How does learning work in practice?

Approximation and learning of anisotropic and mixed smooth functions by deep ReLU neural networks covers approximation, learning, anisotropic from first principles with code examples. Free lesson at https://engineersofai.com/docs/research/paper-breakdowns/2026-05-29-approximation-and-learning-of-anisotropic-and-mixed-smooth-functions-by-deep-rel

What is the difference between approximation and anisotropic?

See the full breakdown at https://engineersofai.com/docs/research/paper-breakdowns/2026-05-29-approximation-and-learning-of-anisotropic-and-mixed-smooth-functions-by-deep-rel

Approximation and learning of anisotropic and mixed smooth functions by deep ReLU neural networks

:::info Stub — Full Engineering Breakdown Coming This paper was auto-fetched from arXiv on 2026-06-01. A full breakdown with production viability rating, implementation notes, and honest limitations is being written. Subscribe to AI Letters → :::


Authors	Yunfei Yang & Jun Fan
Year	2026
Field	Statistics / ML
arXiv	2605.31152
PDF	Download
Categories	stat.ML, cs.LG

Abstract

This paper studies how efficiently deep ReLU neural networks can approximate and learn smooth functions. When the error is measured in $L^p([0,1]^d)$ norm and the approximator is a network with width $W$ and depth $L$ , recent works have proven the supper approximation rate $\mathcal{O}((WL)^{-2s/d})$ for Besov space $\mathcal{B}^s_{q,r}([0,1]^d)$ under the Sobolev embedding condition $s/d>1/q-1/p$ . In order to overcome the curse of dimensionality in this rate, we extent this result to anisotropic and mixed smooth function classes. We establish the approximation rate $\mathcal{O}((WL)^{-2\tilde{s}})$ for anisotropic Besov space $\mathcal{B}^{\boldsymbol{s}}_{q,r}([0,1]^d)$ with anisotropic smoothness $\boldsymbol{s}=(s_1,\dots,s_d)$ under the embedding condition $\tilde{s} > 1/q-1/p$ , where the mean smoothness $\tilde{s} = (\sum_{i=1}^d s_i^{-1})^{-1}$ . For mixed smooth Besov space $\mathcal{MB}^s_{q,r}([0,1]^d)$ with mixed smoothness $s>1/q-1/p$ , we show that the approximation rate $\mathcal{O}((WL)^{-2s})$ holds up to logarithmic factors. Using these results, we also derive approximation bounds for the composition of anisotropic Besov functions. As an application, it is shown that deep ReLU neural networks can achieve minimax optimal rates up to logarithmic factors for a wide range of smooth function classes.

Engineering Breakdown

The Problem

This paper studies how efficiently deep ReLU neural networks can approximate and learn smooth functions.

The Approach

This paper studies how efficiently deep ReLU neural networks can approximate and learn smooth functions. For mixed smooth Besov space $\mathcal{MB}^s_{q,r}([0,1]^d)$ with mixed smoothness $s>1/q-1/p$ , we show that the approximation rate $\mathcal{O}((WL)^{-2s})$ holds up to logarithmic factors.

Key Results

As an application, it is shown that deep ReLU neural networks can achieve minimax optimal rates up to logarithmic factors for a wide range of smooth function classes.

Research Areas

This paper contributes to the following areas of AI/ML engineering:

Machine learning
Deep learning
Neural networks
Model optimization
AI systems
Approximation

:::tip Subscribe Get weekly breakdowns of papers like this in AI Letters - the newsletter for engineers building production AI systems. :::

Back to Research Lab → · Subscribe to AI Letters →

Abstract​

Engineering Breakdown​

The Problem​

The Approach​

Key Results​

Research Areas​