Universal Approximation Theorem: Interactive Demos🔗

Step Function Approximation🔗

Number of Bumps: 3

ReLU Network Approximation Visualization🔗

This interactive demo shows how a neural network decomposes functions into ReLU components. The example network uses 5 ReLU neurons to approximate a cubic function.

For a ReLU function ReLU(wx + b), the bias term b determines the activation threshold where the function "turns on." The ReLU switches from outputting 0 to outputting the linear part when \(wx + b = 0\), which gives us the inflection point at \(x = -b/w\). You can see this in the visualization: ReLU(x-2) activates at x = 2 (where -b/w = -(-2)/1 = 2), and ReLU(x+1) activates at x = -1 (where -b/w = -(1)/1 = -1).

Interactive ReLU Decomposition

Target Function -20·ReLU(-x-1) 5·ReLU(x+1) -5·ReLU(x) 5·ReLU(x-2) 15·ReLU(x-3) Neural Network Output

Approximation Error (L∞): -

Neural Network Architecture

External Visualization🔗

Activation Function Comparison🔗

This demo compares how different activation functions approximate a target function. Notice how parabolic activation (y = x²) may seem to work for sine but fails for other functions because it's not part of the UAT family.

Target Function: Activation Type: Number of Units: 5

Key Observations🔗

Watch how different activation functions approximate various target functions: - ReLU & Sigmoid: Universal approximators (work for all continuous functions) - Parabolic: Not universal (may work for specific cases but fails generally)