Universal Approximation Theorem: Interactive Demos🔗
Step Function Approximation🔗
ReLU Network Approximation Visualization🔗
This interactive demo shows how a neural network decomposes functions into ReLU components. The example network uses 5 ReLU neurons to approximate a cubic function.
For a ReLU function ReLU(wx + b), the bias term b determines the activation threshold where the function "turns on." The ReLU switches from outputting 0 to outputting the linear part when \(wx + b = 0\), which gives us the inflection point at \(x = -b/w\). You can see this in the visualization: ReLU(x-2) activates at x = 2 (where -b/w = -(-2)/1 = 2), and ReLU(x+1) activates at x = -1 (where -b/w = -(1)/1 = -1).
Interactive ReLU Decomposition
Approximation Error (L∞): -
Neural Network Architecture
External Visualization🔗
Activation Function Comparison🔗
This demo compares how different activation functions approximate a target function. Notice how parabolic activation (y = x²) may seem to work for sine but fails for other functions because it's not part of the UAT family.
Key Observations🔗
Watch how different activation functions approximate various target functions: - ReLU & Sigmoid: Universal approximators (work for all continuous functions) - Parabolic: Not universal (may work for specific cases but fails generally)