I'm working on a problem related to function approximation within the $L^2\left(I_n\right)$ space of square-integrable functions:
Problem Statement:
Given a lemma without proof:
$\textit{Lemma}$: Let $g \in L^2\left(I_n\right)$ such that $\int_{\mathcal{H}} g(x) d x=0$, for any half-space $\mathcal{H}:=\left\{x: w^T x+\theta>\right.$ $0\} \cap I_n$. Then $g=0$ almost everywhere.
Note that by choosing a convenient value for the parameter $\theta$, the upper half-space may become the entire hypercube. Then, $g$, considered before, has a zero integral $\int_{I_n} g(x) d x=0$.
The current task is to show that any function $g \in L^2\left(I_n\right)$ can be approximated by the output of a one hidden layer perceptron where the activation function $\sigma(x)$ is the Heaviside step function, defined as: $$ \sigma(x)= \begin{cases}1, & x \geq 0 \\ 0, & x<0\end{cases} $$
Progress Made So Far:
I am examining the use of a one-hidden layer perceptron with Heaviside step function activation for approximating functions in $L^2\left(I_n\right)$. The approach constructs hyperrectangle approximations through the intersections of half-spaces generated by neuron outputs, proposing that these intersections can represent any square-integrable function in $L^2\left(I_n\right)$ through linear combinations.
However, I'm seeking advice on $\textit{formally}$ proving the density of these approximations in $L^2(I_n)$ and establishing a method for choosing perceptron parameters (weights and biases) that ensures any function $g \in L^2(I_n)$ can be approximated with arbitrary precision.
Any guidance on applying functional analysis or approximation theory principles to support this approximation technique in $\textit{rigorous math}$ would be appreciated.
Are you just asking whether any $L^2$ function can be approximated (in $L 2$ norm) by linear combinations of step functions, i.e. functions of the form $\sum c_i 1_{E_i}$ ?
– Mar 25 '24 at 05:32