Let us assume that I am working on a dataset of black and white dog images.
Each image is of size $28 \times 28$.
Now, I can say that I have a sample space $S$ of all possible images. And $p_{data}$ is the probability distribution for dog images. It is easy to understand that all other images get a probability value of zero. And it is obvious that $n(S)= 2^{28 \times 28}$.
Now, I am going to design a generative model that sample from $S$ using $p_{data}$ rather than random sampling.
My generative model is a neural network that takes random noise (say, of length 100) and generates an image of the size $28 \times 28$. My function is learning a function $f$, which is totally different from the function $p_{data}$. It is because of the reason that $f$ is from $R^{100}$ to $S$ and $p_{data}$ is from $S$ to $[0,1]$.
In the literature, I often read the phrases that our generative model learned $p_{data}$ or our goal is to get $p_{data}$, etc., but in fact, they are trying to learn $f$, which just obeys $p_{data}$ while giving its output.
Am I going wrong anywhere or the usage in literature is somewhat random?