We have a function $p(x, \theta)\in \mathbb{R},\ x\in\mathbb{R},\ \theta\in\mathbb{R}^n$.
We can also consider $p(\cdot, \theta)$ to be a map from $\mathbb{R}^n$ to a Banach space of functions. For example, assuming $p(\cdot, \theta)\in \mathbb{L}^1(\mathbb{R})$ for each $\theta$ we have that $\hat p:\theta\to p(\cdot, \theta)$ is a map from $\mathbb{R}^n$ to $\mathbb{L}^1(\mathbb{R})$.
My question is about relation of the ordinary partial derivative $\frac{\partial p(x, \theta)}{\partial\theta}$ (assuming it is $\in\mathbb{L}^1(\mathbb{R})$) to the Frechet derivative of $\hat p:\theta\to p(\cdot, \theta)$. When are they the same?
Addition after the comments of Zerox:
For the ordinary partial derivative for each $x$ $$ \lim_{d\theta\to0}\frac{\left|p(x,\theta+d\theta)-p(x, \theta)-\frac{\partial p(x, \theta)}{\partial\theta}\cdot d\theta\right|}{\Vert d\theta \Vert}=0 $$ For the Frechet derivative we have different convergence $$\lim_{d\theta\to0}\frac{\Vert p(\cdot,\theta+d\theta)-p(\cdot, \theta)-L(d\theta)\Vert_{\mathbb{L}^1}}{\Vert d\theta\Vert}=0$$ where $L(d\theta)$ is a continuous linear map from $\mathbb{R}^n$ to $\mathbb{L}^1(\mathbb{R})$.
The linear map $d\theta\to\frac{\partial p(\cdot, \theta)}{\partial\theta}\cdot d\theta$ from $\mathbb{R}^n$ to $\mathbb{L}^1(\mathbb{R})$ is continuous in the Banach norm. $$\left\|\frac{\partial p(\cdot, \theta)}{\partial\theta}\cdot d\theta\right\|_{\mathbb{L}^1}=\left\|\sum_i\frac{\partial p(\cdot, \theta)}{\partial\theta_i}d\theta_i\right\|_{\mathbb{L}^1}\leq\sum_i\left\Vert\frac{\partial p(\cdot, \theta)}{\partial\theta_i}\right\Vert_{\mathbb{L}^1}|d\theta_i|\leq\operatorname{max}_i\left\Vert\frac{\partial p(\cdot, \theta)}{\partial\theta_i}\right\|_{\mathbb{L}^1}\Vert d\theta\Vert_{\mathbb{L}^1}$$
So this map seems to be a candidate for $L(d\theta)$, but i don't see why (or when) they have to coincide. Pointwise convergence (in $x$) doesn't imply $\mathbb{L}^1$ convergence and vice versa.
Of course in the topology of pointwise convergence on the Banach space they would coincide, but i am interested in the $\mathbb{L}^1$ convergence.
A special case of parametrized probability density functions:
If there are no convenient conditions for the general case, then maybe there are some for this case? $$\int p(x,\theta)dx=1,\ p(x,\theta)\geq0\ \forall\theta$$
Upd: deleted the incorrect Scheffe's Lemma application.
The case of dominated partial derivatives:
Assume $\left|\frac{\partial p(x, \theta)}{\partial\theta_i}\right|\leq g(x)$ for some $g(x)\in \mathbb{L}^1(\mathbb{R})$. Then by the multivariate MVT $$ \begin{split} \frac{\left|p(x,\theta+d\theta)-p(x, \theta)-\frac{\partial p(x, \theta)}{\partial\theta}\cdot d\theta\right|}{\Vert d\theta\Vert}\leq&\frac{\left|\frac{\partial p(x, c)}{\partial\theta}\cdot d\theta\right|+\left|\frac{\partial p(x, \theta)}{\partial\theta}\cdot d\theta\right|}{\Vert d\theta\Vert}\\ \leq&\left\Vert\frac{\partial p(x, c)}{\partial\theta}\right\Vert+\left\Vert\frac{\partial p(x, \theta)}{\partial\theta}\right\Vert\leq2\sqrt n g(x) \end{split} $$ So the whole expression is also dominated by the $\mathbb{L}^1(\mathbb{R})$ function.
By the dominated convergence theorem we have the convergence of the expression above in $\mathbb{L}^1$, so $L(d\theta)=\frac{\partial p(x, \theta)}{\partial\theta}\cdot d\theta$.
If both derivatives exist, they have to be equal:
One more observation. For any $\mathbb R \ni h_m\to0$ the existence of the Frechet derivative implies the $\mathbb L^1$ convergence to $0$ of the corresponding expression for each $i=1,\cdots,n$. $$ \left\Vert\frac{p(\cdot,\theta+e_ih_m)-p(\cdot, \theta)}{h_m}-L(e_i)\right\Vert_{\mathbb{L}^1}\to0$$ Where $e_i$ is $i$th vector of the standard basis.
Then there exists a subsequence of $h_m$, call it $\mathbb R \ni g_l\to0$, for which the expression above converges pointwise a.e. to $0$. $$\left|\frac{p(x,\theta+e_ig_l)-p(x, \theta)}{g_l}-L(e_i)(x)\right|\to0\ \textbf{a.e.}$$ But we also have for the ordinary partial derivative $$\left|\frac{p(x,\theta+e_ig_l)-p(x, \theta)}{g_l}-\frac{\partial p(x, \theta)}{\partial\theta_i}\right|\to0$$
So $L(e_i)=\frac{\partial p(\cdot, \theta)}{\partial\theta_i}$and thus $L(d\theta)=\frac{\partial p(\cdot, \theta)}{\partial\theta}\cdot d\theta$.
Additions, corrections, indications of errors are still welcome.