Let's start with the following question, suppose
$$
\Sigma:=\left\{x\in\Bbb R^m \mid f(x)=0\in\Bbb R^q,\quad f(x)\in\mathscr C^1(\Bbb R^m),\quad 1\le q<m \right\}
$$
and there is a scalar field, or the so-called "goal function" $\theta(x)\in\Bbb R$ on $\Sigma$. What we are going to do is seek $x_{*}$ such that
$$\theta (x_*)=\sup_{x\in\Sigma}\theta(x)\quad\text{or}\quad\inf_{x\in\Sigma}\theta(x)$$
Now we apply the Implicit Function Theorem. We split $x\in\Bbb R^m$ into two parts $(\tilde{x},\hat{x})\in\Bbb R^p\times\Bbb R^q$ where $p+q=m$, and hence $f(\tilde{x},\hat{x})=0\in\Bbb R^q$. According to the theorem, if for all $x=(\tilde{x},\hat{x})\in \Sigma$, we have ($D$ denotes Jacobian matrix)
\begin{equation}
\det (D_{\hat{x}}f)(x)=\frac{\partial f}{\partial \hat{x}}\ne 0
\end{equation}
Then there exists a parameter domain $U_{\Sigma}\in\Bbb R^p$ such that there exists an implicit function $\xi(\tilde{x})$:
\begin{equation}
\xi(\tilde{x}):U_{\Sigma}\ni\tilde{x}\mapsto\xi(\tilde{x})\in \Bbb R^q
\end{equation}
which is determined by the constraint $f(\tilde{x},\xi(\tilde{x}))=0\in\Bbb R^q$, or equivalently $(\tilde{x},\xi(\tilde{x}))\in\Sigma$.
Here comes the key part: we are going to recognize $\Sigma$ as a "hyper surface" which maps elements from $\Bbb R^p$ to $\Bbb R^{p+q}$, and then parameterize it with the aid of the implicit function: if we regard $U_{\Sigma}$ as the "parameter domain" for the hyper surface $\Sigma$, then we are immediately able to define the parametrization mapping $\sigma$ for $\Sigma$ as
\begin{equation}\sigma(\tilde{x}):U_{\Sigma}\ni\tilde{x}\mapsto \sigma(\tilde{x}):=(\tilde{x},\xi(\tilde{x}))\in\Sigma\subset \Bbb R^m\end{equation}
Hence we can rewrite the "goal function" $\theta(x)$ as
\begin{equation}\Theta(\tilde{x}):U_{\Sigma}\ni\tilde{x}\mapsto \Theta(\tilde{x}):=\theta\circ\sigma(\tilde{x})\in\Bbb R\end{equation}
The significant difference between the original form $\theta(x)$ and the rewritten form $\Theta(\tilde{x})$ is that the latter is defined directly on an open domain $U_{\Sigma}$, "freed" from any constraint. Therefore, if we are to seek local extrema for $\Theta(\tilde{x})$, all we have to do is simply let
\begin{equation}
(D\Theta)(\tilde{x})=(D\theta\circ\sigma)(\tilde{x})=0\in\Bbb R^{1\times p}
\end{equation}
By Chain Rule, we have
$$
(D\theta\circ\sigma)(\tilde{x})=(D\theta)(\sigma(\tilde{x}))(D\sigma)(\tilde{x})
$$
Note that $\sigma(\tilde{x})=(\tilde{x},\xi(\tilde{x}))$ and hence $$(D\theta)(\cdot)=\left[(D_{\tilde{x}}\theta)(\cdot),(D_{\hat{x}}\theta)(\cdot)\right]$$
and ($I_p$ denotes the $p\times p$ identity matrix)
$$(D\sigma)(\tilde{x})=\begin{bmatrix}
I_p \\ (D\xi)(\tilde{x})
\end{bmatrix}$$
we have
\begin{equation}
(D\theta\circ\sigma)(\tilde{x})=(D_{\tilde{x}}\theta)(x)+(D_{\hat{x}}\theta)(x)(D\xi)(\tilde{x})=0\in\Bbb R^{1\times p}
\end{equation}
Again, aided by the Implicit Function Theorem, we have
\begin{equation}
(D\xi)(\tilde{x})=-(D_{\hat{x}}f)^{-1}(x)(D_{\tilde{x}}f)(x)
\end{equation}
plugging it into the previous equation, we obtain the following equation
\begin{equation}(D_{\tilde{x}}\theta)(x)-(D_{\hat{x}}\theta)(x)(D_{\hat{x}}f)^{-1}(x)(D_{\tilde{x}}f)(x)=0\in\Bbb R^{1\times q}\end{equation}
Together with the constraint $f(x)=0\in\Bbb R^q$, we have
$$
\left\{
\begin{array}{l}
(D_{\tilde{x}}\theta)(x)-(D_{\hat{x}}\theta)(x)(D_{\hat{x}}f)^{-1}(x)(D_{\tilde{x}}f)(x)=0\in\Bbb R^{1\times p}\\
f(x)=0\in\Bbb R^q
\end{array}
\right.
$$
Provided that $\Sigma$ (and hence $U_{\Sigma}$) is compact, these $m$ equations can determine all the possible $x_*$s that are not located on $\partial\Sigma$, which, in my opinion, is the intrinsic form of the so-called Lagrange Multiplier Function.
To see how the common Lagrange function "coincides" with this form, let
$$L(x,\lambda):\Bbb R^m\times\Bbb R^q\ni (x,\lambda)\mapsto L(x,\lambda):=\theta(x)+\lambda^Tf(x)\in\Bbb R$$
differentiate $L$ and we get
\begin{align*}
(DL)(x,\lambda)&=(DL)(\tilde{x},\hat{x},\lambda)=\left[(D_{\tilde{x}}L),(D_{\hat{x}}L),(D_{\lambda}L)\right](x,\lambda)\\
&=\left[(D_{\tilde{x}}\theta)(x)+\lambda^T(D_{\tilde{x}}f)(x),(D_{\hat{x}}\theta)(x)+\lambda^T(D_{\hat{x}}f)(x),(f(x))^T\right]\\
&=\left[0\in\Bbb R^{1\times p},0\in\Bbb R^{1\times q},0\in\Bbb R^{1\times q}\right]
\end{align*}
from which it follows that
$$(D_{\hat{x}}\theta)(x)+\lambda^T(D_{\hat{x}}f)(x)=0\in\Bbb R^{1\times p}\implies \lambda^T=-(D_{\hat{x}}\theta)(x)(D_{\hat{x}}f)^{-1}(x)\in\Bbb R^{1\times q}$$
plugging it into
$$(D_{\tilde{x}}\theta)(x)+\lambda^T(D_{\tilde{x}}f)(x)=0\in\Bbb R^{1\times p}$$
so to obtain
$$(D_{\tilde{x}}\theta)(x)-(D_{\hat{x}}\theta)(x)(D_{\hat{x}}f)^{-1}(x)(D_{\tilde{x}}f)(x)=0\in\Bbb R^{1\times p}$$
together with $(f(x))^T=0\in\Bbb R^{q}$, we have returned to the $m$ equations in the intrinsic form.