Lagrangian multipliers can be used to mini-/maximize a multivariable function $f()$ subject to one or multiple constraints. Dedner et al. used the technique of generalized Lagrangian multipliers (GLM) to minimize the divergence of the magnetic field $\vec{B}(x,y,z)$ in the ideal MHD equations.
They transformed the equations \begin{equation} \begin{array}{rcl} \displaystyle \nabla\cdot\vec{B} &=& 0 \,,\\ \displaystyle \frac{\partial}{\partial t}\vec{B} &=& \nabla\times\left(\vec{u}\times\vec{B}\right)\, , \end{array}\tag{1} \end{equation} into the form \begin{equation} \begin{array}{rcl} \displaystyle \nabla\cdot\vec{B} + \mathcal{D}(\psi) &=& 0 \,,\\ \displaystyle \frac{\partial}{\partial t}\vec{B} + \nabla \psi &=& \nabla\times\left(\vec{u}\times\vec{B}\right)\, . \end{array}\tag{2} \end{equation}
Suitable choices for the differential operator $\mathcal{D}(\psi)$ seemingly lead to a scheme that removes the divergence in the field $\vec{B}$, e.g. \begin{equation}\tag{3} \mathcal{D}(\psi) = -\frac{1}{c^2}\frac{\partial}{\partial t}\psi\, , \end{equation} with another newly introduced parameter $c$.
My question is specifically not about the involved physics, but the mathematical background (why is it allowed to do this?). I'd really love to be able to understand the underlying math to get a better understanding of what is happening here. However, I have never seen the concept of GLM being used like this. It is entirely unclear to me how you get from (1) to (2) and also why you would have to introduce yet another variable "$c$" in (3).