I'm trying to prove (or find a reference of the proof) that, given a metric tensor $g$ on a manifold $M$ and vector fields $X,Y,Z\in \mathfrak{X}(M)$, $$\mathcal L_Xg(Y,Z)=g(\nabla_Y X,Z)+g(Y,\nabla_Z X)$$ where $\nabla$ is the Levi Civita connection. At first, I thought that it follows easily from the definining properties of the LC connection as $$[X,Y]=\nabla_XY-\nabla_YX \qquad \& \qquad \mathcal L_X(g(Y,Z))=g(\nabla_X Y,Z)+g(Y,\nabla_X Z)$$ This gives us $$g(\nabla_YX,Z)+g(Y,\nabla_ZX)=g([Y,X],Z)-g(Y,[X,Z])-\mathcal L_X(g(Y,Z))$$ and I was hoping to find some metric version of the Jacobi identity. However, I didn't know how to bring the Lie derivative of the tensor into play.
Locally (with brute force), $g=\sum_{i,j=1}^n g_{ij}dx_i\otimes dx_j$ and $X=\sum_{i=1}^nX^i \frac{\partial}{\partial x_i}$ and thus \begin{align*} \mathcal L_Xg&=\left.\frac{d}{dt}\right|_{t=0} (\phi^X_t)^*g=\sum_{i,j}\left.\frac{d}{dt}\right|_{t=0} g_{ij}(\phi^X_t)(\phi^X_t)^*(dx_i\otimes dx_j)= \sum_{i,j}\left.\frac{d}{dt}\right|_{t=0} g_{ij}(\phi^X_t)\frac{\phi^X_t}{\partial x_i}\frac{\phi^X_t}{\partial x_j}dx_i\otimes dx_j\\ &=\sum_{i,j,k,l}\delta_{(i,j)}^{(k,l)}\frac{\partial g_{ij}}{\partial x_k}(\phi_0^X)\frac{\partial g_{ij}}{\partial x_l}(\phi_0^X)\frac{\phi^X_0}{\partial x_i}\frac{\phi^X_0}{\partial x_j}\left.\frac{d}{dt}\right|_{t=0} \phi^X_t dx_i\otimes dx_j+\sum_{i,j}g_{ij}(\phi_0^X)\left.\frac{d}{dt}\right|_{t=0}\left(\frac{\phi^X_t}{\partial x_i}\frac{\phi^X_t}{\partial x_j}\right)dx_i\otimes dx_j\end{align*} As $\phi^X_0=Id$ we obtain \begin{align*} \mathcal L_Xg&=\sum_{i,j}\left(\frac{\partial g_{ij}}{\partial x_i}\frac{\partial g_{ij}}{\partial x_j} \left.\frac{d}{dt}\right|_{t=0} \phi^X_t+g_{ij}\frac{\partial}{\partial x_i}\left.\frac{d}{dt}\right|_{t=0} \phi^X_t+g_{ij}\frac{\partial}{ \partial x_j}\left.\frac{d}{dt}\right|_{t=0} \phi^X_t\right) dx_i\otimes dx_j \end{align*} Now $\left.\frac{d}{dt}\right|_{t=0} \phi^X_t$ should coincide with the vector field $X$, but formally, I don't see how to put a vector field inside somethig that consists of functions with value in $\mathbb R$. Is it at least the correct way to do this?