How can I calculate the hyperparameter gradients for the Matérn 5/2 kernel?
If the general Matern kernel is written
$$ \mathcal{K}_M(r) = \sigma^2\frac{2^{1-\nu}}{\Gamma(\nu)}
\left( \frac{r\sqrt{2\nu}}{\ell} \right)^\nu K_\nu\left( \frac{r\sqrt{2\nu}}{\ell} \right) $$
then the $5/2$ kernel is given by
$$ \mathcal{K}_{M,5/2}(r) = \sigma^2\left[1 + \frac{r\sqrt{5}}{\ell} + \frac{5r^2}{3\ell^2} \right]
\exp\left(-\frac{r\sqrt{5}}{\ell}\right) $$
Then the gradient with respect to $\ell$ and $\sigma$ is
\begin{align}
\frac{\partial \mathcal{K}_{M,5/2} }{\partial \ell}
&=
\sigma^2\exp\left(-\frac{r\sqrt{5}}{\ell}\right)\frac{\partial }{\partial \ell}
\left[1 + \frac{r\sqrt{5}}{\ell} + \frac{5r^2}{3\ell^2} \right]
+
\sigma^2\left[1 + \frac{r\sqrt{5}}{\ell} + \frac{5r^2}{3\ell^2} \right]
\frac{\partial }{\partial \ell}
\exp\left(-\frac{r\sqrt{5}}{\ell}\right) \\
&=
\sigma^2\exp\left(-\frac{r\sqrt{5}}{\ell}\right)
\left[ \frac{-r\sqrt{5}}{\ell^2} - \frac{10r^2}{3\ell^3} \right]
+
\sigma^2\left[1 + \frac{r\sqrt{5}}{\ell} + \frac{5r^2}{3\ell^2} \right]
\left(\frac{r\sqrt{5}}{\ell^2}\right) \exp\left(-\frac{r\sqrt{5}}{\ell}\right) \\
&=
\sigma^2\exp\left(-\frac{r\sqrt{5}}{\ell}\right)
\left[
\frac{-r\sqrt{5}}{\ell^2} - \frac{10r^2}{3\ell^3}
+
\frac{r\sqrt{5}}{\ell^2} + \frac{5{r^2}}{\ell^3} + \frac{5\sqrt{5}r^3}{3\ell^4}
\right] \\
&=
\sigma^2\exp\left(-\frac{r\sqrt{5}}{\ell}\right)
\left[
\frac{5{r^2}}{3\ell^3} + \frac{5\sqrt{5}r^3}{3\ell^4}
\right] \\
&=
\frac{5{r^2}\sigma^2}{3\ell^3}
\exp\left(-\frac{r\sqrt{5}}{\ell}\right)
\left[
1 + \frac{r\sqrt{5}}{\ell}
\right] \\
\frac{\partial \mathcal{K}_{M,5/2} }{\partial \sigma}
&=
\frac{2}{\sigma}\mathcal{K}_{M,5/2}(r)
\end{align}
Source: Rasmussen and Williams, Gaussian Processes for Machine Learning.