In multivariate linear model, I have come across the following matrix-valued function of $\beta \in \Bbb R^p$.
$$\beta \mapsto(y-X\beta)(y-X\beta)^{T}$$
where matrix $X \in \Bbb R^{n \times p}$ and vector $y \in \Bbb R^n$ are given. I have to differentiate it with respect to $\beta \in \Bbb R^p$. Can anyone please help me with how to differentiate this?
Some other examples that I have seen on this site are differentiation of $(y-X\beta)^{T}(y-X\beta)$ (which is a scalar), but here the expression is an $n×n$ matrix and I am not sure how to handle this. Also, I would appreciate some reference or reading materials on this kind of matrix-vector differentiation for beginners.