In the proof of matrix solution of Least Square Method, I see some matrix calculus, which I have no clue. Can anyone explain to me or recommend me a good link to study this sort of matrix calculus?
In Least-Square method, we want to find such a vector $x$ such that $||Ax-b||$ is minimized.
Assume $r=Ax-b$
$\Rightarrow\|r\|^2=x^TA^TAx-2b^TAx+b^Tb$
$\Rightarrow \nabla_x \|r\|^2=2A^TAx-2A^Tb$
In the end we set the gradient to zero and find the minimized solution. I understand the whole idea, but I just don't know how exactly we did matrix calculus here, or say I don't know how to do the matrix calculus here. For example, can anyone tell me how we got those transpose in $\|r\|^2$(By what rule?) and how we got the gradient?(how do we take the gradient exactly in matrix format)?
I'll really appreciate if you can help me out. Thanks!