I have a system $A x ~ b$ where vector $b$ is not actually in the span of matrix $A$. I want to use a least squares approach to minimize the distance between the two vectors.
$$\begin{aligned} \min_x \Vert \mathbf Ax - b\Vert ^2 &= (\mathbf Ax-b)^T (\mathbf Ax-b) \\ &= x^T \mathbf A^T \mathbf A x - (\mathbf Ax)^Tb - b^T \mathbf Ax + b^Tb \\ &= x^T \mathbf A^T \mathbf A x - 2b^T \mathbf Ax + b^Tb \end{aligned}$$
I'm struggling to understand how to arrive at
$$0 = 2 \mathbf A^T \mathbf Ax -2 \mathbf A^T b$$
I understand that some sort of derivative with respect to $x$ has been taken in order to minimize the distance, but what does it mean to take a derivative of a matrix product with respect to a vector? I have never come across such a calculation?
Thanks in advance.