I am studying for my linear algebra final and it was suggested to me that I learn how to derive the normal equation used in solving least squares problems. I have been looking in various places online but I haven't had any luck, probably because I don't know the topic well enough to put the notation my professor uses into what I find. Note: the notation for what I'm using is directly from my class and I don't have a textbook to reference because he doesn't use one.
Here is an example from my class and I'm trying to use it to generate a general example that I could use to derive the normal equation $A^TA\vec x=A^T\vec b$
Given this equation $N=a_0t_0+a_1t_1$, and a graph with 4 collected data points in the form $(t_i,n_i)$, I created this error function
$E_2\begin{bmatrix}a_1\\a_2\end{bmatrix}= (a_0+a_1t_1-n_1)^2+(a_0+a_1t_2-n_2)^2+(a_0+a_1t_3-n_3)+(a_0+a_1t_4-n_4)=e_i^2+e_2^2+e_3^2+e_4^2=\Vert e\Vert= \Vert A\vec x-\vec b\Vert $
The summation I have is: $$\sum_{i=1}^4 e_i^2 = (a_0+a_1t_i-n_i)^2$$
Since this is for a specific case I know I need to find a general error function to work with in order to derive the normal equation, so I created this one:
$$E_2\begin{bmatrix}a_1\\a_2\\.\\.\\.\\a_n\end{bmatrix}=\sum_{i=1}^n e_i^2 = (a_0+a_1t_i+a_2t_i^2+...a_nt_i^ n-n_i)^2$$
I'm not even sure that I need this information to do my derivation. What do I need to do?