The Gram-Schmidt orthogonalization of a linearly independent set $S=\lbrace v_1,v_2,\dots,v_p \rbrace$-- assuming finite-ness for convenience-- is given by $u_1=v_1$ and $k>1\implies u_k=v_k-\sum_{i=1}^{p}\langle v_k,u_i \rangle u_i /\lvert u_i \rvert^2$.
Many of the proofs I've seen of the Gram-Schmidt orthogonalization have been quite confusing, save perhaps for this here. I liked the proof in this document, but was confused by the inductive hypothesis on page 3. The author assumes that $u_1,u_2,\dots,u_{k-2}$ are all orthogonal to $u_{k-1}$, then sets out to prove that the $u_1,u_2,\dots,u_{k-2}$ are also all orthogonal to $u_k$. That seems fine; however, in the inductive step, they further assume that all of the $u_1,u_2,\dots,u_{k-2}$ are mutually orthogonal, which (to my knowledge) wasn't stated in the inductive hypothesis.
Below is my attempt at a proof using strong induction.
Define a predicate $\varphi(k)$ in one variable $k\in\left\{ 1,2,\dots,p\right\} $ as follows: $$ \varphi(k)=\forall(m<k).\left\langle u_{m},u_{k}\right\rangle = 0. $$ In words, $\varphi(k)$ means that $u_{k}$ is orthogonal to each preceeding $u_{m}$. Clearly we have $\varphi(1)$, since $m\nless1$ for $m\in\left\{ 1,2,\dots,p\right\} $; and we have also $\varphi(2)$ since $u_{1}\perp u_{2}$; as well as $\varphi(3)$, since $u_{1}\perp u_{3}$ and $u_{2}\perp u_{3}$.
Claim: We have $\varphi(j)$ for all $j\in\left\{ 1,2,\dots,p\right\} =: I$. Equivalently, we have $u_{i}\perp u_{j}$ for all $i\neq j$ in $I$.
Proof: As stated above, we already have $\varphi(1)$, $\varphi(2)$ and $\varphi(3)$. Now suppose there is some $j\in I$ such that $\varphi(k)$ holds for all $k<j$ in $I$. That is, for all $k<j$ and all $m<k$, we have that $\left\langle u_{m},u_{k}\right\rangle =0.$ Now consider $\left\langle u_{m},u_{j}\right\rangle $ for $m<k<j$. Letting $J:=\left\{ 1,2,\dots,j\right\} $, so that $J\setminus\left\{ j,m\right\} =\left\{ 1,2,\dots,m-1,m+1,\dots,j-1\right\}$, we have \begin{align*} \left\langle u_{m},u_{j}\right\rangle = & \left\langle u_{m},v_{j}-\sum_{i=1}^{j-1}\frac{\left\langle v_{j},u_{i}\right\rangle }{\left|u_{i}\right|^{2}}u_{i}\right\rangle =\left\langle u_{m},v_{j}\right\rangle -\sum_{i=1}^{j-1}\frac{\left\langle v_{j},u_{i}\right\rangle \left\langle u_{m},u_{i}\right\rangle }{\left|u_{i}\right|^{2}}\\ = & \left\langle u_{m},v_{j}\right\rangle -\sum_{i\in J\setminus\left\{ j\right\} }\frac{\left\langle v_{j},u_{i}\right\rangle \left\langle u_{m},u_{i}\right\rangle }{\left|u_{i}\right|^{2}}\\ = & \left\langle u_{m},v_{j}\right\rangle -\left\langle v_{j},u_{m}\right\rangle -\sum_{i\in J\setminus\left\{ j,m\right\} }\frac{\left\langle v_{j},u_{i}\right\rangle \left\langle u_{m},u_{i}\right\rangle }{\left|u_{i}\right|^{2}}\\ = & -\sum_{i\in J\setminus\left\{ j,m\right\} }\frac{\left\langle v_{j},u_{i}\right\rangle \left\langle u_{m},u_{i}\right\rangle }{\left|u_{i}\right|^{2}} \end{align*} but inside the sum we have $i\leq j-1\implies i<j$, so $\left\langle u_{m},u_{i}\right\rangle =0$ by our inductive hypothesis. We therefore get $\left\langle u_{m},u_{j}\right\rangle =0$.
We've just shown that $\left[\forall(k<j).\varphi(k)\right]\implies\varphi(j)$. By the principle of (complete/strong) mathematical induction, we have that $\varphi(j)$ holds for all $j\in I$, and so $u_{i}\perp u_{j}$ for all $i\neq j$ in $I$.
I'd appreciate if anyone could assuage my woes over the first proof mentioned above, as well as verify the proof I've provided. Also, I know that it's possible to use double induction to first prove that $\langle u_1, u_i \rangle = 0$ and then prove that $\langle u_i,u_j \rangle = 0$, similar to the proof of commutativity of multiplication on $\mathbb{N}$, but I'm not sure how to tackle the second induction there.