In fact, the proof from $\left\| \mathbf{A}\right\|_2 =\max_{\left\| \mathbf{x}\right\|_2=1} \left\| \mathbf{Ax} \right\|_2$ to $\left\| \mathbf{A}\right\|_2 = \sqrt{\lambda_{\max}(\mathbf{A}^H \mathbf{A})}$ is straight forward. We can first simply prove when $\mathbf{P}$ is Hermitian
$$
\lambda_{\max} = \max_{\| \mathbf{x} \|_2=1} \mathbf{x}^H \mathbf{Px}.
$$
That's because when $\mathbf{P}$ is Hermitian, there exists one and only one unitary matrix $\mathbf{U}$ that can diagonalize $\mathbf{P}$ as $\mathbf{U}^H \mathbf{PU}=\mathbf{D}$ (so $\mathbf{P}=\mathbf{UDU}^H$), where $\mathbf{D}$ is a diagonal matrix with eigenvalues of $\mathbf{P}$ on the diagonal, and the columns of $\mathbf{U}$ are the corresponding eigenvectors. Let $\mathbf{y}=\mathbf{U}^H \mathbf{x}$ and substitute $\mathbf{x} = \mathbf{Uy}$ to the optimization problem, we obtain
$$
\max_{\| \mathbf{x} \|_2=1} \mathbf{x}^H \mathbf{Px} = \max_{\| \mathbf{y} \|_2=1} \mathbf{y}^H \mathbf{Dy} = \max_{\| \mathbf{y} \|_2=1} \sum_{i=1}^n \lambda_i |y_i|^2 \le \lambda_{\max} \max_{\| \mathbf{y} \|_2=1} \sum_{i=1}^n |y_i|^2 = \lambda_{\max}
$$
Thus, just by choosing $\mathbf{x}$ as the corresponding eigenvector to the eigenvalue $\lambda_{\max}$, $\max_{\| \mathbf{x} \|_2=1} \mathbf{x}^H \mathbf{Px} = \lambda_{\max}$. This proves $\left\| \mathbf{A}\right\|_2 = \sqrt{\lambda_{\max}(\mathbf{A}^H \mathbf{A})}$.
And then, because the $n\times n$ matrix $\mathbf{A}^H \mathbf{A}$ is positive semidefinite, all of its eigenvalues are not less than zero. Assume $\text{rank}~\mathbf{A}^H \mathbf{A}=r$, we can put the eigenvalues into a decrease order:
$$
\lambda_1 \geq \lambda_2 \geq \lambda_r > \lambda_{r+1} = \cdots = \lambda_n = 0.
$$
Because for all $\mathbf{X}\in \mathbb{C}^{n\times n}$,
$$
\text{trace}~\mathbf{X} = \sum\limits_{i=1}^{n} \lambda_i,
$$
where $\lambda_i$, $i=1,2,\ldots,n$ are eigenvalues of $\mathbf{X}$; and besides, it's easy to verify
$$
\left\| \mathbf{A}\right\|_F = \sqrt{\text{trace}~ \mathbf{A}^H \mathbf{A}}.
$$
Thus, through
$$
\sqrt{\lambda_1} \leq \sqrt{\sum_{i=1}^{n} \lambda_i} \leq \sqrt{r \cdot \lambda_1}
$$
we have
$$
\left\| \mathbf{A}\right\|_2 \leq \left\| \mathbf{A}\right\|_F \leq \sqrt{r} \left\| \mathbf{A}\right\|_2
$$