Linear algebra states Schwarz inequality as $$\lvert\mathbf x^\mathrm T\mathbf y\rvert\le\lVert\mathbf x\rVert\lVert\mathbf y\rVert\tag 1$$ However, probability theory states it as $$(\mathbf E[XY])^2\le\mathbf E[X^2]\mathbf E[Y^2]\tag 2$$ By comparing $\lvert\sum_i x_iy_i\rvert\le\sqrt{\sum_i x_i^2\sum_i y_i^2}$ with $\lvert\sum_y\sum_x xyp_{X,Y}(x,y)\rvert\le\sqrt{\sum_x x^2p_X(x)\sum_y y^2p_Y(y)}$, we see that $(1)$ and $(2)$ are equivalent when $p_{X,Y}(x,y)=\begin{cases}\frac1n&\text{if $x=x_i$ and $y=y_i$ for $i\in\{1,2,\cdots,n\}$}\\0&\text{otherwise}\end{cases}$. Thus, $(2)$ can be thought of as a more general form of the inequality.
Another way to think about this is to compare $\lvert\cos\theta\rvert=\frac{\lvert\mathbf x^\mathrm T\mathbf y\rvert}{\lVert\mathbf x\rVert\lVert\mathbf y\rVert}\le1$ with $\lvert\rho\rvert=\frac{\lvert\mathbf{cov}(X,Y)\rvert}{\sqrt{\mathbf{var}(X)\mathbf{var}(Y)}}\le1$. The former is exactly $(1)$, while the latter becomes $(2)$ only when $\mathbf E[X]=\mathbf E[Y]=0$. In some sense, we can view $\mathbf x^\mathrm T\mathbf y$ as a special form of $\mathbf{cov}(X,Y)$. Then, it follows that $\mathbf x^\mathrm T\mathbf x$ is a form of $\mathbf{var}(X)$ and $\lVert\mathbf x\rVert$ is a form of $\sqrt{\mathbf{var}(X)}$.
What is the special form of $\mathbf E[X]$ and how do we understand $\mathbf E[X]=\mathbf E[Y]=0$ in linear algebra? With $p_{X,Y}$ defined above, we have $\mathbf E[XY]=\frac{\mathbf x^\mathrm T\mathbf y}n$, but $\mathbf{cov}(X,Y)\ne\mathbf E[XY]$ unless $\mathbf E[X]=0$ or $\mathbf E[Y]=0$. How can we obtain a relation between $\mathbf{cov}(X,Y)$ and $\mathbf x^\mathrm T\mathbf y$?