Proving the dot product cosine identity for dimensions $> 2$

Question

In $\mathbb{R}^n$, define the dot product $u \cdot v := \sum_{i = 1}^n u_i v_i$. I understand two proofs that $$u \cdot v = \|u\| \; \|v\| \cos \theta$$ for $n = 2$, where $\|u\| := \sqrt{u \cdot u}$ is the Euclidean norm and $\theta$ is the counterclockwise rotation from $u$ to $v$ (or clockwise rotation, as $\cos(x) = \cos(2 \pi - x)$), but I don't know how to generalize them to $n > 2$.

The first proof is that the identity holds for $u = (1, 0)$ and $v = (\cos \theta, \sin \theta)$ and is invariant under rotation and scaling in $\mathbb{R}^2$. One author says "by an appropriate choice of coordinates we may assume we are working in 2 dimensions," and the other concludes "For higher dimensions, just notice that the two vectors $a$, $b$ span a two-dimensional subspace, for which the argument above applies."

The second proof is to write the vectors in polar form and apply $\cos(x - y) = \cos x \cos y + \sin x \sin y$.

Is it possible to adapt these proofs to higher dimensions? More strongly, if I have any proof of this identity for $n = 2$, is that sufficient to prove it for $n > 2$ due to some $2$-dimensional subspace argument?

I'm starting to believe that the only elegant way to prove this formula for all $n$ is with the law of cosines. It is somewhat unfortunate that Wikipedia uses this formula to prove the law of cosines... (Don't worry, the page has other proofs.) Bonus points if you have a good name for this formula - "dot product cosine identity" is the best I could do.

@NicNic8 Is there a simple way to show $x \cdot y = \sum_i x_i y_i$ after starting with the geometric definition? — jskattt797, Sep 12 '21 at 05:04
You can use the law of cosines: https://www.math.usm.edu/lambers/mat169/fall09/lecture21.pdf — NicNic8, Sep 12 '21 at 15:04
The plane containing $u,,v$ can already be taken as $\Bbb R^2$, as it has $u,,v$ as a basis. So these proofs handle higher dimensions automatically. (The cases $u=0,,v=0,,u\parallel v$ need to be treated separately, but that's all.) — J.G., Sep 18 '21 at 07:22
@J.G. I rely on this $2$-dimensional subspace idea to visualize the angle between $u, v \in \mathbb{R}^n$ for $n > 2$. But I'm struggling to understand how it allows these proofs to "handle higher dimensions automatically," because the dot product in $\mathbb{R}^3$ (for example) is a completely different algebraic expression than the dot product in $\mathbb{R}^2$. — jskattt797, Sep 19 '21 at 17:15
But in an orthonormal basis $u,,v$ span, the $n$-dimensional $u\cdot v$ has just $2$ nonzero entries, i.e. is the $2$-dimensional DP. The hard part is explaining why in choosing this basis the original product's value is unchanged, but that follows as explained in my last comment here. — J.G., Sep 19 '21 at 18:12
@J.G. The "last comment" link discusses notation abuse and why $a^T b$ is a matrix and not a scalar - how does this show that $u \cdot v$ is unchanged when we map the vectors to $\mathrm{Span}\left( (1, 0, \dots, 0), (0, 1, \dots, 0) \right)$? — jskattt797, Sep 19 '21 at 21:01
@jskattt797 Its proof is easily generalized to show rotations satisfy $(Ru)\cdot Rv=u\cdot v$. — J.G., Sep 19 '21 at 21:10

score 1 · Answer 1 · answered Sep 12 '21 at 05:58

The answer to this depends on how we define the angle $\theta$ between the vectors $u$ and $v$. Often* it is simply defined to be $$\theta = \arccos \bigg ( \frac{ u \cdot v}{\| u \| \| v \|} \bigg ) $$ provided $u , v \neq 0$. In this case your identity is tautologically true.

A more geometric (but equivalent) definition of $\theta$ is to first consider the span of $u$ and $v$. If $u,v$ are linearly independent then their span is a 2D plane in which case we can restrict ourselves to this plane and the definition of $\theta$ is identical to the case $n=2$. In this case the proof of the identity is also the same as the $n=2$ case. If $u,v$ are linearly dependent then we take $\theta = 0$ and then the identity is trivially true.

*When I say "often" I mean in standard Euclidean space. Note that for a general inner product space this is the definition.

Can you please give more explanation/details on "we can restrict ourselves to this plane and the definition of $\theta$ is identical to the case $n=2$" and why the proof "is also the same as the $n=2$ case"? I am looking for at least one of the following: a rigorous proof that we can reduce higher-dimensional cases to the $2$-dimensional case, or a rigorous generalization of the two proofs in the OP to $n > 2$. — jskattt797, Sep 12 '21 at 17:59

Mo Pol Bol · Accepted Answer · 2021-09-20T23:16:17.333

Here is another elegant proof in $\mathbb{R}^2$:

Let $\mathbf{u}$ and $\mathbf{v}$ be given. Choose $a$ such that $\hat{\mathbf{v}}=a\mathbf{u}$ forms a right triangle with $\mathbf{v}$ and $\mathbf{0}$ as shown in Figure 1. For simplicity, assume $a\gt0$ and thus $\hat{\mathbf{v}}$ makes the same angle $\theta$ with $\mathbf{v}$ (a nice exercise is to show the proof holds when $\theta$ is obtuse).

$\tag{Fig. 1}$

By definition $$\tag{1}\cos\theta = \frac{|\hat{\mathbf{v}} |}{|\mathbf{v}|}.$$ Since $a\mathbf{u}=(au_1,au_2)$ and $\mathbf{z}=(z_1,z_2)$ are perpendicular their slopes multiply to $-1$: $$\frac{au_2}{au_1}\frac{z_2}{z_1}=-1\implies au_1z_1+au_2z_2=0$$ and so $$\hat{\mathbf{v}}\cdot \mathbf{z}=\hat{\mathbf{v}} \cdot (\hat{\mathbf{v}}-\mathbf{v})=0$$ $$\implies \hat{\mathbf{v}}\cdot \hat{\mathbf{v}}- \hat{\mathbf{v}}\cdot \mathbf{v}=0$$ $$\tag{2}\implies \hat{\mathbf{v}}\cdot \mathbf{v} =|\hat{\mathbf{v}}|^2. $$ Putting everything together $$a(\mathbf{u}\cdot \mathbf{v})=\hat{\mathbf{v}}\cdot \mathbf{v}=|\hat{\mathbf{v}}||\mathbf{v}|\cos\theta=a(|\mathbf{u}||\mathbf{v}|\cos\theta)$$ $$\tag{3}\implies \mathbf{u}\cdot \mathbf{v}=|\mathbf{u}||\mathbf{v}|\cos\theta. \qquad \qquad \square$$

Discussion: Going back over the proof, in particular Figure 1, we can see that the decisive step was the decomposition of the vector $\mathbf{v}$ into the sum of two vectors $$\tag{4} \mathbf{v}= \hat{\mathbf{v}} +(-\mathbf{z})=\hat{\mathbf{v}} +\mathbf{z}',$$ one a multiple of $\mathbf{u}$ and the other orthogonal to it. Alternatively, we could say that the cosine law for dot products fell out of the orthogonal projection of $\mathbf{v}$ onto the line $L$ (the $1$-dimensional subspace of $\mathbb{R}^2$ spanned by $\mathbf{u}$). In fact, the vector $\hat{\mathbf{v}}$ is called the orthogonal projection of $\mathbf{v}$ onto $\mathbf{u}$ and, as our proof shows, $$\tag{5} \hat{\mathbf{v}}=\mathbb{proj}_L\mathbf{v}=\frac{\mathbf{v}\cdot\mathbf{u}}{\mathbf{u} \cdot\mathbf{u}}\mathbf{u}.$$ Orthogonal projections naturally generalise to higher dimensions and lead to the orthogonal decomposition theorem, which states that, if $\{\mathbf{u_1},\mathbf{u_2},...,\mathbf{u_p}\}$ is any orthogonal basis of a subspace $W \subset\mathbb{R}^n$, then $$\tag{6} \hat{\mathbf{v}}=\frac{\mathbf{v}\cdot\mathbf{u_1}}{\mathbf{u_1}\cdot\mathbf{u_1}}\mathbf{u_1}+...+\frac{\mathbf{v}\cdot\mathbf{u_p}}{\mathbf{u_p}\cdot\mathbf{u_p}}\mathbf{u_p}.$$ Geometrically, this theorem states that the projection $\mathbb{proj}_W\mathbf{v}$ is the sum of $1$-dimensional projections onto the basis vectors (which coordinitise $W$). Figure 2 demonstrates the case for $\mathbb{R}^3$.

$\tag{Fig. 2}$

So, with the above pictures in mind, we can outline the general proof in $\mathbb{R}^n$. Let $\mathbf{u}$ and $\mathbf{v}$ be any two linearly independent vectors in $\mathbb{R}^n$, then they span a plane $\pi$. Using the properties of orthogonal projections outlined above, the Gram-Schmidt process allows us to construct an orthonormal basis $\{\mathbf{b_1}, \mathbf{b_2}\}$ of $\pi$. Now, it easy to define (exercise) a linear transformation $T$, mapping $\{\mathbf{b_1}, \mathbf{b_2}\}$ to the standard basis $\{\mathbf{e_1}, \mathbf{e_2}\}$ of the plane $\pi'$ consisting of all n-tuples $\mathbf{x}=(x_1,x_2,0,...,0)$. The map $T$ is an isometry (exercise)-that is, $T$ is a distance preserving map: $$\tag{7} | \mathbf{u}-\mathbf{v}|= | T(\mathbf{u})-T(\mathbf{v})|.$$ Since $T$ is an isometry, it also preserves dot products. Clearly, in the plane $\pi'$, $$\tag{8} T(\mathbf{u})\cdot T(\mathbf{v})=|T(\mathbf{u})||T(\mathbf{v})|\cos\theta,$$ which we can prove by another orthogonal projection! Since $T$ is an isometry, the result must also hold in the plane $\pi$. $\qquad \qquad \square$

A Note on Isometries: A function $h:\mathbb{R}^n\rightarrow\mathbb{R}^n$ is an isometry iff it equals an orthogonal transformation followed by a translation: $$\tag{9} h(\mathbf{x})=A\mathbf{x} +\mathbf{p}.$$ There is however another, more geometrically appealing way to view isometries: as reflections. A famous result states that every isometry of $\mathbb{R}^n$ is a composition of at most $n+1$ reflections in hyperplanes. This can be visualised most evocatively in $\mathbb{R}^2$, where it is known as the three reflections theorem. A nice exercise is to show, via a drawing, that $$\tag{10} \mathbb{refl}_L \mathbf{v}=2\mathbb{proj}_L \mathbf{v}-\mathbf{v}.$$

Edit: For the case $\theta$ is obtuse, we get the following picture:

The isometry idea is exactly what I was looking for. Preceding (3), we know that $| \hat{v} | = | v | \cos \theta$, so $$ a (u \cdot v) = \hat{v} \cdot v = | \hat{v} |^2 = | \hat{v} | | v | \cos \theta = |a| | u | | v | \cos \theta$$ But how do we know $a > 0$? And how do we conclude (1) and (2) in the case $\theta = \pi/2$? — jskattt797, Sep 19 '21 at 18:54
@jskattt797 I have edited to add a picture which hopefully explains how to proceed in the case $a\lt0$. — Mo Pol Bol, Sep 20 '21 at 00:23
@jskattt797 For $\theta=\pi/2$, algebraically we can work out, since their slopes multiply to $-1$, the dot product equals $0$, and since cosine is also $0$ at $\pi/2$, the result holds. There are several geometric proofs showing slopes of perpendicular lines multiply to $-1$. A slightly different proof appeals to the relative slopes of lines $$rs=\pm |\frac{t_1-t_2}{1+t_1t_2}|,$$ which comes from the formula for $\tan(\theta_1-\theta_2)$. Since vertical lines have undefined or infinite slope, $rs$ must be undefined when lines are perpendicular. — Mo Pol Bol, Sep 20 '21 at 02:15
"The isometry $T$ above is most easily calculated via matrix multiplication." Shouldn't "This is not surprising since..." end in "since $T$ is a linear transformation"? Before the statement "Since $T$ is an isometry, it also preserves dot products." it might be worth noting that a function between metric spaces is an isometry iff it preserves distances, but there are many other equivalent statements that qualify a linear operator as an isometry (preserve norms, preserve inner products, preserve orthonormality, inverse = adjoint, adjoint preserves distances). — jskattt797, Sep 20 '21 at 15:42
Why is the angle between $Tu$ and $Tv$ the same as the angle between $u$ and $v$? — jskattt797, Sep 20 '21 at 15:49
@jskattt797 Yeah, last part isn’t worded well. I mainly just wanted to point out that 1) they always equal an orthogonal matrix A (often with an affine translation) and 2) although you can work out the isometry algebraically without much thought to what the matrix represents, it’s nice to have the geometric idea of reflections (or rotations) in mind. The angles are the same because the triangles $\mathbf{0}\mathbf{u}\mathbf{v}$ and $\mathbf{0}T(\mathbf{u})T(\mathbf{v})$ are congruent (SSS). — Mo Pol Bol, Sep 20 '21 at 23:14

Proving the dot product cosine identity for dimensions $> 2$

2 Answers2