Why is the dot product of two vectors defined the way in which it is?

Question

Most people who’ve sat through any lesson involving vectors will know about the vector dot product

If $\displaystyle \mathbf p=\left[{v_1\atop v_2}\right]$ and $\displaystyle \mathbf q=\left[{w_1\atop w_2}\right]$, then $$\mathbf p \cdot \mathbf q=v_1w_1+v_2w_2$$

Obviously this is the special case where the vectors lie in the two dimensional plane, and this formula does extend to $n$ dimensions, my typesetting abilities just aren’t advanced enough to know how to represent such vectors. In any case, I’m just wondering, so that I don’t go through the rest of any future linear algebra courses blind, why do we define the dot product of vectors in this way? Instead of, for example, multiplying each component of one vector by the corresponding component of a second vector to produce a third vector, whose components are these products?

Any input is appreciated, thank you.

It's because it is a very useful value that arises in many natural situations. Just like the function $f(x) = e^x$ say - to ask why is the "exponential function" defined this way is a rather meaningless question. A rose by any other name would smell as sweet. If you think about it enough, you should be able to see for yourself in which natural situations does such a value arise. — Project Book, Dec 29 '17 at 11:00
I’m not asking why we don’t define it component wise, though, I’m asking why we do define it the way we do. @HansLundmark — joshuaheckroodt, Dec 29 '17 at 11:10
OK, fair enough. Have a look at some of these questions then: https://www.google.com/search?q=site:math.stackexchange.com+dot+product+definition — Hans Lundmark, Dec 29 '17 at 11:21
@ProjectBook it is very rarely the case that something so useful is justified just by its usefulness. Most often, things are defined by analogy or by extension from well understood things, and some are later found to be useful. Then standard textbooks (unfortunately) neglect to mention the justification, focusing only on the uses and technicalities, thus giving the impression that magically this thing is just so so so useful. I cannot think of a single concept that pops out of nowhere and turns out to be any good. — Ittay Weiss, Dec 29 '17 at 11:29
@joshuaheckroodt this is a very good question. I wish more students would regularly show such interest in understanding the concepts they encounter. — Ittay Weiss, Dec 29 '17 at 11:30
The dot product is the extra term that pops out when you try to express $|u+v|^2$ in terms of $|u|^2$ and $|v|^2$. — , Jan 01 '18 at 21:24
You can go ahead and define your own product in whichever way you want, including the way you have been taught in class. The question is: Is your product useful? The dot product defined the way you state is very special, hence why it id used. — pshmath0, Jan 02 '18 at 00:50

score 6 · Answer 1 · edited Apr 08 '20 at 15:58

Tl;dr

In my opinion, the dot product cannot be motivated naturally, because no single of its applications justifies this exact definition. However, the mere number of naturally occurring formulas which contain one or more terms of the form $v_1\cdot w_1+v_2\cdot w_2$ gives a hard to argue a posteriori motivation for this exact notation.

So the reason why the dot product is defined this way and no other: because this is the term which occures in hundreds of naturally emerging formulas, and no other.

The nature of definitions

In contrast to mathematical proofs or ideas for how to solve certain problems, which must be developed from the first second on, many definitions are given a posteriori, i.e. after the subject reached some maturity. The reason is that only after solving many similar problems, it turns out which definitions would have been useful in the first place. Many definitions arise for one of the following reasons:

A certain important term is long, ugly or hard to remember. Therefore we introduce a short hand form to hide some complexity.
A certain term occurs over and over, and it seems introducing a short-hand form creates some useful abstraction and might reveal what is really going on.

Another reasons for definitions, which also makes sense a priori, is the following:

We know what we want to compute, but we lack the exact expression $-$ for now. Still, we have to develop a whole lot of theory until we have a result. Therefore we introduce a placeholder term. This is often done for quantities occurring from modeling reality, e.g. curve lengths etc.

The dot product is a classic example of the second motivation (among others like determinants, matrix multiplication, ...). Look at the following problems and their solutions. I will not show you how to derive them as this will be done as you advance in linear algebra (or you already know them):

Do you want to compute the length of a vector $\mathbf v=(v_1,v_2)$? Do it like this: $$\sqrt{v_1\cdot v_1+v_2\cdot v_2}.$$
Do you want to know the angle $\alpha$ between two vectors $\mathbf v=(v_1,v_2)$ and $\mathbf w=(w_1,w_2)$? Do it like this:$$\cos(\alpha)=\frac{v_1\cdot w_1+v_2\cdot w_2}{\sqrt{v_1\cdot v_1+v_2\cdot v_2}\sqrt{w_1\cdot w_1+w_2\cdot w_2}}.$$
Do you need to project a vector $\mathbf v=(v_1,v_2)$ onto a plane with normal vector $\mathbf n=(n_1,n_2)$? Do it like this: $$\mathbf v-\frac{v_1\cdot n_1+v_2\cdot n_2}{n_1\cdot n_1+n_2\cdot n_2}\mathbf n.$$
Do you need to know if two vectors $\mathbf v=(v_1,v_2)$ and $\mathbf w=(w_1,w_2)$ are orthogonal? Check whether $$v_1\cdot w_1+v_2\cdot w_2=0.$$

All these problems arise naturally in a geometrically motivated subject like linear algebra. And do you see what all of them have in common? They all can benefit from the definition

$$\mathbf v\cdot \mathbf w := v_1\cdot w_1+v_2\cdot w_2.$$

All the complexity vanishes and we get (in this order):

$$\sqrt{\mathbf v\cdot \mathbf v},\qquad \cos(\alpha)=\frac{\mathbf v\cdot \mathbf w}{\sqrt{\mathbf v\cdot\mathbf v}\sqrt{\mathbf w\cdot\mathbf w}},\qquad \mathbf v-\frac{\mathbf v\cdot\mathbf n}{\mathbf n\cdot\mathbf n}\mathbf n,\qquad\mathbf v\cdot\mathbf w=0.$$

Further simplification can be obtained via the definition $\|\mathbf v\|=\sqrt{\mathbf v\cdot\mathbf v}$ after it is proven that $\mathbf v\cdot\mathbf v\ge0$. Also, this definition opens up the way for a coordinate-free approach to linear algebra which only then justifies the word algebra in the name.

From a didactic point of view

I generally avoid introducing definitions without some motivation. For very central and recurring elements like the dot product it is hard to demonstrate the true importance before bringing the definition $-$ already for notational reasons.

But what can be done is computing at least two of the above toy problems and therefore demonstrating the recurrent character of this element in naturally occuring tasks.

Only after this definition has proven its usefulness in certain relevant problems it is appropriate to give definitions which are more like theorems:

This definition is the only way to define a bi-linear multiplication on vectors that yields scalar and also gives $\mathbf e_1\cdot\mathbf e_1=1$ and $\mathbf e_2\cdot\mathbf e_2=1$ for $\mathbf e_1=(1,0)$ and $\mathbf e_2=(0,1)$.

or which are mainly based on further unmotivated axioms:

A dot product is a symmetric, positive definite bilinear-form.

score 2 · Answer 2 · answered Dec 29 '17 at 11:33

There can be many answers to your question but let me explain my thoughts.

First, as you have noticed the inner product isn't really a product in the usual sense , since the output is not again an element of $\mathbb{R^n}$ but rather just a real number (and that's why it is called "inner"). Now, in a really down to earth sense , the inner product is tool to measure angles. In fact the formal way to define the angle of two elements of $\mathbb{R^n}$ is $θ=\cos^{-1}(\frac{<υ,ν>}{<υ,υ>\cdot<v,v>})$.

Why are we defining the inner product that way? Taking cue from the properites of the usual , Euclidian inner product, we say that any billiniar form in a vector space ( a function that takes 2 vectors and spits a number) is called an inner product if it satisfies certain axioms (see wikipidia's article). One can see that in Euclidian Spaces all inner products stem from the standart one with an appropriate change of basis. So that answers why we define the opperation like that.

What good is the inner product for? The parallelogram law of course! $||x||^2+||y||^2=\frac{||x+y||^2+||x-y||^2}{2}$ which is easily verified using that $||x||^2=<x,x>$ ( it is a hard and surprising theorem that if you have a normed space with a parallelogram law you also have an inner product!)This property is not only a generalisation of good old Pythagora's Theroem but it is really the essential tool to prove geometric properties of linear spaces. If you search about Hilber Spaces and Banach Spaces you will come to realise how usefull this is.

Finally, why not define multiplication pointwise ? Simply, because it is useless! Pointwise multiplication really has no good properties and I haven't seen any good usage of it. It is really hard to define multiplication between vectors of $\mathbb{R^n}$ that returns another vector of $R^n$ and in fact it is really surprising (and deep) theorem that you cannot always do it ( indeed such a multiplication exists only in $\mathbb{R}$,$\mathbb{R^2}$,$\mathbb{R^4}$ and $\mathbb{R^8}$ but it gets progressively "weaker". Wikipidias article on Quartenions&Octonions).

TL;DR : We want a function that measures angles and especially orthogonality and this definition is the simplest we can get.

score 1 · Answer 3 · answered Dec 29 '17 at 11:25

1

The law of cosines in elementary Euclidean geometry gives a formula relating angles and lengths. When interpreted with coordinates and Pythagoras theorem is applied, isolating the angle componenet gives rise to the Cauchy-Schwarz inequality involving the standard inner product. This is why the standard inner product is defined the way it is.

answered Dec 29 '17 at 11:25

Ittay Weiss

79,840
7
141
236

I think to reduce the dot product to Euclidean geometry does it a disservice, because it does go beyond being an inner product. Fundamentally, the dot product is the primal example of "duality", of the interaction between two "things". One of the most significant principle that comes out of this interaction is $\langle Tx, x^* \rangle = \langle x, T^x^ \rangle$, which as an example, results in the (in my humble opinion, mindblowing) duality "Existence of a "solution" to the primal system" is "equivalent" to "Uniqueness of a "solution" to the dual system". – Project Book Dec 29 '17 at 22:23
The self-duality of Euclidean, and more generally, of Hilbert spaces, is such a special thing that it itself has a very important significance that goes beyond angles (or is it that angles and orthogonality are more important than what people usually think of them as?) In any case, I think it is both important to give the student a glimpse beyond, as well as to relate this glimpse to something they are familiar with, which is Euclidean geometry, or the simple duality of "quantity" and "weight" for example. – Project Book Dec 29 '17 at 22:29
@ProjectBook, I think the utility of Hilbert spaces is exactly that they extrapolate our physical intuition for (finite-dimensional) physical spaces, where there is a sense of "orthogonality" (and angles). E.g., the "Dirichlet principle" is literally incorrect in Banach spaces that are not Hilbert spaces... – paul garrett Feb 23 '18 at 00:41

score 1 · Answer 4 · answered Dec 29 '17 at 11:32

Here is a physicist point of view. A dot product is a projection of one vector onto another. Define each vector in polar coordinates p=p (Cos($\theta_p$),Sin($\theta_p$)) and q=q (Cos($\theta_q$),Sin($\theta_q$)). The dot product gives you p.q=pq (Cos($\theta_p$).Cos($\theta_q$)+Sin($\theta_p$),Sin($\theta_q$))=pq Cos($\theta_p$-$\theta_q$), which depends on the difference angle between the 2 vectors. With a little bit of geometry:

You can see the dot product is the component of vector p along q. In fact, this formula can be generalized to any dimension because you can always define vectors in polar coordinates.

Why is the dot product of two vectors defined the way in which it is?

4 Answers4

Linked