Why is the dot product of two vectors $\mathbf{x},\mathbf{y}$ the same as $x^T y$?

Question

I've always thought that a $1\times 1$ matrix is not the same as a scalar. However, in $\textbf{many}$ times throughout my first year in undergrad, I see people interchange $x\cdot y$ with $x^T y$ (a scalar in the former, and a $1\times 1$ matrix in the latter).

Is this just sloppy/lazy notation?

Surely there will be instances where using one or the other "breaks" the working of a question?

e.g.
Let $\mathbf{x,y}$ be vectors in $\mathbb{R}^3$, and let $X$ be a $2\times 2$ matrix.

Then $(x\cdot y)X$ is defined, yet $x^T y X$ is not defined.

Is there a situation where treating them as equivalent is beneficial?

Ginger88895 · Answer 1 · 2017-09-09T11:37:18.110

2

Your statement is correct.

However, in practice, since the $1\times 1$ matrices is isomorphic to the base field (that is, they behave like scalars), we can define its product with other matrices as scalar product, yet still keeping the properties of matrix products.

EDIT: By "Your statement is correct", I mean that, in a strict sense, $(x^T y)X$ is not defined as matrix product.

edited Sep 09 '17 at 11:37

answered Sep 09 '17 at 10:58

Ginger88895

184

"Your statement is correct". No, it's not. Please see my explanation below. – Andreas Sep 09 '17 at 11:13
I have edited my answer to be more accurate now. Thanks for pointing it out :) – Ginger88895 Sep 09 '17 at 11:42

score 1 · Answer 2 · answered Sep 09 '17 at 15:22

If that kind of argument makes you feel uncomfortable, you can get around any difficulty by formulating the following simple theorem:

“If $k$ is a scalar, and $[k]$ the corresponding $1 \times 1$ matrix, and if $X$ is an $n \times 1$ matrix (i.e., a column vector), then the equality $$ kX = X[k] \tag{$*$} $$ holds.”

This is true for trivial reasons (because of how things are defined), and it's exactly in this situation that this trick is usually applied, like when deriving the matrix $I - 2 \mathbf{n}^t \mathbf{n}$ for the reflection $R$ in a hyperplane with unit normal $\mathbf{n}$: $$ R(\mathbf{x}) = \mathbf{x} - 2 \underbrace{(\mathbf{x} \cdot \mathbf{n})}_{\text{scalar}} \mathbf{n} \overset{(*)}{=} I\mathbf{x} - 2 \mathbf{n} \underbrace{(\mathbf{n}^t \mathbf{x})}_{\text{$1\times 1$}} = (I - 2 \mathbf{n}^t \mathbf{n}) \mathbf{x}. $$

Andreas · Answer 3 · 2017-09-09T11:14:05.733

$x \cdot y$ is commonly understood, if no reference is made to matrices, as the scalar product. I.e. $x \cdot y = \sum_{i=1}^N x_i y_i$.

In matrix notation, $x \cdot y$ will only be defined if both $x$ and $y$ are $N \times N$ matrices, otherwise it is undefined. If both $x$ and $y$ are $N \times K$ matrices, you have to specify whether you want $x^T \cdot y$ or $x \cdot y^T$. Then the result will either be $N \times N$ or $K \times K$. In particular, if either $N$ or $K$ equals 1, you are back to the scalar product in one of the two cases $x^T \cdot y$ or $x \cdot y^T$. Visualize this as matrices as follows:

$\left( {{x_1 \atop \vdots} \atop x_N} \right)^T \cdot \left( {{y_1 \atop \vdots} \atop y_N} \right) = (x_1, \dots, x_N) \cdot \left( {{y_1 \atop \vdots} \atop y_N} \right) = \sum_{i=1}^N x_i y_i$

and

$(x_1, \dots, x_N) \cdot (y_1, \dots, y_N)^T = (x_1, \dots, x_N) \cdot \left( {{y_1 \atop \vdots} \atop y_N} \right) = \sum_{i=1}^N x_i y_i$

So contrary to what you are saying, $ (x⋅y)X$ is defined only if $x⋅y$ is understood as the scalar vector product, however this should be avoided since it is a "sloppy" notation mixing vectors and matrices. If $x$ and $y$ were matrices, it would not be generally defined.

$ x^T y X $ is defined without problems (all are matrices), if you interpret the order of operations as $ (x^T y) X $ .

Why is the dot product of two vectors $\mathbf{x},\mathbf{y}$ the same as $x^T y$?

3 Answers3

Linked