Why do Matrices work the way they do?

Question

You get taught about matrices and how they work but nobody ever tells you WHY they work in the way that they do. What was the idea that sparked the creation of matrices?

@Alizter For example, why do we multiply matrices the way that we do? — turnip, Jan 23 '14 at 01:00
I think the OP is asking about the relationship between how you do matrices on a mechanical level and what happens on an abstract level. — ncmathsadist, Jan 23 '14 at 01:01
I up-voted the question, and its vote total is zero. Could whoever down-voted it explain? — Michael Hardy, Jan 23 '14 at 01:26
This is either a duplicate of Matrix multiplication: interpreting and understanding the process for the general answer or Why, historically, do we multiply matrices as we do? for a historical perspective. — , Dec 13 '16 at 01:10

score 4 · Accepted Answer · answered Jan 23 '14 at 01:00

4

Composition of linear transformations. Thus, for example, $$ \begin{align} m & = 4p+7q, & p & =10x-13y \\ \\ n & =-3p+2q, & q & = 2x+5y \end{align} $$

So how do you get $m$ and $n$ as functions of $x$ and $y$? You multiply matrices.

answered Jan 23 '14 at 01:00

Michael Hardy

1

Show us an example O Hardy – Ali Caglayan Jan 23 '14 at 01:02
3

@Alizter : What I posted is an example. – Michael Hardy Jan 23 '14 at 01:03
this answer is an example @Alizter – janmarqz Jan 23 '14 at 01:03
Apologies, let me rephrase. Show us an example of how the matrices multiply to give the result. – Ali Caglayan Jan 23 '14 at 01:06
7

We have $\begin{bmatrix} m \ n\end{bmatrix}=\begin{bmatrix}4 & 7 \ -3 & 2\end{bmatrix}\begin{bmatrix}p \ q\end{bmatrix}$ and $\begin{bmatrix} p \ q\end{bmatrix}=\begin{bmatrix}10 & -13 \ 2 & 5\end{bmatrix}\begin{bmatrix}x \ y \end{bmatrix}$. So $\begin{bmatrix} m \ n\end{bmatrix}=\begin{bmatrix}4 & 7 \ -3 & 2\end{bmatrix}\begin{bmatrix}10 & -13 \ 2 & 5\end{bmatrix}\begin{bmatrix}x \ y\end{bmatrix}$. Do the matrix multiplication in the usual way, and show that it's the same as the result you'd get from routine algebra if you'd never heard of matrix multiplication. – Michael Hardy Jan 23 '14 at 01:13
@PPG : I do something similar to Michael Hardy's comment above. First define how to multiply a matrix by a column vector. Then define the product of two matrices $A$ and $B$, $AB$, so that $(AB)\mathbf{x}=A(B\mathbf{x})$ (assuming the dimensions of $A$, $B$, and $\mathbf{x}$ make sense). There is only one way to do this. – Stefan Smith Jan 23 '14 at 01:23

score 4 · Answer 2 · edited Apr 13 '17 at 12:19

Matrices are representations of linear maps in terms of specific bases, similar to how decimal and hex numbers are representations of integers in specific bases. Operations on matrices are defined precisely so that they correspond to the associated operations on their corresponding linear maps, e.g. matrix multiplication corresponds to composition of linear maps. One can derive all of the the usual formulas for matrix operations from this fact alone. This is explained in every good linear algebra textbook, e.g. Axler's Linear Algebra Done Right. $\:$ See also Arturo Magidin's answer here.

Dmytro · Answer 3 · 2019-01-25T23:31:45.570

I just had a eureka moment after MANY years trying to make sense of this strange construct; I want to share it with those that have been as confused as me and hopefully this will unconfuse you.

For a moment let's forget about matrices and remember functions that allow two operations; multiplication, and addition.

For this example, I will exclusively use JavaScript lambda notation for functions because this is 2016 and it is one functional language everyone reading this will almost surely have or run on an online interpreter.

const f = x => 2 is the same as f(x) = 2.
const f = g => g(2) is the same as f(g) = g(2), a function that takes a function and calls that function with the value 2.
const f = () => x => x + 1 is a function that takes "nothing" and creates a function that can be called with an x to add 1 to it.
const f = (x, y) => x + y is a function that takes two inputs x and y and when you call it with f(1, 2) it will give you 3.
side note, const f = f => f is a function named f which takes one input named f and returns that input, this is fine because the inner f hides the function, which is only a problem if you want to be recursive, but even then you can pass the function back to itself in an argument to bypass this.
const cannot be reassigned, var can be reassigned, let also exists but i'll avoid it for now.

Now that we are in 2016, let's invent a special operational space where we are only allowed to add and multiply. In this space, we can express functions like the following:

f = x => 2
f = (x, y) => x + y
f = (x, y) => x * y
f = (x, y) => x - (2 * y)

this is all nice, but this is one dimensional; the result of each of these functions is always a single value, and there really is no need to have multiple arguments for these systems at all, we could just have a new structure called a vector, array in JavaScript.

Instead of the

const f = (x, y, z, alpha, beta, zeta) => x + y

notation, let's start using this:

const f = A => A[0] + A[1]

where A is effectively a vertical vector.

To gain an intuition for this; here are some examples:

// 33
const f1 = A => A[0] 
console.log(f1([33]))

// 1 - (2 * 0.5) = 0
const f2 = A => A[0] - (2 * A[1])
console.log(f2([1, 0.5])) 

// 1 + 1 - 1 = 1
const f3 = A => A[0] + A[1] - A[2]
console.log(f3([1, 1, 1]))

We lost our variable names, but we no longer have to make them up; we have a nice ordering of 0, 1, 2, 3... whereas letters are quite limiting.

Now let's say we have MANY functions, and we want to apply them to MANY inputs AT THE SAME TIME.

Let's start with a simple case; we have a function:

f = x => x + 2

and we want to apply it to each element in this array: [1, 2, 3, 4, 5, 6], but we also want to not change anything in our original array, but create a new array with our new values.

We now have two choices; we can cheat and do:

const f = A => A.map(x => x + 2)
console.log(f([1, 2, 3, 4, 5, 6]))

this approach creates a function f which represents a structure preserving transformation on the input A that preserves the fact that it has n elements, but applies the function on each of them to produce a new array.

this function can now be used on ANY array-like structure to apply this function on all elements of the array to produce a new array.

The second approach is to create an array that contains our functions and find two convenient rules.

For any two variables a and b, and any two functions f and g, we want the following to be true:

$$ result = \begin{pmatrix} x =>x + 1 \\ y =>y + 2 \end{pmatrix} \begin{pmatrix} 1 \\ 2 \end{pmatrix}$$

is absolutely the same as

const f = (x => x + 1)(1)
const g = (x => x + 2)(2)
const result = [f, g]

Suppose we want a function that automates this process of finding result for any 2 inputs; in the case above those two inputs were 1 and 2 but what if we had to do this hundreds of times?

const f = x => x + 1
const g = x => x + 2
const F = A => [f(A[0]), f(A[1])]

Now we have a function F that given an array of two(or more) items, and produces the functions individually applied to those values.

This is in essence, what Matrix does, except it is expressed as following:

'use strict'

const M = [
    [1, 2, 0, 5], 
    [1, 0, 0, 0],
    [0, 1, 0, 0],
    [0, 0, 1, 0],
]
const v = [1, 2, 0, 5]

const zip = (xs, ys) => 
    xs.map((x, index) => [x, ys[index]])

const row_vector_to_lambda = row_vector => 
    // get all [value, multiplier] pairs
    xs => zip(row_vector, xs)
        // multiply each value by its multiplier, making the array flat.
        .map(pair => pair[0] * pair[1])
        // boring summation.
        .reduce((prev, current) => prev + current, 0)

const matrix_to_lambdas = Matrix =>
    Matrix.map(row => row_vector_to_lambda(row))

const call_matrix = (Matrix, values) => 
    zip(matrix_to_lambdas(Matrix), values)
        .map(pairs => pairs[0](pairs[1]))

const lambda = row_vector_to_lambda(v)
console.log(lambda([1, 2, 1, 1]))

const lambdas = matrix_to_lambdas(M);
console.log(lambdas[0]([1, 2, 1, 1]));

const result = call_matrix(M, [
    [1, 2, 1, 1],
    [1, 2, 1, 1],
    [1, 2, 1, 1],
    [1, 2, 1, 1]
]);

console.log(JSON.stringify(result));

This "honestly" expresses how matrices are mapped to anonymous functions with higher order functions.

What we notice now is that Matrices are a very efficient way to separate the way we represent bundles of linear functions from interpretation of them, which is great for performance but means you have to carry the burden of remembering how to interpret it, whereas lambdas are honest and leave nothing out but are much longer to write out.

To make long story shorter; matrices work by being a construct that preserves multiplication and addition for any number of inputs and outputs rather than only one(they are also at the same time both an array of functions, as each row vector(each row represents a way to transform a column vector, eg how much to change the price of every item of a store/multiple stores at the same time), and your everyday number arrays as each column vector(eg, list of prices for each item in a store, with each index representing a different item). This becomes convenient because you can represent 10 stores selling 3 items each as 11 column, 4 row matrix(with last row, column being zeros except cell at row 4 column 11 being 1), and if you want to multiply or add prices of any store by a constant or based on one another, you can do that by multiplying it to the left with an appropriate matrix that represents such transformation); you can represent the history of a made-up universe as a long chain of matrix multiplications(eg how prices changed from start of the universe, to the end, and what time it was in that universe at the time of any particular change).They also preserve groupings of transformations; you can produce new transformations just by "matrix multiplying" two matrices, but function composition is a whole different story(see note 1).

(note 1) well, in truth they're absolutely isomorphic(programs can convert matrices to functions and backwards) if we preserve source code, and enforce a few new rules, functions would behave the same way as matrices, just need a new stricter kind of functions that support addition, multiplication, and the kind of input/output they give would work on would be a bit more meta than we're used to, but as you see it gets really counterintuitive really fast.

Why do Matrices work the way they do?

3 Answers3

Linked