0

When trying to find the inverse of the n$\times$n matrix $A$, one way of going about it is by solving $AX=I$, wherein $I$ is the n$\times$n identity matrix, and $X$ is some n$\times$n matrix which is the inverse of $A$. Writing out the matrix product $AX$ will leave you with $n^2$ equations in $n^2$ unknowns. Could someone explain to me how finding the inverse of an invertible matrix $A$ by writing it like this is valid:

$$\left(A|I\right)= \left( \begin{array}{cccc|cccc} a_{1,1}&a_{1,2} &\cdots &a_{1,3} &1 &0 &\cdots &0\\ a_{2,1}&a_{2,2} &\cdots &a_{2,3} &0 &1 &\cdots &0\\ \vdots &\vdots &\ddots &\vdots &\vdots &\vdots &\ddots &\vdots \\ {a_{n,1}}&a_{n,2} &\cdots&a_{n,n} &0 &0 &\cdots &1 \end{array} \right)$$

It makes sense to do this with a system $\mathit{A}\mathbf{x} = \mathbf{b}$, where $\mathbf{x}, \mathbf{b}$ are column vectors, and $A$ is a coefficient matrix. Solving the augmented matrix shown above isn't difficult; I understand how to do it, and how to get a solution, but I don't understand how it's a valid action to perform. I mean, the identity matrix $I$ to the right isn't a column vector, and as such, when I row-reduce $A$ to the identity matrix, I get:

$$\left(I|C\right)= \left( \begin{array}{cccc|cccc} 1 &0 &\cdots &0 &c_{1,1} &c_{1,2} &\cdots &c_{1,n}\\ 0 &1 &\cdots &0 &c_{2,1} &c_{2,2} &\cdots &c_{2,n}\\ \vdots &\vdots &\ddots &\vdots &\vdots &\vdots &\ddots &\vdots \\ 0 &0 &\cdots &1 &c_{n,1} &c_{n,2} &\cdots &c_{n,n} \end{array} \right)$$

which means that each of my variables is equal to a row vector. For example, $\mathbf{x_{1,1}}$ would be $\mathbf{x}_{1,1}=[c_{1,1}, c_{1,2}, \cdots ,c_{1,n}]$. How is this possible? It doesn't make any sense to me at all. It makes me wonder what the variables of the coefficient matrix $A$ are? Apparently, they're row vectors. But how is this even possible, as we were originally trying to solve $AX$: a matrix product which yields only linear equations in the form of dot products of coefficients $a_{ij}$ and variables $x_{ij}$?

Ius Klesar
  • 1,406

2 Answers2

1

Doing row operations is equivalent to multiply your matrix from the left by an elementary matrix, thus you get

$$E_m E_{m-1}\cdot\ldots\cdot E_2E_1A=I$$

Now the above simply means $\;E_m\cdot\ldots\cdot E_1= A^{-1}\;$

You can do the above also with columns operations, which is the same as multiplying your matrix from the right by elementary operations, but never mixed row and column operations: ifyou began with either one, stick to it all the time.

DonAntonio
  • 211,718
  • 17
  • 136
  • 287
  • Yes, I'm aware of that. However, say I've got some other system I'm trying to solve, $AX=B$? This fundamentally make sense to me when X and B aren't column vectors. – Ius Klesar May 31 '14 at 17:27
0

If you understand how to solve $Ax=b,$ where $x,b$ are column vectors then you have done. What you are doing when using this method is just to solve $n$ systems simultaneously, $Ax_i=e_i$ where $e_i=(0,\cdots, 1,\cdots, 0)^T.$

Thus the colum $i$ of $C$ satisfies $Ac_i=e_i,$ $i=1,\cdots,n.$ That is, $AC=I,$ or, in other words, $C=A^{-1}.$

Edit

Let us see an example. Assume we want to solve the linear system

$$\left(\begin{matrix}1 & 2 \\ 3 & -1 \end{matrix} \right)\left(\begin{matrix}x \\ y \end{matrix}\right)=\left(\begin{matrix} 3 \\ 2 \end{matrix}\right).$$ So

$$\left(\begin{array}{rr|r} 1 & 2 & 3 \\ 3 & -1 & 2\end{array} \right)\rightarrow \cdots \rightarrow\left(\begin{array}{rr|r} 1 & 0 & 1 \\ 0 & 1 & 1\end{array} \right).$$

Now, if we want to get $A^{-1}$ we have to find a matrix $C=\left(\begin{matrix} x & u \\ y & v \end{matrix} \right)$ such that $AC=\left(\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix} \right).$ That is, we have to solve the following two systems

$$A \left(\begin{matrix} x \\ y \end{matrix} \right) =\left(\begin{matrix} 1\\ 0 \end{matrix} \right) \:\: \text{and} \:\: A \left(\begin{matrix} u\\ v \end{matrix} \right) =\left(\begin{matrix} 0\\ 1 \end{matrix} \right).$$ We can solve them separately

$$\left(\begin{array}{rr|r} 1 & 2 & 1\\ 3 & -1 & 0\end{array} \right) \:\: \text{and} \:\: \left(\begin{array}{rr|r} 1 & 2 & 0\\ 3 & -1 & 1\end{array} \right),$$ or, since we have to do the same operations in the matrix $A$ to solve them simultaneously

$$\left(\begin{array}{rr|rr} 1 & 2 & 1 & 0\\ 3 & -1 & 0 & 1\end{array} \right).$$

The name of the variables has not importance at all. I have written

$$A \left(\begin{matrix} x \\ y \end{matrix} \right) =\left(\begin{matrix} 1\\ 0 \end{matrix} \right) \:\: \text{and} \:\: A \left(\begin{matrix} u\\ v \end{matrix} \right) =\left(\begin{matrix} 0\\ 1 \end{matrix} \right),$$ only to give a name to the entries of $C.$ There is no need to write them as $x_{i,j}.$ Actually we are solving

$$A \left(\begin{matrix} x \\ y \end{matrix} \right) =\left(\begin{matrix} 1\\ 0 \end{matrix} \right) \:\: \text{and} \:\: A \left(\begin{matrix} x\\ y \end{matrix} \right) =\left(\begin{matrix} 0\\ 1 \end{matrix} \right).$$ Of course, for any system we get a different solution (value of the variables) because we are solving different systems. For the fist system the solution is the first column of $C,$ for the second system the solution is the second column of $C,$ and so on.

mfl
  • 29,399