17

How can we prove that all change-of-basis matrices are invertible? The trivial case when it's a change of basis for $\mathbb{R^{n}}$ is easily demonstratable using, for example, determinants. But I am struggling to rigorously show this for all bases, for example for a two-dimensional subspace of $\mathbb{R^{4}}$. I am sure that there are many ways to go about this proof, and I would be very appreciative for as many ways of demonstration as possible, to back up my intuition!

Arturo Magidin
  • 398,050
  • 7
    For something like this you probably should include the definitions that you use. This is because there are different pedagogical presentations of linear algebra that treats different things as definitions and different things as consequences of those definitions. In one of the presentations I am familiar with, a change-of-basis matrix is pretty much by definition invertible (being a square matrix of full rank). – Willie Wong Mar 08 '11 at 16:57
  • @Willie: in my class we acknowledge the matrix to be square, but I haven't heard the mention of full rank. I guess it would be a consequence of it being a change of basis matrix, but that's not something we spoke of. –  Mar 08 '11 at 17:01
  • Any vector space of finite dimension $n$ is isomorphic to $\mathbb{R}^n$, so if you can prove it for $\mathbb{R}^n$ you've proved it for all finite-dimensional vector spaces. In particular, whatever you can do with determinants to show it for $\mathbb{R}^n$, you can do the same thing with respect to arbitrary bases of arbitrary finite-dimensional vector spaces. – joriki Mar 08 '11 at 17:02
  • 2
    I agree with Willie. Off the top of my head, here is one way to look at it (which may or may not agree with your setup): view the change of basis matrix as a linear transformation $L$ which carries one basis $v_1,...,v_n$ to a different basis $w_1,...,w_n$. Then there is a unique linear transformation carrying the basis $w_1,...,w_n$ to $v_1,...,v_n$ and the matrix of this transformation must be the inverse to the first change of basis matrix. – Pete L. Clark Mar 08 '11 at 17:08
  • Maybe this specific example will clarify things a bit: if I am given $A=BC$ where $A$ and $B$ are 4x2 matrices whose columns form bases for a 2-dimensional subspace of $\mathbb{R^{4}}$, and $C$ is a 2x2 change-of-basis matrix, how do we show that $C$ is invertible? Using matrix operations, algebra, or anything else. –  Mar 08 '11 at 17:34
  • @Karamislo: please give your definition of "a change-of-basis matrix". Also see http://math.stackexchange.com/questions/21557/prove-that-if-s-is-a-change-of-basis-matrix-its-columns-are-a-basis-for-math – wildildildlife Mar 08 '11 at 17:41
  • 1
    What is a change-of-basis matrix except a matrix which is invertible? – Qiaochu Yuan Mar 08 '11 at 17:44
  • @wildildildlife: ok, let's say that in the example I cited $C$ is a 2x2 matrix, which when multiplied by a matrix whose columns form a basis produces another matrix whose columns form a basis for the same subspace. How do we show that such a mtrix $C$ is invertible (not even calling it change-of-basis)? The question in the link you provided deals with bases for $\mathbb{R^{n}}$, for which case as I said the proof is trivial (also that question is concerned more with proving that columns of the c-o-b matrix for a basis for a space, which is not true in my question). –  Mar 08 '11 at 17:51
  • @Karamislo: Why is the proof trivial for $\mathbb{R}^n$ and not for other finite-dimensional vector spaces? Your answer to that question might also go some way towards answering some of the other questions that've been asked about what you're assuming and what definitions you're using. – joriki Mar 08 '11 at 18:44
  • @joriki: for bases of the whole $\mathbb{R^{n}}$ I can write $A=BC$ where $A$, $B$, and $C$ are nxn matrices, and $A$ and $B$ have basis vectors of $\mathbb{R^{n}}$ as their columns. Then I can say the following: det $A$ and det $B$ do not equal to zero, as the matrices are invertible since their coumns are independent. Then det $C$ can't equal zero either, otherwise the equality det$A$=det$B$det$C$ would not hold. So, $C$ is invertible. But I can't figure out how to do something of this sort for a 4x2 matrix, for example. –  Mar 08 '11 at 18:53
  • 1
    @Karamislo: A change-of-basis matrix is never a $4\times2$ matrix. Irrespective of whether the vector space is $\mathbb{R}^2$ or a two-dimensional subspace of $\mathbb{R}^4$, as long as it's two-dimensional, all its bases have 2 elements, and any matrix representing a change from one basis to another is a $2\times2$ matrix, to which you can apply exactly the same reasoning as if it represented a change of basis in $\mathbb{R}^2$. – joriki Mar 08 '11 at 19:08
  • @joriki: I didn't say it was... I was referring to $A$ and $B$ being 4x2 matrices as in the example I presented before, in which case $A$=$B$$C$ where C is 2x2. I can't invoke the deerminant equality here since $A$ and $B$ are not square matrices. –  Mar 08 '11 at 19:32
  • @Karamislo: this conversation is going in circles. What is your definition of a change-of-basis matrix? – Qiaochu Yuan Mar 08 '11 at 19:35
  • @Qiaouchu: My definition of change of basis matrix is such a matrix $C$ which when multiplied by a matrix $B$ whose columns form a basis of a certain subspace, produces another matrix $A$ whose columns form a basis for the same subspace. As I showed, I can algebraically prove that this $C$ is invertible when $A$ and $B$ are square, but I can't find a way to do that when $A$ and $B$ are not square, as in the example with 4x2 matrices. –  Mar 08 '11 at 19:37
  • Can't you just show that since the change-of-basis matrix is linearly independent, it is invertible by the fundamental theorem of invertible matrices? – Cormano Nov 12 '13 at 00:36

3 Answers3

39

What is a change-of-basis matrix? You have a vector space $\mathbf{V}$ (and it doesn't matter if $\mathbf{V}$ is all of $\mathbb{R}^n$, or some subspace thereof, or even something entirely different), and two different ordered bases for $\mathbf{V}$, $\beta_1$ and $\beta_2$ (necessarily of the same size, since two bases of the same vector space always have the same size): \begin{align*} \beta_1 &= \Bigl[ \mathbf{v}_1,\mathbf{v}_2,\ldots,\mathbf{v}_n\Bigr]\\ \beta_2 &= \Bigl[ \mathbf{w}_1,\mathbf{w}_2,\ldots,\mathbf{w}_n\Bigr]. \end{align*} A "change of basis" matrix is a matrix that translates from $\beta_1$ coordinates to $\beta_2$ coordinates. That is, $A$ is a change-of-basis matrix (from $\beta_1$ to $\beta_2$) if, given the coordinate vector $[\mathbf{x}]_{\beta_1}$ of a vector $\mathbf{x}$ relative to $\beta_1$, then $A[\mathbf{x}]_{\beta_1}=[\mathbf{x}]_{\beta_2}$ gives the coordinate vector of $\mathbf{x}$ relative to $\beta_2$, for all $\mathbf{x}$ in $\mathbf{V}$.

How do we get a change-of-basis matrix? We write each vector of $\beta_1$ in terms of $\beta_2$, and these are the columns of $A$: \begin{align*} \mathbf{v}_1 &= a_{11}\mathbf{w}_1 + a_{21}\mathbf{w}_2+\cdots+a_{n1}\mathbf{w}_n\\ \mathbf{v}_2 &= a_{12}\mathbf{w}_1 + a_{22}\mathbf{w}_2 +\cdots + a_{n2}\mathbf{w}_n\\ &\vdots\\ \mathbf{v}_n &= a_{1n}\mathbf{w}_1 + a_{2n}\mathbf{w}_2 + \cdots + a_{nn}\mathbf{w}_n. \end{align*} We know we can do this because $\beta_2$ is a basis, so we can express any vector (in particular, the vectors in $\beta_1$) as linear combinations of the vectors in $\beta_2$.

Then the change-of-basis matrix translating from $\beta_1$ to $\beta_2$ is $$ A = \left(\begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{array}\right).$$

Why is $A$ always invertible? Because just like there is a change-of-basis from $\beta_1$ to $\beta_2$, there is also a change-of-basis from $\beta_2$ to $\beta_1$. Since $\beta_1$ is a basis, we can express every vector in $\beta_2$ using the vectors in $\beta_1$: \begin{align*} \mathbf{w}_1 &= b_{11}\mathbf{v}_1 + b_{21}\mathbf{v}_2 + \cdots + b_{n1}\mathbf{v}_n\\ \mathbf{w}_2 &= b_{12}\mathbf{v}_2 + b_{22}\mathbf{v}_2 + \cdots + b_{n2}\mathbf{v}_n\\ &\vdots\\ \mathbf{w}_n &= b_{1n}\mathbf{v}_n + b_{2n}\mathbf{v}_n + \cdots + b_{nn}\mathbf{v}_n. \end{align*} So the matrix $B$, with $$B = \left(\begin{array}{cccc} b_{11} & b_{12} & \cdots & b_{1n}\\ b_{21} & b_{22} & \cdots & b_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ b_{n1} & b_{nn} & \cdots & b_{nn} \end{array}\right),$$ has the property that given any vector $\mathbf{x}$, if $[\mathbf{x}]_{\beta_2}$ is the coordinate vector of $\mathbf{x}$ relative to $\beta_2$, then $B[\mathbf{x}]_{\beta_2}=[\mathbf{x}]_{\beta_1}$ is the coordinate vector of $\mathbf{x}$ relative to $\beta_1$.

But now, consider what the matrix $BA$ does to the standard basis of $\mathbb{R}^n$ (or $\mathbf{F}^n$, in the general case): what is $BA\mathbf{e}_i$, where $\mathbf{e}_i$ is the vector that has a $1$ in the $i$th coordinate and zeros elsewhere? It's a matter of interpreting this correctly: $\mathbf{e}_i$ is the coordinate vector relative to $\beta_1$ of $\mathbf{v}_i$, because $[\mathbf{v}_i]={\beta_1}=\mathbf{e}_i$. Therefore, since $A[\mathbf{x}]_{\beta_1} = [\mathbf{x}]_{\beta_2}$ and $B[\mathbf{x}]_{\beta_2}=[\mathbf{x}]_{\beta_1}$ for every $\mathbf{x}$, we have: $$BA\mathbf{e}_i = B(A\mathbf{e}_i) = B(A[\mathbf{v}_i]_{\beta_1}) = B[\mathbf{v}_i]_{\beta_2} = [\mathbf{v}_i]_{\beta_1} = \mathbf{e}_i.$$ That is, $BA$ maps $\mathbf{e}_i$ to $\mathbf{e}_i$ for $i=1,\ldots,n$. The only way for this to happen is if $BA=I_n$ is the identity. The same argument, now interpreting $\mathbf{e}_i$ as $[\mathbf{w}_i]_{\beta_2}$, shows that $AB$ is also the identity.

So $A$ and $B$ are both invertible.

So every change-of-basis matrix is necessarily invertible.

It doesn't really matter if you are considering a subspace of $\mathbb{R}^N$, a vector space of polynomials or functions, or any other vector space. So long as it is finite dimensional (so that you can define the "change-of-basis" matrix), change-of-basis matrices are always invertible.

Added. I just saw the comment where you give the definition you have of change-of-basis matrix: a matrix $C$ which, when multiplied by a matrix $B$ whose columns form a basis of a certain subspace, produces another matrix $A$ whose columns form a basis for the same subspace.

This matrix $C$ is just the matrix that expresses the columns of $A$ in terms of the columns of $B$. That is, it's the change-of-basis matrix from "columns-of-A" coordinates to "columns-of-B" coordinates.

For example, take the subspace of $\mathbb{R}^4$ given by $x=z$ and $y=w$, with basis $$\left(\begin{array}{c}1\\0\\1\\0\end{array}\right),\quad\left(\begin{array}{c}0\\1\\0\\1\end{array}\right),$$ and now consider the same space, but with basis $$\left(\begin{array}{c}1\\1\\1\\1\end{array}\right),\quad \left(\begin{array}{r}1\\-2\\1\\-2\end{array}\right).$$ The matrix $C$ such that $$ \left(\begin{array}{rr} 1 & 1\\ 1 & -2\\ 1 & 1\\ 1 & -2 \end{array}\right) = C\left(\begin{array}{cc} 1 & 0\\ 0 & 1\\ 1 & 0\\ 0 & 1 \end{array}\right)$$ is obtained by writing each vector in the columns of $A$ in terms of the columns of $B$: \begin{align*} \left(\begin{array}{r} 1\\1\\1\\1\end{array}\right) &= 1\left(\begin{array}{c}1\\ 0\\ 1\\ 0\end{array}\right) + 1\left(\begin{array}{c}0 \\ 1 \\ 0 \\ 1\end{array}\right),\\ \left(\begin{array}{r} 1\\ -2\\ 1\\ -2\end{array}\right) &= 1\left(\begin{array}{c}1\\0\\1\\0\end{array}\right) -2\left(\begin{array}{c}0\\1\\0\\1\end{array}\right). \end{align*} And so, the matrix $C$ is $$C = \left(\begin{array}{rr} 1 & 1\\ 1 & -2 \end{array}\right).$$ Expressing the columns of $B$ in terms of the columns of $A$ give the inverse: \begin{align*} \left(\begin{array}{c}1\\ 0\\ 1\\ 0\end{array}\right) &= \frac{2}{3}\left(\begin{array}{c}1 \\ 1\\ 1\\ 1\end{array}\right) + \frac{1}{3}\left(\begin{array}{r}1 \\ -2\\ 1\\ -2\end{array}\right)\\ \left(\begin{array}{c}0\\ 1\\ 0\\ 1\end{array}\right) &= \frac{1}{3}\left(\begin{array}{c} 1\\ 1\\ 1\\ 1\end{array}\right) -\frac{1}{3}\left(\begin{array}{r}1\\ -2\\ 1\\ -2\end{array}\right), \end{align*} so the inverse of $C$ is: $$C^{-1} = \left(\begin{array}{rr} \frac{2}{3} & \frac{1}{3}\\ \frac{1}{3} & -\frac{1}{3} \end{array}\right),$$ which you can verify by multiplying by $C$.

joriki
  • 238,052
Arturo Magidin
  • 398,050
  • @Arturo Magidin: Nice explanation -- but I don't understand why you brought the standard basis of $\mathbb{R}^n$ into it -- why not just say that since $A$ translates $\beta_1$ coordinates into $\beta_2$ coordinates and $B$ translates $\beta_2$ coordinates into $\beta_1$ coordinates, then applying first $A$ and then $B$ translates $\beta_1$ coordinates into $\beta_2$ coordinates and back to $\beta_1$ coordinates, and thus $BA$ must be the identity matrix (and likewise for $AB$)? – joriki Mar 08 '11 at 21:28
  • 3
    @joriki: Actually, now I remember why: Since you are dealing with the matrices $A$ and $B$, you would have to think about the the transformation that maps from $\mathbf{V}$ to $\mathbb{R}^n$ via "coordinate vector", then map to $\mathbb{R}^n$ via $BA$, then map back to $\mathbf{V}$ via "what the coordinate vector means", so I thought it would be a bit more cumbersome than just directly seeing what the matrices $AB$ and $BA$ do to a basis for $\mathbb{R}^n$. – Arturo Magidin Mar 08 '11 at 23:40
  • @Arturo Magidin: I don't understand. The thought of 'a transformation that maps from $\mathbf{V}$ to $\mathbb{R}^n$ via "coordinate vector"' never crosses my mind when I think about this -- all you need is the fact that the coordinates of a vector in a basis are unique -- then if $BA$ translates from one set of coordinates to another and back, it has to be the identity. To my mind, mentioning "the standard basis of $\mathbb{R}^n$" just confuses things, since it sounds as if $\mathbb{R}^n$ is entering as a vector space, whereas actually it's only entering as the set of n-tuples of coordinates. – joriki Mar 09 '11 at 05:18
  • @joriki: I disagree that $\mathbb{R}^n$ only enters as "set of n-tuples of coordinates". In fact, the act of writing the coordinate vector is a linear transformation from $\mathbf{V}$ to $\mathbb{R}^n$ (in fact, it's how we prove that an $n$-dimensional vector space is isomorphic to $\mathbf{F}^n$). If you're not used to these things, you do think about those "translations", and I suspect the OP is not used to them yet. But in any case, it's a disagreement between you and me on how to explain things, not on substance. – Arturo Magidin Mar 09 '11 at 05:23
  • @Arturo Magidin: I disagree that it's a disagreement :-) I began both my comments with "I don't understand". I've noticed that you're good at explaining things to people who are "not used to these things", and I'm always trying to get better at that; it's quite likely that you're right and I just don't see it yet. I agree that we view a coordinate tuple as a vector when we prove isomorphism, but you're not proving isomorphism here, and I still don't see how a basis of $\mathbb{R}^n$ plays any role here and why it helps to consider the coordinate tuple as anything other than just a tuple. – joriki Mar 09 '11 at 09:08
  • @Arturo Magidin: (Of course we view the tuple as a row "vector" when we multiply it by matrices, but that doesn't rely on bases of $\mathbb{R}^n$ or viewing $\mathbb{R}^n$ as a vector space.) – joriki Mar 09 '11 at 09:11
  • @joriki: The idea for my argument was: "How do we show that $AB$ is the identity? We show that it acts like the identity on a basis" (that's pretty much the same as what you're saying, I believe). What basis? $AB$ is a matrix; we can either "translate" this matrix into a linear transformation and see how it acts on $\beta_1$ (what you are proposing to do); or else we can let $AB$ act on the space it naturally acts on, $\mathbb{R}^n$, and interpret those vectors in some way that make the action of $AB$ on them clear. You are proposing the former, I did the latter. – Arturo Magidin Mar 09 '11 at 16:41
  • @Arturo Magidin: I think there's a misunderstanding there. I'm not proposing to see how $AB$ acts on $\beta_1$. My point is that there's no need to mention any basis at this point, neither one of $\mathbb{R}^n$, nor one of $\mathbf{V}$. $AB$ is the identity simply because multiplication by $B$ represents a translation from $\beta_2$ coordinates to $\beta_1$ coordinates and multiplication by $A$ represents a translation from $\beta_1$ coordinates to $\beta_2$ coordinates, so applying them one after the other translates from $\beta_2$ coordinates to $\beta_2$ coordinates, which is the identity. – joriki Mar 09 '11 at 16:58
  • @joriki: Okay; rather than look at specific vectors like I did, you propose looking at a generic vector and simply interpret it as a "coordinate vector". Sure, that definitely works. I happen to be teaching advanced linear algebra at the moment, and "look at what happens to a basis" is one of our standard tricks (e.g., to show to linear transformations are equal), so I may just have gone down that road by inertia rather than pedagogical concerns. – Arturo Magidin Mar 09 '11 at 17:22
0

You can prove this by linking Change of Basis matrices to linear transformations. See Kuldeep Singh's Linear Algebra: Step by Step (2013) p 414.

enter image description here

Let know me if you want me post Exercise 5.6 solution.

  • 1.This doesn't answer the question being asked. 2.The text is rather obscure in the usage of invertible linear transformation. Either (a) or (b) can only hold at once and not both. We usually use the notion of "invertible linear transformations" for linear operators(a linear transformation from a vector space to the same vector space, T:V->V) to satisfy (a) and (b) simultaneously(proposition 5.36 also). – user600016 Mar 30 '21 at 13:34
0

Let $M=[I]_{B_1}^{B_2}$ be the change of basis matrix the identity operator $I:V \rightarrow V$ with $B_1,B_2$ being basis for the vector space $V$.

We will show that the equation $Mx=0$ has $x=0$ as the only solution, where $x \in V$.

As mentioned in the above answer as well, the columns of the matrix $M$ are the representation for the vector in $B_2$ obtained by action of $I$ on the vectors in the basis $B_1$.

In mathematical terms:

Let $B_1=\{u^1,u^2,\ldots,u^n \}$ and $B_2=\{v^1,v^2,\ldots,v^n \}$ for $u^1,\ldots,u^n,v^1,\ldots,v^n \in V$.

$[I(u^j)]_{B_2} = [u^j]_{B_2} = a_{1j} v^1 + a_{2j} v^2+\ldots+a_{nj}v^n$ for some constants $a_{1j},a_{2j},\ldots,a_{nj}$.

Let's denote the $jth$ column of matrix $M=\begin{pmatrix} a_{1j} \\ a_{2j} \\ \vdots \\ a_{nj} \end{pmatrix}$ as $c_j$.

Claim: The columns of matrix $M$ are linearly independent.

Consider, $\alpha_1 c_1 + \alpha_2 c_2 + \ldots \alpha_n c_n = \alpha_1 [u^1]_{B_2} + \alpha_2 [u^2]_{B_2} + \ldots + \alpha_n [u^n]_{B_2}=0$

$\implies [\alpha_1 u_1 + \alpha_2 u_2 + \ldots \alpha_n u_n]_{B_2} = 0$

$\implies \alpha_1 u_1 + \alpha_2 u_2 + \ldots \alpha_n u_n = 0$

$ \implies \alpha_1 = \alpha_2 = \ldots = \alpha_n = 0$(as $u^j$'s form a basis in $B_1$).

Hence, we have for $x = \begin{pmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{pmatrix} \in V$ we have:

$Mx=0 \implies x_1c_1+x_2c_2+\ldots+x_nc_n=0 \implies x_1=x_2=\ldots=x_n=0$(from above claim).

$i.e. x=0$ is the only solution for $Mx=0$ and hence $M$, the change of basis matrix is invertible.

Note: We could have also completed the proof by concluding that $M$ is of full rank and hence invertible instead of considering $Mx=0$ as well.

user600016
  • 2,165