Why doesn't the definition of dependence require that one can expresses each vector in terms of the others?

Question

I was reviewing my foundations on linear algebra and realized that I am confused about independence and dependence. I understand that by definition independence means:

A set of vectors $\{x_1,\ldots,x_k\}$ is independent if the only linear combination that gives the zero vector is the zero vector itself. i.e. if $[x_1, \ldots, x_k]c = Xc = 0$ iff $c=0$

I understand what the definition says but it sort of goes against my intuition of what the definition of dependence should be (and hence its negation independence). In my head intuitively dependence means that the a set of vectors depends on each other. In other word one should always be able to express one vector as a linear combination of the others. Something like:

$$ \forall x_i \in \{x_1,\ldots,x_k\}, \exists c \neq 0 : \sum_{j \neq i} c_j x_j = x_i$$

however with my definition above (which is wrong and is not what the standard definition is, I know but I am trying to come to terms why its wrong) implies that a set of independent vectors with the zero vector tacked on is not dependent (i.e. independent) which is the opposite of what is should be. i.e. tacking the zero vector and the set remains independent (this should be wrong cuz [0,...,0,1] is not the zero vector and only the zero vector should give 0).

Consider for a simple example $ \{ x_1,x_2,0 \}$ where $x_1,x_1$ only give zero with the zero vector (standard definition of independence). With my definition of things its obvious that these vectors are independent. In reality they should be dependent because [0,0,1] is now in the nullspace but things are only independent if only the zero vector is in the nullspace. With my definition the vectors are independent because there is no way to express any of them in terms of each other. For example:

$a x_1 + b x_2 = 0$
$c x_1 + d 0 = x_2$
$e x_2 + f 0 = x_1$

non of the above can be made true with non zero (non-trivial) linear combinations. Thus, the vectors are not dependent so they are independent. I know its sort of an "edge case" condition for the definition but it sort of flipped my world to find out that I've been thinking about such a fundamental concept like independence and dependence wrongly in linear algebra and I'm trying to come to terms with it.

Why is my intuition incorrect? Why was the standard definition of independence as $Xc = 0 \iff c=0$ the accepted definition of independence? Whats wrong with my definition? Are they essentially the same definition except for this weird edge case?

last footnote is about what the word dependence means with respect to the number and vector zero. I think what my last confusion boils down to is why $0x = \mathbf{0}$ is considered as $\mathbf{0}$ depending on $x$. I guess in my head saying that we don't need any of $x$ to express $\mathbf{0}$ seems to mean that $\mathbf{0}$ doesn't need $x$ (or any other vector). But the convention according to everything pointed out by everyone in these set of answers points out to the opposite. I don't understand why. Is it that just having an equation linking terms means dependence even if we specify with a zero that we don't actually need the term?

comment I found useful: For instance, one can ask about the span of the vectors; that is, the set of linear combinations of the vectors. For n n linearly-independent vectors, one would expect the span to have dimension n n . However, if one of the n n vectors is the zero vector, then the remaining vectors span only a lower-dimension space. In that way, it makes sense to talk about including the zero vector as causing the same problems as including a vector that is a linear combination of the rest: that the new vector does not add anything new to the span. — Charlie Parker, Jul 16 '17 at 03:53
The problem of your proposed definition: a linearly dependent set can have linearly independent subsets. — Martín-Blas Pérez Pinilla, Jul 17 '17 at 07:13
With your definition every element of the set is linear combination of the other elements. — Martín-Blas Pérez Pinilla, Jul 17 '17 at 19:54
@Pinocchio You said the vectors are independent because you can't express any of them in terms of the other two - see three examples. However see example 1 and substitute $a=0, b=0$. — user253751, Jul 17 '17 at 21:01

Ted · Accepted Answer · 2017-07-16T07:36:27.763

22

Your intuition for linear (in)dependence is very close. Based on your intuition, the definition you're looking for is:

$\{v_1, ..., v_k\}$ is linearly dependent if there exists an index $i$ and scalars $c_1, ..., c_k$ (excluding $c_i$) such that $v_i = \sum_{j \ne i} c_j v_j.$

You can prove that this is equivalent to the standard definition.

Notice how this differs from your proposed definition:

(1) It says there exists a $v_i$, not for all $v_i$.

(2) There is no zero restriction on the $c_i$.

(1) is important because all it takes is a single redundancy to get linear dependence. Not all vectors have to expressible in terms of the others. To see why this is the case, just think about the case where a set $\{v_1, \ldots, v_k\}$ is already dependent and then I suddenly add a $v_{k+1}$ which cannot be expressed as a linear combination of $v_1, \ldots, v_k$. Adding a vector to a dependent set shouldn't turn it into an independent set.

As for (2), the standard definition needs to say that $c$'s can't be all 0 because you don't want $\sum 0 v_i = 0$ to imply dependence. But with the above definition, you've already singled out a vector to have a coefficient of 1 (which is not 0) so you don't need any condition on the c's anymore.

edited Jul 16 '17 at 07:36

answered Jul 16 '17 at 04:15

Ted

33,788

I think what I am still struggling a lot is why we don't require that the $c_i$'s to be non-zero. Do you mind clarifying that? – Charlie Parker Jul 16 '17 at 04:48
3

I'm not sure what you're unclear on. Linear dependence = one of the vectors can be expressed as a linear combination of the others. It shouldn't matter if the linear combination involves zero coefficients or not. It's only when you don't single out one of the vectors (as in the standard definition) that you say that the coefficients can't all be 0, because when a vector has coefficient 0, it's "not really there", so if all coefficients are 0 then you don't really have any vectors around at all. But if you use the above definition, then you already have a vector which is "there". – Ted Jul 16 '17 at 05:02
I think what is unclear to me is why: "it shouldn't matter if the linear combination involves zero coefficients or not." Thats whats driving me crazy right now. :( Why is 0x=0 means 0 depends on x. It doesn't depend on x since no amount of x is being used. Shouldn't that mean 0 and x are independent? – Charlie Parker Jul 16 '17 at 05:27
3

@Pinocchio, "dependent" is an English word that fits most of the time. When the zero vector is around, it's a bad fit. Since there's no English word that meant something like "one of the members of this set is redundant", we just have to live with the word dependent. – Mark S. Jul 16 '17 at 05:41
@MarkS. so basically the word dependence means redundancy is what your saying (in this context)? – Charlie Parker Jul 16 '17 at 06:00
2

@Pinocchio The word dependent in math means [the formal definition you don't like] or anything equivalent to that, and it serves a mathematical purpose. The English word "dependent" was probably chosen because it fits the intuition in the common practical cases when the zero vector is not around. – Mark S. Jul 16 '17 at 06:03
not sure if this piece of justification brings anything but I think I like thinking of the case that things are dependent if they lie in the same line (so there is an equation linking them). If we have the vector the vectors x and 2x we shrink 2x by half to bring it to x. In a way it seems the same processes to consider the vector x and shrinking it to the origin. So they are dependent in mathematical sense that there is an equation connecting them. At this point if I keep questioning things it even seems sensible to question if the origin point is really a "vector" since it has no size. – Charlie Parker Jul 16 '17 at 06:09
Anyway I think in the end Ted is right, the only thing that really matters is if there is at least a single 1 in the coefficients (choosing a "thing") since if this happens it means we can re-arrange the equations to show there is an equation connecting where the 1 is and the other terms. It seems that the "all zeros" part of the definition is more of a trick to make the definition concise (and elegant?).The concept that the all zeros is really capturing is that since all coefficients are zero there is no real way to express one term in terms of the others. I think thats whats going on. – Charlie Parker Jul 16 '17 at 06:13
@Pinocchio : The way to see that $\vec{0}$ is a vector is to observe that $\vec{v} + (-\vec{v})$ is a vector (because the definition of the operations gives us that minus a vector is a vector and that the sum of two vectors is a vector). – Eric Towers Jul 16 '17 at 06:44
@Ted In your quote in the answer, should that say "linearly dependent", rather than "linearly independent"? – David Z Jul 16 '17 at 07:03
@DavidZ Yes. Fixed. – Ted Jul 16 '17 at 07:36
@EricTowers sorry for being dense eric, how does $\mathbf 0 = ( \mathbf v) + (- \mathbf v)$ help to see $\mathbf 0$ as a vector? – Charlie Parker Jul 16 '17 at 16:09
@Pinocchio : Do you have a vector, $v$, in some vector space? Then the operations of negation and addition force that $\vec{0}$ is in your vector space (and so is a vector). – Eric Towers Jul 16 '17 at 20:54

score 14 · Answer 2 · answered Jul 16 '17 at 03:49

14

Let me address your last question (and hopefully it will help with clarifying some of your misconceptions):

Are they essentially the same definition except for this weird edge case?

No, not only in that case. Consider e.g. the following set of three vectors in $\mathbb{R}^2$:

$$\mathbf{v}_1=\begin{bmatrix}1\\0\end{bmatrix}, \quad \mathbf{v}_2=\begin{bmatrix}2\\0\end{bmatrix}, \quad \mathbf{v}_3=\begin{bmatrix}0\\1\end{bmatrix}.$$

It's easy to see that this set is linearly dependent according to the standard definition because $$2\mathbf{v}_1+(-1)\mathbf{v}_2+0\mathbf{v}_3=\mathbf{0}.$$ However it doesn't satisfy your definition. Although vectors $\mathbf{v}_1$ and $\mathbf{v}_2$ can be expressed as (nontrivial) linear combinations of the other ones, viz. $\mathbf{v}_1=0.5\mathbf{v}_2+0\mathbf{v}_3$ and $\mathbf{v}_2=2\mathbf{v}_1+0\mathbf{v}_3$, we can't do the same with the last vector because the equation $$\mathbf{v}_3=c_1\mathbf{v}_1+c_2\mathbf{v}_2$$ clearly has no solutions.

Let me try to describe informally what I think is going on here. The standard definition of linear dependency basically says that there's some dependency somewhere, but not necessarily everywhere, as you seem to believe.

As @AOrtiz already said, one way to think of dependency is that it means redundancy in the given system of vectors. Look at it this way. Given a set of vectors, we may want to construct its span, i.e. the set of all linear combinations of those vectors. If the original set is linearly dependent, then it's redundant in the sense that you can remove some (but not arbitrary!) vectors and still have the same span. The standard definition of linear dependence helps us detect if that's the case.

answered Jul 16 '17 at 03:49

zipirovich

14,670
1
26
35

3

Good job pointing out that the redundancy that arises in linear dependence arises somewhere and that it is not the case that all vectors in a dependent set need to be linear combinations of the others. – Alex Ortiz Jul 16 '17 at 03:55
this is very interesting. So essentially some dependency somewhere is equivalent to redundancy as this is what the definition of independence tries to capture, right? Independent means there is truly no dependence anywhere. But in my example I see how the tacking the vector $0$ is redundant but how is there dependence somewhere now with your new framework to think about the definition? Can't seem to find it. – Charlie Parker Jul 16 '17 at 03:59
@Pinocchio: You've incorrectly excluded linear combinations with all coefficients 0. – user2357112 Jul 16 '17 at 04:03
@Pinocchio: A statement equivalent to the standard definition of linear dependence says that there exists at least one vector (not arbitrary, as we've already agreed) which is a linear combination of the other vectors. Note that it only says "linear combination" -- it doesn't have to be nontrivial. (Which was another misconception in your "definition", by the way.) That's how your example fits into this framework: $\mathbf{0}=0\mathbf{x}_1+0\mathbf{x}_2$. – zipirovich Jul 16 '17 at 04:06
@zipirovich I am definitively very confused now because of the "non-trivial" example. Can you clarify that? How is it legal to just use zeros as the coefficients and call it dependent...isn't that what independent means? (sorry for being dense and thanks for the patience). – Charlie Parker Jul 16 '17 at 04:10
@AOrtiz: Thank you! By the way, you gave an excellent answer, and I couldn't agree more with it. But I also wanted to offer the OP a slightly different point of view with a more specific example. – zipirovich Jul 16 '17 at 04:11
1

@Pinocchio: In the last example I'm not saying that that the set ${\mathbf{x}_1,\mathbf{x}_2}$ is linearly dependent -- it isn't. But the set ${\mathbf{x}_1,\mathbf{x}_2,\mathbf{0}}$ is linearly dependent. To avoid confusion, let's give the last vector a name too: $\mathbf{x}_3=\mathbf{0}$. From the point of the standard definition: dependent because e.g. $0\mathbf{x}_1+0\mathbf{x}_2+(-1)\mathbf{x}_3=\mathbf{0}$. Move the last term to the other side, and you'll get $\mathbf{x}_3=0\mathbf{x}_1+0\mathbf{x}_2$, i.e. linear dependence from the point of view of this other equivalent definition. – zipirovich Jul 16 '17 at 04:19
Am I missing something or is $c_1 = -2, c_2 = 1$ a solution to your equation? – YoTengoUnLCD Jul 16 '17 at 04:36
@zipirovich I see. That was definitively helpful. Though, why is the definition of dependence not require that the linear combination to be non-trivial? I think if we get that we got the last nail on the coffin of my confusions. – Charlie Parker Jul 16 '17 at 04:44
@Pinocchio: See the answer from Ted below. The vector on the left-hand side of $\mathbf{v}i=\sum\limits{j\neq i}\mathbf{v}_j$ has a coefficient of $1$. So even if all coefficients on the right-hand side are zeroes, you still have a nontrivial property, i.e. a nontrivial linear combination -- just move $1\mathbf{v}_i$ to the other side. – zipirovich Jul 16 '17 at 04:48
@YoTengoUnLCD: Yes, you're missing something -- the second (bottom) component. – zipirovich Jul 16 '17 at 04:50
@zipirovich I think what my confusion boils down to is why $0x = \mathbf{0}$ is considered as $\mathbf{0}$ depending on $x$. I guess in my head saying that we don't need any of $x$ to express $\mathbf{0}$ seems to mean that $\mathbf{0}$ doesn't need $x$ (or any other vector). But the convention according to everything pointed out by everyone in these set of answers points out to the opposite. I don't understand why. Is it that just having an equation linking terms means dependence even if we specify with a zero that we don't actually need the term? – Charlie Parker Jul 16 '17 at 05:41
@zipirovich Ohh right, sorry, I read that equation too fast. – YoTengoUnLCD Jul 16 '17 at 06:09
@Pinocchio Why would $0x$ not be $0$? (pretend that last zero is bolded; I don't know the latex command) – user253751 Jul 17 '17 at 05:20
@immibis sorry my sentence was phrased weirdly. Of course $0 \mathbf x = \mathbf 0$ is correct (thats not my complaint). My complaint is that equation means dependence of the zero vector to the $ \mathbf x$ vector. What I mean the act of multiplying any vector $ \mathbf x$ with the number zero is something that we consider as the vector $ \mathbf x$ depending on $ \mathbf x$. Thats what I mean. Essentially the word dependence does mean having an equation that links one vector to another instead of what I was thinking that if you need zero of something you don't really need it. – Charlie Parker Jul 17 '17 at 15:58
@immibis essentially I was really caught up on what the word dependence means (compared to natural language) which at this point I have accepted that it is if we can express one vector in terms of another in any way. The fact that the definition of independence requires all zeros in the definition to express zero looked very weird to me if what my previous comment is correct. In the end as ted correctly pointed out thats not what is going on, its just sort of a "trick" to compactly express the definition of independence plus we really don't want $ \sum 0 v = 0$ implying independence of course. – Charlie Parker Jul 17 '17 at 16:02

score 6 · Answer 3 · answered Jul 16 '17 at 03:41

I find that many of my students think the same way. Instead of thinking about null linear combinations, they usually prefer to think in terms of vectors as linear combinations of other vectors. And honestly, I probably do too. The definition of linear independence that is most intuitively geometric to me, is that no vector in the list can be expressed as linear combination of the others. This is equivalent to the other definitions of linear independence.

The negation of this is that some vector (not all vectors) in the list can be written as a linear combination of others. That is linear dependence. It has nothing to do with non-zero linear combinations (otherwise, as you pointed out, adding $0$ to the list will preserve linear independence). The zero vector is always a linear combination of the other vectors, adds nothing to the span, and therefore nothing to the dimension.

There are other cases, aside from $0$, where not every vector in a linearly dependent list can be expressed a a linear combination of others. For example,

$$((1, 0), (2, 0), (0, 1))$$

Some vectors (i.e. $(1, 0)$ and $(2, 0)$) can be expressed as linear combinations of the others, but not all. There is still dependency in the list.

Hope that helps.

Alex Ortiz · Answer 4 · 2017-07-16T05:50:03.953

I would prefer you state your definition of linear independence thusly:

Definition: The subset $\{v_1,\dots,v_n\}\subset V$ is linearly independent if whenever $a_1,\dots,a_n\in F$ and $$ a_1v_1+\dots+a_nv_n = 0, $$ then $a_1 = \dots = a_n = 0$.

Let's see how your intuition breaks down:

Definition: A set $A=\{v_1,\dots,v_n\}\subset V$ is linearly dependent if for each $v_i\in A$ there is a nontrivial linear combination $$ a_1v_1+\dots+\widehat{a_iv_i}+\dots+a_nv_n = v_i, $$ where the notation $a_1v_1+\dots+\widehat{a_iv_i}+\dots+a_nv_n$ means that $a_iv_i$ is excluded from the sum.

This says that every vector in the set $A$ can be expressed as a nontrivial linear combination of the other vectors. Well, what if we consider the set $A=\{e_1, 0\}$, where $e_1 = \begin{bmatrix} 1 \\ 0\end{bmatrix}$ is a column vector in $\Bbb R^2$, and $0$ is the zero vector. Then, according to our definition, this set $A$ is not dependent, since we can't express $0$ as a nontrivial linear combination of $e_1$. ~~However, we expect this to be dependent because of course $0$ does depend on $e_1$, as in $0 = 0e_1$.~~ [As I said in my comment to the OP below, I don't like how I originally phrased this—I would rather explain the intuition for why $\{e_1,0\}$ is dependent solely in terms of the redundancy that $0$ brings to this set.]

A better intuition for linear independence is that a set is linearly independent if we are specifying a minimal amount of information for the space it spans. That is, we can always consider the span of a set of vectors $A\subset V$. If we specify the minimal amount of information to achieve the span of $A$, then there are no redundancies: the vectors are independent. So dependent sets should be ones where we can find redundant information leftover.

To be concrete about the idea of how dependence $\leftrightarrow$ redundancies, consider the set $\{e_1,0\}$ again; this time consider its span too, i.e., $\{a_1e_1 + a_20:a_1,a_2\in F\}$. The $0$ vector is redundant because $\operatorname{span}(\{e_1,0\}) = \operatorname{span}(\{e_1\})$. Thus the $0$ vector is redundant, and the set is dependent.

On the other hand, if a set is independent, like $A=\{e_1,e_2\}\subset\Bbb R^2$, then we should not be able to remove even one vector from the set $A$ without changing $\operatorname{span}(A)$. This bears itself out here of course—reinforcing the intuition that independence $\leftrightarrow$ specifying the minimal amount of information.

Sorry I am still confused on this last requirement of non-triviality (i.e. the zeros in the coefficients). Why do we allow $0 = 0 e_1$ to mean dependence, isn't that what we want independence to eventually mean or is there some weird exception with the number and vector 0? — Charlie Parker, Jul 16 '17 at 04:52
I think what is unclear to me is why is 0x=0 means 0 depends on x. For me it says that the 0 vector doesn't depend on x since no amount of x is being used (i.e. the coefficient is zero). Shouldn't that mean 0 and x are not dependent? We are using no amount of x to form the target vector y (which is zero in this case by coincidence) so x and the target vector y are independent. — Charlie Parker, Jul 16 '17 at 05:31
@Pinocchio I'm not sure if I really like the way I worded it. Looking back at it, I would rather have explained the intuition for why ${e_1,0}$ is dependent solely in terms of redundancy, since that is what is really at the root of what's important here. The language "$0$ depends on $e_1$" is a little too imprecise to be useful, and I wouldn't get too hung up on it. — Alex Ortiz, Jul 16 '17 at 05:45

score 3 · Answer 5 · answered Jul 16 '17 at 03:25

To say that one of the vectors is a linear combination of the others singles out a vector to play a different role from the others. And it's possible that there are some among them that are not linear combinations of the others, but also some that are.

The point of the conventional definition is to make a statement in which none of the vectors plays a role different from the roles of the others, at least in the statement of the definition.

score 2 · Answer 6 · 2017-07-16T05:32:17.857

Why is my intuition incorrect?

I posit your intuition is incorrect because you learned from a biased source.

You probably learned about the idea of independence from talk of (in)dependent variables introductory calculus. However, dependence there is not spoken in a general sense, but is instead oriented towards a very specific application.

Specifically, actual problems are often most naturally expressed in terms of related variables, but introductory calculus tends to be presented in a very function-oriented manner. Thus, one is taught to re-express such problems in terms of functions, the typical method being to single out one or more variables (the 'independent variable(s)') to be used as function inputs, and interpreting the remaining variables as function outputs.

The general definition of independence in this setting is actually of the following form: a collection of variables are independent if and only if the only function $f$ satisfying $f(x_1, x_2, \ldots, x_n) = 0$ is the zero function.

You can also talk about more nuanced cases of independence, such as continuously independent ($f$ is restricted to continuous functions), differentiably independent ($f$ is restricted to differentiable functions), analytically independent ($f$ is restricted to analytic functions)... and, of course, the case at hand: linearly independent ($f$ is restricted to linear functions).

Incidentally, the fact that independence can be expressed in terms of comparisons to zero is a sort of weird quirk that is often applied to simplify definitions; the point may seem more intuitive when expressed in the following equivalent form:

A collection $\{ x_1, \ldots, x_n \}$ of vectors is linearly independent if and only if, whenever $f$ and $g$ are linear functions satisfying $f(x_1, \ldots, x_n) = g(x_1, \ldots, x_n)$, then $f = g$.

why is: "The general definition of independence in this setting is actually of the following form: a collection of variables are independent if and only if the only function $f$ satisfying $f(x_1,...,x_n)=0$ is the zero function." the definition of independent variables in the (weird?) calculus setting? Do you mind clarifying this point? seems a bit odd/random? — Charlie Parker, Jul 16 '17 at 06:22

score 2 · Answer 7 · answered Jul 16 '17 at 16:47

I think if you look at "dependent" as the negation of "independent" instead of the other way around, it'll make sense to you.

Independent is the lack of any dependence. So if there is even the tiniest dependence (between only a subset of the vectors), the whole set of vectors is dependent.

Your proposed definition requires a dependence relation between all of the vectors, which is just a "higher level of dependence" (so to speak) than is required to negate "independent".

Léreau · Answer 8 · 2018-06-30T18:38:14.853

Let me try to give you some intuition on the term independence, since you are lacking one so far. The usual definition is indeed

\begin{align} \{ v_1 , \dots, v_n \} \text{ are linearly independent} :\Longleftrightarrow&~ \textstyle \Big( \forall \alpha_j~~ \sum_{j = 1}^n \alpha_j v_j = 0 ~\Longrightarrow~ \forall j ~~\alpha_j = 0 \Big) ~~~~(\ast) \\ \Longleftrightarrow&~ \textstyle \Big( \forall \alpha_j~~ \sum_{j = 1}^n \alpha_j v_j = 0 ~\Longleftrightarrow~ \forall j ~~\alpha_j = 0 \Big) \end{align}

The last line following from the fact that $\forall j ~~\alpha_j = 0 ~\Rightarrow~ \sum_{j = 1}^n \alpha_j v_j = 0$ is trivially true.

But (as you can check) this is then also equivalent to the following

$\forall \alpha_j , \beta_j ~~\sum_{j = 1}^n \alpha_j v_j = \sum_{j = 1}^n \beta_j v_j ~~\Longleftrightarrow~~ \forall j ~~ \alpha_j = \beta_j$

which can be given a more intuitive interpretation. It basically says that two linear combinations (one with the $\alpha_j$'s and one with the $\beta_j$'s) of linearly independent vectors $v_j$ result in the same vector (i.e. $\sum \alpha_j v_j = \sum \beta_j v_j$), only when all the coefficients are already identical.

As you said, dependance is the negation of the above, and using our definition $(\ast)$ it would be

$\exists \alpha_j , \beta_j ~~\sum_{j = 1}^n \alpha_j v_j = \sum_{j = 1}^n \beta_j v_j ~~\text{and}~~ \exists j ~~ \alpha_j \neq \beta_j$

This says that the $v_j$ are dependant iff there are coefficients $\alpha_j, \beta_j$ which are not all the same (there is at least one $j$ such that $\alpha_j \neq \beta_j$), but still give the same linear combination $\sum \alpha_j v_j = \sum \beta_j v_j$.

tl;dr Independent vectors have unique linear combinations; Dependent vectors have ambiguous ones.

Why doesn't the definition of dependence require that one can expresses each vector in terms of the others?

8 Answers8