15

Reading about the cyclic values returned by integral powers of $i$: $$\begin{align*} i^0&=\hphantom{-}1\\ i^1&=\hphantom{-}i\\ i^2&=-1\\ i^3&=-i\\ & \,\,\,\vdots \end{align*}$$

and the pattern continues. Now, a simple question popped up in my mind: $i^2 = -1\implies i\notin\mathbb{R}$. So, how did we come to the conclusion that: $$i^3=i^2\cdot i=-1\cdot i=-i$$ because $i$ does not necessarily have the properties of reals. Is this just an assumption for the construction of the complex numbers, or is there a proof?

codetalker
  • 2,419
  • 14
    Your assumption is that the "exponent rules" for example ($k^3 = k^2\cdot k$) are "real number rules." They aren't. The whole field of abstract algebra developed from this observation: very different sets, with very different operations, have similar properties. Often the elements of those sets don't look like numbers at all, and the operations don't look like arithmetic, and yet the same rules apply. –  Aug 26 '16 at 18:25
  • 1
    It does depend on what framework you invented complex numbers for. If your definition involves complex numbers being a field, then $x^{n}x^{m}=x^{n+m}$ is a property of fields. – fleablood Aug 26 '16 at 19:07

6 Answers6

33

It does indeed have a proof. But this involves going back to exactly how the complex numbers are defined in the first place.

Let's say for the moment that we're happy with the real numbers and their properties. (Of course, there's a separate question - "how do we construct the reals?" - but let's leave that aside for now.) There are now a number of ways to rigorously define the complex numbers. Here's probably the simplest (although not the most useful): we let $\mathbb{C}$ be the set of ordered pairs $(a, b)$ where $a, b\in\mathbb{R}$, and we define the operations $+$ and $\times$ on $\mathbb{C}$ as follows (intuitively, the pair $(a, b)$ represents the number $a+bi$):

  • $(a, b)+(c, d)=(a+b, c+d)$.

  • $(a, b)\times (c, d)=(a\times c-b\times d, a\times d+b\times c)$.

Note that on the right hand side of each of those expressions, I'm using the notion of $+$ and $\times$ (and $-$) for real numbers. Arguably, a better way to write this would be

  • $(a, b)+_\mathbb{C}(c, d)=(a+_\mathbb{R}b, c+_\mathbb{R}d)$.

  • $(a, b)\times_\mathbb{C} (c, d)=(a\times_\mathbb{R} c-_\mathbb{R}b\times_\mathbb{R} d, a\times_\mathbb{R} d+_\mathbb{R}b\times_\mathbb{R} c)$.

That is, I've already defined operations $+_\mathbb{R}, \times_\mathbb{R}, -_\mathbb{R}$ for the reals, and now I'm defining new operations $+_\mathbb{C}, \times_\mathbb{C}$ for the complex numbers. In general, it will always be clear from context which is meant; but it's a good habit when learning this material to get into to clarify things with subscripts whenever we're using both types of operation.

We can use this explicit definition to verify basic claims about complex numbers. For example,

  • Check that $(0, 1)\times(0, 1)=(-1, 0)$, and that $(0, 1)$ and $(0, -1)$ are the only complex numbers with this property. We call the former "$i$" (but it doesn't really matter which one we pick).

  • Now, we can compute $(0, 1)^3=(0, 1)\times (0, 1)\times (0, 1)=(0, -1)$. This shows that indeed $i^3=-i$.

What about more general claims? For example, I implicitly used the associativity of $\times$ above, in writing out $(0, 1)^3$. We can also use the definition of $\mathbb{C}$ to prove that $\times$ is associative, and more generally that $\mathbb{C}$ is a field). This gets a bit messy; let me give as an example the proof that addition of complex numbers is commutative, which is simpler:

$$(a, b)+_\mathbb{C}(c, d)=(a+_\mathbb{R}c, b+_\mathbb{R}d)=(c+_\mathbb{R}a, d+_\mathbb{R}b)=(c, d)+_\mathbb{C}(a, b).$$

Note how we use (at the second "$=$" sign) the commutativity of $+_\mathbb{R}$; we're building on facts we already know about the reals, to prove facts about the complex numbers, and this is justified because of how the complex numbers are constructed from the reals.


EDIT: Just to tie up a loose end, what did I mean above when I said that this was "not the most useful" way to construct the complex numbers?

Well, it turns out that there is a much more abstract approach we can use, that - while much harder at first - ultimately turns out to be much more mathematically useful. The key point is to notice that essentially the whole point of the complex numbers is to solve a single equation, $x^2=-1$. In some sense, they are what you get when you start with the reals and "fill in" the hole left by that equation.

This suggests that we should look for some tool for doing the following: if $\mathbb{S}$ is some number system (reals, complexes, integers, rationals, quaternions, hyperbolic sedenions, or thing-I-just-made-upions) and $E$ is some equation in $\mathbb{S}$, there should be a way to build a thing like $\mathbb{S}$ but that can also solve $E$. This is really really vague, but ultimately leads to a bunch of ideas in abstract algebra; for example, algebraic extensions of fields. In this context, there's a precise sense in which the complex numbers are the "smallest" number system containing the reals where $x^2=-1$ has a solution, and the relevant notation is $\mathbb{R}[x]/\langle x^2+1\rangle$ - which is probably gibberish right now but down the road you'll see how to view this as a "recipe" for building the complex numbers from the real numbers.

Noah Schweber
  • 245,398
  • 3
    Great Explanation, thnx – codetalker Aug 26 '16 at 18:17
  • 3
    Wouldn't it be simpler to define $\mathbb C$ as $\mathbb R[x] / (x^2 + 1)$ and then deduce the formula for a product of two complex numbers? – Santiago Aug 26 '16 at 18:19
  • 2
    @Santiago I tend not to think so, although that is certainly the more mathematically useful way to define $\mathbb{C}$ (and I've added a bit about it very very briefly); I find that passing to polynomials, then modding out by an ideal, etc. winds up introducing a lot of unneeded confusion for most people (myself included) seeing it for the first time. But of course this varies person to person, and what's simpler for me may not be simpler for anyone else. – Noah Schweber Aug 26 '16 at 18:24
  • Textbook introduction. Very nice. – StubbornAtom Aug 26 '16 at 18:35
  • 1
    Very nice! But I'd like to actually point out that the "real rule" $x^{n+m}=x^nx^m$ would hold for all constructs where $ab$ is an associative binary operation, (associative means $(ab)c = a(bc)$ ordering which grouping of terms first doesn't matter; binary operation means simply you can combine any two values to a distinct value) and we define $a^m; m$ an integer to mean $aa......a$ (operated $m$ times). If so $a^ma^n=aa......a = a^{m+n}$ is inavoidable no matter what the set is or what the operation is. – fleablood Aug 26 '16 at 19:14
  • @fleablood Yes, that's of course absolutely true - and a very good point. I thought about adding a section on how specific facts only depend on some properties, but I decided against it on the grounds that this was too long already. Maybe I should . . . – Noah Schweber Aug 26 '16 at 19:19
  • @Siddhant But we can give an elementary presentation of $,\Bbb C := \Bbb R[x]/(x^2+1),$ that does not require knowledge of ideals or quotient rings. Namely we can use congruences (in exactly the same way we use congruences to present the quotient rings $,\Bbb Z/m,$ in elementary number theory). This is more natural than using Hamilton's pair construction. – Bill Dubuque Aug 26 '16 at 20:37
  • 4
    @BillDubuque In my experience, many students find even congruences more complicated than the pair construction. Of course, preference varies student to student, but I tend to think that the pair construction is the easiest for the majority of students. One possible reason why is that it doesn't involve "type confusion" - I've often found that students find it weird that an element of the "new" algebra is on the same level as a subset of the old algebra, and canonical representatives don't help here as much as with $\mathbb{Z}/m$. But we can agree to disagree. – Noah Schweber Aug 26 '16 at 20:41
  • Besides, I think the pair construction has a very positive feature: it encourages playing around. It suggests other possible structures on sets of $n$-tuples for varying $n$, and playing around with these can both function as a great way to introduce congruences (what happens when we can deduce $(a, b)=(c, d)$?) and deep questions about abstract algebra (can we define a "nice" system on triples?). By contrast, congruences feel much more abstract and harder to play with (in my experience). Tl;dr - the pair construction is definitely more ad hoc, but I actually quite like it on its own terms. – Noah Schweber Aug 26 '16 at 20:45
  • (Full disclosure - when I say "many students," I mean "many students including me.") – Noah Schweber Aug 26 '16 at 20:45
  • @Noah The elementary congruence-based presentations don't employ quotient sets or equivalence classes. – Bill Dubuque Aug 26 '16 at 20:45
  • @BillDubuque So you choose canonical representatives? I've tried that, and it still seems more difficult for most (of my) students. – Noah Schweber Aug 26 '16 at 20:46
  • @Noah By your argument choosing the least-terms canonical rep of a fraction should be beyond most students. But this is not true. – Bill Dubuque Aug 26 '16 at 20:47
  • 2
    @BillDubuque No, that's not what I said. I said that in general contexts choosing canonical representatives has been, in my experience, a nontrivial step for most students. Reducing fractions, and modular arithmetic, are specific contexts in which there are very "natural" choices of canonical representative. In my experience, constructing the complex numbers gets to the point where choosing canonical representatives adds a layer of difficulty not present with the pairs construction. (Btw I think this thread is the wrong place for this conversation; please email me if you want to continue.) – Noah Schweber Aug 26 '16 at 20:49
  • @Noah It can be successfully taught that way. Students who have already mastered using remainders as normal reps in $,\Bbb Z\ {\rm mod}\ m,$ usually don't have much difficulty generalizing that to $,\Bbb R[x]\ {\rm mod}\ f(x),,$ esp. if one stresses the analogy (both have Euclidean division algorithm). – Bill Dubuque Aug 26 '16 at 20:55
  • @BillDubuque My claim was never that it can't be taught that way - just that I find it easier to teach via pairs instead. (Keep in mind that, as I said above, I personally found it both easier and more interesting (initially) via pairs.) Regardless, I don't think this comment thread is the place for this discussion; please email me if you want to continue (my email address can easily be computed from available information :P). – Noah Schweber Aug 26 '16 at 20:56
  • 4
    @BillDubuque if you think there's a better presentation, the place for it would be as an alternate answer. The people to convince are the visitors to this question, not Noah Schweber, who already gave his answer. – Spike0xff Aug 27 '16 at 04:32
  • 1
    @Spike0xff Answers are not the correct place for replies to comments (which is what sparked this comment thread). – Bill Dubuque Aug 27 '16 at 13:30
  • @Noah Obviously what is easiest in any instance will be highly context dependent. My point was merely to point out one elementary way that is more conceptually intuitive. Worth mention is that fractions are also more naturally presented this way, so one gains a bit greater generality by teaching the congruence-based approaches. But I don't doubt that this may be too advanced for some audiences. – Bill Dubuque Aug 27 '16 at 14:16
  • 1
    The orderer pair approach solves a different problem: show that real numbers can coexist with a square root of -1. To bridge the gap between "assuming" and "proving" in relation to complex numbers, one would have to show that the usual forms of "assuming" create an R(i) or Q(i) or Z[i], with $i$ defined as a formal solution of $i^2=-1$, and that this expanded ring or field makes sense and has the usual properties. – zyx Aug 27 '16 at 20:40
5

Although those are excellent answers I think the answer is even simpler.

$b^{m}b^{n} = b^{m+n}$ is not a "real number rule" but an "associative binary operation rule".

If we define/construct/invent any set of elements (these could be gumdrops or chess pieces for all we care) and create any operation between to elements ($a*b$ could be "melt the two gumdrops together an make a new one" or "always pick the second chess piece; always") so that:

--$a * b$ for any two elements of the set will result in $c$, also a member of the set and the same $a,b$ combined will always results in the same member $c$ (The term for this is $*$ is a "binary operation".

--If we evaluate $a*b*c*d$ it doesn't matter how we group them. $(a*(b*c))*d$ (where we first do $b*c= e$ and then we do $a*e = f$ and then we do $f*d = g$) will give the same result as $(a*b)*(c*d)$ (where we do $a*b$ and get $h$ and then we do $c*d$ and get $i$ and then we do $h*i$ and get, amazingly, $g$). Or more simply $(a*b)*c = a*(b*c)$. (such an operation is called "associative". An example of something that isn't associative is $(2^3)^2 \ne 2^{(3^2)}$.)

Then if we define as a matter of notation (it's just notation) that if $n\in \mathbb N$ than $a^n := a*a*a*....*a$ where a is operated $n$ times.

If we create such a system, (whether gumdrops, chess pieces, or numbers), then we will always have:

$b^mb^n = (b*.....*b)*(b*.....*b) = (b*....................*b) = b^{m+n}$.

Always.

So this is true for complex numbers. (Assuming we defined what $c*d$ means and that $c*(d*e) = (c*d)*e$).

===

Just to be perverse, let's do this with chess pieces. And define $a*b = b$. Then $(knight*pawn)*bishop = pawn*bishop = bishop$ which is equal to $knight*(pawn*bishop) = knight*bishop = bishop$.

Then $pawn^2*pawn = (pawn*pawn)*pawn=pawn*pawn= pawn; pawn*pawn^2 = pawn*(pawn*pawn) = pawn*pawn = pawn; pawn^3 = pawn*pawn*pawn = pawn*pawn = pawn$.

So $pawn^2pawn=pawn*pawn^2 = pawn^3$.

That was kind of boring.

=====

To go to Noah Schweber's excellent answer. If $i = (0,1)$ and $(a,b)*(c,d)= (ac - bd, ad + bc)$ is the definition of complex numbers then:

-- yes, it is binary. ($(a,b)*(c,d)= (ac - bd, ad + bc)$ results in a real value pair.)

-- yes it is associative. ($(a,b)*[(c,d)*(e,f)] = (a,b)*(ce-df,cf+de)=(a(ce-df)-b(cf+de), b(ce-df) + a(cf+de))=(ace-adf-bcf-bde,bce-bdf+acf+ade)$ while ($[(a,b)*(c,d)]*(e,f) = (ac - bd,bc+ad)*(e,f)=((ac-bd)e-(bc+ad)f,(bc+ad)e + f(bc+ad))=(ace-adf-bcf-bde,bce-bdf+acf+ade)$ so $(a,b)*[(c,d)*(e,f)] = [(a,b)*(c,d)]*(e,f)$.

Then

$(a,b)^3 = (a,b)(a,b)(a,b)=(a^2 - b^2,2ab)(a,b) = (a^3 - 3ab^2,3a^b-b^3)$

$(a,b)^2(a,b) = [(a,b)(a,b)](a,b)=(a^2 - b^2,2ab)(a,b) = (a^3 - 3ab^2,3a^b-b^3)$

$(a,b)(a,b)^2 = (a,b)(a^2 - b^2,2ab)=(a^3 - 3ab^2,3a^b-b^3)$

all equal.

fleablood
  • 124,253
  • +1, at least for positive integer exponents, there's no need to go beyond checking associativity. – celtschk Aug 27 '16 at 11:23
  • I'm not sure where I did go beyond checking associativity. But we need binary and associativity otherwise $b^n$ isn't well defined. As $n$ is integer and $b^n$ is notation it is impossible for $b^{n+m} \ne b^nb^m$ but I do feel this will seem like "magic" if it's simply stated as fact. So I feel a physical demonstration was in order. – fleablood Aug 27 '16 at 17:46
  • Oh, I see. you are talking about my display that (a,b)^3 is (a,b)^2(a,b). I wasn't actually trying to "prove" that but demonstrate that my argument really does hold in the hopes that by seeing it concretely and looking at it it will be made apparent and obvious the my argument must be true and hold for any associative binary operation. – fleablood Aug 27 '16 at 17:49
  • You misunderstood me. I didn't criticize you, I just summarized. However for the $i^0$ also present in the question, you actually do need a bit more: Namely the neutral element ($1$). Well, strictly speaking, you only need an idempotent element that when multiplied by $i$ from either side gives $i$. For your chess pieces example, $a^0=a$ would work, definins some piece as "1"and writing $a^0=1$ would not, as for $a\ne 1$ you'd get $a*1=1\ne a$ – celtschk Aug 27 '16 at 17:52
  • Oh, I did skip the $i^0$. The question holds without it. If you want to be picky I always felt"$b^1=b$" times itself 1 time" could be be considered ambiguous. Anyway, IF the system has an identity element. (gumdrops and chessmen do not) $a^0 = $ identity by definition is the one value that would extend the $b^{n+m}=b^{n}b^m$. Likewise if the system has inverses $b^{-m} := {b^{m}}^{inv}$ consistently extends to preserve the rule. – fleablood Aug 27 '16 at 18:06
3

Are properties of the imaginary unit assumed or proved?

Both. The intuitive approach of assuming a solution to a previously unsolvable equation (such as an $i$ solving $i^2= -1$) that satisfies other algebraic properties of the number system, came into use in the 1400-1700's and was later proved to create a consistent number system that works as expected (such as any element of the system with $i$ being uniquely expressible as $x+yi$).

Today the sequence is similar: it is assumed to work in high school, and proved to work at university.

zyx
  • 35,436
3

There are various logically rigorous ways to prove that the set of all complex numbers is a field, i.e. it satisfies certain axioms about addition, subtraction, multiplication, and division, but there is also another point of view worth knowing about:

  • Multiplication by any complex number $z = x+iy$ amounts to rotating through the angle whose sine is $y/|z|$ and whose cosine is $x/|z|$ and then multiplying by the real number $|z|$.
  • In particular, multiplying by $i$ means rotating $90^\circ$ counterclockwise. Thus multiplying by $i^3$ amounts to rotationg $270^\circ$ counterclockwise, whcih gives you $-i$.
2

As you point out, any integral power of $i$ is $i$ or $-1$, or $-i$ or $1$.

Complex 'multiplication' is defined by $(a,b).(c,d)=(ac-bd,ad+bc)$ where $a,b,c,d\in\mathbb{R}$. Also note that the complex number $(0,1)=0+i.1$ is defined by the notation $i$, the imaginary unit.

So, $i^2=(0,1)(0,1)=(-1,0)=-1$ and we can similarly show that $i^3=(0,-1)=-i$.

StubbornAtom
  • 17,052
2

Define $ i $ as the root of $ x^2 + 1 $, define $ \mathbb C = \mathbb R(i) $, a finite field extension of the reals. From this definition, the field axioms are preserved, and you can thus apply associativity to obtain your statement.