Quaternions have real and imaginary parts, or one may call them a scalar and vector part. That is, we can interpret $\mathbb{H}$ (named after Hamilton) as $\mathbb{R}\oplus\mathbb{R}^3$. We already know how to multiply a scalar by a scalar, and a vector by a scalar, so it remains to describe how to multiply two 3D vectors. The scalar and vector parts of the product $\mathbf{uv}$ are the (opposite) dot product $-\mathbf{u}\cdot\mathbf{v}$ and cross product $\mathbf{u}\times\mathbf{v}$ respectively, so
$$ \mathbf{uv}=-\mathbf{u}\cdot\mathbf{v}+\mathbf{u}\times\mathbf{v}. $$
From this we may conclude, for instance:
- The square roots of $1$ are $\pm1$, and the square roots of $-1$ are precisely the unit vectors.
- Euler's formula $\exp(\theta\mathbf{u})=\cos(\theta)+\sin(\theta)\mathbf{u}$ for unit vectors $\mathbf{u}$.
- All quaternions have a polar form $p=r\exp(\theta\mathbf{u})$ with $r=\|p\|$.
- Two quaternions commute if and only if their vector parts are parallel.
- Two quaternions anticommute iff they are perpendicular vectors.
We consider the problem of describing 3D rotations by "inject a structure into a larger structure and describe it there". Now instead of looking at 3D rotations, we start by looking at rotations in 4D [...]
Exactly!
Given any unit vector $\mathbf{u}$, we may extend it to an oriented orthonormal basis $\{\mathbf{u},\mathbf{v},\mathbf{w}\}$ of $\mathbb{R}^3$, and if we adjoin the scalar $1$ we get an oriented orthonormal basis for $\mathbb{H}$. Define $L_p(x)=px$ and $R_p(x)=xp$. Then $L_{\mathbf{u}}$ has two invariant planes, the spans of $\{1,\mathbf{u}\}$ and $\{\mathbf{v},\mathbf{w}\}$. More to the point, $L_{\mathbf{u}}$ is a right-angle rotation in the $1\mathbf{u}$-plane and the $\mathbf{vw}$-plane. Moreover, the same applies to $R_{\mathbf{u}}$, except it turns the opposite direction in the $\mathbf{vw}$-plane. Just as $\exp(i\theta)$ turns the complex plane by $\theta$, we can show $L_p$ and $R_p$ (where $p=\exp(\theta\mathbf{u})$ turn the $1\mathbf{u}$ and $\mathbf{vw}$-planes by $\theta$, but with opposite directions in the $\mathbf{vw}$-plane.
If you want, you can write the matrices for $L_p$ and $R_p$ WRT the basis $\{1,\mathbf{u},\mathbf{v},\mathbf{w}\}$.
Inverting $L_p$ or $R_p$ alters the direction of rotation in both planes. Consequently, the conjugation $L_p\circ R_p^{-1}$ (i.e. $x\mapsto pxp^{-1}$) rotates by $2\theta$ in the $\mathbf{vw}$-plane and acts trivially in the $1\mathbf{u}$-plane. Restricting to $\mathbb{R}^3$, we can simply say it rotates around the oriented $\mathbf{u}$-axis by $2\theta$. So the answer to this is yes:
[...] we start with those that are induced by choosing a pair of coordinates, rotating it a certain angle, and then rotating the other two remaining coordinates. [...] You play around and maybe you end up realizing "hey if I switch the orientation in the other pair, and then conjugate an element by this rotation, it's actually a 3d rotations." - is that true or do I get that wrong?
On the other hand,
Is there a way to clearly see that other pairs like 1,i+j also define some sort of plane that i+j rotates via multiplication? [...] What I don't get in this approach is why things will still work even for other "rotations" of this type - how would you formalize that in some sense left multiplication by (i+j)2–√/2 will "rotate" ⟨1,(i+j)2–√/2⟩ and also the "orthogonal complement" of ⟨1,(i+j)2–√/2⟩ ?
This follows, I think reasonably directly, from the quaternion product of two vectors formula I mentioned above: with the dot and cross product here, multiplying two orthogonal vectors yields a third orthogonal vector. You can use this to show the $1\mathbf{u}$ and $\mathbf{vw}$-planes are indeed invariant planes WRT $L_p$ and $R_p$, and check the matrix representations of $L_p$ and $R_p$ in the appropriate basis.
It suffices to know what $L_p$ and $R_p$ do on these invariant planes because they are complementary and span the whole of $\mathbb{H}$; you can figure out what $L_p$ and $R_p$ do to any quaternion by splitting that quaternion into components with respect to the invariant planes.
Does extending the multiplication operation distributively still preserve the quality of "rotating two separate coordinate pairs" and if yes, how do I see that?
Adding two unit quaternions generally does not yield a unit quaternion, so the answer is technically no as written, but the answer is yes if you say "rotating two separate planes by the same angle and rescales."
Of course adding two quaternions gives a quaternion, so algebraically this is clear. I don't really think it's clear geometrically, however, and with good reason: this is a very exceptional accident that occurs in precisely four dimensions, and no other dimensions. (I have a related answer on Are left isoclinic rotations a group?.)
I want to get some ideas about how one might have discovered quaternions in the first place.
Finding a number system to describe 3D rotations just as complex numbers describe 2D rotations was indeed the way Hamilton discovered the quaternions. He needed a number system with an inner product corresponding to a multiplicative norm, and some square roots of $-1$ to act as "generators" for the rotations. He first assumed it would a 3D number system with $x=a+b\mathbf{i}+c\mathbf{j}$ and agonized for years over how to get it to work right, in particular what $\mathbf{ij}$ ought to be. Eventually he realized $|x^2|=|x|^2$ forced $\mathbf{i}$ and $\mathbf{j}$ to anticommute, and then he had an infamous flash of bridge-adjacent insight that $\mathbf{ij}$ ought to be independent of $\mathbf{i}$ and $\mathbf{j}$; from there everything else - the full multiplication table - flowed smoothly from the 4D insight and the requirement $|xy|=|x||y|$.
Once you have the number system in place, you can begin investigating it.
This is my best recollection anyway.
I haven't read a ton, I was hoping to get some high level understanding before trying to tackle this topic.
– John P May 27 '20 at 20:08