As I doubt that adding a 34th reply to the other thread would be noticed, I am replying to this thread with a point that I don't see discussed over there.
When defining length in $\Bbb R$, area in $\Bbb R^2$ and volume in $\Bbb R^3$, there are certain obvious principles we want from these concepts that pretty much nail down what the values should be for "reasonable" sets. These are:
- The measure of an interval $[a,b]$ should be $b - a$. The measure of a square of sidelength $s$ should be $s^2$. The measure of a cube of sidelength $s$ should be $s^3$.
- If $A$ and $B$ do not overlap, then the measure of $A \cup B$ should be the sum of the measures of $A$ and $B$.
- If $A \subset B$, then the measure of $A$ should $\le$ the measure of $B$.
These give you a strategy for uniquely defining the measure of a very large class of sets:
- Any set that can be expressed as the union of a non-overlapping collection of intervals/squares/cubes has a length/area/volume equal to the sum of the measures of those intervals/squares/cubes. Call these "base sets". Include the empty set, with measure $0$, as a base set.
- For any other set $A$, consider base sets containing $A$. The set of the measures of all such base sets is bounded below by $0$, and thus has a greatest lower bound $M$. Now consider base sets contained within $A$. If the set of all measures of those base sets is bounded above, it has a least upper bound $m$. If it is unbounded, treat $m = \infty$ as its least upper bound.
- By the last principle, since $A$ is contained in all the upper base sets, it should have measure $\le M$. Since $A$ contains all the lower base sets, it should have measure $\ge m$. So if $m = M$, the only possible measure for $A$ is the common value.
It turns out that sets have to be really weird not to have $m = M$. In fact, you cannot even construct one. You have to use non-constructive methods to prove that they exist. So this is sufficient to define length in $\Bbb R$, area in $\Bbb R^2$ or volume in $\Bbb R^3$ for any "reasonable" set. (And there is nothing that can be done about the "unreasonable" sets in $\Bbb R^3$, and even in $\Bbb R^2$ and $\Bbb R$ they can only be assigned measures by infinitely many arbitrary choices).
But none of the above applies to your paradox. You are not measuring length in $\Bbb R$. You are measuring length in $\Bbb R^2$. And that is where the problem comes in. The reason we can trap $A$ to a single value is because of the third principle. We can find base sets that contain $A$ and know by this principle that the measure of $A$ must be less than that of the base set. And vice versa for base sets within $A$.
But this principle is of little use for measuring length in $\Bbb R^2$ or $\Bbb R^3$. Pythagorus figured out how to extend the concept of length in $\Bbb R$ to lines in these higher spaces, and that is what you used to get $\sqrt 2$ for the diagonal. But more general curves, such as the circle, do not have any part of them bigger than discrete points lying in a line. How do we define their lengths?
In your paradox, the stair-step approximations to the diagonal are not subsets of that diagonal, nor is the diagonal a subset of the stair-steps. The third principle says nothing about how they are related.
There is one more general principle we can fall back on:
- The shortest distance between two points is a line. That is, among all curves going from one point to the other, the line segment connecting them will have the shortest length.
So if we approximate a curve with by choosing a selection of "vertex" points along it (in order as they occur on the curve) and connecting those vertices with line segments, we get a polygonal curve that approximates the original. The length of the polygonal curve is just the sum of the lengths of its segments. But by the principle, the length of the curve between each adjacent pair of vertices should be at least as long as the line segment connecting them. Thus the whole curve must be at least as long as the polygonal curve. If we consider the collection of all such polygonal curves approximating the original curve, the least upper bound of their lengths has to $\le$ the length of the original curve.
So we have a lower limit on how short the original curve can be, but what about an upper limit? Houston, we have a problem! There is nothing that gives us an upper limit. When Archimedes showed that the circumference of a circle is given by $2\pi r$, he waved his hands and just assumed that the circumferences of circumscribed polygons must be greater than that of the circle, because he has no property that implied it. But since he didn't have any definition of length for a non-linear curve to work with, there was nothing else he could do.
Today, we just define the length of a curve to be the least upper bound of the lengths of the polygonal approximations. If there is no upper bound, then the curve is of infinite length ("unbounded variation" is the term-of-art).
Your paradox and the $\pi = 4$ paradox, approximate the curve in question with a polygonal path with vertices that do not lie on the curve. Our only principle says straight lines between points on the curve must be at least as short as the curve itself between those points. The stair-steps are not straight lines between points on the curve, so they do not have to be a shorter path than the curve. And all these farsical paradoxes show is that the stair-steps are not as short as the curve.