Periodically I've tried to wrap my head around nonstandard calculus and hyperreals, but I always thought I needed a lot more of a background in formal logic and/or set theory to understand what's going on with them. Last night I had a sudden flash of insight about how they work, based on the ultrapower construction, but it was so anticlimactic that I wonder if I'm missing something. I want to outline my understanding to see if I actually have the basic idea correct. The construction I'm following is per Wikipedia, and I'm heavily paraphrasing:
One way to construct the hyperreals is as equivalence classes of real-valued sequences $\mathbf{x} = \{ x_i \}_{i \in \Bbb{N}} \in \Bbb{R}^\Bbb{N}.$ The construction is not dissimilar to using Cauchy sequences of rationals to define the reals, except that our equivalence relation doesn't necessarily treat Cauchy sequences that converge to the same number as being "equal"; rather, they are considered to be "infinitesimally different".
Any operation or function that we can define on the real numbers - addition, multiplication, comparison, absolute value, floor function - is extended to these sequences componentwise. This gives us a ring with a lot of the same nice structure and operations as we had on $\Bbb{R}$ originally; we can identify the elements $r$ of $\Bbb{R}$ with the constant sequences $(r, r, r, r, ...)$; and we can put a partial order on these sequences via componentwise comparison $\leq$: $\mathbf{a} \leq \mathbf{b}$ iff $a_i \leq b_i$ for all $i \in \Bbb{N}$. This partial order also lets us talk about sequences like $\mathbf{ε} = (1, 0.5, 0.25, 0.125, ..., 2^{-i}, ...)$, which are "greater than" the sequence $\mathbf{0} = (0, 0, 0, 0, ...)$ but "less than" any sequence $\mathbf{r} = (r, r, r, r, ...)$ for $r > 0$, as being infinitesimal in some sense, since $\mathbf{0} < \mathbf{ε} < \mathbf{r}$ for any positive $r$.
However, most sequences in $\Bbb{R}^\Bbb{N}$ are not comparable in this partial order, and the ring of such sequences is also rife with zero divisors, both of which are serious deficiencies if we hope to do any calculus. If possible, we would also like two sequences $\mathbf{a}, \mathbf{b} \in \Bbb{R}^\Bbb{N}$ to be considered "equivalent" (i.e. different names for the same sequence) if $a_i = b_i$ for all but finitely many $i$. We use an ultrafilter to accomplish this, and also to extend the partial order $\leq$ on sequences to a total order on equivalence classes of sequences (where the equivalence classes are given by "$\mathbf{a} \sim \mathbf{b}$" iff "$\mathbf{a} \leq \mathbf{b}$ and $\mathbf{b} \leq \mathbf{a}$"). The set of all equivalence classes, $\Bbb{R}^\Bbb{N} / \sim$, is the hyperreal numbers $^*\Bbb{R}$. Every function $f: \Bbb{R} \to \Bbb{R}$ has a nonstandard counterpart $^*f: ^*\Bbb{R} \to ^*\Bbb{R}$ given by $$^*f([x_1, x_2, x_3, ...]) := [f(x_1), f(x_2), f(x_3), ...]$$ where $[x_1, x_2, x_3, ...]$ means the equivalence class of the sequence $(x_1, x_2, x_3, ...) \in \Bbb{R}^\Bbb{N}$.
Under this interpretation, every infinitesimal hyperreal is the equivalence class of a Cauchy sequence with limit zero; every finite hyperreal is the equivalence class of a Cauchy sequence; and the "infinite hyperreals" are equivalence classes of unbounded sequences that either approach $+\infty$ or $-\infty$ (which gives an interesting way to rigorously interpret Big O notation). The standard part of a finite hyperreal is just the limit of a Cauchy sequence in its equivalence class.
Now that I understand how this works, it seems pretty clever to treat "essentially different" Cauchy sequences as different numbers "infinitesimally close" to a given one, since these sequences are the foundation of the limiting processes in calculus. But it also seems surprising to me that such heavy machinery was applied to proving that nonstandard calculus is logically equivalent to regular calculus, and such pedagogical emphasis (in the sources I've perused, like Robinson's original Nonstandard Analysis) is placed on the transfer principle. Because now that I have this understanding, nonstandard analysis almost seems like a trivial rewording of regular calculus; and I have such confidence that it works just like regular calculus, because the rewording is so mechanical once you understand that "standard part of a finite hyperreal" means "limit of a Cauchy sequence in its equivalence class":
Continuity
Standard: $f: \Bbb{R} \to \Bbb{R}$ is continuous iff $\{ f(x_i) \}_{i \in \Bbb{N}}$ is Cauchy whenever $\{ (x_i) \}_{i \in \Bbb{N}}$ is Cauchy (using the standard definition of a Cauchy sequence.)
Nonstandard: $^*f: ^*\Bbb{R} \to ^*\Bbb{R}$ is continuous iff $\operatorname{st}(^*f(\mathbf{x})) = f(\operatorname{st}(\mathbf{x}))$.
Differentiability
Standard: $f: \Bbb{R} \to \Bbb{R}$ is differentiable iff $\{ \frac{f(x + h_i) - f(x)}{h_i} \}_{i \in \Bbb{N}}$ is a Cauchy sequence converging to a specific number $f'(x)$ independent of the sequence $\{ h_i \}_{i \in \Bbb{N}}$, whenever $\{ h_i \}$ is Cauchy and converges to $0$.
Nonstandard: $^*f: ^*\Bbb{R} \to ^*\Bbb{R}$ is differentiable iff for any infinitesimal $\mathbf{h}$, $\operatorname{st}\left(\frac{^*f(\mathbf{x}+\mathbf{h}) - ^*f(\mathbf{x})}{\mathbf{h}}\right) = f'(\operatorname{st}(\mathbf{x}))$ independent of the value of $\mathbf{h}$.
So I guess my questions are:
Am I understanding this right and, on some level, the "infinitesimals" are just a thinly disguised version of Cauchy sequences with limit zero? That seems significantly more mundane than the name's suggestion of "infinitely small numbers" would imply.
What is the gain in clarity, elegance, speed, etc. one gets by working with "finite hyperreals" rather than the representative Cauchy sequences, per se? That is, what are some specific instances when one gains meaningful new insight into a result from standard analysis (as opposed to a spiffy new phrasing of the result) by working in the nonstandard/hyperreal setting?