3

An algorithm $\tilde{f}$ of a problem $f$ is said to be backward stable if for any input $x$, $$ \tilde{f}(x) = f(\tilde{x}) \ \text{ for some } \tilde{x} \text{ such that } \frac{||\tilde{x} - x||}{||x||} = O(\epsilon) $$ as $ \epsilon \to 0 $, where $\epsilon$ is the machine epsilon.

The first time I saw this definition I was very confused: where does the $ \epsilon $ come from? There is no $ \epsilon $ in the definition of $ f, \tilde{f}, x, \tilde{x} \cdots $ etc.

After some thought, I realized that the algorithm $\tilde{f}$ is meant to be implemented on a family of idealized machines for which machine epsilon $\epsilon \to 0$. Thus, the algorithm is actually a family of functions $ \{ \tilde{f_\epsilon} \}_{\epsilon >0} $. Each $ \tilde{f_\epsilon} $ corresponds to different machine epsilon $\epsilon$.

Here is my understanding of backward stability:

An algorithm $ \{ \tilde{f_\epsilon} \}_{\epsilon >0} $ is backward stable if there exists $C, \delta>0$ such that for all input $x$ and $ 0 < \epsilon < \delta$, there exists $\tilde{x}$ such that $$ \tilde{f_\epsilon}(x) = f(\tilde{x}) \ \text{ and } \ ||\tilde{x} - x|| \leq C \epsilon ||x|| $$

Is my understanding correct? This definition seems to be too complicated.

1 Answers1

1

Your understanding is very close to correct, I would say. The difficulty with the original definition that you quote stems from confusing an algorithm with its output when implemented on a variety of machines with inexact arithmetic.

I hesitate to give a general definition of what an algorithm is, since I am not a computer scientist. But commonly we need to talk about implementing the same algorithm on different machines. On any given machine $m$, this determines an output $\tilde f_m(x)$ from any given input $x$.

To analyze the machine-dependent error, we suppose that the relative error in inexact arithmetic operations can be bounded in terms of machine $\epsilon$ in a standard way, that, for example, IEEE floating-point arithmetic should satisfy. (Different machines with the same machine $\epsilon$ can produce different outputs, as I know from bitter experience: Around 1990, a certain British PC maker refused to fix an overflow-on-underflow bug in their Intel 8088 software floating-point emulation that I ran into and told them about.)

Under these assumptions on machine arithmetic, we can hope to prove that, as you say, there exists $C,\delta>0$ such that if $0<\epsilon<\delta$, for each input $x$ there exists $\tilde x$ such that $$ \tilde f_m(x)=\tilde f_0(\tilde x) \quad\mbox{and } \|\tilde x-x\|\le C\epsilon\|x\|, $$ where $\tilde f_0(\tilde x)$ is the (idealized) output of the algorithm run in exact arithmetic with input $\tilde x$. We may prefer here to replace $\tilde f_0$ by the quantity $f(\tilde x)$ that the exact algorithm should output if it could be run exactly, without worrying about whether this is possible or whether it is defined differently. In this sense, evaluating $f(x)$ is the original "problem."

As to why this is fairly complicated, it has to do with distinguishing error generated by inexact arithmetic from error due to problem sensitivity. There is a possibility that the values of $f$ may depend very sensitively on the inputs, so that $f(x)$ may differ very greatly from $\tilde f_m(x)$, but it is not the algorithm's fault: It computes the exact answer $f(\tilde x)$ to a problem with input $\tilde x$ as close to $x$ as we can reasonably expect, given finite-precision arithmetic. See answers to this question for more on this and further references.

Bob Pego
  • 5,449