Take $f(x) = x ^ 2$ and $p = 3$ for a simplicity sake. Then $f'(x)$ allows to define a linear approximation of the $f(p)$ in the following way: $l(x) = f'(p)(x - p) + f(p)$. Often enough it is stated that $l(x)$ is the best linear approximation one could possible find, however I've never seen how one compares some linear approximation to another to be able to deduce which is better?
For example, take $l(x) = 3x$ instead. It does approximate $f(x)$ at the $p$ perfectly: $f(p) = 3 ^ 2 = 9 = l(p) = 3 \times 3$, and also it's error function $e(x) = f(p) - l(x)$ does approach $0$ given that $x$ approaches $p$ either way.
So both of them yield the same perfect approximation at the point $p$, both offer arbitrary little errors for some $(p - \delta, p + \delta), \delta > 0$. How do I convince myself then that one is indeed better then the other?