Exponential least squares equation

Question

I want to fit a function in the form of $y=Ae^{bx}$ using least squares regression. This page says that there's an alternative to applying least squares to $\ln y=\ln A+Bx$, but I can't figure out why $\sum_{i=1}^ny_i(\ln y_i-a-bx_i)^2$ is minimized, particularly where $y_i$ and $\ln y_i$ come from. I'm also lost when it comes to the steps that follow and the final equations that solve for a and b.

I've found that the final equation they come up with gives a fairly accurate function, but I don't understand how they got there. I'm only in high school and have a basic understanding of calculus, but I want to know how this works. I'd be thankful for any answers I can get.

The webpage also gives an explanation for introducing $y_i$ in the summation, i.e., to reduce the weights given to small $y_i$'s. — Math Lover, Jan 04 '18 at 01:52

NicNic8 · Accepted Answer · 2018-01-04T04:37:50.470

So you know that $y=Ae^{bx}$, and that's all you know. You have a whole bunch of data $(x_1,y_1), (x_2,y_2), \ldots, (x_N,y_N)$ and you can plot these points on a graph, and then you see that they lie on an exponential curve, but how do you get the values of $A$ and $b$?

What if we took the log of both sides: $\ln(y) = \ln(A) + bx$. But wait! This is our familiar equation of the line: output equals the slope times the input plus a constant. In this case, $x$ is the input, $\ln(y)$ is the output, $b$ is the slope, and $\ln(A)$ is the constant. So now all we have to do is fit points to a line, and we can do that.

How do we find the relevant quantities? We could graph the points by hand, we'd see that they almost fall on a line, then we could take a ruler and try to draw a line that goes through most of the points. This would do fairly well, but it probably wouldn't be the "best" solution in any way that we would care about.

Alternatively, we could solve the following problem: \begin{align} \text{minimize}\hspace{8pt}\sum_i \left(\ln(y_i) - (\ln(A)+bx_i)\right)^2. \end{align} This is called the "least squares problem" because we are minimizing the difference between the points we known and our model, squared. If we think of this difference as the error, then we're minimizing the sum of the errors squared: \begin{align} \text{minimize}\hspace{8pt}\sum_i \text{error}_i^2 \end{align} where $\text{error}_i = \left(\ln(y_i) - (\ln(A)+bx_i)\right)^2$.

But what's happened? Consider any one point of our data, and think of the error in the original exponential plot. The distance between the point and the curve, or the error, is not the same as the distance between the point and the line in the new plot where we plot $\ln(y)$ versus $x$. And, as stated in the referred article, the smaller values of $y$ will matter more than the larger values of $y$.

This may be ok. For example, if you knew that you cared about the smaller values of $y$ more than the larger values of $y$, then this might be exactly what you want. But what if it's not?

The key is to realize that the function we're minimizing was chosen arbitrarily. If you know that you care about the larger values of $y$ as much as the smaller values, then you can alter the minimization function as follows: \begin{align} \text{minimize}\hspace{8pt}\sum_i w_i \left( \ln(y_i)-(\ln(A)+bx_i)\right)^2. \end{align} What did you just do? You created a set of weights $w$ that change how much you care about different terms in the sum. And if you want to give less weight to the data points with small values of $y$, you can let $w_i=y_i$.

This type of problem is called "weighted least squares". Note that this is a differentiable function, and you can solve it by taking derivatives and setting them equal to 0.

For many problems of engineering, determining weights can be the difference between a solution that works and one that doesn't.

Great concise explanation, but I was referring more to the second half of the page (from line 5 downwards) and how they got to this. — John, Jan 04 '18 at 04:14

score 1 · Answer 2 · answered Jan 04 '18 at 02:04

Assume that you have some points $(X_1,Y_1),\ldots,(X_n,Y_n)$ which appear to approximately lie on a line. The quadratic error associated to the line $y=mx+q$ is given by $$ E(m,q) = \sum_{k=1}^{n}\left(Y_k-mX_k-q\right)^2 $$ which is a smooth and subharmonic function of the variables $m$ and $q$. The error is minimized when both $\frac{\partial E}{\partial m}$ and $\frac{\partial E}{\partial q}$ vanish. This implies that $y=mx+q$ goes through the centroid $\left(\frac{X_1+\ldots+X_n}{n},\frac{Y_1+\ldots+Y_n}{n}\right)=\left(\overline{X},\overline{Y}\right)$ and its slope $m$ solves $$ \sum_{k=1}^{n}X_k\left(Y_k-mX_k-q\right)=\sum_{k=1}^{n}X_k\left(Y_k-mX_k-\overline{Y}+m\overline{X}\right) = 0 $$ or $$ \sum_{k=1}^{n} X_k(Y_k-\overline{Y})-m \sum_{k=1}^{n}X_k(X_k-\overline{X}) = 0. $$ If $y_n$ appears to behave like $A e^{Bx_n}$, a good fit can be found by applying a linear regression to the set of points $(x_1,\log y_1),\ldots,(x_n,\log y_n)$, with $A=e^q$ and $B=m$ can be found through the previous formulas.

@martycohen: it is a positive-definite quadratic form in $m$ and $q$. — Jack D'Aurizio, Jan 04 '18 at 03:22
I just look at it as setting the partials wrt m and q to zero. — marty cohen, Jan 04 '18 at 04:35

Exponential least squares equation

2 Answers2