0

How can I linearize a nonlinear model using

$$y=Ax^b$$

and

$$y=A\ln x+B$$

I couldn't find much online using the above methods.

The notes are taken from a class in Models in Applied Mathematics

dantopa
  • 10,342
iordanis
  • 813

3 Answers3

3

Suppose you have to fit experimental data using a model such that $y=ax^b$ you should first notice that this model is nonlinear with respect to its parameters. The problem with nonlinear regression is that you must provide "reasonable" starting guesses for parameters $a$ and $b$.

In some cases, you can linearize the model; in your case, taking logarithms of both sides, you have $\log (y)=\log (a)+b \log (x)$ which is of the form $Y=A+B\ X$, using $Y=\log (y)$, $A=log(a)$, $B=b$ and $X=\log (x)$. So, standard linear regression can be used from which parameters $A$ and $B$ will be obtained. So, going backwards, $b=B$ and $a=e^A$.

However, this is not the end of the story for the simple reason that, when you use the linearized form, it means that you minimize the sum of the squares of the errors for $Y$ that is to say for $log(y)$; while the original problem is to minimize the sum of the squares of the errors for $y$ which is not the same.

So, using the estimates obtained by the linear regression, you must start the nonlinear regression. For sure, in the case of marginal errors, the estimates will be quite close to the solution.

  • So the point of linearizing a data set is so that it can be represented by a line? Or so that I can find the variables described above and fit it on a logistic model? – iordanis Feb 10 '14 at 08:43
  • You are right but on a log-log plot. – Claude Leibovici Feb 10 '14 at 08:44
  • I would appreciate it if you could link me to an example that makes use of the above technique. – iordanis Feb 10 '14 at 08:47
  • I give you a few data points : (1,1.65), (2,4.91), (3,9.24), (4,14.6), (5,20.7), (6,27.4). Now, apply and send me your estimated values for the original parameters $a$ and $b$. Cheers. – Claude Leibovici Feb 10 '14 at 08:53
  • Solve the system of -2e^B(1.65-Ae^B)-24.91e^(2B)(1-Ae^(2B))-29.24e^(3B)(1-Ae^(3B))=0

    and d(Sum(y-Ae^(Bx))^2)/dB=0 and obtain A and B and then replace them on Y=A+BX?

    – iordanis Feb 10 '14 at 09:08
  • I did it for the first 3 points – iordanis Feb 10 '14 at 09:10
  • No. First plot $y$ as a function of $x$. Then another plot of $Y$ as a function of $X$. Do you see any difference from a graphing point of view ? – Claude Leibovici Feb 10 '14 at 09:11
  • Can you give me the answer to the question so that I can solve it later on and confirm it? For the first 3 points only. – iordanis Feb 10 '14 at 09:15
  • Answer my question first, please. By the way, have you been teached linear regression yet ? – Claude Leibovici Feb 10 '14 at 09:26
  • I just understood it very well... So the steps I take are, first I transform the dataset from the form of {xi,yi} to {xi,ln(yi)} and then I perform linear least square fit for the data I have, I then finding factors A and B from the linear least square fit model and use them to replace the factors of the equation y=e^c*e^(Bx) where c and B are ln(y)=Bx+c – iordanis Feb 10 '14 at 13:24
  • I actually trying to find it using the proof in my previous comment. Regardless of that thank you very much I have an exam in half an hour! Your help was appreciated. – iordanis Feb 10 '14 at 13:25
1

Problem statement

Given a sequence of $m=5$ measurements $\left\{ x_{k}, y_{k} \right\}_{k=1}^{m}$ and the model $$ y(x) = a x^{b}. $$ Use the method of least squares to find the solution $$ \boxed{ (a,b)_{LS} = \left\{ (a,b) \in\mathbb{R}^{2} \colon \sum_{k=1}^{m} \left( y_{k} - a x^{b}_{k} \right)^{2} \text{ is minimized} \right\} } \tag{1} $$ The target of minimization is the merit function , $$ M(a,b) = \sum_{k=1}^{m} \left( y_{k} - a x^{b}_{k} \right)^{2} \tag{2} $$ Defining the residual error as, $$ r_{k} = y_{k} - a x^{b}_{k} $$ the least squares problem minimizes the sum of the squares of the residual errors $r\cdot r = r^{2}$

Data

The input parameters are $(a,b) = (1, \frac{1}{2})$, and random noise $|\epsilon| < 0.1$ is stirred in. The data set in then $$ \begin{array}{cc} k & x & y \\\hline 1 & 0.2 & 0.510691 \\ 2 & 0.4 & 0.554739 \\ 3 & 0.6 & 0.832502 \\ 4 & 0.8 & 0.831988 \\ 5 & 1 & 0.948272 \\ \end{array} $$

Least squares solution

The least squares solution described in $(1)$ is $$ (a,b)_{LS} = \left( 0.943281, 0.424359 \right) $$ with total error of $r^{2} = 0.0143$

The solution point is displayed against a contour plot of the merit function in $(1)$. The least squares solution is nestled in at the very least of the total error.

merit

Least squares solution of reduced system

What if we don't have access to a conjugate gradient code? The problem can be simplified using least squares. The considerable challenge of finding a minimum in two dimensions will be supplanted with a minimization problem in one dimension.

Go back to equation (1), and attack with the conventional tools from the calculus. Demand the derivative with respect to $a$ be $0$ $$ \frac{\partial} {\partial a} \sum_{k=1}^{m} \left( y_{k} - a x^{b}_{k} \right)^{2} = -2 \sum_{k=1}^{m} \left( y_{k} - a x^{b}_{k} \right)x^{b}_{k} = 0 $$ This produces a constrained version of the parameter $$ a_{*}(b) = \frac{\sum y_{k}x^{b}_{k}} {\sum x^{2b}_{k}} $$ Given $b_{LS}$, we can compute $a_{LS}$ like so $$ a_{LS} = a_{*} \left( b_{LS} \right) \tag{3} $$ The figure below plots $a_{*}(b)$ from $(3)$ as a dashed line against the full two dimensional merit function.

dash

Instead of searching in two dimensions, the problem involves searching along this trajectory.

line

The constrained merit function is $$ M_{*}(b) = M\left( a_{*}(b), b \right) = \sum_{k=1}^{m} \left( y_{k} - a_{*} x^{b}_{k} \right)^{2} = \sum_{k=1}^{m} \left( y_{k} - \frac{\sum y_{k}x^{b}_{k}} {\sum x^{2b}_{k}} x^{b}_{k} \right)^{2} \tag{4} $$

The function is a delight to minimize. It has a lone minimum and is monotonic is the best way. Moving from left to right, the function decreases monotonically up to the minimum, and then increases monotonically.

Find $b_{LS}$ by finding the minimum of the $(4)$, use $(3)$ to compute $a_{LS}$. This exactly matches the full least squares solution and minimum error.

Logarithmic transformation

A logarithmic transformation is used to batter the data set until in can be forced into a linear system. In the limit of no error, $$ \begin{align} y(x) &= a x^{b} \\ \ln y(x) &= \ln ax^{b} = \ln a + b \ln x \end{align} $$ is an exact solution. But if there were no error, there would be no need for least squares. A way to think about the problem is this: the method of least squares takes a "peanut butter" to smoothing out the errors. A residual error of $1$ has the same effect whether $y=1$ or $y=1000$. The logarithm enforces a location dependence. If $y=1$ and $r=1$, the prediction is $y=10$. The difference in our linear space is $9$. If that same residual error is measured at $y=1000$, the prediction is $10,000$, a difference of $9000$. This is very disruptive for the solution.

Proceeding, the battered data is inserted in the linear system $$ \begin{align} % \mathbf{A}\, \alpha &= \tilde{y} \\ % % \left[ \begin{array}{c} 1 & \ln x \end{array} \right] % \left[ \begin{array}{c} \ln a \\ b \end{array} \right] % &= % \left[ \begin{array}{c} \ln y \end{array} \right] \\ % \left[ \begin{array}{cl} 1 & -1.60944 \\ 1 & -0.916291 \\ 1 & -0.510826 \\ 1 & -0.223144 \\ 1 & \phantom{-}0 \\ \end{array} \right] % \left[ \begin{array}{c} \ln a \\ b \end{array} \right] % &= % \left[ \begin{array}{l} -0.67199 \\ -0.589257 \\ -0.18332 \\ -0.183937 \\ -0.0531137 \end{array} \right] % \end{align} $$ The solution for the linear system is $$ \mathbf{A}^{+}\alpha = \left[ \begin{array}{c} \ln a \\ b \end{array} \right] = \left[ \begin{array}{l} -0.0700446 \\ \phantom{-}0.408441 \end{array} \right] $$ To compare solutions $$ \left[ \begin{array}{l} a \\ b \end{array} \right] = \left[ \begin{array}{l} 0.932352 \\ 0.408441 \end{array} \right] $$ The total error is $r^{2} = 0.0146$.

How close are we? The red dot marks the solution found after logarithmic transformation.

red

In closing, a long post has been written which provides details for the insight of @Claude Leibovici:

So, using the estimates obtained by the linear regression, you must start the nonlinear regression.

dantopa
  • 10,342
1

Take logs $$ \mathrm{ln}(y) = \mathrm{ln}(A) + b\mathrm{ln}(x) $$ which we can define as $$ Y = A + BX. $$

Chinny84
  • 14,186
  • 2
  • 22
  • 31
  • I appreciate your quick response but can you demonstrate with an example? I would greatly appreciate it! – iordanis Feb 10 '14 at 07:34
  • How do you mean? – Chinny84 Feb 10 '14 at 07:35
  • An example that you linearize a nonlinear model. – iordanis Feb 10 '14 at 07:37
  • Is this for a regression model? If so with the linearise equation you can use a simple linear regression to fit the data. – Chinny84 Feb 10 '14 at 07:42
  • I am not sure exactly what it is for but these are the 2 equations I found on my notes. And I am just trying to study more about this subject but can't find resources. If you can identify it I would appreciate it. – iordanis Feb 10 '14 at 07:44
  • If it is to fit data using regression then linearising the model equation and then applying a linear regression for your parameters of the model equation I.e a and b. to further understand what you are trying to do, maybe explain in your question what subject your notes were discussing. – Chinny84 Feb 10 '14 at 08:10
  • It is Models in Applied Math – iordanis Feb 10 '14 at 08:12