0

If I want to fit a quadratic function of two variables to some data, I can use

$$f(x, y) = c_1 x^2 + c_2 xy + c_3 y^2 + c_4 x + c_5 y + c_6$$

$$\frac{\partial}{\partial c_i} \sum_j\left( z_j - f(x_j, y_j) \right)^2 = 0$$

to obtain six equations, and then endeavor to solve them.

I've done it for one variable not two, but I'm guessing the process is straightforward.

If I extend this to more variables, and to higher order than 2, when will the analytical expressions be at risk for having multiple solutions?

uhoh
  • 1,864
  • 1
    This does not seem to be possible since the system is linear wrt the parameters. Please post a set of $(x_i,y_i,z_i)$ – Claude Leibovici Mar 25 '21 at 04:22
  • @ClaudeLeibovici yes that's right; for least squares it will always be linear. It's been half a century but I remember that now, that's one reason why least squares is/was so convenient. Thanks! – uhoh Mar 25 '21 at 04:29
  • 1
    It is possible if the system of linear equations is rank deficient. Take for example $y=\beta_{0}+\beta_{1}x$ and the data points $(1,1)$ and $(1,2)$. There are infinitely many least-squares solutions. – Brian Borchers Mar 25 '21 at 04:56
  • @BrianBorchers that's a good point and I didn't address that at all my amateur answer below. You are right, there is always the potential for data to be inadequate for a given fitting. – uhoh Mar 25 '21 at 05:01
  • 1
    This appears to be a duplicate. For example, How come least square can have many solutions?, and many, many more. – David Hammen Mar 25 '21 at 05:06
  • @DavidHammen yep, do you think that's the best one to dupe to? – uhoh Mar 25 '21 at 05:11

1 Answers1

0

As @ClaudeLeibovici quickly pointed out, for least squares it will always be linear.

This is one reason why least squares is/was so convenient/popular back when folks were "writing with feathers using light from burning animal fat".

For example:

$$\frac{\partial}{\partial c_1} \sum_j\left( z_j - f(x_j, y_j) \right)^2 = 2 \sum_j \left( z_j - f(x_j, y_j) \right) \frac{\partial f}{\partial c_1}$$ and since $\frac{\partial f}{\partial c_1}$ is now constant and $\left( z_j - f(x_j, y_j) \right)$ is linear in $c_1$, there will always be one solution.

The number of variables and the power to which they are raised is unimportant. It's only the power of 2 in "least squares" that we need to notice here.

uhoh
  • 1,864