Fitting least squares to 3-D points to get an equation for a plane

Question

Here is a thread that is inspiring this question. I don't agree with Ben's answer for the reason wcochran stated.

I am trying to find the equation for a plane specifically using least squares. So say we have an equation for a plane, $x+y+z=0$ (no intercept! this will later be problematic). Now using this equation, we generate 4 data points, say, $$ (0,0,0) \\ (1,0,-1) \\ (0,1,-1) \\ (1,1,-2) $$

Now say that I want to take these points and fit least squares to get the original equation of a plane. So to do this, I assume the equation of the plane is of some generic form: $$ 0 = \beta_1x + \beta_2y + \beta_3z + \beta_0 $$ We can formulate a system of linear equations using the above data points $$ \begin{bmatrix} 0 & 0 & 0 & 1\\ 1 & 0 & -1 & 1\\ 0 & 1 & -1 & 1 \\ 1 & 1 & -2 & 1 \end{bmatrix} \begin{bmatrix} \beta_1 \\ \beta_2 \\ \beta_3 \\ \beta_0 \end{bmatrix} = 0 $$

But the issue is this matrix isn't full rank, so we can't use least squares (or I guess in this case, since the matrix is square, you could have simply inverted it if it wasn't singular)... precisely because the points were generated from $x+y+z = 0$, where each variable is a linear combination of the other variables. The issue is also because there is no intercept term in the plane meaning the equation passes through the origin, hence allowing each variable to be a linear combination of the others.

Am I missing something fundamental with my thought process or is it impossible to reconstruct the plane $x + y + z = 0$ using the 4 given data points above? I believe we should be able to since 3 non-collinear points are all that is needed to construct a plane, but from what I can see, my process doesn't work here.

@lamaron You may not even recover the initial plane... Imagine for instance that the four points belong to the same straight line. — PierreCarre, Jul 01 '20 at 22:45
@PierreCarre Right, but in this case, I made sure the points aren't all collinear. — 24n8, Jul 01 '20 at 22:47
If you use a numerical method such as singular value decomposition (SVD) then you can do this with any number of points. I suggest you read up on SVD, least squares, and pseudo inverses (also called Moore-Penrose inverse). Lots of literature on that. Here is a start: https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_inverse — Jap88, Jul 02 '20 at 00:39
@Jap88 Yes, I understand this particular problem can be solved with SVD as the linked post in the OP uses. I just don't understand why LS doesn't work here. — 24n8, Jul 02 '20 at 00:54
@lamanon Oh, I completely missed that link. Well, the method presented there using pseudoinverses and any least squares method all have to be mathematically equivalent. The point with least square methods is that the matrix should be allowed to be of more or less any rank. The problem I think is with the way you have setup the problem. You got one unknown too many. In the link they provide two methods to get rid of the "extra degree of freedom": 1) move to the centroid position 2) set $\beta_3=1$. — Jap88, Jul 02 '20 at 01:11
@lamanon ... since your equation can be multiplied by any scalar and still represent the same plane. — Jap88, Jul 02 '20 at 01:17
@Jap88 Right, but I think the one unknown too many is because the datapoints were acquired from the specific equation $x+y+z=0$, which gives you at most 2 linearly independent variables. Also ordinary least requires the coefficient matrix $X$ to be full column rank, otherwise $X^TX$ isn't invertible. — 24n8, Jul 02 '20 at 01:19
@lamanon Since the 4 points were taken from the equation of a plane they are guaranteed to only span a subspace. The "full-rank" least-square method will not work in this case. If you perturb one point randomly you will (with high probably) get a full rank matrix and then "full-rank" least squares will work. This is actually exactly one of the reasons "full-rank" least squares is not used that much in practice - since this is a problem already when you are close to not having full rank. SVD and similar methods on the other hand are extremely stable. — Jap88, Jul 02 '20 at 02:38
@Jap88 Ah I see. Could you expand on why the 4 points taken only span a subspace and why it doesn't result in a full column rank matrix, or provide a reference so I can read up more on it? — 24n8, Jul 02 '20 at 03:15
There are several equivalent ways of seeing this. One way to understand a full rank matrix is to think probabilistic. The columns (or rows) of a 4-by-4 matrix has the capability of spanning the full 4 dimensional space. This is what you get with probability 1 if you pick all entries at random. However, here the "picks" are constrained. You can think of your plane as hyperplane $(\beta_1,\beta_2,\beta_3,\beta_4) \cdot (z,x,y,w)=0$ but where $w=1$. So, here you have three free coordinates $x,y,z$ and the subspace is 3D (where your points live). Thus your matrix will have rank 3. Hope that helps. — Jap88, Jul 02 '20 at 04:09

Fitting least squares to 3-D points to get an equation for a plane

0 Answers0

Linked