Reversing 2D image pixels projection back to 3D world coordinates.

Question

I am using a camera mounted on a pole to detect objects. This equation below, from OpenCV, gives an equation for finding pixel coordinates from 3D coordinates.

image_pixels_in_2D (3 x 1) = [intrisix_matrix (3 x 3)] . [ extrinsic matrix (3 x 4)] . [ world coord (4 x1)]

For example, using the values that I have

[x, y, 1] T =

           [[1.62101313e+03 0.00000000e+00 9.60000000e+02]
           [0.00000000e+00 1.62025316e+03 1.28000000e+03] 
           [0.00000000e+00 0.00000000e+00 1.00000000e+00]]
                       dot product with 
   [[-3.46094253e-01  9.21477975e-01 -1.76343727e-01  1.59227339e+04]
    [ 6.37135469e-01  9.28747529e-02 -7.65135723e-01 -3.60030568e+05]
    [-6.88677836e-01 -3.77163920e-01 -6.19249720e-01  4.40335632e+05]]
                       dot product with
                     [538000, 180023, 6]

Sorry for the formatting. My coordinates are non-homogenous ( i presumed that it meant that z-axis is non 0). I derived the extrinsic matrix using this non-homogenous data with the tool solvePnP in opencv.

cv2.solvePnP(world_coord_pts, image_pixel_pts, intrMatrix, distCoeffs)

where distcoeff is an empty array. Using some initial points I get my camera matrix, which is =

            [[-1.22215405e+03  1.13165054e+03 -8.80335229e+02  4.48533168e+08]
             [ 1.50813130e+02 -3.32289205e+02 -2.03235322e+03 -1.97110575e+07]
             [-6.88677836e-01 -3.77163920e-01 -6.19249720e-01  4.40335632e+05]]

I wish to do the reverse. The altitude of the location will always be known ( say 6m for this point, maybe 13 m for another point). I read other threads, where it is suggested that the equation

    X

  &#x007E;

( μ ) =

  M


  &#x2212;
  1

( μ

−

  p

4

) .

where it is my assumption that my camera matrix needs to be decomposed into M and p4, where M is the first 3x3 matrix of camera matrix =

            [[-1.22215405e+03  1.13165054e+03 -8.80335229e+02]
             [ 1.50813130e+02 -3.32289205e+02 -2.03235322e+03]
             [-6.88677836e-01 -3.77163920e-01 -6.19249720e-01]]

and p4 is the last column of the camera matrix so, it should be

                        [[4.48533168e+08]
                         [-1.97110575e+07]
                         [4.40335632e+05]]

and x would be my pixel coordinates say [30, 900, 1] corresponding to [538000, 180023, 6]

So the first question is what is μ on right hand side. How do I compute it or find it? Do I find it with known 3D and 2D points and then use the result to find the subsequent 3D points for all other pixel points.

Second question , what is μ on the left hand side?

Third question - What does tilde (~) mean on top for capital x (X) in the equation?

Can someone help compute μ given the values above, please.

Thank you. Any help in deciphering this method would be great. I am writing my code in python.

The other link with the similar question is here https://math.stackexchange.com/questions/3577395/reverse-perspective-matrix-to-find-2d-coordinate-with-known-height
But my problem is I was not able to completely understand the method. That is why shared the values, so if someone could just point me in the right direction. Thank you. — user25194, Apr 08 '21 at 16:57
This is not possible in general. Two points in 3D might be mapped to the same pixel. — K.defaoite, Apr 08 '21 at 17:02
@K.defaoite I was warned that that might be the case but the solutions shared in the links seems to have worked for many. Any help understanding the equation will be great. thanks — user25194, Apr 09 '21 at 08:33

Reversing 2D image pixels projection back to 3D world coordinates.

0 Answers0