Trig, matrix transform, or...?

Question

I am working on an app that will transform a figure such as this:

{fig 1}

Into this:

{fig 2}

In short, the grey "canvas" is deformed so that the inner black quadrangle is as close to a rectangle as possible, while attempting to conserve the rectangles area if possible. Please assume that in the second image the black figure is much, much closer to a rectangle.

iOS already offers a method for deforming the canvas by stretching the corners to four new points. I just need to apply a generic solution for calculating the new points.

I am at a loss as to a consistent method; simply pulling a corner in a single direction is not consistent.

Do I need to rotate the corner through an arc? Do I need to move the new corner points diametrically away from the center of the grey figure?

All suggestions appreciated.

Thompson

The thing you need is called a homography. Do you know about homogeneous coordinates? — Mårten W, May 03 '13 at 23:58
Mårten, I do not, but I just peeked at Wikipedia and I am VERY intrigued; affine transformation matrices are readily available in iOS. Would I use the ratio of the gray:black figures as the value of the transform? — Thompson, May 04 '13 at 03:16
One can always find a homography which transforms any non-degenerate quadrilateral into any other given non-degenerate quadrilateral. The area does not uniquely determine the target rectangle. Do you have any requirements on the target rectangle? — Mårten W, May 04 '13 at 20:44
Mårten, if I may take a step back for a moment, the black rectangle in this project will be text that the user has photographed and that I will then attempt OCR on. Prior to that, I need to de-skew the target for reliable results. With that in mind, would it be reasonable to assert that the length of the smallest side not change after transformation? — Thompson, May 04 '13 at 22:52
As I understand it, you need to perform the following two steps: 1) guess the proportions of the sides of the text rectangle, and 2) rectify the image so that the text becomes a rectangle with the guessed proportions. Step 1) is very problematic, since there is no way of guessing that always yields sensible results. The problem can, unfortunately, not be well posed. However, if you can assume that the perspective effects are moderate, it should be possible to get a reasonable guess of the proportions. — Mårten W, May 04 '13 at 23:46
Mårten I appreciate your input. However, after many weeks of trial-and-error, I have found that I cannot assume the perspective effects are moderate, and I have yet to find a generic solution that works for more than one or two of my test cases at a time. I understand that ultimately it may be that the problem can not be well posed. Nonetheless, as a last attempt, what if I always assume that the requirements for the target black rectangle in figure two are the dimensions of the grey rectangle in figure one. Are there examples or tutorials of how to find that homography? — Thompson, May 05 '13 at 16:01

Mårten W · Accepted Answer · 2013-05-06T21:35:46.547

After the discussion in the comments, I assume that the proportions of the desired rectangle are known to be $a:b$.

Though the method I describe below is good for understanding, it is not suitable for direct implementation. The resulting equation system is typically ill-conditioned. (It is also possible to get rid of the $\lambda_i$ by using cross products to express collinearity, and thus get a slightly smaller system to solve. This is described in detail in Multiple View Geometry by Hartley and Zisserman.)

Let $\boldsymbol{p}_i = (x_i, y_i, 1)$ be the homogeneous coordinates of the four corners in the input image. Let their desired destinations be $$ \left\{ \begin{aligned} \boldsymbol{q}_1 = (0, 0, 1) \\ \boldsymbol{q}_2 = (a, 0, 1) \\ \boldsymbol{q}_3 = (a, b, 1) \\ \boldsymbol{q}_4 = (0, b, 1) \end{aligned} \right.. $$ The corners are related through a planar homography, represented by a $3\times 3$-matrix $\boldsymbol{H}$, and the relations are $\lambda_i\boldsymbol{q}_i=\boldsymbol{H}\boldsymbol{p}_i$ for some scalars $\lambda_i$.

How does one determine $\boldsymbol{H}$? Let $\boldsymbol{h}_1^T$, $\boldsymbol{h}_2^T$ and $\boldsymbol{h}_3^T$ be the rows of $\boldsymbol{H}$. Then $$ \left\{ \begin{aligned} \lambda_i q_{i1} = \boldsymbol{h}_1^T\boldsymbol{p}_i \\ \lambda_i q_{i2} = \boldsymbol{h}_2^T\boldsymbol{p}_i \\ \lambda_i q_{i3} = \boldsymbol{h}_3^T\boldsymbol{p}_i \end{aligned} \right. \Longleftrightarrow \left\{ \begin{aligned} \boldsymbol{p}_i^T\boldsymbol{h}_1 - \lambda_i q_{i1} = 0 \\ \boldsymbol{p}_i^T\boldsymbol{h}_2 - \lambda_i q_{i2} = 0 \\ \boldsymbol{p}_i^T\boldsymbol{h}_3 - \lambda_i q_{i3} = 0 \end{aligned} \right.. $$ Write this system of equations in matrix form, $$\begin{bmatrix} \boldsymbol{p}_1^T & \boldsymbol{0} & \boldsymbol{0} & q_{11} & 0 & 0 & 0 \\ \boldsymbol{0} & \boldsymbol{p}_1^T & \boldsymbol{0} & q_{12} & 0 & 0 & 0 \\ \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{p}_1^T & q_{13} & 0 & 0 & 0 \\ \boldsymbol{p}_2^T & \boldsymbol{0} & \boldsymbol{0} & 0 & q_{21} & 0 & 0 \\ \boldsymbol{0} & \boldsymbol{p}_2^T & \boldsymbol{0} & 0 & q_{22} & 0 & 0 \\ \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{p}_2^T & 0 & q_{23} & 0 & 0 \\ \boldsymbol{p}_3^T & \boldsymbol{0} & \boldsymbol{0} & 0 & 0 & q_{31} & 0 \\ \boldsymbol{0} & \boldsymbol{p}_3^T & \boldsymbol{0} & 0 & 0 & q_{32} & 0 \\ \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{p}_3^T & 0 & 0 & q_{33} & 0 \\ \boldsymbol{p}_4^T & \boldsymbol{0} & \boldsymbol{0} & 0 & 0 & 0 & q_{41} \\ \boldsymbol{0} & \boldsymbol{p}_4^T & \boldsymbol{0} & 0 & 0 & 0 & q_{42} \\ \boldsymbol{0} & \boldsymbol{0} & \boldsymbol{p}_4^T & 0 & 0 & 0 & q_{43} \\ \end{bmatrix} \begin{bmatrix} \boldsymbol{h}_1 \\ \boldsymbol{h}_2 \\ \boldsymbol{h}_3 \\ \lambda_1 \\ \lambda_2 \\ \lambda_3 \\ \lambda_4 \end{bmatrix} = \boldsymbol{0}, $$ and solve. Now, any point $\boldsymbol{p}$ in the input image is mapped to the point $\lambda\boldsymbol{q}=\boldsymbol{H}\boldsymbol{p}$. In particular, the corners are mapped to the corners of a rectangle.

@Thompson: Glad to help. If you still feel that something is unclear, don't hesitate to ask. I would also like to recommend OpenCV. It has a function for determining homographies, along with lots of other useful tools for computer vision. — Mårten W, May 07 '13 at 18:59
See also this post of mine for an approach to find $H$ which only involves $3\times3$ systems of equations, instead of the underdefined $12\times13$ here. — MvG, May 29 '13 at 19:33

Trig, matrix transform, or...?

1 Answers1