I am researching a project which would include document scanning. What I want to achieve is something like the iOS document scanner in the notes app.
Here the input is a single image from a smartphone camera. The app is able to automatically guess the boundaries of the document within the image, and then flatten the document: un-transforming the perspective keystoning due to the the camera angle, and even handling multiple segments of the document at different angles, for instance in the example of a letter which has come out of an envelope.
What techniques or tools would I look into for understanding how I could achieve something like this?