2

I'm wondering whether there is a way to set dpi to 600 using imwrite. I'm trying to write text to an image which I will later need to OCR.

I'm also wondering if there's a particular image format that is best for OCR - disk space and processing time are not a primary constraints (yet). I'm using python 3.4 and opencv2 - would be very appreciative if anyone has any tips for how to increase the dpi.

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
Matt Johnson
  • 21
  • 1
  • 2
  • I'm still just reading docs to see what I'm not understanding about imwrite. It seems there would be a set of params to define image quality and compression, etc, but I don't see that. Is this something that has to get defined as a part of the np.array? – Matt Johnson Jun 10 '16 at 04:39
  • No you can't (see here: http://stackoverflow.com/questions/10860969/change-dpi-of-an-image-in-opencv). How do you expect it to benefit OCR, anyway? – Headcrab Jun 10 '16 at 06:32
  • Thanks for the link - I assumed my issue was dpi. Here's my issue - when I try to ocr a doc (plain black arial font on white), I get 99% accuracy. However, the text on that page isn't delimeted. So I recognize each word, add a delimeter, write it back at the same x,y to a new blank bg, then ocr it again (viola, didn't have to spend time on table detection). However, the accuracy of the second pass ocr is like 75%. Right now my suspects are the Hershey font versus Arial, cv2.imwrite adding artifacts around the letters or something else with compression or resolution of the image. – Matt Johnson Jun 10 '16 at 14:21
  • Do you save your intermediate results? Do you use jpeg format for that? – Headcrab Jun 12 '16 at 05:13
  • I switched to png and the output improved significantly but still not as good as ocr on the source image. – Matt Johnson Jun 13 '16 at 19:52
  • Have a look here... https://stackoverflow.com/a/57555123/2836621 – Mark Setchell Aug 19 '19 at 15:45

0 Answers0