OCR images and screenshots: pull text out of any picture
A photo of a menu in another language, a scanned tax form, a screenshot of a long error message — OCR turns all of these into selectable, searchable, editable text. Here's when to use which OCR flavor.
OCR — optical character recognition — is the technology that turns a picture of text into actual text. It used to be slow, error-prone, and tuned for clean black-on-white scans. Modern OCR handles screenshots, phone photos, multilingual menus, handwritten notes, and faded receipts. The interesting part isn't whether to use OCR — it's which OCR mode to use, because picking wrong is the most common reason people get bad results.
The five common situations and the right tool for each
1. A screenshot, and you want the plain text
Error messages, code snippets, chat logs — you want the text, you don't care about formatting. Use image to text and paste the result. Works on dark mode screenshots, blurry phone photos of laptop screens, anything you can read.
2. A photo of a table
Phone photo of a printed spreadsheet, screenshot of a web table, scan of a printed report. Don't use plain text OCR — the columns will collapse. Use image to Excel for a styled spreadsheet, or image to CSV for raw data. Both detect the table grid and preserve rows and columns.
If the photo's already a clean table screenshot, image to HTML gives you real <table> elements you can drop into a web page or convert further with table to JSON.
3. A photo of a document with headings and structure
Magazine article, printed report, multi-page contract. Plain text OCR loses the structure. Use image to Word or image to Markdown. Word gives you a .docx you can edit further; Markdown is better if the text is going to a wiki, README, or notes app.
4. Text in a language you can't read
A photo of a menu in Tokyo, a sign in Cairo, a label in Stockholm. Don't do OCR then translate manually — use image translator which combines OCR + translation in one step. Outputs the original text, the translation, and recognises ~60 languages.
5. Handwritten notes
Whiteboard photos, journal pages, lecture notes. Generic OCR fails on cursive — use handwriting OCR which is tuned for connected, sloppy, real-human handwriting. Works best on dark ink on plain paper; the more contrast, the better the result.
One special case: scanned PDFs that need to stay PDFs
If you have a scanned PDF that you want to keep as a PDF but make searchable — say, a scanned contract you need to grep for a clause — don't convert it to Word. Use searchable PDF: it adds a hidden text layer over the image so the PDF still looks like a scan but Ctrl-F and PDF search work. This also lets you compress the image layer aggressively afterwards without losing the text.
Get better OCR results: three rules
- Contrast is king. Black ink on a white background reads almost perfectly. Faded ink, busy backgrounds, and patterned paper all hurt. If you control the photo, take it in flat, even light with the document filling the frame.
- Straight beats crooked. OCR engines auto-correct slight skew, but tilted shots (taken from across a desk) lose accuracy. Hold the camera parallel to the page.
- Resolution matters up to a point. About 300 DPI equivalent is the sweet spot — that's a 2000-pixel image of an A4 page. Higher resolution makes OCR slower without making it more accurate.
Specialised OCR you might not have known existed
- Receipts — extract vendor, total, tax, date, line items as JSON with receipt extractor. Built specifically for expense reports and bookkeeping.
- Business cards — generate a vCard ready to import to Contacts with business card scanner. Stops the conference-card pile from rotting on your desk.
OCR is one of the highest-leverage tools in your toolbox — five minutes of reading a photo can become five seconds. Pick the right mode for the situation and the results come out cleanly editable on the first try.
Pixoate