← All posts · May 27, 2026 · 8 min read

How we evaluated 12 OCR services for SideQuest (and which one won).

A real customer dropped a PDF into our parser playground last week. The PDF was a clean Datamoto-template purchase order. Tesseract, the open-source OCR we'd been running, returned text that looked like this: SN[momcods [Description [ay [unt. Bordered table, mangled output, no line items extracted. That same week we sat down and evaluated every credible OCR option on the market. Here's what we found and what we're shipping in v0.10.0.

Why Tesseract wasn't going to cut it

Tesseract is open-source, free, and runs locally. We started there because SideQuest is a local-first connector — the customer's PO data never leaves their machine when Tesseract is the OCR. Tesseract on a clean typewriter-style scan is fine. Tesseract on a bordered invoice table is bad: it reads the cell borders as pipe characters and brackets, scrambles the column alignment, and turns "DM19012 Rollerblade 10.0 123.00 1,230.00" into garbage.

The deterministic parser sitting downstream of OCR can't fix what OCR scrambles. If "DM19012" never makes it through, no amount of regex cleverness produces a line item. We needed better OCR.

The 12 we looked at

We split them into three buckets: cloud APIs from hyperscalers, purpose-built invoice OCR services, and large-language-model vision. Per-page price below is the published list price as of May 27, 2026.

Service	Per-page price	Notes
Google Document AI — Invoice Parser	$0.01	Pretrained invoice schema, structured output. Independent benchmarks put line-item accuracy near 40%. Strong on header fields, weak on multi-column line tables.
AWS Textract — AnalyzeExpense	$0.01	Solid line-item handling (around 82% in published benchmarks). Returns normalized ITEM/QTY/PRICE fields. Asynchronous batch processing available.
Azure Document Intelligence — prebuilt-invoice PICK	$0.01	Same price as Google and AWS, leads the published benchmarks at roughly 93% header accuracy and 87% line-item accuracy. Free F0 tier covers development. First-party Python SDK. Rich invoice schema with line items already structured.
Mindee — Invoice OCR	$0.10	Purpose-built for invoices, strong on European formats. Free 250 pages per month; price tapers toward $0.01/page at high volume. Modern Python SDK.
Klippa DocHorizon	on request	Marketed at 99%+ accuracy but no public benchmark. Annual license or per-doc.
Veryfi — Invoices OCR	$0.16/doc	Per document, not per page. Strong accuracy. $500/month minimum on the Standard tier kills the math for a $39 Solo customer.
Rossum AI	$18,000/yr starter	Excellent OCR but minimum spend wipes out small distributors. Targets ERP-grade customers.
Affinda	From $80/mo	Scales by invoice volume. Decent invoice coverage; no public benchmark.
Nanonets	~$0.30/page	Block-based pricing. First 100 free. Strong with fine-tuning, mediocre out of the box.
Hyperscience	~$50,000/yr floor	99.5% claimed accuracy. Federal and Fortune 500. Not a fit for the kind of customer we serve.
ABBYY FlexiCapture / FineReader	quote-based	Mature, accurate on structured forms. Heavy install and templating overhead.
Claude Sonnet 4.6 vision RESCUE	~$0.015	An image token in the Claude API works out to (width × height) / 750. A standard letter-size PO at 1568×1212 pixels comes in around 2,500 input tokens. With a small JSON response, the all-in cost is about a penny and a half per page. Accuracy on invoice extraction matches the GPT-4o-class benchmarks at around 98%.
Claude Haiku 4.5 vision	~$0.005	Cheaper than Sonnet, well above Tesseract, matches Google Vision OCR on table tests in the benchmarks we found.

Sources for the table are in the underlying memo at marketing/ocr-vendor-eval-2026-05-27.md with fetch dates and links.

Three things that surprised us

Azure leads the published invoice benchmarks, not Google. Going in, we expected Google Document AI to win on the strength of Google's general-purpose OCR reputation. The independent benchmarks tell a different story: on invoice line-item extraction specifically, Azure's prebuilt-invoice model lands at roughly 87% versus Google Invoice Parser's 40%. AWS Textract sits in between at around 82%. All three are at the same $0.01/page list price. That's surprising and it changes the answer.

Claude vision at $0.015/page is in the same ballpark as the OCR specialists. If you had told us a year ago that a frontier LLM with vision would compete on per-page cost with Azure DI and AWS Textract, we'd have laughed. The math works because Claude prices vision as image tokens, and a single PO page is about 2,500 tokens. The accuracy on invoice-shaped documents lands at GPT-4o-class numbers (~98%) because the model can use the whole-page context to disambiguate columns and footers in a way pure OCR can't.

None of the enterprise PO-automation incumbents target QuickBooks. Conexiom, Esker, Rossum, Hyperscience, Endeavor AI all sell into SAP, Oracle, NetSuite, Infor, Microsoft Dynamics, and Epicor Prophet 21. They start at $18,000 to $50,000 per year and need ERP-grade onboarding. None of them have a native QuickBooks Online play. The five-person distributor on QBO has been quietly underserved while every credible PO automation tool chased the enterprise list. That's the gap we sit in.

What we're shipping in v0.10.0

Azure Document Intelligence primary, Claude Sonnet 4.6 vision rescue.

Azure DI runs first on every PDF page. Returns a normalized invoice schema with line items already structured. $0.01 per page.

Claude Sonnet 4.6 vision runs as a second pass on any page where Azure's confidence falls below 0.85 on a line-item field, or the line count from raw OCR text disagrees with the parsed schema, or the document total doesn't equal the sum of line items within $0.50. Roughly 10% of pages in our existing logs trip one of those conditions. Sonnet at $0.015/page adds about $0.0015 amortized per page.

Tesseract stays in the binary as an offline fallback for the air-gapped demo case, but never runs as the active path once v0.10.0 ships.

Cost at each pricing tier

Assuming two pages per PO and the hybrid pipeline above:

Solo (100 POs/month, $39/mo plan)

~$2.65/month in OCR + rescue costs. That's 6.8% of revenue. Tesseract was "free" but cost us roughly an hour per customer per month in support tickets on garbled tables — call it $50 in support load. Net win.

Growth (1,000 POs/month, $149/mo plan)

~$26/month in OCR + rescue costs. 17% of revenue. Still comfortably inside our 70% gross margin target.

Scale (5,000 POs/month, $399/mo plan)

~$130/month in OCR + rescue costs. 33% of revenue. The volume conversation gets sharper here, and we'll renegotiate Azure's pricing at the 5,000-page-per-month mark when we have a customer there. The hybrid approach gives us room to dial the rescue threshold up or down without changing vendors.

What this changes for prospects

If you tried our parser playground last week with a bordered-table PDF and saw the OCR quality warning, the v0.10.0 release is the answer. The playground will continue to use browser-side Tesseract because we can't run an Azure API call from a static page. The production connector running on your Mac or Windows machine will use Azure DI plus Sonnet rescue, and the bordered tables that broke Tesseract will parse cleanly.

If you're running on a 2019 MacBook Air or an older Windows PC, the change costs you nothing in compute — Azure runs in their data center, your laptop just makes an HTTPS call. The connector still touches your files locally; we send only the page images to Azure (and only the low-confidence pages to Claude). Your customer list, your QuickBooks data, and your match cross-references never leave your machine.

What's next

The v0.10.0 work is in our queue and shipping shortly. Once it's in production, we'll publish a follow-up post with the regression numbers on the 47-PO bordered-table set we use internally. If you'd like a heads-up when v0.10.0 lands, the welcome kit includes an email opt-in. If you want to test the new pipeline on your real POs before it ships broadly, the beta program takes the first cohort.

And if you're still running an enterprise PO automation tool that's costing you $20,000+ a year because it's the only thing that read bordered tables until now: there's a path off it. Tell us what you're paying and we'll model what SideQuest would cost on the same volume.

SideQuest Automation · sidequestautomation.com

Questions? Send a brief