Lettalize OCR Gets an AI Brain: Nova + Textract Hybrid Flow
Just wrapped up a major overhaul of Lettalize's OCR system, and honestly? This feels like a huge step forward.
The Problem with Simple OCR
Before today, my document analysis was pretty basic – just AWS Textract doing raw text extraction with some regex magic. It worked, but barely. Documents would get misclassified, important fields would get missed, and the whole experience felt... clunky.
Enter the 3-Step AI Hybrid Flow
I completely rebuilt the analysis pipeline into a smart 3-step process:
Step 1 – Document Classification (Amazon Nova Lite) First, I feed the document image to Nova Lite as a multimodal classifier. It tells me what type of document this is, what country it's from, the language, and how confident it is about each guess. This context becomes crucial for step 3.
Step 2 – Text Extraction (AWS Textract)
This part stays the same – Textract is just really good at pulling text out of documents. Basic tier uses DetectDocumentText, advanced tier adds AnalyzeDocument with forms and tables plus AnalyzeExpense.
Step 3 – Smart Field Extraction (Amazon Nova Micro) Here's where the magic happens. I take the raw OCR text, combine it with the document classification from step 1, and feed it to Nova Micro with document-type-specific hints. Official letters get prompts about reference numbers (Kassenzeichen), fines get prompts about penalty amounts (Bußgeld), etc.
The UX Gets Smarter Too
Now the review screen actually helps users understand what needs attention:
- Yellow warning icons for uncertain fields (confidence < 0.70)
- Red shield icons for fields that always need verification (like IBANs)
- Clean session handling so metadata doesn't get stale after re-scans
AWS IAM Adventures
Hit a fun snag with permissions. Turns out EU cross-region inference profiles route through multiple regions internally – my policy was too restrictive. Had to add wildcard foundation-model ARNs alongside the specific inference-profile ones.
Also learned that Amazon Nova doesn't need marketplace approval like Anthropic's Haiku does, which saved me a headache since my root account lacks those permissions.
What's Working
Tested with three real documents (including a Stadt Viersen enforcement notice) and the field extraction is dramatically better. The AI actually understands context now instead of just pattern matching.
Next Up: AI Letter Generation
Feature 1 feels solid, so I'm moving on to Feature 2 – an AI-powered response letter generator. Users will pick a letter type (cancellation, objection, inquiry, complaint), and Nova will generate a proper DIN 5008 formatted response. Then they can edit it section by section before saving as PDF.
The infrastructure thinking is done, just need to build it. Sometimes the hardest part is figuring out what to build – actually building it feels almost relaxing after all that decision making.
Tier limits will be 1/3/10/20 letters per month for Free/Lite/Plus/Pro respectively. Seems reasonable for an AI feature that probably costs me a few cents per generation.
Feels good to see Lettalize getting genuinely smart instead of just... functional.