Overview
This cookbook shows how to extract structured data from invoices using the Aifano/extract endpoint. You’ll define a JSON schema for the data you need, and Aifano will extract it from any invoice format — PDF, scanned image, or digital document.
What You’ll Build
A script that:- Uploads an invoice to Aifano
- Extracts vendor info, line items, totals, and payment terms
- Returns clean, structured JSON ready for your accounting system
Step 1: Define the Extraction Schema
Create a JSON schema that describes the data you want to extract:Step 2: Extract Data from an Invoice
Step 4: Batch Processing Multiple Invoices
For processing multiple invoices, use async endpoints to maximize throughput:Python
Tips
Use system_prompt for better accuracy
Use system_prompt for better accuracy
Add context like currency format, date format, or language to the
system_prompt to improve extraction accuracy.Reuse parsed results with jobid://
Reuse parsed results with jobid://
If you need to extract different fields from the same invoice, use
jobid:// to skip re-parsing and save credits.Handle missing fields gracefully
Handle missing fields gracefully
Not all invoices have every field. Check for
null values in the response
and handle them in your application logic.Next Steps
- Contract Analysis — Extract clauses and terms from legal documents
- Multi-Document Pipelines — Process bundled document packages