Overview
The Parse endpoint converts any document into structured JSON. It extracts text, tables, figures, and metadata with precise bounding boxes for every element. Credits: 1 credit per pageBasic Usage
Input Options
Theinput field accepts:
| Format | Example | Description |
|---|---|---|
| Public URL | https://example.com/doc.pdf | Any publicly accessible document URL |
| Presigned URL | https://s3.amazonaws.com/... | AWS S3 presigned URLs |
| Aifano reference | aifano://abc123.pdf | File uploaded via /upload |
| Job reference | jobid://job_abc123 | Reuse parsed result from a previous job |
Configuration Options
Enhance
Use vision language models to improve accuracy for specific block types:Chunking & Retrieval
Configure how content is chunked for RAG pipelines:| Chunk Mode | Description |
|---|---|
disabled | No chunking (default) |
variable | Variable-size chunks based on content |
section | One chunk per document section |
page | One chunk per page |
block | One chunk per block |
page_sections | Sections within pages |
Formatting
Control output format for tables and special content:Settings
Fine-tune OCR and processing behavior:| Setting | Options | Description |
|---|---|---|
ocr_system | standard, legacy | Standard supports all languages; legacy for Germanic only |
extraction_mode | hybrid, ocr | Hybrid combines OCR + embedded text for best accuracy |
page_range | {start, end} | Process only specific pages |
document_password | string | Password for encrypted PDFs |
Response Structure
Block Types
| Type | Description |
|---|---|
Title | Document or section title |
Section Header | Sub-section heading |
Text | Body text paragraph |
Table | Tabular data |
Figure | Image or chart |
List Item | Bulleted or numbered list item |
Header | Page header |
Footer | Page footer |
Page Number | Page number |
Key Value | Key-value pair |
Comment | Annotation or comment |
Signature | Signature block |