How It Works
Instead of processing an entire document in a single LLM call, Chunking:- Identifies array structures in your JSON schema (e.g.,
line_items,transactions,entries) - Extracts unique keys from the document (e.g., product names, invoice numbers, dates) that identify each item
- Segments the document by locating where each key appears
- Processes items concurrently - each array item is extracted in parallel using dedicated LLM calls
- Merges results into the final structured output while preserving order
Context Engineering: Why Chunking is More Accurate
Beyond speed, Chunking significantly improves extraction accuracy through Context Engineering — the practice of optimizing what context the LLM sees for each extraction task. When processing a 50-page invoice with 200 line items in a single LLM call, the model must:- Hold the entire document in its context window
- Track hundreds of items simultaneously
- Maintain attention across thousands of tokens
- Avoid confusing similar items that appear pages apart
| Aspect | Standard Extraction | Chunking |
|---|---|---|
| Context per item | Entire document (all pages) | Only the relevant segment |
| Noise level | High (hundreds of unrelated items) | Minimal (just the target item) |
| Attention dilution | Significant on long documents | None — laser-focused extraction |
| Position bias | Later items often less accurate | Equal accuracy for all items |
When to Use Chunking
Chunking is ideal when:- Your schema contains arrays of objects (e.g.,
line_items: [{sku, description, quantity, price}]) - Documents have many repeating items (10+ items benefit most)
- You need faster turnaround on large documents
- Items can be uniquely identified by a key field
Usage
Enable Chunking by specifying thechunking_keys parameter in your extraction request:
chunking_keys parameter is a dictionary mapping:
- Key: The path to the array in your schema (e.g.,
"line_items","transactions","items.products") - Value: The field within each array item that uniquely identifies it (e.g.,
"product_name","sku","id")
Example Schema
chunking_keys={"line_items": "product_name"}, Retab will:
- Extract all product names from the document
- Locate each product’s position in the document
- Extract each line item’s details in parallel
- Merge results and extract constants (
invoice_number,date,total_amount) separately
Supported Document Types
- PDF documents
- Images (JPEG, PNG, etc.)
- Office documents (DOCX, PPTX, ODT, ODP)
- Excel spreadsheets (XLSX, XLS)
Performance Benefits
| Document Size | Standard Extraction | Chunking |
|---|---|---|
| 5 line items | ~3s | ~3s |
| 20 line items | ~8s | ~4s |
| 50 line items | ~15s | ~5s |
| 100+ items | ~30s+ | ~6s |
Consensus with Chunking
Chunking fully supports then_consensus parameter. When enabled, each item is extracted multiple times independently, and results are compared to improve accuracy. This is particularly useful for:
- High-value documents requiring verification
- Documents with challenging handwriting or scan quality
- Compliance-critical extractions
Pricing
Chunking uses more credits than standard extraction because it processes the document in multiple chunks. It charges 2x the base rate per page.Cost Breakdown
| Component | Cost |
|---|---|
| Billed pages | 2 × document_page_count |
| Credits per billed page | n_consensus × model_credits |
| Total | (2 × document_page_count) × n_consensus × model_credits |
Example
For a 10-page invoice usingretab-small (1.0 credit) and n_consensus=1:
- Billed pages:
2 × 10 = 20 - Credits per billed page:
1 × 1.0 = 1.0 - Total: 20 × 1.0 = 20 credits
n_consensus=3, the same document costs:
20 × (3 × 1.0) = 60 credits