Core Concepts
Model Routing
Retab provides intelligent model routing through two special model identifiers: auto-large
and auto-small
. These models automatically route your requests to the current best-performing model based on availability, performance, and speed metrics. This means you don’t need to manually update your model selection when new, better-performing models become available - Retab handles the routing for you, ensuring your applications always use the optimal model for your use case.
Sync & Async Client
Retab offers both synchronous and asynchronous client interfaces, making it versatile for different application needs. The asynchronous client (AsyncRetab
) is ideal for high-performance, non-blocking applications where multiple tasks run concurrently. For simpler or blocking operations, the synchronous client (Retab
) provides a straightforward approach.
Here’s how you can use both:
Both clients provide the same core functionality, enabling you to list models, create messages, extract data from documents, and more, with the flexibility to match your application’s concurrency model.
Pagination
Many top-level resources have support for bulk fetches via list API methods. For instance, you can list extraction links, list email addresses, and list logs. These list API methods share a common structure, taking at least these four parameters: limit, order, after, and before.
Retab utilizes pagination via the after and before parameters. Both parameters take an existing object ID value and return objects in either descending or ascending order by creation time.
Idempotency
The Retab API supports idempotency which guarantees that performing the same operation multiple times will have the same result as if the operation were performed only once. This is handy in situations where you may need to retry a request due to a failure or prevent accidental duplicate requests from creating more than one resource.
To achieve idempotency, you can add Idempotency-Key request header to any Retab API request with a unique string as the value. Each subsequent request matching this unique string will return the same response. We suggest using v4 UUIDs for idempotency keys to avoid collisions.
Idempotency keys expire after 24 hours. The Retab API will generate a new response if you submit a request with an expired key.
Rate Limits
Retab implements rate limiting to ensure stable service for all users. The API uses a rolling window rate limit with the following configuration:
- 300 requests per 60-second window
- Applies across the following API endpoints:
POST /v1/documents/extractions
POST /v1/documents/create_messages
When you exceed the rate limit, the API will return a 429 Too Many Requests
response. The response headers will include:
For high-volume applications, we can provide a dedicated plan. Contact us for more information.
Modality
LLM works with text and image data. Retab converts documents into different modalities, based on the document type.
Native modalities
Here are the list of native modalities supported by Retab:
You can also use the modality
parameter to specify the modality of the document and override the default modality.
Image Settings
When processing images, several factors can affect the LLM’s ability to accurately interpret and extract information. The image_resolution_dpi and browser_canvas parameters allow you to tune images settings to improve extraction quality.
API Reference
The DPI of the image. Defaults to 96.
The canvas size of the browser. Must be one of:
- “A3” (11.7in x 16.54in)
- “A4” (8.27in x 11.7in)
- “A5” (5.83in x 8.27in) Defaults to “A4”.
Consensus
You can leverage the consensus feature to improve the accuracy of the extraction. The consensus feature is a way to aggregate the results of multiple LLMs to improve the accuracy of the extraction.
The consensus principle is simple: Multiple runs should give the same result, if the result is not the same, the LLM is not confident about the result so neither should you. We compute a consensus score for each field.
Some additional _consensus_score
fields are added to the likelihoods object, they are computed as the average of the consensus scores within some context.