Create Extraction
Run a structured extraction on a document.
Extracts structured data from the document according to the supplied
json_schema, using the requested model. Returns the extraction
with its output, consensus details, and usage on 201. When
stream is true, partial results are streamed back as they are produced.
Extract structured data from a document against a JSON schema and persist the result as anDocumentation Index
Fetch the complete documentation index at: https://docs.retab.com/llms.txt
Use this file to discover all available pages before exploring further.
Extraction resource that can later be retrieved via GET /v1/extractions/{extraction_id} or listed via GET /v1/extractions.
Authorizations
Body
Request to run a structured extraction on a single document.
Extends the base extraction request with the document to process (either
inline content or a reference to a previously uploaded file) and a stream
flag that controls whether results are returned incrementally.
A file represented by its filename and a base64 data url.
- MIMEData
- FileRef
JSON schema describing the structured output
The model to use for the extraction
Resolution of the image sent to the LLM
Free-form instructions appended to the system prompt to steer the extraction.
Number of consensus extraction runs to perform. Uses deterministic single-pass when set to 1.
User-defined metadata to associate with this extraction
Additional chat messages forwarded to the extraction model.
If true, skip the LLM cache and force a fresh completion
If true, run asynchronously: returns immediately with status 'queued' and an empty output. Poll GET /v1//{id} until status is terminal. Mutually exclusive with stream.
Response
Successful Response
A stored extraction record from the Retab API.
Unique identifier of the extraction
Information about the extracted file
Model used for the extraction
JSON schema used for the extraction
The extracted structured data
Number of consensus votes used
DPI used to render document images
Free-form instructions supplied with the extraction request.
Lifecycle status. The synchronous path returns 'completed'. Background runs progress pending -> queued -> in_progress -> completed | failed | cancelled.
pending, queued, in_progress, completed, failed, cancelled Error details when a background run fails; null otherwise. Always present so consumers can read it without an existence check.
Consensus metadata for multi-vote extraction runs
Usage information for the extraction