Create Streaming Extraction
Run a structured extraction on a document and stream partial results as they are produced.
Extraction to be persisted. The
request body is identical to POST /v1/extractions;
the response is a stream of application/stream+json chunks, each carrying the
latest partial output as the model fills the schema.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Request to run a structured extraction on a single document.
Extends the base extraction request with the document to process (either
inline content or a reference to a previously uploaded file) and a stream
flag that controls whether results are returned incrementally.
A file represented by its filename and a base64 data url.
JSON schema describing the structured output
The model to use for the extraction
Resolution of the image sent to the LLM
Free-form instructions appended to the system prompt to steer the extraction.
Number of consensus extraction runs to perform. Uses deterministic single-pass when set to 1.
User-defined metadata to associate with this extraction
Additional chat messages forwarded to the extraction model.
If true, skip the LLM cache and force a fresh completion
If true, run asynchronously: returns immediately with status 'queued' and an empty output. Poll GET /v1//{id} until status is terminal. Mutually exclusive with stream.
Response
Streaming extraction chunks