Extract

from retab import Retab

client = Retab()
response = client.documents.extract(
    json_schema = "Invoice_schema.json",
    document = "Invoice.pdf",
    model="gpt-4.1-nano",
    temperature=0
)

{
    "content": {
        "id": "chatcmpl-AoBs45TNWTB1VKGSXV7NAwCnxMaNN",
        "choices": [
            {
                "finish_reason": "stop",
                "index": 0,
                "logprobs": null,
                "message": {
                    "content": "{\"name\": \"Confirmation d'affr\\u00e9tement\", \"date\": \"2024-11-08\"}",
                    "refusal": null,
                    "role": "assistant",
                    "audio": null,
                    "function_call": null,
                    "tool_calls": [],
                    "parsed": {
                        "name": "Confirmation d'affr\u00e9tement",
                        "date": "2024-11-08"
                    }
                }
            }
        ],
        "created": 1736525396,
        "model": "gpt-4.1-nano",
        "object": "chat.completion",
        "service_tier": "default",
        "system_fingerprint": "fp_f2cd28694a",
        "usage": {
            "completion_tokens": 20,
            "prompt_tokens": 2760,
            "total_tokens": 2780,
            "completion_tokens_details": {
                "accepted_prediction_tokens": 0,
                "audio_tokens": 0,
                "reasoning_tokens": 0,
                "rejected_prediction_tokens": 0
            },
            "prompt_tokens_details": {
                "audio_tokens": 0,
                "cached_tokens": 0
            }
        },
        "likelihoods": {
            "name": 0.7227993785831323,
            "date": 0.7306298416895017
        }
    },
    "error": null
}

POST

documents

extract

from retab import Retab

client = Retab()
response = client.documents.extract(
    json_schema = "Invoice_schema.json",
    document = "Invoice.pdf",
    model="gpt-4.1-nano",
    temperature=0
)

{
    "content": {
        "id": "chatcmpl-AoBs45TNWTB1VKGSXV7NAwCnxMaNN",
        "choices": [
            {
                "finish_reason": "stop",
                "index": 0,
                "logprobs": null,
                "message": {
                    "content": "{\"name\": \"Confirmation d'affr\\u00e9tement\", \"date\": \"2024-11-08\"}",
                    "refusal": null,
                    "role": "assistant",
                    "audio": null,
                    "function_call": null,
                    "tool_calls": [],
                    "parsed": {
                        "name": "Confirmation d'affr\u00e9tement",
                        "date": "2024-11-08"
                    }
                }
            }
        ],
        "created": 1736525396,
        "model": "gpt-4.1-nano",
        "object": "chat.completion",
        "service_tier": "default",
        "system_fingerprint": "fp_f2cd28694a",
        "usage": {
            "completion_tokens": 20,
            "prompt_tokens": 2760,
            "total_tokens": 2780,
            "completion_tokens_details": {
                "accepted_prediction_tokens": 0,
                "audio_tokens": 0,
                "reasoning_tokens": 0,
                "rejected_prediction_tokens": 0
            },
            "prompt_tokens_details": {
                "audio_tokens": 0,
                "cached_tokens": 0
            }
        },
        "likelihoods": {
            "name": 0.7227993785831323,
            "date": 0.7306298416895017
        }
    },
    "error": null
}

from retab import Retab

client = Retab()
response = client.documents.extract(
    json_schema = "Invoice_schema.json",
    document = "Invoice.pdf",
    model="gpt-4.1-nano",
    temperature=0
)

{
    "content": {
        "id": "chatcmpl-AoBs45TNWTB1VKGSXV7NAwCnxMaNN",
        "choices": [
            {
                "finish_reason": "stop",
                "index": 0,
                "logprobs": null,
                "message": {
                    "content": "{\"name\": \"Confirmation d'affr\\u00e9tement\", \"date\": \"2024-11-08\"}",
                    "refusal": null,
                    "role": "assistant",
                    "audio": null,
                    "function_call": null,
                    "tool_calls": [],
                    "parsed": {
                        "name": "Confirmation d'affr\u00e9tement",
                        "date": "2024-11-08"
                    }
                }
            }
        ],
        "created": 1736525396,
        "model": "gpt-4.1-nano",
        "object": "chat.completion",
        "service_tier": "default",
        "system_fingerprint": "fp_f2cd28694a",
        "usage": {
            "completion_tokens": 20,
            "prompt_tokens": 2760,
            "total_tokens": 2780,
            "completion_tokens_details": {
                "accepted_prediction_tokens": 0,
                "audio_tokens": 0,
                "reasoning_tokens": 0,
                "rejected_prediction_tokens": 0
            },
            "prompt_tokens_details": {
                "audio_tokens": 0,
                "cached_tokens": 0
            }
        },
        "likelihoods": {
            "name": 0.7227993785831323,
            "date": 0.7306298416895017
        }
    },
    "error": null
}

Authorizations

Api-Key

string

header

required

Headers

Idempotency-Key

string | null

Idempotency-ForceRefresh

boolean

default:false

Body

application/json

model

string

required

Model used for chat completion

json_schema

object

required

JSON schema format used to validate the output data.

document

object

Document to be analyzed

Show child attributes

documents

MIMEData · object[]

Documents to be analyzed (preferred over document)

Show child attributes

image_resolution_dpi

integer

default:96

Resolution of the image sent to the LLM

temperature

number

default:0

Temperature for sampling. If not provided, the default temperature for the model will be used.

Examples:

0

reasoning_effort

enum<string> | null

default:minimal

The effort level for the model to reason about the input data. If not provided, the default reasoning effort for the model will be used.

Available options:

minimal,

low,

medium,

high

n_consensus

integer

default:1

Number of consensus models to use for extraction. If greater than 1 the temperature cannot be 0.

stream

boolean

default:false

If true, the extraction will be streamed to the user using the active WebSocket connection

seed

integer | null

Seed for the random number generator. If not provided, a random seed will be generated.

Examples:

null

store

boolean

default:true

If true, the extraction will be stored in the database

need_validation

boolean

default:false

If true, the extraction will be validated against the schema

test_exception

enum<string> | null

Available options:

before_handle_extraction,

within_extraction_parse_or_stream,

after_handle_extraction,

within_process_document_stream_generator

Response

Successful Response

string

required

choices

RetabParsedChoice · object[]

required

Show child attributes

created

integer

required

model

string

required

object

string

required

Allowed value: "chat.completion"

service_tier

enum<string> | null

Available options:

auto,

default,

flex,

scale,

priority

system_fingerprint

string | null

usage

object | null

Show child attributes

extraction_id

string | null

likelihoods

object | null

Object defining the uncertainties of the fields extracted when using consensus. Follows the same structure as the extraction object.

schema_validation_error

object | null

Show child attributes

request_at

string<date-time> | null

Timestamp of the request

first_token_at

string<date-time> | null

Timestamp of the first token of the document. If non-streaming, set to last_token_at

last_token_at

string<date-time> | null

Timestamp of the last token of the document

Models/List Parse

API Reference

Documents

Schema

Projects

Authorizations

Headers

Body

Response