Skip to main content
POST
/
v1
/
jobs
from retab import Retab

client = Retab()

# Create an async extraction job
job = client.jobs.create(
    endpoint="/v1/documents/extract",
    request={
        "document": {
            "filename": "invoice.pdf",
            "url": "data:application/pdf;base64,JVBERi0xLjQK..."
        },
        "json_schema": {
            "type": "object",
            "properties": {
                "invoice_number": {"type": "string"},
                "total": {"type": "number"}
            },
            "required": ["invoice_number", "total"]
        },
        "model": "retab-small"
    },
    metadata={"batch_id": "batch_001", "source": "api"}
)

print(f"Job ID: {job.id}")
print(f"Status: {job.status}")
{
  "id": "job_V1StGXR8_Z5jdHi6B-myT",
  "object": "job",
  "status": "queued",
  "endpoint": "/v1/documents/extract",
  "request": {
    "document": {
      "filename": "invoice.pdf",
      "url": "data:application/pdf;base64,JVBERi0xLjQK..."
    },
    "json_schema": {
      "type": "object",
      "properties": {
        "invoice_number": {"type": "string"},
        "total": {"type": "number"}
      },
      "required": ["invoice_number", "total"]
    },
    "model": "retab-small"
  },
  "response": null,
  "error": null,
  "created_at": 1705420800,
  "started_at": null,
  "completed_at": null,
  "expires_at": 1706025600,
  "organization_id": "org_abc123",
  "metadata": {
    "batch_id": "batch_001",
    "source": "api"
  }
}
from retab import Retab

client = Retab()

# Create an async extraction job
job = client.jobs.create(
    endpoint="/v1/documents/extract",
    request={
        "document": {
            "filename": "invoice.pdf",
            "url": "data:application/pdf;base64,JVBERi0xLjQK..."
        },
        "json_schema": {
            "type": "object",
            "properties": {
                "invoice_number": {"type": "string"},
                "total": {"type": "number"}
            },
            "required": ["invoice_number", "total"]
        },
        "model": "retab-small"
    },
    metadata={"batch_id": "batch_001", "source": "api"}
)

print(f"Job ID: {job.id}")
print(f"Status: {job.status}")
{
  "id": "job_V1StGXR8_Z5jdHi6B-myT",
  "object": "job",
  "status": "queued",
  "endpoint": "/v1/documents/extract",
  "request": {
    "document": {
      "filename": "invoice.pdf",
      "url": "data:application/pdf;base64,JVBERi0xLjQK..."
    },
    "json_schema": {
      "type": "object",
      "properties": {
        "invoice_number": {"type": "string"},
        "total": {"type": "number"}
      },
      "required": ["invoice_number", "total"]
    },
    "model": "retab-small"
  },
  "response": null,
  "error": null,
  "created_at": 1705420800,
  "started_at": null,
  "completed_at": null,
  "expires_at": 1706025600,
  "organization_id": "org_abc123",
  "metadata": {
    "batch_id": "batch_001",
    "source": "api"
  }
}

Request Parameters

endpoint
string
required
The API endpoint to call asynchronously. Supported values:
  • /v1/documents/extract - Extract structured data
  • /v1/documents/parse - Parse to text/markdown
  • /v1/documents/split - Split documents
  • /v1/documents/classify - Classify documents
  • /v1/schemas/generate - Generate schemas
  • /v1/edit/agent/fill - AI agent form filling
  • /v1/edit/templates/fill - Template filling
  • /v1/edit/templates/generate - Generate form schema
  • /v1/evals/extract/process - Run an extract eval process job. Requires project_id in request
  • /v1/evals/split/process - Run a split eval process job. Requires project_id in request
  • /v1/evals/classify/process - Run a classify eval process job. Requires project_id in request
  • /v1/evals/extract/extract - Run extract evaluation extraction. Requires project_id in request
  • /v1/evals/extract/split - Run extract evaluation splitting. Requires project_id in request
request
object
required
The full request body for the target endpoint. Must match the schema expected by the specified endpoint.
metadata
object
Optional key-value pairs for tracking. Maximum 16 pairs; keys up to 64 characters, values up to 512 characters.
If request contains embedded MIME/data URLs, Retab may offload those artifacts to object storage internally before persisting the job. The API response still returns the original request shape.

Response Fields

id
string
Unique identifier for the job, prefixed with job_.
object
string
Always "job".
status
string
Current status: validating, queued, in_progress, completed, failed, cancelled, or expired.
endpoint
string
The target endpoint for this job.
request
object
The original request body submitted.
response
object | null
The response from the target endpoint when status is completed. Contains status_code and body.
error
object | null
Error details when status is failed. Contains code, message, and optional details.

Dispatch Behavior

Most successful creates return 200 with a queued job. If the job record is stored successfully but task dispatch is temporarily delayed, Retab returns 202 Accepted with the job already created and a warning like dispatch_delayed. In that case, poll the returned job.id instead of creating a duplicate job.
created_at
integer
Unix timestamp when the job was created.
started_at
integer | null
Unix timestamp when processing started.
completed_at
integer | null
Unix timestamp when the job reached a terminal status.
expires_at
integer
Unix timestamp when the job data will expire (7 days after creation).
organization_id
string
The organization that owns this job.
metadata
object | null
User-provided metadata.

Authorizations

Api-Key
string
header
required

Headers

Idempotency-Key
string | null

Query Parameters

access_token
string | null

Body

application/json

Request body for POST /v1/jobs.

endpoint
enum<string>
required
Available options:
/v1/documents/extract,
/v1/documents/parse,
/v1/documents/split,
/v1/documents/classify,
/v1/schemas/generate,
/v1/edit/agent/fill,
/v1/edit/templates/fill,
/v1/edit/templates/generate,
/v1/evals/extract/process,
/v1/evals/split/process,
/v1/evals/classify/process,
/v1/evals/extract/extract,
/v1/evals/extract/split
request
Request · object
required
metadata
Metadata · object

Max 16 pairs; keys ≤64 chars, values ≤512 chars

Response

Successful Response

Core Job object following OpenAI-style specification.

Represents a single asynchronous job that can be polled for status and result retrieval.

endpoint
enum<string>
required
Available options:
/v1/documents/extract,
/v1/documents/parse,
/v1/documents/split,
/v1/documents/classify,
/v1/schemas/generate,
/v1/edit/agent/fill,
/v1/edit/templates/fill,
/v1/edit/templates/generate,
/v1/evals/extract/process,
/v1/evals/split/process,
/v1/evals/classify/process,
/v1/evals/extract/extract,
/v1/evals/extract/split
organization_id
string
required
id
string
object
string
default:job
Allowed value: "job"
status
enum<string>
default:validating
Available options:
validating,
queued,
in_progress,
completed,
failed,
cancelled,
expired
error
JobError · object

Error details when job fails.

warnings
JobWarning · object[] | null
created_at
integer
started_at
integer | null
completed_at
integer | null
expires_at
integer
metadata
Metadata · object
cloud_task_name
string | null
cancelled
boolean
default:false
attempt_count
integer
default:0
last_attempt_at
integer | null
last_failure_code
string | null
request
Request · object
response
JobResponse · object

Response stored when job completes successfully.

request_gcs_paths
null