Skip to main content
GET
/
v1
/
jobs
/
{job_id}
from retab import Retab

client = Retab()

# Retrieve a job by ID
job = client.jobs.retrieve_full("job_V1StGXR8_Z5jdHi6B-myT")

print(f"Status: {job.status}")

if job.status == "completed":
    print(f"Result: {job.response.body}")
elif job.status == "failed":
    print(f"Error: {job.error.message}")
{
  "id": "job_V1StGXR8_Z5jdHi6B-myT",
  "object": "job",
  "status": "queued",
  "endpoint": "/v1/documents/extract",
  "request": {
    "document": {
      "filename": "invoice.pdf",
      "url": "data:application/pdf;base64,..."
    },
    "json_schema": { ... },
    "model": "retab-small"
  },
  "response": null,
  "error": null,
  "created_at": 1705420800,
  "started_at": null,
  "completed_at": null,
  "expires_at": 1706025600,
  "organization_id": "org_abc123",
  "metadata": {"batch_id": "batch_001"}
}
from retab import Retab

client = Retab()

# Retrieve a job by ID
job = client.jobs.retrieve_full("job_V1StGXR8_Z5jdHi6B-myT")

print(f"Status: {job.status}")

if job.status == "completed":
    print(f"Result: {job.response.body}")
elif job.status == "failed":
    print(f"Error: {job.error.message}")
{
  "id": "job_V1StGXR8_Z5jdHi6B-myT",
  "object": "job",
  "status": "queued",
  "endpoint": "/v1/documents/extract",
  "request": {
    "document": {
      "filename": "invoice.pdf",
      "url": "data:application/pdf;base64,..."
    },
    "json_schema": { ... },
    "model": "retab-small"
  },
  "response": null,
  "error": null,
  "created_at": 1705420800,
  "started_at": null,
  "completed_at": null,
  "expires_at": 1706025600,
  "organization_id": "org_abc123",
  "metadata": {"batch_id": "batch_001"}
}

Path Parameters

job_id
string
required
The unique identifier of the job to retrieve.

Query Parameters

include_request
boolean
default:"false"
Whether to include and restore the original request payload.
include_response
boolean
default:"false"
Whether to include and restore response payloads/documents.
The default retrieve response is lightweight (request and response are omitted).
For tight polling loops, use GET /v1/jobs/{job_id}/status. It avoids request/response restoration work and returns only lifecycle metadata.

SDK Convenience Methods

  • Python: client.jobs.retrieve(job_id) defaults to lightweight polling.
  • Python: client.jobs.retrieve_full(job_id) fetches full request/response payloads.
  • JavaScript: client.jobs.retrieve(jobId) defaults to lightweight polling.
  • JavaScript: client.jobs.retrieveFull(jobId) fetches full request/response payloads.

Response Fields

id
string
Unique identifier for the job.
object
string
Always "job".
status
string
Current status of the job:
  • validating - Request is being validated
  • queued - Waiting in the processing queue
  • in_progress - Currently being processed
  • completed - Successfully finished
  • failed - Execution failed
  • cancelled - Cancelled by user
  • expired - Job data has expired
endpoint
string
The target endpoint for this job.
request
object | null
The original request body submitted. May be omitted when include_request=false.
response
object | null
Present when status is completed. Contains:
error
object | null
Present when status is failed. Contains:
created_at
integer
Unix timestamp when the job was created.
started_at
integer | null
Unix timestamp when processing started (null if not yet started).
completed_at
integer | null
Unix timestamp when the job reached a terminal status.
expires_at
integer
Unix timestamp when the job data will expire.
organization_id
string
The organization that owns this job.
metadata
object | null
User-provided metadata from job creation.

Payload Restoration

include_request=true restores any request-side file artifacts that were offloaded during job creation. include_response=true restores response payloads or document artifacts for completed jobs. If restoration fails for a specific artifact, the job is still returned and Retab may attach a warning describing the degraded restore.

Polling for Completion

Use polling to wait for a job to complete:
import time

job = client.jobs.retrieve(job_id)
terminal_statuses = ("completed", "failed", "cancelled", "expired")

while job.status not in terminal_statuses:
    time.sleep(2)  # Wait 2 seconds between polls
    job = client.jobs.retrieve(job_id)

# Fetch full payload only once the job is terminal
if job.status == "completed":
    job = client.jobs.retrieve_full(job_id)

# Or use the helper method
job = client.jobs.wait_for_completion(
    job_id,
    poll_interval_seconds=2.0,
    timeout_seconds=300,
)
By default, wait_for_completion/waitForCompletion polls with lightweight retrieves and returns a final job with include_response=true and include_request=false. Override these with options when needed.

Authorizations

Api-Key
string
header
required

Path Parameters

job_id
string
required

Query Parameters

include_request
boolean
default:false

Whether to restore request MIME documents from GCS in the response.

include_response
boolean
default:false

Whether to restore response payload/documents from GCS in the response.

access_token
string | null

Response

Successful Response

Core Job object following OpenAI-style specification.

Represents a single asynchronous job that can be polled for status and result retrieval.

endpoint
enum<string>
required
Available options:
/v1/documents/extract,
/v1/documents/parse,
/v1/documents/split,
/v1/documents/classify,
/v1/schemas/generate,
/v1/edit/agent/fill,
/v1/edit/templates/fill,
/v1/edit/templates/generate,
/v1/evals/extract/process,
/v1/evals/split/process,
/v1/evals/classify/process,
/v1/evals/extract/extract,
/v1/evals/extract/split
organization_id
string
required
id
string
object
string
default:job
Allowed value: "job"
status
enum<string>
default:validating
Available options:
validating,
queued,
in_progress,
completed,
failed,
cancelled,
expired
error
JobError · object

Error details when job fails.

warnings
JobWarning · object[] | null
created_at
integer
started_at
integer | null
completed_at
integer | null
expires_at
integer
metadata
Metadata · object
cloud_task_name
string | null
cancelled
boolean
default:false
attempt_count
integer
default:0
last_attempt_at
integer | null
last_failure_code
string | null
request
Request · object
response
JobResponse · object

Response stored when job completes successfully.

request_gcs_paths
null