Skip to main content
GET
/
v1
/
jobs
from retab import Retab

client = Retab()

# List all jobs
jobs_response = client.jobs.list()
for job in jobs_response.data:
    print(f"{job.id}: {job.status} - {job.endpoint}")

# List with filters and pagination
jobs_response = client.jobs.list(
    status="completed",
    limit=10
)

# Paginate through results
if jobs_response.has_more:
    next_page = client.jobs.list(
        after=jobs_response.last_id,
        limit=10
    )
{
  "object": "list",
  "data": [
    {
      "id": "job_V1StGXR8_Z5jdHi6B-myT",
      "object": "job",
      "status": "completed",
      "endpoint": "/v1/documents/extract",
      "request": { ... },
      "response": {
        "status_code": 200,
        "body": { ... }
      },
      "error": null,
      "created_at": 1705420810,
      "started_at": 1705420812,
      "completed_at": 1705420820,
      "expires_at": 1706025610,
      "organization_id": "org_abc123",
      "metadata": {"batch_id": "batch_001"}
    },
    {
      "id": "job_X2TuHYS9_A6keIj7C-nzU",
      "object": "job",
      "status": "queued",
      "endpoint": "/v1/documents/parse",
      "request": { ... },
      "response": null,
      "error": null,
      "created_at": 1705420800,
      "started_at": null,
      "completed_at": null,
      "expires_at": 1706025600,
      "organization_id": "org_abc123",
      "metadata": null
    }
  ],
  "first_id": "job_V1StGXR8_Z5jdHi6B-myT",
  "last_id": "job_X2TuHYS9_A6keIj7C-nzU",
  "has_more": true
}
from retab import Retab

client = Retab()

# List all jobs
jobs_response = client.jobs.list()
for job in jobs_response.data:
    print(f"{job.id}: {job.status} - {job.endpoint}")

# List with filters and pagination
jobs_response = client.jobs.list(
    status="completed",
    limit=10
)

# Paginate through results
if jobs_response.has_more:
    next_page = client.jobs.list(
        after=jobs_response.last_id,
        limit=10
    )
{
  "object": "list",
  "data": [
    {
      "id": "job_V1StGXR8_Z5jdHi6B-myT",
      "object": "job",
      "status": "completed",
      "endpoint": "/v1/documents/extract",
      "request": { ... },
      "response": {
        "status_code": 200,
        "body": { ... }
      },
      "error": null,
      "created_at": 1705420810,
      "started_at": 1705420812,
      "completed_at": 1705420820,
      "expires_at": 1706025610,
      "organization_id": "org_abc123",
      "metadata": {"batch_id": "batch_001"}
    },
    {
      "id": "job_X2TuHYS9_A6keIj7C-nzU",
      "object": "job",
      "status": "queued",
      "endpoint": "/v1/documents/parse",
      "request": { ... },
      "response": null,
      "error": null,
      "created_at": 1705420800,
      "started_at": null,
      "completed_at": null,
      "expires_at": 1706025600,
      "organization_id": "org_abc123",
      "metadata": null
    }
  ],
  "first_id": "job_V1StGXR8_Z5jdHi6B-myT",
  "last_id": "job_X2TuHYS9_A6keIj7C-nzU",
  "has_more": true
}

Query Parameters

before
string
Reverse-pagination cursor. Use the first_id from the current page to fetch the previous page.
after
string
Pagination cursor. Use the last_id from the previous response to get the next page.
limit
integer
default:"20"
Maximum number of jobs to return. Range: 1-100.
order
string
default:"desc"
Sort order by created_at. Supported values: asc, desc.
id
string
Filter by an exact job ID.
status
string
Filter by job status. Valid values:
  • validating
  • queued
  • in_progress
  • completed
  • failed
  • cancelled
  • expired
endpoint
string
Filter by the target jobs endpoint.
source
string
High-level source filter. Supported values:
  • api - Jobs created directly through the API
  • project - Jobs associated with a request.project_id
  • workflow - Jobs associated with metadata.workflow_id
project_id
string
Filter by request.project_id.
workflow_id
string
Filter by metadata.workflow_id.
workflow_node_id
string
Filter by metadata.workflow_node_id or metadata.node_id.
model
string
Filter by request.model.
filename_regex
string
Anchored filename pattern applied to request.document.filename or request.documents[].filename.
filename_contains
string
Plain-text substring filter applied to request filenames.
document_type
string[]
Repeatable filter for file types such as pdf, png, jpg, docx, xlsx, csv, json, or xml.
from_date
string
Filter jobs created on or after this date in YYYY-MM-DD format.
to_date
string
Filter jobs created on or before this date in YYYY-MM-DD format.
metadata
string
JSON object string for exact metadata filtering, for example {"batch_id":"batch_001"}.
include_request
boolean
default:"false"
Whether to include and restore full request payloads in each list item.
include_response
boolean
default:"false"
Whether to include and restore full response payloads/documents in each list item.
By default, list responses are lightweight (request and response are omitted/null). Use include_request=true&include_response=true when you need full payloads.
include_response=true is limited to limit<=20 to keep response restoration bounded.

Response Fields

object
string
Always "list".
data
array
Array of Job objects, sorted by created_at descending (newest first).
first_id
string | null
ID of the first job in this page (null if empty).
last_id
string | null
ID of the last job in this page. Use as the after parameter for pagination.
has_more
boolean
Whether there are more results available.

Pagination

Jobs are returned in descending order by creation time (newest first). Use cursor-based pagination to iterate through all results:
all_jobs = []
after = None

while True:
    response = client.jobs.list(after=after, limit=100)
    all_jobs.extend(response.data)

    if not response.has_more:
        break
    after = response.last_id

print(f"Total jobs: {len(all_jobs)}")

Filtering Examples

Monitor Active Jobs

# Get all jobs currently processing
in_progress = client.jobs.list(status="in_progress")
queued = client.jobs.list(status="queued")

print(f"In progress: {len(in_progress.data)}")
print(f"Queued: {len(queued.data)}")

Find Failed Jobs

# List failed jobs to investigate errors
failed_jobs = client.jobs.list(status="failed", limit=50)

for job in failed_jobs.data:
    print(f"{job.id}: {job.error.code} - {job.error.message}")

Filter by Metadata

While the API doesn’t support direct metadata filtering, you can filter client-side:
# Find jobs from a specific batch
batch_id = "batch_001"
batch_jobs = []

response = client.jobs.list(limit=100)
for job in response.data:
    if job.metadata and job.metadata.get("batch_id") == batch_id:
        batch_jobs.append(job)

print(f"Jobs in {batch_id}: {len(batch_jobs)}")

Authorizations

Api-Key
string
header
required

Query Parameters

before
string | null

Pagination cursor (first ID from current page)

after
string | null

Pagination cursor (last ID from previous page)

limit
integer
default:20

Number of jobs to return

Required range: 1 <= x <= 100
order
enum<string> | null
default:desc

Sort order by created_at

Available options:
asc,
desc
id
string | null

Filter by job ID

status
enum<string> | null

Filter by status

Available options:
validating,
queued,
in_progress,
completed,
failed,
cancelled,
expired
endpoint
enum<string> | null

Filter by endpoint

Available options:
/v1/documents/extract,
/v1/documents/parse,
/v1/documents/split,
/v1/documents/classify,
/v1/schemas/generate,
/v1/edit/agent/fill,
/v1/edit/templates/fill,
/v1/edit/templates/generate,
/v1/evals/extract/process,
/v1/evals/split/process,
/v1/evals/classify/process,
/v1/evals/extract/extract,
/v1/evals/extract/split
source
enum<string> | null

High-level source filter. Use api/project/workflow.

Available options:
api,
project,
workflow
project_id
string | null

Filter by request.project_id

workflow_id
string | null

Filter by metadata.workflow_id

workflow_node_id
string | null

Filter by metadata.workflow_node_id or metadata.node_id

model
string | null

Filter by request.model

filename_regex
string | null

Regex or plain text pattern applied to request filenames.

filename_contains
string | null

Plain text substring applied to request filenames.

document_type
string[] | null

Filter by document type. Can be repeated. Accepted values: bmp, csv, doc, docm, docx, dotm, dotx, eml, gif, heic, heif, htm, html, jpeg, jpg, json, md, mhtml, msg, odp, ods, odt, ots, ott, pdf, png, ppt, pptx, rtf, svg, tif, tiff, tsv, txt, webp, xlam, xls, xlsb, xlsm, xlsx, xltm, xltx, xml, yaml, yml.

from_date
string | null

Filter jobs created on or after this date (YYYY-MM-DD)

to_date
string | null

Filter jobs created on or before this date (YYYY-MM-DD)

metadata
string | null

JSON object string to filter metadata key/value pairs.

include_request
boolean
default:false

Whether to include the full original request body in each listed job.

include_response
boolean
default:false

Whether to include full response payloads in each listed job.

access_token
string | null

Response

Successful Response

Response for GET /v1/jobs.

data
JobListItem · object[]
required
object
string
default:list
Allowed value: "list"
first_id
string | null
last_id
string | null
has_more
boolean
default:false