List Jobs - Retab Docs

from retab import Retab

client = Retab()

# List all jobs
jobs_response = client.jobs.list()
for job in jobs_response.data:
    print(f"{job.id}: {job.status} - {job.endpoint}")

# List with filters and pagination
jobs_response = client.jobs.list(
    status="completed",
    limit=10
)

# Paginate through results
if jobs_response.has_more:
    next_page = client.jobs.list(
        after=jobs_response.last_id,
        limit=10
    )

{
  "object": "list",
  "data": [
    {
      "id": "job_V1StGXR8_Z5jdHi6B-myT",
      "object": "job",
      "status": "completed",
      "endpoint": "/v1/extractions",
      "request": { ... },
      "response": {
        "status_code": 200,
        "body": { ... }
      },
      "error": null,
      "created_at": 1705420810,
      "started_at": 1705420812,
      "completed_at": 1705420820,
      "expires_at": 1706025610,
      "organization_id": "org_abc123",
      "metadata": {"batch_id": "batch_001"}
    },
    {
      "id": "job_X2TuHYS9_A6keIj7C-nzU",
      "object": "job",
      "status": "queued",
      "endpoint": "/v1/parses",
      "request": { ... },
      "response": null,
      "error": null,
      "created_at": 1705420800,
      "started_at": null,
      "completed_at": null,
      "expires_at": 1706025600,
      "organization_id": "org_abc123",
      "metadata": null
    }
  ],
  "first_id": "job_V1StGXR8_Z5jdHi6B-myT",
  "last_id": "job_X2TuHYS9_A6keIj7C-nzU",
  "has_more": true
}

Query Parameters

before

string

Reverse-pagination cursor. Use the first_id from the current page to fetch the previous page.

after

string

Pagination cursor. Use the last_id from the previous response to get the next page.

limit

integer

default:"20"

Maximum number of jobs to return. Range: 1-100.

order

string

default:"desc"

Sort order by created_at. Supported values: asc, desc.

string

Filter by an exact job ID.

status

string

Filter by job status. Valid values:

validating
queued
in_progress
completed
failed
cancelled
expired

endpoint

string

Filter by the target jobs endpoint.

source

string

High-level source filter. Supported values:

api - Jobs created directly through the API
project - Jobs associated with a request.project_id
workflow - Jobs associated with metadata.workflow_id

project_id

string

Filter by request.project_id.

workflow_id

string

Filter by metadata.workflow_id.

workflow_block_id

string

Filter by metadata.workflow_block_id or metadata.block_id.

model

string

Filter by request.model.

filename_regex

string

Anchored filename pattern applied to request.document.filename or request.documents[].filename.

filename_contains

string

Plain-text substring filter applied to request filenames.

document_type

string[]

Repeatable filter for file types such as pdf, png, jpg, docx, xlsx, csv, json, or xml.

from_date

string

Filter jobs created on or after this date in YYYY-MM-DD format.

to_date

string

Filter jobs created on or before this date in YYYY-MM-DD format.

metadata

string

JSON object string for exact metadata filtering, for example {"batch_id":"batch_001"}.

include_request

boolean

default:"false"

Whether to include and restore full request payloads in each list item.

include_response

boolean

default:"false"

Whether to include and restore full response payloads/documents in each list item.

By default, list responses are lightweight (request and response are omitted/null). Use include_request=true&include_response=true when you need full payloads.

include_response=true is limited to limit<=20 to keep response restoration bounded.

Response Fields

object

string

Always "list".

data

array

Array of Job objects, sorted by created_at descending (newest first).

Show Job object properties

string

Unique identifier for the job.

object

string

Always "job".

status

string

Current status of the job.

endpoint

string

The target endpoint for this job.

request

object | null

The original request body. May be omitted when include_request=false.

response

object | null

The response when completed. May be omitted when include_response=false.

error

object | null

Error details when failed.

created_at

integer

Unix timestamp of creation.

started_at

integer | null

Unix timestamp when processing started.

completed_at

integer | null

Unix timestamp of completion.

expires_at

integer

Unix timestamp of expiration.

organization_id

string

The owning organization.

metadata

object | null

User-provided metadata.

first_id

string | null

ID of the first job in this page (null if empty).

last_id

string | null

ID of the last job in this page. Use as the after parameter for pagination.

has_more

boolean

Whether there are more results available.

Pagination

Jobs are returned in descending order by creation time (newest first). Use cursor-based pagination to iterate through all results:

all_jobs = []
after = None

while True:
    response = client.jobs.list(after=after, limit=100)
    all_jobs.extend(response.data)

    if not response.has_more:
        break
    after = response.last_id

print(f"Total jobs: {len(all_jobs)}")

Filtering Examples

Monitor Active Jobs

# Get all jobs currently processing
in_progress = client.jobs.list(status="in_progress")
queued = client.jobs.list(status="queued")

print(f"In progress: {len(in_progress.data)}")
print(f"Queued: {len(queued.data)}")

Find Failed Jobs

# List failed jobs to investigate errors
failed_jobs = client.jobs.list(status="failed", limit=50)

for job in failed_jobs.data:
    print(f"{job.id}: {job.error.code} - {job.error.message}")

Filter by Metadata

While the API doesn’t support direct metadata filtering, you can filter client-side:

# Find jobs from a specific batch
batch_id = "batch_001"
batch_jobs = []

response = client.jobs.list(limit=100)
for job in response.data:
    if job.metadata and job.metadata.get("batch_id") == batch_id:
        batch_jobs.append(job)

print(f"Jobs in {batch_id}: {len(batch_jobs)}")

Authorizations

Api-Key

string

header

required

Query Parameters

before

string | null

Pagination cursor (first ID from current page)

after

string | null

Pagination cursor (last ID from previous page)

limit

integer

default:20

Number of jobs to return

Required range: 1 <= x <= 100

order

enum<string> | null

default:desc

Sort order by created_at

Available options:

asc,

desc

string | null

Filter by job ID

status

enum<string> | null

Filter by status

Available options:

validating,

queued,

in_progress,

completed,

failed,

cancelled,

expired

endpoint

enum<string> | null

Filter by endpoint

Available options:

/v1/extractions,

/v1/parses,

/v1/splits,

/v1/partitions,

/v1/classifications,

/v1/schemas/generate,

/v1/edits,

/v1/edits/templates/fill,

/v1/edits/templates/generate,

/v1/evals/extract/process,

/v1/evals/extract/extract,

/v1/evals/extract/split

source

enum<string> | null

High-level source filter. Use api/project/workflow.

Available options:

api,

project,

workflow

project_id

string | null

Filter by request.project_id

workflow_id

string | null

Filter by metadata.workflow_id

workflow_block_id

string | null

Filter by metadata.workflow_block_id or metadata.block_id

model

string | null

Filter by request.model

filename_regex

string | null

Regex or plain text pattern applied to request filenames.

filename_contains

string | null

Plain text substring applied to request filenames.

document_type

string[] | null

Filter by document type. Can be repeated. Accepted values: bmp, csv, doc, docm, docx, dotm, dotx, eml, gif, heic, heif, htm, html, jpeg, jpg, json, md, mhtml, msg, odp, ods, odt, ots, ott, pdf, png, ppt, pptx, rtf, svg, tif, tiff, tsv, txt, webp, xlam, xls, xlsb, xlsm, xlsx, xltm, xltx, xml, yaml, yml.

from_date

string | null

Filter jobs created on or after this date (YYYY-MM-DD)

to_date

string | null

Filter jobs created on or before this date (YYYY-MM-DD)

metadata

string | null

JSON object string to filter metadata key/value pairs.

include_request

boolean

default:false

Whether to include the full original request body in each listed job.

include_response

boolean

default:false

Whether to include full response payloads in each listed job.

access_token

string | null

operation_name

string

default:get_ready_gcp_settings

Response

Successful Response

Response for GET /v1/jobs.

data

JobListItem · object[]

required

Show child attributes

object

string

default:list

Allowed value: "list"

first_id

string | null

last_id

string | null

has_more

boolean

default:false

API Reference

Documentation Index

​Query Parameters

​Response Fields

​Pagination

​Filtering Examples

​Monitor Active Jobs

​Find Failed Jobs

​Filter by Metadata

Authorizations

Query Parameters

Response

Query Parameters

Response Fields

Pagination

Filtering Examples

Monitor Active Jobs

Find Failed Jobs

Filter by Metadata