Skip to main content

Introduction

The Files API lets you upload, manage, and retrieve documents stored in Retab. Files are the foundation of document processing — once uploaded, a file can be referenced by ID across extractions, workflows, and projects without re-uploading. The module exposes four methods:
MethodPurpose
uploadUpload a document and receive a persistent file_id for future reference.
listList uploaded files with pagination, filename prefix search, and MIME type filtering.
getRetrieve metadata for a single file by ID.
get_download_linkGet a temporary signed URL (60 min) to download the original file.

Uploading files

Files are uploaded as MIMEData objects — the same format used across the Retab API for documents:
from retab import Retab
from pathlib import Path

client = Retab()

# Upload from a file path
response = client.files.upload(Path("invoice.pdf"))
print(f"File ID: {response.file_id}")

# Upload from a URL
response = client.files.upload({
    "filename": "invoice.pdf",
    "url": "data:application/pdf;base64,JVBERi0xLjQK..."
})

The file data structure

File Object
object
{
  "id": "file_a1b2c3d4e5f6",
  "object": "file",
  "filename": "invoice.pdf",
  "organization_id": "org_abc123",
  "page_count": 3,
  "created_at": "2024-01-15T10:30:00Z",
  "updated_at": "2024-01-15T10:30:00Z"
}

Listing and filtering

Use list to browse uploaded files with cursor-based pagination:
# List recent files
files = client.files.list(limit=20)
for f in files:
    print(f"{f.id}: {f.filename}")

# Filter by filename prefix
pdfs = client.files.list(filename="invoice", mime_type="application/pdf")

Downloading files

Retrieve a time-limited signed URL to download the original file:
link = client.files.get_download_link("file_a1b2c3d4e5f6")
print(f"Download URL: {link.download_url}")
print(f"Expires in: {link.expires_in}")