Skip to main content
GET
/
v1
/
projects
/
{project_id}
/
documents
/
{document_id}
from retab import Retab

client = Retab()
response = client.projects.documents.get("<project_id>", "<document_id>")
{
  "mime_data": {
    "filename": "file.pdf",
    "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADIA..."
  },
  "annotation": {},
  "annotation_metadata": {
    "extraction_id": "<string>",
    "likelihoods": {},
    "field_locations": {},
    "agentic_field_locations": {},
    "consensus_details": [
      {}
    ]
  },
  "playground_extraction": {},
  "playground_extraction_metadata": {
    "extraction_id": "<string>",
    "likelihoods": {},
    "field_locations": {},
    "agentic_field_locations": {},
    "consensus_details": [
      {}
    ]
  },
  "id": "<string>",
  "ocr_file_id": "<string>"
}
from retab import Retab

client = Retab()
response = client.projects.documents.get("<project_id>", "<document_id>")

Authorizations

Api-Key
string
header
required

Path Parameters

project_id
string
required
document_id
string
required

Query Parameters

include_content
boolean
default:false

Whether to include the content of the document

Response

Successful Response

mime_data
object
required

The mime data of the document. Can also be a BaseMIMEData, which is why we have this id field (to be able to identify the file, but id is equal to mime_data.id)

id
string
required

The ID of the document. Equal to mime_data.id but robust to the case where mime_data is a BaseMIMEData

annotation
object

The ground truth of the document

annotation_metadata
object | null

The metadata of the annotation when the annotation is a prediction

playground_extraction
object

The playground extraction of the document

playground_extraction_metadata
object | null

The metadata of the playground extraction

ocr_file_id
string | null

The ID of the OCR file