Introduction
Retab offers a consolidated, production-grade pipeline for processing any types of documents with AI. Our model read documents the way humans do. It accepts native digital files (Images, PDFs, DOCX, XLSX, E-mail) and parses text, detects visual structure across pages, tables, forms, and figures. Please check the API Reference for more details. The module exposes three high-level methods:Method | Purpose | |
---|---|---|
extract | Executes the extraction and returns the parsed object (optionally with consensus voting). | One-step OCR + LLM parsing when only the structured output is required. |
parse | Converts any document into structured text content with page-by-page extraction. | Perfect for RAG, text extraction, and preparing documents for further processing or indexing. |
The document data structure
Documents in Retab are represented asMIMEData
objects, which encapsulate the file content and metadata. This structure allows you to work with documents in a consistent way regardless of their original format. The url
field directly matches OpenAI’s expected format for image inputs.
document
parameter as a file path, bytes, or a PIL.Image.Image
object, and we will automatically convert it to a MIMEData
object for you.