Retab transforms any document into structured data, with six core functionalities: parse ->
Convert any file (PDFs, Excel, emails, images) into LLM-ready markdown. extract ->
Extract structured JSON from documents using your defined schema. edit ->
Modify document content while preserving formatting. split ->
Intelligently split documents into logical sections. partition ->
Group repeated records in a document into chunks by a key such as invoice number or policy ID. classify ->
Categorize documents based on content and type.
Quickstart
The most basic workflow is extracting structured data from a document. The easiest way to access the API is through the Python or Node SDK.Get Started
Workflows
Build complex document workflows with our no-code editor.
API Playground
Explore the API playground and try Retab API.
Community
Discord
Join our community for tips and best practices.
GitHub
Star us and contribute to the project.