What is Retab?

Retab solves all the major challenges in data processing with Large Language Models:
  1. Parsing: Convert any file type (PDFs, Excel, emails, etc.) into LLM-ready format without writing custom parsers
  2. Extraction: Get consistent, reliable outputs using schema-based prompt engineering
  3. Projects: Evaluate the performance of models against annotated datasets
  4. Deployments: Publish a live, stable, shareable document processor from your project
Our goal is to make the process of analyzing documents and unstructured data as easy and transparent as possible. We are offering you all the software-defined primitives to build your own document processing solutions. We see it as Stripe for document processing.

A new, lighter paradigm

Large Language Models collapse entire layers of legacy OCR pipelines into a single, elegant abstraction. When a model can read, reason, and structure text natively, we no longer need brittle heuristics, handcrafted parsers, or heavyweight ETL jobs. Instead, we can expose a small, principled API: input your document, define the output schema, and receive reliable structured data. The result is less complexity, better accuracy, faster processing, and reduced costs. By building around LLMs from the ground up, we shift the focus from tedious infrastructure to extracting meaningful answers from your data. Many people haven’t yet realized how powerful LLMs have become at document processing tasks. We believe that LLMs and structured generation are among the most impactful breakthroughs of the 21st century. AI is the new electricity, and retab is here to help you tame it.

Structured Generation

JSON is one of the most widely used formats in the world for applications to exchange data. Structured Generation is a feature that ensures the AI model will always generate responses that adhere to your supplied JSON Schema, so you don’t need to worry about the model omitting a required key, or hallucinating an invalid enum value.

Community

Let’s create the future of document processing together! Join our discord community to share tips, discuss best practices, and showcase what you build. Or just tweet at us. We can’t wait to see how you’ll use Retab.

Roadmap

We share our roadmap publicly. Please submit your feature requests on Github Among the features we’re working on:
  • Schema optimization autopilot
  • Sources API
  • Document Edit API

Learn More