Projects provide a systematic way to test and validate your extraction schemas against known ground truth data. Think of it as unit testing for document AI—you can measure accuracy, compare different models, and optimize your extraction pipelines with confidence.A project consists of documents with annotations (your test data), iterations (test runs with different settings), and a schema (what you want to extract). This structure lets you run A/B tests between models and systematically improve your document processing accuracy.
Upload test documents with manually verified ground truth annotations
Run iterations with different model settings (GPT-4o vs GPT-4o-mini, consensus, etc.)
Compare results to find the optimal configuration for your use case
Retab automatically calculates accuracy metrics by comparing each iteration’s output against your ground truth annotations, giving you objective performance measurements.
While you can create projects programmatically with the SDK, we recommend using the Retab platform for project management. The web interface provides powerful schema editing tools, visual result comparisons, and collaborative features that make optimization much easier.