Generate Schema
The Generate Schema endpoint allows you to automatically generate a JSON Schema from a set of example documents. This is particularly useful when you want to create a schema that captures all the important fields and patterns present in your documents. You can provide multiple documents to ensure the generated schema covers all possible variations in your data structure. The AI will analyze the documents and create a comprehensive schema with appropriate field descriptions and validation rules.The Schema Object
TheSchema
class turns a design-time schema that you supply (JSON Schema or Pydantic model) into all the artefacts required to obtain, validate and post-process a large-language-model (LLM) extraction:
- A system prompt
- An inference-time schema
A Schema object represents a JSON schema for structured data extraction.
Introduction
Schema
offers a single abstraction that:
- Ingests a schema specifying the data structure you want to extract (JSON Schema or Pydantic Model).
- Produces A system prompt and unfolds the reasoning fields into a new data structure that will be used when calling the LLM.
1. Architectural Overview
Schema is the bridge that keeps those two views in sync.- Authoring view — the exact schema you provided, held in
Schema.json_schema
and suitable for documentation or version control. - Inference view — an enhanced schema plus a comprehensive system prompt, used only when interacting with the LLM.
2. Typical Usage Pattern
3. Life-Cycle of a Response
Step | Validator / Schema | Goal |
---|---|---|
LLM output | inference_json_schema | Ensure the model produces both data and detailed reasoning. |
Post-process | filter_auxiliary_fields_json() | Remove the reasoning keys. |
Application object | Original Pydantic model (Invoice in the example) | Validate the cleaned payload and obtain a type-safe object. |