Function blocks execute sandboxed function code. TheDocumentation Index
Fetch the complete documentation index at: https://docs.retab.com/llms.txt
Use this file to discover all available pages before exploring further.
language config field currently supports Python, where code receives upstream data as a typed Input Pydantic model and returns a typed Output model, enabling arbitrary transformations, validations, and computed fields.
Overview
When processing documents, you often need values that aren’t directly extracted but can be computed from other fields. For example:- Line item totals:
quantity * unit_price - Invoice totals: Sum of all line item amounts
- Reconciliation checks: Verify that computed totals match stated totals
- Conditional values: Apply different logic based on field values
language: "Python", function blocks let you write Python code with access to the full standard library plus packages like pydantic, pandas, numpy, duckdb, and rapidfuzz.
Configuration
| Field | Description |
|---|---|
| language | Execution language. Currently only Python is supported. |
| output_schema | JSON schema defining the output structure. Required for stable downstream typing. |
| code | Python code containing a transform(input_data: Input) -> Output function. |
| timeout_seconds | Sandbox execution timeout (1—300, default 60). |
| table_refs | Optional list of workflow tables to mount as CSV files in the sandbox. |
Output Schema
Define the output contract as a JSON schema:Code
Import the auto-generatedInput and Output models from the virtual models module:
Validation Patterns
Function blocks are commonly used after Extract blocks to validate extracted data.Sum Check
Verify a total matches the sum of its parts:Difference Check
Verify a result equals A - B - C:Equality Check
Verify two fields match:Conditional Labeling
Categorize values:String Extraction
Extract structured parts from text:Fuzzy Matching with DuckDB
Look up values in a mounted workflow table:Available Packages
Standard library (json, re, datetime, math, os, collections, itertools, etc.), plus:| Package | Use Case |
|---|---|
| pydantic | Input/Output models (auto-generated) |
| pandas, numpy, scipy | Data manipulation and math |
| python-dateutil | Date parsing |
| beautifulsoup4, lxml | HTML/XML parsing |
| duckdb | In-memory SQL analytics, fuzzy string matching |
| rapidfuzz | Fast fuzzy string matching |
Outbound network access is disabled inside function sandboxes. Use the
api_call block when you need to call external HTTP APIs, then pass the
response into the function block.Workflow Tables
Mount workflow tables (managed via the Tables UI or API) as CSV files in the sandbox:/tmp/ as mount prefix (not /data/). In local dev mode the sandbox runs on the host filesystem and /tmp/ is always writable.
Rules
- Always provide an
output_schemathat matches whattransform()returns. transform()must acceptinput_data: Inputand return anOutputinstance.- Access input fields via dot notation:
input_data.field_name. - Do not redefine the
Inputclass — it is auto-generated from the upstream block’s schema. - If the output is nested, return plain dict/list structures matching
output_schema. - Use
os.environ["VAR_NAME"]for secrets — never hardcode credentials.
Go Further
- Extraction - Learn how to extract structured data, design schemas, add reasoning prompts, and inspect provenance
- Schema - Design your extraction schemas