Large language models are non-deterministic—the same prompt can return different answers each time. Consensus reduces this variability by sending the prompt through K independent model calls and merging their JSON outputs into one result. Our algorithm then compares those outputs field-by-field and scores the uncertainty of every field by averaging the pair-wise distance between answers. The smaller the distance, the higher the likelihood. On Retab’s platform this signal exposes ambiguous user queries. When the calls disagree, it means the prompt can be interpreted in several ways, so we need to tighten the schema—clearer descriptions, less-ambiguous names, stricter types—before moving on. These field-level likelihoods also lay the groundwork for an agent that can refine schemas autonomously, even without human supervision. You can use consensus using our SDK with the parameterDocumentation Index
Fetch the complete documentation index at: https://docs.retab.com/llms.txt
Use this file to discover all available pages before exploring further.
n_consensus, but also with our dedicated k-llms library.
Context: This approach follows similar principles to Palantir’s “K-LLMs” methodology, where multiple models evaluate the same prompt and their outputs are synthesized for increased accuracy, confidence, and reduced hallucinations.
How it works
Under the hood Retab:- Fires n_consensus identical calls.
- Parses each raw answer into a Pydantic model / JSON‑Schema object.
- Runs a deterministic reconciliation strategy:
- Exact match vote for scalar fields (strings, numbers, booleans)
- Deep merge for arrays when all models agree on length and order
- Recursive reconciliation for nested objects
- Returns the reconciled object in
response.output_parsed(Responses) orcompletion.choices[0].message.parsed(Completions).
ConsensusError is raised.
Note: Running consensus withn=4means 4× the API costs and parallel requests may hit rate limits. Consider using smaller models likegpt-5-minifor cost optimization.
k-LLMs
We provide a quick, type‑safe wrapper around OpenAI Chat Completions and Responses endpoints with automatic consensus reconciliation.- Step 1: Generating diverse answers. We purposely sample each call with diversity settings to obtain distinct answers.
- Step 2: SOTA reconciliation. A bespoke strategy merge the candidates with state‑of‑the‑art accuracy.
Quick‑Start: Switch in One Line
Arguments
| Name | Type | Default | Notes |
|---|---|---|---|
model | str | — | Any OpenAI chat model name. |
messages | list[dict] | — | Same shape as OpenAI’s messages. |
response_format | pydantic.BaseModel or dict JSON‑Schema | — | Target structure. |
n | int | 1 | >1 enables consensus. |