n_consensus
, but also with our dedicated k-llms library.
Context: This approach follows similar principles to Palantir’s “K-LLMs” methodology, where multiple models evaluate the same prompt and their outputs are synthesized for increased accuracy, confidence, and reduced hallucinations.
How it works
Under the hood Retab:- Fires n_consensus identical calls.
- Parses each raw answer into a Pydantic model / JSON‑Schema object.
- Runs a deterministic reconciliation strategy:
- Exact match vote for scalar fields (strings, numbers, booleans)
- Deep merge for arrays when all models agree on length and order
- Recursive reconciliation for nested objects
- Returns the reconciled object in
response.output_parsed
(Responses) orcompletion.choices[0].message.parsed
(Completions).
ConsensusError
is raised.
Note: Running consensus withn=4
means 4× the API costs and parallel requests may hit rate limits. Consider using smaller models likegpt-4.1-mini
for cost optimization.
k-LLMs
We provide a quick, type‑safe wrapper around OpenAI Chat Completions and Responses endpoints with automatic consensus reconciliation.-
Step 1: Generating diverse answers. We purposely sample each call with
temperature > 0
to obtain distinct answers. - Step 2: SOTA reconciliation. A bespoke strategy merge the candidates with state‑of‑the‑art accuracy.
Quick‑Start: Switch in One Line
Arguments
Name | Type | Default | Notes |
---|---|---|---|
model | str | — | Any OpenAI chat model name. |
messages | list[dict] | — | Same shape as OpenAI’s messages . |
response_format | pydantic.BaseModel or dict JSON‑Schema | — | Target structure. |
n | int | 1 | >1 enables consensus. |