Create Workflow Eval Run
Create a workflow-scoped eval run.
workflow_id is the execution context. Optional scope narrows the
run to one saved eval or one block; omitted scope runs all workflow evals.
workflow_id in the request body and
optionally narrow execution with scope. If scope is omitted, every saved
eval in the workflow runs.
The response is a run resource. Use its id with the run-id-first endpoints:
Get Workflow Eval Run, List Eval
Run Results.
The request body has a workflow context and an optional scope:
- omitted
scope- run every saved eval in the workflow. scope.type = "single"- run one saved eval byeval_id.scope.type = "block"- run every saved eval for one block byblock_id.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Create a workflow eval run. Provide a workflow_id, and optionally narrow execution with scope to a single eval or one block. Omit scope to run every saved workflow eval.
Response
Successful Response
A batch execution of a workflow's evals, with overall lifecycle, timing, and pass/fail counts.
The eval run has been created but execution has not started.
- PendingWorkflowEvalRun
- QueuedWorkflowEvalRun
- RunningWorkflowEvalRun
- CompletedWorkflowEvalRun
- ErrorWorkflowEvalRun
- CancelledWorkflowEvalRun
Public workflow-eval target.
The storage layer remains block-scoped today, but the API shape names the tested entity explicitly so workflow-level targets can be added later.
Aggregate counts for a batch of block-eval runs.
Each individual run contributes to exactly one lifecycle_counts
bucket, and additionally to one outcome bucket when
lifecycle_counts.completed is incremented.