client.evals.classify is the SDK surface for benchmarking single-label document classification. You define the category set once, attach labeled examples to datasets, create iterations with category and model overrides, and measure how well each iteration performs before publishing it.
Resource Map
Quick Start
Core Resources
Eval
The top-level classify eval resource manages the category set and published configuration.create(name, categories, ...): create a new classify eval.get(eval_id),list(...),update(eval_id, ...),delete(eval_id): standard CRUD methods.publish(eval_id, origin=None): publish the current draft config.process(eval_id, iteration_id=None, document=..., ...): classify a document against the base eval or a specific iteration.
process() returns ClassifyResponse, which contains the winning classification and the model reasoning.
Datasets
Datasets store documents plus their expected labels.datasets.create(..., base_categories=..., base_inference_settings=...): create a labeled classification dataset.datasets.add_document(...): add a document with optionalprediction_data.datasets.update_document(...): updatevalidation_flag,prediction_data,classification_id, orextraction_id.datasets.process_document(...): run a document through the base eval configuration.
prediction_data.prediction.classification.
Iterations
Iterations let you experiment with model settings and category text.iterations.create(...): start a new iteration.iterations.update_draft(...): update draftinference_settingsorcategory_overrides.iterations.get_categories(...): fetch the effective category list for the iteration.iterations.get_schema(...): fetch the server-side schema view for the iteration.iterations.finalize(...): freeze the draft into a finalized iteration.iterations.process_documents(...): queue one labeled dataset document for iteration scoring.iterations.get_metrics(...): compute quality metrics for the iteration.
category_overrides is the key iteration-specific knob for classify evals. It lets you refine category descriptions without rebuilding the whole eval.
Templates
Templates work the same way as on the other eval resources.templates.list(...): browse classify templates.templates.get(template_id): inspect one template.templates.clone(template_id, name=None): create a new classify eval from the template.templates.list_builder_documents(template_id): fetch the template’s sample documents.templates.list_builder_document_previews(template_ids): fetch previews for multiple templates.
SDK Notes
- Python
list()methods on evals, datasets, and iterations return model lists. Node returns paginated API payloads withdata. - Python and Node both expose
get_categories()on classify iterations. This method is specific to classification evals. - Classification evals do not expose
process_stream()in the SDK.