Skip to content
SupportGo to app

Evaluating dataset quality

Read evaluation status and quality metrics for a processed dataset.

After an adaptation run finishes successfully, the platform can produce evaluation signals—scores and related metrics that summarize how the augmented data compares to the original on quality dimensions the pipeline measures.

In the Python SDK you retrieve that information in two complementary ways:

  • datasets.get_evaluation(dataset_id) — dedicated response with evaluation pipeline status and structured quality metrics.
  • datasets.get(dataset_id) — full dataset record including evaluation_summary, a compact mirror of the headline metrics when evaluation has finished.

Call get_evaluation with the same dataset_id you adapted:

ev = client.datasets.get_evaluation(dataset_id)
print(ev.status) # pending | running | succeeded | failed | skipped
if ev.quality:
print(f"Score before: {ev.quality.score_before}")
print(f"Score after: {ev.quality.score_after}")
print(f"Improvement: {ev.quality.improvement_percent}%")

When status is succeeded, quality includes fields such as score_before / score_after (0–10 scale), letter grades, improvement_percent, and percentile_after where applicable. If evaluation is still pending or running, expect quality to be absent until the pipeline finishes.

datasets.get returns a Dataset whose evaluation_summary is populated when a compact summary is available—useful for dashboards or listing datasets without a second request:

ds = client.datasets.get(dataset_id)
if ds.evaluation_summary:
print(ds.evaluation_summary.score_after, ds.evaluation_summary.improvement_percent)

get_status focuses on ingestion/run progress and does not include evaluation; use get or get_evaluation when you care about quality metrics.

Adaptation may show succeeded before evaluation is done. Poll get_evaluation (or get if you only need evaluation_summary) until status is no longer pending or running:

import time
while True:
ev = client.datasets.get_evaluation(dataset_id)
if ev.status in ("succeeded", "failed", "skipped"):
break
time.sleep(5)
if ev.status == "succeeded" and ev.quality:
print(ev.quality.model_dump(exclude_none=True))

Adjust interval and timeout to match your environment (notebooks vs CI).

Async clients use the same shape: await client.datasets.get_evaluation(dataset_id).

After a completed run (see Getting started), pull evaluation:

import os
from adaption import Adaption
client = Adaption(api_key=os.environ["ADAPTION_API_KEY"])
dataset_id = os.environ["ADAPTION_DATASET_ID"]
ev = client.datasets.get_evaluation(dataset_id)
print(f"Evaluation status: {ev.status}")
ds = client.datasets.get(dataset_id)
print(f"Dataset status: {ds.status}")
if ds.evaluation_summary:
print(f"Summary: {ds.evaluation_summary.model_dump(exclude_none=True)}")

Use get_evaluation when you need explicit evaluation status and full quality details. Use get when you already fetch the dataset and want a single summary on the same object. Pair either approach with estimate=True on future runs (see Processing large datasets and the FAQ) when you are iterating on quality before scaling row counts.