Skip to content
SupportGo to app

Mitigating hallucinations

Reduce fabricated content in adapted data with hallucination mitigation and web-search grounding via brand_controls.

Generative models can produce hallucinations—plausible text that is not grounded in your data or verifiable facts. When that noise lands in training or evaluation sets, it works against durable, transferable model behavior—the kind of control Adaptive Data is designed to reinforce.

Hallucination mitigation steers generations toward verifiable information using web-search grounding when enabled, so adapted completions are less likely to invent facts you will later have to filter or relabel.

Set hallucination_mitigation to true inside brand_controls:

run = client.datasets.run(
dataset_id,
column_mapping={
"prompt": "instruction",
"completion": "response",
},
brand_controls={
"hallucination_mitigation": True,
},
)

You can combine this with other brand_controls (length, safety categories, and related options) on the same run.

Same structure as Getting started: ingest (or reuse a dataset_id), wait until the dataset is ready, adapt with mitigation enabled, wait for completion, then export.

import os
import time
from adaption import Adaption, DatasetTimeout
client = Adaption(api_key=os.environ["ADAPTION_API_KEY"])
dataset_id = os.environ.get("ADAPTION_DATASET_ID")
if not dataset_id:
result = client.datasets.upload_file("training_data.csv")
dataset_id = result.dataset_id
while True:
st = client.datasets.get_status(dataset_id)
if st.row_count is not None:
break
time.sleep(2)
estimate = client.datasets.run(
dataset_id,
column_mapping={"prompt": "instruction", "completion": "response"},
brand_controls={"hallucination_mitigation": True},
estimate=True,
)
print(f"Estimated credits: {estimate.estimated_credits_consumed}")
run = client.datasets.run(
dataset_id,
column_mapping={"prompt": "instruction", "completion": "response"},
brand_controls={"hallucination_mitigation": True},
)
print(f"Run started: {run.run_id}")
try:
final = client.datasets.wait_for_completion(dataset_id, timeout=3600)
print(f"Finished: {final.status}")
if final.error:
raise RuntimeError(final.error.message)
except DatasetTimeout:
print("Timed out — poll datasets.get or get_status in your environment")
url = client.datasets.download(dataset_id)
print(f"Download: {url}")

Enable mitigation when outputs must stay fact-aligned: customer support, RAG-style workflows with external truth, compliance-sensitive text, or any pipeline where invented details are expensive to catch later. For creative or purely subjective tasks, you may omit it to save latency or credits; use estimate=True to compare cost before you scale.