Mitigating hallucinations
Reduce fabricated content in adapted data with hallucination mitigation and web-search grounding via brand_controls.
Generative models can produce hallucinations—plausible text that is not grounded in your data or verifiable facts. When that noise lands in training or evaluation sets, it works against durable, transferable model behavior—the kind of control Adaptive Data is designed to reinforce.
Hallucination mitigation steers generations toward verifiable information using web-search grounding when enabled, so adapted completions are less likely to invent facts you will later have to filter or relabel.
Configure a run
Section titled “Configure a run”Set hallucination_mitigation to true inside brand_controls:
run = client.datasets.run( dataset_id, column_mapping={ "prompt": "instruction", "completion": "response", }, brand_controls={ "hallucination_mitigation": True, },)You can combine this with other brand_controls (length, safety categories, and related options) on the same run.
Full example
Section titled “Full example”Same structure as Getting started: ingest (or reuse a dataset_id), wait until the dataset is ready, adapt with mitigation enabled, wait for completion, then export.
import osimport time
from adaption import Adaption, DatasetTimeout
client = Adaption(api_key=os.environ["ADAPTION_API_KEY"])dataset_id = os.environ.get("ADAPTION_DATASET_ID")
if not dataset_id: result = client.datasets.upload_file("training_data.csv") dataset_id = result.dataset_id while True: st = client.datasets.get_status(dataset_id) if st.row_count is not None: break time.sleep(2)
estimate = client.datasets.run( dataset_id, column_mapping={"prompt": "instruction", "completion": "response"}, brand_controls={"hallucination_mitigation": True}, estimate=True,)print(f"Estimated credits: {estimate.estimated_credits_consumed}")
run = client.datasets.run( dataset_id, column_mapping={"prompt": "instruction", "completion": "response"}, brand_controls={"hallucination_mitigation": True},)print(f"Run started: {run.run_id}")
try: final = client.datasets.wait_for_completion(dataset_id, timeout=3600) print(f"Finished: {final.status}") if final.error: raise RuntimeError(final.error.message)except DatasetTimeout: print("Timed out — poll datasets.get or get_status in your environment")
url = client.datasets.download(dataset_id)print(f"Download: {url}")When to use it
Section titled “When to use it”Enable mitigation when outputs must stay fact-aligned: customer support, RAG-style workflows with external truth, compliance-sensitive text, or any pipeline where invented details are expensive to catch later. For creative or purely subjective tasks, you may omit it to save latency or credits; use estimate=True to compare cost before you scale.