Brand controls

Guides

Encode length, content safety, and freeform brand voice into each run.

Adaptive Data treats specification as foundational: goals such as length, safety, and brand voice become objectives that steer what the platform produces. Data configuration should be structural, not a layer of preferences applied after the fact.

In the Python SDK, brand_controls on datasets.run is where you express that specification. It covers:

length — target verbosity of generated completions.
safety_categories — content safety categories to enforce.
hallucination_mitigation — web-search grounding (covered in Mitigating hallucinations).
blueprint — a freeform system prompt for tone, persona, language, or any guideline that does not fit the structured fields.

Together these help ensure adapted data matches how much your product says, what it is allowed to say, and how it says it.

Length

Sets how verbose adapted completions should be—aligned with UI copy limits, documentation depth, or coaching-style responses.

run = client.datasets.run(
    dataset_id,
    column_mapping={"prompt": "instruction", "completion": "response"},
    brand_controls={
        "length": "concise",
    },
)

Safety categories

Completions that violate selected categories are handled according to platform policy—so training data stays closer to rules you can audit, not ad hoc filtering downstream.

run = client.datasets.run(
    dataset_id,
    column_mapping={"prompt": "instruction", "completion": "response"},
    brand_controls={
        "safety_categories": ["harassment", "hate"],
    },
)

Example values appear in tests; for production, treat the API schema as source of truth.

Blueprint: freeform brand voice

Use blueprint when the guideline is a sentence or paragraph, not a structured flag—tone, persona, target audience, language preferences, or product-specific phrasing. The string is injected as a system prompt on every generated completion, so it applies uniformly across the run.

run = client.datasets.run(
    dataset_id,
    column_mapping={"prompt": "instruction", "completion": "response"},
    brand_controls={
        "blueprint": (
            "You are a customer-success assistant for Acme. "
            "Answer in British English, stay warm but concise, "
            "and never recommend third-party tools."
        ),
    },
)

Use blueprint alongside the structured fields: length and safety_categories handle quantitative and policy constraints, while blueprint carries qualitative brand/style instructions. They compose on the same run.

Combined example

run = client.datasets.run(
    dataset_id,
    column_mapping={"prompt": "instruction", "completion": "response"},
    brand_controls={
        "length": "detailed",
        "safety_categories": ["harassment", "hate"],
        "blueprint": "Answer in British English with a warm, concise tone.",
    },
)

Full script

import os
import time

from adaption import Adaption, DatasetTimeout

client = Adaption(api_key=os.environ["ADAPTION_API_KEY"])
dataset_id = os.environ.get("ADAPTION_DATASET_ID")

if not dataset_id:
    result = client.datasets.upload_file("training_data.csv")
    dataset_id = result.dataset_id
    while True:
        status = client.datasets.get_status(dataset_id)
        if status.row_count is not None:
            break
        time.sleep(2)

run = client.datasets.run(
    dataset_id,
    column_mapping={"prompt": "instruction", "completion": "response"},
    brand_controls={
        "length": "concise",
        "safety_categories": ["harassment", "hate"],
        "blueprint": "Answer in British English with a warm, concise tone.",
    },
)
print(f"Run started: {run.run_id}")

try:
    final = client.datasets.wait_for_completion(dataset_id, timeout=3600)
    print(f"Finished: {final.status}")
    if final.error:
        raise RuntimeError(final.error.message)
except DatasetTimeout:
    print("Timed out")

url = client.datasets.download(dataset_id)
print(f"Download: {url}")

When to use each control

length — Match assistant response depth to product constraints (short replies vs long explanations).
safety_categories — Encode policy-aligned training data for moderated or customer-facing models.
blueprint — Encode qualitative brand voice: tone, persona, language, product-specific phrasing, or any guideline that does not fit the structured fields.
hallucination_mitigation — Enable web-search grounding to reduce fabricated content. See Mitigating hallucinations.