Skip to content
Get started

Datasets

Get a dataset by ID
GET/api/v1/datasets/{dataset_id}
List datasets
GET/api/v1/datasets
Get the processing status of a dataset
GET/api/v1/datasets/{dataset_id}/status
Download the processed dataset
GET/api/v1/datasets/{dataset_id}/download
Publish a dataset to an external platform
POST/api/v1/datasets/{dataset_id}/publish
Start an augmentation run (or estimate cost)
POST/api/v1/datasets/{dataset_id}/run
Get evaluation results for a dataset
GET/api/v1/datasets/{dataset_id}/evaluation
ModelsExpand Collapse
Dataset = object { configured_column_mapping, created_at, dataset_id, 8 more }
configured_column_mapping: object { chat, completion, context, prompt }

User-configured column mapping. Null if not yet configured.

chat: string
completion: string
context: array of string
prompt: string
created_at: string

Timestamp when the dataset was created

formatdate-time
dataset_id: string

Unique dataset identifier

error: object { message }

Error details if the dataset failed. Null otherwise.

message: string

Error message

evaluation_summary: object { grade_after, grade_before, improvement_percent, 2 more }

Compact evaluation summary. Null if evaluation has not completed.

grade_after: string

Letter grade (A-E) after augmentation

grade_before: string

Letter grade (A-E) before augmentation

improvement_percent: number

Relative improvement percentage

score_after: number

Quality score after augmentation

score_before: number

Quality score before augmentation

name: string

Human-readable name for the dataset

progress: object { percent, processed_rows, total_rows }

Processing progress. Null when no run is active.

percent: number

Progress percentage (0-100)

processed_rows: number

Number of rows processed so far

total_rows: number

Total rows to process (samples_to_process or row_count)

row_count: number

Total number of rows in the dataset

run_id: string

ID of the currently active run

status: "pending" or "running" or "succeeded" or "failed"

Lifecycle status: pending, running, succeeded, or failed

One of the following:
"pending"
"running"
"succeeded"
"failed"
updated_at: string

Timestamp of the last update

formatdate-time

DatasetsUpload

Initiate a dataset upload
POST/api/v1/datasets/upload/initiate
Complete a dataset upload and trigger processing
POST/api/v1/datasets/upload/complete
Complete a file upload and trigger processing
POST/api/v1/datasets/{dataset_id}/upload/complete