Datasets

Create a dataset from file upload, HuggingFace, or Kaggle

POST/api/v1/datasets

Get the processing status of a dataset

GET/api/v1/datasets/{dataset_id}/status

Download the processed dataset

GET/api/v1/datasets/{dataset_id}/download

Publish a dataset to an external platform

POST/api/v1/datasets/{dataset_id}/publish

Start an augmentation run (or estimate cost)

POST/api/v1/datasets/{dataset_id}/run

Get evaluation results for a dataset

GET/api/v1/datasets/{dataset_id}/evaluation

ModelsExpand Collapse

Dataset = object { configured_column_mapping, created_at, dataset_id, 9 more }

configured_column_mapping: object { chat, completion, context, 2 more }

User-configured column mapping. Null if not yet configured.

chat: string

completion: string

context: array of string

image: string

prompt: string

created_at: string

Timestamp when the dataset was created

formatdate-time

dataset_id: string

Unique dataset identifier

error_data: object { code, level, message }

Error details if the dataset failed. Null otherwise.

code: string

Stable error code when the failure was structured (e.g. E0100)

level: "error" or "warning"

Severity when known

One of the following:

"error"

"warning"

message: string

Error message

evaluation_summary: object { grade_after, grade_before, improvement_percent, 2 more }

Compact evaluation summary. Null if evaluation has not completed.

grade_after: string

Letter grade (A-E) after augmentation

grade_before: string

Letter grade (A-E) before augmentation

improvement_percent: number

Relative improvement percentage

score_after: number

Quality score after augmentation

score_before: number

Quality score before augmentation

image_column_formats: map["embedded_bytes" or "url" or "file_reference"]

Per-column export encoding for detected image columns (column name → format). Use with GET /datasets/{dataset_id}/download: look up the active image column (mapped image column that is also in configured_column_mapping.context) to determine how each row's original_image is encoded. Null or empty when no image columns were detected.

One of the following:

"embedded_bytes"

"url"

"file_reference"

name: string

Human-readable name for the dataset

progress: object { percent, processed_rows, total_rows }

Processing progress. Null when no run is active.

percent: number

Progress percentage (0-100)

processed_rows: number

Number of rows processed so far

total_rows: number

Total rows to process (samples_to_process or row_count)

row_count: number

Total number of rows in the dataset

run_id: string

ID of the currently active run

status: "pending" or "running" or "succeeded" or "failed"

Lifecycle status: pending, running, succeeded, or failed

One of the following:

"pending"

"running"

"succeeded"

"failed"

updated_at: string

Timestamp of the last update

formatdate-time

DatasetsUpload

Initiate a dataset upload

POST/api/v1/datasets/upload/initiate

Complete a dataset upload and trigger processing

POST/api/v1/datasets/upload/complete

Complete a file upload and trigger processing

POST/api/v1/datasets/{dataset_id}/upload/complete

Datasets

Create a dataset from file upload, HuggingFace, or Kaggle

Get a dataset by ID

List datasets

Get the processing status of a dataset

Download the processed dataset

Publish a dataset to an external platform

Start an augmentation run (or estimate cost)

Get evaluation results for a dataset

ModelsExpand Collapse

DatasetsUpload

Initiate a dataset upload

Complete a dataset upload and trigger processing

Complete a file upload and trigger processing