Datasets

Create a dataset from file upload, HuggingFace, or Kaggle

client.datasets.create(, ?): DatasetCreateResponse { dataset_id, status, upload_instructions }

POST/api/v1/datasets

Get a dataset by ID

client.datasets.get(, ?): Dataset { configured_column_mapping, created_at, dataset_id, 8 more }

GET/api/v1/datasets/{dataset_id}

List datasets

client.datasets.list(?, ?): DatasetListResponse { datasets, next_cursor }

GET/api/v1/datasets

Get the processing status of a dataset

client.datasets.getStatus(, ?): DatasetGetStatusResponse { dataset_id, error, progress, 2 more }

GET/api/v1/datasets/{dataset_id}/status

Download the processed dataset

client.datasets.download(, ?, ?): DatasetDownloadResponse

GET/api/v1/datasets/{dataset_id}/download

Publish a dataset to an external platform

client.datasets.publish(, , ?): DatasetPublishResponse { publish_id, status, message }

POST/api/v1/datasets/{dataset_id}/publish

Start an augmentation run (or estimate cost)

client.datasets.run(, , ?): DatasetRunResponse { estimate, estimatedCreditsConsumed, estimatedMinutes, run_id }

POST/api/v1/datasets/{dataset_id}/run

Get evaluation results for a dataset

client.datasets.getEvaluation(, ?): DatasetGetEvaluationResponse { dataset_id, quality, raw_results, status }

GET/api/v1/datasets/{dataset_id}/evaluation

ModelsExpand Collapse

Dataset { configured_column_mapping, created_at, dataset_id, 8 more }

configured_column_mapping: ConfiguredColumnMapping | null

User-configured column mapping. Null if not yet configured.

chat: string | null

completion: string | null

context: Array<string>

prompt: string | null

created_at: string

Timestamp when the dataset was created

formatdate-time

dataset_id: string

Unique dataset identifier

error: Error | null

Error details if the dataset failed. Null otherwise.

message: string

Error message

evaluation_summary: EvaluationSummary | null

Compact evaluation summary. Null if evaluation has not completed.

grade_after: string | null

Letter grade (A-E) after augmentation

grade_before: string | null

Letter grade (A-E) before augmentation

improvement_percent: number | null

Relative improvement percentage

score_after: number | null

Quality score after augmentation

score_before: number | null

Quality score before augmentation

name: string | null

Human-readable name for the dataset

progress: Progress | null

Processing progress. Null when no run is active.

percent: number | null

Progress percentage (0-100)

processed_rows: number | null

Number of rows processed so far

total_rows: number | null

Total rows to process (samples_to_process or row_count)

row_count: number | null

Total number of rows in the dataset

run_id: string | null

ID of the currently active run

status: "pending" | "running" | "succeeded" | "failed"

Lifecycle status: pending, running, succeeded, or failed

One of the following:

"pending"

"running"

"succeeded"

"failed"

updated_at: string

Timestamp of the last update

formatdate-time

DatasetsUpload

Initiate a dataset upload

client.datasets.upload.initiate(, ?): UploadInitiateResponse { upload_url }

POST/api/v1/datasets/upload/initiate

Complete a dataset upload and trigger processing

client.datasets.upload.complete(, ?): UploadCompleteResponse { dataset_id }

POST/api/v1/datasets/upload/complete

Complete a file upload and trigger processing

client.datasets.upload.completeByID(, , ?): UploadCompleteByIDResponse { dataset_id, status }

POST/api/v1/datasets/{dataset_id}/upload/complete