Skip to content
Get started

Datasets

Create a dataset from file upload, HuggingFace, or Kaggle
client.datasets.create(DatasetCreateParams { source } body, RequestOptionsoptions?): DatasetCreateResponse { dataset_id, status, upload_instructions }
POST/api/v1/datasets
Get a dataset by ID
client.datasets.get(stringdatasetID, RequestOptionsoptions?): Dataset { configured_column_mapping, created_at, dataset_id, 8 more }
GET/api/v1/datasets/{dataset_id}
List datasets
client.datasets.list(DatasetListParams { created_after, created_before, cursor, 5 more } query?, RequestOptionsoptions?): DatasetListResponse { datasets, next_cursor }
GET/api/v1/datasets
Get the processing status of a dataset
client.datasets.getStatus(stringdatasetID, RequestOptionsoptions?): DatasetGetStatusResponse { dataset_id, error, progress, 2 more }
GET/api/v1/datasets/{dataset_id}/status
Download the processed dataset
client.datasets.download(stringdatasetID, DatasetDownloadParams { fileFormat } query?, RequestOptionsoptions?): DatasetDownloadResponse
GET/api/v1/datasets/{dataset_id}/download
Publish a dataset to an external platform
client.datasets.publish(stringdatasetID, DatasetPublishParams { target, target_spec } body, RequestOptionsoptions?): DatasetPublishResponse { publish_id, status, message }
POST/api/v1/datasets/{dataset_id}/publish
Start an augmentation run (or estimate cost)
client.datasets.run(stringdatasetID, DatasetRunParams { brand_controls, column_mapping, estimate, 2 more } body, RequestOptionsoptions?): DatasetRunResponse { estimate, estimatedCreditsConsumed, estimatedMinutes, run_id }
POST/api/v1/datasets/{dataset_id}/run
Get evaluation results for a dataset
client.datasets.getEvaluation(stringdatasetID, RequestOptionsoptions?): DatasetGetEvaluationResponse { dataset_id, quality, raw_results, status }
GET/api/v1/datasets/{dataset_id}/evaluation
ModelsExpand Collapse
Dataset { configured_column_mapping, created_at, dataset_id, 8 more }
configured_column_mapping: ConfiguredColumnMapping | null

User-configured column mapping. Null if not yet configured.

chat: string | null
completion: string | null
context: Array<string>
prompt: string | null
created_at: string

Timestamp when the dataset was created

formatdate-time
dataset_id: string

Unique dataset identifier

error: Error | null

Error details if the dataset failed. Null otherwise.

message: string

Error message

evaluation_summary: EvaluationSummary | null

Compact evaluation summary. Null if evaluation has not completed.

grade_after: string | null

Letter grade (A-E) after augmentation

grade_before: string | null

Letter grade (A-E) before augmentation

improvement_percent: number | null

Relative improvement percentage

score_after: number | null

Quality score after augmentation

score_before: number | null

Quality score before augmentation

name: string | null

Human-readable name for the dataset

progress: Progress | null

Processing progress. Null when no run is active.

percent: number | null

Progress percentage (0-100)

processed_rows: number | null

Number of rows processed so far

total_rows: number | null

Total rows to process (samples_to_process or row_count)

row_count: number | null

Total number of rows in the dataset

run_id: string | null

ID of the currently active run

status: "pending" | "running" | "succeeded" | "failed"

Lifecycle status: pending, running, succeeded, or failed

One of the following:
"pending"
"running"
"succeeded"
"failed"
updated_at: string

Timestamp of the last update

formatdate-time

DatasetsUpload

Initiate a dataset upload
client.datasets.upload.initiate(UploadInitiateParams { file_format, name } body, RequestOptionsoptions?): UploadInitiateResponse { upload_url }
POST/api/v1/datasets/upload/initiate
Complete a dataset upload and trigger processing
client.datasets.upload.complete(UploadCompleteParams { file_format, file_size_bytes, name, s3_key } body, RequestOptionsoptions?): UploadCompleteResponse { dataset_id }
POST/api/v1/datasets/upload/complete
Complete a file upload and trigger processing
client.datasets.upload.completeByID(stringdatasetID, UploadCompleteByIDParams { file_size_bytes, sha256 } body, RequestOptionsoptions?): UploadCompleteByIDResponse { dataset_id, status }
POST/api/v1/datasets/{dataset_id}/upload/complete