Skip to content
Get started

Create a dataset from file upload, HuggingFace, or Kaggle

client.datasets.create(DatasetCreateParams { source } body, RequestOptionsoptions?): DatasetCreateResponse { dataset_id, status, upload_instructions }
POST/api/v1/datasets

Unified ingest endpoint. Discriminated by source.type: "file" returns upload instructions for a presigned S3 PUT, "huggingface" and "kaggle" start an async import.

ParametersExpand Collapse
body: DatasetCreateParams { source }
source: FileSourceDto { file_format, name, type } | HuggingfaceSourceDto { files, type, url } | KaggleSourceDto { files, type, url }

Dataset source configuration. Discriminated by type: file, huggingface, or kaggle.

One of the following:
FileSourceDto { file_format, name, type }
file_format: "csv" | "json" | "jsonl" | "parquet"

Format of the file being uploaded

One of the following:
"csv"
"json"
"jsonl"
"parquet"
name: string

Human-readable name for the dataset

type: "file"

Source type

HuggingfaceSourceDto { files, type, url }
files: Array<string>

File paths to download from the repository

type: "huggingface"

Source type

url: string

HuggingFace dataset repository URL

KaggleSourceDto { files, type, url }
files: Array<string>

File paths to download from the dataset

type: "kaggle"

Source type

url: string

Kaggle dataset URL

ReturnsExpand Collapse
DatasetCreateResponse { dataset_id, status, upload_instructions }
dataset_id: string

ID of the newly created dataset

status: string

Current dataset status

upload_instructions?: UploadInstructions { method, s3_key, url }

Upload instructions for file sources. PUT your file to the provided URL.

method: string

HTTP method to use

s3_key: string

S3 object key — pass this back in the complete request if needed for verification

url: string

Pre-signed URL for uploading the file

Create a dataset from file upload, HuggingFace, or Kaggle

import Adaption from 'adaption';

const client = new Adaption({
  apiKey: process.env['ADAPTION_API_KEY'], // This is the default and can be omitted
});

const dataset = await client.datasets.create({
  source: {
    file_format: 'csv',
    name: 'my-training-data',
    type: 'file',
  },
});

console.log(dataset.dataset_id);
{
  "dataset_id": "dataset_id",
  "status": "status",
  "upload_instructions": {
    "method": "PUT",
    "s3_key": "s3_key",
    "url": "https://s3.amazonaws.com/bucket/key?X-Amz-Signature=..."
  }
}
Returns Examples
{
  "dataset_id": "dataset_id",
  "status": "status",
  "upload_instructions": {
    "method": "PUT",
    "s3_key": "s3_key",
    "url": "https://s3.amazonaws.com/bucket/key?X-Amz-Signature=..."
  }
}