Skip to content
Get started

Create a dataset from file upload, HuggingFace, or Kaggle

POST/api/v1/datasets

Unified ingest endpoint. Discriminated by source.type: "file" returns upload instructions for a presigned S3 PUT, "huggingface" and "kaggle" start an async import.

Body ParametersJSONExpand Collapse
source: object { file_format, name, type } or object { files, type, url } or object { files, type, url }

Dataset source configuration. Discriminated by type: file, huggingface, or kaggle.

One of the following:
FileSourceDto = object { file_format, name, type }
file_format: "csv" or "json" or "jsonl" or "parquet"

Format of the file being uploaded

One of the following:
"csv"
"json"
"jsonl"
"parquet"
name: string

Human-readable name for the dataset

type: "file"

Source type

HuggingfaceSourceDto = object { files, type, url }
files: array of string

File paths to download from the repository

type: "huggingface"

Source type

url: string

HuggingFace dataset repository URL

KaggleSourceDto = object { files, type, url }
files: array of string

File paths to download from the dataset

type: "kaggle"

Source type

url: string

Kaggle dataset URL

ReturnsExpand Collapse
dataset_id: string

ID of the newly created dataset

status: string

Current dataset status

upload_instructions: optional object { method, s3_key, url }

Upload instructions for file sources. PUT your file to the provided URL.

method: string

HTTP method to use

s3_key: string

S3 object key — pass this back in the complete request if needed for verification

url: string

Pre-signed URL for uploading the file

Create a dataset from file upload, HuggingFace, or Kaggle

curl https://api.adaptionlabs.ai/api/v1/datasets \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $ADAPTION_API_KEY" \
    -d '{
          "source": {
            "file_format": "csv",
            "name": "my-training-data",
            "type": "file"
          }
        }'
{
  "dataset_id": "dataset_id",
  "status": "status",
  "upload_instructions": {
    "method": "PUT",
    "s3_key": "s3_key",
    "url": "https://s3.amazonaws.com/bucket/key?X-Amz-Signature=..."
  }
}
Returns Examples
{
  "dataset_id": "dataset_id",
  "status": "status",
  "upload_instructions": {
    "method": "PUT",
    "s3_key": "s3_key",
    "url": "https://s3.amazonaws.com/bucket/key?X-Amz-Signature=..."
  }
}