Skip to content
Get started

Create a dataset from file upload, HuggingFace, or Kaggle

datasets.create(DatasetCreateParams**kwargs) -> DatasetCreateResponse
POST/api/v1/datasets

Unified ingest endpoint. Discriminated by source.type: "file" returns upload instructions for a presigned S3 PUT, "huggingface" and "kaggle" start an async import.

ParametersExpand Collapse
source: Source

Dataset source configuration. Discriminated by type: file, huggingface, or kaggle.

One of the following:
class SourceFileSourceDto:
file_format: Literal["csv", "json", "jsonl", "parquet"]

Format of the file being uploaded

One of the following:
"csv"
"json"
"jsonl"
"parquet"
name: str

Human-readable name for the dataset

type: Literal["file"]

Source type

class SourceHuggingfaceSourceDto:
files: SequenceNotStr[str]

File paths to download from the repository

type: Literal["huggingface"]

Source type

url: str

HuggingFace dataset repository URL

class SourceKaggleSourceDto:
files: SequenceNotStr[str]

File paths to download from the dataset

type: Literal["kaggle"]

Source type

url: str

Kaggle dataset URL

ReturnsExpand Collapse
class DatasetCreateResponse:
dataset_id: str

ID of the newly created dataset

status: str

Current dataset status

upload_instructions: Optional[UploadInstructions]

Upload instructions for file sources. PUT your file to the provided URL.

method: str

HTTP method to use

s3_key: str

S3 object key — pass this back in the complete request if needed for verification

url: str

Pre-signed URL for uploading the file

Create a dataset from file upload, HuggingFace, or Kaggle

import os
from adaption import Adaption

client = Adaption(
    api_key=os.environ.get("ADAPTION_API_KEY"),  # This is the default and can be omitted
)
dataset = client.datasets.create(
    source={
        "file_format": "csv",
        "name": "my-training-data",
        "type": "file",
    },
)
print(dataset.dataset_id)
{
  "dataset_id": "dataset_id",
  "status": "status",
  "upload_instructions": {
    "method": "PUT",
    "s3_key": "s3_key",
    "url": "https://s3.amazonaws.com/bucket/key?X-Amz-Signature=..."
  }
}
Returns Examples
{
  "dataset_id": "dataset_id",
  "status": "status",
  "upload_instructions": {
    "method": "PUT",
    "s3_key": "s3_key",
    "url": "https://s3.amazonaws.com/bucket/key?X-Amz-Signature=..."
  }
}