Skip to main content

Upload Data

This article explains how to upload training data to Hyperstack AI Studio using the UI or API. It covers file upload via signed URLs, saving logs in real time, required formats like JSONL, and tagging strategies to organize your data effectively.

In this article


Upload Logs Using the API

Follow these steps to upload data through the API:

Uploading training data through the API is a three-step process that includes obtaining a signed URL, uploading your file to the signed URL, and registering the file under your account.

  1. Get a Signed URL

    This request generates a signed URL and filename for file upload. You will need your API key to authenticate the request.

    curl -X GET "https://api.genai.hyperstack.cloud/tailor/v1/generate_signed_url" \
    -H "X-API-KEY: API_KEY"

    To authenticate your request, make sure to include your AI Studio API key by replacing API_KEY with your key. To learn how to generate one, click here.

    Response:

    {
    "filename": "b24f7b35-ab20-4a62-8f34-4bb6550a9f58",
    "signedUrl": "https://ca1-dev.s3.nexgencloud.io/prod-tailor-logs-ca1/b24f7b34-ab20-4a62-8f35-4bb6550a9f58?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=X7C8YRB1Z2QJTBRY1IPF%2F20250528%2Fca1%2Fs3%2Faws4_request&X-Amz-Date=20250528K162709Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=content-type%3Bhost&X-Amz-Signature=aba1218d754a517badc54e77dea28ca86aebad50a596adedfa9369ca16c62792"
    }
  2. Upload File to Signed URL

    Use the signedUrl returned from the previous step to upload your .jsonl file.

    Replace the following variables before running the command:

    • signedUrl: The value of the signedUrl field returned from the generate_signed_url API response.
    • YOUR_FILE_NAME.jsonl: The name of your local .jsonl data file, including its path.

    curl -X PUT "signedUrl" \
    -H "Content-Type: application/octet-stream" \
    --data-binary @YOUR_FILE_NAME.jsonl
  3. Associate File with Account

    After uploading the file to the signed URL, you must register the file with your Hyperstack account using the custom_log_upload endpoint. This step links the uploaded file to your account and allows you to tag and organize the logs.

    Replace the following variables before running the command:

    • API_KEY: Your AI Studio API key.
    • custom_logs_filename: The exact filename value returned in the previous generate_signed_url step.
    • save_logs_with_tags: Replace with meaningful tags (e.g., ["customer-feedback"]) that help identify or categorize the data.

    curl -X POST "https://api.genai.hyperstack.cloud/tailor/v1/custom_log_upload" \
    -H "X-API-KEY: API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
    "custom_logs_filename": "FILENAME",
    "save_logs_with_tags": ["tag1", "tag2"]
    }'

    Response

    If the upload is successful, the API will return the following confirmation response:

    {
    "message": "Successfully uploaded custom logs",
    "status": "success"
    }

    You can now view and manage your uploaded logs in the UI by navigating to the Logs tab under the Data & Datasets section.

Here’s a clearer and better-formatted version of your procedure with step-by-step instructions and improved readability:


Upload Logs Using the UI

You can upload logs directly through the AI Studio UI using the following steps:

  1. Open the Logs Page

    Navigate to the Logs & Datasets page.

  2. Upload Your .jsonl File

    Ensure your logs meet the required file format as outlined in the JSONL File Format guidelines.

    Click the Upload Logs button in the top-right corner, then either select your .jsonl file from your device or drag and drop it into the upload area.

  3. Add Tags

    Enter at least one tag to help categorize your logs (e.g., testing).

  4. Validate and Upload

    Click Validate & Upload.

    The system will automatically check your file format and structure, then upload the logs if validation succeeds.

Once uploaded, you can:

  • View and manage logs individually.
  • Associate logs with datasets for training.
  • Add metadata tags to improve filtering and traceability.

Save Data Using the API

This endpoint allows you to log individual interactions programmatically. It’s useful for sending data in real time from an application.

Replace the following variables before running the command:

  • YOUR_API_KEY: Your actual API key.
  • model: The name of the model used (e.g., llama-3.1-70B-instruct, or your custom model name).
  • messages: An array of message objects, each containing a role (system, user, or assistant) and a content string. For expected JSONL format, see here.
  • creation_time: A UTC timestamp indicating when the interaction occurred. Must be formatted as an ISO 8601 datetime string, e.g., "2025-06-05T15:50:00Z". This is used for log ordering and audit purposes.

Additional optional fields such as kwargs, tags, and usage can be included. See Optional Parameters for details.

curl -X POST "https://api.genai.hyperstack.cloud/tailor/v1/data" \
-H "X-API-KEY: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What'\''s the weather today?"}
],
"model": "MODEL_NAME",
"creation_time": "CREATION_TIME",
"kwargs": {},
"tags": ["example", "weather"],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20,
"total_tokens": 30
}
}'

Response

If successful, the API will return the following confirmation response:

{
"message": "Data successfully saved to database",
"status": "success"
}

JSONL File Format

The JSONL file should contain one JSON object per line, where each object represents a conversation or interaction. Here's an example of the expected format:

{"messages": [{"role": "user", "content": "What's the capital of Australia?"}, {"role": "assistant", "content": "The capital of Australia is Canberra."}]}
{"messages": [{"role": "system", "content": "You are a travel advisor."}, {"role": "user", "content": "Where should I go in Europe for a summer vacation?"}, {"role": "assistant", "content": "Consider Italy, Spain, or Greece—they offer great weather, food, and culture in the summer!"}]}
{"messages": [{"role": "user", "content": "What's the capital of Australia?"}, {"role": "assistant", "content": "The capital of Australia is Canberra."}]}

Each line in the JSONL file must be a valid JSON object containing:

  • messages: An array of message objects
  • Each message object must have:
    • role: Either "system", "user", or "assistant"
    • content: The text content of the message

Make sure your JSONL file:

  • Has one complete JSON object per line
  • Uses proper JSON formatting
  • Contains the required fields for each message
  • Has no trailing commas
  • Uses UTF-8 encoding