Deploy Models

This article explains how to deploy, monitor, and undeploy trained models in Hyperstack AI Studio using both the API and the user interface. It walks through the deployment process, checking deployment status, and managing production-ready models from the My Models page or programmatically.

Deploying Using the API

Replace the following variables before running the command:

API_KEY: Your AI Studio API key.
model_name: Name of the model to deploy.

curl -X POST https://api.genai.hyperstack.cloud/tailor/v1/deploy_model \
  -H "X-API-Key: API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model_name": "MODEL_NAME"}'

Required Parameters

model_name (string) – Unique name of the model to be deployed.

Response

{
  "message": "Deploying model",
  "status": "success"
}

Asynchronous deployment

Deployment is asynchronous. Model enters deploying state initially.

Deploying Using the UI

Follow these steps to deploy your fine-tuned model:

Open the Models Page
Navigate to the My Models page.
Select a Model
Click on the fine-tuned model you want to deploy to open its details view.
Toggle Deployment
In the Status section, use the Deploy toggle to activate or deactivate the model. Deployment typically takes a few seconds.

Checking Deployment Status Using the API

Use this endpoint to retrieve the deployment status and configuration details of a specific fine-tuned model.

A model is successfully deployed when the response includes "state": "deployed"—as shown near the bottom of the example response below.

Replace the following variables before running the command:

API_KEY}: Your AI Studio API key.
{MODEL_NAME}: Include the name of the model in the path.

curl -X GET "https://api.genai.hyperstack.cloud/tailor/v1/models/by_name/{MODEL_NAME}" \
  -H "X-API-Key: API_KEY" \
  -H "Content-Type: application/json"

Required Parameters

model_name (string) – The name of the fine-tuned model to retrieve details for.

Response

{
  "message": {
    "base_model_data": {
      "available_for_finetuning": true,
      "available_for_inference": true,
      "default_batch_size": 64,
      "default_gradient_accumulation_steps": 1,
      "default_lr": 0.0002,
      "default_micro_batch_size": 32,
      "deployment_records": [],
      "display_name": "Mistral 7B Instruct (v0.3)",
      "hf_repo": "mistralai/mistral-7b-instruct-v0.3",
      "model_id": 1,
      "model_name": "mistral-7b-instruct-v0.3",
      "model_size": "0B",
      "model_type": "language_model",
      "price_per_m_token": 0.0,
      "supported_context_len": 8192,
      "supported_locations": [
        "ca1"
      ],
      "training_time_per_log": 1.048,
      "training_time_y_intercept": 95.568
    },
    "base_model_id": 1,
    "created_at": "Thu, 05 Jun 2025",
    "created_at_unix": 1749147227.14679,
    "deployment_records": [
      [
        "2025-06-05T18:55:31.407283+00:00",
        null
      ]
    ],
    "last_deployed_on": null,
    "last_deployed_on_unix": null,
    "last_used": null,
    "last_used_unix": null,
    "model_config": {
      "base_model": "mistral-7b-instruct-v0.3",
      "batch_size": 4,
      "custom_dataset": null,
      "custom_logs_filename": null,
      "epoch": 1,
      "gradient_accumulation_steps": 1,
      "learning_rate": 0.0002,
      "lora_alpha": 16,
      "lora_dropout": 0.05,
      "lora_r": 32,
      "micro_batch_size": 2,
      "number_of_bad_logs": 0,
      "pipeline_id": null,
      "pipeline_uuid": null,
      "requested_batch_size": 4,
      "save_logs_with_tags": null,
      "tags": [
        "example"
      ]
    },
    "model_evaluation": {},
    "model_evaluation_state": {
      "mix_eval": {
        "status": {}
      },
      "needlehaystack": {
        "status": "not_started"
      }
    },
    "model_id": 34,
    "model_name": "test-model",
    "parent_id": null,
    "router_id": null,
    "self_hosted": false,
    "state": "deployed",
    "training_ended_at": "Thu, 05 Jun 2025",
    "training_ended_at_unix": 1749147390.553279,
    "updated_at": "Thu, 05 Jun 2025",
    "usage_data": [],
    "user_id": 21
  },
  "status": "success"
}

Click to view descriptions of response fields

status `string`

The status of the API call. "success" indicates the request was handled correctly.

message `object`

Contains detailed information about the queried fine-tuned model.

Show child attributes

base_model_data `object`

Metadata and configuration about the original base model.

Show child attributes

available_for_finetuning `boolean`

Indicates whether the base model can be fine-tuned.

available_for_inference `boolean`

Indicates whether the base model can be used for inference.

default_batch_size `integer`

The default batch size used during training.

default_gradient_accumulation_steps `integer`

The default number of gradient accumulation steps during training.

default_lr `number`

Default learning rate used during fine-tuning.

default_micro_batch_size `integer`

Default micro batch size used for the training loop.

deployment_records `array`

Records of deployments for the base model (empty if not deployed).

display_name `string`

User-friendly display name of the base model.

hf_repo `string`

HuggingFace model repository identifier.

model_id `integer`

Internal ID of the base model.

model_name `string`

Technical name of the base model.

model_size `string`

Size of the model (e.g., weights file size).

model_type `string`

Type of model (e.g., "language_model").

price_per_m_token `number`

Cost per million tokens for using this base model.

supported_context_len `integer`

Maximum supported input length for the model in tokens.

supported_locations `array`

List of regions where the base model is available.

training_time_per_log `number`

Estimated time per log (in seconds) during training.

training_time_y_intercept `number`

Y-intercept value used for training time estimation.

base_model_id `integer`

ID of the base model used to create the fine-tuned model.

created_at `string`

Human-readable timestamp when the model was created.

created_at_unix `number`

UNIX timestamp for when the model was created.

deployment_records `array`

List of tuples showing model deployment timestamps and undeployment events.

last_deployed_on `string/null`

The last time the model was deployed (if any).

last_used `string/null`

The last time the model was used (if any).

model_config `object`

The configuration used during the fine-tuning process.

Show child attributes

base_model `string`

Name of the base model used.

batch_size `integer`

Effective batch size used in training.

custom_dataset `string/null`

Custom dataset name used for training (if any).

custom_logs_filename `string/null`

Custom logs filename used in training (if any).

epoch `integer`

Number of epochs the model was trained for.

gradient_accumulation_steps `integer`

Gradient accumulation steps used.

learning_rate `number`

Learning rate used.

lora_alpha `integer`

LoRA alpha hyperparameter.

lora_dropout `number`

LoRA dropout rate.

lora_r `integer`

LoRA rank.

micro_batch_size `integer`

Micro batch size used.

number_of_bad_logs `integer`

Count of bad logs skipped during training.

requested_batch_size `integer`

User-requested batch size.

save_logs_with_tags `boolean/null`

Whether logs were saved with tags.

tags `array`

List of tags used to filter training data.

model_evaluation `object`

Model evaluation results (currently empty).

model_evaluation_state `object`

The evaluation state for different tasks.

Show child attributes

mix_eval.status `object`

Evaluation status for mixed evaluation task.

needlehaystack.status `string`

Status of the needlehaystack evaluation task. Example: "not_started".

model_id `integer`

ID of the fine-tuned model.

model_name `string`

User-defined name for the fine-tuned model.

parent_id `integer/null`

Parent model ID if the model is a derivative.

router_id `integer/null`

Routing ID (null if not set).

self_hosted `boolean`

Indicates if the model is self-hosted.

state `string`

Current state of the model. Example: "deployed".

training_ended_at `string`

Human-readable end time of training.

training_ended_at_unix `number`

UNIX timestamp when training ended.

updated_at `string`

Last updated timestamp (human-readable).

usage_data `array`

Reserved for future usage metrics.

user_id `integer`

ID of the user who owns this model.

Undeploying Using the API

Replace the following variables before running the command:

API_KEY}: Your AI Studio API key.
model_name: Include the name of the model to undeploy in the body of the request.

curl -X POST https://api.genai.hyperstack.cloud/tailor/v1/undeploy_model \
  -H "X-API-Key: API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model_name": "MODEL_NAME}'

Required Parameters

model_name (string) – Name of the model to undeploy.

Response

{
  "message": "Undeploying model",
  "status": "success"
}

Deploy Models

In this article​

Deploying Using the API​

Required Parameters​

Response​

Deploying Using the UI​

Checking Deployment Status Using the API​

Required Parameters​

Response​

status string​

message object​

base_model_data object​

available_for_finetuning boolean​

available_for_inference boolean​

default_batch_size integer​

default_gradient_accumulation_steps integer​

default_lr number​

default_micro_batch_size integer​

deployment_records array​

display_name string​

hf_repo string​

model_id integer​

model_name string​

model_size string​

model_type string​

price_per_m_token number​

supported_context_len integer​

supported_locations array​

training_time_per_log number​

training_time_y_intercept number​

base_model_id integer​

created_at string​

created_at_unix number​

deployment_records array​

last_deployed_on string/null​

last_used string/null​

model_config object​

base_model string​

batch_size integer​

custom_dataset string/null​

custom_logs_filename string/null​

epoch integer​

gradient_accumulation_steps integer​

learning_rate number​

lora_alpha integer​

lora_dropout number​

lora_r integer​

micro_batch_size integer​

number_of_bad_logs integer​

requested_batch_size integer​

save_logs_with_tags boolean/null​

tags array​

model_evaluation object​

model_evaluation_state object​

mix_eval.status object​

needlehaystack.status string​

model_id integer​

model_name string​

parent_id integer/null​

router_id integer/null​

self_hosted boolean​

state string​

training_ended_at string​

training_ended_at_unix number​

updated_at string​

usage_data array​

user_id integer​

Undeploying Using the API​

Required Parameters​

Response​

In this article

Deploying Using the API

Required Parameters

Response

Deploying Using the UI

Checking Deployment Status Using the API

Required Parameters

Response

status `string`

message `object`

base_model_data `object`

available_for_finetuning `boolean`

available_for_inference `boolean`

default_batch_size `integer`

default_gradient_accumulation_steps `integer`

default_lr `number`

default_micro_batch_size `integer`

deployment_records `array`

display_name `string`

hf_repo `string`

model_id `integer`

model_name `string`

model_size `string`

model_type `string`

price_per_m_token `number`

supported_context_len `integer`

supported_locations `array`

training_time_per_log `number`

training_time_y_intercept `number`

base_model_id `integer`

created_at `string`

created_at_unix `number`

deployment_records `array`

last_deployed_on `string/null`

last_used `string/null`

model_config `object`

base_model `string`

batch_size `integer`

custom_dataset `string/null`

custom_logs_filename `string/null`

epoch `integer`

gradient_accumulation_steps `integer`

learning_rate `number`

lora_alpha `integer`

lora_dropout `number`

lora_r `integer`

micro_batch_size `integer`

number_of_bad_logs `integer`

requested_batch_size `integer`

save_logs_with_tags `boolean/null`

tags `array`

model_evaluation `object`

model_evaluation_state `object`

mix_eval.status `object`

needlehaystack.status `string`

model_id `integer`

model_name `string`

parent_id `integer/null`

router_id `integer/null`

self_hosted `boolean`

state `string`

training_ended_at `string`

training_ended_at_unix `number`

updated_at `string`

usage_data `array`

user_id `integer`

Undeploying Using the API

Required Parameters

Response