Skip to main content

Deploy Models

This article explains how to deploy, monitor, and undeploy trained models in Hyperstack AI Studio using both the API and the user interface. It walks through the deployment process, checking deployment status, and managing production-ready models from the My Models page or programmatically.

In this article


Deploying Using the API

Replace the following variables before running the command:

  • API_KEY: Your AI Studio API key.
  • model_name: Name of the model to deploy.
curl -X POST https://api.genai.hyperstack.cloud/tailor/v1/deploy_model \
-H "X-API-Key: API_KEY" \
-H "Content-Type: application/json" \
-d '{"model_name": "MODEL_NAME"}'

Response

{
"message": "Deploying model",
"status": "success"
}

Asynchronous deployment

Deployment is asynchronous. Model enters deploying state initially.

Deploying Using the UI

Follow these steps to deploy your fine-tuned model:

  1. Open the Models Page
    Navigate to the My Models page.

  2. Select a Model
    Click on the fine-tuned model you want to deploy to open its details view.

  3. Toggle Deployment
    In the Status section, use the Deploy toggle to activate or deactivate the model. Deployment typically takes a few seconds.

Checking Deployment Status Using the API

Use this endpoint to retrieve the deployment status and configuration details of a specific fine-tuned model.

A model is successfully deployed when the response includes "state": "deployed"—as shown near the bottom of the example response below.

Replace the following variables before running the command:

  • API_KEY}: Your AI Studio API key.
  • {MODEL_NAME}: Include the name of the model in the path.
curl -X GET "https://api.genai.hyperstack.cloud/tailor/v1/models/by_name/{MODEL_NAME}" \
-H "X-API-Key: API_KEY" \
-H "Content-Type: application/json"

Response

{
"message": {
"base_model_data": {
"available_for_finetuning": true,
"available_for_inference": true,
"default_batch_size": 64,
"default_gradient_accumulation_steps": 1,
"default_lr": 0.0002,
"default_micro_batch_size": 32,
"deployment_records": [],
"display_name": "Mistral 7B Instruct (v0.3)",
"hf_repo": "mistralai/mistral-7b-instruct-v0.3",
"model_id": 1,
"model_name": "mistral-7b-instruct-v0.3",
"model_size": "0B",
"model_type": "language_model",
"price_per_m_token": 0.0,
"supported_context_len": 8192,
"supported_locations": [
"ca1"
],
"training_time_per_log": 1.048,
"training_time_y_intercept": 95.568
},
"base_model_id": 1,
"created_at": "Thu, 05 Jun 2025",
"created_at_unix": 1749147227.14679,
"deployment_records": [
[
"2025-06-05T18:55:31.407283+00:00",
null
]
],
"last_deployed_on": null,
"last_deployed_on_unix": null,
"last_used": null,
"last_used_unix": null,
"model_config": {
"base_model": "mistral-7b-instruct-v0.3",
"batch_size": 4,
"custom_dataset": null,
"custom_logs_filename": null,
"epoch": 1,
"gradient_accumulation_steps": 1,
"learning_rate": 0.0002,
"lora_alpha": 16,
"lora_dropout": 0.05,
"lora_r": 32,
"micro_batch_size": 2,
"number_of_bad_logs": 0,
"pipeline_id": null,
"pipeline_uuid": null,
"requested_batch_size": 4,
"save_logs_with_tags": null,
"tags": [
"example"
]
},
"model_evaluation": {},
"model_evaluation_state": {
"mix_eval": {
"status": {}
},
"needlehaystack": {
"status": "not_started"
}
},
"model_id": 34,
"model_name": "test-model",
"parent_id": null,
"router_id": null,
"self_hosted": false,
"state": "deployed",
"training_ended_at": "Thu, 05 Jun 2025",
"training_ended_at_unix": 1749147390.553279,
"updated_at": "Thu, 05 Jun 2025",
"usage_data": [],
"user_id": 21
},
"status": "success"
}
Click to view descriptions of response fields
status string

The status of the API call. "success" indicates the request was handled correctly.


message object

Contains detailed information about the queried fine-tuned model.

Show child attributes
base_model_data object

Metadata and configuration about the original base model.

Show child attributes
available_for_finetuning boolean

Indicates whether the base model can be fine-tuned.


available_for_inference boolean

Indicates whether the base model can be used for inference.


default_batch_size integer

The default batch size used during training.


default_gradient_accumulation_steps integer

The default number of gradient accumulation steps during training.


default_lr number

Default learning rate used during fine-tuning.


default_micro_batch_size integer

Default micro batch size used for the training loop.


deployment_records array

Records of deployments for the base model (empty if not deployed).


display_name string

User-friendly display name of the base model.


hf_repo string

HuggingFace model repository identifier.


model_id integer

Internal ID of the base model.


model_name string

Technical name of the base model.


model_size string

Size of the model (e.g., weights file size).


model_type string

Type of model (e.g., "language_model").


price_per_m_token number

Cost per million tokens for using this base model.


supported_context_len integer

Maximum supported input length for the model in tokens.


supported_locations array

List of regions where the base model is available.


training_time_per_log number

Estimated time per log (in seconds) during training.


training_time_y_intercept number

Y-intercept value used for training time estimation.


base_model_id integer

ID of the base model used to create the fine-tuned model.


created_at string

Human-readable timestamp when the model was created.


created_at_unix number

UNIX timestamp for when the model was created.


deployment_records array

List of tuples showing model deployment timestamps and undeployment events.


last_deployed_on string/null

The last time the model was deployed (if any).


last_used string/null

The last time the model was used (if any).


model_config object

The configuration used during the fine-tuning process.

Show child attributes
base_model string

Name of the base model used.


batch_size integer

Effective batch size used in training.


custom_dataset string/null

Custom dataset name used for training (if any).


custom_logs_filename string/null

Custom logs filename used in training (if any).


epoch integer

Number of epochs the model was trained for.


gradient_accumulation_steps integer

Gradient accumulation steps used.


learning_rate number

Learning rate used.


lora_alpha integer

LoRA alpha hyperparameter.


lora_dropout number

LoRA dropout rate.


lora_r integer

LoRA rank.


micro_batch_size integer

Micro batch size used.


number_of_bad_logs integer

Count of bad logs skipped during training.


requested_batch_size integer

User-requested batch size.


save_logs_with_tags boolean/null

Whether logs were saved with tags.


tags array

List of tags used to filter training data.


model_evaluation object

Model evaluation results (currently empty).


model_evaluation_state object

The evaluation state for different tasks.

Show child attributes
mix_eval.status object

Evaluation status for mixed evaluation task.


needlehaystack.status string

Status of the needlehaystack evaluation task. Example: "not_started".


model_id integer

ID of the fine-tuned model.


model_name string

User-defined name for the fine-tuned model.


parent_id integer/null

Parent model ID if the model is a derivative.


router_id integer/null

Routing ID (null if not set).


self_hosted boolean

Indicates if the model is self-hosted.


state string

Current state of the model. Example: "deployed".


training_ended_at string

Human-readable end time of training.


training_ended_at_unix number

UNIX timestamp when training ended.


updated_at string

Last updated timestamp (human-readable).


usage_data array

Reserved for future usage metrics.


user_id integer

ID of the user who owns this model.

Undeploying Using the API

Replace the following variables before running the command:

  • API_KEY}: Your AI Studio API key.
  • model_name: Include the name of the model to undeploy in the body of the request.
curl -X POST https://api.genai.hyperstack.cloud/tailor/v1/undeploy_model \
-H "X-API-Key: API_KEY" \
-H "Content-Type: application/json" \
-d '{"model_name": "MODEL_NAME}'

Response

{
"message": "Undeploying model",
"status": "success"
}