Deploy Models
This article explains how to deploy, monitor, and undeploy trained models in Hyperstack AI Studio using both the API and the user interface. It walks through the deployment process, checking deployment status, and managing production-ready models from the My Models page or programmatically.
In this article
- Deploying Using the API
- Deploying Using the UI
- Checking Deployment Status Using the API
- Undeploying Using the API
Deploying Using the API
Replace the following variables before running the command:
API_KEY
: Your AI Studio API key.model_name
: Name of the model to deploy.
curl -X POST https://api.genai.hyperstack.cloud/tailor/v1/deploy_model \
-H "X-API-Key: API_KEY" \
-H "Content-Type: application/json" \
-d '{"model_name": "MODEL_NAME"}'
Required Parameters
model_name
(string)
– Unique name of the model to be deployed.
Response
{
"message": "Deploying model",
"status": "success"
}
Deployment is asynchronous. Model enters deploying
state initially.
Deploying Using the UI
Follow these steps to deploy your fine-tuned model:
-
Open the Models Page
Navigate to the My Models page. -
Select a Model
Click on the fine-tuned model you want to deploy to open its details view. -
Toggle Deployment
In the Status section, use the Deploy toggle to activate or deactivate the model. Deployment typically takes a few seconds.
Checking Deployment Status Using the API
Use this endpoint to retrieve the deployment status and configuration details of a specific fine-tuned model.
A model is successfully deployed when the response includes "state": "deployed"
—as shown near the bottom of the example response below.
Replace the following variables before running the command:
API_KEY}
: Your AI Studio API key.{MODEL_NAME}
: Include the name of the model in the path.
curl -X GET "https://api.genai.hyperstack.cloud/tailor/v1/models/by_name/{MODEL_NAME}" \
-H "X-API-Key: API_KEY" \
-H "Content-Type: application/json"
Required Parameters
model_name
(string)
– The name of the fine-tuned model to retrieve details for.
Response
{
"message": {
"base_model_data": {
"available_for_finetuning": true,
"available_for_inference": true,
"default_batch_size": 64,
"default_gradient_accumulation_steps": 1,
"default_lr": 0.0002,
"default_micro_batch_size": 32,
"deployment_records": [],
"display_name": "Mistral 7B Instruct (v0.3)",
"hf_repo": "mistralai/mistral-7b-instruct-v0.3",
"model_id": 1,
"model_name": "mistral-7b-instruct-v0.3",
"model_size": "0B",
"model_type": "language_model",
"price_per_m_token": 0.0,
"supported_context_len": 8192,
"supported_locations": [
"ca1"
],
"training_time_per_log": 1.048,
"training_time_y_intercept": 95.568
},
"base_model_id": 1,
"created_at": "Thu, 05 Jun 2025",
"created_at_unix": 1749147227.14679,
"deployment_records": [
[
"2025-06-05T18:55:31.407283+00:00",
null
]
],
"last_deployed_on": null,
"last_deployed_on_unix": null,
"last_used": null,
"last_used_unix": null,
"model_config": {
"base_model": "mistral-7b-instruct-v0.3",
"batch_size": 4,
"custom_dataset": null,
"custom_logs_filename": null,
"epoch": 1,
"gradient_accumulation_steps": 1,
"learning_rate": 0.0002,
"lora_alpha": 16,
"lora_dropout": 0.05,
"lora_r": 32,
"micro_batch_size": 2,
"number_of_bad_logs": 0,
"pipeline_id": null,
"pipeline_uuid": null,
"requested_batch_size": 4,
"save_logs_with_tags": null,
"tags": [
"example"
]
},
"model_evaluation": {},
"model_evaluation_state": {
"mix_eval": {
"status": {}
},
"needlehaystack": {
"status": "not_started"
}
},
"model_id": 34,
"model_name": "test-model",
"parent_id": null,
"router_id": null,
"self_hosted": false,
"state": "deployed",
"training_ended_at": "Thu, 05 Jun 2025",
"training_ended_at_unix": 1749147390.553279,
"updated_at": "Thu, 05 Jun 2025",
"usage_data": [],
"user_id": 21
},
"status": "success"
}
Click to view descriptions of response fields
status string
The status of the API call. "success"
indicates the request was handled correctly.
message object
Contains detailed information about the queried fine-tuned model.
Show child attributes
base_model_data object
Metadata and configuration about the original base model.
Show child attributes
available_for_finetuning boolean
Indicates whether the base model can be fine-tuned.
available_for_inference boolean
Indicates whether the base model can be used for inference.
default_batch_size integer
The default batch size used during training.
default_gradient_accumulation_steps integer
The default number of gradient accumulation steps during training.
default_lr number
Default learning rate used during fine-tuning.
default_micro_batch_size integer
Default micro batch size used for the training loop.
deployment_records array
Records of deployments for the base model (empty if not deployed).
display_name string
User-friendly display name of the base model.
hf_repo string
HuggingFace model repository identifier.
model_id integer
Internal ID of the base model.
model_name string
Technical name of the base model.
model_size string
Size of the model (e.g., weights file size).
model_type string
Type of model (e.g., "language_model"
).
price_per_m_token number
Cost per million tokens for using this base model.
supported_context_len integer
Maximum supported input length for the model in tokens.
supported_locations array
List of regions where the base model is available.
training_time_per_log number
Estimated time per log (in seconds) during training.
training_time_y_intercept number
Y-intercept value used for training time estimation.
base_model_id integer
ID of the base model used to create the fine-tuned model.
created_at string
Human-readable timestamp when the model was created.
created_at_unix number
UNIX timestamp for when the model was created.
deployment_records array
List of tuples showing model deployment timestamps and undeployment events.
last_deployed_on string/null
The last time the model was deployed (if any).
last_used string/null
The last time the model was used (if any).
model_config object
The configuration used during the fine-tuning process.
Show child attributes
base_model string
Name of the base model used.
batch_size integer
Effective batch size used in training.
custom_dataset string/null
Custom dataset name used for training (if any).
custom_logs_filename string/null
Custom logs filename used in training (if any).
epoch integer
Number of epochs the model was trained for.
gradient_accumulation_steps integer
Gradient accumulation steps used.
learning_rate number
Learning rate used.
lora_alpha integer
LoRA alpha hyperparameter.
lora_dropout number
LoRA dropout rate.
lora_r integer
LoRA rank.
micro_batch_size integer
Micro batch size used.
number_of_bad_logs integer
Count of bad logs skipped during training.
requested_batch_size integer
User-requested batch size.
save_logs_with_tags boolean/null
Whether logs were saved with tags.
tags array
List of tags used to filter training data.
model_evaluation object
Model evaluation results (currently empty).
model_evaluation_state object
The evaluation state for different tasks.
Show child attributes
model_id integer
ID of the fine-tuned model.
model_name string
User-defined name for the fine-tuned model.
parent_id integer/null
Parent model ID if the model is a derivative.
router_id integer/null
Routing ID (null if not set).
self_hosted boolean
Indicates if the model is self-hosted.
state string
Current state of the model. Example: "deployed"
.
training_ended_at string
Human-readable end time of training.
training_ended_at_unix number
UNIX timestamp when training ended.
updated_at string
Last updated timestamp (human-readable).
usage_data array
Reserved for future usage metrics.
user_id integer
ID of the user who owns this model.
Undeploying Using the API
Replace the following variables before running the command:
API_KEY}
: Your AI Studio API key.model_name
: Include the name of the model to undeploy in the body of the request.
curl -X POST https://api.genai.hyperstack.cloud/tailor/v1/undeploy_model \
-H "X-API-Key: API_KEY" \
-H "Content-Type: application/json" \
-d '{"model_name": "MODEL_NAME}'
Required Parameters
model_name
(string)
– Name of the model to undeploy.
Response
{
"message": "Undeploying model",
"status": "success"
}