Skip to main content

Models Overview - Base Models, Aliases, and Rankings

This guide introduces the core model management concepts in Hyperstack AI Studio: base models, model aliasing, and model rankings. Understanding how these elements work together will help you iterate faster, streamline deployment, and select an appropriate base model for your fine-tuning or inference needs.

In this article


Base Models

Hyperstack AI Studio provides several open-source foundation models that you can use out of the box or fine-tune to meet your application needs. These models are production-ready and optimized for various use cases such as summarization, reasoning, code generation, and instruction following.

Available Base Models

The table below summarizes the base models available in AI Studio along with descriptions and links to their official documentation and Hugging Face model pages.

Model NameAbout the Model
Mistral 7B Instruct (v0.3)

A 7.3B parameter dense transformer model tuned for instruction following. It uses sliding window and grouped-query attention to enhance efficiency and throughput. It performs well on reasoning, summarization, and code generation tasks.

Mixtral 8×7B Instruct (v0.1)

A mixture-of-experts model that activates two out of eight 7B expert models per inference step. This architecture increases efficiency while delivering strong results across general language tasks. It’s particularly useful for high-throughput or cost-sensitive deployments.

Mistral Small 3

A compact 24B model designed for low-latency inference and resource-constrained environments. Despite its size, it delivers strong results in instruction following, math, and code generation tasks. Ideal for real-time applications like chatbots, support agents, and embedded systems.

Llama 3.1 70B Instruct

Meta’s flagship 70B parameter model, fine-tuned for instruction-based tasks. It’s optimized for high-complexity tasks like reasoning and long-context generation. Suitable for production use in advanced chatbots and text analysis systems.

Llama 3.1 8B Instruct

A smaller alternative to the 70B variant, this 8B model provides a good balance of performance and efficiency. It supports a broad range of general-purpose tasks while being more cost-effective for development and experimentation.

Managing Base Models

Base models can be accessed and managed directly from the Base Models tab on the My Models page. Here, you can:

  • Explore detailed information about each model, including usage metrics, costs per 1 million tokens, and more.
  • Initiate fine-tuning for your specific use case.

Model Aliasing

Model aliasing allows you to assign a custom, stable name (alias) to a specific base or fine-tuned model version. Instead of using the exact model name or ID in your deployment or API calls, you can reference an alias like chatbot-production.

Aliases simplify referencing specific models in deployment and inference, allowing for more intuitive naming. This can be useful for:

  • Fine-tuned and base models: clearly identifying models for deployment and usage.
  • Inference: referencing models consistently across API calls or tools.

Manage your aliases on the Model Aliases page.

Creating a Model Alias

  1. Visit the Model Aliasing page.
  2. Select a currently deployed model.
  3. Optionally, add a suffix to the alias name.
  4. Click Create Alias.

After creating an alias, the model will appear on the Model Aliasing page with options to update or remove the linked model, or delete the alias entirely.


Model Rankings

You can access the Model Rankings page to explore benchmark-based evaluation scores for a wide range of foundation models, including both open-source and proprietary options. This page helps you compare performance across standardized tasks to make informed decisions when selecting a base model for fine-tuning or inference.

Each model is evaluated on a variety of popular benchmarks such as:

  • AGIEval, ARC, MMLU – General academic and reasoning tasks
  • GSM8K, Math, DROP – Math and numerical reasoning
  • BoolQ, PIQA, SIQA – Commonsense and logic

For a complete list of benchmark datasets used in evaluation, see the Benchmark Datasets section.

Scores range from 0 to 1, with higher values indicating stronger performance.

You can search and filter the list to compare models side by side. This is especially useful for assessing trade-offs between model size, quality, and task-specific capabilities.


Back to top