Balon AI

Model Catalog

53 models

Llama 3.3 70B Instruct Turbo

Meta

Lightweight 8B Meta Llama 3 model for fast, efficient chat tasks.

8K ctx

DeepSeek V4 Pro

DeepSeek

DeepSeek's latest V4 Pro model with strong tool use and general reasoning capabilities.

131K ctx

Cogito v2.1 671B

DeepCogito

Massive 671B DeepCogito architecture for advanced cognitive processing and deduction.

164K ctx

Qwen3 Coder 480B

Qwen

Colossal 480B MoE coding model for ultimate software engineering and refactoring intelligence.

262K ctx

Qwen3.5 397B

Qwen

Qwen3.5 397B — massive parameter model for complex reasoning and generation.

262K ctx

Qwen3 235B Instruct

Qwen

High-throughput 235B Qwen3 model with tool support, optimized for scale.

262K ctx

Qwen3.5 9B

Qwen

Compact and fast Qwen3.5 model for efficient inference and standard chat.

262K ctx

Qwen2.5 7B Instruct Turbo

Qwen

Fast 7B Qwen model with excellent instruction following, tool use, and low latency.

33K ctx

Kimi K2.6

Moonshot AI

Moonshot AI Kimi K2.6 — strong agentic model with advanced tool use capabilities.

131K ctx

MiniMax M2.7

MiniMax AI

MiniMax M2.7 — high-performance foundation model for conversational and creative tasks.

197K ctx

GLM-5

ZAI

Zhipu AI GLM-5 — next-generation foundation model with extended context and reasoning.

203K ctx

GLM-5.1

ZAI

Zhipu AI GLM-5.1 — improved iteration with enhanced multilingual and reasoning capabilities.

203K ctx

OpenAI GPT-OSS 20B

OpenAI OSS

OpenAI's 20B open-source model for standard conversational and instruction-following tasks.

131K ctx

OpenAI GPT-OSS 120B

OpenAI OSS

OpenAI's 120B open-source model for complex tasks requiring large parameter capacity.

131K ctx

Gemma 3N E4B Instruct

Google

Google's lightweight Gemma architecture fine-tuned for instruction following and tool execution.

33K ctx

53 models

1–16 of 53

…