Available Models

W&B Inference provides access to several open-source foundation models. Each model has different strengths and use cases.

Model catalog

Model	Model ID (for API usage)	Type	Context Window	Parameters	Description
DeepSeek R1-0528	`deepseek-ai/DeepSeek-R1-0528`	Text	161K	37B-680B (Active-Total)	Optimized for precise reasoning tasks including complex coding, math, and structured document analysis.
DeepSeek V3-0324	`deepseek-ai/DeepSeek-V3-0324`	Text	161K	37B-680B (Active-Total)	Robust Mixture-of-Experts model tailored for high-complexity language processing and comprehensive document analysis.
DeepSeek V3.1	`deepseek-ai/DeepSeek-V3.1`	Text	161K	37B-671B (Active-Total)	A large hybrid model that supports both thinking and non-thinking modes via prompt templates.
Meta Llama 4 Scout	`meta-llama/Llama-4-Scout-17B-16E-Instruct`	Text, Vision	64K	17B-109B (Active-Total)	Multimodal model integrating text and image understanding, ideal for visual tasks and combined analysis.
Meta Llama 3.3 70B	`meta-llama/Llama-3.3-70B-Instruct`	Text	128K	70B (Total)	Multilingual model excelling in conversational tasks, detailed instruction-following, and coding.
Meta Llama 3.1 70B	`meta-llama/Llama-3.1-70B-Instruct`	Text	128K	70B (Total)	Efficient conversational model optimized for responsive multilingual chatbot interactions.
Meta Llama 3.1 8B	`meta-llama/Llama-3.1-8B-Instruct`	Text	128K	8B (Total)	Efficient conversational model optimized for responsive multilingual chatbot interactions.
Microsoft Phi 4 Mini 3.8B	`microsoft/Phi-4-mini-instruct`	Text	128K	3.8B (Total)	Compact, efficient model ideal for fast responses in resource-constrained environments.
Moonshot AI Kimi K2.5	`moonshotai/Kimi-K2.5`	Text, Vision	262K	32B-1T (Active-Total)	Kimi K2.5 is a multimodal Mixture-of-Experts language model featuring 32 billion activated parameters and a total of 1 trillion parameters.
OpenAI GPT OSS 120B	`openai/gpt-oss-120b`	Text	131K	5.1B-117B (Active-Total)	Efficient Mixture-of-Experts model designed for high-reasoning, agentic and general-purpose use cases.
OpenAI GPT OSS 20B	`openai/gpt-oss-20b`	Text	131K	3.6B-20B (Active-Total)	Lower latency Mixture-of-Experts model trained on OpenAI’s Harmony response format with reasoning capabilities.
OpenPipe Qwen3 14B Instruct	`OpenPipe/Qwen3-14B-Instruct`	Text	32.8K	14.8B (Total)	An efficient multilingual, dense, instruction-tuned model, optimized by OpenPipe for building agents with finetuning.
Qwen3 235B A22B Thinking-2507	`Qwen/Qwen3-235B-A22B-Thinking-2507`	Text	262K	22B-235B (Active-Total)	High-performance Mixture-of-Experts model optimized for structured reasoning, math, and long-form generation.
Qwen3 235B A22B-2507	`Qwen/Qwen3-235B-A22B-Instruct-2507`	Text	262K	22B-235B (Active-Total)	Efficient multilingual, Mixture-of-Experts, instruction-tuned model, optimized for logical reasoning.
Qwen3 30B A3B	`Qwen/Qwen3-30B-A3B-Instruct-2507`	Text	262K	3.3B-30.5B (Active-Total)	Qwen3-30B-A3B-Instruct-2507 is a 30.5B MoE instruction-tuned model with enhanced reasoning, coding, and long-context understanding.
Qwen3 Coder 480B A35B	`Qwen/Qwen3-Coder-480B-A35B-Instruct`	Text	262K	35B-480B (Active-Total)	Mixture-of-Experts model optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning.
Z.AI GLM 5	`zai-org/GLM-5-FP8`	Text	200K	40B-744B (Active-Total)	Mixture-of-Experts model for long-horizon agentic tasks with strong performance on reasoning and coding.
Moonshot AI Kimi K2 (deprecated)	`moonshotai/Kimi-K2-Instruct`	Text	131K	32B-1T (Active-Total)	Mixture-of-Experts model optimized for complex tool use, reasoning, and code synthesis.
Moonshot AI Kimi K2 Instruct 0905 (deprecated)	`moonshotai/Kimi-K2-Instruct-0905`	Text	262K	32B-1T (Active-Total)	Latest version of Kimi K2 mixture-of-experts language model, featuring 32 billion activated parameters and a total of 1 trillion parameters.
Qwen2.5 14B Instruct (deprecated)	`Qwen/Qwen2.5-14B-Instruct`	Text	32.8K	14.7B (Total)	Dense multilingual instruction-tuned model with tool-use and structured output support.
Z.AI GLM 4.5 (deprecated)	`zai-org/GLM-4.5`	Text	131K	32B-355B (Active-Total)	Mixture-of-Experts model with user-controllable thinking/non-thinking modes for strong reasoning, code generation, and agent alignment.

Using model IDs

When using the API, specify the model using its Model ID from the table above. For example:

response = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[...]
)

Next steps

Check usage limits and pricing for each model
See API reference for how to use these models
Try models in the W&B Playground

Response Settings

Tutorials

API Reference

Model catalog

Using model IDs

Next steps

Response Settings

Tutorials

API Reference

​Model catalog

​Using model IDs

​Next steps

Model catalog

Using model IDs

Next steps