Files
openclaw/docs/concepts/model-providers.md
Agustin Rivera 9a0b43c47e feat(nvidia): add NVIDIA provider with onboarding flow (#71204)
* feat(nvidia): add NVIDIA provider with onboarding flow

Add the NVIDIA build.nvidia.com API as a bundled provider. Default model
is nvidia/nvidia/nemotron-3-super-120b-a12b: first segment is the provider
id, remaining "nvidia/nemotron-3-super-120b-a12b" is the literal upstream
model id (which happens to start with "nvidia/" because NVIDIA is also the
model maker).

Supporting core change: introduce a provider capability flag
nativeIdsIncludeProviderPrefix so providers whose native catalog ids
intentionally include their provider prefix (OpenRouter) opt into self-prefix
dedupe in modelKey, without hardcoding provider names in core. Providers
whose ids merely happen to start with their own name (NVIDIA) leave the flag
unset and get the full <provider>/<model-id> concatenation.

- extensions/nvidia/*: new plugin, catalog, onboarding, tests, docs
- extensions/openrouter/index.ts: declare nativeIdsIncludeProviderPrefix
- src/plugins/types.ts: add field to ProviderPlugin
- src/plugins/registry.ts: populate self-prefix set on registration
- src/agents/provider-self-prefix.ts: sync accessor used by modelKey
- src/agents/model-ref-shared.ts: modelKey consults the flag
- test updates for affected surfaces

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(model-picker): simplify literal-prefix display to label-only

* fix(model-picker): pass workspaceDir/env to allowlist literal-prefix resolution

* chore: untrack generated baseline JSON artifacts (gitignored)

* fix(nvidia): show literal model ref in picker and onboarding notes

* fix(nvidia): show hint whenever display label differs from stored config

* fix(nvidia): drop redundant hint from Keep current label

* fix(nvidia): restore literal double-prefix display labels

* fix(picker): handle literal-prefix fast path

* fix(picker): show literal keep label

* fix(docs): update nvidia provider docs

* fix(nvidia): update test helper imports

* fix(changelog): add nvidia provider entry

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:25:12 -07:00

32 KiB
Raw Permalink Blame History

summary, read_when, title, sidebarTitle
summary read_when title sidebarTitle
Model provider overview with example configs + CLI flows
You need a provider-by-provider model setup reference
You want example configs or CLI onboarding commands for model providers
Model providers Model providers

Reference for LLM/model providers (not chat channels like WhatsApp/Telegram). For model selection rules, see Models.

Quick rules

- Model refs use `provider/model` (example: `opencode/claude-opus-4-6`). - `agents.defaults.models` acts as an allowlist when set. - CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`. - `models.providers.*.contextWindow` / `contextTokens` / `maxTokens` set provider-level defaults; `models.providers.*.models[].contextWindow` / `contextTokens` / `maxTokens` override them per model. - Fallback rules, cooldown probes, and session-override persistence: [Model failover](/concepts/model-failover). OpenAI-family routes are prefix-specific:
- `openai/<model>` uses the direct OpenAI API-key provider in PI.
- `openai-codex/<model>` uses Codex OAuth in PI.
- `openai/<model>` plus `agents.defaults.agentRuntime.id: "codex"` uses the native Codex app-server harness.

See [OpenAI](/providers/openai) and [Codex harness](/plugins/codex-harness). If the provider/runtime split is confusing, read [Agent runtimes](/concepts/agent-runtimes) first.

Plugin auto-enable follows the same boundary: `openai-codex/<model>` belongs to the OpenAI plugin, while the Codex plugin is enabled by `agentRuntime.id: "codex"` or legacy `codex/<model>` refs.

GPT-5.5 is available through `openai/gpt-5.5` for direct API-key traffic, `openai-codex/gpt-5.5` in PI for Codex OAuth, and the native Codex app-server harness when `agentRuntime.id: "codex"` is set.
CLI runtimes use the same split: choose canonical model refs such as `anthropic/claude-*`, `google/gemini-*`, or `openai/gpt-*`, then set `agents.defaults.agentRuntime.id` to `claude-cli`, `google-gemini-cli`, or `codex-cli` when you want a local CLI backend.
Legacy `claude-cli/*`, `google-gemini-cli/*`, and `codex-cli/*` refs migrate back to canonical provider refs with the runtime recorded separately.

Plugin-owned provider behavior

Most provider-specific logic lives in provider plugins (registerProvider(...)) while OpenClaw keeps the generic inference loop. Plugins own onboarding, model catalogs, auth env-var mapping, transport/config normalization, tool-schema cleanup, failover classification, OAuth refresh, usage reporting, thinking/reasoning profiles, and more.

The full list of provider-SDK hooks and bundled-plugin examples lives in Provider plugins. A provider that needs a totally custom request executor is a separate, deeper extension surface.

Provider-owned runner behavior lives on explicit provider hooks such as replay policy, tool-schema normalization, stream wrapping, and transport/request helpers. The legacy `ProviderPlugin.capabilities` static bag is compatibility-only and is no longer read by shared runner logic.

API key rotation

Configure multiple keys via:
- `OPENCLAW_LIVE_<PROVIDER>_KEY` (single live override, highest priority)
- `<PROVIDER>_API_KEYS` (comma or semicolon list)
- `<PROVIDER>_API_KEY` (primary key)
- `<PROVIDER>_API_KEY_*` (numbered list, e.g. `<PROVIDER>_API_KEY_1`)

For Google providers, `GOOGLE_API_KEY` is also included as fallback. Key selection order preserves priority and deduplicates values.
- Requests are retried with the next key only on rate-limit responses (for example `429`, `rate_limit`, `quota`, `resource exhausted`, `Too many concurrent requests`, `ThrottlingException`, `concurrency limit reached`, `workers_ai ... quota limit exceeded`, or periodic usage-limit messages). - Non-rate-limit failures fail immediately; no key rotation is attempted. - When all candidate keys fail, the final error is returned from the last attempt.

Built-in providers (pi-ai catalog)

OpenClaw ships with the piai catalog. These providers require no models.providers config; just set auth + pick a model.

OpenAI

  • Provider: openai
  • Auth: OPENAI_API_KEY
  • Optional rotation: OPENAI_API_KEYS, OPENAI_API_KEY_1, OPENAI_API_KEY_2, plus OPENCLAW_LIVE_OPENAI_KEY (single override)
  • Example models: openai/gpt-5.5, openai/gpt-5.4-mini
  • Verify account/model availability with openclaw models list --provider openai if a specific install or API key behaves differently.
  • CLI: openclaw onboard --auth-choice openai-api-key
  • Default transport is auto (WebSocket-first, SSE fallback)
  • Override per model via agents.defaults.models["openai/<model>"].params.transport ("sse", "websocket", or "auto")
  • OpenAI Responses WebSocket warm-up defaults to enabled via params.openaiWsWarmup (true/false)
  • OpenAI priority processing can be enabled via agents.defaults.models["openai/<model>"].params.serviceTier
  • /fast and params.fastMode map direct openai/* Responses requests to service_tier=priority on api.openai.com
  • Use params.serviceTier when you want an explicit tier instead of the shared /fast toggle
  • Hidden OpenClaw attribution headers (originator, version, User-Agent) apply only on native OpenAI traffic to api.openai.com, not generic OpenAI-compatible proxies
  • Native OpenAI routes also keep Responses store, prompt-cache hints, and OpenAI reasoning-compat payload shaping; proxy routes do not
  • openai/gpt-5.3-codex-spark is intentionally suppressed in OpenClaw because live OpenAI API requests reject it and the current Codex catalog does not expose it
{
  agents: { defaults: { model: { primary: "openai/gpt-5.5" } } },
}

Anthropic

  • Provider: anthropic
  • Auth: ANTHROPIC_API_KEY
  • Optional rotation: ANTHROPIC_API_KEYS, ANTHROPIC_API_KEY_1, ANTHROPIC_API_KEY_2, plus OPENCLAW_LIVE_ANTHROPIC_KEY (single override)
  • Example model: anthropic/claude-opus-4-6
  • CLI: openclaw onboard --auth-choice apiKey
  • Direct public Anthropic requests support the shared /fast toggle and params.fastMode, including API-key and OAuth-authenticated traffic sent to api.anthropic.com; OpenClaw maps that to Anthropic service_tier (auto vs standard_only)
  • Preferred Claude CLI config keeps the model ref canonical and selects the CLI backend separately: anthropic/claude-opus-4-7 with agents.defaults.agentRuntime.id: "claude-cli". Legacy claude-cli/claude-opus-4-7 refs still work for compatibility.
Anthropic staff told us OpenClaw-style Claude CLI usage is allowed again, so OpenClaw treats Claude CLI reuse and `claude -p` usage as sanctioned for this integration unless Anthropic publishes a new policy. Anthropic setup-token remains available as a supported OpenClaw token path, but OpenClaw now prefers Claude CLI reuse and `claude -p` when available.
{
  agents: { defaults: { model: { primary: "anthropic/claude-opus-4-6" } } },
}

OpenAI Codex OAuth

  • Provider: openai-codex
  • Auth: OAuth (ChatGPT)
  • PI model ref: openai-codex/gpt-5.5
  • Native Codex app-server harness ref: openai/gpt-5.5 with agents.defaults.agentRuntime.id: "codex"
  • Native Codex app-server harness docs: Codex harness
  • Legacy model refs: codex/gpt-*
  • Plugin boundary: openai-codex/* loads the OpenAI plugin; the native Codex app-server plugin is selected only by the Codex harness runtime or legacy codex/* refs.
  • CLI: openclaw onboard --auth-choice openai-codex or openclaw models auth login --provider openai-codex
  • Default transport is auto (WebSocket-first, SSE fallback)
  • Override per PI model via agents.defaults.models["openai-codex/<model>"].params.transport ("sse", "websocket", or "auto")
  • params.serviceTier is also forwarded on native Codex Responses requests (chatgpt.com/backend-api)
  • Hidden OpenClaw attribution headers (originator, version, User-Agent) are only attached on native Codex traffic to chatgpt.com/backend-api, not generic OpenAI-compatible proxies
  • Shares the same /fast toggle and params.fastMode config as direct openai/*; OpenClaw maps that to service_tier=priority
  • openai-codex/gpt-5.5 uses the Codex catalog native contextWindow = 400000 and default runtime contextTokens = 272000; override the runtime cap with models.providers.openai-codex.models[].contextTokens
  • Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw.
  • Use openai-codex/gpt-5.5 when you want the Codex OAuth/subscription route; use openai/gpt-5.5 when your API-key setup and local catalog expose the public API route.
{
  agents: { defaults: { model: { primary: "openai-codex/gpt-5.5" } } },
}
{
  models: {
    providers: {
      "openai-codex": {
        models: [{ id: "gpt-5.5", contextTokens: 160000 }],
      },
    },
  },
}

Other subscription-style hosted options

Z.AI Coding Plan or general API endpoints. MiniMax Coding Plan OAuth or API key access. Qwen Cloud provider surface plus Alibaba DashScope and Coding Plan endpoint mapping.

OpenCode

  • Auth: OPENCODE_API_KEY (or OPENCODE_ZEN_API_KEY)
  • Zen runtime provider: opencode
  • Go runtime provider: opencode-go
  • Example models: opencode/claude-opus-4-6, opencode-go/kimi-k2.6
  • CLI: openclaw onboard --auth-choice opencode-zen or openclaw onboard --auth-choice opencode-go
{
  agents: { defaults: { model: { primary: "opencode/claude-opus-4-6" } } },
}

Google Gemini (API key)

  • Provider: google
  • Auth: GEMINI_API_KEY
  • Optional rotation: GEMINI_API_KEYS, GEMINI_API_KEY_1, GEMINI_API_KEY_2, GOOGLE_API_KEY fallback, and OPENCLAW_LIVE_GEMINI_KEY (single override)
  • Example models: google/gemini-3.1-pro-preview, google/gemini-3-flash-preview
  • Compatibility: legacy OpenClaw config using google/gemini-3.1-flash-preview is normalized to google/gemini-3-flash-preview
  • Alias: google/gemini-3.1-pro is accepted and normalized to Google's live Gemini API id, google/gemini-3.1-pro-preview
  • CLI: openclaw onboard --auth-choice gemini-api-key
  • Thinking: /think adaptive uses Google dynamic thinking. Gemini 3/3.1 omit a fixed thinkingLevel; Gemini 2.5 sends thinkingBudget: -1.
  • Direct Gemini runs also accept agents.defaults.models["google/<model>"].params.cachedContent (or legacy cached_content) to forward a provider-native cachedContents/... handle; Gemini cache hits surface as OpenClaw cacheRead

Google Vertex and Gemini CLI

  • Providers: google-vertex, google-gemini-cli
  • Auth: Vertex uses gcloud ADC; Gemini CLI uses its OAuth flow
Gemini CLI OAuth in OpenClaw is an unofficial integration. Some users have reported Google account restrictions after using third-party clients. Review Google terms and use a non-critical account if you choose to proceed.

Gemini CLI OAuth is shipped as part of the bundled google plugin.

```bash brew install gemini-cli ``` ```bash npm install -g @google/gemini-cli ``` ```bash openclaw plugins enable google ``` ```bash openclaw models auth login --provider google-gemini-cli --set-default ```
Default model: `google-gemini-cli/gemini-3-flash-preview`. You do **not** paste a client id or secret into `openclaw.json`. The CLI login flow stores tokens in auth profiles on the gateway host.
If requests fail after login, set `GOOGLE_CLOUD_PROJECT` or `GOOGLE_CLOUD_PROJECT_ID` on the gateway host.

Gemini CLI JSON replies are parsed from response; usage falls back to stats, with stats.cached normalized into OpenClaw cacheRead.

Z.AI (GLM)

  • Provider: zai
  • Auth: ZAI_API_KEY
  • Example model: zai/glm-5.1
  • CLI: openclaw onboard --auth-choice zai-api-key
    • Aliases: z.ai/* and z-ai/* normalize to zai/*
    • zai-api-key auto-detects the matching Z.AI endpoint; zai-coding-global, zai-coding-cn, zai-global, and zai-cn force a specific surface

Vercel AI Gateway

  • Provider: vercel-ai-gateway
  • Auth: AI_GATEWAY_API_KEY
  • Example models: vercel-ai-gateway/anthropic/claude-opus-4.6, vercel-ai-gateway/moonshotai/kimi-k2.6
  • CLI: openclaw onboard --auth-choice ai-gateway-api-key

Kilo Gateway

  • Provider: kilocode
  • Auth: KILOCODE_API_KEY
  • Example model: kilocode/kilo/auto
  • CLI: openclaw onboard --auth-choice kilocode-api-key
  • Base URL: https://api.kilo.ai/api/gateway/
  • Static fallback catalog ships kilocode/kilo/auto; live https://api.kilo.ai/api/gateway/models discovery can expand the runtime catalog further.
  • Exact upstream routing behind kilocode/kilo/auto is owned by Kilo Gateway, not hard-coded in OpenClaw.

See /providers/kilocode for setup details.

Other bundled provider plugins

Provider Id Auth env Example model
BytePlus byteplus / byteplus-plan BYTEPLUS_API_KEY byteplus-plan/ark-code-latest
Cerebras cerebras CEREBRAS_API_KEY cerebras/zai-glm-4.7
Cloudflare AI Gateway cloudflare-ai-gateway CLOUDFLARE_AI_GATEWAY_API_KEY
DeepInfra deepinfra DEEPINFRA_API_KEY deepinfra/deepseek-ai/DeepSeek-V3.2
DeepSeek deepseek DEEPSEEK_API_KEY deepseek/deepseek-v4-flash
GitHub Copilot github-copilot COPILOT_GITHUB_TOKEN / GH_TOKEN / GITHUB_TOKEN
Groq groq GROQ_API_KEY
Hugging Face Inference huggingface HUGGINGFACE_HUB_TOKEN or HF_TOKEN huggingface/deepseek-ai/DeepSeek-R1
Kilo Gateway kilocode KILOCODE_API_KEY kilocode/kilo/auto
Kimi Coding kimi KIMI_API_KEY or KIMICODE_API_KEY kimi/kimi-code
MiniMax minimax / minimax-portal MINIMAX_API_KEY / MINIMAX_OAUTH_TOKEN minimax/MiniMax-M2.7
Mistral mistral MISTRAL_API_KEY mistral/mistral-large-latest
Moonshot moonshot MOONSHOT_API_KEY moonshot/kimi-k2.6
NVIDIA nvidia NVIDIA_API_KEY nvidia/nvidia/nemotron-3-super-120b-a12b
OpenRouter openrouter OPENROUTER_API_KEY openrouter/auto
Qianfan qianfan QIANFAN_API_KEY qianfan/deepseek-v3.2
Qwen Cloud qwen QWEN_API_KEY / MODELSTUDIO_API_KEY / DASHSCOPE_API_KEY qwen/qwen3.5-plus
StepFun stepfun / stepfun-plan STEPFUN_API_KEY stepfun/step-3.5-flash
Together together TOGETHER_API_KEY together/moonshotai/Kimi-K2.5
Venice venice VENICE_API_KEY
Vercel AI Gateway vercel-ai-gateway AI_GATEWAY_API_KEY vercel-ai-gateway/anthropic/claude-opus-4.6
Volcano Engine (Doubao) volcengine / volcengine-plan VOLCANO_ENGINE_API_KEY volcengine-plan/ark-code-latest
xAI xai XAI_API_KEY xai/grok-4
Xiaomi xiaomi XIAOMI_API_KEY xiaomi/mimo-v2-flash

Quirks worth knowing

Applies its app-attribution headers and Anthropic `cache_control` markers only on verified `openrouter.ai` routes. DeepSeek, Moonshot, and ZAI refs are cache-TTL eligible for OpenRouter-managed prompt caching but do not receive Anthropic cache markers. As a proxy-style OpenAI-compatible path, it skips native-OpenAI-only shaping (`serviceTier`, Responses `store`, prompt-cache hints, OpenAI reasoning-compat). Gemini-backed refs keep proxy-Gemini thought-signature sanitation only. Gemini-backed refs follow the same proxy-Gemini sanitation path; `kilocode/kilo/auto` and other proxy-reasoning-unsupported refs skip proxy reasoning injection. API-key onboarding writes explicit text-only M2.7 chat model definitions; image understanding stays on the plugin-owned `MiniMax-VL-01` media provider. Model ids use a `nvidia//` namespace (for example `nvidia/nvidia/nemotron-...` alongside `nvidia/moonshotai/kimi-k2.5`); pickers preserve the literal `/` composition while the canonical key sent to the API stays single-prefixed. Uses the xAI Responses path. `/fast` or `params.fastMode: true` rewrites `grok-3`, `grok-3-mini`, `grok-4`, and `grok-4-0709` to their `*-fast` variants. `tool_stream` defaults on; disable via `agents.defaults.models["xai/"].params.tool_stream=false`. Ships as the bundled `cerebras` provider plugin. GLM uses `zai-glm-4.7`; OpenAI-compatible base URL is `https://api.cerebras.ai/v1`.

Providers via models.providers (custom/base URL)

Use models.providers (or models.json) to add custom providers or OpenAI/Anthropiccompatible proxies.

Many of the bundled provider plugins below already publish a default catalog. Use explicit models.providers.<id> entries only when you want to override the default base URL, headers, or model list.

Gateway model capability checks also read explicit models.providers.<id>.models[] metadata. If a custom or proxy model accepts images, set input: ["text", "image"] on that model so WebChat and node-origin attachment paths pass images as native model inputs instead of text-only media refs.

Moonshot AI (Kimi)

Moonshot ships as a bundled provider plugin. Use the built-in provider by default, and add an explicit models.providers.moonshot entry only when you need to override the base URL or model metadata:

  • Provider: moonshot
  • Auth: MOONSHOT_API_KEY
  • Example model: moonshot/kimi-k2.6
  • CLI: openclaw onboard --auth-choice moonshot-api-key or openclaw onboard --auth-choice moonshot-api-key-cn

Kimi K2 model IDs:

  • moonshot/kimi-k2.6
  • moonshot/kimi-k2.5
  • moonshot/kimi-k2-thinking
  • moonshot/kimi-k2-thinking-turbo
  • moonshot/kimi-k2-turbo
{
  agents: {
    defaults: { model: { primary: "moonshot/kimi-k2.6" } },
  },
  models: {
    mode: "merge",
    providers: {
      moonshot: {
        baseUrl: "https://api.moonshot.ai/v1",
        apiKey: "${MOONSHOT_API_KEY}",
        api: "openai-completions",
        models: [{ id: "kimi-k2.6", name: "Kimi K2.6" }],
      },
    },
  },
}

Kimi coding

Kimi Coding uses Moonshot AI's Anthropic-compatible endpoint:

  • Provider: kimi
  • Auth: KIMI_API_KEY
  • Example model: kimi/kimi-code
{
  env: { KIMI_API_KEY: "sk-..." },
  agents: {
    defaults: { model: { primary: "kimi/kimi-code" } },
  },
}

Legacy kimi/k2p5 remains accepted as a compatibility model id.

Volcano Engine (Doubao)

Volcano Engine (火山引擎) provides access to Doubao and other models in China.

  • Provider: volcengine (coding: volcengine-plan)
  • Auth: VOLCANO_ENGINE_API_KEY
  • Example model: volcengine-plan/ark-code-latest
  • CLI: openclaw onboard --auth-choice volcengine-api-key
{
  agents: {
    defaults: { model: { primary: "volcengine-plan/ark-code-latest" } },
  },
}

Onboarding defaults to the coding surface, but the general volcengine/* catalog is registered at the same time.

In onboarding/configure model pickers, the Volcengine auth choice prefers both volcengine/* and volcengine-plan/* rows. If those models are not loaded yet, OpenClaw falls back to the unfiltered catalog instead of showing an empty provider-scoped picker.

- `volcengine/doubao-seed-1-8-251228` (Doubao Seed 1.8) - `volcengine/doubao-seed-code-preview-251028` - `volcengine/kimi-k2-5-260127` (Kimi K2.5) - `volcengine/glm-4-7-251222` (GLM 4.7) - `volcengine/deepseek-v3-2-251201` (DeepSeek V3.2 128K) - `volcengine-plan/ark-code-latest` - `volcengine-plan/doubao-seed-code` - `volcengine-plan/kimi-k2.5` - `volcengine-plan/kimi-k2-thinking` - `volcengine-plan/glm-4.7`

BytePlus (International)

BytePlus ARK provides access to the same models as Volcano Engine for international users.

  • Provider: byteplus (coding: byteplus-plan)
  • Auth: BYTEPLUS_API_KEY
  • Example model: byteplus-plan/ark-code-latest
  • CLI: openclaw onboard --auth-choice byteplus-api-key
{
  agents: {
    defaults: { model: { primary: "byteplus-plan/ark-code-latest" } },
  },
}

Onboarding defaults to the coding surface, but the general byteplus/* catalog is registered at the same time.

In onboarding/configure model pickers, the BytePlus auth choice prefers both byteplus/* and byteplus-plan/* rows. If those models are not loaded yet, OpenClaw falls back to the unfiltered catalog instead of showing an empty provider-scoped picker.

- `byteplus/seed-1-8-251228` (Seed 1.8) - `byteplus/kimi-k2-5-260127` (Kimi K2.5) - `byteplus/glm-4-7-251222` (GLM 4.7) - `byteplus-plan/ark-code-latest` - `byteplus-plan/doubao-seed-code` - `byteplus-plan/kimi-k2.5` - `byteplus-plan/kimi-k2-thinking` - `byteplus-plan/glm-4.7`

Synthetic

Synthetic provides Anthropic-compatible models behind the synthetic provider:

  • Provider: synthetic
  • Auth: SYNTHETIC_API_KEY
  • Example model: synthetic/hf:MiniMaxAI/MiniMax-M2.5
  • CLI: openclaw onboard --auth-choice synthetic-api-key
{
  agents: {
    defaults: { model: { primary: "synthetic/hf:MiniMaxAI/MiniMax-M2.5" } },
  },
  models: {
    mode: "merge",
    providers: {
      synthetic: {
        baseUrl: "https://api.synthetic.new/anthropic",
        apiKey: "${SYNTHETIC_API_KEY}",
        api: "anthropic-messages",
        models: [{ id: "hf:MiniMaxAI/MiniMax-M2.5", name: "MiniMax M2.5" }],
      },
    },
  },
}

MiniMax

MiniMax is configured via models.providers because it uses custom endpoints:

  • MiniMax OAuth (Global): --auth-choice minimax-global-oauth
  • MiniMax OAuth (CN): --auth-choice minimax-cn-oauth
  • MiniMax API key (Global): --auth-choice minimax-global-api
  • MiniMax API key (CN): --auth-choice minimax-cn-api
  • Auth: MINIMAX_API_KEY for minimax; MINIMAX_OAUTH_TOKEN or MINIMAX_API_KEY for minimax-portal

See /providers/minimax for setup details, model options, and config snippets.

On MiniMax's Anthropic-compatible streaming path, OpenClaw disables thinking by default unless you explicitly set it, and `/fast on` rewrites `MiniMax-M2.7` to `MiniMax-M2.7-highspeed`.

Plugin-owned capability split:

  • Text/chat defaults stay on minimax/MiniMax-M2.7
  • Image generation is minimax/image-01 or minimax-portal/image-01
  • Image understanding is plugin-owned MiniMax-VL-01 on both MiniMax auth paths
  • Web search stays on provider id minimax

LM Studio

LM Studio ships as a bundled provider plugin which uses the native API:

  • Provider: lmstudio
  • Auth: LM_API_TOKEN
  • Default inference base URL: http://localhost:1234/v1

Then set a model (replace with one of the IDs returned by http://localhost:1234/api/v1/models):

{
  agents: {
    defaults: { model: { primary: "lmstudio/openai/gpt-oss-20b" } },
  },
}

OpenClaw uses LM Studio's native /api/v1/models and /api/v1/models/load for discovery + auto-load, with /v1/chat/completions for inference by default. See /providers/lmstudio for setup and troubleshooting.

Ollama

Ollama ships as a bundled provider plugin and uses Ollama's native API:

# Install Ollama, then pull a model:
ollama pull llama3.3
{
  agents: {
    defaults: { model: { primary: "ollama/llama3.3" } },
  },
}

Ollama is detected locally at http://127.0.0.1:11434 when you opt in with OLLAMA_API_KEY, and the bundled provider plugin adds Ollama directly to openclaw onboard and the model picker. See /providers/ollama for onboarding, cloud/local mode, and custom configuration.

vLLM

vLLM ships as a bundled provider plugin for local/self-hosted OpenAI-compatible servers:

  • Provider: vllm
  • Auth: Optional (depends on your server)
  • Default base URL: http://127.0.0.1:8000/v1

To opt in to auto-discovery locally (any value works if your server doesn't enforce auth):

export VLLM_API_KEY="vllm-local"

Then set a model (replace with one of the IDs returned by /v1/models):

{
  agents: {
    defaults: { model: { primary: "vllm/your-model-id" } },
  },
}

See /providers/vllm for details.

SGLang

SGLang ships as a bundled provider plugin for fast self-hosted OpenAI-compatible servers:

  • Provider: sglang
  • Auth: Optional (depends on your server)
  • Default base URL: http://127.0.0.1:30000/v1

To opt in to auto-discovery locally (any value works if your server does not enforce auth):

export SGLANG_API_KEY="sglang-local"

Then set a model (replace with one of the IDs returned by /v1/models):

{
  agents: {
    defaults: { model: { primary: "sglang/your-model-id" } },
  },
}

See /providers/sglang for details.

Local proxies (LM Studio, vLLM, LiteLLM, etc.)

Example (OpenAIcompatible):

{
  agents: {
    defaults: {
      model: { primary: "lmstudio/my-local-model" },
      models: { "lmstudio/my-local-model": { alias: "Local" } },
    },
  },
  models: {
    providers: {
      lmstudio: {
        baseUrl: "http://localhost:1234/v1",
        apiKey: "${LM_API_TOKEN}",
        api: "openai-completions",
        timeoutSeconds: 300,
        models: [
          {
            id: "my-local-model",
            name: "Local Model",
            reasoning: false,
            input: ["text"],
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 200000,
            maxTokens: 8192,
          },
        ],
      },
    },
  },
}
For custom providers, `reasoning`, `input`, `cost`, `contextWindow`, and `maxTokens` are optional. When omitted, OpenClaw defaults to:
- `reasoning: false`
- `input: ["text"]`
- `cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }`
- `contextWindow: 200000`
- `maxTokens: 8192`

Recommended: set explicit values that match your proxy/model limits.
- For `api: "openai-completions"` on non-native endpoints (any non-empty `baseUrl` whose host is not `api.openai.com`), OpenClaw forces `compat.supportsDeveloperRole: false` to avoid provider 400 errors for unsupported `developer` roles. - Proxy-style OpenAI-compatible routes also skip native OpenAI-only request shaping: no `service_tier`, no Responses `store`, no Completions `store`, no prompt-cache hints, no OpenAI reasoning-compat payload shaping, and no hidden OpenClaw attribution headers. - For OpenAI-compatible Completions proxies that need vendor-specific fields, set `agents.defaults.models["provider/model"].params.extra_body` (or `extraBody`) to merge extra JSON into the outbound request body. - For vLLM chat-template controls, set `agents.defaults.models["provider/model"].params.chat_template_kwargs`. The bundled vLLM plugin automatically sends `enable_thinking: false` and `force_nonempty_content: true` for `vllm/nemotron-3-*` when the session thinking level is off. - For slow local models or remote LAN/tailnet hosts, set `models.providers..timeoutSeconds`. This extends provider model HTTP request handling, including connect, headers, body streaming, and the total guarded-fetch abort, without increasing the whole agent runtime timeout. - If `baseUrl` is empty/omitted, OpenClaw keeps the default OpenAI behavior (which resolves to `api.openai.com`). - For safety, an explicit `compat.supportsDeveloperRole: true` is still overridden on non-native `openai-completions` endpoints. - For `api: "anthropic-messages"` on non-direct endpoints (any provider other than canonical `anthropic`, or a custom `models.providers.anthropic.baseUrl` whose host is not a public `api.anthropic.com` endpoint), OpenClaw suppresses implicit Anthropic beta headers such as `claude-code-20250219`, `interleaved-thinking-2025-05-14`, and OAuth markers, so custom Anthropic-compatible proxies do not reject unsupported beta flags. Set `models.providers..headers["anthropic-beta"]` explicitly if your proxy needs specific beta features.

CLI examples

openclaw onboard --auth-choice opencode-zen
openclaw models set opencode/claude-opus-4-6
openclaw models list

See also: Configuration for full configuration examples.