31 KiB
summary, read_when, title, sidebarTitle
| summary | read_when | title | sidebarTitle | ||
|---|---|---|---|---|---|
| Deep troubleshooting runbook for gateway, channels, automation, nodes, and browser |
|
Troubleshooting | Troubleshooting |
This page is the deep runbook. Start at /help/troubleshooting if you want the fast triage flow first.
Command ladder
Run these first, in this order:
openclaw status
openclaw gateway status
openclaw logs --follow
openclaw doctor
openclaw channels status --probe
Expected healthy signals:
openclaw gateway statusshowsRuntime: running,Connectivity probe: ok, and aCapability: ...line.openclaw doctorreports no blocking config/service issues.openclaw channels status --probeshows live per-account transport status and, where supported, probe/audit results such asworksoraudit ok.
Split brain installs and newer config guard
Use this when a gateway service unexpectedly stops after an update, or logs show that one openclaw binary is older than the version that last wrote openclaw.json.
OpenClaw stamps config writes with meta.lastTouchedVersion. Read-only commands can still inspect a config written by a newer OpenClaw, but process and service mutations refuse to continue from an older binary. Blocked actions include gateway service start, stop, restart, uninstall, forced service reinstall, service-mode gateway startup, and gateway --force port cleanup.
which openclaw
openclaw --version
openclaw gateway status --deep
openclaw config get meta.lastTouchedVersion
```bash
openclaw gateway install --force
openclaw gateway restart
```
Remove stale system package or old wrapper entries that still point at an old `openclaw` binary.
For intentional downgrade or emergency recovery only, set `OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS=1` for the single command. Leave it unset for normal operation.
Anthropic 429 extra usage required for long context
Use this when logs/errors include: HTTP 429: rate_limit_error: Extra usage is required for long context requests.
openclaw logs --follow
openclaw models status
openclaw config get agents.defaults.models
Look for:
- Selected Anthropic Opus/Sonnet model has
params.context1m: true. - Current Anthropic credential is not eligible for long-context usage.
- Requests fail only on long sessions/model runs that need the 1M beta path.
Fix options:
Disable `context1m` for that model to fall back to the normal context window. Use an Anthropic credential that is eligible for long-context requests, or switch to an Anthropic API key. Configure fallback models so runs continue when Anthropic long-context requests are rejected.Related:
Local OpenAI-compatible backend passes direct probes but agent runs fail
Use this when:
curl ... /v1/modelsworks- tiny direct
/v1/chat/completionscalls work - OpenClaw model runs fail only on normal agent turns
curl http://127.0.0.1:1234/v1/models
curl http://127.0.0.1:1234/v1/chat/completions \
-H 'content-type: application/json' \
-d '{"model":"<id>","messages":[{"role":"user","content":"hi"}],"stream":false}'
openclaw infer model run --model <provider/model> --prompt "hi" --json
openclaw logs --follow
Look for:
- direct tiny calls succeed, but OpenClaw runs fail only on larger prompts
model_not_foundor 404 errors even though direct/v1/chat/completionsworks with the same bare model id- backend errors about
messages[].contentexpecting a string - intermittent
incomplete turn detected ... stopReason=stop payloads=0warnings with an OpenAI-compatible local backend - backend crashes that appear only with larger prompt-token counts or full agent runtime prompts
Related:
No replies
If channels are up but nothing answers, check routing and policy before reconnecting anything.
openclaw status
openclaw channels status --probe
openclaw pairing list --channel <channel> [--account <id>]
openclaw config get channels
openclaw logs --follow
Look for:
- Pairing pending for DM senders.
- Group mention gating (
requireMention,mentionPatterns). - Channel/group allowlist mismatches.
Common signatures:
drop guild message (mention required→ group message ignored until mention.pairing request→ sender needs approval.blocked/allowlist→ sender/channel was filtered by policy.
Related:
Dashboard control UI connectivity
When dashboard/control UI will not connect, validate URL, auth mode, and secure context assumptions.
openclaw gateway status
openclaw status
openclaw logs --follow
openclaw doctor
openclaw gateway status --json
Look for:
- Correct probe URL and dashboard URL.
- Auth mode/token mismatch between client and gateway.
- HTTP usage where device identity is required.
Auth detail codes quick map
Use error.details.code from the failed connect response to pick the next action:
| Detail code | Meaning | Recommended action |
|---|---|---|
AUTH_TOKEN_MISSING |
Client did not send a required shared token. | Paste/set token in the client and retry. For dashboard paths: openclaw config get gateway.auth.token then paste into Control UI settings. |
AUTH_TOKEN_MISMATCH |
Shared token did not match gateway auth token. | If canRetryWithDeviceToken=true, allow one trusted retry. Cached-token retries reuse stored approved scopes; explicit deviceToken / scopes callers keep requested scopes. If still failing, run the token drift recovery checklist. |
AUTH_DEVICE_TOKEN_MISMATCH |
Cached per-device token is stale or revoked. | Rotate/re-approve device token using devices CLI, then reconnect. |
PAIRING_REQUIRED |
Device identity needs approval. Check error.details.reason for not-paired, scope-upgrade, role-upgrade, or metadata-upgrade, and use requestId / remediationHint when present. |
Approve pending request: openclaw devices list then openclaw devices approve <requestId>. Scope/role upgrades use the same flow after you review the requested access. |
Device auth v2 migration check:
openclaw --version
openclaw doctor
openclaw gateway status
If logs show nonce/signature errors, update the connecting client and verify it:
Client waits for the gateway-issued `connect.challenge`. Client signs the challenge-bound payload. Client sends `connect.params.device.nonce` with the same challenge nonce.If openclaw devices rotate / revoke / remove is denied unexpectedly:
- paired-device token sessions can manage only their own device unless the caller also has
operator.admin openclaw devices rotate --scope ...can only request operator scopes that the caller session already holds
Related:
- Configuration (gateway auth modes)
- Control UI
- Devices
- Remote access
- Trusted proxy auth
Gateway service not running
Use this when service is installed but process does not stay up.
openclaw gateway status
openclaw status
openclaw logs --follow
openclaw doctor
openclaw gateway status --deep # also scan system-level services
Look for:
Runtime: stoppedwith exit hints.- Service config mismatch (
Config (cli)vsConfig (service)). - Port/listener conflicts.
- Extra launchd/systemd/schtasks installs when
--deepis used. Other gateway-like services detected (best effort)cleanup hints.
Related:
Gateway restored last-known-good config
Use this when the Gateway starts, but logs say it restored openclaw.json.
openclaw logs --follow
openclaw config file
openclaw config validate
openclaw doctor
Look for:
Config auto-restored from last-known-goodgateway: invalid config was restored from last-known-good backupconfig reload restored last-known-good config after invalid-config- A timestamped
openclaw.json.clobbered.*file beside the active config - A main-agent system event that starts with
Config recovery warning
Related:
Gateway probe warnings
Use this when openclaw gateway probe reaches something, but still prints a warning block.
openclaw gateway probe
openclaw gateway probe --json
openclaw gateway probe --ssh user@gateway-host
Look for:
warnings[].codeandprimaryTargetIdin JSON output.- Whether the warning is about SSH fallback, multiple gateways, missing scopes, or unresolved auth refs.
Common signatures:
SSH tunnel failed to start; falling back to direct probes.→ SSH setup failed, but the command still tried direct configured/loopback targets.multiple reachable gateways detected→ more than one target answered. Usually this means an intentional multi-gateway setup or stale/duplicate listeners.Read-probe diagnostics are limited by gateway scopes (missing operator.read)→ connect worked, but detail RPC is scope-limited; pair device identity or use credentials withoperator.read.Gateway accepted the WebSocket connection, but follow-up read diagnostics failed→ connect worked, but the full diagnostic RPC set timed out or failed. Treat this as a reachable Gateway with degraded diagnostics; compareconnect.okandconnect.rpcOkin--jsonoutput.Capability: pairing-pendingorgateway closed (1008): pairing required→ the gateway answered, but this client still needs pairing/approval before normal operator access.- unresolved
gateway.auth.*/gateway.remote.*SecretRef warning text → auth material was unavailable in this command path for the failed target.
Related:
Channel connected, messages not flowing
If channel state is connected but message flow is dead, focus on policy, permissions, and channel specific delivery rules.
openclaw channels status --probe
openclaw pairing list --channel <channel> [--account <id>]
openclaw status --deep
openclaw logs --follow
openclaw config get channels
Look for:
- DM policy (
pairing,allowlist,open,disabled). - Group allowlist and mention requirements.
- Missing channel API permissions/scopes.
Common signatures:
mention required→ message ignored by group mention policy.pairing/ pending approval traces → sender is not approved.missing_scope,not_in_channel,Forbidden,401/403→ channel auth/permissions issue.
Related:
Cron and heartbeat delivery
If cron or heartbeat did not run or did not deliver, verify scheduler state first, then delivery target.
openclaw cron status
openclaw cron list
openclaw cron runs --id <jobId> --limit 20
openclaw system heartbeat last
openclaw logs --follow
Look for:
- Cron enabled and next wake present.
- Job run history status (
ok,skipped,error). - Heartbeat skip reasons (
quiet-hours,requests-in-flight,cron-in-progress,lanes-busy,alerts-disabled,empty-heartbeat-file,no-tasks-due).
Related:
Node paired, tool fails
If a node is paired but tools fail, isolate foreground, permission, and approval state.
openclaw nodes status
openclaw nodes describe --node <idOrNameOrIp>
openclaw approvals get --node <idOrNameOrIp>
openclaw logs --follow
openclaw status
Look for:
- Node online with expected capabilities.
- OS permission grants for camera/mic/location/screen.
- Exec approvals and allowlist state.
Common signatures:
NODE_BACKGROUND_UNAVAILABLE→ node app must be in foreground.*_PERMISSION_REQUIRED/LOCATION_PERMISSION_REQUIRED→ missing OS permission.SYSTEM_RUN_DENIED: approval required→ exec approval pending.SYSTEM_RUN_DENIED: allowlist miss→ command blocked by allowlist.
Related:
Browser tool fails
Use this when browser tool actions fail even though the gateway itself is healthy.
openclaw browser status
openclaw browser start --browser-profile openclaw
openclaw browser profiles
openclaw logs --follow
openclaw doctor
Look for:
- Whether
plugins.allowis set and includesbrowser. - Valid browser executable path.
- CDP profile reachability.
- Local Chrome availability for
existing-session/userprofiles.
Related:
If you upgraded and something suddenly broke
Most post-upgrade breakage is config drift or stricter defaults now being enforced.
```bash openclaw gateway status openclaw config get gateway.mode openclaw config get gateway.remote.url openclaw config get gateway.auth.mode ```What to check:
- If `gateway.mode=remote`, CLI calls may be targeting remote while your local service is fine.
- Explicit `--url` calls do not fall back to stored credentials.
Common signatures:
- `gateway connect failed:` → wrong URL target.
- `unauthorized` → endpoint reachable but wrong auth.
```bash
openclaw config get gateway.bind
openclaw config get gateway.auth.mode
openclaw config get gateway.auth.token
openclaw gateway status
openclaw logs --follow
```
What to check:
- Non-loopback binds (`lan`, `tailnet`, `custom`) need a valid gateway auth path: shared token/password auth, or a correctly configured non-loopback `trusted-proxy` deployment.
- Old keys like `gateway.token` do not replace `gateway.auth.token`.
Common signatures:
- `refusing to bind gateway ... without auth` → non-loopback bind without a valid gateway auth path.
- `Connectivity probe: failed` while runtime is running → gateway alive but inaccessible with current auth/url.
```bash
openclaw devices list
openclaw pairing list --channel [--account ]
openclaw logs --follow
openclaw doctor
```
What to check:
- Pending device approvals for dashboard/nodes.
- Pending DM pairing approvals after policy or identity changes.
Common signatures:
- `device identity required` → device auth not satisfied.
- `pairing required` → sender/device must be approved.
If the service config and runtime still disagree after checks, reinstall service metadata from the same profile/state directory:
openclaw gateway install --force
openclaw gateway restart
Related: