fix(cli): streamline local model probes

This commit is contained in:
Peter Steinberger
2026-04-27 23:02:26 +01:00
parent d7dcd0e21e
commit 42dddbbe78
14 changed files with 605 additions and 56 deletions

View File

@@ -16,6 +16,7 @@ Docs: https://docs.openclaw.ai
### Fixes
- CLI/Ollama: run local `infer model run` through the lean provider completion path and skip global model discovery for one-shot local probes, so Ollama smoke tests no longer pay full chat-agent/tool startup cost or hang before the native `/api/chat` request. Fixes #72851. Thanks @TotalRes2020.
- Channels/commands: make generated `/dock-*` commands switch the active session reply route through `session.identityLinks` instead of falling through to normal chat. Fixes #69206; carries forward #73033. Thanks @clawbones and @michaelatamuk.
- Providers/Cloudflare AI Gateway: strip assistant prefill turns from Anthropic Messages payloads when thinking is enabled, so Claude requests through Cloudflare AI Gateway no longer fail Anthropic conversation-ending validation. Fixes #72905; carries forward #73005. Thanks @AaronFaby and @sahilsatralkar.
- Gateway/startup: keep primary-model startup prewarm on scoped metadata preparation, let native approval bootstraps retry outside channel startup, and skip the global hook runner when no `gateway_start` hook is registered, so clean post-ready sidecar work stays off the critical path. Refs #72846. Thanks @RayWoo, @livekm0309, and @mrz1836.

View File

@@ -130,7 +130,8 @@ This table maps common inference tasks to the corresponding infer command.
- Stateless execution commands default to local.
- Gateway-managed state commands default to gateway.
- The normal local path does not require the gateway to be running.
- `model run` is one-shot. MCP servers opened through the agent runtime for that command are retired after the reply for both local and `--gateway` execution, so repeated scripted invocations do not keep stdio MCP child processes alive.
- Local `model run` is a lean one-shot provider completion. It resolves the configured agent model and auth, but does not start a chat-agent turn, load tools, or open bundled MCP servers.
- `model run --gateway` still uses the Gateway agent runtime so it can exercise the same routed runtime path as a normal Gateway-backed turn. MCP servers opened through that runtime are retired after the reply, so repeated scripted invocations do not keep stdio MCP child processes alive.
## Model
@@ -143,10 +144,22 @@ openclaw infer model providers --json
openclaw infer model inspect --name gpt-5.5 --json
```
Use full `<provider/model>` refs to smoke-test a specific provider without
starting the Gateway or loading the full agent tool surface:
```bash
openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model cerebras/zai-glm-4.7 --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model google/gemini-2.5-flash --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model groq/llama-3.1-8b-instant --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model mistral/mistral-small-latest --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model openai/gpt-4.1 --prompt "Reply with exactly: pong" --json
```
Notes:
- `model run` reuses the agent runtime so provider/model overrides behave like normal agent execution.
- Because `model run` is intended for headless automation, it does not retain per-session bundled MCP runtimes after the command finishes.
- Local `model run` is the narrowest CLI smoke for provider/model/auth health because it sends only the supplied prompt to the selected model.
- Use `model run --gateway` when you need to test Gateway routing, agent-runtime setup, or Gateway-managed provider state instead of the lean local completion path.
- `model auth login`, `model auth logout`, and `model auth status` manage saved provider auth state.
## Image

View File

@@ -239,14 +239,20 @@ Compatibility notes for stricter OpenAI-compatible backends:
```
- Some smaller or stricter local backends are unstable with OpenClaw's full
agent-runtime prompt shape, especially when tool schemas are included. If the
backend works for tiny direct `/v1/chat/completions` calls but fails on normal
OpenClaw agent turns, first try
agent-runtime prompt shape, especially when tool schemas are included. First
verify the provider path with the lean local probe:
```bash
openclaw infer model run --local --model <provider/model> --prompt "Reply with exactly: pong" --json
```
If that succeeds but normal OpenClaw agent turns fail, first try
`agents.defaults.experimental.localModelLean: true` to drop heavyweight
default tools like `browser`, `cron`, and `message`; this is an experimental
flag, not a stable default-mode setting. See
[Experimental Features](/concepts/experimental-features). If that still fails, try
`models.providers.<provider>.models[].compat.supportsTools: false`.
- If the backend still fails only on larger OpenClaw runs, the remaining issue
is usually upstream model/server capacity or a backend bug, not OpenClaw's
transport layer.
@@ -264,10 +270,11 @@ Compatibility notes for stricter OpenAI-compatible backends:
- Context errors? Lower `contextWindow` or raise your server limit.
- OpenAI-compatible server returns `messages[].content ... expected a string`?
Add `compat.requiresStringContent: true` on that model entry.
- Direct tiny `/v1/chat/completions` calls work, but `openclaw infer model run`
fails on Gemma or another local model? Disable tool schemas first with
`compat.supportsTools: false`, then retest. If the server still crashes only
on larger OpenClaw prompts, treat it as an upstream server/model limitation.
- Direct tiny `/v1/chat/completions` calls work, but `openclaw infer model run --local`
fails on Gemma or another local model? Check the provider URL, model ref, auth
marker, and server logs first; local `model run` does not include agent tools.
If local `model run` succeeds but larger agent turns fail, reduce the agent
tool surface with `localModelLean` or `compat.supportsTools: false`.
- Tool calls show up as raw JSON/XML/ReAct text, or the provider returns an
empty `tool_calls` array? Do not add a proxy that blindly converts assistant
text into tool execution. Fix the server chat template/parser first. If the

View File

@@ -185,7 +185,7 @@ When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models
| Token limits | Sets `maxTokens` to the default Ollama max-token cap used by OpenClaw |
| Costs | Sets all costs to `0` |
This avoids manual model entries while keeping the catalog aligned with the local Ollama instance.
This avoids manual model entries while keeping the catalog aligned with the local Ollama instance. You can use a full ref such as `ollama/<pulled-model>:latest` in local `infer model run`; OpenClaw resolves that installed model from Ollama's live catalog without requiring a hand-written `models.json` entry.
```bash
# See what models are available
@@ -193,6 +193,31 @@ ollama list
openclaw models list
```
For a narrow text-generation smoke test that avoids the full agent tool surface,
use local `infer model run` with a full Ollama model ref:
```bash
OLLAMA_API_KEY=ollama-local \
openclaw infer model run \
--local \
--model ollama/llama3.2:latest \
--prompt "Reply with exactly: pong" \
--json
```
That path still uses OpenClaw's configured provider, auth, and native Ollama
transport, but it does not start a chat-agent turn or load MCP/tool context. If
this succeeds while normal agent replies fail, troubleshoot the model's agent
prompt/tool capacity next.
Live-verify the local text path, native stream path, and embeddings against
local Ollama with:
```bash
OPENCLAW_LIVE_TEST=1 OPENCLAW_LIVE_OLLAMA=1 OPENCLAW_LIVE_OLLAMA_WEB_SEARCH=0 \
pnpm test:live -- extensions/ollama/ollama.live.test.ts
```
To add a new model, simply pull it with Ollama:
```bash

View File

@@ -369,6 +369,57 @@ describe("ollama plugin", () => {
});
});
it("resolves dynamic local models from Ollama without generating PI models.json", async () => {
const provider = registerProvider();
const previous = process.env.OLLAMA_API_KEY;
process.env.OLLAMA_API_KEY = "ollama-local";
buildOllamaProviderMock.mockResolvedValueOnce({
baseUrl: "http://127.0.0.1:11434",
api: "ollama",
models: [
{
id: "llama3.2:latest",
name: "llama3.2:latest",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 8192,
maxTokens: 2048,
},
],
});
try {
await provider.prepareDynamicModel?.({
config: {},
provider: "ollama",
modelId: "llama3.2:latest",
modelRegistry: { find: vi.fn(() => null) },
} as never);
expect(
provider.resolveDynamicModel?.({
config: {},
provider: "ollama",
modelId: "llama3.2:latest",
modelRegistry: { find: vi.fn(() => null) },
} as never),
).toMatchObject({
provider: "ollama",
id: "llama3.2:latest",
api: "ollama",
baseUrl: "http://127.0.0.1:11434",
});
expect(buildOllamaProviderMock).toHaveBeenCalledWith(undefined, { quiet: true });
} finally {
if (previous === undefined) {
delete process.env.OLLAMA_API_KEY;
} else {
process.env.OLLAMA_API_KEY = previous;
}
}
});
it("skips implicit localhost discovery when a custom remote Ollama provider is configured", async () => {
const provider = registerProvider();

View File

@@ -7,8 +7,13 @@ import {
type ProviderAuthMethodNonInteractiveContext,
type ProviderAuthResult,
type ProviderDiscoveryContext,
type ProviderRuntimeModel,
} from "openclaw/plugin-sdk/plugin-entry";
import { buildApiKeyCredential } from "openclaw/plugin-sdk/provider-auth";
import type {
ModelDefinitionConfig,
ModelProviderConfig,
} from "openclaw/plugin-sdk/provider-model-shared";
import {
buildOpenAICompatibleReplayPolicy,
OPENAI_COMPATIBLE_REPLAY_HOOKS,
@@ -57,6 +62,44 @@ function usesOllamaOpenAICompatTransport(model: {
);
}
const dynamicModelCache = new Map<string, ProviderRuntimeModel[]>();
function buildDynamicCacheKey(provider: string, baseUrl: string | undefined): string {
return `${provider}\0${baseUrl ?? ""}`;
}
function hasOllamaDiscoverySignal(providerConfig: ModelProviderConfig | undefined): boolean {
return (
Boolean(process.env.OLLAMA_API_KEY?.trim()) ||
shouldUseSyntheticOllamaAuth(providerConfig) ||
Boolean(providerConfig?.apiKey)
);
}
function toDynamicOllamaModel(params: {
provider: string;
providerConfig: ModelProviderConfig;
model: ModelDefinitionConfig;
}): ProviderRuntimeModel {
const input = (params.model.input ?? ["text"]).filter(
(value): value is "text" | "image" => value === "text" || value === "image",
);
return {
id: params.model.id,
name: params.model.name ?? params.model.id,
provider: params.provider,
api: "ollama",
baseUrl: readProviderBaseUrl(params.providerConfig) ?? "",
reasoning: params.model.reasoning ?? false,
input: input.length > 0 ? input : ["text"],
cost: params.model.cost ?? { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: params.model.contextWindow ?? 8192,
maxTokens: params.model.maxTokens ?? 8192,
...(params.model.compat ? { compat: params.model.compat as never } : {}),
...(params.model.params ? { params: params.model.params } : {}),
};
}
export default definePluginEntry({
id: "ollama",
name: "Ollama Provider",
@@ -215,6 +258,36 @@ export default definePluginEntry({
},
shouldDeferSyntheticProfileAuth: ({ resolvedApiKey }) =>
resolvedApiKey?.trim() === OLLAMA_DEFAULT_API_KEY,
prepareDynamicModel: async (ctx) => {
const providerConfig = resolveConfiguredOllamaProviderConfig({
config: ctx.config,
providerId: ctx.provider,
});
if (!hasOllamaDiscoverySignal(providerConfig)) {
return;
}
const baseUrl = readProviderBaseUrl(providerConfig);
const provider = await buildOllamaProvider(baseUrl, { quiet: true });
dynamicModelCache.set(
buildDynamicCacheKey(ctx.provider, baseUrl),
(provider.models ?? []).map((model) =>
toDynamicOllamaModel({
provider: ctx.provider,
providerConfig: provider,
model,
}),
),
);
},
resolveDynamicModel: (ctx) => {
const providerConfig = resolveConfiguredOllamaProviderConfig({
config: ctx.config,
providerId: ctx.provider,
});
return dynamicModelCache
.get(buildDynamicCacheKey(ctx.provider, readProviderBaseUrl(providerConfig)))
?.find((model) => model.id === ctx.modelId);
},
buildUnknownModelHint: () =>
"Ollama requires authentication to be registered as a provider. " +
'Set OLLAMA_API_KEY="ollama-local" (any value works) or run "openclaw configure". ' +

View File

@@ -1,3 +1,8 @@
import { spawnSync } from "node:child_process";
import * as fsSync from "node:fs";
import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { describe, expect, it } from "vitest";
import { createOllamaEmbeddingProvider } from "./src/embedding-provider.js";
import { createOllamaStreamFn } from "./src/stream.js";
@@ -20,7 +25,133 @@ async function collectStreamEvents<T>(stream: AsyncIterable<T>): Promise<T[]> {
return events;
}
async function withTempOpenClawState<T>(run: (paths: { root: string }) => Promise<T>): Promise<T> {
const root = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-ollama-cli-live-"));
try {
await fs.writeFile(
path.join(root, "openclaw.json"),
JSON.stringify(
{
models: {
providers: {
ollama: {
api: "ollama",
baseUrl: OLLAMA_BASE_URL,
apiKey: "ollama-local",
models: [],
},
},
},
},
null,
2,
),
);
return await run({ root });
} finally {
await fs.rm(root, { recursive: true, force: true });
}
}
async function runOpenClawCli(args: string[], env: NodeJS.ProcessEnv) {
const outputRoot = fsSync.mkdtempSync(path.join(os.tmpdir(), "openclaw-ollama-cli-output-"));
const stdoutPath = path.join(outputRoot, "stdout.txt");
const stderrPath = path.join(outputRoot, "stderr.txt");
const stdoutFd = fsSync.openSync(stdoutPath, "w");
const stderrFd = fsSync.openSync(stderrPath, "w");
let stdoutClosed = false;
let stderrClosed = false;
try {
const result = spawnSync(process.execPath, ["openclaw.mjs", ...args], {
cwd: process.cwd(),
env,
timeout: 90_000,
stdio: ["ignore", stdoutFd, stderrFd],
});
fsSync.closeSync(stdoutFd);
stdoutClosed = true;
fsSync.closeSync(stderrFd);
stderrClosed = true;
return {
exitCode: result.status ?? (result.error ? 1 : 0),
stdout: fsSync.readFileSync(stdoutPath, "utf8"),
stderr: fsSync.readFileSync(stderrPath, "utf8"),
};
} finally {
if (!stdoutClosed) {
fsSync.closeSync(stdoutFd);
}
if (!stderrClosed) {
fsSync.closeSync(stderrFd);
}
fsSync.rmSync(outputRoot, { recursive: true, force: true });
}
}
function parseJsonEnvelope(stdout: string): Record<string, unknown> {
const trimmed = stdout.trim();
const jsonStart = trimmed.lastIndexOf("\n{");
const rawJson = jsonStart >= 0 ? trimmed.slice(jsonStart + 1) : trimmed;
return JSON.parse(rawJson) as Record<string, unknown>;
}
function buildCliEnv(root: string): NodeJS.ProcessEnv {
return {
PATH: process.env.PATH,
HOME: process.env.HOME,
USER: process.env.USER,
TMPDIR: process.env.TMPDIR,
NODE_PATH: process.env.NODE_PATH,
NODE_OPTIONS: process.env.NODE_OPTIONS,
OPENCLAW_LIVE_TEST: "1",
OPENCLAW_LIVE_OLLAMA: "1",
OPENCLAW_LIVE_OLLAMA_WEB_SEARCH: "0",
OPENCLAW_STATE_DIR: path.join(root, "state"),
OPENCLAW_CONFIG_PATH: path.join(root, "openclaw.json"),
OPENCLAW_NO_RESPAWN: "1",
OPENCLAW_TEST_FAST: "1",
OLLAMA_API_KEY: "ollama-local",
};
}
describe.skipIf(!LIVE)("ollama live", () => {
it("runs infer model run through the local CLI path without PI model discovery", async () => {
await withTempOpenClawState(async ({ root }) => {
const result = await runOpenClawCli(
[
"infer",
"model",
"run",
"--local",
"--model",
`ollama/${CHAT_MODEL}`,
"--prompt",
"Reply with exactly one word: pong",
"--json",
],
buildCliEnv(root),
);
expect(result.exitCode).toBe(0);
expect(result.stderr).not.toContain("[agents/auth-profiles]");
expect(result.stdout.trim(), result.stderr).not.toHaveLength(0);
const payload = parseJsonEnvelope(result.stdout) as {
ok?: boolean;
transport?: string;
provider?: string;
model?: string;
outputs?: Array<{ text?: string }>;
};
expect(payload).toMatchObject({
ok: true,
transport: "local",
provider: "ollama",
model: CHAT_MODEL,
});
expect(payload.outputs?.[0]?.text?.trim().length ?? 0).toBeGreaterThan(0);
});
}, 120_000);
it("runs native chat with a custom provider prefix and normalized tool schemas", async () => {
const streamFn = createOllamaStreamFn(OLLAMA_BASE_URL);
let payload:

View File

@@ -15,6 +15,9 @@ vi.mock("../plugins/plugin-registry.js", () => ({
{
origin: "bundled",
nonSecretAuthMarkers: ["gcp-vertex-credentials", "ollama-local"],
providerAuthEnvVars: {
ollama: ["OLLAMA_API_KEY"],
},
},
],
}),
@@ -163,6 +166,20 @@ async function withoutEnv<T>(key: string, fn: () => Promise<T>): Promise<T> {
}
}
async function withEnv<T>(key: string, value: string, fn: () => Promise<T>): Promise<T> {
const previous = process.env[key];
process.env[key] = value;
try {
return await fn();
} finally {
if (previous === undefined) {
delete process.env[key];
} else {
process.env[key] = previous;
}
}
}
function createCustomProviderConfig(
baseUrl: string,
modelId = "llama3",
@@ -809,6 +826,30 @@ describe("resolveApiKeyForProvider", () => {
mode: "api-key",
});
});
it("prefers non-secret local env markers over ambient profiles", async () => {
const resolved = await withEnv("OLLAMA_API_KEY", "ollama-local", () =>
resolveApiKeyForProvider({
provider: "ollama",
store: {
version: 1,
profiles: {
"ollama:default": {
type: "api_key",
provider: "ollama",
key: "ollama-cloud-profile", // pragma: allowlist secret
},
},
},
}),
);
expect(resolved).toMatchObject({
apiKey: "ollama-local",
mode: "api-key",
});
expect(resolved.source).toContain("OLLAMA_API_KEY");
});
});
describe("resolveApiKeyForProvider synthetic local auth for custom providers", () => {

View File

@@ -523,6 +523,22 @@ export async function resolveApiKeyForProvider(params: {
}
const providerConfig = resolveProviderConfig(cfg, provider);
const configuredLocalKey = resolveUsableCustomProviderApiKey({ cfg, provider });
if (configuredLocalKey && isNonSecretApiKeyMarker(configuredLocalKey.apiKey)) {
return {
apiKey: configuredLocalKey.apiKey,
source: configuredLocalKey.source,
mode: "api-key",
};
}
const localMarkerEnv = resolveEnvApiKey(provider);
if (localMarkerEnv && isNonSecretApiKeyMarker(localMarkerEnv.apiKey)) {
return {
apiKey: localMarkerEnv.apiKey,
source: localMarkerEnv.source,
mode: "api-key",
};
}
const store = params.store ?? ensureAuthProfileStore(params.agentDir);
const order = resolveAuthProfileOrder({
cfg,

View File

@@ -1,5 +1,5 @@
import { beforeEach, describe, expect, it, vi } from "vitest";
import { discoverModels } from "../pi-model-discovery.js";
import { discoverAuthStorage, discoverModels } from "../pi-model-discovery.js";
import { createProviderRuntimeTestMock } from "./model.provider-runtime.test-support.js";
vi.mock("../model-suppression.js", () => ({
@@ -55,6 +55,8 @@ import {
beforeEach(() => {
resetMockDiscoverModels(discoverModels);
vi.mocked(discoverModels).mockClear();
vi.mocked(discoverAuthStorage).mockClear();
mockGetOpenRouterModelCapabilities.mockReset();
mockGetOpenRouterModelCapabilities.mockReturnValue(undefined);
mockLoadOpenRouterModelCapabilities.mockReset();
@@ -110,6 +112,27 @@ function resolveModelAsyncForTest(
}
describe("resolveModel", () => {
it("skips PI auth and model discovery during dynamic model resolution", async () => {
const result = await resolveModelAsync(
"openrouter",
"openrouter/auto",
"/tmp/agent",
undefined,
{
runtimeHooks: createRuntimeHooks(),
skipPiDiscovery: true,
},
);
expect(result.error).toBeUndefined();
expect(result.model).toMatchObject({
provider: "openrouter",
id: "openrouter/auto",
});
expect(discoverAuthStorage).not.toHaveBeenCalled();
expect(discoverModels).not.toHaveBeenCalled();
});
it("defaults model input to text when discovery omits input", () => {
mockDiscoveredModel(discoverModels, {
provider: "custom",

View File

@@ -1,16 +1,29 @@
import type { Model } from "@mariozechner/pi-ai";
import { beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
const hoisted = vi.hoisted(() => ({
resolveModelMock: vi.fn(),
resolveModelAsyncMock: vi.fn(),
getApiKeyForModelMock: vi.fn(),
applyLocalNoAuthHeaderOverrideMock: vi.fn(),
setRuntimeApiKeyMock: vi.fn(),
resolveCopilotApiTokenMock: vi.fn(),
prepareProviderRuntimeAuthMock: vi.fn(),
prepareModelForSimpleCompletionMock: vi.fn((params: { model: unknown }) => params.model),
completeMock: vi.fn(),
}));
vi.mock("@mariozechner/pi-ai", () => ({
complete: hoisted.completeMock,
}));
vi.mock("./pi-embedded-runner/model.js", () => ({
resolveModel: hoisted.resolveModelMock,
resolveModelAsync: hoisted.resolveModelAsyncMock,
}));
vi.mock("./simple-completion-transport.js", () => ({
prepareModelForSimpleCompletion: hoisted.prepareModelForSimpleCompletionMock,
}));
vi.mock("./model-auth.js", () => ({
@@ -26,21 +39,30 @@ vi.mock("../plugins/provider-runtime.runtime.js", () => ({
prepareProviderRuntimeAuth: hoisted.prepareProviderRuntimeAuthMock,
}));
let completeWithPreparedSimpleCompletionModel: typeof import("./simple-completion-runtime.js").completeWithPreparedSimpleCompletionModel;
let prepareSimpleCompletionModel: typeof import("./simple-completion-runtime.js").prepareSimpleCompletionModel;
beforeAll(async () => {
({ prepareSimpleCompletionModel } = await import("./simple-completion-runtime.js"));
({ completeWithPreparedSimpleCompletionModel, prepareSimpleCompletionModel } =
await import("./simple-completion-runtime.js"));
});
beforeEach(() => {
hoisted.resolveModelMock.mockReset();
hoisted.resolveModelAsyncMock.mockReset();
hoisted.getApiKeyForModelMock.mockReset();
hoisted.applyLocalNoAuthHeaderOverrideMock.mockReset();
hoisted.setRuntimeApiKeyMock.mockReset();
hoisted.resolveCopilotApiTokenMock.mockReset();
hoisted.prepareProviderRuntimeAuthMock.mockReset();
hoisted.prepareModelForSimpleCompletionMock.mockReset();
hoisted.completeMock.mockReset();
hoisted.applyLocalNoAuthHeaderOverrideMock.mockImplementation((model: unknown) => model);
hoisted.prepareModelForSimpleCompletionMock.mockImplementation(
(params: { model: unknown }) => params.model,
);
hoisted.completeMock.mockResolvedValue({ content: [{ type: "text", text: "ok" }] });
hoisted.resolveModelMock.mockReturnValue({
model: {
@@ -52,6 +74,9 @@ beforeEach(() => {
},
modelRegistry: {},
});
hoisted.resolveModelAsyncMock.mockImplementation((...args: unknown[]) =>
Promise.resolve(hoisted.resolveModelMock(...args)),
);
hoisted.getApiKeyForModelMock.mockResolvedValue({
apiKey: "sk-test",
source: "env:TEST_API_KEY",
@@ -405,4 +430,86 @@ describe("prepareSimpleCompletionModel", () => {
}),
);
});
it("can skip Pi model/auth discovery for config-scoped one-shot completions", async () => {
hoisted.resolveModelAsyncMock.mockResolvedValueOnce({
model: {
provider: "ollama",
id: "llama3.2:latest",
},
authStorage: {
setRuntimeApiKey: hoisted.setRuntimeApiKeyMock,
},
modelRegistry: {},
});
hoisted.getApiKeyForModelMock.mockResolvedValueOnce({
apiKey: "ollama-local",
source: "models.json (local marker)",
mode: "api-key",
});
const result = await prepareSimpleCompletionModel({
cfg: undefined,
provider: "ollama",
modelId: "llama3.2:latest",
skipPiDiscovery: true,
});
expect(result).not.toHaveProperty("error");
expect(hoisted.resolveModelMock).not.toHaveBeenCalled();
expect(hoisted.resolveModelAsyncMock).toHaveBeenCalledWith(
"ollama",
"llama3.2:latest",
undefined,
undefined,
{
skipPiDiscovery: true,
},
);
});
});
describe("completeWithPreparedSimpleCompletionModel", () => {
it("prepares provider-owned stream APIs before running a completion", async () => {
const model = {
provider: "ollama",
id: "llama3.2:latest",
name: "llama3.2:latest",
api: "ollama",
baseUrl: "http://127.0.0.1:11434",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 8192,
maxTokens: 1024,
} satisfies Model<"ollama">;
const preparedModel = {
...model,
api: "openclaw-ollama-simple-test",
};
hoisted.prepareModelForSimpleCompletionMock.mockReturnValueOnce(preparedModel);
await completeWithPreparedSimpleCompletionModel({
model,
auth: {
apiKey: "ollama-local",
source: "models.json (local marker)",
mode: "api-key",
},
context: {
messages: [{ role: "user", content: "pong", timestamp: 1 }],
},
});
expect(hoisted.prepareModelForSimpleCompletionMock).toHaveBeenCalledWith({ model });
expect(hoisted.completeMock).toHaveBeenCalledWith(
preparedModel,
{
messages: [{ role: "user", content: "pong", timestamp: 1 }],
},
{
apiKey: "ollama-local",
},
);
});
});

View File

@@ -15,7 +15,8 @@ import {
resolveDefaultModelForAgent,
resolveModelRefFromString,
} from "./model-selection.js";
import { resolveModel } from "./pi-embedded-runner/model.js";
import { resolveModel, resolveModelAsync } from "./pi-embedded-runner/model.js";
import { prepareModelForSimpleCompletion } from "./simple-completion-transport.js";
type SimpleCompletionAuthStorage = {
setRuntimeApiKey: (provider: string, apiKey: string) => void;
@@ -158,8 +159,13 @@ export async function prepareSimpleCompletionModel(params: {
profileId?: string;
preferredProfile?: string;
allowMissingApiKeyModes?: ReadonlyArray<AllowedMissingApiKeyMode>;
skipPiDiscovery?: boolean;
}): Promise<PreparedSimpleCompletionModel> {
const resolved = resolveModel(params.provider, params.modelId, params.agentDir, params.cfg);
const resolved = params.skipPiDiscovery
? await resolveModelAsync(params.provider, params.modelId, params.agentDir, params.cfg, {
skipPiDiscovery: true,
})
: resolveModel(params.provider, params.modelId, params.agentDir, params.cfg);
if (!resolved.model) {
return {
error: resolved.error ?? `Unknown model: ${params.provider}/${params.modelId}`,
@@ -233,6 +239,7 @@ export async function prepareSimpleCompletionModelForAgent(params: {
modelRef?: string;
preferredProfile?: string;
allowMissingApiKeyModes?: ReadonlyArray<AllowedMissingApiKeyMode>;
skipPiDiscovery?: boolean;
}): Promise<PreparedSimpleCompletionModelForAgent> {
const selection = resolveSimpleCompletionSelectionForAgent({
cfg: params.cfg,
@@ -252,6 +259,7 @@ export async function prepareSimpleCompletionModelForAgent(params: {
profileId: selection.profileId,
preferredProfile: params.preferredProfile,
allowMissingApiKeyModes: params.allowMissingApiKeyModes,
skipPiDiscovery: params.skipPiDiscovery,
});
if ("error" in prepared) {
return {
@@ -272,7 +280,8 @@ export async function completeWithPreparedSimpleCompletionModel(params: {
context: Parameters<typeof complete>[1];
options?: SimpleCompletionModelOptions;
}) {
return await complete(params.model, params.context, {
const completionModel = prepareModelForSimpleCompletion({ model: params.model });
return await complete(completionModel, params.context, {
...params.options,
apiKey: params.auth.apiKey,
});

View File

@@ -34,9 +34,25 @@ const mocks = vi.hoisted(() => ({
),
resolveMemorySearchConfig: vi.fn(() => null),
loadModelCatalog: vi.fn(async () => []),
agentCommand: vi.fn(async () => ({
payloads: [{ text: "local reply" }],
meta: { agentMeta: { provider: "openai", model: "gpt-5.4" } },
prepareSimpleCompletionModelForAgent: vi.fn(async () => ({
selection: {
provider: "openai",
modelId: "gpt-5.4",
agentDir: "/tmp/agent",
},
model: {
provider: "openai",
id: "gpt-5.4",
maxTokens: 128,
},
auth: {
apiKey: "sk-test",
source: "env:TEST_API_KEY",
mode: "api-key",
},
})),
completeWithPreparedSimpleCompletionModel: vi.fn(async () => ({
content: [{ type: "text", text: "local reply" }],
})),
callGateway: vi.fn(async ({ method }: { method: string }) => {
if (method === "tts.status") {
@@ -131,11 +147,6 @@ vi.mock("../config/config.js", () => ({
loadConfig: mocks.loadConfig as typeof import("../config/config.js").loadConfig,
}));
vi.mock("../agents/agent-command.js", () => ({
agentCommand:
mocks.agentCommand as unknown as typeof import("../agents/agent-command.js").agentCommand,
}));
vi.mock("../agents/agent-scope.js", () => ({
resolveDefaultAgentId: () => "main",
resolveAgentDir: () => "/tmp/agent",
@@ -146,6 +157,13 @@ vi.mock("../agents/model-catalog.js", () => ({
mocks.loadModelCatalog as typeof import("../agents/model-catalog.js").loadModelCatalog,
}));
vi.mock("../agents/simple-completion-runtime.js", () => ({
prepareSimpleCompletionModelForAgent:
mocks.prepareSimpleCompletionModelForAgent as unknown as typeof import("../agents/simple-completion-runtime.js").prepareSimpleCompletionModelForAgent,
completeWithPreparedSimpleCompletionModel:
mocks.completeWithPreparedSimpleCompletionModel as unknown as typeof import("../agents/simple-completion-runtime.js").completeWithPreparedSimpleCompletionModel,
}));
vi.mock("../agents/auth-profiles.js", () => ({
loadAuthProfileStoreForRuntime:
mocks.loadAuthProfileStoreForRuntime as unknown as typeof import("../agents/auth-profiles.js").loadAuthProfileStoreForRuntime,
@@ -291,7 +309,8 @@ describe("capability cli", () => {
return store;
});
mocks.resolveMemorySearchConfig.mockReset().mockReturnValue(null);
mocks.agentCommand.mockClear();
mocks.prepareSimpleCompletionModelForAgent.mockClear();
mocks.completeWithPreparedSimpleCompletionModel.mockClear();
mocks.callGateway.mockClear().mockImplementation((async ({ method }: { method: string }) => {
if (method === "tts.status") {
return { enabled: true, provider: "openai" };
@@ -362,7 +381,8 @@ describe("capability cli", () => {
argv: ["capability", "model", "run", "--prompt", "hello", "--json"],
});
expect(mocks.agentCommand).toHaveBeenCalledTimes(1);
expect(mocks.prepareSimpleCompletionModelForAgent).toHaveBeenCalledTimes(1);
expect(mocks.completeWithPreparedSimpleCompletionModel).toHaveBeenCalledTimes(1);
expect(mocks.callGateway).not.toHaveBeenCalled();
expect(mocks.runtime.writeJson).toHaveBeenCalledWith(
expect.objectContaining({
@@ -372,20 +392,30 @@ describe("capability cli", () => {
);
});
it("runs local model probes without chat-agent prompt policy or tools", async () => {
it("runs local model probes through the lean completion path", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "model", "run", "--prompt", "hello", "--json"],
});
expect(mocks.agentCommand).toHaveBeenCalledWith(
expect(mocks.prepareSimpleCompletionModelForAgent).toHaveBeenCalledWith(
expect.objectContaining({
cleanupBundleMcpOnRunEnd: true,
modelRun: true,
promptMode: "none",
agentId: "main",
allowMissingApiKeyModes: ["aws-sdk"],
skipPiDiscovery: true,
}),
);
expect(mocks.completeWithPreparedSimpleCompletionModel).toHaveBeenCalledWith(
expect.objectContaining({
context: {
messages: [
expect.objectContaining({
role: "user",
content: "hello",
}),
],
},
}),
expect.anything(),
expect.anything(),
);
});

View File

@@ -4,7 +4,6 @@ import path from "node:path";
import { Readable } from "node:stream";
import { pipeline } from "node:stream/promises";
import type { Command } from "commander";
import { agentCommand } from "../agents/agent-command.js";
import { resolveAgentDir, resolveDefaultAgentId } from "../agents/agent-scope.js";
import {
listProfilesForProvider,
@@ -13,6 +12,10 @@ import {
import { updateAuthProfileStoreWithLock } from "../agents/auth-profiles/store.js";
import { resolveMemorySearchConfig } from "../agents/memory-search.js";
import { loadModelCatalog } from "../agents/model-catalog.js";
import {
completeWithPreparedSimpleCompletionModel,
prepareSimpleCompletionModelForAgent,
} from "../agents/simple-completion-runtime.js";
import { getRuntimeConfig } from "../config/config.js";
import { resolveAgentModelPrimaryValue } from "../config/model-input.js";
import type { OpenClawConfig } from "../config/types.openclaw.js";
@@ -79,7 +82,6 @@ import {
runWebSearch,
} from "../web-search/runtime.js";
import { runCommandWithRuntime } from "./cli-utils.js";
import { createDefaultDeps } from "./deps.js";
import { removeCommandByName } from "./program/command-tree.js";
import { collectOption } from "./program/helpers.js";
@@ -576,34 +578,54 @@ async function runModelRun(params: {
const cfg = getRuntimeConfig();
const agentId = resolveDefaultAgentId(cfg);
if (params.transport === "local") {
const result = await agentCommand(
{
message: params.prompt,
agentId,
model: params.model,
json: false,
modelRun: true,
promptMode: "none",
cleanupBundleMcpOnRunEnd: true,
const prepared = await prepareSimpleCompletionModelForAgent({
cfg,
agentId,
modelRef: params.model,
allowMissingApiKeyModes: ["aws-sdk"],
skipPiDiscovery: true,
});
if ("error" in prepared) {
throw new Error(prepared.error);
}
const result = await completeWithPreparedSimpleCompletionModel({
model: prepared.model,
auth: prepared.auth,
context: {
messages: [
{
role: "user",
content: params.prompt,
timestamp: Date.now(),
},
],
},
{
...defaultRuntime,
log: () => {},
options: {
maxTokens:
typeof prepared.model.maxTokens === "number" && Number.isFinite(prepared.model.maxTokens)
? prepared.model.maxTokens
: undefined,
},
createDefaultDeps(),
);
});
const text = result.content
.map((block) => (block.type === "text" ? block.text : ""))
.join("")
.trim();
return {
ok: true,
capability: "model.run",
transport: "local" as const,
provider: result?.meta?.agentMeta?.provider,
model: result?.meta?.agentMeta?.model,
provider: prepared.selection.provider,
model: prepared.selection.modelId,
attempts: [],
outputs: (result?.payloads ?? []).map((payload) => ({
text: payload.text,
mediaUrl: payload.mediaUrl,
mediaUrls: payload.mediaUrls,
})),
outputs: text
? [
{
text,
mediaUrl: null,
},
]
: [],
} satisfies CapabilityEnvelope;
}