Closes#74587. AI-assisted, fully tested.
The previous deprecation warning ("set config.modelFallback explicitly
if you want a fallback model") read naturally as runtime failover —
model A errors → switch to model B. The actual semantics in
`getModelRef` are different: `modelFallback` is the **last candidate
in the chain-resolution walk**, consulted only when `config.model`,
the current run's model, AND the agent's configured default have all
resolved to nothing. There is no error-recovery / retry-with-different-model
path.
The mismatch wastes real debug time. The issue filer reports ~1 hour of
cycles before reading source revealed the gap; users without source
access can debug for much longer assuming runtime failover exists.
## Fix
Rewrite the warning string to:
1. State the deprecation (preserved).
2. Describe `modelFallback`'s actual semantics — chain-resolution
last-resort, gated on the three earlier candidates resolving to
nothing.
3. Explicitly disclaim the wrong mental model — "it is NOT a runtime
failover that substitutes a different model when the resolved model
errors out" — so a quick read can't lead the operator astray.
No behavior change, only operator-facing copy. Surrounding code paths
(`getModelRef`, `hasDeprecatedModelFallbackPolicy`, the warn caller in
`register()`) are untouched.
## Tests
`extensions/active-memory/index.test.ts` extends the existing
deprecation-warning assertion to pin both the positive copy
(`chain-resolution`, `last-resort`) and the negative disclaimer
(`NOT a runtime failover`), so a future "let's reword this" change
that reintroduces the failover-implying language fails the test
instead of silently regressing.
`pnpm test extensions/active-memory/index.test.ts` — 94 passed.
`pnpm exec oxfmt --check` — clean. `pnpm exec oxlint` — 0 warnings,
0 errors.
## AI-assisted PR
- [x] Mark as AI-assisted (Claude). Lightly tested via the targeted
Vitest extension shard; not exercised against a live Ollama / AM
rollout because the change is a log-string update, not behavior.
- [x] Confirm I understand what the code does: yes — `getModelRef`
walks four candidates (`config.model`, `currentRunModel`,
`configuredDefaultModel`, `config.modelFallback`) and returns the
first non-null parse; `modelFallback` is purely a default-when-empty
selector, not a runtime failover.
Fixes#73502.
Active Memory now allows its hidden recall sub-agent to use both bundled memory tool contracts: memory_recall for memory-lancedb and memory_search/memory_get for memory-core. The prompt prefers memory_recall when available and falls back to the legacy tool pair when that is the active backend surface.
Also updates Active Memory docs, QA mock fixtures, and debug parsing compatibility for the two recall paths.
Wrap runRecallSubagent() with Promise.race so maybeResolveActiveRecall
returns a timeout result at the configured timeoutMs even when the
embedded run has not cooperatively checked the abort signal. Late
subagent rejections are caught silently to prevent unhandled promise
errors.
Fixes#71629
The normalizePluginConfig clamp hard-coded a 60_000 ms ceiling for
config.timeoutMs, silently reducing any configured value above 60
seconds down to 60 000 ms at runtime. This made it impossible for
operators to set longer recall budgets even though the docs
(docs/pi.md) showed 120_000 as a valid example.
Raise the ceiling to 120_000 ms so values between 60 001 and 120 000
are honored. Values above 120 000 are still clamped to prevent
unbounded blocking.
Adds two regression tests:
- 90 000 ms is passed through unchanged
- 200 000 ms is clamped to 120 000 ms
Fixes#68410.
* move active memory into prompt prefix
* document active memory prompt prefix
* strip active memory prefixes from recall history
* harden active memory prompt prefix handling
* hide active memory prefix in leading history views
* strip hidden memory blocks after prompt merges
* preserve user turns in memory recall cleanup
* Refine plugin debug plumbing
* Tighten plugin debug handling
* Reduce active memory overhead
* Abort active memory sidecar on timeout
* Rename active memory blocking subagent wording
* Fix active memory cache and recall selection
* Preserve active memory session scope
* Sanitize recalled context before retrieval
* Add active memory changelog entry
* Harden active memory debug and transcript handling
* Add active memory policy config
* Raise active memory timeout default
* Keep usage footer on primary reply
* Clear stale active memory status lines
* Match legacy active memory status prefixes
* Preserve numeric active memory bullets
* Reuse canonical session keys for active memory
* Let active memory subagent decide relevance
* Refine active memory plugin summary flow
* Fix active memory main-session DM detection
* Trim active memory summaries at word boundaries
* Add active memory prompt styles
* Fix active memory stale status cleanup
* Rename active memory subagent wording
* Add active memory prompt and thinking overrides
* Remove active memory legacy status compat
* Resolve active memory session id status
* Add active memory session toggle
* Add active memory global toggle
* Fix active memory toggle state handling
* Harden active memory transcript persistence
* Fix active memory chat type gating
* Scope active memory transcripts by agent
* Show plugin debug before replies