Commit Graph

54 Commits

Author SHA1 Message Date
Peter Steinberger
dadcfb574f test: trim surrogate chunk fixtures 2026-04-17 19:38:53 +01:00
Peter Steinberger
77e6e4cf87 refactor: move memory embeddings into provider plugins 2026-04-17 02:57:18 +01:00
Peter Steinberger
3a59eddd07 test: reduce hotspot fixture overhead 2026-04-17 02:37:00 +01:00
Peter Steinberger
cd330f5f98 test: isolate gemini embedding request helpers 2026-04-17 01:46:47 +01:00
Peter Steinberger
2e08c77582 test: slim provider registry mocks 2026-04-17 01:29:12 +01:00
Peter Steinberger
272536015f test: slim runtime hotspot mocks 2026-04-17 01:15:31 +01:00
Peter Steinberger
b2cae7f12a test: trim duplicate memory hotspot coverage 2026-04-16 23:15:38 +01:00
Peter Steinberger
d59604b15e test: speed up hotspot boundaries 2026-04-16 22:55:30 +01:00
Peter Steinberger
63e53fbf2e test: trim duplicate hotspot coverage 2026-04-16 22:19:32 +01:00
Peter Steinberger
372c0051ba test: speed up slow import-boundary tests 2026-04-16 21:14:17 +01:00
zqchris
82e349a48a memory: strip inbound metadata envelopes from user messages in session corpus (#66548)
Merged via squash.

Prepared head SHA: 98562b2a84
Co-authored-by: zqchris <4436110+zqchris@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-04-16 11:15:44 -07:00
Tak Hoffman
4f00b76925 fix(context-window): Tighten context limits and bound memory excerpts (#67277)
* Tighten context limits and bound memory excerpts

* Align startup context defaults in config docs

* Align qmd memory_get bounds with shared limits

* Preserve qmd partial memory reads

* Fix shared memory read type import

* Add changelog entry for context bounds
2026-04-15 13:06:02 -05:00
Mariano
3d06d90e83 fix(memory): unify default root memory handling (#66141)
* fix(memory): unify default root memory handling

* test(memory): align legacy migration expectation

* docs(changelog): tag qmd root-memory fix

* docs(changelog): append qmd root-memory entry

* docs(changelog): dedupe qmd root-memory entry

* docs(changelog): attribute qmd root-memory fix

---------

Co-authored-by: mbelinky <mbelinky@users.noreply.github.com>
2026-04-13 23:59:57 +02:00
Rugved Somwanshi
0cfb83edfa feat: LM Studio Integration (#53248)
* Feat: LM Studio Integration

* Format

* Support usage in streaming true

Fix token count

* Add custom window check

* Drop max tokens fallback

* tweak docs

Update generated

* Avoid error if stale header does not resolve

* Fix test

* Fix test

* Fix rebase issues

Trim code

* Fix tests

Drop keyless

Fixes

* Fix linter issues in tests

* Update generated artifacts

* Do not have fatal header resoltuion for discovery

* Do the same for API key as well

* fix: honor lmstudio preload runtime auth

* fix: clear stale lmstudio header auth

* fix: lazy-load lmstudio runtime facade

* fix: preserve lmstudio shared synthetic auth

* fix: clear stale lmstudio header auth in discovery

* fix: prefer lmstudio header auth for discovery

* fix: honor lmstudio header auth in warmup paths

* fix: clear stale lmstudio profile auth

* fix: ignore lmstudio env auth on header migration

* fix: use local lmstudio setup seam

* fix: resolve lmstudio rebase fallout

---------

Co-authored-by: Frank Yang <frank.ekn@gmail.com>
2026-04-13 15:22:44 +08:00
Tak Hoffman
847739d82c Fix dreaming replay, repair polluted artifacts, and gate wiki tabs (#65138)
* fix(active-memory): preserve parent channel context for recall runs

* fix(active-memory): keep recall runs on the resolved channel

* fix(active-memory): prefer resolved recall channel over wrapper hints

* fix(active-memory): trust explicit recall channel hints

* fix(active-memory): rank recall channel fallbacks by trust

* Fix dreaming replay and recovery flows

* fix: prevent dreaming event loss and diary write races

* chore: add changelog entry for memory fixes

* fix: harden dreaming repair and diary writes

* fix: harden dreaming artifact archive naming
2026-04-12 00:25:11 -05:00
Peter Steinberger
1fb2e18f47 refactor: simplify cli conversions 2026-04-11 01:27:48 +01:00
Peter Steinberger
56fc20fb7c test: align Vitest config path assertions 2026-04-10 15:49:37 +01:00
Mariano
383ea34efe fix(reply): keep resolved secret config stable (#64249)
Merged via squash.

Prepared head SHA: 973f863d8c
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Reviewed-by: @mbelinky
2026-04-10 14:56:30 +02:00
Vincent Koc
5342cc49b1 fix(memory-host-sdk): route ollama shim through sdk runtime facade 2026-04-09 08:23:06 +01:00
Peter Steinberger
cb28d8d6b8 refactor: dedupe browser and memory lowercase helpers 2026-04-07 13:44:41 +01:00
Peter Steinberger
a15a5a1edc refactor: dedupe lowercase helper readers 2026-04-07 13:44:41 +01:00
Peter Steinberger
ad0c4309e6 refactor: dedupe shared trim readers 2026-04-07 07:36:11 +01:00
Peter Steinberger
7cf72f7bc8 Tests: skip bedrock auth probe in embeddings spec 2026-04-07 12:55:09 +08:00
Peter Steinberger
8c7dd66a7b refactor: dedupe string readers 2026-04-07 05:06:54 +01:00
Peter Steinberger
69f4022950 refactor: dedupe browser and memory host error helpers 2026-04-07 05:06:53 +01:00
wirjo
699b2320a8 feat(memory): add Bedrock embedding provider for memory search (#61547)
* feat(memory): add Bedrock embedding provider for memory search

Add Amazon Bedrock as a native embedding provider for memory search.
Supports Titan Embed Text v1/v2 and Cohere Embed models via AWS SDK.

- New embeddings-bedrock.ts: BedrockRuntimeClient + InvokeModel
- Auth via AWS default credential chain (same as Bedrock inference)
- Auto-selected in 'auto' mode when AWS credentials are detected
- Titan V2: configurable dimensions (256/512/1024), normalization
- Cohere: native batch support with search_query/search_document types
- 16 new tests covering all model types, auth detection, edge cases

Closes #26289

* fix(memory): harden bedrock embedding selection

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-04-06 01:19:56 +01:00
Dave Morin
2ed2dbba00 Memory: move dreaming trail to dreams.md (#61537)
* Memory: move dreaming trail to dreams.md

* docs(changelog): add dreams.md entry

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-04-05 23:19:31 +01:00
Peter Steinberger
2aafa8fb7d refactor: remove ollama sdk facades 2026-04-05 08:15:39 +01:00
Vincent Koc
2a999bf9c9 refactor(memory): invert memory host sdk dependency 2026-04-05 07:34:33 +01:00
scoootscooob
9860db5cea fix(memory): allow Gemini multimodal fallback before registry hydration (#61085)
* fix(memory): allow Gemini multimodal fallback

* docs(memory): clarify multimodal fallback
2026-04-04 18:24:20 -07:00
Peter Steinberger
bc23db501b test: trim more core importOriginal usage 2026-04-03 19:49:43 +01:00
Shakker
e9e7033ea1 test: trim embeddings provider import cost 2026-04-04 01:07:28 +09:00
Peter Steinberger
847faa3d04 test: trim extension test import churn 2026-04-03 04:41:08 +01:00
ImLukeF
78b48735fa test: restore fetch stubs in embedding suites 2026-04-01 20:53:16 +11:00
ImLukeF
101c31f5e1 test: harden ci-sensitive unit suites 2026-04-01 20:53:16 +11:00
Vincent Koc
ab4ddff7f1 feat(memory): add per-agent QMD extra collections for cross-agent session search (#58211)
* feat(memory): add per-agent qmd extra collections

* test(config): cover qmd extra collections schema outputs

* docs(config): refresh qmd extra collections baseline

* docs(config): regenerate qmd extra collections baselines

* docs(config): clarify qmd extra collection naming
2026-03-31 17:08:18 +09:00
Vincent Koc
075645f5cb fix(memory): use explicit qmd snippet line metadata (#58181)
* fix(memory): preserve qmd snippet line metadata

* Memory/QMD: preserve snippet span with partial line metadata
2026-03-31 17:05:53 +09:00
Altay
910134b702 fix(memory): stabilize qmd collection scoping 2026-03-30 22:41:21 +03:00
Vincent Koc
b7de04f23f fix(memory): preserve shared qmd collection names (#57628)
* fix(memory): preserve shared qmd collection names

* fix(memory): canonicalize qmd path containment
2026-03-30 19:29:35 +09:00
Vincent Koc
8623c28f1d fix(memory): warn when qmd binary is missing (#57467)
* fix(memory): warn when qmd binary is missing

* fix(memory): avoid probing cached qmd managers

* docs(memory): clarify qmd doctor probe behavior

* fix(memory): probe qmd from agent workspace
2026-03-30 13:44:41 +09:00
Vincent Koc
b7d59f7831 fix(memory): export archived qmd session transcripts (#57446)
* fix(memory): export archived qmd session transcripts

* test(memory): separate qmd session listing describe
2026-03-30 12:50:21 +09:00
Gustavo Madeira Santana
313fdf5adf Memory Host: make backend-config path tests portable 2026-03-29 23:41:42 -04:00
Vincent Koc
da35718cb2 fix(memory): add qmd mcporter search tool override (#57363)
* fix(memory): add qmd mcporter search tool override

* fix(memory): tighten qmd search tool override guards

* chore(config): drop generated docs baselines from qmd pr

* fix(memory): keep explicit qmd query override on v2 args

* docs(changelog): normalize qmd search tool attribution

* fix(memory): reuse v1 qmd tool after query fallback
2026-03-30 12:07:32 +09:00
Amine Harch el korane
219d4f03bd fix: wire memorySearch.extraPaths to QMD indexing (#57315)
* fix: wire memorySearch.extraPaths to QMD indexing

The 'agents.defaults.memorySearch.extraPaths' config field was documented
to add extra directories to the memory index, but the paths were never
actually passed to the QMD backend. Only 'memory.qmd.paths' worked.

This fix reads extraPaths from the memorySearch config and maps them
to QMD custom path collections, so users can simply configure:

  memorySearch:
    extraPaths:
      - odd-vault
      - /Users/odd/workspace
      - /Users/odd/docs

And have those directories indexed alongside the default memory files.

Closes #57302

* fix: handle per-agent memorySearch.extraPaths overrides + add tests

- Read per-agent overrides from agents.list[].memorySearch.extraPaths
- Agent-specific overrides take priority over defaults
- Falls back to defaults when agent has no overrides
- Added 3 test cases for the feature

* fix: merge defaults + agent overrides instead of replacing

* fix: remove any types from tests, fix merge behavior assertion

* fix(memory): merge qmd extra path collections

* fix(memory): normalize qmd extra path resolution

* fix(memory): type qmd extra path merge

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-03-30 08:58:42 +09:00
Peter Steinberger
8e0ab35b0e refactor(plugins): decouple bundled plugin runtime loading 2026-03-29 09:10:38 +01:00
Tak Hoffman
3ce48aff66 Memory: add configurable FTS5 tokenizer for CJK text support (openclaw#56707)
Verified:
- pnpm build
- pnpm check
- pnpm test -- extensions/memory-core/src/memory/manager-search.test.ts packages/memory-host-sdk/src/host/query-expansion.test.ts
- pnpm test -- extensions/memory-core/src/memory/index.test.ts -t "reindexes when extraPaths change"
- pnpm test -- src/config/schema.base.generated.test.ts
- pnpm test -- src/media-understanding/image.test.ts
- pnpm test

Co-authored-by: Mitsuyuki Osabe <24588751+carrotRakko@users.noreply.github.com>
2026-03-28 20:53:29 -05:00
AaronLuo00
f8547fcae4 fix: guard fine-split against breaking UTF-16 surrogate pairs
When re-splitting CJK-heavy segments at chunking.tokens, check whether the
slice boundary falls on a high surrogate (0xD800–0xDBFF) and if so extend
by one code unit to keep the pair intact.  Prevents producing broken
surrogate halves for CJK Extension B+ characters (U+20000+).

Add test verifying no lone surrogates appear when splitting lines of
surrogate-pair characters with an odd token budget.

Addresses third-round Codex P2 review comment.
2026-03-29 10:22:43 +09:00
AaronLuo00
3b95aa8804 fix: address second-round review — Latin backward compat and emoji consistency
- Two-pass line splitting: first slice at maxChars (unchanged for Latin),
  then re-split only CJK-heavy segments at chunking.tokens. This preserves
  the original ~800-char segments for ASCII lines while keeping CJK chunks
  within the token budget.

- Narrow surrogate-pair adjustment to CJK Extension B+ range (D840–D87E)
  only, so emoji surrogate pairs are not affected. Mixed CJK+emoji text
  is now handled consistently regardless of composition.

- Add tests: emoji handling (2), Latin backward-compat long-line (1).

Addresses Codex P1 (oversized CJK segments) and P2s (Latin over-splitting,
emoji surrogate inconsistency).
2026-03-29 10:22:43 +09:00
AaronLuo00
a5147d4d88 fix: address bot review — surrogate-pair counting and CJK line splitting
- Use code-point length instead of UTF-16 length in estimateStringChars()
  so that CJK Extension B+ surrogate pairs (U+20000+) are counted as 1
  character, not 2 (fixes ~25% overestimate for rare characters).

- Change long-line split step from maxChars to chunking.tokens so that
  CJK lines are sliced into token-budget-sized segments instead of
  char-budget-sized segments that produce ~4x oversized chunks.

- Add tests for both fixes: surrogate-pair handling and long CJK line
  splitting.

Addresses review feedback from Greptile and Codex bots.
2026-03-29 10:22:43 +09:00
AaronLuo00
971ecabe80 fix(memory): account for CJK characters in QMD memory chunking
The QMD memory system uses a fixed 4:1 chars-to-tokens ratio for chunk
sizing, which severely underestimates CJK (Chinese/Japanese/Korean) text
where each character is roughly 1 token. This causes oversized chunks for
CJK users, degrading vector search quality and wasting context window space.

Changes:
- Add shared src/utils/cjk-chars.ts module with CJK-aware character
  counting (estimateStringChars) and token estimation helpers
- Update chunkMarkdown() in src/memory/internal.ts to use weighted
  character lengths for chunk boundary decisions and overlap calculation
- Replace hardcoded estimateTokensFromChars in the context report
  command with the shared utility
- Add 13 unit tests for the CJK estimation module and 5 new tests for
  CJK-aware memory chunking behavior

Backward compatible: pure ASCII/Latin text behavior is unchanged.

Closes #39965
Related: #40216
2026-03-29 10:22:43 +09:00