2295 Commits

Author SHA1 Message Date
Luis Pater
8f4a4eabfc feat(docs): add VisionCoder sponsorship details and optimize external links
- Added VisionCoder sponsorship information to `README.md`, `README_CN.md`, and `README_JA.md`.
- Updated external links to include `target="_blank"` for improved user experience.
- Added new logo asset `visioncoder.png` for README use.
v6.9.30
2026-04-19 23:00:09 +08:00
Luis Pater
e05abec01f Merge pull request #2898 from octo-patch/fix/issue-2866-remove-obsolete-1m-beta-header
fix(executor): drop obsolete context-1m-2025-08-07 beta header
2026-04-19 11:46:21 +08:00
octo-patch
f4eb16102b fix(executor): drop obsolete context-1m-2025-08-07 beta header (fixes #2866)
Anthropic has moved the 1M-context-window feature to General Availability,
so the context-1m-2025-08-07 beta flag is no longer accepted and now causes
400 Bad Request errors when forwarded upstream.

Remove the X-CPA-CLAUDE-1M detection and the corresponding injection of the
now-invalid beta header.  Also drop the unused net/textproto import that was
only needed for the header-key lookup.
2026-04-19 10:38:16 +08:00
Luis Pater
86c856f56f feat(translator): add partial and full image generation support in Codex-GPT and Codex-Gemini flows
- Introduced `LastImageHashByItemID` in Codex-GPT and `LastImageHashByID` in Codex-Gemini for deduplication of generated images.
- Added support for handling `partial_image` and `image_generation_call` types, with inline data embedding for Gemini and URL payload conversion for GPT.
- Extended unit tests to verify image handling in both streaming and non-streaming modes.
2026-04-19 03:21:59 +08:00
Supra4E8C
c6baa64b4e Merge pull request #2892 from router-for-me/fix-provider
feat(api): integrate auth index into key retrieval endpoints for Gemi…
v6.9.29
2026-04-18 17:24:43 +08:00
Supra4E8C
a64141a9a6 fix(tests): remove obsolete config_auth_index_test file 2026-04-18 17:22:16 +08:00
Supra4E8C
c26936e2e6 fix(management): stabilize auth-index mapping 2026-04-18 17:12:14 +08:00
Supra4E8C
894baad829 feat(api): integrate auth index into key retrieval endpoints for Gemini, Claude, Codex, OpenAI, and Vertex 2026-04-18 16:44:33 +08:00
hkfires
da43f63735 fix(tests): update Gemini family test case numbers for consistency 2026-04-17 08:43:19 +08:00
hkfires
d9a3b3e5f3 fix(tests): update model lookup references and enhance Claude executor tests 2026-04-17 08:32:07 +08:00
Luis Pater
5dcca69e8c feat(models): add Claude Opus 4.7 model entry to registry JSON v6.9.28 2026-04-17 01:08:19 +08:00
Luis Pater
f5dc6483d5 chore: remove iFlow-related modules and dependencies
- Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules.
- Removed iFlow-specific tests, executors, and helpers across SDK and internal components.
- Updated all references to exclude iFlow functionality.
2026-04-17 01:07:12 +08:00
Luis Pater
d949921143 feat(auth): add proxy URL override support to auth constructors and executors
- Introduced `WithProxyURL` variants for `CodexAuth`, `ClaudeAuth`, `IFlowAuth`, and `DeviceFlowClient`.
- Updated executors to use proxy-aware constructors for improved configurability.
- Added unit tests to validate proxy override precedence and functionality.

Closes: #2823
2026-04-16 22:11:39 +08:00
Luis Pater
7b03f04670 fix(handlers): include execution session metadata and skip idempotency key when absent
- Refactored `requestExecutionMetadata` to handle empty `Idempotency-Key` gracefully.
- Added test to validate metadata inclusion of execution session without idempotency key.
2026-04-16 21:44:32 +08:00
hkfires
1267fddf61 fix(docker-build): improve argument handling and error messaging for usage option 2026-04-16 09:19:03 +08:00
Luis Pater
ba454dbfbf Merge pull request #2817 from sususu98/fix/antigravity-strip-billing-header
fix(antigravity): strip billing header from system instruction before upstream call
v6.9.27
2026-04-16 02:46:05 +08:00
Luis Pater
d1508ca030 Merge pull request #2816 from sususu98/feat/session-affinity
feat(session-affinity): add session-sticky routing for multi-account load balancing
2026-04-16 02:45:31 +08:00
sususu98
d4a6a5ae15 fix(antigravity): strip billing header from system instruction before upstream call
The x-anthropic-billing-header block in the Claude system array is
client-internal metadata and should not be forwarded to the Gemini
upstream as part of systemInstruction.parts.
2026-04-16 00:19:01 +08:00
sususu98
7c24d54ca8 feat(session-affinity): add session-sticky routing for multi-account load balancing
When multiple auth credentials are configured, requests from the same
session are now routed to the same credential, improving upstream prompt
cache hit rates and maintaining context continuity.

Core components:
- SessionAffinitySelector: wraps RoundRobin/FillFirst selectors with
  session-to-auth binding; automatic failover when bound auth is
  unavailable, re-binding via the fallback selector for even distribution
- SessionCache: TTL-based in-memory cache with background cleanup
  goroutine, supporting per-session and per-auth invalidation
- StoppableSelector interface: lifecycle hook for selectors holding
  resources, called during Manager.StopAutoRefresh()

Session ID extraction priority (extractSessionIDs):
1. metadata.user_id with Claude Code session format (old
   user_{hash}_session_{uuid} and new JSON {session_id} format)
2. X-Session-ID header (generic client support)
3. metadata.user_id (non-Claude format, used as-is)
4. conversation_id field
5. Stable FNV hash from system prompt + first user/assistant messages
   (fallback for clients with no explicit session ID); returns both a
   full hash (primaryID) and a short hash without assistant content
   (fallbackID) to inherit bindings from the first turn

Multi-format message hash covers OpenAI messages, Claude system array,
Gemini contents/systemInstruction, and OpenAI Responses API input items
(including inline messages with role but no type field).

Configuration (config.yaml routing section):
- session-affinity: bool (default false)
- session-affinity-ttl: duration string (default "1h")
- claude-code-session-affinity: bool (deprecated, alias for above)
All three fields trigger selector rebuild on config hot reload.

Side effect: Idempotency-Key header is no longer auto-generated with a
random UUID when absent — only forwarded when explicitly provided by the
client, to avoid polluting session hash extraction.
2026-04-16 00:18:47 +08:00
Luis Pater
a4c1e32ff6 chore(models): remove outdated GPT-5 and related model entries from registry JSON 2026-04-15 20:37:32 +08:00
Luis Pater
f56cf42461 Merge pull request #2800 from sususu98/fix/antigravity-max-output-tokens-cap
fix(antigravity): cap maxOutputTokens using registry max_completion_tokens
2026-04-15 20:35:11 +08:00
Luis Pater
3dea1da249 Merge pull request #2782 from sususu98/fix/strip-invalid-signature-thinking-blocks
fix(antigravity): use E-prefixed fake signature in strict bypass test
2026-04-15 20:34:32 +08:00
Luis Pater
8fac29631d chore: remove Qwen support from SDK and internal components
- Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations.
- Removed all Qwen-related OAuth flows, token handling, and execution logic.
- Cleaned up dependencies and references to Qwen across the codebase.
v6.9.26
2026-04-15 12:16:08 +08:00
sususu98
8fecd625d2 fix(antigravity): cap maxOutputTokens using registry max_completion_tokens
Claude models on antigravity have a 64000 token output limit but
max_tokens from downstream requests was passed through uncapped,
causing 400 INVALID_ARGUMENT from Google when clients sent 128000.
2026-04-15 11:57:55 +08:00
sususu98
10b55b5ddd fix(antigravity): use E-prefixed fake signature in strict bypass test
The strict bypass test used testGeminiSignaturePayload() which produces
a base64 string starting with 'C'. Since StripInvalidSignatureThinkingBlocks
now strips all non-E/R signatures unconditionally, the test payload was
stripped before reaching ValidateClaudeBypassSignatures, causing the test
to pass the request through instead of rejecting it with 400.

Replace with testFakeClaudeSignature() which produces a base64 string
starting with 'E' (valid at the lightweight check) but with invalid
protobuf content (no valid field 2), so strict mode correctly rejects
it at the deep validation layer.
2026-04-14 15:46:02 +08:00
sususu98
41ae2c81e7 fix(antigravity): discard thinking blocks with non-Claude-format signatures
Proxy-generated thinking blocks may carry hex hashes or other non-Claude
signatures (e.g. "d5cb9cd0823142109f451861") from Gemini responses. These
are now discarded alongside empty-signature blocks during the strip phase,
before validation runs. Valid Claude signatures always start with 'E' or 'R'
(after stripping any cache prefix).
v6.9.25
2026-04-14 15:14:48 +08:00
sususu98
278a89824c fix(antigravity): strip thinking blocks with empty signatures instead of rejecting
Thinking blocks with empty signatures come from proxy-generated
responses (Antigravity/Gemini routed as Claude). These should be
silently dropped from the request payload before forwarding, not
rejected with 400. Fixes 10 "missing thinking signature" errors.
2026-04-14 15:14:48 +08:00
sususu98
c4459c4346 Merge pull request #2724 from sususu98/fix/skip-schema-cleanup-empty-tools
fix(antigravity): skip full schema cleanup for empty tool requests
v6.9.24
2026-04-12 14:05:47 +08:00
sususu98
61e0447f92 Merge pull request #2723 from sususu98/fix/drop-redacted-thinking-blocks
fix(antigravity): drop redacted thinking blocks with empty text
2026-04-12 14:05:41 +08:00
sususu98
1dc3018fd6 Merge pull request #2716 from sususu98/pr/antigravity-bypass-log-noise
fix(antigravity): reduce bypass mode log noise
2026-04-12 14:05:34 +08:00
sususu98
26fd3eff03 Merge pull request #2715 from sususu98/pr/antigravity-32mb-bypass-signatures
fix(antigravity): allow 32MB bypass signatures
2026-04-12 14:05:27 +08:00
Luis Pater
5bfaf8086b feat(auth): add configurable worker pool size for auto-refresh loop
- Introduced `auth-auto-refresh-workers` config option to override default concurrency.
- Updated `authAutoRefreshLoop` to support customizable worker counts.
- Enhanced token refresh scheduling flexibility by aligning worker pool with runtime configurations.
2026-04-12 13:56:05 +08:00
Luis Pater
6c0a1efd71 refactor(auth): simplify auth directory scanning and improve JSON processing logic
- Replaced `filepath.Walk` with `os.ReadDir` for cleaner directory traversal.
- Fixed `isAuthJSON` check to use `filepath.Dir` for directory comparison.
- Updated auth hash cache generation and file synthesis to improve readability and maintainability.
2026-04-12 13:32:03 +08:00
sususu98
f5ed5c7453 fix(antigravity): skip full schema cleanup for empty tool requests
Avoid whole-payload schema sanitization when translated Antigravity requests have no actual tool schemas, including missing and empty tools arrays. Add regression coverage so image-heavy no-tool requests keep bypassing the old memory amplification path.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-12 12:51:42 +08:00
sususu98
65158cce46 fix(antigravity): drop redacted thinking blocks with empty text
Antigravity wraps empty thinking text into a prompt-caching-scope
object that omits the required inner "thinking" field, causing 400
"messages.N.content.0.thinking.thinking: Field required" when Claude
Max requests are routed through Antigravity in bypass mode.
2026-04-12 12:30:43 +08:00
Luis Pater
a583463d60 feat(auth): implement auto-refresh loop for managing auth token schedule
- Introduced `authAutoRefreshLoop` to handle token refresh scheduling.
- Replaced semaphore-based refresh logic in `Manager` with the new loop.
- Added unit tests to verify refresh schedule logic and edge cases.
2026-04-12 02:06:40 +08:00
sususu98
8ed290c1c4 fix(antigravity): reduce bypass mode log noise
Keep cache-disable visibility at info level while suppressing duplicate state-change logs and moving strict-mode chatter down to debug.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-12 00:52:41 +08:00
sususu98
727221df2e fix(antigravity): allow 32MB bypass signatures
Raise the local bypass-signature ceiling so long Claude thinking signatures are not rejected before request translation, and keep the oversized-signature test cheap to execute.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-12 00:51:53 +08:00
Luis Pater
0ab1f5412f fix(executor): handle 429 Retry-After header and default retry logic for quota exhaustion
- Added proper parsing of `Retry-After` headers for 429 responses.
- Set default retry duration when "disable cooling" is active on quota exhaustion.
- Updated tests to verify `Retry-After` handling and default behavior.
v6.9.23
2026-04-11 21:04:55 +08:00
Luis Pater
9ded75d335 Merge pull request #2702 from AllenReder/docs/add-quota-inspector
docs(README): add CLIproxyAPI Quota Inspector to community projects list
2026-04-11 16:42:02 +08:00
Allen Yi
f135fdf7fc docs: clarify codex quota window wording in README locales 2026-04-11 16:39:32 +08:00
Luis Pater
828df80088 refactor(executor): remove immediate retry with token refresh on 429 for Qwen and update tests accordingly 2026-04-11 16:35:18 +08:00
Allen Yi
c585caa0ce docs: fix CLIProxyAPI Quota Inspector naming and link casing 2026-04-11 16:22:45 +08:00
Allen Yi
5bb69fa4ab docs: refine CLIproxyAPI Quota Inspector description in all README locales 2026-04-11 15:22:27 +08:00
Luis Pater
5ab9afac83 fix(executor): handle OAuth tool name remapping with rename detection and add tests
Closes: #2656
v6.9.22
2026-04-10 21:54:59 +08:00
Luis Pater
65ce86338b fix(executor): implement immediate retry with token refresh on 429 for Qwen and add associated tests
Closes: #2661
2026-04-10 21:12:03 +08:00
Chén Mù
2a97037d7b Merge pull request #2670 from sususu98/feat/antigravity-prefer-prod-url
feat(antigravity): prefer prod URL as first priority
v6.9.21
2026-04-10 19:43:27 +08:00
sususu98
d801393841 feat(antigravity): prefer prod URL as first priority
Promote cloudcode-pa.googleapis.com to the first position in the
fallback order, with daily and sandbox URLs as fallbacks.
2026-04-10 19:37:56 +08:00
Luis Pater
b2c0cdfc88 Merge pull request #2621 from wykk-12138/fix/oauth-extra-usage-detection
fix(claude): prevent OAuth extra-usage billing via tool name fingerprinting and system prompt cloaking
v6.9.20
2026-04-10 10:29:27 +08:00
Luis Pater
f32c8c9620 fix(handlers): update listener to bind on all interfaces instead of localhost
Fixed: #2640
2026-04-10 07:24:34 +08:00