- Added VisionCoder sponsorship information to `README.md`, `README_CN.md`, and `README_JA.md`.
- Updated external links to include `target="_blank"` for improved user experience.
- Added new logo asset `visioncoder.png` for README use.
Anthropic has moved the 1M-context-window feature to General Availability,
so the context-1m-2025-08-07 beta flag is no longer accepted and now causes
400 Bad Request errors when forwarded upstream.
Remove the X-CPA-CLAUDE-1M detection and the corresponding injection of the
now-invalid beta header. Also drop the unused net/textproto import that was
only needed for the header-key lookup.
- Introduced `LastImageHashByItemID` in Codex-GPT and `LastImageHashByID` in Codex-Gemini for deduplication of generated images.
- Added support for handling `partial_image` and `image_generation_call` types, with inline data embedding for Gemini and URL payload conversion for GPT.
- Extended unit tests to verify image handling in both streaming and non-streaming modes.
- Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules.
- Removed iFlow-specific tests, executors, and helpers across SDK and internal components.
- Updated all references to exclude iFlow functionality.
- Introduced `WithProxyURL` variants for `CodexAuth`, `ClaudeAuth`, `IFlowAuth`, and `DeviceFlowClient`.
- Updated executors to use proxy-aware constructors for improved configurability.
- Added unit tests to validate proxy override precedence and functionality.
Closes: #2823
- Refactored `requestExecutionMetadata` to handle empty `Idempotency-Key` gracefully.
- Added test to validate metadata inclusion of execution session without idempotency key.
The x-anthropic-billing-header block in the Claude system array is
client-internal metadata and should not be forwarded to the Gemini
upstream as part of systemInstruction.parts.
When multiple auth credentials are configured, requests from the same
session are now routed to the same credential, improving upstream prompt
cache hit rates and maintaining context continuity.
Core components:
- SessionAffinitySelector: wraps RoundRobin/FillFirst selectors with
session-to-auth binding; automatic failover when bound auth is
unavailable, re-binding via the fallback selector for even distribution
- SessionCache: TTL-based in-memory cache with background cleanup
goroutine, supporting per-session and per-auth invalidation
- StoppableSelector interface: lifecycle hook for selectors holding
resources, called during Manager.StopAutoRefresh()
Session ID extraction priority (extractSessionIDs):
1. metadata.user_id with Claude Code session format (old
user_{hash}_session_{uuid} and new JSON {session_id} format)
2. X-Session-ID header (generic client support)
3. metadata.user_id (non-Claude format, used as-is)
4. conversation_id field
5. Stable FNV hash from system prompt + first user/assistant messages
(fallback for clients with no explicit session ID); returns both a
full hash (primaryID) and a short hash without assistant content
(fallbackID) to inherit bindings from the first turn
Multi-format message hash covers OpenAI messages, Claude system array,
Gemini contents/systemInstruction, and OpenAI Responses API input items
(including inline messages with role but no type field).
Configuration (config.yaml routing section):
- session-affinity: bool (default false)
- session-affinity-ttl: duration string (default "1h")
- claude-code-session-affinity: bool (deprecated, alias for above)
All three fields trigger selector rebuild on config hot reload.
Side effect: Idempotency-Key header is no longer auto-generated with a
random UUID when absent — only forwarded when explicitly provided by the
client, to avoid polluting session hash extraction.
- Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations.
- Removed all Qwen-related OAuth flows, token handling, and execution logic.
- Cleaned up dependencies and references to Qwen across the codebase.
Claude models on antigravity have a 64000 token output limit but
max_tokens from downstream requests was passed through uncapped,
causing 400 INVALID_ARGUMENT from Google when clients sent 128000.
The strict bypass test used testGeminiSignaturePayload() which produces
a base64 string starting with 'C'. Since StripInvalidSignatureThinkingBlocks
now strips all non-E/R signatures unconditionally, the test payload was
stripped before reaching ValidateClaudeBypassSignatures, causing the test
to pass the request through instead of rejecting it with 400.
Replace with testFakeClaudeSignature() which produces a base64 string
starting with 'E' (valid at the lightweight check) but with invalid
protobuf content (no valid field 2), so strict mode correctly rejects
it at the deep validation layer.
Proxy-generated thinking blocks may carry hex hashes or other non-Claude
signatures (e.g. "d5cb9cd0823142109f451861") from Gemini responses. These
are now discarded alongside empty-signature blocks during the strip phase,
before validation runs. Valid Claude signatures always start with 'E' or 'R'
(after stripping any cache prefix).
Thinking blocks with empty signatures come from proxy-generated
responses (Antigravity/Gemini routed as Claude). These should be
silently dropped from the request payload before forwarding, not
rejected with 400. Fixes 10 "missing thinking signature" errors.
- Introduced `auth-auto-refresh-workers` config option to override default concurrency.
- Updated `authAutoRefreshLoop` to support customizable worker counts.
- Enhanced token refresh scheduling flexibility by aligning worker pool with runtime configurations.
- Replaced `filepath.Walk` with `os.ReadDir` for cleaner directory traversal.
- Fixed `isAuthJSON` check to use `filepath.Dir` for directory comparison.
- Updated auth hash cache generation and file synthesis to improve readability and maintainability.
Avoid whole-payload schema sanitization when translated Antigravity requests have no actual tool schemas, including missing and empty tools arrays. Add regression coverage so image-heavy no-tool requests keep bypassing the old memory amplification path.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Antigravity wraps empty thinking text into a prompt-caching-scope
object that omits the required inner "thinking" field, causing 400
"messages.N.content.0.thinking.thinking: Field required" when Claude
Max requests are routed through Antigravity in bypass mode.
- Introduced `authAutoRefreshLoop` to handle token refresh scheduling.
- Replaced semaphore-based refresh logic in `Manager` with the new loop.
- Added unit tests to verify refresh schedule logic and edge cases.
Raise the local bypass-signature ceiling so long Claude thinking signatures are not rejected before request translation, and keep the oversized-signature test cheap to execute.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Added proper parsing of `Retry-After` headers for 429 responses.
- Set default retry duration when "disable cooling" is active on quota exhaustion.
- Updated tests to verify `Retry-After` handling and default behavior.