2322 Commits

Author SHA1 Message Date
Luis Pater
f1ba6151a9 feat(codex): pass base model to enable conditional image_generation tool injection
- Modified `ensureImageGenerationTool` to accept `baseModel` for conditional logic.
- Ensured `gpt-5.3-codex-spark` models bypass image_generation tool injection.
- Updated relevant tests and executor logic to reflect changes.
v6.9.36
2026-04-24 07:21:03 +08:00
Luis Pater
7b89583cf8 chore(models): remove GPT-5.5 model entry from registry JSON 2026-04-24 05:07:03 +08:00
Luis Pater
1576d14137 Merge pull request #2989 from ben-vargas/gpt-5-5-support
Align GPT-5.5 Codex metadata with runtime cache
2026-04-24 04:24:58 +08:00
Ben Vargas
736018a0b0 Add GPT-5.5 Codex model support 2026-04-23 13:43:02 -06:00
Luis Pater
7d5f6d9382 feat(models): add GPT-5.5 model entry to registry JSON 2026-04-24 02:43:12 +08:00
sususu98
12195a276e Merge pull request #2971 from sususu98/feat/antigravity-credits-fallback
feat(antigravity): conductor-level credits fallback for Claude models
v6.9.35
2026-04-24 00:15:23 +08:00
Luis Pater
25137b1984 feat(logging): add AI API path support for image routes
- Included `/v1/images` in AI API path prefixes.
- Introduced tests to validate `/v1/images/generations` and `/v1/images/edits` as AI API paths.
2026-04-24 00:11:42 +08:00
sususu98
7ad1900041 perf(antigravity): async credits hint refresh for warm tokens 2026-04-23 23:58:10 +08:00
Luis Pater
8eb56e5602 Merge pull request #2962 from MoYeRanqianzhi/feat/codex-image-generation-tool-injection
feat(codex): inject image_generation tool + route aliases for Codex CLI image generation
2026-04-23 23:48:31 +08:00
sususu98
f130846ec1 fix(auth): break credits cold-start deadlock by keeping unknown-hint auths as fallback candidates
Replace antigravityCreditsAvailableForModel with inline known/unknown
split. Auths whose credit hints are not yet populated are kept as
lower-priority candidates instead of being rejected, breaking the
chicken-and-egg deadlock at cold start.
2026-04-23 22:47:51 +08:00
sususu98
920b6efffa refactor(logging): strip unrelated deferred body changes, keep credits-only logging
Remove deferred body optimization and maxErrorLog constants that were
unrelated to credits fallback. Keep only MarkCreditsUsed/CreditsUsed
helpers for flagging requests that consumed AI credits.
2026-04-23 17:41:54 +08:00
sususu98
e75daa299b fix(antigravity): respect pinned auth in credits fallback, release deferred body on success
- findAllAntigravityCreditsCandidateAuths now filters by PinnedAuthMetadataKey
  to prevent credential isolation violations during credits fallback
- Release deferredBody reference on success path to avoid holding large
  payloads in memory for the lifetime of the gin context
2026-04-23 17:38:02 +08:00
sususu98
4de5c29f86 fix(antigravity): remove credits fallback from CountTokens, fix gofmt
CountTokens upstream API does not support enabledCreditTypes, so
remove the dead credits fallback path from ExecuteCount and delete
the unused tryAntigravityCreditsExecuteCount method. Fix gofmt on
credits test file.
2026-04-23 15:17:00 +08:00
sususu98
14d46a0a5d feat(antigravity): conductor-level credits fallback for Claude models
Move credits handling from executor-level retry to conductor-level
orchestration. When all free-tier auths are exhausted (429/503), the
conductor discovers auths with available Google One AI credits and
retries with enabledCreditTypes injected via context flag.

Key changes:
- Add AntigravityCreditsHint system for tracking per-auth credits state
- Conductor tries credits fallback after all auths fail (Execute/Stream/Count)
- Executor injects enabledCreditTypes only when conductor sets context flag
- Credits fallback respects provider scope (requires antigravity in providers)
- Add context cancellation check in credits fallback to avoid wasted requests
- Remove executor-level attemptCreditsFallback and preferCredits machinery
- Restructure 429 decision logic (parse details first, keyword fallback)
- Expand shouldAbort to cover INVALID_ARGUMENT/FAILED_PRECONDITION/500+UNKNOWN
- Support human-readable retry delay parsing (e.g. "1h43m56s")
2026-04-23 13:44:20 +08:00
MoYeRanQianZhi
31934ae04c feat(codex): enable image generation for all Codex upstream requests
Codex CLI gates the built-in image_generation tool behind
AuthMode::Chatgpt (OAuth only). When clients connect via API key
auth through CPA, the tool is absent from requests, making image
generation unavailable through the reverse proxy.

Changes:

1. Inject image_generation tool (codex_executor.go):
   Add ensureImageGenerationTool() that appends
   {"type":"image_generation","output_format":"png"} to the tools
   array if not already present. Applied to all three execution
   paths: Execute, executeCompact, and ExecuteStream.

2. Route aliases for Codex CLI direct access (server.go):
   Add /backend-api/codex/responses routes that map to the same
   OpenAI Responses API handlers as /v1/responses. This allows
   Codex CLI to connect via chatgpt_base_url config while keeping
   AuthMode::Chatgpt, which enables the built-in image_generation
   tool on the client side.

3. Unit tests (codex_executor_imagegen_test.go):
   Cover no-tools, existing tools, already-present, empty array,
   and mixed built-in tool scenarios.
2026-04-23 01:24:40 +08:00
Luis Pater
a188159632 fix(handlers): remove references to unsupported n parameter in OpenAI image handlers v6.9.34 2026-04-22 21:28:17 +08:00
Luis Pater
fd71960c3e fix(handlers): remove handling of unsupported n parameter in OpenAI image handlers v6.9.33 2026-04-22 21:12:50 +08:00
Luis Pater
e935196df4 feat(models): add hardcoded GPT-Image-2 model support in Codex
- Added `GPT-Image-2` as a built-in model to avoid dependency on remote updates for Codex.
- Updated model tier functions (`CodexFree`, `CodexTeam`, etc.) to include built-in models via `WithCodexBuiltins`.
- Introduced new handlers for image generation and edit operations under `OpenAIAPIHandler`.
- Extended tests to validate 503 response for unsupported image model requests.
v6.9.32
2026-04-22 20:51:13 +08:00
Luis Pater
4fc2c619fb feat(models): add Kimi K2.6 model entry to registry JSON 2026-04-21 20:53:03 +08:00
Luis Pater
8ced7a548f Merge pull request #2834 from muzhi1991/fix/openai-compat-host-header
fix(util): forward custom Host header to upstream
v6.9.31
2026-04-21 20:29:14 +08:00
Luis Pater
3444820958 Merge pull request #2939 from stringer07/fix/codex-stream-output-backfill
fix(codex): backfill streaming response output
2026-04-21 20:18:57 +08:00
Luis Pater
1716a845eb feat(api): add support for HEAD requests to /healthz endpoint
- Refactored `/healthz` handler to support `HEAD` requests alongside `GET`.
- Updated tests to include validation for `HEAD` requests with expected status and empty body.

Closes: #2929
2026-04-21 20:16:18 +08:00
stringer07
b6781d69be perf(codex): avoid repeated output patch writes 2026-04-21 16:29:54 +08:00
stringer07
bb8408cef5 fix(codex): backfill streaming response output 2026-04-21 16:03:56 +08:00
Luis Pater
e6866ff19c feat(auth): add refresh backoff for ineffective token updates
- Introduced `refreshIneffectiveBackoff` to prevent tight-looping in auto-refresh when token refresh fails to update expiry.
- Adjusted refresh logic to apply backoff when `shouldRefresh` evaluates true.

Closes: #2830
2026-04-20 15:40:43 +08:00
Luis Pater
8f4a4eabfc feat(docs): add VisionCoder sponsorship details and optimize external links
- Added VisionCoder sponsorship information to `README.md`, `README_CN.md`, and `README_JA.md`.
- Updated external links to include `target="_blank"` for improved user experience.
- Added new logo asset `visioncoder.png` for README use.
v6.9.30
2026-04-19 23:00:09 +08:00
Luis Pater
e05abec01f Merge pull request #2898 from octo-patch/fix/issue-2866-remove-obsolete-1m-beta-header
fix(executor): drop obsolete context-1m-2025-08-07 beta header
2026-04-19 11:46:21 +08:00
octo-patch
f4eb16102b fix(executor): drop obsolete context-1m-2025-08-07 beta header (fixes #2866)
Anthropic has moved the 1M-context-window feature to General Availability,
so the context-1m-2025-08-07 beta flag is no longer accepted and now causes
400 Bad Request errors when forwarded upstream.

Remove the X-CPA-CLAUDE-1M detection and the corresponding injection of the
now-invalid beta header.  Also drop the unused net/textproto import that was
only needed for the header-key lookup.
2026-04-19 10:38:16 +08:00
Luis Pater
86c856f56f feat(translator): add partial and full image generation support in Codex-GPT and Codex-Gemini flows
- Introduced `LastImageHashByItemID` in Codex-GPT and `LastImageHashByID` in Codex-Gemini for deduplication of generated images.
- Added support for handling `partial_image` and `image_generation_call` types, with inline data embedding for Gemini and URL payload conversion for GPT.
- Extended unit tests to verify image handling in both streaming and non-streaming modes.
2026-04-19 03:21:59 +08:00
Supra4E8C
c6baa64b4e Merge pull request #2892 from router-for-me/fix-provider
feat(api): integrate auth index into key retrieval endpoints for Gemi…
v6.9.29
2026-04-18 17:24:43 +08:00
Supra4E8C
a64141a9a6 fix(tests): remove obsolete config_auth_index_test file 2026-04-18 17:22:16 +08:00
Supra4E8C
c26936e2e6 fix(management): stabilize auth-index mapping 2026-04-18 17:12:14 +08:00
Supra4E8C
894baad829 feat(api): integrate auth index into key retrieval endpoints for Gemini, Claude, Codex, OpenAI, and Vertex 2026-04-18 16:44:33 +08:00
muzhi1991
eba561bf6f fix(util): also keep Host in header map for synthetic requests
Addressing the P1 note from the Codex reviewer: applyCustomHeaders is
also called with a synthetic &http.Request{Header: ...} from the
websockets executors (aistudio_executor.go, codex_websockets_executor.go),
which forward only the header map. The previous continue meant a custom
Host was dropped from that map, regressing virtual-host overrides on
those flows. Mirror the value to both r.Host (for real net/http) and
r.Header (for header-map-only consumers).
2026-04-17 09:28:59 +08:00
hkfires
da43f63735 fix(tests): update Gemini family test case numbers for consistency 2026-04-17 08:43:19 +08:00
hkfires
d9a3b3e5f3 fix(tests): update model lookup references and enhance Claude executor tests 2026-04-17 08:32:07 +08:00
Luis Pater
5dcca69e8c feat(models): add Claude Opus 4.7 model entry to registry JSON v6.9.28 2026-04-17 01:08:19 +08:00
Luis Pater
f5dc6483d5 chore: remove iFlow-related modules and dependencies
- Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules.
- Removed iFlow-specific tests, executors, and helpers across SDK and internal components.
- Updated all references to exclude iFlow functionality.
2026-04-17 01:07:12 +08:00
Luis Pater
d949921143 feat(auth): add proxy URL override support to auth constructors and executors
- Introduced `WithProxyURL` variants for `CodexAuth`, `ClaudeAuth`, `IFlowAuth`, and `DeviceFlowClient`.
- Updated executors to use proxy-aware constructors for improved configurability.
- Added unit tests to validate proxy override precedence and functionality.

Closes: #2823
2026-04-16 22:11:39 +08:00
Luis Pater
7b03f04670 fix(handlers): include execution session metadata and skip idempotency key when absent
- Refactored `requestExecutionMetadata` to handle empty `Idempotency-Key` gracefully.
- Added test to validate metadata inclusion of execution session without idempotency key.
2026-04-16 21:44:32 +08:00
muzhi1991
8f9e6622b0 fix(util): forward custom Host header to upstream
Custom headers configured under openai-compatibility (and any other
provider passing through applyCustomHeaders) were silently dropped for
the Host key, because Go's net/http reads the wire Host from
req.Host, not req.Header["Host"]. As a result, virtual-host routed
upstreams (e.g. LiteLLM behind an ingress) saw the base-url's host
instead of the user-configured override and returned 404.

Detect the Host key with http.CanonicalHeaderKey and assign it to
req.Host so it is actually written on the wire. Other headers continue
to use Header.Set as before.

Fixes #2833
2026-04-16 20:45:37 +08:00
hkfires
1267fddf61 fix(docker-build): improve argument handling and error messaging for usage option 2026-04-16 09:19:03 +08:00
Luis Pater
ba454dbfbf Merge pull request #2817 from sususu98/fix/antigravity-strip-billing-header
fix(antigravity): strip billing header from system instruction before upstream call
v6.9.27
2026-04-16 02:46:05 +08:00
Luis Pater
d1508ca030 Merge pull request #2816 from sususu98/feat/session-affinity
feat(session-affinity): add session-sticky routing for multi-account load balancing
2026-04-16 02:45:31 +08:00
sususu98
d4a6a5ae15 fix(antigravity): strip billing header from system instruction before upstream call
The x-anthropic-billing-header block in the Claude system array is
client-internal metadata and should not be forwarded to the Gemini
upstream as part of systemInstruction.parts.
2026-04-16 00:19:01 +08:00
sususu98
7c24d54ca8 feat(session-affinity): add session-sticky routing for multi-account load balancing
When multiple auth credentials are configured, requests from the same
session are now routed to the same credential, improving upstream prompt
cache hit rates and maintaining context continuity.

Core components:
- SessionAffinitySelector: wraps RoundRobin/FillFirst selectors with
  session-to-auth binding; automatic failover when bound auth is
  unavailable, re-binding via the fallback selector for even distribution
- SessionCache: TTL-based in-memory cache with background cleanup
  goroutine, supporting per-session and per-auth invalidation
- StoppableSelector interface: lifecycle hook for selectors holding
  resources, called during Manager.StopAutoRefresh()

Session ID extraction priority (extractSessionIDs):
1. metadata.user_id with Claude Code session format (old
   user_{hash}_session_{uuid} and new JSON {session_id} format)
2. X-Session-ID header (generic client support)
3. metadata.user_id (non-Claude format, used as-is)
4. conversation_id field
5. Stable FNV hash from system prompt + first user/assistant messages
   (fallback for clients with no explicit session ID); returns both a
   full hash (primaryID) and a short hash without assistant content
   (fallbackID) to inherit bindings from the first turn

Multi-format message hash covers OpenAI messages, Claude system array,
Gemini contents/systemInstruction, and OpenAI Responses API input items
(including inline messages with role but no type field).

Configuration (config.yaml routing section):
- session-affinity: bool (default false)
- session-affinity-ttl: duration string (default "1h")
- claude-code-session-affinity: bool (deprecated, alias for above)
All three fields trigger selector rebuild on config hot reload.

Side effect: Idempotency-Key header is no longer auto-generated with a
random UUID when absent — only forwarded when explicitly provided by the
client, to avoid polluting session hash extraction.
2026-04-16 00:18:47 +08:00
Luis Pater
a4c1e32ff6 chore(models): remove outdated GPT-5 and related model entries from registry JSON 2026-04-15 20:37:32 +08:00
Luis Pater
f56cf42461 Merge pull request #2800 from sususu98/fix/antigravity-max-output-tokens-cap
fix(antigravity): cap maxOutputTokens using registry max_completion_tokens
2026-04-15 20:35:11 +08:00
Luis Pater
3dea1da249 Merge pull request #2782 from sususu98/fix/strip-invalid-signature-thinking-blocks
fix(antigravity): use E-prefixed fake signature in strict bypass test
2026-04-15 20:34:32 +08:00
Luis Pater
8fac29631d chore: remove Qwen support from SDK and internal components
- Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations.
- Removed all Qwen-related OAuth flows, token handling, and execution logic.
- Cleaned up dependencies and references to Qwen across the codebase.
v6.9.26
2026-04-15 12:16:08 +08:00