- Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules.
- Removed iFlow-specific tests, executors, and helpers across SDK and internal components.
- Updated all references to exclude iFlow functionality.
When multiple auth credentials are configured, requests from the same
session are now routed to the same credential, improving upstream prompt
cache hit rates and maintaining context continuity.
Core components:
- SessionAffinitySelector: wraps RoundRobin/FillFirst selectors with
session-to-auth binding; automatic failover when bound auth is
unavailable, re-binding via the fallback selector for even distribution
- SessionCache: TTL-based in-memory cache with background cleanup
goroutine, supporting per-session and per-auth invalidation
- StoppableSelector interface: lifecycle hook for selectors holding
resources, called during Manager.StopAutoRefresh()
Session ID extraction priority (extractSessionIDs):
1. metadata.user_id with Claude Code session format (old
user_{hash}_session_{uuid} and new JSON {session_id} format)
2. X-Session-ID header (generic client support)
3. metadata.user_id (non-Claude format, used as-is)
4. conversation_id field
5. Stable FNV hash from system prompt + first user/assistant messages
(fallback for clients with no explicit session ID); returns both a
full hash (primaryID) and a short hash without assistant content
(fallbackID) to inherit bindings from the first turn
Multi-format message hash covers OpenAI messages, Claude system array,
Gemini contents/systemInstruction, and OpenAI Responses API input items
(including inline messages with role but no type field).
Configuration (config.yaml routing section):
- session-affinity: bool (default false)
- session-affinity-ttl: duration string (default "1h")
- claude-code-session-affinity: bool (deprecated, alias for above)
All three fields trigger selector rebuild on config hot reload.
Side effect: Idempotency-Key header is no longer auto-generated with a
random UUID when absent — only forwarded when explicitly provided by the
client, to avoid polluting session hash extraction.
- Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations.
- Removed all Qwen-related OAuth flows, token handling, and execution logic.
- Cleaned up dependencies and references to Qwen across the codebase.
- Introduced `auth-auto-refresh-workers` config option to override default concurrency.
- Updated `authAutoRefreshLoop` to support customizable worker counts.
- Enhanced token refresh scheduling flexibility by aligning worker pool with runtime configurations.
- Abort update when SHA256 digest mismatch is detected instead of
logging a warning and proceeding (prevents MITM asset replacement)
- Cap asset download size to 10 MB via io.LimitReader (defense-in-depth
against OOM from oversized responses)
- Add `auto-update-panel` config option (default: false) to make the
periodic background updater opt-in; the panel is still downloaded
on first access when missing, but no longer silently auto-updated
every 3 hours unless explicitly enabled
Default to generating a fresh random user_id per request instead of
reusing cached IDs. Add cache-user-id config option to opt in to the
previous caching behavior.
- Add CacheUserID field to CloakConfig
- Extract user_id cache logic to dedicated file
- Generate fresh user_id by default, cache only when enabled
- Add tests for both paths
- Introduced `passthrough-headers` option in configuration to control forwarding of upstream response headers.
- Updated handlers to respect the passthrough headers setting.
- Added tests to verify behavior when passthrough is enabled or disabled.
- Introduced unit tests for request logging middleware to enhance coverage.
- Added WebSocket-based Codex executor to support Responses API upgrade.
- Updated middleware logic to selectively capture request bodies for memory efficiency.
- Enhanced Codex configuration handling with new WebSocket attributes.
Update hardcoded X-Stainless-* and User-Agent defaults to match
Claude Code 2.1.44 / @anthropic-ai/sdk 0.74.0 (verified via
diagnostic proxy capture 2026-02-17).
Changes:
- X-Stainless-Os/Arch: dynamic via runtime.GOOS/GOARCH
- X-Stainless-Package-Version: 0.55.1 → 0.74.0
- X-Stainless-Timeout: 60 → 600
- User-Agent: claude-cli/1.0.83 (external, cli) → claude-cli/2.1.44 (external, sdk-cli)
Add claude-header-defaults config section so values can be updated
without recompilation when Claude Code releases new versions.
- Introduced a new `pprof` server to enable/debug HTTP profiling.
- Added configuration options for enabling/disabling and specifying the server address.
- Integrated pprof server lifecycle management with `Service`.
#1287
Introduce `Filter` rules in the payload configuration to remove specified JSON paths from the payload. Update related helper functions and add examples to `config.example.yaml`.
- Add ErrorLogsMaxFiles config field with default value 10
- Support hot-reload via config file changes
- Add Management API: GET/PUT/PATCH /v0/management/error-logs-max-files
- Maintain SDK backward compatibility with NewFileRequestLogger (3 params)
- Add NewFileRequestLoggerWithOptions for custom error log retention
When request logging is disabled, forced error logs are retained up to
the configured limit. Set to 0 to disable cleanup.
- Introduced `default-raw` and `override-raw` rules to handle raw JSON values.
- Enhanced `PayloadConfig` to validate and sanitize raw JSON payload rules.
- Updated executor logic to apply `default-raw` and `override-raw` rules.
- Extended example YAML to include usage of raw JSON rules.