Claude's output_config.effort parameter (low/medium/high/max) was not
being extracted from requests or logged in the reasoning_effort column
of usage logs. Only the OpenAI path populated this field.
Changes:
- Extract output_config.effort in ParseGatewayRequest
- Add ReasoningEffort field to ForwardResult
- Populate reasoning_effort in both RecordUsage and RecordUsageWithLongContext
- Guard against overwriting service-set effort values in handler
- Update stale comments that described reasoning_effort as OpenAI-only
- Add unit tests for extraction, normalization, and persistence
- Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE
- Filter negative values in GetBaseRPM(), update test expectation
- Add RPM batch query (GetRPMBatch) to account List API
- Add warn logs for RPM increment failures in gateway handler
- Reset enableRpmLimit on BulkEditAccountModal close
- Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs
- Add design decision comments for rdb.Time() RTT trade-off
- Remove Gemini platform exclusion from model restriction UI in
Create/Edit account modals (Gemini now supports model_mapping)
- Remove outdated Gemini model passthrough info cards
- Add model_mapping field to GeminiCredentials type
- Extend warmup request interception toggle to Antigravity platform
- Remove redundant try/catch in API key account creation
- Remove noisy gateway.request_completed debug log
- Reorganize Gemini model mapping sections in constants.go
For retryable transient errors (Google 400 "invalid project resource name"
and empty stream responses), retry on the same account up to 2 times
(with 500ms delay) before switching to another account.
- Add RetryableOnSameAccount field to UpstreamFailoverError
- Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
- Move temp-unschedule from service layer to handler layer (only after
all same-account retries exhausted)
- Reduce temp-unschedule cooldown from 30 minutes to 1 minute
ParseGatewayRequest only parsed Anthropic format (system/messages),
ignoring Gemini native format (systemInstruction/contents). This caused
GenerateSessionHash to produce identical hashes for all Gemini sessions.
Add protocol parameter to ParseGatewayRequest to branch between
Anthropic and Gemini parsing. Update GenerateSessionHash message
traversal to extract text from both formats.
Mix SessionContext (ClientIP, UserAgent, APIKeyID) into
GenerateSessionHash 3rd-level fallback to differentiate requests
from different users sending identical content.
Also switch hashContent from SHA256-truncated to XXHash64 for
better performance, and optimize Trie Lua script to match from
longest prefix first.
Upstream accounts now use the standard APIKey type instead of a dedicated
upstream type. GetBaseURL() and new GetGeminiBaseURL() automatically append
/antigravity for Antigravity platform APIKey accounts, eliminating the need
for separate upstream forwarding methods.
- Remove ForwardUpstream, ForwardUpstreamGemini, testUpstreamConnection
- Remove upstream branch guards in Forward/ForwardGemini/TestConnection
- Add migration 052 to convert existing upstream accounts to apikey
- Update frontend CreateAccountModal to create apikey type
- Add unit tests for GetBaseURL and GetGeminiBaseURL
Previously the /v1/usage endpoint aggregated usage stats (today/total
tokens, cost, RPM/TPM) across all API Keys belonging to the user.
This made it impossible to distinguish usage from different API Keys
(e.g. balance vs subscription keys).
Now the usage stats are filtered by the current request's API Key ID,
so each key only sees its own usage data. The balance/remaining fields
are unaffected and still reflect the user-level wallet balance.
Changes:
- Add GetAPIKeyDashboardStats to repository interface and implementation
- Add getPerformanceStatsByAPIKey helper (also fixes TPM to include
cache_creation_tokens and cache_read_tokens)
- Add GetAPIKeyDashboardStats to UsageService
- Update Usage handler to call GetAPIKeyDashboardStats(apiKey.ID)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>