- Fix gofmt alignment in admin_service.go and trailing newline in
antigravity_credits_overages.go
- Suppress errcheck for fmt.Sscanf in client.go GetMinimumAmount
Remove SimulateClaudeMaxEnabled field and related logic from
admin_service.go, and remove applyClaudeMaxCacheBillingPolicyToUsage,
applyClaudeMaxNonStreamingRewrite, setupClaudeMaxStreamingHook calls
from antigravity_gateway_service.go. These symbols are not yet
available in upstream/main.
Replace process-memory sync.Map + per-model runtime state with a single
"AICredits" key in model_rate_limits, making credits exhaustion fully
isomorphic with model-level rate limiting.
Scheduler: rate-limited accounts with overages enabled + credits available
are now scheduled instead of excluded.
Forwarding: when model is rate-limited + credits available, inject credits
proactively without waiting for a 429 round trip.
Storage: credits exhaustion stored as model_rate_limits["AICredits"] with
5h duration, reusing SetModelRateLimit/isRateLimitActiveForKey.
Frontend: show credits_active (yellow ⚡) when model rate-limited but
credits available, credits_exhausted (red) when AICredits key active.
Tests: add unit tests for shouldMarkCreditsExhausted, injectEnabledCreditTypes,
clearCreditsExhausted, and update existing overages tests.
Introduce OAuthRefreshAPI as the single entry point for all OAuth token
refresh operations, eliminating the race condition where background
refresh and inline refresh could simultaneously use the same
refresh_token (fixes#1035).
Key changes:
- Add OAuthRefreshExecutor interface extending TokenRefresher with CacheKey
- Add OAuthRefreshAPI.RefreshIfNeeded with lock → DB re-read → double-check flow
- Add ProviderRefreshPolicy / BackgroundRefreshPolicy strategy types
- Simplify all 4 TokenProviders to delegate to OAuthRefreshAPI
- Rewrite TokenRefreshService.refreshWithRetry to use unified API path
- Add MergeCredentials and BuildClaudeAccountCredentials helpers
- Add 40 unit tests covering all new and modified code paths
- Apply InboundEndpointMiddleware to all gateway route groups
- Replace normalizedOpenAIInboundEndpoint/normalizedOpenAIUpstreamEndpoint and normalizedGatewayInboundEndpoint/normalizedGatewayUpstreamEndpoint with GetInboundEndpoint/GetUpstreamEndpoint
- Remove 4 old constants and 4 old normalization functions (-70 lines)
- Migrate existing endpoint normalization test to new API
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Introduce endpoint.go with shared constants, NormalizeInboundEndpoint, DeriveUpstreamEndpoint, InboundEndpointMiddleware, and context helpers. This replaces the two separate normalization implementations (OpenAI and Gateway) with a single source of truth. Includes comprehensive test coverage.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Extend RecordUsageInput and RecordUsageLongContextInput structs with InboundEndpoint and UpstreamEndpoint so that Claude, Gemini, and Sora handlers can record endpoint info alongside OpenAI handlers.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace t.Add(24*time.Hour - time.Nanosecond) with t.AddDate(0, 0, 1) and use SQL < instead of <= for end-of-day boundaries. This avoids edge-case misses around DST transitions.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The upstream Gemini API returns "Insufficient account balance" which
doesn't contain the substring "insufficient balance". Add explicit
match for the full phrase to ensure the filter works correctly.
- Add 5th error filter switch IgnoreInsufficientBalanceErrors to suppress
upstream insufficient balance / insufficient_quota errors from ops log
- Extract hardcoded error strings into package-level constants for
shouldSkipOpsErrorLog, normalizeOpsErrorType, classifyOpsPhase, and
classifyOpsIsBusinessLimited
- Define ErrNoAvailableAccounts sentinel error and replace all
errors.New("no available accounts") call sites
- Update tests to use require.ErrorIs with the sentinel error
Previously, v-model.number produced "" when input was cleared, causing
JSON decode errors on the backend. Also, normalizeLimit treated 0 as
"unlimited" which prevented setting a zero quota. Now "" is converted
to null (unlimited) in frontend, and 0 is preserved as a valid limit.
ClosesWei-Shaw/sub2api#1021
When Redis cache is populated from DB with a NULL window_1d_start, the
Lua increment script only updates usage counters without setting window
timestamps. IsWindowExpired(nil) previously returned false, so the
accumulated usage was never reset across time windows, effectively
turning usage_1d into a lifetime counter. Once this exceeded
rate_limit_1d the key was incorrectly blocked with "日限额已用完".
FixesWei-Shaw/sub2api#1022
Claude's output_config.effort parameter (low/medium/high/max) was not
being extracted from requests or logged in the reasoning_effort column
of usage logs. Only the OpenAI path populated this field.
Changes:
- Extract output_config.effort in ParseGatewayRequest
- Add ReasoningEffort field to ForwardResult
- Populate reasoning_effort in both RecordUsage and RecordUsageWithLongContext
- Guard against overwriting service-set effort values in handler
- Update stale comments that described reasoning_effort as OpenAI-only
- Add unit tests for extraction, normalization, and persistence