Remove SimulateClaudeMaxEnabled field and related logic from
admin_service.go, and remove applyClaudeMaxCacheBillingPolicyToUsage,
applyClaudeMaxNonStreamingRewrite, setupClaudeMaxStreamingHook calls
from antigravity_gateway_service.go. These symbols are not yet
available in upstream/main.
Replace process-memory sync.Map + per-model runtime state with a single
"AICredits" key in model_rate_limits, making credits exhaustion fully
isomorphic with model-level rate limiting.
Scheduler: rate-limited accounts with overages enabled + credits available
are now scheduled instead of excluded.
Forwarding: when model is rate-limited + credits available, inject credits
proactively without waiting for a 429 round trip.
Storage: credits exhaustion stored as model_rate_limits["AICredits"] with
5h duration, reusing SetModelRateLimit/isRateLimitActiveForKey.
Frontend: show credits_active (yellow ⚡) when model rate-limited but
credits available, credits_exhausted (red) when AICredits key active.
Tests: add unit tests for shouldMarkCreditsExhausted, injectEnabledCreditTypes,
clearCreditsExhausted, and update existing overages tests.
The upstream Gemini API returns "Insufficient account balance" which
doesn't contain the substring "insufficient balance". Add explicit
match for the full phrase to ensure the filter works correctly.
- Add 5th error filter switch IgnoreInsufficientBalanceErrors to suppress
upstream insufficient balance / insufficient_quota errors from ops log
- Extract hardcoded error strings into package-level constants for
shouldSkipOpsErrorLog, normalizeOpsErrorType, classifyOpsPhase, and
classifyOpsIsBusinessLimited
- Define ErrNoAvailableAccounts sentinel error and replace all
errors.New("no available accounts") call sites
- Update tests to use require.ErrorIs with the sentinel error
Previously, v-model.number produced "" when input was cleared, causing
JSON decode errors on the backend. Also, normalizeLimit treated 0 as
"unlimited" which prevented setting a zero quota. Now "" is converted
to null (unlimited) in frontend, and 0 is preserved as a valid limit.
ClosesWei-Shaw/sub2api#1021
When Redis cache is populated from DB with a NULL window_1d_start, the
Lua increment script only updates usage counters without setting window
timestamps. IsWindowExpired(nil) previously returned false, so the
accumulated usage was never reset across time windows, effectively
turning usage_1d into a lifetime counter. Once this exceeded
rate_limit_1d the key was incorrectly blocked with "日限额已用完".
FixesWei-Shaw/sub2api#1022
Claude's output_config.effort parameter (low/medium/high/max) was not
being extracted from requests or logged in the reasoning_effort column
of usage logs. Only the OpenAI path populated this field.
Changes:
- Extract output_config.effort in ParseGatewayRequest
- Add ReasoningEffort field to ForwardResult
- Populate reasoning_effort in both RecordUsage and RecordUsageWithLongContext
- Guard against overwriting service-set effort values in handler
- Update stale comments that described reasoning_effort as OpenAI-only
- Add unit tests for extraction, normalization, and persistence
Increase MAX(bucket_start) query timeout from 3s to 5s to reduce
timeout-induced fallbacks. Shrink backfill window from 30 days to
1 hour so that fallback recomputation stays lightweight instead of
scanning the entire retention range.