TestConnection now reuses antigravityRetryLoop instead of a standalone
HTTP loop, gaining credits overages, smart retry, and 429/503 backoff
for free. AccountSwitchError is caught and surfaced as a friendly
message. Also populates RateLimitedModel in TempUnscheduled switch error.
Test fixes:
- Use RATE_LIMIT_EXCEEDED in 503 short-delay test to avoid 60x1s timeout
- Clamp waitDuration=0 instead of 999s to avoid 15s max-wait timeout
- Enhance mockSmartRetryUpstream with repeatLast and body caching
The expanded user breakdown rows already align with the parent table
columns (Requests, Token, Actual, Standard), so the repeated sub-header
wastes vertical space. Remove the <thead> from UserBreakdownSubTable.
Click on a group name, model name, or endpoint name in the distribution
tables to expand and show per-user usage breakdown (requests, tokens,
actual cost, standard cost).
Backend: new GET /admin/dashboard/user-breakdown API with group_id,
model, endpoint, endpoint_type filters.
Frontend: clickable rows with expand/collapse sub-table in all three
distribution charts.
OAuth upstreams (ChatGPT) reject requests containing role:"system" in
the input array with HTTP 400 "System messages are not allowed". Extract
such items before forwarding and merge their content into the top-level
instructions field, prepending to any existing value.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Docker named volumes and host bind-mounts may be owned by root,
causing "open data/model_pricing.sha256: permission denied" when
the container runs as the non-root sub2api user.
Add an entrypoint script that fixes /app/data ownership before
dropping to sub2api via su-exec. Replace USER directive with the
entrypoint approach across all three Dockerfiles and update both
GoReleaser configs to include the script in Docker build contexts.
The usage stats query defaults to a 30-day rolling window, but the
UI label said "Total"/"累计" implying lifetime aggregation. Rename
to "Last 30d"/"近30天" so the label matches the actual query semantics.
Closes#1060
Antigravity streaming handlers were missing the keepalive mechanism
that exists in the standard gateway, causing proxy/CDN idle timeouts
to break connections during long thinking phases (e.g. claude-opus-4-6).
This resulted in truncated responses with missing tool calls.
Add StreamKeepaliveInterval support to all three Antigravity streaming
paths: Claude SSE, Gemini SSE, and upstream passthrough.
- Simplify OpenAI rendering: always fetch /usage, prefer fetched data over
codex snapshot (snapshot serves as loading placeholder only)
- Remove dead code: preferFetchedOpenAIUsage, isOpenAICodexSnapshotStale,
and unreachable template branch
- Add today-stats support for key accounts (req/tokens/A/U badges)
- Use formatCompactNumber for consistent number formatting
- Add A/U badge titles for clarity
- Filter zero-value window stats in UsageProgressBar to avoid empty badges
- Update tests to match new fetched-data-first behavior
- Pass todayStats/todayStatsLoading to AccountUsageCell for key accounts
- Propagate usageManualRefreshToken to force usage reload on explicit refresh
- Refresh today stats when toggling usage/today_stats columns visible
Removes hasMeaningfulWindowStats guard so the /usage endpoint consistently
returns WindowStats for both time windows. The frontend now controls
zero-value display filtering at the component level.
The updateRateLimitUsageScript Lua script previously performed
unconditional HINCRBYFLOAT on all usage counters without checking
whether the rate limit window had expired. This caused usage to
accumulate across window boundaries in Redis while the DB correctly
reset on expiration, leading to incorrect 429 rate limiting that
could persist for up to 24 hours.
The Lua script now checks each window timestamp before incrementing:
- If the window has expired, usage is reset to the current cost and
the window timestamp is updated (matching DB-side semantics)
- If the window is still valid, usage is accumulated normally
This also resolves the async race condition where stale HINCRBYFLOAT
tasks from the worker queue could pollute a freshly rebuilt cache
after invalidation, since the script now self-corrects expired windows.
Closes#1049
- Fix gofmt alignment in admin_service.go and trailing newline in
antigravity_credits_overages.go
- Suppress errcheck for fmt.Sscanf in client.go GetMinimumAmount
Remove SimulateClaudeMaxEnabled field and related logic from
admin_service.go, and remove applyClaudeMaxCacheBillingPolicyToUsage,
applyClaudeMaxNonStreamingRewrite, setupClaudeMaxStreamingHook calls
from antigravity_gateway_service.go. These symbols are not yet
available in upstream/main.
Replace process-memory sync.Map + per-model runtime state with a single
"AICredits" key in model_rate_limits, making credits exhaustion fully
isomorphic with model-level rate limiting.
Scheduler: rate-limited accounts with overages enabled + credits available
are now scheduled instead of excluded.
Forwarding: when model is rate-limited + credits available, inject credits
proactively without waiting for a 429 round trip.
Storage: credits exhaustion stored as model_rate_limits["AICredits"] with
5h duration, reusing SetModelRateLimit/isRateLimitActiveForKey.
Frontend: show credits_active (yellow ⚡) when model rate-limited but
credits available, credits_exhausted (red) when AICredits key active.
Tests: add unit tests for shouldMarkCreditsExhausted, injectEnabledCreditTypes,
clearCreditsExhausted, and update existing overages tests.