4K image base64 data can exceed 40MB limit, causing "bufio.Scanner:
token too long" errors. Scanner is adaptive (starts at 64KB, grows
as needed), so increasing the cap has no impact on normal responses.
Extend the existing total quota limit with daily and weekly periodic
dimensions. Each dimension is independently configurable and uses lazy
reset — when the period expires, usage is automatically reset to zero on
the next increment. Any dimension exceeding its limit will pause the
account from scheduling.
Backend:
- Add GetQuotaDailyLimit/Used, GetQuotaWeeklyLimit/Used, HasAnyQuotaLimit
- Rewrite IncrementQuotaUsed with atomic CTE SQL for 3-dimension update
- Rewrite ResetQuotaUsed to clear all dimensions and period timestamps
- Update postUsageBilling to use HasAnyQuotaLimit()
- Preserve daily/weekly used values on account edit
Frontend:
- Refactor QuotaLimitCard from single v-model to 3-dimension props
- Add QuotaBadge component for compact D/W/$ display
- Update AccountCapacityCell with per-dimension badges
- Update Create/Edit modals with daily/weekly quota fields
- Update AccountActionMenu hasQuotaLimit to check all dimensions
- Add i18n strings for daily/weekly/total quota labels
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Anthropic Messages API support for OpenAI platform groups, enabling
clients using Claude-style /v1/messages format to access OpenAI accounts
through automatic protocol conversion.
- Add apicompat package with type definitions and bidirectional converters
(Anthropic ↔ Chat, Chat ↔ Responses, Anthropic ↔ Responses)
- Implement /v1/messages endpoint for OpenAI gateway with streaming support
- Add model mapping UI for OpenAI OAuth accounts (whitelist + mapping modes)
- Support prompt caching fields and codex OAuth transforms
- Fix tool call ID conversion for Responses API (fc_ prefix)
- Ensure function_call_output has non-empty output field
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract postUsageBilling() to consolidate billing logic across
GatewayService.RecordUsage, RecordUsageWithLongContext, and
OpenAIGatewayService.RecordUsage, eliminating ~120 lines of
duplicated code
- Fix account quota to use TotalCost × accountRateMultiplier
(was using raw TotalCost, inconsistent with account cost stats)
- Fix RecordUsageWithLongContext API Key quota only updating in
balance mode (now updates regardless of billing type)
- Fix WebSocket client disconnect detection on Windows by adding
"an established connection was aborted" to known disconnect errors
- Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE
- Filter negative values in GetBaseRPM(), update test expectation
- Add RPM batch query (GetRPMBatch) to account List API
- Add warn logs for RPM increment failures in gateway handler
- Reset enableRpmLimit on BulkEditAccountModal close
- Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs
- Add design decision comments for rdb.Time() RTT trade-off
- Add migration 060 to update model_mapping for all antigravity accounts
- Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings
- Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings
- Update frontend usage window to show GImage for new model
- Update isImageGenerationModel to support new model
PR #635 returned HTTP 200 with {"input_tokens": 0} when upstream doesn't
support count_tokens (404). This caused Claude Code CLI to trust the zero
value, believing context uses 0 tokens, so auto-compression never triggers.
Fix: return 404 with proper error body so CLI falls back to its local
tokenizer for accurate estimation. Return nil (not error) to avoid
polluting ops error metrics with expected 404s.
Affected paths:
- Passthrough APIKey accounts: upstream 404 now passed through as 404
- Antigravity accounts: same fix (was also returning fake 200)
Remove the special case that bypassed model-supported checks for Gemini
API Key accounts, allowing model_mapping to filter requests properly.
Add tests for multiplatform model filtering behavior.