Introduce OAuthRefreshAPI as the single entry point for all OAuth token
refresh operations, eliminating the race condition where background
refresh and inline refresh could simultaneously use the same
refresh_token (fixes#1035).
Key changes:
- Add OAuthRefreshExecutor interface extending TokenRefresher with CacheKey
- Add OAuthRefreshAPI.RefreshIfNeeded with lock → DB re-read → double-check flow
- Add ProviderRefreshPolicy / BackgroundRefreshPolicy strategy types
- Simplify all 4 TokenProviders to delegate to OAuthRefreshAPI
- Rewrite TokenRefreshService.refreshWithRetry to use unified API path
- Add MergeCredentials and BuildClaudeAccountCredentials helpers
- Add 40 unit tests covering all new and modified code paths
- Apply InboundEndpointMiddleware to all gateway route groups
- Replace normalizedOpenAIInboundEndpoint/normalizedOpenAIUpstreamEndpoint and normalizedGatewayInboundEndpoint/normalizedGatewayUpstreamEndpoint with GetInboundEndpoint/GetUpstreamEndpoint
- Remove 4 old constants and 4 old normalization functions (-70 lines)
- Migrate existing endpoint normalization test to new API
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Introduce endpoint.go with shared constants, NormalizeInboundEndpoint, DeriveUpstreamEndpoint, InboundEndpointMiddleware, and context helpers. This replaces the two separate normalization implementations (OpenAI and Gateway) with a single source of truth. Includes comprehensive test coverage.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Extend RecordUsageInput and RecordUsageLongContextInput structs with InboundEndpoint and UpstreamEndpoint so that Claude, Gemini, and Sora handlers can record endpoint info alongside OpenAI handlers.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace t.Add(24*time.Hour - time.Nanosecond) with t.AddDate(0, 0, 1) and use SQL < instead of <= for end-of-day boundaries. This avoids edge-case misses around DST transitions.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The upstream Gemini API returns "Insufficient account balance" which
doesn't contain the substring "insufficient balance". Add explicit
match for the full phrase to ensure the filter works correctly.
- Add 5th error filter switch IgnoreInsufficientBalanceErrors to suppress
upstream insufficient balance / insufficient_quota errors from ops log
- Extract hardcoded error strings into package-level constants for
shouldSkipOpsErrorLog, normalizeOpsErrorType, classifyOpsPhase, and
classifyOpsIsBusinessLimited
- Define ErrNoAvailableAccounts sentinel error and replace all
errors.New("no available accounts") call sites
- Update tests to use require.ErrorIs with the sentinel error
Previously, v-model.number produced "" when input was cleared, causing
JSON decode errors on the backend. Also, normalizeLimit treated 0 as
"unlimited" which prevented setting a zero quota. Now "" is converted
to null (unlimited) in frontend, and 0 is preserved as a valid limit.
ClosesWei-Shaw/sub2api#1021
When Redis cache is populated from DB with a NULL window_1d_start, the
Lua increment script only updates usage counters without setting window
timestamps. IsWindowExpired(nil) previously returned false, so the
accumulated usage was never reset across time windows, effectively
turning usage_1d into a lifetime counter. Once this exceeded
rate_limit_1d the key was incorrectly blocked with "日限额已用完".
FixesWei-Shaw/sub2api#1022
Claude's output_config.effort parameter (low/medium/high/max) was not
being extracted from requests or logged in the reasoning_effort column
of usage logs. Only the OpenAI path populated this field.
Changes:
- Extract output_config.effort in ParseGatewayRequest
- Add ReasoningEffort field to ForwardResult
- Populate reasoning_effort in both RecordUsage and RecordUsageWithLongContext
- Guard against overwriting service-set effort values in handler
- Update stale comments that described reasoning_effort as OpenAI-only
- Add unit tests for extraction, normalization, and persistence
Increase MAX(bucket_start) query timeout from 3s to 5s to reduce
timeout-induced fallbacks. Shrink backfill window from 30 days to
1 hour so that fallback recomputation stays lightweight instead of
scanning the entire retention range.