Both short (<20s) and long (>=20s/missing) retryDelay now retry once:
- Short: wait actual retryDelay, retry once
- Long/missing: wait 20s, retry once
- Still capacity exhausted: switch account
- Different error: let upper layer handle
- MODEL_CAPACITY_EXHAUSTED now uses independent retry strategy:
- retryDelay < 20s: wait actual retryDelay then retry once
- retryDelay >= 20s or missing: retry up to 5 times at 20s intervals
- Still capacity exhausted after retries: switch account (failover)
- Different error during retry (e.g. 429): handle by actual error code
- No model rate limit set (capacity != rate limit)
- Remove Antigravity extra failover retries feature:
Same-account retry mechanism (cherry-picked) makes it redundant.
Removed: antigravityExtraRetries config, sleepFixedDelay, skip-non-antigravity logic.
- Change antigravitySmartRetryMaxAttempts from 3 to 1 to prevent
repeated rate limiting and long waits
- Clear sticky session binding (DeleteSessionAccountID) after smart
retry exhaustion, so subsequent requests don't hit the same
rate-limited account
- Add flow diagrams to Forward/ForwardGemini doc comments
- Add comprehensive unit tests covering:
- Sticky session cleared on retry failure (429, 503, network error)
- Sticky session NOT cleared on retry success
- Sticky session NOT cleared for non-sticky requests (empty hash)
- Sticky session NOT cleared on long delay path (handled by handler)
- Nil cache safety (no panic)
- MaxAttempts constant verification
- End-to-end retryLoop → switchError propagation with session clear
- GetAccessToken: add upstream branch to read api_key from credentials
- shouldTriggerAntigravitySmartRetry: relax check from IsOAuth to Platform-based
- isModelSupportedByAccount/WithContext: replace IsAntigravityModelSupported
whitelist with mapAntigravityModel for unified scheduling/forwarding logic
- mapAntigravityModel: fix edge case where wildcard target equals request model
- Update tests for new behavior and add custom model_mapping test cases