Ethan0x0000
7bd1972f94
refactor: migrate all handlers to shared endpoint normalization middleware
...
- Apply InboundEndpointMiddleware to all gateway route groups
- Replace normalizedOpenAIInboundEndpoint/normalizedOpenAIUpstreamEndpoint and normalizedGatewayInboundEndpoint/normalizedGatewayUpstreamEndpoint with GetInboundEndpoint/GetUpstreamEndpoint
- Remove 4 old constants and 4 old normalization functions (-70 lines)
- Migrate existing endpoint normalization test to new API
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode )
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai >
2026-03-15 22:13:42 +08:00
Wesley Liddick
8321e4a647
Merge pull request #1023 from YanzheL/fix/claude-output-effort-logging
...
fix: extract and log Claude output_config.effort in usage records
2026-03-15 16:45:37 +08:00
Elysia
359e56751b
增加测试
2026-03-15 16:21:49 +08:00
YanzheL
1bff2292a6
fix: extract and log Claude output_config.effort in usage records
...
Claude's output_config.effort parameter (low/medium/high/max) was not
being extracted from requests or logged in the reasoning_effort column
of usage logs. Only the OpenAI path populated this field.
Changes:
- Extract output_config.effort in ParseGatewayRequest
- Add ReasoningEffort field to ForwardResult
- Populate reasoning_effort in both RecordUsage and RecordUsageWithLongContext
- Guard against overwriting service-set effort values in handler
- Update stale comments that described reasoning_effort as OpenAI-only
- Add unit tests for extraction, normalization, and persistence
2026-03-15 12:55:37 +08:00
Elysia
0e23732631
fix(gateway): 防止流式 failover 拼接腐化导致客户端收到双 message_start
...
当上游在 SSE 流中途返回 event:error 时,handleStreamingResponse 已将
部分 SSE 事件写入客户端,但原先的 failover 逻辑仍会切换到下一个账号
并写入完整流,导致客户端收到两个 message_start 进而产生 400 错误。
修复方案:在每次 Forward 调用前记录 c.Writer.Size(),若 Forward 返回
UpstreamFailoverError 后 writer 字节数增加,说明 SSE 内容已不可撤销地
发送给客户端,此时直接调用 handleFailoverExhausted 发送 SSE error 事件
终止流,而非继续 failover。
Ping-only 场景不受影响:slot 等待期的 ping 字节在 Forward 前后相等,
正常 failover 流程照常进行。
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-03-14 22:49:23 +08:00
ius
611fd884bd
feat: decouple billing correctness from usage log batching
2026-03-12 16:53:18 +08:00
shaw
00a0a12138
feat: Anthropic平台可配置 anthropic-beta 策略
2026-03-10 11:20:10 +08:00
shaw
c7fcb7a84b
feat: apikey限额支持查询重置时间
2026-03-09 10:22:24 +08:00
shaw
7a353028e7
fix: 修复keys速率限制未自动重置额度的bug
2026-03-07 10:13:51 +08:00
shaw
838dad8759
feat: 重构 /v1/usage 端点,支持 quota_limited 和 unrestricted 双模式
...
- quota_limited 模式:返回 Key 级别的总额度、速率限制窗口用量和过期时间
- unrestricted 模式:返回订阅限额或钱包余额信息(向后兼容)
- 新增 model_stats 字段,支持 start_date/end_date 参数查询按模型用量统计
- 提取 buildUsageData/parseUsageDateRange 等辅助方法,减少主函数复杂度
- 新增 APIKeyService.GetRateLimitData 和 UsageService.GetAPIKeyModelStats
2026-03-03 20:59:12 +08:00
shaw
a80ec5d8bb
feat: apikey支持5h/1d/7d速率控制
2026-03-03 15:01:10 +08:00
QTom
a9285b8a94
feat(gateway): 双模式用户消息队列 — 串行队列 + 软性限速
...
新增 UMQ (User Message Queue) 双模式支持:
- serialize: 账号级分布式串行锁 + RPM 自适应延迟(严格限流)
- throttle: 仅 RPM 自适应前置延迟,不阻塞并发(软性限速)
后端:
- config: 新增 Mode 字段,保留 Enabled 向后兼容
- service: 新增 UserMessageQueueService(Lua 锁/延迟算法/清理 worker)
- repository: 新增 UserMsgQueueCache(Redis Lua acquire/release/force-release)
- handler: 新增 UserMsgQueueHelper(SSE ping + 等待循环 + throttle)
- gateway: 按 mode 分支集成 serialize/throttle 逻辑
- lint: 修复 gofmt rewrite rules、errcheck 类型断言、staticcheck QF1012
前端:
- 三态选择器 UI(关闭/软性限速/串行队列)替代 toggle 开关
- BulkEdit 支持 null 语义(不修改)
- i18n 中英文文案
通过 6 轮专家评审(42 次 review)、golangci-lint、单元测试、集成测试。
2026-03-03 01:05:11 +08:00
QTom
4280aca82c
feat(gateway): 添加 Claude Code 客户端最低版本检查功能
...
- 通过 User-Agent 识别 Claude Code 客户端并提取版本号
- 在网关层验证客户端版本是否满足管理员配置的最低要求
- 在管理后台提供版本要求配置选项(英文/中文双语)
- 实现原子缓存 + singleflight 防止并发问题和 thundering herd
- 使用 context.WithoutCancel 隔离 DB 查询,避免客户端断连影响缓存
- 双 TTL 策略:60s 正常、5s 错误恢复,保证性能与可用性
- 仅检查 Claude Code 客户端,其他客户端不受影响
- 添加完整单元测试覆盖版本提取、比对、上下文操作
2026-03-01 15:45:44 +08:00
QTom
e63c83955a
fix: address deep code review issues for RPM limiting
...
- Move IncrementRPM after Forward success to prevent phantom RPM
consumption during account switch retries
- Add base_rpm input sanitization (clamp to 0-10000) in Create/Update
- Add WindowCost scheduling checks to legacy path sticky sessions
(4 check sites + 4 prefetch sites), fixing pre-existing gap
- Clean up rpm_strategy/rpm_sticky_buffer when disabling RPM in
BulkEditModal (JSONB merge cannot delete keys, use empty values)
- Add json.Number test cases to TestGetBaseRPM/TestGetRPMStickyBuffer
- Document TOCTOU race as accepted soft-limit design trade-off
2026-02-28 20:38:06 +08:00
QTom
607237571f
fix: address code review issues for RPM limiting feature
...
- Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE
- Filter negative values in GetBaseRPM(), update test expectation
- Add RPM batch query (GetRPMBatch) to account List API
- Add warn logs for RPM increment failures in gateway handler
- Reset enableRpmLimit on BulkEditAccountModal close
- Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs
- Add design decision comments for rdb.Time() RTT trade-off
2026-02-28 20:37:37 +08:00
QTom
f648b8e026
feat: increment RPM counter before request forwarding
2026-02-28 20:37:10 +08:00
yangjianbo
bb664d9bbf
feat(sync): full code sync from release
2026-02-28 15:01:20 +08:00
erio
d8d4b0c0c7
fix: enable Gemini model_mapping UI and extend warmup to Antigravity
...
- Remove Gemini platform exclusion from model restriction UI in
Create/Edit account modals (Gemini now supports model_mapping)
- Remove outdated Gemini model passthrough info cards
- Add model_mapping field to GeminiCredentials type
- Extend warmup request interception toggle to Antigravity platform
- Remove redundant try/catch in API key account creation
- Remove noisy gateway.request_completed debug log
- Reorganize Gemini model mapping sections in constants.go
2026-02-24 21:30:32 +08:00
erio
09166a52f8
refactor: extract failover error handling into FailoverState
...
- Extract duplicated failover logic from gateway_handler.go (3 places)
and gemini_v1beta_handler.go into shared failover_loop.go
- Introduce FailoverState with HandleFailoverError and HandleSelectionExhausted
- Move helper functions (needForceCacheBilling, sleepWithContext) into failover_loop.go
- Add comprehensive unit tests (32+ test cases)
- Delete redundant gateway_handler_single_account_retry_test.go
2026-02-24 18:08:04 +08:00
yangjianbo
2ee6c26676
fix(gateway): 修复粘性会话预取分组错配并优化并发等待热路径
2026-02-22 16:43:33 +08:00
yangjianbo
a89477ddf5
perf(gateway): 优化热点路径并补齐高覆盖测试
2026-02-22 13:31:30 +08:00
yangjianbo
33db7a0fb6
feat(gateway): 引入使用量记录有界 worker 池与自动扩缩容
...
- 新增 UsageRecordWorkerPool,支持有界队列、溢出降级策略与自动扩缩容
- 将 Gateway/OpenAI/Sora/Gemini 使用量记录改为提交到统一任务池执行
- 增加 usage_record 配置默认值与校验规则,并补充配置与任务提交相关测试
- 注入并托管 worker 池生命周期,服务退出时统一 StopAndWait
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-22 12:56:57 +08:00
yangjianbo
d04b47b3ca
feat(backend): 提交后端审计修复与配套测试改动
2026-02-14 11:23:10 +08:00
yangjianbo
abf5de69fb
Merge branch 'main' into test
2026-02-12 23:43:47 +08:00
yangjianbo
584cfc3db2
chore(logging): 完成后端日志审计与结构化迁移
...
- 将高密度服务与处理器日志迁移到新日志系统(LegacyPrintf/结构化日志)
- 增加 stdlog bridge 与兼容测试,保留旧日志捕获能力
- 将 OpenAI 断流告警改为结构化 Warn 并改造对应测试为 sink 捕获
- 补齐后端相关文件 logger 引用并通过全量 go test
2026-02-12 19:01:09 +08:00
yangjianbo
fff1d54858
feat(log): 落地统一日志底座与系统日志运维能力
2026-02-12 16:27:29 +08:00
程序猿MT
8da5fac69e
Merge branch 'Wei-Shaw:main' into main
2026-02-11 18:39:52 +08:00
Edric Li
2d4236f76e
fix: 修复错误透传规则 skip_monitoring 未生效的问题
...
- ops_error_logger: status < 400 分支增加 OpsSkipPassthroughKey 检查
- ops_upstream_context: 新增 checkSkipMonitoringForUpstreamEvent,中间重试/故障转移事件也能触发跳过标记
- gateway_handler/openai_gateway_handler/gemini_v1beta_handler: handleFailoverExhausted 匹配规则后设置 OpsSkipPassthroughKey
- antigravity_gateway_service: writeMappedClaudeError 增加 applyErrorPassthroughRule 调用
2026-02-10 20:56:01 +08:00
yangjianbo
3b0910f664
Merge branch 'main' into test-sora
2026-02-10 18:01:17 +08:00
程序猿MT
1dd3158c7e
Merge branch 'Wei-Shaw:main' into main
2026-02-10 13:55:51 +08:00
Edric Li
7d0a30fa8f
merge: sync upstream main (antigravity single-account 503 retry)
...
合并上游新增的 Antigravity 单账号 503 退避重试机制,
解决与本地 MODEL_CAPACITY_EXHAUSTED 逻辑的冲突,两者共存。
2026-02-10 12:00:21 +08:00
Edric Li
d6c2921f2b
feat: same-account retry before failover for transient errors
...
For retryable transient errors (Google 400 "invalid project resource name"
and empty stream responses), retry on the same account up to 2 times
(with 500ms delay) before switching to another account.
- Add RetryableOnSameAccount field to UpstreamFailoverError
- Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
- Move temp-unschedule from service layer to handler layer (only after
all same-account retries exhausted)
- Reduce temp-unschedule cooldown from 30 minutes to 1 minute
2026-02-10 00:53:54 +08:00
yangjianbo
d367d1cde6
Merge branch 'main' into test-sora
2026-02-09 20:40:09 +08:00
yangjianbo
16131c3d3f
Merge branch 'main' of https://github.com/mt21625457/aicodex2api
2026-02-09 20:26:03 +08:00
Rose Ding
021abfca18
fix: 单账号分组首次 503 不设模型限流标记,避免后续请求雪崩
...
单账号 antigravity 分组收到 503 (MODEL_CAPACITY_EXHAUSTED) 时,
原逻辑会设置 ~29s 模型限流标记。由于只有一个账号无法切换,
后续所有新请求在预检查时命中限流 → 几毫秒内直接返回 503,
导致约 30 秒的雪崩窗口。
修复:在 Handler 入口处检查分组是否只有单个 antigravity 账号,
如果是则提前设置 SingleAccountRetry context 标记,让 Service 层
首次 503 就走原地重试逻辑(不设限流标记),避免污染后续请求。
2026-02-09 17:25:36 +08:00
Rose Ding
f6cfab9901
feat: 添加 Antigravity 单账号 503 退避重试机制
...
当分组内只有一个可用账号且上游返回 503 (MODEL_CAPACITY_EXHAUSTED) 时,
不再设置模型限流+切换账号(因为切换回来还是同一个账号),而是在 Service 层
原地等待+重试,避免双重等待问题。
主要变更:
- Handler 层:检测单账号 503 场景,清除排除列表并设置 SingleAccountRetry 标记
- Service 层:新增 handleSingleAccountRetryInPlace 原地重试逻辑
- Service 层:预检查跳过单账号模式下的限流检查
- 新增 ctxkey.SingleAccountRetry 上下文标记
2026-02-09 14:26:01 +08:00
erio
72b08f9cc5
fix: ensure sticky session failover triggers cache billing exemption
2026-02-09 06:57:07 +08:00
erio
681950dadd
feat: add linear delay between Antigravity account failover switches
2026-02-09 06:56:29 +08:00
erio
35598d5648
fix: parse Gemini native request format in ParseGatewayRequest for correct session hash generation
...
ParseGatewayRequest only parsed Anthropic format (system/messages),
ignoring Gemini native format (systemInstruction/contents). This caused
GenerateSessionHash to produce identical hashes for all Gemini sessions.
Add protocol parameter to ParseGatewayRequest to branch between
Anthropic and Gemini parsing. Update GenerateSessionHash message
traversal to extract text from both formats.
2026-02-09 06:47:22 +08:00
erio
5c76b9e45a
fix: prevent sessionHash collision for different users with same messages
...
Mix SessionContext (ClientIP, UserAgent, APIKeyID) into
GenerateSessionHash 3rd-level fallback to differentiate requests
from different users sending identical content.
Also switch hashContent from SHA256-truncated to XXHash64 for
better performance, and optimize Trie Lua script to match from
longest prefix first.
2026-02-09 06:46:32 +08:00
erio
fb58560d15
refactor(upstream): replace upstream account type with apikey, auto-append /antigravity
...
Upstream accounts now use the standard APIKey type instead of a dedicated
upstream type. GetBaseURL() and new GetGeminiBaseURL() automatically append
/antigravity for Antigravity platform APIKey accounts, eliminating the need
for separate upstream forwarding methods.
- Remove ForwardUpstream, ForwardUpstreamGemini, testUpstreamConnection
- Remove upstream branch guards in Forward/ForwardGemini/TestConnection
- Add migration 052 to convert existing upstream accounts to apikey
- Update frontend CreateAccountModal to create apikey type
- Add unit tests for GetBaseURL and GetGeminiBaseURL
2026-02-08 13:06:25 +08:00
yangjianbo
fd43be8d0b
merge: 合并 main 分支到 test,解决 config 和 modelWhitelist 冲突
...
- config.go: 保留 Sora 配置,合入 SubscriptionCache 配置
- useModelWhitelist.ts: 同时保留 soraModels 和 antigravityModels
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-07 20:18:07 +08:00
yangjianbo
a14dfb769a
Merge branch 'dev-release'
2026-02-07 19:58:00 +08:00
yangjianbo
2588fa6a8f
fix(audit): 第二批审计修复 — P0 生产 Bug、安全加固、性能优化、缓存一致性、代码质量
...
基于 backend-code-audit 审计报告,修复剩余 P0/P1/P2 共 34 项问题:
P0 生产 Bug:
- 修复 time.Since(time.Now()) 计时逻辑错误 (P0-03)
- generateRandomID 改用 crypto/rand 替代固定索引 (P0-04)
- IncrementQuotaUsed 重写为 Ent 原子操作消除 TOCTOU 竞态 (P0-05)
安全加固:
- gateway/openai handler 错误响应替换为泛化消息,防止内部信息泄露 (P1-14)
- usage_log_repo dateFormat 参数改用白名单映射,防止 SQL 注入 (P1-16)
- 默认配置安全加固:sslmode=prefer、response_headers=true、mode=release (P1-18/19, P2-15)
性能优化:
- gateway handler 循环内 defer 替换为显式 releaseWait 闭包 (P1-02)
- group_repo/promo_code_repo Count 前 Clone 查询避免状态污染 (P1-03)
- usage_log_repo 四个查询添加 LIMIT 10000 防止 OOM (P1-07)
- GetBatchUsageStats 添加时间范围参数,默认最近 30 天 (P1-10)
- ip.go CIDR 预编译为包级变量 (P1-11)
- BatchUpdateCredentials 重构为先验证后更新 (P1-13)
缓存一致性:
- billing_cache 添加 jitteredTTL 防止缓存雪崩 (P2-10)
- DeductUserBalance/UpdateSubscriptionUsage 错误传播修复 (P2-12)
- UserService.UpdateBalance 成功后异步失效 billingCache (P2-13)
代码质量:
- search 截断改为按 rune 处理,支持多字节字符 (P2-01)
- TLS Handshake 改为 HandshakeContext 支持 context 取消 (P2-07)
- CORS 预检添加 Access-Control-Max-Age: 86400 (P2-16)
测试覆盖:
- 新增 user_service_test.go(UpdateBalance 缓存失效 6 个用例)
- 新增 batch_update_credentials_test.go(fail-fast + 类型验证 7 个用例)
- 新增 response_transformer_test.go、ip_test.go、usage_log_repo_unit_test.go、search_truncate_test.go
- 集成测试:IncrementQuotaUsed 并发测试、billing_cache 错误传播测试
- config_test.go 补充 server.mode/sslmode 默认值断言
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-02-07 19:46:42 +08:00
shaw
6aaa4aee6a
fix: 收敛 Claude Code 探测拦截并补齐回归测试
2026-02-07 19:04:08 +08:00
erio
edb0937024
fix: restore non-failover error passthrough from 7b156489
2026-02-07 14:24:55 +08:00
erio
5e98445b22
feat(antigravity): comprehensive enhancements - model mapping, rate limiting, scheduling & ops
...
Key changes:
- Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching
- Unified rate limiting: scope-level → model-level with Redis snapshot sync
- Load-balanced scheduling by call count with smart retry mechanism
- Force cache billing support
- Model identity injection in prompts with leak prevention
- Thinking mode auto-handling (max_tokens/budget_tokens fix)
- Frontend: whitelist mode toggle, model mapping validation, status indicators
- Gemini session fallback with Redis Trie O(L) matching
- Ops: enhanced concurrency monitoring, account availability, retry logic
- Migration scripts: 049-051 for model mapping unification
2026-02-07 12:31:10 +08:00
shaw
7b1564898b
fix: make error passthrough effective for non-failover upstream errors
2026-02-07 10:25:56 +08:00
yangjianbo
d8e405511e
Merge branch 'main' of https://github.com/mt21625457/aicodex2api
2026-02-06 06:56:23 +08:00
shaw
39e05a2dad
feat: 新增全局错误透传规则功能
...
支持管理员配置上游错误如何返回给客户端:
- 新增 ErrorPassthroughRule 数据模型和 Ent Schema
- 实现规则的 CRUD API(/admin/error-passthrough-rules)
- 支持按错误码、关键词匹配,支持 any/all 匹配模式
- 支持按平台过滤(anthropic/openai/gemini/antigravity)
- 支持透传或自定义响应状态码和错误消息
- 实现两级缓存(Redis + 本地内存)和多实例同步
- 集成到 gateway_handler 的错误处理流程
- 新增前端管理界面组件
- 新增单元测试覆盖核心匹配逻辑
优化:
- 移除 refreshLocalCache 中的冗余排序(数据库已排序)
- 后端 Validate() 增加匹配条件非空校验
2026-02-05 21:52:54 +08:00