279 Commits

Author SHA1 Message Date
erio
b3fe0506fb Merge branch 'release/custom-0.1.95' into release/custom-0.1.96 2026-03-12 18:12:47 +08:00
amberwarden
6e90ec6111 fix: 为 Anthropic Messages API 流式转发添加下游 keepalive ping
Anthropic Messages API 的流式转发路径(gateway_service.go)在上游长时间
无数据时(如 Opus extended thinking 阶段)不会向下游发送任何内容,导致
Cloudflare Tunnel 等代理因连接空闲而断开。

复用已有的 StreamKeepaliveInterval 配置(默认 10 秒),在 select 循环中
添加 keepalive 分支,定时发送 Anthropic 原生格式的 ping 事件保活,与
OpenAI 兼容路径的实现模式保持一致。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:43:03 +08:00
erio
4caa3c2701 merge: integrate upstream v0.1.95 with our customizations
Merge main (custom features) into release/custom-0.1.95 (upstream v0.1.95).
New upstream features: group subscription binding, multi-dimension quota (daily/weekly/total),
allow_messages_dispatch, default_mapped_model, recover state API.
Our customizations: simulate_claude_max_enabled, usage status detection, 403 validation handling.
2026-03-11 03:23:44 +08:00
shaw
00a0a12138 feat: Anthropic平台可配置 anthropic-beta 策略 2026-03-10 11:20:10 +08:00
shaw
a461538d58 fix: 修复gpt->claude转换无法命中codex缓存问题 2026-03-09 15:08:37 +08:00
shaw
ebe6f418f3 fix: gpt->claude格式转换对齐effort映射和fast 2026-03-09 11:42:35 +08:00
Wesley Liddick
6cb8980404 Merge pull request #807 from touwaeriol/fix/openai-passthrough-v2
fix(openai): remove misplaced passthrough check from isModelSupportedByAccount
2026-03-09 09:06:35 +08:00
erio
91ef085d7d fix: increase SSE scanner max line size from 40MB to 500MB
4K image base64 data can exceed 40MB limit, causing "bufio.Scanner:
token too long" errors. Scanner is adaptive (starts at 64KB, grows
as needed), so increasing the cap has no impact on normal responses.
2026-03-09 08:56:54 +08:00
Wesley Liddick
97aaa24733 Merge pull request #858 from james-6-23/fix/pool-mode-03bf3485
支持 API Key 上游池模式的同账号重试次数配置与自定义错误策略
2026-03-09 08:48:53 +08:00
Wesley Liddick
01180b316f Merge pull request #841 from touwaeriol/feature/account-periodic-quota
feat(account): 为 API Key 账号新增日/周周期性配额限制
2026-03-08 20:34:15 +08:00
kyx236
e643fc382c feat: 支持 API Key 上游池模式同账号重试次数配置与自定义错误策略 2026-03-08 14:12:17 +08:00
shaw
a3791104f9 feat: 支持后台设置是否启用整流开关 2026-03-07 21:55:38 +08:00
erio
1ee17383f8 feat(account): add daily/weekly periodic quota limits for API Key accounts
Extend the existing total quota limit with daily and weekly periodic
dimensions. Each dimension is independently configurable and uses lazy
reset — when the period expires, usage is automatically reset to zero on
the next increment. Any dimension exceeding its limit will pause the
account from scheduling.

Backend:
- Add GetQuotaDailyLimit/Used, GetQuotaWeeklyLimit/Used, HasAnyQuotaLimit
- Rewrite IncrementQuotaUsed with atomic CTE SQL for 3-dimension update
- Rewrite ResetQuotaUsed to clear all dimensions and period timestamps
- Update postUsageBilling to use HasAnyQuotaLimit()
- Preserve daily/weekly used values on account edit

Frontend:
- Refactor QuotaLimitCard from single v-model to 3-dimension props
- Add QuotaBadge component for compact D/W/$ display
- Update AccountCapacityCell with per-dimension badges
- Update Create/Edit modals with daily/weekly quota fields
- Update AccountActionMenu hasQuotaLimit to check all dimensions
- Add i18n strings for daily/weekly/total quota labels

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 19:06:59 +08:00
Wesley Liddick
afbe8bf001 Merge pull request #809 from alfadb/feature/openai-messages
feat(openai): 添加 /v1/messages 端点和 API 兼容层
2026-03-06 20:16:06 +08:00
Wesley Liddick
005d0c5f53 Merge pull request #815 from mt21625457/pr/openai-user-group-rate-upstream
fix(openai): 统一专属倍率计费链路并补齐回归测试
2026-03-06 17:33:09 +08:00
yangjianbo
a18bbb5f2f fix(openai): 统一专属倍率计费链路并补齐回归测试
抽取共享的用户分组专属倍率解析器,统一缓存、singleflight 与回退逻辑。\n\n让 OpenAI 独立计费链路复用专属倍率解析,修复 usage 记录与实际扣费未命中用户专属倍率的问题。\n\n补齐 OpenAI 计费与解析器单元测试,并修复全量回归中暴露的 lint 阻塞项。\n\nCo-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 16:47:51 +08:00
erio
c28f691f32 fix(openai): remove misplaced passthrough check from isModelSupportedByAccount
isModelSupportedByAccount 不被 OpenAI 调度路径调用,
OpenAI /responses 和 /chat/completions 走的是
openai_account_scheduler.go,透传短路已在 PR #806 的
第二个 commit 中正确添加到该文件。

此处的检查是多余的死代码,因为 OpenAI 账号不会走到
isModelSupportedByAccount 的这个分支。
2026-03-06 14:32:08 +08:00
alfadb
ff1f114989 feat(openai): add /v1/messages endpoint and API compatibility layer
Add Anthropic Messages API support for OpenAI platform groups, enabling
clients using Claude-style /v1/messages format to access OpenAI accounts
through automatic protocol conversion.

- Add apicompat package with type definitions and bidirectional converters
  (Anthropic ↔ Chat, Chat ↔ Responses, Anthropic ↔ Responses)
- Implement /v1/messages endpoint for OpenAI gateway with streaming support
- Add model mapping UI for OpenAI OAuth accounts (whitelist + mapping modes)
- Support prompt caching fields and codex OAuth transforms
- Fix tool call ID conversion for Responses API (fc_ prefix)
- Ensure function_call_output has non-empty output field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 14:29:22 +08:00
erio
79ae15d5e8 fix: OpenAI passthrough accounts bypass model mapping check
透传模式账号仅替换认证,应允许所有模型通过。之前调度阶段的
isModelSupportedByAccount 不感知透传模式,导致 model_mapping
中未配置的新模型(如 gpt-5.4)被拒绝返回 503。
2026-03-06 14:01:47 +08:00
Wesley Liddick
63a8c76946 Merge pull request #798 from touwaeriol/feature/account-load-factor
feat: add account load_factor for scheduling load calculation
2026-03-06 09:42:10 +08:00
erio
0d6c1c7790 feat: add independent load_factor field for scheduling load calculation 2026-03-06 05:07:10 +08:00
erio
f89465fb39 Merge branch 'main' into release/custom-0.1.91
# Conflicts:
#	frontend/src/components/admin/account/AccountActionMenu.vue
#	frontend/src/views/admin/AccountsView.vue
2026-03-06 04:08:14 +08:00
erio
440c3f46a7 feat: add independent load_factor field for scheduling load calculation
- Separate load factor from concurrency: concurrency controls actual
  slot acquisition, load_factor controls load rate calculation
- Add EffectiveLoadFactor() method: LoadFactor > Concurrency > 1
- Add load_factor field to Create/Edit/BulkEdit account forms
- Fix RPM default value: auto-fill 15 when RPM enabled but not set
- Fix stale test compilation errors in server and handler packages
2026-03-06 03:42:24 +08:00
erio
b2d6879b3f refactor: unify post-usage billing logic and fix account quota calculation
- Extract postUsageBilling() to consolidate billing logic across
  GatewayService.RecordUsage, RecordUsageWithLongContext, and
  OpenAIGatewayService.RecordUsage, eliminating ~120 lines of
  duplicated code
- Fix account quota to use TotalCost × accountRateMultiplier
  (was using raw TotalCost, inconsistent with account cost stats)
- Fix RecordUsageWithLongContext API Key quota only updating in
  balance mode (now updates regardless of billing type)
- Fix WebSocket client disconnect detection on Windows by adding
  "an established connection was aborted" to known disconnect errors
2026-03-06 00:54:48 +08:00
erio
02dea7b09b refactor: unify post-usage billing logic and fix account quota calculation
- Extract postUsageBilling() to consolidate billing logic across
  GatewayService.RecordUsage, RecordUsageWithLongContext, and
  OpenAIGatewayService.RecordUsage, eliminating ~120 lines of
  duplicated code
- Fix account quota to use TotalCost × accountRateMultiplier
  (was using raw TotalCost, inconsistent with account cost stats)
- Fix RecordUsageWithLongContext API Key quota only updating in
  balance mode (now updates regardless of billing type)
- Fix WebSocket client disconnect detection on Windows by adding
  "an established connection was aborted" to known disconnect errors
2026-03-06 00:54:17 +08:00
erio
05527b13db feat: add quota limit for API key accounts
- Add configurable spending limit (quota_limit) for apikey-type accounts
- Atomic quota accumulation via PostgreSQL JSONB operations on TotalCost
- Scheduler filters out over-quota accounts with outbox-triggered snapshot refresh
- Display quota usage ($used / $limit) in account capacity column
- Add "Reset Quota" action in account menu to reset usage to zero
- Editing account settings preserves quota_used (no accidental reset)
- Covers all 3 billing paths: Anthropic, Gemini, OpenAI RecordUsage

chore: bump version to 0.1.90.4
2026-03-06 00:35:09 +08:00
erio
95cf59b2f6 feat: add quota limit for API key accounts
- Add configurable spending limit (quota_limit) for apikey-type accounts
- Atomic quota accumulation via PostgreSQL JSONB operations on TotalCost
- Scheduler filters out over-quota accounts with outbox-triggered snapshot refresh
- Display quota usage ($used / $limit) in account capacity column
- Add "Reset Quota" action in account menu to reset usage to zero
- Editing account settings preserves quota_used (no accidental reset)
- Covers all 3 billing paths: Anthropic, Gemini, OpenAI RecordUsage

chore: bump version to 0.1.90.4
2026-03-05 21:48:37 +08:00
shaw
9d70c38504 fix: 修复claude apikey账号请求时未携带beta=true 查询参数的bug 2026-03-05 15:01:04 +08:00
shaw
aeb464f3ca feat: 模型映射应用 /v1/messages/count_tokens端点 2026-03-05 14:49:28 +08:00
erio
a6026e7ac4 Merge tag 'v0.1.90' into merge/upstream-v0.1.90
注册邮箱域名白名单策略上线,后台大数据场景性能大幅优化。

- 注册邮箱域名白名单:支持管理员配置允许注册的邮箱域名策略
- Keys 页面表单筛选:用户 /keys 页面支持按条件筛选 API Key
- Settings 页面分 Tab 拆分:管理后台设置页面按功能模块分 Tab 展示

- 后台大数据场景加载性能优化:仪表盘/用户/账号/Ops 页面大数据集加载显著提速
- Usage 大表分页优化:默认避免全量 COUNT(*),大幅降低分页查询耗时
- 消除重复的 normalizeAccountIDList,补充新增组件的单元测试
- 清理无用文件和过时文档,精简项目结构
- EmailVerifyView 硬编码英文字符串替换为 i18n 调用

- 修复 Anthropic 平台无限流重置时间的 429 误标记账号限流问题
- 修复自定义菜单页面管理员视角菜单不生效问题
- 修复 Ops 错误详情弹窗未展示真实上游 payload 的问题
- 修复充值/订阅菜单 icon 显示问题

# Conflicts:
#	.gitignore
#	backend/cmd/server/VERSION
#	backend/ent/group.go
#	backend/ent/runtime/runtime.go
#	backend/ent/schema/group.go
#	backend/go.sum
#	backend/internal/handler/admin/account_handler.go
#	backend/internal/handler/admin/dashboard_handler.go
#	backend/internal/pkg/usagestats/usage_log_types.go
#	backend/internal/repository/group_repo.go
#	backend/internal/repository/usage_log_repo.go
#	backend/internal/server/middleware/security_headers.go
#	backend/internal/server/router.go
#	backend/internal/service/account_usage_service.go
#	backend/internal/service/admin_service_bulk_update_test.go
#	backend/internal/service/dashboard_service.go
#	backend/internal/service/gateway_service.go
#	frontend/src/api/admin/dashboard.ts
#	frontend/src/components/account/BulkEditAccountModal.vue
#	frontend/src/components/charts/GroupDistributionChart.vue
#	frontend/src/components/layout/AppSidebar.vue
#	frontend/src/i18n/locales/en.ts
#	frontend/src/i18n/locales/zh.ts
#	frontend/src/views/admin/GroupsView.vue
#	frontend/src/views/admin/SettingsView.vue
#	frontend/src/views/admin/UsageView.vue
#	frontend/src/views/user/PurchaseSubscriptionView.vue
2026-03-04 19:58:38 +08:00
Wesley Liddick
43c203333e Merge pull request #733 from DaydreamCoding/fix/group-isolation
fix(gateway): 分组隔离 — 禁止未分组账号被跨组调度
2026-03-03 15:10:30 +08:00
shaw
a80ec5d8bb feat: apikey支持5h/1d/7d速率控制 2026-03-03 15:01:10 +08:00
QTom
530a16291c fix(gateway): 分组隔离 — 禁止未分组账号被跨组调度
当 API Key 无分组时,调度仅从未分组账号池中选取。
修复 isAccountInGroup 在 groupID==nil 时的逻辑,
同时补全 scheduler_snapshot_service 和 gemini_compat_service
中的 SimpleMode 保护,确保分组隔离在所有调度路径生效。

新增 ListSchedulableUngroupedByPlatform/s 方法,
使用 Ent 的 Not(HasAccountGroups()) 谓词实现未分组账号隔离。
新增 17 个单元和端到端隔离测试,覆盖所有分支和边界条件。
2026-03-03 13:20:58 +08:00
QTom
a9285b8a94 feat(gateway): 双模式用户消息队列 — 串行队列 + 软性限速
新增 UMQ (User Message Queue) 双模式支持:
- serialize: 账号级分布式串行锁 + RPM 自适应延迟(严格限流)
- throttle: 仅 RPM 自适应前置延迟,不阻塞并发(软性限速)

后端:
- config: 新增 Mode 字段,保留 Enabled 向后兼容
- service: 新增 UserMessageQueueService(Lua 锁/延迟算法/清理 worker)
- repository: 新增 UserMsgQueueCache(Redis Lua acquire/release/force-release)
- handler: 新增 UserMsgQueueHelper(SSE ping + 等待循环 + throttle)
- gateway: 按 mode 分支集成 serialize/throttle 逻辑
- lint: 修复 gofmt rewrite rules、errcheck 类型断言、staticcheck QF1012

前端:
- 三态选择器 UI(关闭/软性限速/串行队列)替代 toggle 开关
- BulkEdit 支持 null 语义(不修改)
- i18n 中英文文案

通过 6 轮专家评审(42 次 review)、golangci-lint、单元测试、集成测试。
2026-03-03 01:05:11 +08:00
QTom
2491e9b5ad fix: round-3 review fixes for RPM limiting
- Add sanitizeExtraBaseRPM to BulkUpdate handler (was missing)
- Add WindowCost scheduling checks to legacy non-sticky selection
  paths (4 sites), matching existing sticky + load-aware coverage
- Export ParseExtraInt from service package, remove duplicate
  parseExtraIntForValidation from admin handler
2026-02-28 20:38:06 +08:00
QTom
e63c83955a fix: address deep code review issues for RPM limiting
- Move IncrementRPM after Forward success to prevent phantom RPM
  consumption during account switch retries
- Add base_rpm input sanitization (clamp to 0-10000) in Create/Update
- Add WindowCost scheduling checks to legacy path sticky sessions
  (4 check sites + 4 prefetch sites), fixing pre-existing gap
- Clean up rpm_strategy/rpm_sticky_buffer when disabling RPM in
  BulkEditModal (JSONB merge cannot delete keys, use empty values)
- Add json.Number test cases to TestGetBaseRPM/TestGetRPMStickyBuffer
- Document TOCTOU race as accepted soft-limit design trade-off
2026-02-28 20:38:06 +08:00
QTom
ff9683b0fc fix: move RPM prefetch before routing segment in legacy/mixed paths
Ensures isAccountSchedulableForRPM calls within the routing segment
hit the prefetch cache instead of querying Redis individually.
2026-02-28 20:37:37 +08:00
QTom
607237571f fix: address code review issues for RPM limiting feature
- Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE
- Filter negative values in GetBaseRPM(), update test expectation
- Add RPM batch query (GetRPMBatch) to account List API
- Add warn logs for RPM increment failures in gateway handler
- Reset enableRpmLimit on BulkEditAccountModal close
- Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs
- Add design decision comments for rdb.Time() RTT trade-off
2026-02-28 20:37:37 +08:00
QTom
f648b8e026 feat: increment RPM counter before request forwarding 2026-02-28 20:37:10 +08:00
QTom
678c3ae132 feat: integrate RPM scheduling checks into account selection flow 2026-02-28 20:37:10 +08:00
QTom
c1c31ed9b2 feat: wire RPMCache into GatewayService and AccountHandler 2026-02-28 20:35:38 +08:00
yangjianbo
bb664d9bbf feat(sync): full code sync from release 2026-02-28 15:01:20 +08:00
erio
0e69895603 Merge branch 'main' into release/custom-0.1.87
# Conflicts:
#	frontend/src/components/keys/UseKeyModal.vue
2026-02-27 21:20:22 +08:00
erio
81d896bf78 fix: sync Antigravity ForwardResult.Usage with client response simulation
Apply Claude Max cache billing to usage before returning ForwardResult
in Antigravity Forward, ensuring RecordUsage gets the same simulated
usage that clients see. Restore apply+fallback in RecordUsage for
consistency across GatewayService and Antigravity paths.
2026-02-27 20:42:53 +08:00
erio
741eae59bb refactor: decouple claude max cache simulation from RecordUsage
Extract setupClaudeMaxStreamingHook and applyClaudeMaxNonStreamingRewrite
facade functions to helpers file. RecordUsage now uses detect-only (no
mutation), client response rewriting handled at Forward layer.
2026-02-27 19:59:36 +08:00
erio
61ef73cb12 refactor: isolate claude max response usage simulation by group toggle 2026-02-27 16:14:07 +08:00
erio
6da2f54e50 refactor: decouple claude max cache policy and add tokenizer 2026-02-27 12:18:22 +08:00
erio
886464b2e9 Merge branch 'feature/claude-max-simulation-review' into release/custom-0.1.86
# Conflicts:
#	backend/cmd/server/VERSION
2026-02-27 09:58:01 +08:00
erio
a6f9f9f968 feat: replace gemini-3-pro-image with gemini-3.1-flash-image
- Add migration 060 to update model_mapping for all antigravity accounts
- Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings
- Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings
- Update frontend usage window to show GImage for new model
- Update isImageGenerationModel to support new model
2026-02-27 09:52:50 +08:00
erio
756b09b6b8 feat: replace gemini-3-pro-image with gemini-3.1-flash-image
- Add migration 060 to update model_mapping for all antigravity accounts
- Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings
- Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings
- Update frontend usage window to show GImage for new model
- Update isImageGenerationModel to support new model
2026-02-27 09:30:44 +08:00