diff --git a/.github/workflows/backend-ci.yml b/.github/workflows/backend-ci.yml
index 2596a18c..84575a96 100644
--- a/.github/workflows/backend-ci.yml
+++ b/.github/workflows/backend-ci.yml
@@ -17,6 +17,7 @@ jobs:
           go-version-file: backend/go.mod
           check-latest: false
           cache: true
+          cache-dependency-path: backend/go.sum
       - name: Verify Go version
         run: |
           go version | grep -q 'go1.25.7'
@@ -36,6 +37,7 @@ jobs:
           go-version-file: backend/go.mod
           check-latest: false
           cache: true
+          cache-dependency-path: backend/go.sum
       - name: Verify Go version
         run: |
           go version | grep -q 'go1.25.7'
diff --git a/.gitignore b/.gitignore
index 48172982..925912fa 100644
--- a/.gitignore
+++ b/.gitignore
@@ -78,6 +78,7 @@ Desktop.ini
 # ===================
 tmp/
 temp/
+logs/
 *.tmp
 *.temp
 *.log
@@ -129,4 +130,12 @@ deploy/docker-compose.override.yml
 .gocache/
 vite.config.js
 docs/*
-.serena/
\ No newline at end of file
+.serena/
+
+# ===================
+# 压测工具
+# ===================
+tools/loadtest/
+# Antigravity Manager
+Antigravity-Manager/
+antigravity_projectid_fix.patch
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 00000000..0202e94f
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,1210 @@
+# Sub2API 开发说明
+
+## 版本管理策略
+
+### 版本号规则
+
+我们在官方版本号后面添加自己的小版本号：
+
+- 官方版本：`v0.1.68`
+- 我们的版本：`v0.1.68.1`、`v0.1.68.2`（递增）
+
+### 分支策略
+
+| 分支 | 说明 |
+|------|------|
+| `main` | 我们的主分支，包含所有定制功能 |
+| `release/custom-X.Y.Z` | 基于官方 `vX.Y.Z` 的发布分支 |
+| `upstream/main` | 上游官方仓库 |
+
+---
+
+## 发布流程（基于新官方版本）
+
+当官方发布新版本（如 `v0.1.69`）时：
+
+### 1. 同步上游并创建发布分支
+
+```bash
+# 获取上游最新代码
+git fetch upstream --tags
+
+# 基于官方标签创建新的发布分支
+git checkout v0.1.69 -b release/custom-0.1.69
+
+# 合并我们的 main 分支（包含所有定制功能）
+git merge main --no-edit
+
+# 解决可能的冲突后继续
+```
+
+### 2. 更新版本号并打标签
+
+```bash
+# 更新版本号文件
+echo "0.1.69.1" > backend/cmd/server/VERSION
+git add backend/cmd/server/VERSION
+git commit -m "chore: bump version to 0.1.69.1"
+
+# 打上我们自己的标签
+git tag v0.1.69.1
+
+# 推送分支和标签
+git push origin release/custom-0.1.69
+git push origin v0.1.69.1
+```
+
+### 3. 更新 main 分支
+
+```bash
+# 将发布分支合并回 main，保持 main 包含最新定制功能
+git checkout main
+git merge release/custom-0.1.69
+git push origin main
+```
+
+---
+
+## 热修复发布（在现有版本上修复）
+
+当需要在当前版本上发布修复时：
+
+```bash
+# 在当前发布分支上修复
+git checkout release/custom-0.1.68
+# ... 进行修复 ...
+git commit -m "fix: 修复描述"
+
+# 递增小版本号
+echo "0.1.68.2" > backend/cmd/server/VERSION
+git add backend/cmd/server/VERSION
+git commit -m "chore: bump version to 0.1.68.2"
+
+# 打标签并推送
+git tag v0.1.68.2
+git push origin release/custom-0.1.68
+git push origin v0.1.68.2
+
+# 同步修复到 main
+git checkout main
+git cherry-pick <fix-commit-hash>
+git push origin main
+```
+
+---
+
+## 服务器部署流程
+
+### 前置条件
+
+- 本地已配置 SSH 别名 `clicodeplus` 连接到生产服务器（运行服务）
+- 本地已配置 SSH 别名 `us-asaki-root` 连接到构建服务器（拉取代码、构建镜像）
+- 生产服务器部署目录：`/root/sub2api`（正式）、`/root/sub2api-beta`（测试）
+- 生产服务器使用 Docker Compose 部署
+- **镜像统一在构建服务器上构建**，避免生产服务器因编译占用 CPU/内存影响线上服务
+
+### 服务器角色说明
+
+| 服务器 | SSH 别名 | 职责 |
+|--------|----------|------|
+| 构建服务器 | `us-asaki-root` | 拉取代码、`docker build` 构建镜像 |
+| 生产服务器 | `clicodeplus` | 加载镜像、运行服务、部署验证 |
+
+### 部署环境说明
+
+| 环境 | 目录（生产服务器） | 端口 | 数据库 | 容器名 |
+|------|------|------|--------|--------|
+| 正式 | `/root/sub2api` | 8080 | `sub2api` | `sub2api` |
+| Beta | `/root/sub2api-beta` | 8084 | `beta` | `sub2api-beta` |
+
+### 外部数据库
+
+正式和 Beta 环境**共用外部 PostgreSQL 数据库**（非容器内数据库），配置在 `.env` 文件中：
+- `DATABASE_HOST`：外部数据库地址
+- `DATABASE_SSLMODE`：SSL 模式（通常为 `require`）
+- `POSTGRES_USER` / `POSTGRES_DB`：用户名和数据库名
+
+#### 数据库操作命令
+
+通过 SSH 在服务器上执行数据库操作：
+
+```bash
+# 正式环境 - 查询迁移记录
+ssh clicodeplus "source /root/sub2api/deploy/.env && PGPASSWORD=\"\$POSTGRES_PASSWORD\" psql -h \$DATABASE_HOST -U \$POSTGRES_USER -d \$POSTGRES_DB -c 'SELECT * FROM schema_migrations ORDER BY applied_at DESC LIMIT 5;'"
+
+# Beta 环境 - 查询迁移记录
+ssh clicodeplus "source /root/sub2api-beta/deploy/.env && PGPASSWORD=\"\$POSTGRES_PASSWORD\" psql -h \$DATABASE_HOST -U \$POSTGRES_USER -d \$POSTGRES_DB -c 'SELECT * FROM schema_migrations ORDER BY applied_at DESC LIMIT 5;'"
+
+# Beta 环境 - 清除指定迁移记录（重新执行迁移）
+ssh clicodeplus "source /root/sub2api-beta/deploy/.env && PGPASSWORD=\"\$POSTGRES_PASSWORD\" psql -h \$DATABASE_HOST -U \$POSTGRES_USER -d \$POSTGRES_DB -c \"DELETE FROM schema_migrations WHERE filename LIKE '%049%';\""
+
+# Beta 环境 - 更新账号数据
+ssh clicodeplus "source /root/sub2api-beta/deploy/.env && PGPASSWORD=\"\$POSTGRES_PASSWORD\" psql -h \$DATABASE_HOST -U \$POSTGRES_USER -d \$POSTGRES_DB -c \"UPDATE accounts SET credentials = credentials - 'model_mapping' WHERE platform = 'antigravity';\""
+```
+
+> **注意**：使用 `source .env` 加载环境变量，避免在命令行中暴露密码。
+
+### 部署步骤
+
+**重要：每次部署都必须递增版本号！**
+
+#### 0. 递增版本号并推送（本地操作）
+
+每次部署前，先在本地递增小版本号并确保推送成功：
+
+```bash
+# 查看当前版本号
+cat backend/cmd/server/VERSION
+# 假设当前是 0.1.69.1
+
+# 递增版本号
+echo "0.1.69.2" > backend/cmd/server/VERSION
+git add backend/cmd/server/VERSION
+git commit -m "chore: bump version to 0.1.69.2"
+git push origin release/custom-0.1.69
+
+# ⚠️ 确认推送成功（必须看到分支更新输出，不能有 rejected 错误）
+```
+
+> **检查点**：如果有其他未提交的改动，应先 commit 并 push，确保 release 分支上的所有代码都已推送到远程。
+
+#### 1. 构建服务器拉取代码
+
+```bash
+# 拉取最新代码并切换分支
+ssh us-asaki-root "cd /root/sub2api && git fetch origin && git checkout -B release/custom-0.1.69 origin/release/custom-0.1.69"
+
+# ⚠️ 验证版本号与步骤 0 一致
+ssh us-asaki-root "cat /root/sub2api/backend/cmd/server/VERSION"
+```
+
+> **首次使用构建服务器？** 需要先初始化仓库，参见下方「构建服务器首次初始化」章节。
+
+#### 2. 构建服务器构建镜像
+
+```bash
+ssh us-asaki-root "cd /root/sub2api && docker build --no-cache -t sub2api:latest -f Dockerfile ."
+
+# ⚠️ 必须看到构建成功输出，如果失败需要先排查问题
+```
+
+> **常见构建问题**：
+> - `buildx` 版本过旧导致 API 版本不兼容 → 更新 buildx：`curl -fsSL "https://github.com/docker/buildx/releases/latest/download/buildx-$(curl -fsSL https://api.github.com/repos/docker/buildx/releases/latest | grep tag_name | cut -d'"' -f4).linux-amd64" -o ~/.docker/cli-plugins/docker-buildx && chmod +x ~/.docker/cli-plugins/docker-buildx`
+> - 磁盘空间不足 → `docker system prune -f` 清理无用镜像
+
+#### 3. 传输镜像到生产服务器并加载
+
+```bash
+# 导出镜像 → 通过管道传输 → 生产服务器加载
+ssh us-asaki-root "docker save sub2api:latest" | ssh clicodeplus "docker load"
+
+# ⚠️ 必须看到 "Loaded image: sub2api:latest" 输出
+```
+
+#### 4. 生产服务器同步代码、更新标签并重启
+
+```bash
+# 同步代码（用于版本号确认和 deploy 配置）
+ssh clicodeplus "cd /root/sub2api && git fetch fork && git checkout -B release/custom-0.1.69 fork/release/custom-0.1.69"
+
+# 更新镜像标签并重启
+ssh clicodeplus "docker tag sub2api:latest weishaw/sub2api:latest"
+ssh clicodeplus "cd /root/sub2api/deploy && docker compose up -d --force-recreate sub2api"
+```
+
+#### 5. 验证部署
+
+```bash
+# 查看启动日志
+ssh clicodeplus "docker logs sub2api --tail 20"
+
+# 确认版本号（必须与步骤 0 中设置的版本号一致）
+ssh clicodeplus "cat /root/sub2api/backend/cmd/server/VERSION"
+
+# 检查容器状态（必须显示 healthy）
+ssh clicodeplus "docker ps | grep sub2api"
+```
+
+---
+
+### 构建服务器首次初始化
+
+首次使用 `us-asaki-root` 作为构建服务器时，需要执行以下一次性操作：
+
+```bash
+ssh us-asaki-root
+
+# 1) 克隆仓库
+cd /root
+git clone https://github.com/touwaeriol/sub2api.git sub2api
+cd sub2api
+
+# 2) 验证 Docker 和 buildx 版本
+docker version
+docker buildx version
+# 如果 buildx 版本过旧（< v0.14），执行更新：
+# LATEST=$(curl -fsSL https://api.github.com/repos/docker/buildx/releases/latest | grep tag_name | cut -d'"' -f4)
+# curl -fsSL "https://github.com/docker/buildx/releases/download/${LATEST}/buildx-${LATEST}.linux-amd64" -o ~/.docker/cli-plugins/docker-buildx
+# chmod +x ~/.docker/cli-plugins/docker-buildx
+
+# 3) 验证构建能力
+docker build --no-cache -t sub2api:test -f Dockerfile .
+docker rmi sub2api:test
+```
+
+---
+
+## Beta 并行部署（不影响现网）
+
+目标：在同一台服务器上并行启动一个 beta 实例（例如端口 `8084`），**严禁改动/重启**现网实例（默认目录 `/root/sub2api`）。
+
+### 设计原则
+
+- **新目录**：beta 使用独立目录，例如 `/root/sub2api-beta`。
+- **敏感信息只放 `.env`**：beta 的数据库密码、JWT_SECRET 等只写入 `/root/sub2api-beta/deploy/.env`，不要提交到 git。
+- **独立 Compose Project**：通过 `docker compose -p sub2api-beta ...` 启动，确保 network/volume 隔离。
+- **独立端口**：通过 `.env` 的 `SERVER_PORT` 映射宿主机端口（例如 `8084:8080`）。
+
+### 前置检查
+
+```bash
+# 1) 确保 8084 未被占用
+ssh clicodeplus "ss -ltnp | grep :8084 || echo '8084 is free'"
+
+# 2) 确认现网容器还在（只读检查）
+ssh clicodeplus "docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Ports}}' | sed -n '1,200p'"
+```
+
+### 首次部署步骤
+
+> **构建服务器说明**：正式和 beta 共用构建服务器上的 `/root/sub2api` 仓库，通过不同的镜像标签区分（`sub2api:latest` 用于正式，`sub2api:beta` 用于测试）。
+
+```bash
+# 1) 构建服务器构建 beta 镜像（共用 /root/sub2api 仓库，切到目标分支后打 beta 标签）
+ssh us-asaki-root "cd /root/sub2api && git fetch origin && git checkout -B release/custom-0.1.71 origin/release/custom-0.1.71"
+ssh us-asaki-root "cd /root/sub2api && docker build --no-cache -t sub2api:beta -f Dockerfile ."
+
+# ⚠️ 构建完成后如需恢复正式分支：
+# ssh us-asaki-root "cd /root/sub2api && git checkout release/custom-<正式版本>"
+
+# 2) 传输镜像到生产服务器
+ssh us-asaki-root "docker save sub2api:beta" | ssh clicodeplus "docker load"
+# ⚠️ 必须看到 "Loaded image: sub2api:beta" 输出
+
+# 3) 在生产服务器上准备 beta 环境
+ssh clicodeplus
+
+# 克隆代码（仅用于 deploy 配置和版本号确认，不在此构建）
+cd /root
+git clone https://github.com/touwaeriol/sub2api.git sub2api-beta
+cd /root/sub2api-beta
+git checkout release/custom-0.1.71
+
+# 4) 准备 beta 的 .env（敏感信息只写这里）
+cd /root/sub2api-beta/deploy
+
+# 推荐：从现网 .env 复制，保证除 DB 名/用户/端口外完全一致
+cp -f /root/sub2api/deploy/.env ./.env
+
+# 仅修改以下三项（其他保持不变）
+perl -pi -e 's/^SERVER_PORT=.*/SERVER_PORT=8084/' ./.env
+perl -pi -e 's/^POSTGRES_USER=.*/POSTGRES_USER=beta/' ./.env
+perl -pi -e 's/^POSTGRES_DB=.*/POSTGRES_DB=beta/' ./.env
+
+# 5) 写 compose override（避免与现网容器名冲突，镜像使用构建服务器传输的 sub2api:beta）
+cat > docker-compose.override.yml <<'YAML'
+services:
+  sub2api:
+    image: sub2api:beta
+    container_name: sub2api-beta
+  redis:
+    container_name: sub2api-beta-redis
+YAML
+
+# 6) 启动 beta（独立 project，确保不影响现网）
+cd /root/sub2api-beta/deploy
+docker compose -p sub2api-beta --env-file .env -f docker-compose.yml -f docker-compose.override.yml up -d
+
+# 7) 验证 beta
+curl -fsS http://127.0.0.1:8084/health
+docker logs sub2api-beta --tail 50
+```
+
+### 数据库配置约定（beta）
+
+- 数据库地址/SSL/密码：与现网一致（从现网 `.env` 复制即可）。
+- 仅修改：
+  - `POSTGRES_USER=beta`
+  - `POSTGRES_DB=beta`
+
+注意：需要数据库侧已存在 `beta` 用户与 `beta` 数据库，并授予权限；否则容器会启动失败并不断重启。
+
+### 更新 beta（构建服务器构建 + 传输 + 仅重启 beta 容器）
+
+```bash
+# 1) 构建服务器拉取代码并构建镜像（共用 /root/sub2api 仓库）
+ssh us-asaki-root "cd /root/sub2api && git fetch origin && git checkout -B release/custom-0.1.71 origin/release/custom-0.1.71"
+ssh us-asaki-root "cd /root/sub2api && docker build --no-cache -t sub2api:beta -f Dockerfile ."
+# ⚠️ 必须看到构建成功输出
+
+# 2) 传输镜像到生产服务器
+ssh us-asaki-root "docker save sub2api:beta" | ssh clicodeplus "docker load"
+# ⚠️ 必须看到 "Loaded image: sub2api:beta" 输出
+
+# 3) 生产服务器同步代码（用于版本号确认和 deploy 配置）
+ssh clicodeplus "set -e; cd /root/sub2api-beta && git fetch --all --tags && git checkout -f release/custom-0.1.71 && git reset --hard origin/release/custom-0.1.71"
+
+# 4) 重启 beta 容器并验证
+ssh clicodeplus "cd /root/sub2api-beta/deploy && docker compose -p sub2api-beta --env-file .env -f docker-compose.yml -f docker-compose.override.yml up -d --no-deps --force-recreate sub2api"
+ssh clicodeplus "sleep 5 && curl -fsS http://127.0.0.1:8084/health"
+ssh clicodeplus "cat /root/sub2api-beta/backend/cmd/server/VERSION"
+```
+
+### 停止/回滚 beta（只影响 beta）
+
+```bash
+ssh clicodeplus "cd /root/sub2api-beta/deploy && docker compose -p sub2api-beta -f docker-compose.yml -f docker-compose.override.yml down"
+```
+
+---
+
+## 服务器首次部署
+
+### 1. 构建服务器：克隆代码并配置远程仓库
+
+```bash
+ssh us-asaki-root
+cd /root
+git clone https://github.com/Wei-Shaw/sub2api.git
+cd sub2api
+
+# 添加 fork 仓库
+git remote add fork https://github.com/touwaeriol/sub2api.git
+```
+
+### 2. 构建服务器：切换到定制分支并构建镜像
+
+```bash
+git fetch fork
+git checkout -B release/custom-0.1.69 fork/release/custom-0.1.69
+
+cd /root/sub2api
+docker build -t sub2api:latest -f Dockerfile .
+exit
+```
+
+### 3. 传输镜像到生产服务器
+
+```bash
+ssh us-asaki-root "docker save sub2api:latest" | ssh clicodeplus "docker load"
+```
+
+### 4. 生产服务器：克隆代码并配置环境
+
+```bash
+ssh clicodeplus
+cd /root
+git clone https://github.com/Wei-Shaw/sub2api.git
+cd sub2api
+
+# 添加 fork 仓库
+git remote add fork https://github.com/touwaeriol/sub2api.git
+git fetch fork
+git checkout -B release/custom-0.1.69 fork/release/custom-0.1.69
+
+# 配置环境变量
+cd deploy
+cp .env.example .env
+vim .env  # 配置 DATABASE_URL, REDIS_URL, JWT_SECRET 等
+```
+
+### 5. 生产服务器：更新镜像标签并启动服务
+
+```bash
+docker tag sub2api:latest weishaw/sub2api:latest
+cd /root/sub2api/deploy && docker compose up -d
+```
+
+### 6. 验证部署
+
+```bash
+# 查看应用日志
+docker logs sub2api --tail 50
+
+# 检查健康状态
+curl http://localhost:8080/health
+
+# 确认版本号
+cat /root/sub2api/backend/cmd/server/VERSION
+```
+
+### 7. 常用运维命令
+
+```bash
+# 查看实时日志
+docker logs -f sub2api
+
+# 重启服务
+docker compose restart sub2api
+
+# 停止所有服务
+docker compose down
+
+# 停止并删除数据卷（慎用！会删除数据库数据）
+docker compose down -v
+
+# 查看资源使用情况
+docker stats sub2api
+```
+
+---
+
+## 定制功能说明
+
+当前定制分支包含以下功能（相对于官方版本）：
+
+### UI/UX 定制
+
+| 功能 | 说明 |
+|------|------|
+| 首页优化 | 面向用户的价值主张设计 |
+| 移除 GitHub 链接 | 用户菜单中不显示 GitHub 导航 |
+| 微信客服按钮 | 首页悬浮微信客服入口 |
+| 限流时间精确显示 | 账号限流时间显示精确到秒 |
+
+### Antigravity 平台增强
+
+| 功能 | 说明 |
+|------|------|
+| Scope 级别限流 | 按配额域（claude/gemini_text/gemini_image）独立限流，避免整个账号被锁定 |
+| 模型级别限流 | 按具体模型（如 claude-opus-4-5）独立限流，更精细的限流控制 |
+| 限流预检查 | 调度时预检查账号/模型限流状态，避免选中已限流账号 |
+| 秒级冷却时间 | 支持 429 响应的秒级精确冷却时间 |
+| 身份注入优化 | 模型身份信息注入 + 静默边界防止身份泄露 |
+| thoughtSignature 修复 | Gemini 3 函数调用 400 错误修复 |
+| max_tokens 自动修正 | 自动修正 max_tokens <= budget_tokens 导致的 400 错误 |
+
+### 调度算法优化
+
+| 功能 | 说明 |
+|------|------|
+| 分层过滤选择 | 调度算法从全排序改为分层过滤，提升性能 |
+| LRU 随机选择 | 相同 LRU 时间时随机选择，避免账号集中 |
+| 限流等待阈值配置化 | 可配置的限流等待阈值 |
+
+### 运维增强
+
+| 功能 | 说明 |
+|------|------|
+| Scope 限流统计 | 运维界面展示 Antigravity 账号 scope 级别限流统计 |
+| 账号限流状态显示 | 账号列表显示 scope 和模型级别限流状态 |
+| 清除限流按钮增强 | 有 scope/模型限流时也显示清除限流按钮 |
+
+### 其他修复
+
+| 功能 | 说明 |
+|------|------|
+| .gitattributes | 确保迁移文件使用 LF 换行符（解决 Windows 下 SQL 摘要不一致） |
+| 部署配置优化 | DATABASE_HOST 和 DATABASE_SSLMODE 可通过 .env 配置 |
+
+---
+
+## Admin API 接口文档
+
+### 认证方式
+
+所有 Admin API 通过 `x-api-key` 请求头传递 Admin API Key 认证。
+
+```
+x-api-key: admin-xxx
+```
+
+> **使用说明**：用户提供 admin token 后，直接将其作为 `x-api-key` 的值使用。Token 格式为 `admin-` + 64 位十六进制字符，在管理后台 `设置 > Admin API Key` 中生成。**请勿将实际 token 写入文档或代码中。**
+
+### 环境地址
+
+| 环境 | 基础地址 | 说明 |
+|------|----------|------|
+| 正式 | `https://clicodeplus.com` | 生产环境 |
+| Beta | `http://<服务器IP>:8084` | 仅内网访问 |
+| OpenAI | `http://<服务器IP>:8083` | 仅内网访问 |
+
+> 以下接口文档中，`${BASE}` 代表环境基础地址，`${KEY}` 代表用户提供的 admin token。
+
+---
+
+### 1. 账号管理
+
+#### 1.1 获取账号列表
+
+```
+GET /api/v1/admin/accounts
+```
+
+**查询参数**：
+
+| 参数 | 类型 | 必填 | 说明 |
+|------|------|------|------|
+| `platform` | string | 否 | 平台筛选：`antigravity` / `anthropic` / `openai` / `gemini` |
+| `type` | string | 否 | 账号类型：`oauth` / `api_key` / `cookie` |
+| `status` | string | 否 | 状态：`active` / `disabled` / `error` |
+| `search` | string | 否 | 搜索关键词（名称、备注） |
+| `page` | int | 否 | 页码，默认 1 |
+| `page_size` | int | 否 | 每页数量，默认 20 |
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts?platform=antigravity&page=1&page_size=100" \
+  -H "x-api-key: ${KEY}"
+```
+
+**响应**：
+```json
+{
+  "code": 0,
+  "message": "success",
+  "data": {
+    "items": [{"id": 1, "name": "xxx@gmail.com", "platform": "antigravity", "status": "active", ...}],
+    "total": 66
+  }
+}
+```
+
+#### 1.2 获取账号详情
+
+```
+GET /api/v1/admin/accounts/:id
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts/1" -H "x-api-key: ${KEY}"
+```
+
+#### 1.3 测试账号连接
+
+```
+POST /api/v1/admin/accounts/:id/test
+```
+
+**请求体**（JSON，可选）：
+
+| 字段 | 类型 | 必填 | 说明 |
+|------|------|------|------|
+| `model_id` | string | 否 | 指定测试模型，如 `claude-opus-4-6`；不传则使用默认模型 |
+
+**响应格式**：SSE（Server-Sent Events）流
+
+```bash
+curl -N -X POST "${BASE}/api/v1/admin/accounts/1/test" \
+  -H "x-api-key: ${KEY}" \
+  -H "Content-Type: application/json" \
+  -d '{"model_id": "claude-opus-4-6"}'
+```
+
+**SSE 事件类型**：
+
+| type | 字段 | 说明 |
+|------|------|------|
+| `test_start` | `model` | 测试开始，返回测试模型名 |
+| `content` | `text` | 模型响应内容（流式文本片段） |
+| `test_end` | `success`, `error` | 测试结束，`success=true` 表示成功 |
+| `error` | `text` | 错误信息 |
+
+#### 1.4 清除账号限流
+
+```
+POST /api/v1/admin/accounts/:id/clear-rate-limit
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/accounts/1/clear-rate-limit" \
+  -H "x-api-key: ${KEY}"
+```
+
+#### 1.5 清除账号错误状态
+
+```
+POST /api/v1/admin/accounts/:id/clear-error
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/accounts/1/clear-error" \
+  -H "x-api-key: ${KEY}"
+```
+
+#### 1.6 获取账号可用模型
+
+```
+GET /api/v1/admin/accounts/:id/models
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts/1/models" -H "x-api-key: ${KEY}"
+```
+
+#### 1.7 刷新 OAuth Token
+
+```
+POST /api/v1/admin/accounts/:id/refresh
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/accounts/1/refresh" -H "x-api-key: ${KEY}"
+```
+
+#### 1.8 刷新账号等级
+
+```
+POST /api/v1/admin/accounts/:id/refresh-tier
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/accounts/1/refresh-tier" -H "x-api-key: ${KEY}"
+```
+
+#### 1.9 获取账号统计
+
+```
+GET /api/v1/admin/accounts/:id/stats
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts/1/stats" -H "x-api-key: ${KEY}"
+```
+
+#### 1.10 获取账号用量
+
+```
+GET /api/v1/admin/accounts/:id/usage
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts/1/usage" -H "x-api-key: ${KEY}"
+```
+
+#### 1.11 批量测试账号（脚本）
+
+批量测试指定平台所有账号的指定模型连通性：
+
+```bash
+# 用户需提供：BASE（环境地址）、KEY（admin token）、MODEL（测试模型）
+ACCOUNT_IDS=$(curl -s "${BASE}/api/v1/admin/accounts?platform=antigravity&page=1&page_size=100" \
+  -H "x-api-key: ${KEY}" | python3 -c "
+import json, sys
+data = json.load(sys.stdin)
+for item in data['data']['items']:
+    print(f\"{item['id']}|{item['name']}\")
+")
+
+while IFS='|' read -r ID NAME; do
+    echo "测试账号 ID=${ID} (${NAME})..."
+    RESPONSE=$(curl -s --max-time 60 -N \
+      -X POST "${BASE}/api/v1/admin/accounts/${ID}/test" \
+      -H "x-api-key: ${KEY}" \
+      -H "Content-Type: application/json" \
+      -d "{\"model_id\": \"${MODEL}\"}" 2>&1)
+    if echo "$RESPONSE" | grep -q '"success":true'; then
+        echo "  ✅ 成功"
+    elif echo "$RESPONSE" | grep -q '"type":"content"'; then
+        echo "  ✅ 成功（有内容响应）"
+    else
+        ERROR_MSG=$(echo "$RESPONSE" | grep -o '"error":"[^"]*"' | tail -1)
+        echo "  ❌ 失败: ${ERROR_MSG}"
+    fi
+done <<< "$ACCOUNT_IDS"
+```
+
+---
+
+### 2. 运维监控
+
+#### 2.1 并发统计
+
+```
+GET /api/v1/admin/ops/concurrency
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/concurrency" -H "x-api-key: ${KEY}"
+```
+
+#### 2.2 账号可用性
+
+```
+GET /api/v1/admin/ops/account-availability
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/account-availability" -H "x-api-key: ${KEY}"
+```
+
+#### 2.3 实时流量摘要
+
+```
+GET /api/v1/admin/ops/realtime-traffic
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/realtime-traffic" -H "x-api-key: ${KEY}"
+```
+
+#### 2.4 请求错误列表
+
+```
+GET /api/v1/admin/ops/request-errors
+```
+
+**查询参数**：`page`、`page_size`
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/request-errors?page=1&page_size=50" \
+  -H "x-api-key: ${KEY}"
+```
+
+#### 2.5 上游错误列表
+
+```
+GET /api/v1/admin/ops/upstream-errors
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/upstream-errors?page=1&page_size=50" \
+  -H "x-api-key: ${KEY}"
+```
+
+#### 2.6 仪表板概览
+
+```
+GET /api/v1/admin/ops/dashboard/overview
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/dashboard/overview" -H "x-api-key: ${KEY}"
+```
+
+---
+
+### 3. 系统设置
+
+#### 3.1 获取系统设置
+
+```
+GET /api/v1/admin/settings
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/settings" -H "x-api-key: ${KEY}"
+```
+
+#### 3.2 更新系统设置
+
+```
+PUT /api/v1/admin/settings
+```
+
+```bash
+curl -X PUT "${BASE}/api/v1/admin/settings" \
+  -H "x-api-key: ${KEY}" \
+  -H "Content-Type: application/json" \
+  -d '{ ... }'
+```
+
+#### 3.3 Admin API Key 状态（脱敏）
+
+```
+GET /api/v1/admin/settings/admin-api-key
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/settings/admin-api-key" -H "x-api-key: ${KEY}"
+```
+
+---
+
+### 4. 用户管理
+
+#### 4.1 用户列表
+
+```
+GET /api/v1/admin/users
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/users?page=1&page_size=20" -H "x-api-key: ${KEY}"
+```
+
+#### 4.2 用户详情
+
+```
+GET /api/v1/admin/users/:id
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/users/1" -H "x-api-key: ${KEY}"
+```
+
+#### 4.3 更新用户余额
+
+```
+POST /api/v1/admin/users/:id/balance
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/users/1/balance" \
+  -H "x-api-key: ${KEY}" \
+  -H "Content-Type: application/json" \
+  -d '{"amount": 100, "reason": "充值"}'
+```
+
+---
+
+### 5. 分组管理
+
+#### 5.1 分组列表
+
+```
+GET /api/v1/admin/groups
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/groups" -H "x-api-key: ${KEY}"
+```
+
+#### 5.2 所有分组（不分页）
+
+```
+GET /api/v1/admin/groups/all
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/groups/all" -H "x-api-key: ${KEY}"
+```
+
+---
+
+## 注意事项
+
+1. **前端必须打包进镜像**：使用 `docker build` 在构建服务器（`us-asaki-root`）上构建，Dockerfile 会自动编译前端并 embed 到后端二进制中，构建完成后通过 `docker save | docker load` 传输到生产服务器（`clicodeplus`）
+
+2. **镜像标签**：docker-compose.yml 使用 `weishaw/sub2api:latest`，本地构建后需要 `docker tag` 覆盖
+
+3. **Windows 换行符问题**：已通过 `.gitattributes` 解决，确保 `*.sql` 文件始终使用 LF
+
+4. **版本号管理**：每次发布必须更新 `backend/cmd/server/VERSION` 并打标签
+
+5. **合并冲突**：合并上游新版本时，重点关注以下文件可能的冲突：
+   - `backend/internal/service/antigravity_gateway_service.go`
+   - `backend/internal/service/gateway_service.go`
+   - `backend/internal/pkg/antigravity/request_transformer.go`
+
+---
+
+## Go 代码规范
+
+### 1. 函数设计
+
+#### 单一职责原则
+- **函数行数**：单个函数常规不应超过 **30 行**，超过时应拆分为子函数。若某段逻辑确实不可拆分（如复杂的状态机、协议解析等），可以例外，但需添加注释说明原因
+- **嵌套层级**：避免超过 3 层嵌套，使用 early return 减少嵌套
+
+```go
+// ❌ 不推荐：深层嵌套
+func process(data []Item) {
+    for _, item := range data {
+        if item.Valid {
+            if item.Type == "A" {
+                if item.Status == "active" {
+                    // 业务逻辑...
+                }
+            }
+        }
+    }
+}
+
+// ✅ 推荐：early return
+func process(data []Item) {
+    for _, item := range data {
+        if !item.Valid {
+            continue
+        }
+        if item.Type != "A" {
+            continue
+        }
+        if item.Status != "active" {
+            continue
+        }
+        // 业务逻辑...
+    }
+}
+```
+
+#### 复杂逻辑提取
+将复杂的条件判断或处理逻辑提取为独立函数：
+
+```go
+// ❌ 不推荐：内联复杂逻辑
+if resp.StatusCode == 429 || resp.StatusCode == 503 {
+    // 80+ 行处理逻辑...
+}
+
+// ✅ 推荐：提取为独立函数
+result := handleRateLimitResponse(resp, params)
+switch result.action {
+case actionRetry:
+    continue
+case actionBreak:
+    return result.resp, nil
+}
+```
+
+### 2. 重复代码消除
+
+#### 配置获取模式
+将重复的配置获取逻辑提取为方法：
+
+```go
+// ❌ 不推荐：重复代码
+logBody := s.settingService != nil && s.settingService.cfg != nil && s.settingService.cfg.Gateway.LogUpstreamErrorBody
+maxBytes := 2048
+if s.settingService != nil && s.settingService.cfg != nil && s.settingService.cfg.Gateway.LogUpstreamErrorBodyMaxBytes > 0 {
+    maxBytes = s.settingService.cfg.Gateway.LogUpstreamErrorBodyMaxBytes
+}
+
+// ✅ 推荐：提取为方法
+func (s *Service) getLogConfig() (logBody bool, maxBytes int) {
+    maxBytes = 2048
+    if s.settingService == nil || s.settingService.cfg == nil {
+        return false, maxBytes
+    }
+    cfg := s.settingService.cfg.Gateway
+    if cfg.LogUpstreamErrorBodyMaxBytes > 0 {
+        maxBytes = cfg.LogUpstreamErrorBodyMaxBytes
+    }
+    return cfg.LogUpstreamErrorBody, maxBytes
+}
+```
+
+### 3. 常量管理
+
+#### 避免魔法数字
+所有硬编码的数值都应定义为常量：
+
+```go
+// ❌ 不推荐
+if retryDelay >= 10*time.Second {
+    resetAt := time.Now().Add(30 * time.Second)
+}
+
+// ✅ 推荐
+const (
+    rateLimitThreshold       = 10 * time.Second
+    defaultRateLimitDuration = 30 * time.Second
+)
+
+if retryDelay >= rateLimitThreshold {
+    resetAt := time.Now().Add(defaultRateLimitDuration)
+}
+```
+
+#### 注释引用常量名
+在注释中引用常量名而非硬编码值：
+
+```go
+// ❌ 不推荐
+// < 10s: 等待后重试
+
+// ✅ 推荐
+// < rateLimitThreshold: 等待后重试
+```
+
+### 4. 错误处理
+
+#### 使用结构化日志
+优先使用 `slog` 进行结构化日志记录：
+
+```go
+// ❌ 不推荐
+log.Printf("%s status=%d model_rate_limit_failed model=%s error=%v", prefix, statusCode, modelName, err)
+
+// ✅ 推荐
+slog.Error("failed to set model rate limit",
+    "prefix", prefix,
+    "status_code", statusCode,
+    "model", modelName,
+    "error", err,
+)
+```
+
+### 5. 测试规范
+
+#### Mock 函数签名同步
+修改函数签名时，必须同步更新所有测试中的 mock 函数：
+
+```go
+// 如果修改了 handleError 签名
+handleError func(..., groupID int64, sessionHash string) *Result
+
+// 必须同步更新测试中的 mock
+handleError: func(..., groupID int64, sessionHash string) *Result {
+    return nil
+},
+```
+
+#### 测试构建标签
+统一使用测试构建标签：
+
+```go
+//go:build unit
+
+package service
+```
+
+### 6. 时间格式解析
+
+#### 使用标准库
+优先使用 `time.ParseDuration`，支持所有 Go duration 格式：
+
+```go
+// ❌ 不推荐：手动限制格式
+if !strings.HasSuffix(delay, "s") || strings.Contains(delay, "m") {
+    continue
+}
+
+// ✅ 推荐：使用标准库
+dur, err := time.ParseDuration(delay) // 支持 "0.5s", "4m50s", "1h30m" 等
+```
+
+### 7. 接口设计
+
+#### 接口隔离原则
+定义最小化接口，只包含必需的方法：
+
+```go
+// ❌ 不推荐：使用过于宽泛的接口
+type AccountRepository interface {
+    // 20+ 个方法...
+}
+
+// ✅ 推荐：定义最小化接口
+type ModelRateLimiter interface {
+    SetModelRateLimit(ctx context.Context, id int64, modelKey string, resetAt time.Time) error
+}
+```
+
+### 8. 并发安全
+
+#### 共享数据保护
+访问可能被并发修改的数据时，确保线程安全：
+
+```go
+// 如果 Account.Extra 可能被并发修改
+// 需要使用互斥锁或原子操作保护读取
+func (a *Account) GetRateLimitRemainingTime(model string) time.Duration {
+    a.mu.RLock()
+    defer a.mu.RUnlock()
+    // 读取 Extra 字段...
+}
+```
+
+### 9. 命名规范
+
+#### 一致的命名风格
+- 常量使用 camelCase：`rateLimitThreshold`
+- 类型使用 PascalCase：`AntigravityQuotaScope`
+- 同一概念使用统一命名：`Threshold` 或 `Limit`，不要混用
+
+```go
+// ❌ 不推荐：命名不一致
+antigravitySmartRetryMinWait    // 使用 Min
+antigravityRateLimitThreshold   // 使用 Threshold
+
+// ✅ 推荐：统一风格
+antigravityMinRetryWait
+antigravityRateLimitThreshold
+```
+
+### 10. 代码审查清单
+
+在提交代码前，检查以下项目：
+
+- [ ] 函数是否超过 30 行？（不可拆分的逻辑除外，需注释说明）
+- [ ] 嵌套是否超过 3 层？
+- [ ] 是否有重复代码可以提取？
+- [ ] 是否使用了魔法数字？
+- [ ] Mock 函数签名是否与实际函数一致？
+- [ ] 测试是否覆盖了新增逻辑？
+- [ ] 日志是否包含足够的上下文信息？
+- [ ] 是否考虑了并发安全？
+
+---
+
+## CI 检查与发布门禁
+
+### GitHub Actions 检查项
+
+本项目有 4 个 CI 任务，**任何代码推送或发布前都必须全部通过**：
+
+| Workflow | Job | 说明 | 本地验证命令 |
+|----------|-----|------|-------------|
+| CI | `test` | 单元测试 + 集成测试 | `cd backend && make test-unit && make test-integration` |
+| CI | `golangci-lint` | Go 代码静态检查（golangci-lint v2.7） | `cd backend && golangci-lint run --timeout=5m` |
+| Security Scan | `backend-security` | govulncheck + gosec 安全扫描 | `cd backend && govulncheck ./... && gosec -severity high -confidence high ./...` |
+| Security Scan | `frontend-security` | pnpm audit 前端依赖安全检查 | `cd frontend && pnpm audit --prod --audit-level=high` |
+
+### 向上游提交 PR
+
+PR 目标是上游官方仓库，**只包含通用功能改动**（bug fix、新功能、性能优化等）。
+
+**以下文件禁止出现在 PR 中**（属于我们 fork 的定制化内容）：
+- `CLAUDE.md`、`AGENTS.md` — 我们的开发文档
+- `backend/cmd/server/VERSION` — 我们的版本号文件
+- UI 定制改动（GitHub 链接移除、微信客服按钮、首页定制等）
+- 部署配置（`deploy/` 目录下的定制修改）
+
+**PR 流程**：
+1. 从 `develop` 创建功能分支，只包含要提交给上游的改动
+2. 推送分支后，**等待 4 个 CI job 全部通过**
+3. 确认通过后再创建 PR
+4. 使用 `gh run list --repo touwaeriol/sub2api --branch <branch>` 检查状态
+
+### 自有分支推送（develop / main）
+
+推送到我们自己的 `develop` 或 `main` 分支时，包含所有改动（定制化 + 通用功能）。
+
+**推送前必须在本地执行全部 CI 检查**（不要等 GitHub Actions）：
+
+```bash
+# 确保 Go 工具链可用（macOS homebrew）
+export PATH="/opt/homebrew/bin:$HOME/go/bin:$PATH"
+
+# 1. 单元测试（必须）
+cd backend && make test-unit
+
+# 2. 集成测试（推荐，需要 Docker）
+make test-integration
+
+# 3. golangci-lint 静态检查（必须）
+golangci-lint run --timeout=5m
+
+# 4. gofmt 格式检查（必须）
+gofmt -l ./...
+# 如果有输出，运行 gofmt -w <file> 修复
+```
+
+**推送后确认**：
+1. 使用 `gh run list --repo touwaeriol/sub2api --branch <branch>` 检查 GitHub Actions 状态
+2. 确认 CI 和 Security Scan 两个 workflow 的 4 个 job 全部绿色 ✅
+3. 任何 job 失败必须立即修复，**禁止在 CI 未通过的状态下继续后续操作**
+
+### 发布版本
+
+1. 本地执行上述全部 CI 检查通过
+2. 递增 `backend/cmd/server/VERSION`，提交并推送
+3. 推送后确认 GitHub Actions 的 4 个 CI job 全部通过
+4. **CI 未通过时禁止部署** — 必须先修复问题
+5. 使用 `gh run list --repo touwaeriol/sub2api --limit 10` 确认状态
+
+### 常见 CI 失败原因及修复
+- **gofmt**：struct 字段对齐不一致 → 运行 `gofmt -w <file>` 修复
+- **golangci-lint**：未使用的变量/导入 → 删除或使用 `_` 忽略
+- **test 失败**：mock 函数签名不一致 → 同步更新 mock
+- **gosec**：安全漏洞 → 根据提示修复或添加例外
diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 100644
index 00000000..737cdf19
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1,1251 @@
+# Sub2API 开发说明
+
+## 版本管理策略
+
+### 版本号规则
+
+我们在官方版本号后面添加自己的小版本号：
+
+- 官方版本：`v0.1.68`
+- 我们的版本：`v0.1.68.1`、`v0.1.68.2`（递增）
+
+### 分支策略
+
+| 分支 | 说明 |
+|------|------|
+| `main` | 我们的主分支，包含所有定制功能 |
+| `release/custom-X.Y.Z` | 基于官方 `vX.Y.Z` 的发布分支 |
+| `upstream/main` | 上游官方仓库 |
+
+---
+
+## 发布流程（基于新官方版本）
+
+当官方发布新版本（如 `v0.1.69`）时：
+
+### 1. 同步上游并创建发布分支
+
+```bash
+# 获取上游最新代码
+git fetch upstream --tags
+
+# 基于官方标签创建新的发布分支
+git checkout v0.1.69 -b release/custom-0.1.69
+
+# 合并我们的 main 分支（包含所有定制功能）
+git merge main --no-edit
+
+# 解决可能的冲突后继续
+```
+
+### 2. 更新版本号并打标签
+
+```bash
+# 更新版本号文件
+echo "0.1.69.1" > backend/cmd/server/VERSION
+git add backend/cmd/server/VERSION
+git commit -m "chore: bump version to 0.1.69.1"
+
+# 打上我们自己的标签
+git tag v0.1.69.1
+
+# 推送分支和标签
+git push origin release/custom-0.1.69
+git push origin v0.1.69.1
+```
+
+### 3. 更新 main 分支
+
+```bash
+# 将发布分支合并回 main，保持 main 包含最新定制功能
+git checkout main
+git merge release/custom-0.1.69
+git push origin main
+```
+
+---
+
+## 热修复发布（在现有版本上修复）
+
+当需要在当前版本上发布修复时：
+
+```bash
+# 在当前发布分支上修复
+git checkout release/custom-0.1.68
+# ... 进行修复 ...
+git commit -m "fix: 修复描述"
+
+# 递增小版本号
+echo "0.1.68.2" > backend/cmd/server/VERSION
+git add backend/cmd/server/VERSION
+git commit -m "chore: bump version to 0.1.68.2"
+
+# 打标签并推送
+git tag v0.1.68.2
+git push origin release/custom-0.1.68
+git push origin v0.1.68.2
+
+# 同步修复到 main
+git checkout main
+git cherry-pick <fix-commit-hash>
+git push origin main
+```
+
+---
+
+## 服务器部署流程
+
+### 前置条件
+
+- 本地已配置 SSH 别名 `clicodeplus` 连接到生产服务器（运行服务）
+- 本地已配置 SSH 别名 `us-asaki-root` 连接到构建服务器（拉取代码、构建镜像）
+- 生产服务器部署目录：`/root/sub2api`（正式）、`/root/sub2api-beta`（测试）
+- 生产服务器使用 Docker Compose 部署
+- **镜像统一在构建服务器上构建**，避免生产服务器因编译占用 CPU/内存影响线上服务
+
+### 服务器角色说明
+
+| 服务器 | SSH 别名 | 职责 |
+|--------|----------|------|
+| 构建服务器 | `us-asaki-root` | 拉取代码、`docker build` 构建镜像 |
+| 生产服务器 | `clicodeplus` | 加载镜像、运行服务、部署验证 |
+| 数据库服务器 | `db-clicodeplus` | PostgreSQL 16 + Redis 7，所有环境共用 |
+
+> 数据库服务器运维手册：`db-clicodeplus:/root/README.md`
+
+### 部署环境说明
+
+| 环境 | 目录（生产服务器） | 端口 | 数据库 | Redis DB | 容器名 |
+|------|------|------|--------|----------|--------|
+| 正式 | `/root/sub2api` | 8080 | `sub2api` | 0 | `sub2api` |
+| Beta | `/root/sub2api-beta` | 8084 | `beta` | 2 | `sub2api-beta` |
+| OpenAI | `/root/sub2api-openai` | 8083 | `openai` | 3 | `sub2api-openai` |
+
+### 外部数据库与 Redis
+
+所有环境（正式、Beta、OpenAI）共用 `db.clicodeplus.com` 上的 **PostgreSQL 16** 和 **Redis 7**，不使用容器内数据库或 Redis。
+
+**PostgreSQL**（端口 5432，TLS 加密，scram-sha-256 认证）：
+
+| 环境 | 用户名 | 数据库 |
+|------|--------|--------|
+| 正式 | `sub2api` | `sub2api` |
+| Beta | `beta` | `beta` |
+| OpenAI | `openai` | `openai` |
+
+**Redis**（端口 6379，密码认证）：
+
+| 环境 | DB |
+|------|-----|
+| 正式 | 0 |
+| Beta | 2 |
+| OpenAI | 3 |
+
+**配置方式**：
+- 数据库通过 `.env` 中的 `DATABASE_HOST`、`DATABASE_SSLMODE`、`POSTGRES_USER`、`POSTGRES_PASSWORD`、`POSTGRES_DB` 配置
+- Redis 通过 `docker-compose.override.yml` 覆盖 `REDIS_HOST`（因主 compose 文件硬编码为 `redis`），密码通过 `.env` 中的 `REDIS_PASSWORD` 配置
+- 各环境的 `docker-compose.override.yml` 已通过 `depends_on: !reset {}` 和 `redis: profiles: [disabled]` 去掉了对容器 Redis 的依赖
+
+#### 数据库操作命令
+
+通过 SSH 在服务器上执行数据库操作：
+
+```bash
+# 正式环境 - 查询迁移记录
+ssh clicodeplus "source /root/sub2api/deploy/.env && PGPASSWORD=\"\$POSTGRES_PASSWORD\" psql -h \$DATABASE_HOST -U \$POSTGRES_USER -d \$POSTGRES_DB -c 'SELECT * FROM schema_migrations ORDER BY applied_at DESC LIMIT 5;'"
+
+# Beta 环境 - 查询迁移记录
+ssh clicodeplus "source /root/sub2api-beta/deploy/.env && PGPASSWORD=\"\$POSTGRES_PASSWORD\" psql -h \$DATABASE_HOST -U \$POSTGRES_USER -d \$POSTGRES_DB -c 'SELECT * FROM schema_migrations ORDER BY applied_at DESC LIMIT 5;'"
+
+# Beta 环境 - 清除指定迁移记录（重新执行迁移）
+ssh clicodeplus "source /root/sub2api-beta/deploy/.env && PGPASSWORD=\"\$POSTGRES_PASSWORD\" psql -h \$DATABASE_HOST -U \$POSTGRES_USER -d \$POSTGRES_DB -c \"DELETE FROM schema_migrations WHERE filename LIKE '%049%';\""
+
+# Beta 环境 - 更新账号数据
+ssh clicodeplus "source /root/sub2api-beta/deploy/.env && PGPASSWORD=\"\$POSTGRES_PASSWORD\" psql -h \$DATABASE_HOST -U \$POSTGRES_USER -d \$POSTGRES_DB -c \"UPDATE accounts SET credentials = credentials - 'model_mapping' WHERE platform = 'antigravity';\""
+```
+
+> **注意**：使用 `source .env` 加载环境变量，避免在命令行中暴露密码。
+
+### 部署步骤
+
+**重要：每次部署都必须递增版本号！**
+
+#### 0. 递增版本号并推送（本地操作）
+
+每次部署前，先在本地递增小版本号并确保推送成功：
+
+```bash
+# 查看当前版本号
+cat backend/cmd/server/VERSION
+# 假设当前是 0.1.69.1
+
+# 递增版本号
+echo "0.1.69.2" > backend/cmd/server/VERSION
+git add backend/cmd/server/VERSION
+git commit -m "chore: bump version to 0.1.69.2"
+git push origin release/custom-0.1.69
+
+# ⚠️ 确认推送成功（必须看到分支更新输出，不能有 rejected 错误）
+```
+
+> **检查点**：如果有其他未提交的改动，应先 commit 并 push，确保 release 分支上的所有代码都已推送到远程。
+
+#### 1. 构建服务器拉取代码
+
+```bash
+# 拉取最新代码并切换分支
+ssh us-asaki-root "cd /root/sub2api && git fetch origin && git checkout -B release/custom-0.1.69 origin/release/custom-0.1.69"
+
+# ⚠️ 验证版本号与步骤 0 一致
+ssh us-asaki-root "cat /root/sub2api/backend/cmd/server/VERSION"
+```
+
+> **首次使用构建服务器？** 需要先初始化仓库，参见下方「构建服务器首次初始化」章节。
+
+#### 2. 构建服务器构建镜像
+
+```bash
+ssh us-asaki-root "cd /root/sub2api && docker build --no-cache -t sub2api:latest -f Dockerfile ."
+
+# ⚠️ 必须看到构建成功输出，如果失败需要先排查问题
+```
+
+> **常见构建问题**：
+> - `buildx` 版本过旧导致 API 版本不兼容 → 更新 buildx：`curl -fsSL "https://github.com/docker/buildx/releases/latest/download/buildx-$(curl -fsSL https://api.github.com/repos/docker/buildx/releases/latest | grep tag_name | cut -d'"' -f4).linux-amd64" -o ~/.docker/cli-plugins/docker-buildx && chmod +x ~/.docker/cli-plugins/docker-buildx`
+> - 磁盘空间不足 → `docker system prune -f` 清理无用镜像
+
+#### 3. 传输镜像到生产服务器并加载
+
+```bash
+# 导出镜像 → 通过管道传输 → 生产服务器加载
+ssh us-asaki-root "docker save sub2api:latest" | ssh clicodeplus "docker load"
+
+# ⚠️ 必须看到 "Loaded image: sub2api:latest" 输出
+```
+
+#### 4. 生产服务器同步代码、更新标签并重启
+
+```bash
+# 同步代码（用于版本号确认和 deploy 配置）
+ssh clicodeplus "cd /root/sub2api && git fetch fork && git checkout -B release/custom-0.1.69 fork/release/custom-0.1.69"
+
+# 更新镜像标签并重启
+ssh clicodeplus "docker tag sub2api:latest weishaw/sub2api:latest"
+ssh clicodeplus "cd /root/sub2api/deploy && docker compose up -d --force-recreate sub2api"
+```
+
+#### 5. 验证部署
+
+```bash
+# 查看启动日志
+ssh clicodeplus "docker logs sub2api --tail 20"
+
+# 确认版本号（必须与步骤 0 中设置的版本号一致）
+ssh clicodeplus "cat /root/sub2api/backend/cmd/server/VERSION"
+
+# 检查容器状态（必须显示 healthy）
+ssh clicodeplus "docker ps | grep sub2api"
+```
+
+---
+
+### 构建服务器首次初始化
+
+首次使用 `us-asaki-root` 作为构建服务器时，需要执行以下一次性操作：
+
+```bash
+ssh us-asaki-root
+
+# 1) 克隆仓库
+cd /root
+git clone https://github.com/touwaeriol/sub2api.git sub2api
+cd sub2api
+
+# 2) 验证 Docker 和 buildx 版本
+docker version
+docker buildx version
+# 如果 buildx 版本过旧（< v0.14），执行更新：
+# LATEST=$(curl -fsSL https://api.github.com/repos/docker/buildx/releases/latest | grep tag_name | cut -d'"' -f4)
+# curl -fsSL "https://github.com/docker/buildx/releases/download/${LATEST}/buildx-${LATEST}.linux-amd64" -o ~/.docker/cli-plugins/docker-buildx
+# chmod +x ~/.docker/cli-plugins/docker-buildx
+
+# 3) 验证构建能力
+docker build --no-cache -t sub2api:test -f Dockerfile .
+docker rmi sub2api:test
+```
+
+---
+
+## Beta 并行部署（不影响现网）
+
+目标：在同一台服务器上并行启动一个 beta 实例（例如端口 `8084`），**严禁改动/重启**现网实例（默认目录 `/root/sub2api`）。
+
+### 设计原则
+
+- **新目录**：beta 使用独立目录，例如 `/root/sub2api-beta`。
+- **敏感信息只放 `.env`**：beta 的数据库密码、JWT_SECRET 等只写入 `/root/sub2api-beta/deploy/.env`，不要提交到 git。
+- **独立 Compose Project**：通过 `docker compose -p sub2api-beta ...` 启动，确保 network/volume 隔离。
+- **独立端口**：通过 `.env` 的 `SERVER_PORT` 映射宿主机端口（例如 `8084:8080`）。
+
+### 前置检查
+
+```bash
+# 1) 确保 8084 未被占用
+ssh clicodeplus "ss -ltnp | grep :8084 || echo '8084 is free'"
+
+# 2) 确认现网容器还在（只读检查）
+ssh clicodeplus "docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Ports}}' | sed -n '1,200p'"
+```
+
+### 首次部署步骤
+
+> **构建服务器说明**：正式和 beta 共用构建服务器上的 `/root/sub2api` 仓库，通过不同的镜像标签区分（`sub2api:latest` 用于正式，`sub2api:beta` 用于测试）。
+
+```bash
+# 1) 构建服务器构建 beta 镜像（共用 /root/sub2api 仓库，切到目标分支后打 beta 标签）
+ssh us-asaki-root "cd /root/sub2api && git fetch origin && git checkout -B release/custom-0.1.71 origin/release/custom-0.1.71"
+ssh us-asaki-root "cd /root/sub2api && docker build --no-cache -t sub2api:beta -f Dockerfile ."
+
+# ⚠️ 构建完成后如需恢复正式分支：
+# ssh us-asaki-root "cd /root/sub2api && git checkout release/custom-<正式版本>"
+
+# 2) 传输镜像到生产服务器
+ssh us-asaki-root "docker save sub2api:beta" | ssh clicodeplus "docker load"
+# ⚠️ 必须看到 "Loaded image: sub2api:beta" 输出
+
+# 3) 在生产服务器上准备 beta 环境
+ssh clicodeplus
+
+# 克隆代码（仅用于 deploy 配置和版本号确认，不在此构建）
+cd /root
+git clone https://github.com/touwaeriol/sub2api.git sub2api-beta
+cd /root/sub2api-beta
+git checkout release/custom-0.1.71
+
+# 4) 准备 beta 的 .env（敏感信息只写这里）
+cd /root/sub2api-beta/deploy
+
+# 推荐：从现网 .env 复制，保证除 DB 名/用户/端口外完全一致
+cp -f /root/sub2api/deploy/.env ./.env
+
+# 仅修改以下三项（其他保持不变）
+perl -pi -e 's/^SERVER_PORT=.*/SERVER_PORT=8084/' ./.env
+perl -pi -e 's/^POSTGRES_USER=.*/POSTGRES_USER=beta/' ./.env
+perl -pi -e 's/^POSTGRES_DB=.*/POSTGRES_DB=beta/' ./.env
+
+# 5) 写 compose override（避免与现网容器名冲突，镜像使用构建服务器传输的 sub2api:beta，Redis 使用外部服务）
+cat > docker-compose.override.yml <<'YAML'
+services:
+  sub2api:
+    image: sub2api:beta
+    container_name: sub2api-beta
+    environment:
+      - DATABASE_HOST=${DATABASE_HOST:-postgres}
+      - DATABASE_SSLMODE=${DATABASE_SSLMODE:-disable}
+      - REDIS_HOST=db.clicodeplus.com
+    depends_on: !reset {}
+  redis:
+    profiles:
+      - disabled
+YAML
+
+# 6) 启动 beta（独立 project，确保不影响现网）
+cd /root/sub2api-beta/deploy
+docker compose -p sub2api-beta --env-file .env -f docker-compose.yml -f docker-compose.override.yml up -d
+
+# 7) 验证 beta
+curl -fsS http://127.0.0.1:8084/health
+docker logs sub2api-beta --tail 50
+```
+
+### 数据库配置约定（beta）
+
+- 数据库地址/SSL/密码：与现网一致（从现网 `.env` 复制即可），均指向 `db.clicodeplus.com`。
+- 仅修改：
+  - `POSTGRES_USER=beta`
+  - `POSTGRES_DB=beta`
+  - `REDIS_DB=2`
+
+注意：需要数据库侧已存在 `beta` 用户与 `beta` 数据库，并授予权限；否则容器会启动失败并不断重启。
+
+### 更新 beta（构建服务器构建 + 传输 + 仅重启 beta 容器）
+
+```bash
+# 1) 构建服务器拉取代码并构建镜像（共用 /root/sub2api 仓库）
+ssh us-asaki-root "cd /root/sub2api && git fetch origin && git checkout -B release/custom-0.1.71 origin/release/custom-0.1.71"
+ssh us-asaki-root "cd /root/sub2api && docker build --no-cache -t sub2api:beta -f Dockerfile ."
+# ⚠️ 必须看到构建成功输出
+
+# 2) 传输镜像到生产服务器
+ssh us-asaki-root "docker save sub2api:beta" | ssh clicodeplus "docker load"
+# ⚠️ 必须看到 "Loaded image: sub2api:beta" 输出
+
+# 3) 生产服务器同步代码（用于版本号确认和 deploy 配置）
+ssh clicodeplus "set -e; cd /root/sub2api-beta && git fetch --all --tags && git checkout -f release/custom-0.1.71 && git reset --hard origin/release/custom-0.1.71"
+
+# 4) 重启 beta 容器并验证
+ssh clicodeplus "cd /root/sub2api-beta/deploy && docker compose -p sub2api-beta --env-file .env -f docker-compose.yml -f docker-compose.override.yml up -d --no-deps --force-recreate sub2api"
+ssh clicodeplus "sleep 5 && curl -fsS http://127.0.0.1:8084/health"
+ssh clicodeplus "cat /root/sub2api-beta/backend/cmd/server/VERSION"
+```
+
+### 停止/回滚 beta（只影响 beta）
+
+```bash
+ssh clicodeplus "cd /root/sub2api-beta/deploy && docker compose -p sub2api-beta -f docker-compose.yml -f docker-compose.override.yml down"
+```
+
+---
+
+## 服务器首次部署
+
+### 1. 构建服务器：克隆代码并配置远程仓库
+
+```bash
+ssh us-asaki-root
+cd /root
+git clone https://github.com/Wei-Shaw/sub2api.git
+cd sub2api
+
+# 添加 fork 仓库
+git remote add fork https://github.com/touwaeriol/sub2api.git
+```
+
+### 2. 构建服务器：切换到定制分支并构建镜像
+
+```bash
+git fetch fork
+git checkout -B release/custom-0.1.69 fork/release/custom-0.1.69
+
+cd /root/sub2api
+docker build -t sub2api:latest -f Dockerfile .
+exit
+```
+
+### 3. 传输镜像到生产服务器
+
+```bash
+ssh us-asaki-root "docker save sub2api:latest" | ssh clicodeplus "docker load"
+```
+
+### 4. 生产服务器：克隆代码并配置环境
+
+```bash
+ssh clicodeplus
+cd /root
+git clone https://github.com/Wei-Shaw/sub2api.git
+cd sub2api
+
+# 添加 fork 仓库
+git remote add fork https://github.com/touwaeriol/sub2api.git
+git fetch fork
+git checkout -B release/custom-0.1.69 fork/release/custom-0.1.69
+
+# 配置环境变量
+cd deploy
+cp .env.example .env
+vim .env  # 配置 DATABASE_HOST=db.clicodeplus.com, POSTGRES_PASSWORD, REDIS_PASSWORD, JWT_SECRET 等
+
+# 创建 override 文件（Redis 指向外部服务，去掉容器 Redis 依赖）
+cat > docker-compose.override.yml <<'YAML'
+services:
+  sub2api:
+    environment:
+      - REDIS_HOST=db.clicodeplus.com
+    depends_on: !reset {}
+  redis:
+    profiles:
+      - disabled
+YAML
+```
+
+### 5. 生产服务器：更新镜像标签并启动服务
+
+```bash
+docker tag sub2api:latest weishaw/sub2api:latest
+cd /root/sub2api/deploy && docker compose up -d
+```
+
+### 6. 验证部署
+
+```bash
+# 查看应用日志
+docker logs sub2api --tail 50
+
+# 检查健康状态
+curl http://localhost:8080/health
+
+# 确认版本号
+cat /root/sub2api/backend/cmd/server/VERSION
+```
+
+### 7. 常用运维命令
+
+```bash
+# 查看实时日志
+docker logs -f sub2api
+
+# 重启服务
+docker compose restart sub2api
+
+# 停止所有服务
+docker compose down
+
+# 停止并删除数据卷（慎用！会删除数据库数据）
+docker compose down -v
+
+# 查看资源使用情况
+docker stats sub2api
+```
+
+---
+
+## 定制功能说明
+
+当前定制分支包含以下功能（相对于官方版本）：
+
+### UI/UX 定制
+
+| 功能 | 说明 |
+|------|------|
+| 首页优化 | 面向用户的价值主张设计 |
+| 移除 GitHub 链接 | 用户菜单中不显示 GitHub 导航 |
+| 微信客服按钮 | 首页悬浮微信客服入口 |
+| 限流时间精确显示 | 账号限流时间显示精确到秒 |
+
+### Antigravity 平台增强
+
+| 功能 | 说明 |
+|------|------|
+| Scope 级别限流 | 按配额域（claude/gemini_text/gemini_image）独立限流，避免整个账号被锁定 |
+| 模型级别限流 | 按具体模型（如 claude-opus-4-5）独立限流，更精细的限流控制 |
+| 限流预检查 | 调度时预检查账号/模型限流状态，避免选中已限流账号 |
+| 秒级冷却时间 | 支持 429 响应的秒级精确冷却时间 |
+| 身份注入优化 | 模型身份信息注入 + 静默边界防止身份泄露 |
+| thoughtSignature 修复 | Gemini 3 函数调用 400 错误修复 |
+| max_tokens 自动修正 | 自动修正 max_tokens <= budget_tokens 导致的 400 错误 |
+
+### 调度算法优化
+
+| 功能 | 说明 |
+|------|------|
+| 分层过滤选择 | 调度算法从全排序改为分层过滤，提升性能 |
+| LRU 随机选择 | 相同 LRU 时间时随机选择，避免账号集中 |
+| 限流等待阈值配置化 | 可配置的限流等待阈值 |
+
+### 运维增强
+
+| 功能 | 说明 |
+|------|------|
+| Scope 限流统计 | 运维界面展示 Antigravity 账号 scope 级别限流统计 |
+| 账号限流状态显示 | 账号列表显示 scope 和模型级别限流状态 |
+| 清除限流按钮增强 | 有 scope/模型限流时也显示清除限流按钮 |
+
+### 其他修复
+
+| 功能 | 说明 |
+|------|------|
+| .gitattributes | 确保迁移文件使用 LF 换行符（解决 Windows 下 SQL 摘要不一致） |
+| 部署配置优化 | DATABASE_HOST 和 DATABASE_SSLMODE 可通过 .env 配置 |
+
+---
+
+## Admin API 接口文档
+
+### 认证方式
+
+所有 Admin API 通过 `x-api-key` 请求头传递 Admin API Key 认证。
+
+```
+x-api-key: admin-xxx
+```
+
+> **使用说明**：用户提供 admin token 后，直接将其作为 `x-api-key` 的值使用。Token 格式为 `admin-` + 64 位十六进制字符，在管理后台 `设置 > Admin API Key` 中生成。**请勿将实际 token 写入文档或代码中。**
+
+### 环境地址
+
+| 环境 | 基础地址 | 说明 |
+|------|----------|------|
+| 正式 | `https://clicodeplus.com` | 生产环境 |
+| Beta | `http://<服务器IP>:8084` | 仅内网访问 |
+| OpenAI | `http://<服务器IP>:8083` | 仅内网访问 |
+
+> 以下接口文档中，`${BASE}` 代表环境基础地址，`${KEY}` 代表用户提供的 admin token。
+
+---
+
+### 1. 账号管理
+
+#### 1.1 获取账号列表
+
+```
+GET /api/v1/admin/accounts
+```
+
+**查询参数**：
+
+| 参数 | 类型 | 必填 | 说明 |
+|------|------|------|------|
+| `platform` | string | 否 | 平台筛选：`antigravity` / `anthropic` / `openai` / `gemini` |
+| `type` | string | 否 | 账号类型：`oauth` / `api_key` / `cookie` |
+| `status` | string | 否 | 状态：`active` / `disabled` / `error` |
+| `search` | string | 否 | 搜索关键词（名称、备注） |
+| `page` | int | 否 | 页码，默认 1 |
+| `page_size` | int | 否 | 每页数量，默认 20 |
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts?platform=antigravity&page=1&page_size=100" \
+  -H "x-api-key: ${KEY}"
+```
+
+**响应**：
+```json
+{
+  "code": 0,
+  "message": "success",
+  "data": {
+    "items": [{"id": 1, "name": "xxx@gmail.com", "platform": "antigravity", "status": "active", ...}],
+    "total": 66
+  }
+}
+```
+
+#### 1.2 获取账号详情
+
+```
+GET /api/v1/admin/accounts/:id
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts/1" -H "x-api-key: ${KEY}"
+```
+
+#### 1.3 测试账号连接
+
+```
+POST /api/v1/admin/accounts/:id/test
+```
+
+**请求体**（JSON，可选）：
+
+| 字段 | 类型 | 必填 | 说明 |
+|------|------|------|------|
+| `model_id` | string | 否 | 指定测试模型，如 `claude-opus-4-6`；不传则使用默认模型 |
+
+**响应格式**：SSE（Server-Sent Events）流
+
+```bash
+curl -N -X POST "${BASE}/api/v1/admin/accounts/1/test" \
+  -H "x-api-key: ${KEY}" \
+  -H "Content-Type: application/json" \
+  -d '{"model_id": "claude-opus-4-6"}'
+```
+
+**SSE 事件类型**：
+
+| type | 字段 | 说明 |
+|------|------|------|
+| `test_start` | `model` | 测试开始，返回测试模型名 |
+| `content` | `text` | 模型响应内容（流式文本片段） |
+| `test_end` | `success`, `error` | 测试结束，`success=true` 表示成功 |
+| `error` | `text` | 错误信息 |
+
+#### 1.4 清除账号限流
+
+```
+POST /api/v1/admin/accounts/:id/clear-rate-limit
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/accounts/1/clear-rate-limit" \
+  -H "x-api-key: ${KEY}"
+```
+
+#### 1.5 清除账号错误状态
+
+```
+POST /api/v1/admin/accounts/:id/clear-error
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/accounts/1/clear-error" \
+  -H "x-api-key: ${KEY}"
+```
+
+#### 1.6 获取账号可用模型
+
+```
+GET /api/v1/admin/accounts/:id/models
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts/1/models" -H "x-api-key: ${KEY}"
+```
+
+#### 1.7 刷新 OAuth Token
+
+```
+POST /api/v1/admin/accounts/:id/refresh
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/accounts/1/refresh" -H "x-api-key: ${KEY}"
+```
+
+#### 1.8 刷新账号等级
+
+```
+POST /api/v1/admin/accounts/:id/refresh-tier
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/accounts/1/refresh-tier" -H "x-api-key: ${KEY}"
+```
+
+#### 1.9 获取账号统计
+
+```
+GET /api/v1/admin/accounts/:id/stats
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts/1/stats" -H "x-api-key: ${KEY}"
+```
+
+#### 1.10 获取账号用量
+
+```
+GET /api/v1/admin/accounts/:id/usage
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/accounts/1/usage" -H "x-api-key: ${KEY}"
+```
+
+#### 1.11 批量测试账号（脚本）
+
+批量测试指定平台所有账号的指定模型连通性：
+
+```bash
+# 用户需提供：BASE（环境地址）、KEY（admin token）、MODEL（测试模型）
+ACCOUNT_IDS=$(curl -s "${BASE}/api/v1/admin/accounts?platform=antigravity&page=1&page_size=100" \
+  -H "x-api-key: ${KEY}" | python3 -c "
+import json, sys
+data = json.load(sys.stdin)
+for item in data['data']['items']:
+    print(f\"{item['id']}|{item['name']}\")
+")
+
+while IFS='|' read -r ID NAME; do
+    echo "测试账号 ID=${ID} (${NAME})..."
+    RESPONSE=$(curl -s --max-time 60 -N \
+      -X POST "${BASE}/api/v1/admin/accounts/${ID}/test" \
+      -H "x-api-key: ${KEY}" \
+      -H "Content-Type: application/json" \
+      -d "{\"model_id\": \"${MODEL}\"}" 2>&1)
+    if echo "$RESPONSE" | grep -q '"success":true'; then
+        echo "  ✅ 成功"
+    elif echo "$RESPONSE" | grep -q '"type":"content"'; then
+        echo "  ✅ 成功（有内容响应）"
+    else
+        ERROR_MSG=$(echo "$RESPONSE" | grep -o '"error":"[^"]*"' | tail -1)
+        echo "  ❌ 失败: ${ERROR_MSG}"
+    fi
+done <<< "$ACCOUNT_IDS"
+```
+
+---
+
+### 2. 运维监控
+
+#### 2.1 并发统计
+
+```
+GET /api/v1/admin/ops/concurrency
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/concurrency" -H "x-api-key: ${KEY}"
+```
+
+#### 2.2 账号可用性
+
+```
+GET /api/v1/admin/ops/account-availability
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/account-availability" -H "x-api-key: ${KEY}"
+```
+
+#### 2.3 实时流量摘要
+
+```
+GET /api/v1/admin/ops/realtime-traffic
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/realtime-traffic" -H "x-api-key: ${KEY}"
+```
+
+#### 2.4 请求错误列表
+
+```
+GET /api/v1/admin/ops/request-errors
+```
+
+**查询参数**：`page`、`page_size`
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/request-errors?page=1&page_size=50" \
+  -H "x-api-key: ${KEY}"
+```
+
+#### 2.5 上游错误列表
+
+```
+GET /api/v1/admin/ops/upstream-errors
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/upstream-errors?page=1&page_size=50" \
+  -H "x-api-key: ${KEY}"
+```
+
+#### 2.6 仪表板概览
+
+```
+GET /api/v1/admin/ops/dashboard/overview
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/ops/dashboard/overview" -H "x-api-key: ${KEY}"
+```
+
+---
+
+### 3. 系统设置
+
+#### 3.1 获取系统设置
+
+```
+GET /api/v1/admin/settings
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/settings" -H "x-api-key: ${KEY}"
+```
+
+#### 3.2 更新系统设置
+
+```
+PUT /api/v1/admin/settings
+```
+
+```bash
+curl -X PUT "${BASE}/api/v1/admin/settings" \
+  -H "x-api-key: ${KEY}" \
+  -H "Content-Type: application/json" \
+  -d '{ ... }'
+```
+
+#### 3.3 Admin API Key 状态（脱敏）
+
+```
+GET /api/v1/admin/settings/admin-api-key
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/settings/admin-api-key" -H "x-api-key: ${KEY}"
+```
+
+---
+
+### 4. 用户管理
+
+#### 4.1 用户列表
+
+```
+GET /api/v1/admin/users
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/users?page=1&page_size=20" -H "x-api-key: ${KEY}"
+```
+
+#### 4.2 用户详情
+
+```
+GET /api/v1/admin/users/:id
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/users/1" -H "x-api-key: ${KEY}"
+```
+
+#### 4.3 更新用户余额
+
+```
+POST /api/v1/admin/users/:id/balance
+```
+
+```bash
+curl -X POST "${BASE}/api/v1/admin/users/1/balance" \
+  -H "x-api-key: ${KEY}" \
+  -H "Content-Type: application/json" \
+  -d '{"amount": 100, "reason": "充值"}'
+```
+
+---
+
+### 5. 分组管理
+
+#### 5.1 分组列表
+
+```
+GET /api/v1/admin/groups
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/groups" -H "x-api-key: ${KEY}"
+```
+
+#### 5.2 所有分组（不分页）
+
+```
+GET /api/v1/admin/groups/all
+```
+
+```bash
+curl -s "${BASE}/api/v1/admin/groups/all" -H "x-api-key: ${KEY}"
+```
+
+---
+
+## 注意事项
+
+1. **前端必须打包进镜像**：使用 `docker build` 在构建服务器（`us-asaki-root`）上构建，Dockerfile 会自动编译前端并 embed 到后端二进制中，构建完成后通过 `docker save | docker load` 传输到生产服务器（`clicodeplus`）
+
+2. **镜像标签**：docker-compose.yml 使用 `weishaw/sub2api:latest`，本地构建后需要 `docker tag` 覆盖
+
+3. **Windows 换行符问题**：已通过 `.gitattributes` 解决，确保 `*.sql` 文件始终使用 LF
+
+4. **版本号管理**：每次发布必须更新 `backend/cmd/server/VERSION` 并打标签
+
+5. **合并冲突**：合并上游新版本时，重点关注以下文件可能的冲突：
+   - `backend/internal/service/antigravity_gateway_service.go`
+   - `backend/internal/service/gateway_service.go`
+   - `backend/internal/pkg/antigravity/request_transformer.go`
+
+---
+
+## Go 代码规范
+
+### 1. 函数设计
+
+#### 单一职责原则
+- **函数行数**：单个函数常规不应超过 **30 行**，超过时应拆分为子函数。若某段逻辑确实不可拆分（如复杂的状态机、协议解析等），可以例外，但需添加注释说明原因
+- **嵌套层级**：避免超过 3 层嵌套，使用 early return 减少嵌套
+
+```go
+// ❌ 不推荐：深层嵌套
+func process(data []Item) {
+    for _, item := range data {
+        if item.Valid {
+            if item.Type == "A" {
+                if item.Status == "active" {
+                    // 业务逻辑...
+                }
+            }
+        }
+    }
+}
+
+// ✅ 推荐：early return
+func process(data []Item) {
+    for _, item := range data {
+        if !item.Valid {
+            continue
+        }
+        if item.Type != "A" {
+            continue
+        }
+        if item.Status != "active" {
+            continue
+        }
+        // 业务逻辑...
+    }
+}
+```
+
+#### 复杂逻辑提取
+将复杂的条件判断或处理逻辑提取为独立函数：
+
+```go
+// ❌ 不推荐：内联复杂逻辑
+if resp.StatusCode == 429 || resp.StatusCode == 503 {
+    // 80+ 行处理逻辑...
+}
+
+// ✅ 推荐：提取为独立函数
+result := handleRateLimitResponse(resp, params)
+switch result.action {
+case actionRetry:
+    continue
+case actionBreak:
+    return result.resp, nil
+}
+```
+
+### 2. 重复代码消除
+
+#### 配置获取模式
+将重复的配置获取逻辑提取为方法：
+
+```go
+// ❌ 不推荐：重复代码
+logBody := s.settingService != nil && s.settingService.cfg != nil && s.settingService.cfg.Gateway.LogUpstreamErrorBody
+maxBytes := 2048
+if s.settingService != nil && s.settingService.cfg != nil && s.settingService.cfg.Gateway.LogUpstreamErrorBodyMaxBytes > 0 {
+    maxBytes = s.settingService.cfg.Gateway.LogUpstreamErrorBodyMaxBytes
+}
+
+// ✅ 推荐：提取为方法
+func (s *Service) getLogConfig() (logBody bool, maxBytes int) {
+    maxBytes = 2048
+    if s.settingService == nil || s.settingService.cfg == nil {
+        return false, maxBytes
+    }
+    cfg := s.settingService.cfg.Gateway
+    if cfg.LogUpstreamErrorBodyMaxBytes > 0 {
+        maxBytes = cfg.LogUpstreamErrorBodyMaxBytes
+    }
+    return cfg.LogUpstreamErrorBody, maxBytes
+}
+```
+
+### 3. 常量管理
+
+#### 避免魔法数字
+所有硬编码的数值都应定义为常量：
+
+```go
+// ❌ 不推荐
+if retryDelay >= 10*time.Second {
+    resetAt := time.Now().Add(30 * time.Second)
+}
+
+// ✅ 推荐
+const (
+    rateLimitThreshold       = 10 * time.Second
+    defaultRateLimitDuration = 30 * time.Second
+)
+
+if retryDelay >= rateLimitThreshold {
+    resetAt := time.Now().Add(defaultRateLimitDuration)
+}
+```
+
+#### 注释引用常量名
+在注释中引用常量名而非硬编码值：
+
+```go
+// ❌ 不推荐
+// < 10s: 等待后重试
+
+// ✅ 推荐
+// < rateLimitThreshold: 等待后重试
+```
+
+### 4. 错误处理
+
+#### 使用结构化日志
+优先使用 `slog` 进行结构化日志记录：
+
+```go
+// ❌ 不推荐
+log.Printf("%s status=%d model_rate_limit_failed model=%s error=%v", prefix, statusCode, modelName, err)
+
+// ✅ 推荐
+slog.Error("failed to set model rate limit",
+    "prefix", prefix,
+    "status_code", statusCode,
+    "model", modelName,
+    "error", err,
+)
+```
+
+### 5. 测试规范
+
+#### Mock 函数签名同步
+修改函数签名时，必须同步更新所有测试中的 mock 函数：
+
+```go
+// 如果修改了 handleError 签名
+handleError func(..., groupID int64, sessionHash string) *Result
+
+// 必须同步更新测试中的 mock
+handleError: func(..., groupID int64, sessionHash string) *Result {
+    return nil
+},
+```
+
+#### 测试构建标签
+统一使用测试构建标签：
+
+```go
+//go:build unit
+
+package service
+```
+
+### 6. 时间格式解析
+
+#### 使用标准库
+优先使用 `time.ParseDuration`，支持所有 Go duration 格式：
+
+```go
+// ❌ 不推荐：手动限制格式
+if !strings.HasSuffix(delay, "s") || strings.Contains(delay, "m") {
+    continue
+}
+
+// ✅ 推荐：使用标准库
+dur, err := time.ParseDuration(delay) // 支持 "0.5s", "4m50s", "1h30m" 等
+```
+
+### 7. 接口设计
+
+#### 接口隔离原则
+定义最小化接口，只包含必需的方法：
+
+```go
+// ❌ 不推荐：使用过于宽泛的接口
+type AccountRepository interface {
+    // 20+ 个方法...
+}
+
+// ✅ 推荐：定义最小化接口
+type ModelRateLimiter interface {
+    SetModelRateLimit(ctx context.Context, id int64, modelKey string, resetAt time.Time) error
+}
+```
+
+### 8. 并发安全
+
+#### 共享数据保护
+访问可能被并发修改的数据时，确保线程安全：
+
+```go
+// 如果 Account.Extra 可能被并发修改
+// 需要使用互斥锁或原子操作保护读取
+func (a *Account) GetRateLimitRemainingTime(model string) time.Duration {
+    a.mu.RLock()
+    defer a.mu.RUnlock()
+    // 读取 Extra 字段...
+}
+```
+
+### 9. 命名规范
+
+#### 一致的命名风格
+- 常量使用 camelCase：`rateLimitThreshold`
+- 类型使用 PascalCase：`AntigravityQuotaScope`
+- 同一概念使用统一命名：`Threshold` 或 `Limit`，不要混用
+
+```go
+// ❌ 不推荐：命名不一致
+antigravitySmartRetryMinWait    // 使用 Min
+antigravityRateLimitThreshold   // 使用 Threshold
+
+// ✅ 推荐：统一风格
+antigravityMinRetryWait
+antigravityRateLimitThreshold
+```
+
+### 10. 代码审查清单
+
+在提交代码前，检查以下项目：
+
+- [ ] 函数是否超过 30 行？（不可拆分的逻辑除外，需注释说明）
+- [ ] 嵌套是否超过 3 层？
+- [ ] 是否有重复代码可以提取？
+- [ ] 是否使用了魔法数字？
+- [ ] Mock 函数签名是否与实际函数一致？
+- [ ] 测试是否覆盖了新增逻辑？
+- [ ] 日志是否包含足够的上下文信息？
+- [ ] 是否考虑了并发安全？
+
+---
+
+## CI 检查与发布门禁
+
+### GitHub Actions 检查项
+
+本项目有 4 个 CI 任务，**任何代码推送或发布前都必须全部通过**：
+
+| Workflow | Job | 说明 | 本地验证命令 |
+|----------|-----|------|-------------|
+| CI | `test` | 单元测试 + 集成测试 | `cd backend && make test-unit && make test-integration` |
+| CI | `golangci-lint` | Go 代码静态检查（golangci-lint v2.7） | `cd backend && golangci-lint run --timeout=5m` |
+| Security Scan | `backend-security` | govulncheck + gosec 安全扫描 | `cd backend && govulncheck ./... && gosec -severity high -confidence high ./...` |
+| Security Scan | `frontend-security` | pnpm audit 前端依赖安全检查 | `cd frontend && pnpm audit --prod --audit-level=high` |
+
+### 向上游提交 PR
+
+PR 目标是上游官方仓库，**只包含通用功能改动**（bug fix、新功能、性能优化等）。
+
+**以下文件禁止出现在 PR 中**（属于我们 fork 的定制化内容）：
+- `CLAUDE.md`、`AGENTS.md` — 我们的开发文档
+- `backend/cmd/server/VERSION` — 我们的版本号文件
+- UI 定制改动（GitHub 链接移除、微信客服按钮、首页定制等）
+- 部署配置（`deploy/` 目录下的定制修改）
+
+**PR 流程**：
+1. 从 `develop` 创建功能分支，只包含要提交给上游的改动
+2. 推送分支后，**等待 4 个 CI job 全部通过**
+3. 确认通过后再创建 PR
+4. 使用 `gh run list --repo touwaeriol/sub2api --branch <branch>` 检查状态
+
+### 自有分支推送（develop / main）
+
+推送到我们自己的 `develop` 或 `main` 分支时，包含所有改动（定制化 + 通用功能）。
+
+**推送前必须在本地执行全部 CI 检查**（不要等 GitHub Actions）：
+
+```bash
+# 确保 Go 工具链可用（macOS homebrew）
+export PATH="/opt/homebrew/bin:$HOME/go/bin:$PATH"
+
+# 1. 单元测试（必须）
+cd backend && make test-unit
+
+# 2. 集成测试（推荐，需要 Docker）
+make test-integration
+
+# 3. golangci-lint 静态检查（必须）
+golangci-lint run --timeout=5m
+
+# 4. gofmt 格式检查（必须）
+gofmt -l ./...
+# 如果有输出，运行 gofmt -w <file> 修复
+```
+
+**推送后确认**：
+1. 使用 `gh run list --repo touwaeriol/sub2api --branch <branch>` 检查 GitHub Actions 状态
+2. 确认 CI 和 Security Scan 两个 workflow 的 4 个 job 全部绿色 ✅
+3. 任何 job 失败必须立即修复，**禁止在 CI 未通过的状态下继续后续操作**
+
+### 发布版本
+
+1. 本地执行上述全部 CI 检查通过
+2. 递增 `backend/cmd/server/VERSION`，提交并推送
+3. 推送后确认 GitHub Actions 的 4 个 CI job 全部通过
+4. **CI 未通过时禁止部署** — 必须先修复问题
+5. 使用 `gh run list --repo touwaeriol/sub2api --limit 10` 确认状态
+
+### 常见 CI 失败原因及修复
+- **gofmt**：struct 字段对齐不一致 → 运行 `gofmt -w <file>` 修复
+- **golangci-lint**：未使用的变量/导入 → 删除或使用 `_` 忽略
+- **test 失败**：mock 函数签名不一致 → 同步更新 mock
+- **gosec**：安全漏洞 → 根据提示修复或添加例外
diff --git a/backend/cmd/server/VERSION b/backend/cmd/server/VERSION
index 5087e794..0aa59ad9 100644
--- a/backend/cmd/server/VERSION
+++ b/backend/cmd/server/VERSION
@@ -1 +1 @@
-0.1.76
\ No newline at end of file
+0.1.79.7
diff --git a/backend/internal/config/config.go b/backend/internal/config/config.go
index 91437ba8..7b6b4a37 100644
--- a/backend/internal/config/config.go
+++ b/backend/internal/config/config.go
@@ -883,6 +883,7 @@ func setDefaults() {
 	viper.SetDefault("gateway.max_account_switches", 10)
 	viper.SetDefault("gateway.max_account_switches_gemini", 3)
 	viper.SetDefault("gateway.antigravity_fallback_cooldown_minutes", 1)
+	viper.SetDefault("gateway.antigravity_extra_retries", 10)
 	viper.SetDefault("gateway.max_body_size", int64(100*1024*1024))
 	viper.SetDefault("gateway.connection_pool_isolation", ConnectionPoolIsolationAccountProxy)
 	// HTTP 上游连接池配置（针对 5000+ 并发用户优化）
diff --git a/backend/internal/handler/admin/ops_realtime_handler.go b/backend/internal/handler/admin/ops_realtime_handler.go
index c175dcd0..2d3cce4b 100644
--- a/backend/internal/handler/admin/ops_realtime_handler.go
+++ b/backend/internal/handler/admin/ops_realtime_handler.go
@@ -65,6 +65,10 @@ func (h *OpsHandler) GetConcurrencyStats(c *gin.Context) {
 
 // GetUserConcurrencyStats returns real-time concurrency usage for all active users.
 // GET /api/v1/admin/ops/user-concurrency
+//
+// Query params:
+// - platform: optional, filter users by allowed platform
+// - group_id: optional, filter users by allowed group
 func (h *OpsHandler) GetUserConcurrencyStats(c *gin.Context) {
 	if h.opsService == nil {
 		response.Error(c, http.StatusServiceUnavailable, "Ops service not available")
@@ -84,7 +88,18 @@ func (h *OpsHandler) GetUserConcurrencyStats(c *gin.Context) {
 		return
 	}
 
-	users, collectedAt, err := h.opsService.GetUserConcurrencyStats(c.Request.Context())
+	platformFilter := strings.TrimSpace(c.Query("platform"))
+	var groupID *int64
+	if v := strings.TrimSpace(c.Query("group_id")); v != "" {
+		id, err := strconv.ParseInt(v, 10, 64)
+		if err != nil || id <= 0 {
+			response.BadRequest(c, "Invalid group_id")
+			return
+		}
+		groupID = &id
+	}
+
+	users, collectedAt, err := h.opsService.GetUserConcurrencyStats(c.Request.Context(), platformFilter, groupID)
 	if err != nil {
 		response.ErrorFrom(c, err)
 		return
diff --git a/backend/internal/handler/failover_loop.go b/backend/internal/handler/failover_loop.go
new file mode 100644
index 00000000..1f8a7e9a
--- /dev/null
+++ b/backend/internal/handler/failover_loop.go
@@ -0,0 +1,160 @@
+package handler
+
+import (
+	"context"
+	"log"
+	"net/http"
+	"time"
+
+	"github.com/Wei-Shaw/sub2api/internal/service"
+)
+
+// TempUnscheduler 用于 HandleFailoverError 中同账号重试耗尽后的临时封禁。
+// GatewayService 隐式实现此接口。
+type TempUnscheduler interface {
+	TempUnscheduleRetryableError(ctx context.Context, accountID int64, failoverErr *service.UpstreamFailoverError)
+}
+
+// FailoverAction 表示 failover 错误处理后的下一步动作
+type FailoverAction int
+
+const (
+	// FailoverContinue 继续循环（同账号重试或切换账号，调用方统一 continue）
+	FailoverContinue FailoverAction = iota
+	// FailoverExhausted 切换次数耗尽（调用方应返回错误响应）
+	FailoverExhausted
+	// FailoverCanceled context 已取消（调用方应直接 return）
+	FailoverCanceled
+)
+
+const (
+	// maxSameAccountRetries 同账号重试次数上限（针对 RetryableOnSameAccount 错误）
+	maxSameAccountRetries = 2
+	// sameAccountRetryDelay 同账号重试间隔
+	sameAccountRetryDelay = 500 * time.Millisecond
+	// singleAccountBackoffDelay 单账号分组 503 退避重试固定延时。
+	// Service 层在 SingleAccountRetry 模式下已做充分原地重试（最多 3 次、总等待 30s），
+	// Handler 层只需短暂间隔后重新进入 Service 层即可。
+	singleAccountBackoffDelay = 2 * time.Second
+)
+
+// FailoverState 跨循环迭代共享的 failover 状态
+type FailoverState struct {
+	SwitchCount           int
+	MaxSwitches           int
+	FailedAccountIDs      map[int64]struct{}
+	SameAccountRetryCount map[int64]int
+	LastFailoverErr       *service.UpstreamFailoverError
+	ForceCacheBilling     bool
+	hasBoundSession       bool
+}
+
+// NewFailoverState 创建 failover 状态
+func NewFailoverState(maxSwitches int, hasBoundSession bool) *FailoverState {
+	return &FailoverState{
+		MaxSwitches:           maxSwitches,
+		FailedAccountIDs:      make(map[int64]struct{}),
+		SameAccountRetryCount: make(map[int64]int),
+		hasBoundSession:       hasBoundSession,
+	}
+}
+
+// HandleFailoverError 处理 UpstreamFailoverError，返回下一步动作。
+// 包含：缓存计费判断、同账号重试、临时封禁、切换计数、Antigravity 延时。
+func (s *FailoverState) HandleFailoverError(
+	ctx context.Context,
+	gatewayService TempUnscheduler,
+	accountID int64,
+	platform string,
+	failoverErr *service.UpstreamFailoverError,
+) FailoverAction {
+	s.LastFailoverErr = failoverErr
+
+	// 缓存计费判断
+	if needForceCacheBilling(s.hasBoundSession, failoverErr) {
+		s.ForceCacheBilling = true
+	}
+
+	// 同账号重试：对 RetryableOnSameAccount 的临时性错误，先在同一账号上重试
+	if failoverErr.RetryableOnSameAccount && s.SameAccountRetryCount[accountID] < maxSameAccountRetries {
+		s.SameAccountRetryCount[accountID]++
+		log.Printf("Account %d: retryable error %d, same-account retry %d/%d",
+			accountID, failoverErr.StatusCode, s.SameAccountRetryCount[accountID], maxSameAccountRetries)
+		if !sleepWithContext(ctx, sameAccountRetryDelay) {
+			return FailoverCanceled
+		}
+		return FailoverContinue
+	}
+
+	// 同账号重试用尽，执行临时封禁
+	if failoverErr.RetryableOnSameAccount {
+		gatewayService.TempUnscheduleRetryableError(ctx, accountID, failoverErr)
+	}
+
+	// 加入失败列表
+	s.FailedAccountIDs[accountID] = struct{}{}
+
+	// 检查是否耗尽
+	if s.SwitchCount >= s.MaxSwitches {
+		return FailoverExhausted
+	}
+
+	// 递增切换计数
+	s.SwitchCount++
+	log.Printf("Account %d: upstream error %d, switching account %d/%d",
+		accountID, failoverErr.StatusCode, s.SwitchCount, s.MaxSwitches)
+
+	// Antigravity 平台换号线性递增延时
+	if platform == service.PlatformAntigravity {
+		delay := time.Duration(s.SwitchCount-1) * time.Second
+		if !sleepWithContext(ctx, delay) {
+			return FailoverCanceled
+		}
+	}
+
+	return FailoverContinue
+}
+
+// HandleSelectionExhausted 处理选号失败（所有候选账号都在排除列表中）时的退避重试决策。
+// 针对 Antigravity 单账号分组的 503 (MODEL_CAPACITY_EXHAUSTED) 场景：
+// 清除排除列表、等待退避后重新选号。
+//
+// 返回 FailoverContinue 时，调用方应设置 SingleAccountRetry context 并 continue。
+// 返回 FailoverExhausted 时，调用方应返回错误响应。
+// 返回 FailoverCanceled 时，调用方应直接 return。
+func (s *FailoverState) HandleSelectionExhausted(ctx context.Context) FailoverAction {
+	if s.LastFailoverErr != nil &&
+		s.LastFailoverErr.StatusCode == http.StatusServiceUnavailable &&
+		s.SwitchCount <= s.MaxSwitches {
+
+		log.Printf("Antigravity single-account 503 backoff: waiting %v before retry (attempt %d)",
+			singleAccountBackoffDelay, s.SwitchCount)
+		if !sleepWithContext(ctx, singleAccountBackoffDelay) {
+			return FailoverCanceled
+		}
+		log.Printf("Antigravity single-account 503 retry: clearing failed accounts, retry %d/%d",
+			s.SwitchCount, s.MaxSwitches)
+		s.FailedAccountIDs = make(map[int64]struct{})
+		return FailoverContinue
+	}
+	return FailoverExhausted
+}
+
+// needForceCacheBilling 判断 failover 时是否需要强制缓存计费。
+// 粘性会话切换账号、或上游明确标记时，将 input_tokens 转为 cache_read 计费。
+func needForceCacheBilling(hasBoundSession bool, failoverErr *service.UpstreamFailoverError) bool {
+	return hasBoundSession || (failoverErr != nil && failoverErr.ForceCacheBilling)
+}
+
+// sleepWithContext 等待指定时长，返回 false 表示 context 已取消。
+func sleepWithContext(ctx context.Context, d time.Duration) bool {
+	if d <= 0 {
+		return true
+	}
+	select {
+	case <-ctx.Done():
+		return false
+	case <-time.After(d):
+		return true
+	}
+}
diff --git a/backend/internal/handler/failover_loop_test.go b/backend/internal/handler/failover_loop_test.go
new file mode 100644
index 00000000..5a41b2dd
--- /dev/null
+++ b/backend/internal/handler/failover_loop_test.go
@@ -0,0 +1,732 @@
+package handler
+
+import (
+	"context"
+	"testing"
+	"time"
+
+	"github.com/Wei-Shaw/sub2api/internal/service"
+
+	"github.com/stretchr/testify/require"
+)
+
+// ---------------------------------------------------------------------------
+// Mock
+// ---------------------------------------------------------------------------
+
+// mockTempUnscheduler 记录 TempUnscheduleRetryableError 的调用信息。
+type mockTempUnscheduler struct {
+	calls []tempUnscheduleCall
+}
+
+type tempUnscheduleCall struct {
+	accountID   int64
+	failoverErr *service.UpstreamFailoverError
+}
+
+func (m *mockTempUnscheduler) TempUnscheduleRetryableError(_ context.Context, accountID int64, failoverErr *service.UpstreamFailoverError) {
+	m.calls = append(m.calls, tempUnscheduleCall{accountID: accountID, failoverErr: failoverErr})
+}
+
+// ---------------------------------------------------------------------------
+// Helper
+// ---------------------------------------------------------------------------
+
+func newTestFailoverErr(statusCode int, retryable, forceBilling bool) *service.UpstreamFailoverError {
+	return &service.UpstreamFailoverError{
+		StatusCode:             statusCode,
+		RetryableOnSameAccount: retryable,
+		ForceCacheBilling:      forceBilling,
+	}
+}
+
+// ---------------------------------------------------------------------------
+// NewFailoverState 测试
+// ---------------------------------------------------------------------------
+
+func TestNewFailoverState(t *testing.T) {
+	t.Run("初始化字段正确", func(t *testing.T) {
+		fs := NewFailoverState(5, true)
+		require.Equal(t, 5, fs.MaxSwitches)
+		require.Equal(t, 0, fs.SwitchCount)
+		require.NotNil(t, fs.FailedAccountIDs)
+		require.Empty(t, fs.FailedAccountIDs)
+		require.NotNil(t, fs.SameAccountRetryCount)
+		require.Empty(t, fs.SameAccountRetryCount)
+		require.Nil(t, fs.LastFailoverErr)
+		require.False(t, fs.ForceCacheBilling)
+		require.True(t, fs.hasBoundSession)
+	})
+
+	t.Run("无绑定会话", func(t *testing.T) {
+		fs := NewFailoverState(3, false)
+		require.Equal(t, 3, fs.MaxSwitches)
+		require.False(t, fs.hasBoundSession)
+	})
+
+	t.Run("零最大切换次数", func(t *testing.T) {
+		fs := NewFailoverState(0, false)
+		require.Equal(t, 0, fs.MaxSwitches)
+	})
+}
+
+// ---------------------------------------------------------------------------
+// sleepWithContext 测试
+// ---------------------------------------------------------------------------
+
+func TestSleepWithContext(t *testing.T) {
+	t.Run("零时长立即返回true", func(t *testing.T) {
+		start := time.Now()
+		ok := sleepWithContext(context.Background(), 0)
+		require.True(t, ok)
+		require.Less(t, time.Since(start), 50*time.Millisecond)
+	})
+
+	t.Run("负时长立即返回true", func(t *testing.T) {
+		start := time.Now()
+		ok := sleepWithContext(context.Background(), -1*time.Second)
+		require.True(t, ok)
+		require.Less(t, time.Since(start), 50*time.Millisecond)
+	})
+
+	t.Run("正常等待后返回true", func(t *testing.T) {
+		start := time.Now()
+		ok := sleepWithContext(context.Background(), 50*time.Millisecond)
+		elapsed := time.Since(start)
+		require.True(t, ok)
+		require.GreaterOrEqual(t, elapsed, 40*time.Millisecond)
+		require.Less(t, elapsed, 500*time.Millisecond)
+	})
+
+	t.Run("已取消context立即返回false", func(t *testing.T) {
+		ctx, cancel := context.WithCancel(context.Background())
+		cancel()
+
+		start := time.Now()
+		ok := sleepWithContext(ctx, 5*time.Second)
+		require.False(t, ok)
+		require.Less(t, time.Since(start), 50*time.Millisecond)
+	})
+
+	t.Run("等待期间context取消返回false", func(t *testing.T) {
+		ctx, cancel := context.WithCancel(context.Background())
+		go func() {
+			time.Sleep(30 * time.Millisecond)
+			cancel()
+		}()
+
+		start := time.Now()
+		ok := sleepWithContext(ctx, 5*time.Second)
+		elapsed := time.Since(start)
+		require.False(t, ok)
+		require.Less(t, elapsed, 500*time.Millisecond)
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleFailoverError — 基本切换流程
+// ---------------------------------------------------------------------------
+
+func TestHandleFailoverError_BasicSwitch(t *testing.T) {
+	t.Run("非重试错误_非Antigravity_直接切换", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(500, false, false)
+
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SwitchCount)
+		require.Contains(t, fs.FailedAccountIDs, int64(100))
+		require.Equal(t, err, fs.LastFailoverErr)
+		require.False(t, fs.ForceCacheBilling)
+		require.Empty(t, mock.calls, "不应调用 TempUnschedule")
+	})
+
+	t.Run("非重试错误_Antigravity_第一次切换无延迟", func(t *testing.T) {
+		// switchCount 从 0→1 时，sleepFailoverDelay(ctx, 1) 的延时 = (1-1)*1s = 0
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(500, false, false)
+
+		start := time.Now()
+		action := fs.HandleFailoverError(context.Background(), mock, 100, service.PlatformAntigravity, err)
+		elapsed := time.Since(start)
+
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SwitchCount)
+		require.Less(t, elapsed, 200*time.Millisecond, "第一次切换延迟应为 0")
+	})
+
+	t.Run("非重试错误_Antigravity_第二次切换有1秒延迟", func(t *testing.T) {
+		// switchCount 从 1→2 时，sleepFailoverDelay(ctx, 2) 的延时 = (2-1)*1s = 1s
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		fs.SwitchCount = 1 // 模拟已切换一次
+
+		err := newTestFailoverErr(500, false, false)
+		start := time.Now()
+		action := fs.HandleFailoverError(context.Background(), mock, 200, service.PlatformAntigravity, err)
+		elapsed := time.Since(start)
+
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 2, fs.SwitchCount)
+		require.GreaterOrEqual(t, elapsed, 800*time.Millisecond, "第二次切换延迟应约 1s")
+		require.Less(t, elapsed, 3*time.Second)
+	})
+
+	t.Run("连续切换直到耗尽", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(2, false)
+
+		// 第一次切换：0→1
+		err1 := newTestFailoverErr(500, false, false)
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", err1)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SwitchCount)
+
+		// 第二次切换：1→2
+		err2 := newTestFailoverErr(502, false, false)
+		action = fs.HandleFailoverError(context.Background(), mock, 200, "openai", err2)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 2, fs.SwitchCount)
+
+		// 第三次已耗尽：SwitchCount(2) >= MaxSwitches(2)
+		err3 := newTestFailoverErr(503, false, false)
+		action = fs.HandleFailoverError(context.Background(), mock, 300, "openai", err3)
+		require.Equal(t, FailoverExhausted, action)
+		require.Equal(t, 2, fs.SwitchCount, "耗尽时不应继续递增")
+
+		// 验证失败账号列表
+		require.Len(t, fs.FailedAccountIDs, 3)
+		require.Contains(t, fs.FailedAccountIDs, int64(100))
+		require.Contains(t, fs.FailedAccountIDs, int64(200))
+		require.Contains(t, fs.FailedAccountIDs, int64(300))
+
+		// LastFailoverErr 应为最后一次的错误
+		require.Equal(t, err3, fs.LastFailoverErr)
+	})
+
+	t.Run("MaxSwitches为0时首次即耗尽", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(0, false)
+		err := newTestFailoverErr(500, false, false)
+
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, FailoverExhausted, action)
+		require.Equal(t, 0, fs.SwitchCount)
+		require.Contains(t, fs.FailedAccountIDs, int64(100))
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleFailoverError — 缓存计费 (ForceCacheBilling)
+// ---------------------------------------------------------------------------
+
+func TestHandleFailoverError_CacheBilling(t *testing.T) {
+	t.Run("hasBoundSession为true时设置ForceCacheBilling", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, true) // hasBoundSession=true
+		err := newTestFailoverErr(500, false, false)
+
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.True(t, fs.ForceCacheBilling)
+	})
+
+	t.Run("failoverErr.ForceCacheBilling为true时设置", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(500, false, true) // ForceCacheBilling=true
+
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.True(t, fs.ForceCacheBilling)
+	})
+
+	t.Run("两者均为false时不设置", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(500, false, false)
+
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.False(t, fs.ForceCacheBilling)
+	})
+
+	t.Run("一旦设置不会被后续错误重置", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+
+		// 第一次：ForceCacheBilling=true → 设置
+		err1 := newTestFailoverErr(500, false, true)
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err1)
+		require.True(t, fs.ForceCacheBilling)
+
+		// 第二次：ForceCacheBilling=false → 仍然保持 true
+		err2 := newTestFailoverErr(502, false, false)
+		fs.HandleFailoverError(context.Background(), mock, 200, "openai", err2)
+		require.True(t, fs.ForceCacheBilling, "ForceCacheBilling 一旦设置不应被重置")
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleFailoverError — 同账号重试 (RetryableOnSameAccount)
+// ---------------------------------------------------------------------------
+
+func TestHandleFailoverError_SameAccountRetry(t *testing.T) {
+	t.Run("第一次重试返回FailoverContinue", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(400, true, false)
+
+		start := time.Now()
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		elapsed := time.Since(start)
+
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SameAccountRetryCount[100])
+		require.Equal(t, 0, fs.SwitchCount, "同账号重试不应增加切换计数")
+		require.NotContains(t, fs.FailedAccountIDs, int64(100), "同账号重试不应加入失败列表")
+		require.Empty(t, mock.calls, "同账号重试期间不应调用 TempUnschedule")
+		// 验证等待了 sameAccountRetryDelay (500ms)
+		require.GreaterOrEqual(t, elapsed, 400*time.Millisecond)
+		require.Less(t, elapsed, 2*time.Second)
+	})
+
+	t.Run("第二次重试仍返回FailoverContinue", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(400, true, false)
+
+		// 第一次
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SameAccountRetryCount[100])
+
+		// 第二次
+		action = fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 2, fs.SameAccountRetryCount[100])
+
+		require.Empty(t, mock.calls, "两次重试期间均不应调用 TempUnschedule")
+	})
+
+	t.Run("第三次重试耗尽_触发TempUnschedule并切换", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(400, true, false)
+
+		// 第一次、第二次重试
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, 2, fs.SameAccountRetryCount[100])
+
+		// 第三次：重试已达到 maxSameAccountRetries(2)，应切换账号
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SwitchCount)
+		require.Contains(t, fs.FailedAccountIDs, int64(100))
+
+		// 验证 TempUnschedule 被调用
+		require.Len(t, mock.calls, 1)
+		require.Equal(t, int64(100), mock.calls[0].accountID)
+		require.Equal(t, err, mock.calls[0].failoverErr)
+	})
+
+	t.Run("不同账号独立跟踪重试次数", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(5, false)
+		err := newTestFailoverErr(400, true, false)
+
+		// 账号 100 第一次重试
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SameAccountRetryCount[100])
+
+		// 账号 200 第一次重试（独立计数）
+		action = fs.HandleFailoverError(context.Background(), mock, 200, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SameAccountRetryCount[200])
+		require.Equal(t, 1, fs.SameAccountRetryCount[100], "账号 100 的计数不应受影响")
+	})
+
+	t.Run("重试耗尽后再次遇到同账号_直接切换", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(5, false)
+		err := newTestFailoverErr(400, true, false)
+
+		// 耗尽账号 100 的重试
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		// 第三次: 重试耗尽 → 切换
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+
+		// 再次遇到账号 100，计数仍为 2，条件不满足 → 直接切换
+		action = fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+		require.Len(t, mock.calls, 2, "第二次耗尽也应调用 TempUnschedule")
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleFailoverError — TempUnschedule 调用验证
+// ---------------------------------------------------------------------------
+
+func TestHandleFailoverError_TempUnschedule(t *testing.T) {
+	t.Run("非重试错误不调用TempUnschedule", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(500, false, false) // RetryableOnSameAccount=false
+
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Empty(t, mock.calls)
+	})
+
+	t.Run("重试错误耗尽后调用TempUnschedule_传入正确参数", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(502, true, false)
+
+		// 耗尽重试
+		fs.HandleFailoverError(context.Background(), mock, 42, "openai", err)
+		fs.HandleFailoverError(context.Background(), mock, 42, "openai", err)
+		fs.HandleFailoverError(context.Background(), mock, 42, "openai", err)
+
+		require.Len(t, mock.calls, 1)
+		require.Equal(t, int64(42), mock.calls[0].accountID)
+		require.Equal(t, 502, mock.calls[0].failoverErr.StatusCode)
+		require.True(t, mock.calls[0].failoverErr.RetryableOnSameAccount)
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleFailoverError — Context 取消
+// ---------------------------------------------------------------------------
+
+func TestHandleFailoverError_ContextCanceled(t *testing.T) {
+	t.Run("同账号重试sleep期间context取消", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(400, true, false)
+
+		ctx, cancel := context.WithCancel(context.Background())
+		cancel() // 立即取消
+
+		start := time.Now()
+		action := fs.HandleFailoverError(ctx, mock, 100, "openai", err)
+		elapsed := time.Since(start)
+
+		require.Equal(t, FailoverCanceled, action)
+		require.Less(t, elapsed, 100*time.Millisecond, "应立即返回")
+		// 重试计数仍应递增
+		require.Equal(t, 1, fs.SameAccountRetryCount[100])
+	})
+
+	t.Run("Antigravity延迟期间context取消", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		fs.SwitchCount = 1 // 下一次 switchCount=2 → delay = 1s
+		err := newTestFailoverErr(500, false, false)
+
+		ctx, cancel := context.WithCancel(context.Background())
+		cancel() // 立即取消
+
+		start := time.Now()
+		action := fs.HandleFailoverError(ctx, mock, 100, service.PlatformAntigravity, err)
+		elapsed := time.Since(start)
+
+		require.Equal(t, FailoverCanceled, action)
+		require.Less(t, elapsed, 100*time.Millisecond, "应立即返回而非等待 1s")
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleFailoverError — FailedAccountIDs 跟踪
+// ---------------------------------------------------------------------------
+
+func TestHandleFailoverError_FailedAccountIDs(t *testing.T) {
+	t.Run("切换时添加到失败列表", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", newTestFailoverErr(500, false, false))
+		require.Contains(t, fs.FailedAccountIDs, int64(100))
+
+		fs.HandleFailoverError(context.Background(), mock, 200, "openai", newTestFailoverErr(502, false, false))
+		require.Contains(t, fs.FailedAccountIDs, int64(200))
+		require.Len(t, fs.FailedAccountIDs, 2)
+	})
+
+	t.Run("耗尽时也添加到失败列表", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(0, false)
+
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", newTestFailoverErr(500, false, false))
+		require.Equal(t, FailoverExhausted, action)
+		require.Contains(t, fs.FailedAccountIDs, int64(100))
+	})
+
+	t.Run("同账号重试期间不添加到失败列表", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", newTestFailoverErr(400, true, false))
+		require.Equal(t, FailoverContinue, action)
+		require.NotContains(t, fs.FailedAccountIDs, int64(100))
+	})
+
+	t.Run("同一账号多次切换不重复添加", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(5, false)
+
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", newTestFailoverErr(500, false, false))
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", newTestFailoverErr(500, false, false))
+		require.Len(t, fs.FailedAccountIDs, 1, "map 天然去重")
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleFailoverError — LastFailoverErr 更新
+// ---------------------------------------------------------------------------
+
+func TestHandleFailoverError_LastFailoverErr(t *testing.T) {
+	t.Run("每次调用都更新LastFailoverErr", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+
+		err1 := newTestFailoverErr(500, false, false)
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err1)
+		require.Equal(t, err1, fs.LastFailoverErr)
+
+		err2 := newTestFailoverErr(502, false, false)
+		fs.HandleFailoverError(context.Background(), mock, 200, "openai", err2)
+		require.Equal(t, err2, fs.LastFailoverErr)
+	})
+
+	t.Run("同账号重试时也更新LastFailoverErr", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+
+		err := newTestFailoverErr(400, true, false)
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, err, fs.LastFailoverErr)
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleFailoverError — 综合集成场景
+// ---------------------------------------------------------------------------
+
+func TestHandleFailoverError_IntegrationScenario(t *testing.T) {
+	t.Run("模拟完整failover流程_多账号混合重试与切换", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, true) // hasBoundSession=true
+
+		// 1. 账号 100 遇到可重试错误，同账号重试 2 次
+		retryErr := newTestFailoverErr(400, true, false)
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", retryErr)
+		require.Equal(t, FailoverContinue, action)
+		require.True(t, fs.ForceCacheBilling, "hasBoundSession=true 应设置 ForceCacheBilling")
+
+		action = fs.HandleFailoverError(context.Background(), mock, 100, "openai", retryErr)
+		require.Equal(t, FailoverContinue, action)
+
+		// 2. 账号 100 重试耗尽 → TempUnschedule + 切换
+		action = fs.HandleFailoverError(context.Background(), mock, 100, "openai", retryErr)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SwitchCount)
+		require.Len(t, mock.calls, 1)
+
+		// 3. 账号 200 遇到不可重试错误 → 直接切换
+		switchErr := newTestFailoverErr(500, false, false)
+		action = fs.HandleFailoverError(context.Background(), mock, 200, "openai", switchErr)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 2, fs.SwitchCount)
+
+		// 4. 账号 300 遇到不可重试错误 → 再切换
+		action = fs.HandleFailoverError(context.Background(), mock, 300, "openai", switchErr)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 3, fs.SwitchCount)
+
+		// 5. 账号 400 → 已耗尽 (SwitchCount=3 >= MaxSwitches=3)
+		action = fs.HandleFailoverError(context.Background(), mock, 400, "openai", switchErr)
+		require.Equal(t, FailoverExhausted, action)
+
+		// 最终状态验证
+		require.Equal(t, 3, fs.SwitchCount, "耗尽时不再递增")
+		require.Len(t, fs.FailedAccountIDs, 4, "4个不同账号都在失败列表中")
+		require.True(t, fs.ForceCacheBilling)
+		require.Len(t, mock.calls, 1, "只有账号 100 触发了 TempUnschedule")
+	})
+
+	t.Run("模拟Antigravity平台完整流程", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(2, false)
+
+		err := newTestFailoverErr(500, false, false)
+
+		// 第一次切换：delay = 0s
+		start := time.Now()
+		action := fs.HandleFailoverError(context.Background(), mock, 100, service.PlatformAntigravity, err)
+		elapsed := time.Since(start)
+		require.Equal(t, FailoverContinue, action)
+		require.Less(t, elapsed, 200*time.Millisecond, "第一次切换延迟为 0")
+
+		// 第二次切换：delay = 1s
+		start = time.Now()
+		action = fs.HandleFailoverError(context.Background(), mock, 200, service.PlatformAntigravity, err)
+		elapsed = time.Since(start)
+		require.Equal(t, FailoverContinue, action)
+		require.GreaterOrEqual(t, elapsed, 800*time.Millisecond, "第二次切换延迟约 1s")
+
+		// 第三次：耗尽（无延迟，因为在检查延迟之前就返回了）
+		start = time.Now()
+		action = fs.HandleFailoverError(context.Background(), mock, 300, service.PlatformAntigravity, err)
+		elapsed = time.Since(start)
+		require.Equal(t, FailoverExhausted, action)
+		require.Less(t, elapsed, 200*time.Millisecond, "耗尽时不应有延迟")
+	})
+
+	t.Run("ForceCacheBilling通过错误标志设置", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false) // hasBoundSession=false
+
+		// 第一次：ForceCacheBilling=false
+		err1 := newTestFailoverErr(500, false, false)
+		fs.HandleFailoverError(context.Background(), mock, 100, "openai", err1)
+		require.False(t, fs.ForceCacheBilling)
+
+		// 第二次：ForceCacheBilling=true（Antigravity 粘性会话切换）
+		err2 := newTestFailoverErr(500, false, true)
+		fs.HandleFailoverError(context.Background(), mock, 200, "openai", err2)
+		require.True(t, fs.ForceCacheBilling, "错误标志应触发 ForceCacheBilling")
+
+		// 第三次：ForceCacheBilling=false，但状态仍保持 true
+		err3 := newTestFailoverErr(500, false, false)
+		fs.HandleFailoverError(context.Background(), mock, 300, "openai", err3)
+		require.True(t, fs.ForceCacheBilling, "不应重置")
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleFailoverError — 边界条件
+// ---------------------------------------------------------------------------
+
+func TestHandleFailoverError_EdgeCases(t *testing.T) {
+	t.Run("StatusCode为0的错误也能正常处理", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(0, false, false)
+
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+	})
+
+	t.Run("AccountID为0也能正常跟踪", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(500, true, false)
+
+		action := fs.HandleFailoverError(context.Background(), mock, 0, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SameAccountRetryCount[0])
+	})
+
+	t.Run("负AccountID也能正常跟踪", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		err := newTestFailoverErr(500, true, false)
+
+		action := fs.HandleFailoverError(context.Background(), mock, -1, "openai", err)
+		require.Equal(t, FailoverContinue, action)
+		require.Equal(t, 1, fs.SameAccountRetryCount[-1])
+	})
+
+	t.Run("空平台名称不触发Antigravity延迟", func(t *testing.T) {
+		mock := &mockTempUnscheduler{}
+		fs := NewFailoverState(3, false)
+		fs.SwitchCount = 1
+		err := newTestFailoverErr(500, false, false)
+
+		start := time.Now()
+		action := fs.HandleFailoverError(context.Background(), mock, 100, "", err)
+		elapsed := time.Since(start)
+
+		require.Equal(t, FailoverContinue, action)
+		require.Less(t, elapsed, 200*time.Millisecond, "空平台不应触发 Antigravity 延迟")
+	})
+}
+
+// ---------------------------------------------------------------------------
+// HandleSelectionExhausted 测试
+// ---------------------------------------------------------------------------
+
+func TestHandleSelectionExhausted(t *testing.T) {
+	t.Run("无LastFailoverErr时返回Exhausted", func(t *testing.T) {
+		fs := NewFailoverState(3, false)
+		// LastFailoverErr 为 nil
+
+		action := fs.HandleSelectionExhausted(context.Background())
+		require.Equal(t, FailoverExhausted, action)
+	})
+
+	t.Run("非503错误返回Exhausted", func(t *testing.T) {
+		fs := NewFailoverState(3, false)
+		fs.LastFailoverErr = newTestFailoverErr(500, false, false)
+
+		action := fs.HandleSelectionExhausted(context.Background())
+		require.Equal(t, FailoverExhausted, action)
+	})
+
+	t.Run("503且未耗尽_等待后返回Continue并清除失败列表", func(t *testing.T) {
+		fs := NewFailoverState(3, false)
+		fs.LastFailoverErr = newTestFailoverErr(503, false, false)
+		fs.FailedAccountIDs[100] = struct{}{}
+		fs.SwitchCount = 1
+
+		start := time.Now()
+		action := fs.HandleSelectionExhausted(context.Background())
+		elapsed := time.Since(start)
+
+		require.Equal(t, FailoverContinue, action)
+		require.Empty(t, fs.FailedAccountIDs, "应清除失败账号列表")
+		require.GreaterOrEqual(t, elapsed, 1500*time.Millisecond, "应等待约 2s")
+		require.Less(t, elapsed, 5*time.Second)
+	})
+
+	t.Run("503但SwitchCount已超过MaxSwitches_返回Exhausted", func(t *testing.T) {
+		fs := NewFailoverState(2, false)
+		fs.LastFailoverErr = newTestFailoverErr(503, false, false)
+		fs.SwitchCount = 3 // > MaxSwitches(2)
+
+		start := time.Now()
+		action := fs.HandleSelectionExhausted(context.Background())
+		elapsed := time.Since(start)
+
+		require.Equal(t, FailoverExhausted, action)
+		require.Less(t, elapsed, 100*time.Millisecond, "不应等待")
+	})
+
+	t.Run("503但context已取消_返回Canceled", func(t *testing.T) {
+		fs := NewFailoverState(3, false)
+		fs.LastFailoverErr = newTestFailoverErr(503, false, false)
+
+		ctx, cancel := context.WithCancel(context.Background())
+		cancel()
+
+		start := time.Now()
+		action := fs.HandleSelectionExhausted(ctx)
+		elapsed := time.Since(start)
+
+		require.Equal(t, FailoverCanceled, action)
+		require.Less(t, elapsed, 100*time.Millisecond, "应立即返回")
+	})
+
+	t.Run("503且SwitchCount等于MaxSwitches_仍可重试", func(t *testing.T) {
+		fs := NewFailoverState(2, false)
+		fs.LastFailoverErr = newTestFailoverErr(503, false, false)
+		fs.SwitchCount = 2 // == MaxSwitches，条件是 <=，仍可重试
+
+		action := fs.HandleSelectionExhausted(context.Background())
+		require.Equal(t, FailoverContinue, action)
+	})
+}
diff --git a/backend/internal/handler/gateway_handler.go b/backend/internal/handler/gateway_handler.go
index c2b6bf09..0cc86bb4 100644
--- a/backend/internal/handler/gateway_handler.go
+++ b/backend/internal/handler/gateway_handler.go
@@ -232,12 +232,7 @@ func (h *GatewayHandler) Messages(c *gin.Context) {
 	hasBoundSession := sessionKey != "" && sessionBoundAccountID > 0
 
 	if platform == service.PlatformGemini {
-		maxAccountSwitches := h.maxAccountSwitchesGemini
-		switchCount := 0
-		failedAccountIDs := make(map[int64]struct{})
-		sameAccountRetryCount := make(map[int64]int) // 同账号重试计数
-		var lastFailoverErr *service.UpstreamFailoverError
-		var forceCacheBilling bool // 粘性会话切换时的缓存计费标记
+		fs := NewFailoverState(h.maxAccountSwitchesGemini, hasBoundSession)
 
 		// 单账号分组提前设置 SingleAccountRetry 标记，让 Service 层首次 503 就不设模型限流标记。
 		// 避免单账号分组收到 503 (MODEL_CAPACITY_EXHAUSTED) 时设 29s 限流，导致后续请求连续快速失败。
@@ -247,31 +242,28 @@ func (h *GatewayHandler) Messages(c *gin.Context) {
 		}
 
 		for {
-			selection, err := h.gatewayService.SelectAccountWithLoadAwareness(c.Request.Context(), apiKey.GroupID, sessionKey, reqModel, failedAccountIDs, "") // Gemini 不使用会话限制
+			selection, err := h.gatewayService.SelectAccountWithLoadAwareness(c.Request.Context(), apiKey.GroupID, sessionKey, reqModel, fs.FailedAccountIDs, "") // Gemini 不使用会话限制
 			if err != nil {
-				if len(failedAccountIDs) == 0 {
+				if len(fs.FailedAccountIDs) == 0 {
 					h.handleStreamingAwareError(c, http.StatusServiceUnavailable, "api_error", "No available accounts: "+err.Error(), streamStarted)
 					return
 				}
-				// Antigravity 单账号退避重试：分组内没有其他可用账号时，
-				// 对 503 错误不直接返回，而是清除排除列表、等待退避后重试同一个账号。
-				// 谷歌上游 503 (MODEL_CAPACITY_EXHAUSTED) 通常是暂时性的，等几秒就能恢复。
-				if lastFailoverErr != nil && lastFailoverErr.StatusCode == http.StatusServiceUnavailable && switchCount <= maxAccountSwitches {
-					if sleepAntigravitySingleAccountBackoff(c.Request.Context(), switchCount) {
-						log.Printf("Antigravity single-account 503 retry: clearing failed accounts, retry %d/%d", switchCount, maxAccountSwitches)
-						failedAccountIDs = make(map[int64]struct{})
-						// 设置 context 标记，让 Service 层预检查等待限流过期而非直接切换
-						ctx := context.WithValue(c.Request.Context(), ctxkey.SingleAccountRetry, true)
-						c.Request = c.Request.WithContext(ctx)
-						continue
+				action := fs.HandleSelectionExhausted(c.Request.Context())
+				switch action {
+				case FailoverContinue:
+					ctx := context.WithValue(c.Request.Context(), ctxkey.SingleAccountRetry, true)
+					c.Request = c.Request.WithContext(ctx)
+					continue
+				case FailoverCanceled:
+					return
+				default: // FailoverExhausted
+					if fs.LastFailoverErr != nil {
+						h.handleFailoverExhausted(c, fs.LastFailoverErr, service.PlatformGemini, streamStarted)
+					} else {
+						h.handleFailoverExhaustedSimple(c, 502, streamStarted)
 					}
+					return
 				}
-				if lastFailoverErr != nil {
-					h.handleFailoverExhausted(c, lastFailoverErr, service.PlatformGemini, streamStarted)
-				} else {
-					h.handleFailoverExhaustedSimple(c, 502, streamStarted)
-				}
-				return
 			}
 			account := selection.Account
 			setOpsSelectedAccount(c, account.ID)
@@ -346,8 +338,8 @@ func (h *GatewayHandler) Messages(c *gin.Context) {
 			// 转发请求 - 根据账号平台分流
 			var result *service.ForwardResult
 			requestCtx := c.Request.Context()
-			if switchCount > 0 {
-				requestCtx = context.WithValue(requestCtx, ctxkey.AccountSwitchCount, switchCount)
+			if fs.SwitchCount > 0 {
+				requestCtx = context.WithValue(requestCtx, ctxkey.AccountSwitchCount, fs.SwitchCount)
 			}
 			if account.Platform == service.PlatformAntigravity {
 				result, err = h.antigravityGatewayService.ForwardGemini(requestCtx, c, account, reqModel, "generateContent", reqStream, body, hasBoundSession)
@@ -360,40 +352,16 @@ func (h *GatewayHandler) Messages(c *gin.Context) {
 			if err != nil {
 				var failoverErr *service.UpstreamFailoverError
 				if errors.As(err, &failoverErr) {
-					lastFailoverErr = failoverErr
-					if needForceCacheBilling(hasBoundSession, failoverErr) {
-						forceCacheBilling = true
-					}
-
-					// 同账号重试：对 RetryableOnSameAccount 的临时性错误，先在同一账号上重试
-					if failoverErr.RetryableOnSameAccount && sameAccountRetryCount[account.ID] < maxSameAccountRetries {
-						sameAccountRetryCount[account.ID]++
-						log.Printf("Account %d: retryable error %d, same-account retry %d/%d",
-							account.ID, failoverErr.StatusCode, sameAccountRetryCount[account.ID], maxSameAccountRetries)
-						if !sleepSameAccountRetryDelay(c.Request.Context()) {
-							return
-						}
+					action := fs.HandleFailoverError(c.Request.Context(), h.gatewayService, account.ID, account.Platform, failoverErr)
+					switch action {
+					case FailoverContinue:
 						continue
-					}
-
-					// 同账号重试用尽，执行临时封禁并切换账号
-					if failoverErr.RetryableOnSameAccount {
-						h.gatewayService.TempUnscheduleRetryableError(c.Request.Context(), account.ID, failoverErr)
-					}
-
-					failedAccountIDs[account.ID] = struct{}{}
-					if switchCount >= maxAccountSwitches {
-						h.handleFailoverExhausted(c, failoverErr, service.PlatformGemini, streamStarted)
+					case FailoverExhausted:
+						h.handleFailoverExhausted(c, fs.LastFailoverErr, service.PlatformGemini, streamStarted)
+						return
+					case FailoverCanceled:
 						return
 					}
-					switchCount++
-					log.Printf("Account %d: upstream error %d, switching account %d/%d", account.ID, failoverErr.StatusCode, switchCount, maxAccountSwitches)
-					if account.Platform == service.PlatformAntigravity {
-						if !sleepFailoverDelay(c.Request.Context(), switchCount) {
-							return
-						}
-					}
-					continue
 				}
 				// 错误响应已在Forward中处理，这里只记录日志
 				log.Printf("Forward request failed: %v", err)
@@ -421,7 +389,7 @@ func (h *GatewayHandler) Messages(c *gin.Context) {
 				}); err != nil {
 					log.Printf("Record usage failed: %v", err)
 				}
-			}(result, account, userAgent, clientIP, forceCacheBilling)
+			}(result, account, userAgent, clientIP, fs.ForceCacheBilling)
 			return
 		}
 	}
@@ -442,41 +410,33 @@ func (h *GatewayHandler) Messages(c *gin.Context) {
 	}
 
 	for {
-		maxAccountSwitches := h.maxAccountSwitches
-		switchCount := 0
-		failedAccountIDs := make(map[int64]struct{})
-		sameAccountRetryCount := make(map[int64]int) // 同账号重试计数
-		var lastFailoverErr *service.UpstreamFailoverError
+		fs := NewFailoverState(h.maxAccountSwitches, hasBoundSession)
 		retryWithFallback := false
-		var forceCacheBilling bool // 粘性会话切换时的缓存计费标记
 
 		for {
 			// 选择支持该模型的账号
-			selection, err := h.gatewayService.SelectAccountWithLoadAwareness(c.Request.Context(), currentAPIKey.GroupID, sessionKey, reqModel, failedAccountIDs, parsedReq.MetadataUserID)
+			selection, err := h.gatewayService.SelectAccountWithLoadAwareness(c.Request.Context(), currentAPIKey.GroupID, sessionKey, reqModel, fs.FailedAccountIDs, parsedReq.MetadataUserID)
 			if err != nil {
-				if len(failedAccountIDs) == 0 {
+				if len(fs.FailedAccountIDs) == 0 {
 					h.handleStreamingAwareError(c, http.StatusServiceUnavailable, "api_error", "No available accounts: "+err.Error(), streamStarted)
 					return
 				}
-				// Antigravity 单账号退避重试：分组内没有其他可用账号时，
-				// 对 503 错误不直接返回，而是清除排除列表、等待退避后重试同一个账号。
-				// 谷歌上游 503 (MODEL_CAPACITY_EXHAUSTED) 通常是暂时性的，等几秒就能恢复。
-				if lastFailoverErr != nil && lastFailoverErr.StatusCode == http.StatusServiceUnavailable && switchCount <= maxAccountSwitches {
-					if sleepAntigravitySingleAccountBackoff(c.Request.Context(), switchCount) {
-						log.Printf("Antigravity single-account 503 retry: clearing failed accounts, retry %d/%d", switchCount, maxAccountSwitches)
-						failedAccountIDs = make(map[int64]struct{})
-						// 设置 context 标记，让 Service 层预检查等待限流过期而非直接切换
-						ctx := context.WithValue(c.Request.Context(), ctxkey.SingleAccountRetry, true)
-						c.Request = c.Request.WithContext(ctx)
-						continue
+				action := fs.HandleSelectionExhausted(c.Request.Context())
+				switch action {
+				case FailoverContinue:
+					ctx := context.WithValue(c.Request.Context(), ctxkey.SingleAccountRetry, true)
+					c.Request = c.Request.WithContext(ctx)
+					continue
+				case FailoverCanceled:
+					return
+				default: // FailoverExhausted
+					if fs.LastFailoverErr != nil {
+						h.handleFailoverExhausted(c, fs.LastFailoverErr, platform, streamStarted)
+					} else {
+						h.handleFailoverExhaustedSimple(c, 502, streamStarted)
 					}
+					return
 				}
-				if lastFailoverErr != nil {
-					h.handleFailoverExhausted(c, lastFailoverErr, platform, streamStarted)
-				} else {
-					h.handleFailoverExhaustedSimple(c, 502, streamStarted)
-				}
-				return
 			}
 			account := selection.Account
 			setOpsSelectedAccount(c, account.ID)
@@ -549,8 +509,8 @@ func (h *GatewayHandler) Messages(c *gin.Context) {
 			// 转发请求 - 根据账号平台分流
 			var result *service.ForwardResult
 			requestCtx := c.Request.Context()
-			if switchCount > 0 {
-				requestCtx = context.WithValue(requestCtx, ctxkey.AccountSwitchCount, switchCount)
+			if fs.SwitchCount > 0 {
+				requestCtx = context.WithValue(requestCtx, ctxkey.AccountSwitchCount, fs.SwitchCount)
 			}
 			if account.Platform == service.PlatformAntigravity && account.Type != service.AccountTypeAPIKey {
 				result, err = h.antigravityGatewayService.Forward(requestCtx, c, account, body, hasBoundSession)
@@ -598,40 +558,16 @@ func (h *GatewayHandler) Messages(c *gin.Context) {
 				}
 				var failoverErr *service.UpstreamFailoverError
 				if errors.As(err, &failoverErr) {
-					lastFailoverErr = failoverErr
-					if needForceCacheBilling(hasBoundSession, failoverErr) {
-						forceCacheBilling = true
-					}
-
-					// 同账号重试：对 RetryableOnSameAccount 的临时性错误，先在同一账号上重试
-					if failoverErr.RetryableOnSameAccount && sameAccountRetryCount[account.ID] < maxSameAccountRetries {
-						sameAccountRetryCount[account.ID]++
-						log.Printf("Account %d: retryable error %d, same-account retry %d/%d",
-							account.ID, failoverErr.StatusCode, sameAccountRetryCount[account.ID], maxSameAccountRetries)
-						if !sleepSameAccountRetryDelay(c.Request.Context()) {
-							return
-						}
+					action := fs.HandleFailoverError(c.Request.Context(), h.gatewayService, account.ID, account.Platform, failoverErr)
+					switch action {
+					case FailoverContinue:
 						continue
-					}
-
-					// 同账号重试用尽，执行临时封禁并切换账号
-					if failoverErr.RetryableOnSameAccount {
-						h.gatewayService.TempUnscheduleRetryableError(c.Request.Context(), account.ID, failoverErr)
-					}
-
-					failedAccountIDs[account.ID] = struct{}{}
-					if switchCount >= maxAccountSwitches {
-						h.handleFailoverExhausted(c, failoverErr, account.Platform, streamStarted)
+					case FailoverExhausted:
+						h.handleFailoverExhausted(c, fs.LastFailoverErr, account.Platform, streamStarted)
+						return
+					case FailoverCanceled:
 						return
 					}
-					switchCount++
-					log.Printf("Account %d: upstream error %d, switching account %d/%d", account.ID, failoverErr.StatusCode, switchCount, maxAccountSwitches)
-					if account.Platform == service.PlatformAntigravity {
-						if !sleepFailoverDelay(c.Request.Context(), switchCount) {
-							return
-						}
-					}
-					continue
 				}
 				// 错误响应已在Forward中处理，这里只记录日志
 				log.Printf("Account %d: Forward request failed: %v", account.ID, err)
@@ -659,7 +595,7 @@ func (h *GatewayHandler) Messages(c *gin.Context) {
 				}); err != nil {
 					log.Printf("Record usage failed: %v", err)
 				}
-			}(result, account, userAgent, clientIP, forceCacheBilling)
+			}(result, account, userAgent, clientIP, fs.ForceCacheBilling)
 			return
 		}
 		if !retryWithFallback {
@@ -893,65 +829,6 @@ func (h *GatewayHandler) handleConcurrencyError(c *gin.Context, err error, slotT
 		fmt.Sprintf("Concurrency limit exceeded for %s, please retry later", slotType), streamStarted)
 }
 
-// needForceCacheBilling 判断 failover 时是否需要强制缓存计费
-// 粘性会话切换账号、或上游明确标记时，将 input_tokens 转为 cache_read 计费
-func needForceCacheBilling(hasBoundSession bool, failoverErr *service.UpstreamFailoverError) bool {
-	return hasBoundSession || (failoverErr != nil && failoverErr.ForceCacheBilling)
-}
-
-const (
-	// maxSameAccountRetries 同账号重试次数上限（针对 RetryableOnSameAccount 错误）
-	maxSameAccountRetries = 2
-	// sameAccountRetryDelay 同账号重试间隔
-	sameAccountRetryDelay = 500 * time.Millisecond
-)
-
-// sleepSameAccountRetryDelay 同账号重试固定延时，返回 false 表示 context 已取消。
-func sleepSameAccountRetryDelay(ctx context.Context) bool {
-	select {
-	case <-ctx.Done():
-		return false
-	case <-time.After(sameAccountRetryDelay):
-		return true
-	}
-}
-
-// sleepFailoverDelay 账号切换线性递增延时：第1次0s、第2次1s、第3次2s…
-// 返回 false 表示 context 已取消。
-func sleepFailoverDelay(ctx context.Context, switchCount int) bool {
-	delay := time.Duration(switchCount-1) * time.Second
-	if delay <= 0 {
-		return true
-	}
-	select {
-	case <-ctx.Done():
-		return false
-	case <-time.After(delay):
-		return true
-	}
-}
-
-// sleepAntigravitySingleAccountBackoff Antigravity 平台单账号分组的 503 退避重试延时。
-// 当分组内只有一个可用账号且上游返回 503（MODEL_CAPACITY_EXHAUSTED）时使用，
-// 采用短固定延时策略。Service 层在 SingleAccountRetry 模式下已经做了充分的原地重试
-// （最多 3 次、总等待 30s），所以 Handler 层的退避只需短暂等待即可。
-// 返回 false 表示 context 已取消。
-func sleepAntigravitySingleAccountBackoff(ctx context.Context, retryCount int) bool {
-	// 固定短延时：2s
-	// Service 层已经在原地等待了足够长的时间（retryDelay × 重试次数），
-	// Handler 层只需短暂间隔后重新进入 Service 层即可。
-	const delay = 2 * time.Second
-
-	log.Printf("Antigravity single-account 503 backoff: waiting %v before retry (attempt %d)", delay, retryCount)
-
-	select {
-	case <-ctx.Done():
-		return false
-	case <-time.After(delay):
-		return true
-	}
-}
-
 func (h *GatewayHandler) handleFailoverExhausted(c *gin.Context, failoverErr *service.UpstreamFailoverError, platform string, streamStarted bool) {
 	statusCode := failoverErr.StatusCode
 	responseBody := failoverErr.ResponseBody
diff --git a/backend/internal/handler/gateway_handler_single_account_retry_test.go b/backend/internal/handler/gateway_handler_single_account_retry_test.go
deleted file mode 100644
index 96aa14c6..00000000
--- a/backend/internal/handler/gateway_handler_single_account_retry_test.go
+++ /dev/null
@@ -1,51 +0,0 @@
-package handler
-
-import (
-	"context"
-	"testing"
-	"time"
-
-	"github.com/stretchr/testify/require"
-)
-
-// ---------------------------------------------------------------------------
-// sleepAntigravitySingleAccountBackoff 测试
-// ---------------------------------------------------------------------------
-
-func TestSleepAntigravitySingleAccountBackoff_ReturnsTrue(t *testing.T) {
-	ctx := context.Background()
-	start := time.Now()
-	ok := sleepAntigravitySingleAccountBackoff(ctx, 1)
-	elapsed := time.Since(start)
-
-	require.True(t, ok, "should return true when context is not canceled")
-	// 固定延迟 2s
-	require.GreaterOrEqual(t, elapsed, 1500*time.Millisecond, "should wait approximately 2s")
-	require.Less(t, elapsed, 5*time.Second, "should not wait too long")
-}
-
-func TestSleepAntigravitySingleAccountBackoff_ContextCanceled(t *testing.T) {
-	ctx, cancel := context.WithCancel(context.Background())
-	cancel() // 立即取消
-
-	start := time.Now()
-	ok := sleepAntigravitySingleAccountBackoff(ctx, 1)
-	elapsed := time.Since(start)
-
-	require.False(t, ok, "should return false when context is canceled")
-	require.Less(t, elapsed, 500*time.Millisecond, "should return immediately on cancel")
-}
-
-func TestSleepAntigravitySingleAccountBackoff_FixedDelay(t *testing.T) {
-	// 验证不同 retryCount 都使用固定 2s 延迟
-	ctx := context.Background()
-
-	start := time.Now()
-	ok := sleepAntigravitySingleAccountBackoff(ctx, 5)
-	elapsed := time.Since(start)
-
-	require.True(t, ok)
-	// 即使 retryCount=5，延迟仍然是固定的 2s
-	require.GreaterOrEqual(t, elapsed, 1500*time.Millisecond)
-	require.Less(t, elapsed, 5*time.Second)
-}
diff --git a/backend/internal/handler/gemini_v1beta_handler.go b/backend/internal/handler/gemini_v1beta_handler.go
index 3d25505b..51b77037 100644
--- a/backend/internal/handler/gemini_v1beta_handler.go
+++ b/backend/internal/handler/gemini_v1beta_handler.go
@@ -321,11 +321,7 @@ func (h *GatewayHandler) GeminiV1BetaModels(c *gin.Context) {
 	hasBoundSession := sessionKey != "" && sessionBoundAccountID > 0
 	cleanedForUnknownBinding := false
 
-	maxAccountSwitches := h.maxAccountSwitchesGemini
-	switchCount := 0
-	failedAccountIDs := make(map[int64]struct{})
-	var lastFailoverErr *service.UpstreamFailoverError
-	var forceCacheBilling bool // 粘性会话切换时的缓存计费标记
+	fs := NewFailoverState(h.maxAccountSwitchesGemini, hasBoundSession)
 
 	// 单账号分组提前设置 SingleAccountRetry 标记，让 Service 层首次 503 就不设模型限流标记。
 	// 避免单账号分组收到 503 (MODEL_CAPACITY_EXHAUSTED) 时设 29s 限流，导致后续请求连续快速失败。
@@ -335,27 +331,24 @@ func (h *GatewayHandler) GeminiV1BetaModels(c *gin.Context) {
 	}
 
 	for {
-		selection, err := h.gatewayService.SelectAccountWithLoadAwareness(c.Request.Context(), apiKey.GroupID, sessionKey, modelName, failedAccountIDs, "") // Gemini 不使用会话限制
+		selection, err := h.gatewayService.SelectAccountWithLoadAwareness(c.Request.Context(), apiKey.GroupID, sessionKey, modelName, fs.FailedAccountIDs, "") // Gemini 不使用会话限制
 		if err != nil {
-			if len(failedAccountIDs) == 0 {
+			if len(fs.FailedAccountIDs) == 0 {
 				googleError(c, http.StatusServiceUnavailable, "No available Gemini accounts: "+err.Error())
 				return
 			}
-			// Antigravity 单账号退避重试：分组内没有其他可用账号时，
-			// 对 503 错误不直接返回，而是清除排除列表、等待退避后重试同一个账号。
-			// 谷歌上游 503 (MODEL_CAPACITY_EXHAUSTED) 通常是暂时性的，等几秒就能恢复。
-			if lastFailoverErr != nil && lastFailoverErr.StatusCode == http.StatusServiceUnavailable && switchCount <= maxAccountSwitches {
-				if sleepAntigravitySingleAccountBackoff(c.Request.Context(), switchCount) {
-					log.Printf("Antigravity single-account 503 retry: clearing failed accounts, retry %d/%d", switchCount, maxAccountSwitches)
-					failedAccountIDs = make(map[int64]struct{})
-					// 设置 context 标记，让 Service 层预检查等待限流过期而非直接切换
-					ctx := context.WithValue(c.Request.Context(), ctxkey.SingleAccountRetry, true)
-					c.Request = c.Request.WithContext(ctx)
-					continue
-				}
+			action := fs.HandleSelectionExhausted(c.Request.Context())
+			switch action {
+			case FailoverContinue:
+				ctx := context.WithValue(c.Request.Context(), ctxkey.SingleAccountRetry, true)
+				c.Request = c.Request.WithContext(ctx)
+				continue
+			case FailoverCanceled:
+				return
+			default: // FailoverExhausted
+				h.handleGeminiFailoverExhausted(c, fs.LastFailoverErr)
+				return
 			}
-			h.handleGeminiFailoverExhausted(c, lastFailoverErr)
-			return
 		}
 		account := selection.Account
 		setOpsSelectedAccount(c, account.ID)
@@ -429,8 +422,8 @@ func (h *GatewayHandler) GeminiV1BetaModels(c *gin.Context) {
 		// 5) forward (根据平台分流)
 		var result *service.ForwardResult
 		requestCtx := c.Request.Context()
-		if switchCount > 0 {
-			requestCtx = context.WithValue(requestCtx, ctxkey.AccountSwitchCount, switchCount)
+		if fs.SwitchCount > 0 {
+			requestCtx = context.WithValue(requestCtx, ctxkey.AccountSwitchCount, fs.SwitchCount)
 		}
 		if account.Platform == service.PlatformAntigravity && account.Type != service.AccountTypeAPIKey {
 			result, err = h.antigravityGatewayService.ForwardGemini(requestCtx, c, account, modelName, action, stream, body, hasBoundSession)
@@ -443,24 +436,16 @@ func (h *GatewayHandler) GeminiV1BetaModels(c *gin.Context) {
 		if err != nil {
 			var failoverErr *service.UpstreamFailoverError
 			if errors.As(err, &failoverErr) {
-				failedAccountIDs[account.ID] = struct{}{}
-				if needForceCacheBilling(hasBoundSession, failoverErr) {
-					forceCacheBilling = true
-				}
-				if switchCount >= maxAccountSwitches {
-					lastFailoverErr = failoverErr
-					h.handleGeminiFailoverExhausted(c, lastFailoverErr)
+				action := fs.HandleFailoverError(c.Request.Context(), h.gatewayService, account.ID, account.Platform, failoverErr)
+				switch action {
+				case FailoverContinue:
+					continue
+				case FailoverExhausted:
+					h.handleGeminiFailoverExhausted(c, fs.LastFailoverErr)
+					return
+				case FailoverCanceled:
 					return
 				}
-				lastFailoverErr = failoverErr
-				switchCount++
-				log.Printf("Gemini account %d: upstream error %d, switching account %d/%d", account.ID, failoverErr.StatusCode, switchCount, maxAccountSwitches)
-				if account.Platform == service.PlatformAntigravity {
-					if !sleepFailoverDelay(c.Request.Context(), switchCount) {
-						return
-					}
-				}
-				continue
 			}
 			// ForwardNative already wrote the response
 			log.Printf("Gemini native forward failed: %v", err)
@@ -506,7 +491,7 @@ func (h *GatewayHandler) GeminiV1BetaModels(c *gin.Context) {
 			}); err != nil {
 				log.Printf("Record usage failed: %v", err)
 			}
-		}(result, account, userAgent, clientIP, forceCacheBilling)
+		}(result, account, userAgent, clientIP, fs.ForceCacheBilling)
 		return
 	}
 }
diff --git a/backend/internal/service/antigravity_gateway_service.go b/backend/internal/service/antigravity_gateway_service.go
index 0d054c49..5ca7b3f3 100644
--- a/backend/internal/service/antigravity_gateway_service.go
+++ b/backend/internal/service/antigravity_gateway_service.go
@@ -1372,6 +1372,10 @@ func (s *AntigravityGatewayService) Forward(ctx context.Context, c *gin.Context,
 				ForceCacheBilling: switchErr.IsStickySession,
 			}
 		}
+		// 区分客户端取消和真正的上游失败，返回更准确的错误消息
+		if c.Request.Context().Err() != nil {
+			return nil, s.writeClaudeError(c, http.StatusBadGateway, "client_disconnected", "Client disconnected before upstream response")
+		}
 		return nil, s.writeClaudeError(c, http.StatusBadGateway, "upstream_error", "Upstream request failed after retries")
 	}
 	resp := result.resp
@@ -2044,6 +2048,10 @@ func (s *AntigravityGatewayService) ForwardGemini(ctx context.Context, c *gin.Co
 				ForceCacheBilling: switchErr.IsStickySession,
 			}
 		}
+		// 区分客户端取消和真正的上游失败，返回更准确的错误消息
+		if c.Request.Context().Err() != nil {
+			return nil, s.writeGoogleError(c, http.StatusBadGateway, "Client disconnected before upstream response")
+		}
 		return nil, s.writeGoogleError(c, http.StatusBadGateway, "Upstream request failed after retries")
 	}
 	resp := result.resp
diff --git a/backend/internal/service/error_passthrough_runtime_test.go b/backend/internal/service/error_passthrough_runtime_test.go
index 0a45e57a..4a4309f9 100644
--- a/backend/internal/service/error_passthrough_runtime_test.go
+++ b/backend/internal/service/error_passthrough_runtime_test.go
@@ -220,7 +220,7 @@ func TestApplyErrorPassthroughRule_SkipMonitoringSetsContextKey(t *testing.T) {
 	v, exists := c.Get(OpsSkipPassthroughKey)
 	assert.True(t, exists, "OpsSkipPassthroughKey should be set when skip_monitoring=true")
 	boolVal, ok := v.(bool)
-	assert.True(t, ok, "value should be bool")
+	assert.True(t, ok, "value should be a bool")
 	assert.True(t, boolVal)
 }
 
diff --git a/backend/internal/service/ops_concurrency.go b/backend/internal/service/ops_concurrency.go
index f6541d08..faac2d5b 100644
--- a/backend/internal/service/ops_concurrency.go
+++ b/backend/internal/service/ops_concurrency.go
@@ -344,8 +344,16 @@ func (s *OpsService) getUsersLoadMapBestEffort(ctx context.Context, users []User
 	return out
 }
 
-// GetUserConcurrencyStats returns real-time concurrency usage for all active users.
-func (s *OpsService) GetUserConcurrencyStats(ctx context.Context) (map[int64]*UserConcurrencyInfo, *time.Time, error) {
+// GetUserConcurrencyStats returns real-time concurrency usage for active users.
+//
+// Optional filters:
+// - platformFilter: only include users who have access to groups belonging to that platform
+// - groupIDFilter: only include users who have access to that specific group
+func (s *OpsService) GetUserConcurrencyStats(
+	ctx context.Context,
+	platformFilter string,
+	groupIDFilter *int64,
+) (map[int64]*UserConcurrencyInfo, *time.Time, error) {
 	if err := s.RequireMonitoringEnabled(ctx); err != nil {
 		return nil, nil, err
 	}
@@ -355,6 +363,15 @@ func (s *OpsService) GetUserConcurrencyStats(ctx context.Context) (map[int64]*Us
 		return nil, nil, err
 	}
 
+	// Build a set of allowed group IDs when filtering is requested.
+	var allowedGroupIDs map[int64]struct{}
+	if platformFilter != "" || (groupIDFilter != nil && *groupIDFilter > 0) {
+		allowedGroupIDs, err = s.buildAllowedGroupIDsForFilter(ctx, platformFilter, groupIDFilter)
+		if err != nil {
+			return nil, nil, err
+		}
+	}
+
 	collectedAt := time.Now()
 	loadMap := s.getUsersLoadMapBestEffort(ctx, users)
 
@@ -365,6 +382,12 @@ func (s *OpsService) GetUserConcurrencyStats(ctx context.Context) (map[int64]*Us
 			continue
 		}
 
+		// Apply group/platform filter: skip users whose AllowedGroups
+		// have no intersection with the matching group IDs.
+		if allowedGroupIDs != nil && !userMatchesGroupFilter(u.AllowedGroups, allowedGroupIDs) {
+			continue
+		}
+
 		load := loadMap[u.ID]
 		currentInUse := int64(0)
 		waiting := int64(0)
@@ -394,3 +417,46 @@ func (s *OpsService) GetUserConcurrencyStats(ctx context.Context) (map[int64]*Us
 
 	return result, &collectedAt, nil
 }
+
+// buildAllowedGroupIDsForFilter returns the set of group IDs that match the given
+// platform and/or group ID filter. It reuses listAllAccountsForOps (which already
+// supports platform filtering at the DB level) to collect group IDs from accounts.
+func (s *OpsService) buildAllowedGroupIDsForFilter(ctx context.Context, platformFilter string, groupIDFilter *int64) (map[int64]struct{}, error) {
+	// Fast path: only group ID filter, no platform filter needed.
+	if platformFilter == "" && groupIDFilter != nil && *groupIDFilter > 0 {
+		return map[int64]struct{}{*groupIDFilter: {}}, nil
+	}
+
+	// Use the same account-based approach as GetConcurrencyStats to collect group IDs.
+	accounts, err := s.listAllAccountsForOps(ctx, platformFilter)
+	if err != nil {
+		return nil, err
+	}
+
+	groupIDs := make(map[int64]struct{})
+	for _, acc := range accounts {
+		for _, grp := range acc.Groups {
+			if grp == nil || grp.ID <= 0 {
+				continue
+			}
+			// If groupIDFilter is set, only include that specific group.
+			if groupIDFilter != nil && *groupIDFilter > 0 && grp.ID != *groupIDFilter {
+				continue
+			}
+			groupIDs[grp.ID] = struct{}{}
+		}
+	}
+
+	return groupIDs, nil
+}
+
+// userMatchesGroupFilter returns true if the user's AllowedGroups contains
+// at least one group ID in the allowed set.
+func userMatchesGroupFilter(userGroups []int64, allowedGroupIDs map[int64]struct{}) bool {
+	for _, gid := range userGroups {
+		if _, ok := allowedGroupIDs[gid]; ok {
+			return true
+		}
+	}
+	return false
+}
diff --git a/deploy/docker-compose.yml b/deploy/docker-compose.yml
index 033731ac..f1d19f84 100644
--- a/deploy/docker-compose.yml
+++ b/deploy/docker-compose.yml
@@ -47,13 +47,15 @@ services:
 
       # =======================================================================
       # Database Configuration (PostgreSQL)
+      # Default: uses local postgres container
+      # External DB: set DATABASE_HOST and DATABASE_SSLMODE in .env
       # =======================================================================
-      - DATABASE_HOST=postgres
-      - DATABASE_PORT=5432
+      - DATABASE_HOST=${DATABASE_HOST:-postgres}
+      - DATABASE_PORT=${DATABASE_PORT:-5432}
       - DATABASE_USER=${POSTGRES_USER:-sub2api}
       - DATABASE_PASSWORD=${POSTGRES_PASSWORD:?POSTGRES_PASSWORD is required}
       - DATABASE_DBNAME=${POSTGRES_DB:-sub2api}
-      - DATABASE_SSLMODE=disable
+      - DATABASE_SSLMODE=${DATABASE_SSLMODE:-disable}
 
       # =======================================================================
       # Redis Configuration
@@ -128,8 +130,6 @@ services:
       # Examples: http://host:port, socks5://host:port
       - UPDATE_PROXY_URL=${UPDATE_PROXY_URL:-}
     depends_on:
-      postgres:
-        condition: service_healthy
       redis:
         condition: service_healthy
     networks:
@@ -141,35 +141,6 @@ services:
       retries: 3
       start_period: 30s
 
-  # ===========================================================================
-  # PostgreSQL Database
-  # ===========================================================================
-  postgres:
-    image: postgres:18-alpine
-    container_name: sub2api-postgres
-    restart: unless-stopped
-    ulimits:
-      nofile:
-        soft: 100000
-        hard: 100000
-    volumes:
-      - postgres_data:/var/lib/postgresql/data
-    environment:
-      - POSTGRES_USER=${POSTGRES_USER:-sub2api}
-      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD:?POSTGRES_PASSWORD is required}
-      - POSTGRES_DB=${POSTGRES_DB:-sub2api}
-      - TZ=${TZ:-Asia/Shanghai}
-    networks:
-      - sub2api-network
-    healthcheck:
-      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-sub2api} -d ${POSTGRES_DB:-sub2api}"]
-      interval: 10s
-      timeout: 5s
-      retries: 5
-      start_period: 10s
-    # 注意：不暴露端口到宿主机，应用通过内部网络连接
-    # 如需调试，可临时添加：ports: ["127.0.0.1:5433:5432"]
-
   # ===========================================================================
   # Redis Cache
   # ===========================================================================
@@ -209,8 +180,6 @@ services:
 volumes:
   sub2api_data:
     driver: local
-  postgres_data:
-    driver: local
   redis_data:
     driver: local
 
diff --git a/frontend/public/wechat-qr.jpg b/frontend/public/wechat-qr.jpg
new file mode 100644
index 00000000..659068d8
Binary files /dev/null and b/frontend/public/wechat-qr.jpg differ
diff --git a/frontend/src/api/admin/ops.ts b/frontend/src/api/admin/ops.ts
index 9f980a12..523fbd00 100644
--- a/frontend/src/api/admin/ops.ts
+++ b/frontend/src/api/admin/ops.ts
@@ -366,8 +366,16 @@ export async function getConcurrencyStats(platform?: string, groupId?: number |
   return data
 }
 
-export async function getUserConcurrencyStats(): Promise<OpsUserConcurrencyStatsResponse> {
-  const { data } = await apiClient.get<OpsUserConcurrencyStatsResponse>('/admin/ops/user-concurrency')
+export async function getUserConcurrencyStats(platform?: string, groupId?: number | null): Promise<OpsUserConcurrencyStatsResponse> {
+  const params: Record<string, any> = {}
+  if (platform) {
+    params.platform = platform
+  }
+  if (typeof groupId === 'number' && groupId > 0) {
+    params.group_id = groupId
+  }
+
+  const { data } = await apiClient.get<OpsUserConcurrencyStatsResponse>('/admin/ops/user-concurrency', { params })
   return data
 }
 
diff --git a/frontend/src/components/common/WechatServiceButton.vue b/frontend/src/components/common/WechatServiceButton.vue
new file mode 100644
index 00000000..9ee8d3d5
--- /dev/null
+++ b/frontend/src/components/common/WechatServiceButton.vue
@@ -0,0 +1,104 @@
+<template>
+  <!-- 悬浮按钮 - 使用主题色 -->
+  <button
+    @click="showModal = true"
+    class="fixed bottom-6 right-6 z-50 flex items-center gap-2 rounded-full bg-gradient-to-r from-primary-500 to-primary-600 px-4 py-3 text-white shadow-lg shadow-primary-500/25 transition-all hover:from-primary-600 hover:to-primary-700 hover:shadow-xl hover:shadow-primary-500/30"
+  >
+    <svg class="h-5 w-5" viewBox="0 0 24 24" fill="currentColor">
+      <path d="M8.691 2.188C3.891 2.188 0 5.476 0 9.53c0 2.212 1.17 4.203 3.002 5.55a.59.59 0 01.213.665l-.39 1.48c-.019.07-.048.141-.048.213 0 .163.13.295.29.295a.328.328 0 00.186-.059l2.114-1.225a.87.87 0 01.415-.106.807.807 0 01.213.026 10.07 10.07 0 002.696.37c.262 0 .52-.011.776-.028a5.91 5.91 0 01-.193-1.479c0-3.644 3.374-6.6 7.536-6.6.262 0 .52.011.776.028-.628-3.513-4.27-6.472-8.885-6.472zM5.785 5.97a1.1 1.1 0 110 2.2 1.1 1.1 0 010-2.2zm5.813 0a1.1 1.1 0 110 2.2 1.1 1.1 0 010-2.2zm5.192 2.642c-3.703 0-6.71 2.567-6.71 5.73 0 3.163 3.007 5.73 6.71 5.73a7.9 7.9 0 002.126-.288.644.644 0 01.17-.022.69.69 0 01.329.085l1.672.97a.262.262 0 00.147.046c.128 0 .23-.104.23-.233a.403.403 0 00-.038-.168l-.309-1.17a.468.468 0 01.168-.527c1.449-1.065 2.374-2.643 2.374-4.423 0-3.163-3.007-5.73-6.71-5.73h-.159zm-2.434 3.34a.88.88 0 110 1.76.88.88 0 010-1.76zm4.868 0a.88.88 0 110 1.76.88.88 0 010-1.76z"/>
+    </svg>
+    <span class="text-sm font-medium">客服</span>
+  </button>
+
+  <!-- 弹窗 -->
+  <Teleport to="body">
+    <Transition name="fade">
+      <div
+        v-if="showModal"
+        class="fixed inset-0 z-[100] flex items-center justify-center bg-black/50 p-4 backdrop-blur-sm"
+        @click.self="showModal = false"
+      >
+        <Transition name="scale">
+          <div
+            v-if="showModal"
+            class="relative w-full max-w-sm rounded-2xl bg-white p-6 shadow-2xl dark:bg-dark-700"
+          >
+            <!-- 关闭按钮 -->
+            <button
+              @click="showModal = false"
+              class="absolute right-4 top-4 text-gray-400 transition-colors hover:text-gray-600 dark:text-dark-400 dark:hover:text-dark-200"
+            >
+              <svg class="h-5 w-5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M6 18L18 6M6 6l12 12" />
+              </svg>
+            </button>
+
+            <!-- 标题 -->
+            <div class="mb-4 flex items-center gap-3">
+              <div class="flex h-10 w-10 items-center justify-center rounded-full bg-gradient-to-br from-primary-500 to-primary-600">
+                <svg class="h-6 w-6 text-white" viewBox="0 0 24 24" fill="currentColor">
+                  <path d="M8.691 2.188C3.891 2.188 0 5.476 0 9.53c0 2.212 1.17 4.203 3.002 5.55a.59.59 0 01.213.665l-.39 1.48c-.019.07-.048.141-.048.213 0 .163.13.295.29.295a.328.328 0 00.186-.059l2.114-1.225a.87.87 0 01.415-.106.807.807 0 01.213.026 10.07 10.07 0 002.696.37c.262 0 .52-.011.776-.028a5.91 5.91 0 01-.193-1.479c0-3.644 3.374-6.6 7.536-6.6.262 0 .52.011.776.028-.628-3.513-4.27-6.472-8.885-6.472zM5.785 5.97a1.1 1.1 0 110 2.2 1.1 1.1 0 010-2.2zm5.813 0a1.1 1.1 0 110 2.2 1.1 1.1 0 010-2.2zm5.192 2.642c-3.703 0-6.71 2.567-6.71 5.73 0 3.163 3.007 5.73 6.71 5.73a7.9 7.9 0 002.126-.288.644.644 0 01.17-.022.69.69 0 01.329.085l1.672.97a.262.262 0 00.147.046c.128 0 .23-.104.23-.233a.403.403 0 00-.038-.168l-.309-1.17a.468.468 0 01.168-.527c1.449-1.065 2.374-2.643 2.374-4.423 0-3.163-3.007-5.73-6.71-5.73h-.159zm-2.434 3.34a.88.88 0 110 1.76.88.88 0 010-1.76zm4.868 0a.88.88 0 110 1.76.88.88 0 010-1.76z"/>
+                </svg>
+              </div>
+              <div>
+                <h3 class="text-lg font-semibold text-gray-900 dark:text-white">联系客服</h3>
+                <p class="text-sm text-gray-500 dark:text-dark-400">扫码添加好友</p>
+              </div>
+            </div>
+
+            <!-- 二维码卡片 -->
+            <div class="mb-4 overflow-hidden rounded-xl border border-primary-100 bg-gradient-to-br from-primary-50 to-white p-3 dark:border-primary-800/30 dark:from-primary-900/10 dark:to-dark-800">
+              <img
+                src="/wechat-qr.jpg"
+                alt="微信二维码"
+                class="w-full rounded-lg"
+              />
+            </div>
+
+            <!-- 提示文字 -->
+            <div class="text-center">
+              <p class="mb-2 text-sm font-medium text-primary-600 dark:text-primary-400">
+                微信扫码添加客服
+              </p>
+              <p class="flex items-center justify-center gap-1 text-xs text-gray-500 dark:text-dark-400">
+                <svg class="h-4 w-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                  <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 8v4l3 3m6-3a9 9 0 11-18 0 9 9 0 0118 0z" />
+                </svg>
+                工作时间：周一至周五 9:00-18:00
+              </p>
+            </div>
+          </div>
+        </Transition>
+      </div>
+    </Transition>
+  </Teleport>
+</template>
+
+<script setup lang="ts">
+import { ref } from 'vue'
+
+const showModal = ref(false)
+</script>
+
+<style scoped>
+.fade-enter-active,
+.fade-leave-active {
+  transition: opacity 0.2s ease;
+}
+
+.fade-enter-from,
+.fade-leave-to {
+  opacity: 0;
+}
+
+.scale-enter-active,
+.scale-leave-active {
+  transition: all 0.2s ease;
+}
+
+.scale-enter-from,
+.scale-leave-to {
+  opacity: 0;
+  transform: scale(0.95);
+}
+</style>
diff --git a/frontend/src/components/layout/AppHeader.vue b/frontend/src/components/layout/AppHeader.vue
index a6b4030f..53a0c01e 100644
--- a/frontend/src/components/layout/AppHeader.vue
+++ b/frontend/src/components/layout/AppHeader.vue
@@ -121,23 +121,6 @@
                   <Icon name="key" size="sm" />
                   {{ t('nav.apiKeys') }}
                 </router-link>
-
-                <a
-                  href="https://github.com/Wei-Shaw/sub2api"
-                  target="_blank"
-                  rel="noopener noreferrer"
-                  @click="closeDropdown"
-                  class="dropdown-item"
-                >
-                  <svg class="h-4 w-4" fill="currentColor" viewBox="0 0 24 24">
-                    <path
-                      fill-rule="evenodd"
-                      clip-rule="evenodd"
-                      d="M12 2C6.477 2 2 6.477 2 12c0 4.42 2.865 8.17 6.839 9.49.5.092.682-.217.682-.482 0-.237-.008-.866-.013-1.7-2.782.604-3.369-1.34-3.369-1.34-.454-1.156-1.11-1.464-1.11-1.464-.908-.62.069-.608.069-.608 1.003.07 1.531 1.03 1.531 1.03.892 1.529 2.341 1.087 2.91.831.092-.646.35-1.086.636-1.336-2.22-.253-4.555-1.11-4.555-4.943 0-1.091.39-1.984 1.029-2.683-.103-.253-.446-1.27.098-2.647 0 0 .84-.269 2.75 1.025A9.578 9.578 0 0112 6.836c.85.004 1.705.114 2.504.336 1.909-1.294 2.747-1.025 2.747-1.025.546 1.377.203 2.394.1 2.647.64.699 1.028 1.592 1.028 2.683 0 3.842-2.339 4.687-4.566 4.935.359.309.678.919.678 1.852 0 1.336-.012 2.415-.012 2.743 0 .267.18.578.688.48C19.138 20.167 22 16.418 22 12c0-5.523-4.477-10-10-10z"
-                    />
-                  </svg>
-                  {{ t('nav.github') }}
-                </a>
               </div>
 
               <!-- Contact Support (only show if configured) -->
diff --git a/frontend/src/views/HomeView.vue b/frontend/src/views/HomeView.vue
index 6a3753f1..babcf046 100644
--- a/frontend/src/views/HomeView.vue
+++ b/frontend/src/views/HomeView.vue
@@ -122,8 +122,11 @@
             >
               {{ siteName }}
             </h1>
-            <p class="mb-8 text-lg text-gray-600 dark:text-dark-300 md:text-xl">
-              {{ siteSubtitle }}
+            <p class="mb-3 text-xl font-semibold text-primary-600 dark:text-primary-400 md:text-2xl">
+              {{ t('home.heroSubtitle') }}
+            </p>
+            <p class="mb-8 text-base text-gray-600 dark:text-dark-300 md:text-lg">
+              {{ t('home.heroDescription') }}
             </p>
 
             <!-- CTA Button -->
@@ -177,7 +180,7 @@
         </div>
 
         <!-- Feature Tags - Centered -->
-        <div class="mb-12 flex flex-wrap items-center justify-center gap-4 md:gap-6">
+        <div class="mb-16 flex flex-wrap items-center justify-center gap-4 md:gap-6">
           <div
             class="inline-flex items-center gap-2.5 rounded-full border border-gray-200/50 bg-white/80 px-5 py-2.5 shadow-sm backdrop-blur-sm dark:border-dark-700/50 dark:bg-dark-800/80"
           >
@@ -204,6 +207,63 @@
           </div>
         </div>
 
+        <!-- Pain Points Section -->
+        <div class="mb-16">
+          <h2 class="mb-8 text-center text-2xl font-bold text-gray-900 dark:text-white md:text-3xl">
+            {{ t('home.painPoints.title') }}
+          </h2>
+          <div class="grid gap-4 sm:grid-cols-2 lg:grid-cols-4">
+            <!-- Pain Point 1: Expensive -->
+            <div class="rounded-xl border border-red-200/50 bg-red-50/50 p-5 dark:border-red-900/30 dark:bg-red-950/20">
+              <div class="mb-3 flex h-10 w-10 items-center justify-center rounded-lg bg-red-100 dark:bg-red-900/30">
+                <svg class="h-5 w-5 text-red-500" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+                  <path stroke-linecap="round" stroke-linejoin="round" d="M12 8c-1.657 0-3 .895-3 2s1.343 2 3 2 3 .895 3 2-1.343 2-3 2m0-8c1.11 0 2.08.402 2.599 1M12 8V7m0 1v8m0 0v1m0-1c-1.11 0-2.08-.402-2.599-1M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
+                </svg>
+              </div>
+              <h3 class="mb-1.5 font-semibold text-gray-900 dark:text-white">{{ t('home.painPoints.items.expensive.title') }}</h3>
+              <p class="text-sm text-gray-600 dark:text-dark-400">{{ t('home.painPoints.items.expensive.desc') }}</p>
+            </div>
+            <!-- Pain Point 2: Complex -->
+            <div class="rounded-xl border border-orange-200/50 bg-orange-50/50 p-5 dark:border-orange-900/30 dark:bg-orange-950/20">
+              <div class="mb-3 flex h-10 w-10 items-center justify-center rounded-lg bg-orange-100 dark:bg-orange-900/30">
+                <svg class="h-5 w-5 text-orange-500" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+                  <path stroke-linecap="round" stroke-linejoin="round" d="M19 11H5m14 0a2 2 0 012 2v6a2 2 0 01-2 2H5a2 2 0 01-2-2v-6a2 2 0 012-2m14 0V9a2 2 0 00-2-2M5 11V9a2 2 0 012-2m0 0V5a2 2 0 012-2h6a2 2 0 012 2v2M7 7h10" />
+                </svg>
+              </div>
+              <h3 class="mb-1.5 font-semibold text-gray-900 dark:text-white">{{ t('home.painPoints.items.complex.title') }}</h3>
+              <p class="text-sm text-gray-600 dark:text-dark-400">{{ t('home.painPoints.items.complex.desc') }}</p>
+            </div>
+            <!-- Pain Point 3: Unstable -->
+            <div class="rounded-xl border border-yellow-200/50 bg-yellow-50/50 p-5 dark:border-yellow-900/30 dark:bg-yellow-950/20">
+              <div class="mb-3 flex h-10 w-10 items-center justify-center rounded-lg bg-yellow-100 dark:bg-yellow-900/30">
+                <svg class="h-5 w-5 text-yellow-600" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+                  <path stroke-linecap="round" stroke-linejoin="round" d="M12 9v2m0 4h.01m-6.938 4h13.856c1.54 0 2.502-1.667 1.732-3L13.732 4c-.77-1.333-2.694-1.333-3.464 0L3.34 16c-.77 1.333.192 3 1.732 3z" />
+                </svg>
+              </div>
+              <h3 class="mb-1.5 font-semibold text-gray-900 dark:text-white">{{ t('home.painPoints.items.unstable.title') }}</h3>
+              <p class="text-sm text-gray-600 dark:text-dark-400">{{ t('home.painPoints.items.unstable.desc') }}</p>
+            </div>
+            <!-- Pain Point 4: No Control -->
+            <div class="rounded-xl border border-gray-200/50 bg-gray-50/50 p-5 dark:border-dark-700/50 dark:bg-dark-800/50">
+              <div class="mb-3 flex h-10 w-10 items-center justify-center rounded-lg bg-gray-100 dark:bg-dark-700">
+                <svg class="h-5 w-5 text-gray-500" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+                  <path stroke-linecap="round" stroke-linejoin="round" d="M18.364 18.364A9 9 0 005.636 5.636m12.728 12.728A9 9 0 015.636 5.636m12.728 12.728L5.636 5.636" />
+                </svg>
+              </div>
+              <h3 class="mb-1.5 font-semibold text-gray-900 dark:text-white">{{ t('home.painPoints.items.noControl.title') }}</h3>
+              <p class="text-sm text-gray-600 dark:text-dark-400">{{ t('home.painPoints.items.noControl.desc') }}</p>
+            </div>
+          </div>
+        </div>
+
+        <!-- Solutions Section Title -->
+        <div class="mb-8 text-center">
+          <h2 class="mb-2 text-2xl font-bold text-gray-900 dark:text-white md:text-3xl">
+            {{ t('home.solutions.title') }}
+          </h2>
+          <p class="text-gray-600 dark:text-dark-400">{{ t('home.solutions.subtitle') }}</p>
+        </div>
+
         <!-- Features Grid -->
         <div class="mb-12 grid gap-6 md:grid-cols-3">
           <!-- Feature 1: Unified Gateway -->
@@ -369,6 +429,77 @@
             >
           </div>
         </div>
+
+        <!-- Comparison Table -->
+        <div class="mb-16">
+          <h2 class="mb-8 text-center text-2xl font-bold text-gray-900 dark:text-white md:text-3xl">
+            {{ t('home.comparison.title') }}
+          </h2>
+          <div class="overflow-x-auto">
+            <table class="w-full rounded-xl border border-gray-200/50 bg-white/60 backdrop-blur-sm dark:border-dark-700/50 dark:bg-dark-800/60">
+              <thead>
+                <tr class="border-b border-gray-200/50 dark:border-dark-700/50">
+                  <th class="px-6 py-4 text-left text-sm font-semibold text-gray-900 dark:text-white">{{ t('home.comparison.headers.feature') }}</th>
+                  <th class="px-6 py-4 text-center text-sm font-semibold text-gray-500 dark:text-dark-400">{{ t('home.comparison.headers.official') }}</th>
+                  <th class="px-6 py-4 text-center text-sm font-semibold text-primary-600 dark:text-primary-400">{{ t('home.comparison.headers.us') }}</th>
+                </tr>
+              </thead>
+              <tbody class="divide-y divide-gray-200/50 dark:divide-dark-700/50">
+                <tr>
+                  <td class="px-6 py-4 text-sm font-medium text-gray-900 dark:text-white">{{ t('home.comparison.items.pricing.feature') }}</td>
+                  <td class="px-6 py-4 text-center text-sm text-gray-500 dark:text-dark-400">{{ t('home.comparison.items.pricing.official') }}</td>
+                  <td class="px-6 py-4 text-center text-sm font-medium text-primary-600 dark:text-primary-400">{{ t('home.comparison.items.pricing.us') }}</td>
+                </tr>
+                <tr>
+                  <td class="px-6 py-4 text-sm font-medium text-gray-900 dark:text-white">{{ t('home.comparison.items.models.feature') }}</td>
+                  <td class="px-6 py-4 text-center text-sm text-gray-500 dark:text-dark-400">{{ t('home.comparison.items.models.official') }}</td>
+                  <td class="px-6 py-4 text-center text-sm font-medium text-primary-600 dark:text-primary-400">{{ t('home.comparison.items.models.us') }}</td>
+                </tr>
+                <tr>
+                  <td class="px-6 py-4 text-sm font-medium text-gray-900 dark:text-white">{{ t('home.comparison.items.management.feature') }}</td>
+                  <td class="px-6 py-4 text-center text-sm text-gray-500 dark:text-dark-400">{{ t('home.comparison.items.management.official') }}</td>
+                  <td class="px-6 py-4 text-center text-sm font-medium text-primary-600 dark:text-primary-400">{{ t('home.comparison.items.management.us') }}</td>
+                </tr>
+                <tr>
+                  <td class="px-6 py-4 text-sm font-medium text-gray-900 dark:text-white">{{ t('home.comparison.items.stability.feature') }}</td>
+                  <td class="px-6 py-4 text-center text-sm text-gray-500 dark:text-dark-400">{{ t('home.comparison.items.stability.official') }}</td>
+                  <td class="px-6 py-4 text-center text-sm font-medium text-primary-600 dark:text-primary-400">{{ t('home.comparison.items.stability.us') }}</td>
+                </tr>
+                <tr>
+                  <td class="px-6 py-4 text-sm font-medium text-gray-900 dark:text-white">{{ t('home.comparison.items.control.feature') }}</td>
+                  <td class="px-6 py-4 text-center text-sm text-gray-500 dark:text-dark-400">{{ t('home.comparison.items.control.official') }}</td>
+                  <td class="px-6 py-4 text-center text-sm font-medium text-primary-600 dark:text-primary-400">{{ t('home.comparison.items.control.us') }}</td>
+                </tr>
+              </tbody>
+            </table>
+          </div>
+        </div>
+
+        <!-- CTA Section -->
+        <div class="mb-8 rounded-2xl bg-gradient-to-r from-primary-500 to-primary-600 p-8 text-center shadow-xl shadow-primary-500/20 md:p-12">
+          <h2 class="mb-3 text-2xl font-bold text-white md:text-3xl">
+            {{ t('home.cta.title') }}
+          </h2>
+          <p class="mb-6 text-primary-100">
+            {{ t('home.cta.description') }}
+          </p>
+          <router-link
+            v-if="!isAuthenticated"
+            to="/register"
+            class="inline-flex items-center gap-2 rounded-full bg-white px-8 py-3 font-semibold text-primary-600 shadow-lg transition-all hover:bg-gray-50 hover:shadow-xl"
+          >
+            {{ t('home.cta.button') }}
+            <Icon name="arrowRight" size="md" :stroke-width="2" />
+          </router-link>
+          <router-link
+            v-else
+            :to="dashboardPath"
+            class="inline-flex items-center gap-2 rounded-full bg-white px-8 py-3 font-semibold text-primary-600 shadow-lg transition-all hover:bg-gray-50 hover:shadow-xl"
+          >
+            {{ t('home.goToDashboard') }}
+            <Icon name="arrowRight" size="md" :stroke-width="2" />
+          </router-link>
+        </div>
       </div>
     </main>
 
@@ -380,27 +511,20 @@
         <p class="text-sm text-gray-500 dark:text-dark-400">
           &copy; {{ currentYear }} {{ siteName }}. {{ t('home.footer.allRightsReserved') }}
         </p>
-        <div class="flex items-center gap-4">
-          <a
-            v-if="docUrl"
-            :href="docUrl"
-            target="_blank"
-            rel="noopener noreferrer"
-            class="text-sm text-gray-500 transition-colors hover:text-gray-700 dark:text-dark-400 dark:hover:text-white"
-          >
-            {{ t('home.docs') }}
-          </a>
-          <a
-            :href="githubUrl"
-            target="_blank"
-            rel="noopener noreferrer"
-            class="text-sm text-gray-500 transition-colors hover:text-gray-700 dark:text-dark-400 dark:hover:text-white"
-          >
-            GitHub
-          </a>
-        </div>
+        <a
+          v-if="docUrl"
+          :href="docUrl"
+          target="_blank"
+          rel="noopener noreferrer"
+          class="text-sm text-gray-500 transition-colors hover:text-gray-700 dark:text-dark-400 dark:hover:text-white"
+        >
+          {{ t('home.docs') }}
+        </a>
       </div>
     </footer>
+
+    <!-- 微信客服悬浮按钮 -->
+    <WechatServiceButton />
   </div>
 </template>
 
@@ -410,6 +534,7 @@ import { useI18n } from 'vue-i18n'
 import { useAuthStore, useAppStore } from '@/stores'
 import LocaleSwitcher from '@/components/common/LocaleSwitcher.vue'
 import Icon from '@/components/icons/Icon.vue'
+import WechatServiceButton from '@/components/common/WechatServiceButton.vue'
 
 const { t } = useI18n()
 
@@ -419,7 +544,6 @@ const appStore = useAppStore()
 // Site settings - directly from appStore (already initialized from injected config)
 const siteName = computed(() => appStore.cachedPublicSettings?.site_name || appStore.siteName || 'Sub2API')
 const siteLogo = computed(() => appStore.cachedPublicSettings?.site_logo || appStore.siteLogo || '')
-const siteSubtitle = computed(() => appStore.cachedPublicSettings?.site_subtitle || 'AI API Gateway Platform')
 const docUrl = computed(() => appStore.cachedPublicSettings?.doc_url || appStore.docUrl || '')
 const homeContent = computed(() => appStore.cachedPublicSettings?.home_content || '')
 
@@ -432,9 +556,6 @@ const isHomeContentUrl = computed(() => {
 // Theme
 const isDark = ref(document.documentElement.classList.contains('dark'))
 
-// GitHub URL
-const githubUrl = 'https://github.com/Wei-Shaw/sub2api'
-
 // Auth state
 const isAuthenticated = computed(() => authStore.isAuthenticated)
 const isAdmin = computed(() => authStore.isAdmin)
diff --git a/frontend/src/views/admin/ops/components/OpsConcurrencyCard.vue b/frontend/src/views/admin/ops/components/OpsConcurrencyCard.vue
index c7370ab5..0956caa5 100644
--- a/frontend/src/views/admin/ops/components/OpsConcurrencyCard.vue
+++ b/frontend/src/views/admin/ops/components/OpsConcurrencyCard.vue
@@ -122,6 +122,7 @@ const platformRows = computed((): SummaryRow[] => {
       available_accounts: availableAccounts,
       rate_limited_accounts: safeNumber(avail.rate_limit_count),
 
+
       error_accounts: safeNumber(avail.error_count),
       total_concurrency: totalConcurrency,
       used_concurrency: usedConcurrency,
@@ -161,7 +162,6 @@ const groupRows = computed((): SummaryRow[] => {
         total_accounts: totalAccounts,
         available_accounts: availableAccounts,
         rate_limited_accounts: safeNumber(avail.rate_limit_count),
-  
         error_accounts: safeNumber(avail.error_count),
         total_concurrency: totalConcurrency,
         used_concurrency: usedConcurrency,
@@ -265,7 +265,7 @@ async function loadData() {
   try {
     if (showByUser.value) {
       // 用户视图模式只加载用户并发数据
-      const userData = await opsAPI.getUserConcurrencyStats()
+      const userData = await opsAPI.getUserConcurrencyStats(props.platformFilter, props.groupIdFilter)
       userConcurrency.value = userData
     } else {
       // 常规模式加载账号/平台/分组数据
@@ -301,6 +301,14 @@ watch(
   }
 )
 
+// 过滤条件变化时重新加载数据
+watch(
+  [() => props.platformFilter, () => props.groupIdFilter],
+  () => {
+    loadData()
+  }
+)
+
 function getLoadBarClass(loadPct: number): string {
   if (loadPct >= 90) return 'bg-red-500 dark:bg-red-600'
   if (loadPct >= 70) return 'bg-orange-500 dark:bg-orange-600'
@@ -329,6 +337,7 @@ function formatDuration(seconds: number): string {
 }
 
 
+
 watch(
   () => realtimeEnabled.value,
   async (enabled) => {
diff --git a/stress_test_gemini_session.sh b/stress_test_gemini_session.sh
new file mode 100644
index 00000000..1f2aca57
--- /dev/null
+++ b/stress_test_gemini_session.sh
@@ -0,0 +1,127 @@
+#!/bin/bash
+
+# Gemini 粘性会话压力测试脚本
+# 测试目标：验证不同会话分配不同账号，同一会话保持同一账号
+
+BASE_URL="http://host.clicodeplus.com:8080"
+API_KEY="sk-32ad0a3197e528c840ea84f0dc6b2056dd3fead03526b5c605a60709bd408f7e"
+MODEL="gemini-2.5-flash"
+
+# 创建临时目录存放结果
+RESULT_DIR="/tmp/gemini_stress_test_$(date +%s)"
+mkdir -p "$RESULT_DIR"
+
+echo "=========================================="
+echo "Gemini 粘性会话压力测试"
+echo "结果目录: $RESULT_DIR"
+echo "=========================================="
+
+# 函数：发送请求并记录
+send_request() {
+    local session_id=$1
+    local round=$2
+    local system_prompt=$3
+    local contents=$4
+    local output_file="$RESULT_DIR/session_${session_id}_round_${round}.json"
+
+    local request_body=$(cat <<EOF
+{
+    "systemInstruction": {
+        "parts": [{"text": "$system_prompt"}]
+    },
+    "contents": $contents
+}
+EOF
+)
+
+    curl -s -X POST "${BASE_URL}/v1beta/models/${MODEL}:generateContent" \
+        -H "Content-Type: application/json" \
+        -H "x-goog-api-key: ${API_KEY}" \
+        -d "$request_body" > "$output_file" 2>&1
+
+    echo "[Session $session_id Round $round] 完成"
+}
+
+# 会话1：数学计算器（累加序列）
+run_session_1() {
+    local sys_prompt="你是一个数学计算器，只返回计算结果数字，不要任何解释"
+
+    # Round 1: 1+1=?
+    send_request 1 1 "$sys_prompt" '[{"role":"user","parts":[{"text":"1+1=?"}]}]'
+
+    # Round 2: 继续 2+2=?（累加历史）
+    send_request 1 2 "$sys_prompt" '[{"role":"user","parts":[{"text":"1+1=?"}]},{"role":"model","parts":[{"text":"2"}]},{"role":"user","parts":[{"text":"2+2=?"}]}]'
+
+    # Round 3: 继续 3+3=?
+    send_request 1 3 "$sys_prompt" '[{"role":"user","parts":[{"text":"1+1=?"}]},{"role":"model","parts":[{"text":"2"}]},{"role":"user","parts":[{"text":"2+2=?"}]},{"role":"model","parts":[{"text":"4"}]},{"role":"user","parts":[{"text":"3+3=?"}]}]'
+
+    # Round 4: 批量计算 10+10, 20+20, 30+30
+    send_request 1 4 "$sys_prompt" '[{"role":"user","parts":[{"text":"1+1=?"}]},{"role":"model","parts":[{"text":"2"}]},{"role":"user","parts":[{"text":"2+2=?"}]},{"role":"model","parts":[{"text":"4"}]},{"role":"user","parts":[{"text":"3+3=?"}]},{"role":"model","parts":[{"text":"6"}]},{"role":"user","parts":[{"text":"计算: 10+10=? 20+20=? 30+30=?"}]}]'
+}
+
+# 会话2：英文翻译器（不同系统提示词 = 不同会话）
+run_session_2() {
+    local sys_prompt="你是一个英文翻译器，将中文翻译成英文，只返回翻译结果"
+
+    send_request 2 1 "$sys_prompt" '[{"role":"user","parts":[{"text":"你好"}]}]'
+    send_request 2 2 "$sys_prompt" '[{"role":"user","parts":[{"text":"你好"}]},{"role":"model","parts":[{"text":"Hello"}]},{"role":"user","parts":[{"text":"世界"}]}]'
+    send_request 2 3 "$sys_prompt" '[{"role":"user","parts":[{"text":"你好"}]},{"role":"model","parts":[{"text":"Hello"}]},{"role":"user","parts":[{"text":"世界"}]},{"role":"model","parts":[{"text":"World"}]},{"role":"user","parts":[{"text":"早上好"}]}]'
+}
+
+# 会话3：日文翻译器
+run_session_3() {
+    local sys_prompt="你是一个日文翻译器，将中文翻译成日文，只返回翻译结果"
+
+    send_request 3 1 "$sys_prompt" '[{"role":"user","parts":[{"text":"你好"}]}]'
+    send_request 3 2 "$sys_prompt" '[{"role":"user","parts":[{"text":"你好"}]},{"role":"model","parts":[{"text":"こんにちは"}]},{"role":"user","parts":[{"text":"谢谢"}]}]'
+    send_request 3 3 "$sys_prompt" '[{"role":"user","parts":[{"text":"你好"}]},{"role":"model","parts":[{"text":"こんにちは"}]},{"role":"user","parts":[{"text":"谢谢"}]},{"role":"model","parts":[{"text":"ありがとう"}]},{"role":"user","parts":[{"text":"再见"}]}]'
+}
+
+# 会话4：乘法计算器（另一个数学会话，但系统提示词不同）
+run_session_4() {
+    local sys_prompt="你是一个乘法专用计算器，只计算乘法，返回数字结果"
+
+    send_request 4 1 "$sys_prompt" '[{"role":"user","parts":[{"text":"2*3=?"}]}]'
+    send_request 4 2 "$sys_prompt" '[{"role":"user","parts":[{"text":"2*3=?"}]},{"role":"model","parts":[{"text":"6"}]},{"role":"user","parts":[{"text":"4*5=?"}]}]'
+    send_request 4 3 "$sys_prompt" '[{"role":"user","parts":[{"text":"2*3=?"}]},{"role":"model","parts":[{"text":"6"}]},{"role":"user","parts":[{"text":"4*5=?"}]},{"role":"model","parts":[{"text":"20"}]},{"role":"user","parts":[{"text":"计算: 10*10=? 20*20=?"}]}]'
+}
+
+# 会话5：诗人（完全不同的角色）
+run_session_5() {
+    local sys_prompt="你是一位诗人，用简短的诗句回应每个话题，每次只写一句诗"
+
+    send_request 5 1 "$sys_prompt" '[{"role":"user","parts":[{"text":"春天"}]}]'
+    send_request 5 2 "$sys_prompt" '[{"role":"user","parts":[{"text":"春天"}]},{"role":"model","parts":[{"text":"春风拂面花满枝"}]},{"role":"user","parts":[{"text":"夏天"}]}]'
+    send_request 5 3 "$sys_prompt" '[{"role":"user","parts":[{"text":"春天"}]},{"role":"model","parts":[{"text":"春风拂面花满枝"}]},{"role":"user","parts":[{"text":"夏天"}]},{"role":"model","parts":[{"text":"蝉鸣蛙声伴荷香"}]},{"role":"user","parts":[{"text":"秋天"}]}]'
+}
+
+echo ""
+echo "开始并发测试 5 个独立会话..."
+echo ""
+
+# 并发运行所有会话
+run_session_1 &
+run_session_2 &
+run_session_3 &
+run_session_4 &
+run_session_5 &
+
+# 等待所有后台任务完成
+wait
+
+echo ""
+echo "=========================================="
+echo "所有请求完成，结果保存在: $RESULT_DIR"
+echo "=========================================="
+
+# 显示结果摘要
+echo ""
+echo "响应摘要:"
+for f in "$RESULT_DIR"/*.json; do
+    filename=$(basename "$f")
+    response=$(cat "$f" | head -c 200)
+    echo "[$filename]: ${response}..."
+done
+
+echo ""
+echo "请检查服务器日志确认账号分配情况"

{{ t('home.comparison.headers.feature') }}	{{ t('home.comparison.headers.official') }}	{{ t('home.comparison.headers.us') }}
{{ t('home.comparison.items.pricing.feature') }}	{{ t('home.comparison.items.pricing.official') }}	{{ t('home.comparison.items.pricing.us') }}
{{ t('home.comparison.items.models.feature') }}	{{ t('home.comparison.items.models.official') }}	{{ t('home.comparison.items.models.us') }}
{{ t('home.comparison.items.management.feature') }}	{{ t('home.comparison.items.management.official') }}	{{ t('home.comparison.items.management.us') }}
{{ t('home.comparison.items.stability.feature') }}	{{ t('home.comparison.items.stability.official') }}	{{ t('home.comparison.items.stability.us') }}
{{ t('home.comparison.items.control.feature') }}	{{ t('home.comparison.items.control.official') }}	{{ t('home.comparison.items.control.us') }}