Commit Graph

25 Commits

Author SHA1 Message Date
DaydreamCoding
30f55a1f72 feat(openai): OpenAI Fast/Flex Policy 完整实现(HTTP + WebSocket + Admin)
对称参照 Claude BetaPolicy 的 fast-mode 过滤实现,新增针对 OpenAI 上游
service_tier 字段(priority / flex,含客户端 "fast" → "priority" 归一化)的
pass / filter / block 三态策略,覆盖全部 OpenAI 入口 + admin 配置入口。

后端核心
- 新增 SettingKeyOpenAIFastPolicySettings、OpenAIFastPolicyRule、
  OpenAIFastPolicySettings 配置模型,含规则的 service_tier × action × scope
  × 模型白名单 × fallback action 维度。
- SettingService.Get/SetOpenAIFastPolicySettings;缺失时返回内置默认策略
  (所有模型的 priority 走 filter,whitelist 为空,fallback=pass)。设计
  依据:service_tier=fast 是用户级开关,与 model 字段正交,默认锁定特定
  model slug 会留下"用 gpt-4 + fast 透传 priority 上游"的绕过路径。JSON
  解析失败不再静默 fallback,slog.Warn 记录脏数据,便于运维定位。
- service_tier 归一化(trim + ToLower + fast→priority + 白名单 priority/flex)
  与策略评估(evaluateOpenAIFastPolicy)作为唯一真实来源,HTTP / WS 共用。
  抽出纯函数 evaluateOpenAIFastPolicyWithSettings,配合 ctx-bound settings
  快照(withOpenAIFastPolicyContext / openAIFastPolicySettingsFromContext),
  WS 长会话入口预取一次后所有帧复用,避免每帧打到 settingService。

HTTP 入口(4 个)
- Chat Completions、Anthropic 兼容(Messages,含 BetaFastMode→priority 二次
  命中)、原生 Responses、Passthrough Responses 全部接入
  applyOpenAIFastPolicyToBody,filter 走 sjson 顶层删除 service_tier,block
  返回 403 forbidden_error JSON。
- 4 入口统一使用 upstream 视角的 model(GetMappedModel +
  normalizeOpenAIModelForUpstream + Codex OAuth normalize 后的 slug),
  避免 chat/messages/native /responses/passthrough 因为 model 维度不同
  造成 whitelist 命中差异。
- 在 pass 路径也把客户端 "fast" 别名归一化为 "priority" 写回 body,
  否则 native /responses 与 passthrough 入口会把 "fast" 原样透传给上游
  导致 400/拒绝(chat-completions 入口的 normalizeResponsesBodyServiceTier
  此前已具备同等行为)。

WebSocket 入口
- 新增 applyOpenAIFastPolicyToWSResponseCreate:严格匹配
  type="response.create",仅处理顶层 service_tier;filter 用 sjson 删字段,
  block 返回 typed *OpenAIFastBlockedError。
- ingress 路径在 parseClientPayload 内调用,block 命中先 Write Realtime
  风格 error event 再返回 OpenAIWSClientCloseError(StatusPolicyViolation
  =1008),依赖底层 WebSocket Conn.Write 的同步 flush 保证 error 先于
  close。
- passthrough 路径在 RunEntry 前对 firstClientMessage 应用策略,并通过
  openAIWSPolicyEnforcingFrameConn 包装 ReadFrame 对每个 client→upstream
  帧执行策略;后续帧无 model 字段时回退到 capturedSessionModel。
  filter 闭包内同时侦测 session.update / session.created 帧的 session.model
  字段刷新 capturedSessionModel,封堵"首帧 model=gpt-4o(pass)→
  session.update 改为 gpt-5.5 → 不带 model 的 response.create fallback
  到 gpt-4o"的 mid-session 绕过路径。
- passthrough billing:requestServiceTier 在策略 filter 之后再从
  firstClientMessage 提取,filter 命中时 OpenAIForwardResult.ServiceTier
  上报 nil(default tier),与 HTTP 入口(reqBody 来自 post-filter map)
  / WS ingress(payload 来自 post-filter bytes)的语义一致。
- 错误事件 schema:{event_id: "evt_<32hex>", type: "error",
  error: {type: "forbidden_error", code: "policy_violation", message}},
  与 OpenAI codex 客户端 error event 解析兼容。

Admin / Frontend
- dto.SystemSettings / UpdateSettingsRequest 新增
  openai_fast_policy_settings 字段(omitempty),bulk GET/PUT 接入。
- Settings 页 Gateway 页签新增 Fast/Flex Policy 表单卡片:
  service_tier × action × scope × 模型白名单 × fallback action 全字段配置。
- 前端守门:openaiFastPolicyLoaded 标志仅在 GET 真带回字段时才允许回写,
  避免 rollout/错误把默认规则覆盖成空;saveSettings 回写循环 skip 该字段,
  由专用刷新逻辑处理;仅 action=block 时发送 error_message,匹配后端
  omitempty 行为。

测试
- HTTP 路径:openai_fast_policy_test.go 覆盖默认配置(whitelist=[],所有
  模型 priority filter)/ block 自定义错误 / scope 区分 / filter 删字段 /
  block 不改 body / block 短路上游 / Anthropic BetaFastMode 触发 OpenAI
  fast policy 等场景。
- WebSocket 路径:openai_fast_policy_ws_test.go 覆盖
    helper 单元(filter / fast→priority 归一化 / flex 透传 / block typed
    error / 无 service_tier 字节不变 / 非 response.create 帧不动 / 空 type
    帧不动 / event_id+code 字段断言 / 非字符串 service_tier 容错)+
    pass 路径 fast 别名归一化回归 +
    ingress 端到端(filter 后上游不含 service_tier / block 后客户端先收
    error event 再收 close 1008 且上游 0 写)+
    passthrough capturedSessionModel fallback 用例(whitelist 策略下首帧
    建立、缺 model 命中 fallback、缺少 fallback 时的 leak 文档化)+
    passthrough session.update / session.created 旋转 capturedSessionModel
    的 mid-session 绕过回归 +
    passthrough billing post-filter ServiceTier 与 idempotent filter 回归。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 11:15:09 +08:00
fjl5
6c89d8d35c add prompt_cache_key injection for messages→responses 2026-04-15 23:56:56 +08:00
sakurawztlt
a1e299a355 fix: Anthropic 非流式路径在上游终态事件 output 为空时从 delta 事件重建响应内容
b2e379cf 引入的 BufferedResponseAccumulator 已修复了 chat_completions
非流式路径和 responses OAuth 非流式路径,但遗漏了 Anthropic /v1/messages
非流式路径 (handleAnthropicBufferedStreamingResponse)。

当客户端请求 stream=false 且模型开启思考时,上游 response.completed
终态事件的 output 字段可能为空,实际 message 内容通过
response.output_text.delta 增量事件下发。旧代码只读终态事件的 Response,
导致客户端收到的 content 字段为空 ([{"type":"text"}])。

本 commit 将 b2e379cf 的相同修复模式镜像到 Anthropic 路径:在 SSE 扫描
过程中用 BufferedResponseAccumulator 累积 delta 内容,终态 output 为空
时通过 SupplementResponseOutput 补充重建。

同时修复 handleAnthropicBufferedStreamingResponse 遗漏 response.done
事件类型的问题,与 chat completions 路径保持一致,避免上游发送
response.done 时 handler 认不出终态事件、最终返回 502 的潜在问题。

BufferedResponseAccumulator 已在 chatcompletions_responses_test.go 有
完整单元测试覆盖(TextOnly/ToolCalls/Reasoning/Mixed/SupplementEmpty/
NoSupplementWhenOutputExists/EmptyDeltas/IgnoresNonFunctionCallItems),
本次复用相同累加器无需新增测试。
2026-04-13 18:51:49 +08:00
IanShaw027
4de4823a65 feat(openai): 支持messages模型映射与instructions模板注入 2026-04-09 12:29:49 +08:00
Alex
3a07e92b60 fix(openai): do not normalize /completion API token based accounts 2026-04-07 11:40:41 +03:00
erio
e27b0adbc8 refactor: remove resolveOpenAIUpstreamModel, use normalizeCodexModel directly
Eliminates unnecessary indirection layer. The wrapper function only
called normalizeCodexModel with a special case for "gpt 5.3 codex spark"
(space-separated variant) that is no longer needed.

All call sites now use normalizeCodexModel directly.
2026-04-04 14:07:19 +08:00
InCerry
0b3feb9d4c fix(openai): resolve Anthropic compat mapping from normalized model
Anthropic compat requests normalize reasoning suffixes before forwarding, but the account mapping step was still using the raw request model. Resolve billing and upstream models from the normalized compat model so explicit account mappings win over fallback defaults.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-03-31 10:33:28 +08:00
InCerry
ca8692c747 Merge remote-tracking branch 'upstream/main'
# Conflicts:
#	backend/internal/service/openai_gateway_messages.go
2026-03-31 09:38:40 +08:00
YanzheL
8c10941142 fix(openai): normalize gpt-5.4-xhigh compat mapping 2026-03-29 20:52:29 +08:00
InCerry
995ef1348a refactor: improve model resolution and normalization logic for OpenAI integration 2026-03-24 19:20:15 +08:00
Ethan0x0000
2e4ac88ad9 feat(service): record upstream model across all gateway paths
Propagate UpstreamModel through ForwardResult and OpenAIForwardResult in Anthropic direct, API-key passthrough, Bedrock, and OpenAI gateway flows. Extract optionalNonEqualStringPtr and optionalTrimmedStringPtr into usage_log_helpers.go. Store upstream_model only when it differs from the requested model.

Also introduces anthropicPassthroughForwardInput struct to reduce parameter count.
2026-03-17 19:25:35 +08:00
QTom
ab4e8b2cf0 fix(gateway): 防止 OpenAI Codex 跨用户串流
根因:多个用户共享同一 OAuth 账号时,conversation_id/session_id 头
未做用户隔离,导致上游 chatgpt.com 将不同用户的请求关联到同一会话。

HTTP SSE 修复:
- 新增 isolateOpenAISessionID(apiKeyID, raw),将 API Key ID 混入
  session 标识符(xxhash),确保不同 Key 的用户产生不同上游会话
- buildUpstreamRequest: OAuth 分支先 Del 客户端透传的 session 头,
  再用隔离值覆盖
- buildUpstreamRequestOpenAIPassthrough: 透传路径同样隔离
- ForwardAsAnthropic: Anthropic Messages 兼容路径同步修复
- buildOpenAIWSHeaders: WS 路径的 OAuth session 头同步隔离
2026-03-16 10:28:51 +08:00
Wang Lvyuan
4e8615f276 fix: honor account model mapping before group fallback 2026-03-14 10:47:31 +08:00
shaw
9d81467937 refactor: 重构 Chat Completions 端点,采用类型安全的 Responses API 转换
将 /v1/chat/completions 端点从 ResponseWriter 劫持模式重构为独立的
类型安全转换路径,与 Anthropic Messages 端点架构对齐:

- 在 apicompat 包新增 Chat Completions 完整类型定义和双向转换器
- 新增 ForwardAsChatCompletions service 方法,走 Responses API 上游
- Handler 改为独立的账号选择/failover 循环,不再劫持 Responses handler
- 提取 handleCompatErrorResponse 为 Chat Completions 和 Messages 共用
- 删除旧的 forwardChatCompletions 直传路径及相关死代码
2026-03-11 22:15:32 +08:00
Wesley Liddick
944b7f7617 Merge pull request #904 from james-6-23/fix-pool-mode-retry
fix: OpenAI临时性400错误支持池模式同账号重试 & HelpTooltip层级修复
2026-03-10 09:08:12 +08:00
kyx236
5fa22fdf82 fix: OpenAI临时性400错误支持池模式同账号重试 & HelpTooltip层级修复
1. 识别OpenAI "An error occurred while processing your request" 临时性400错误
   并触发failover,同时在池模式下标记RetryableOnSameAccount,允许同账号重试
2. ForwardAsAnthropic路径同步支持临时性400错误的识别和同账号重试
3. HelpTooltip组件使用Teleport渲染到body,修复在dialog内被裁切的问题
2026-03-10 03:00:58 +08:00
erio
bcaae2eb91 fix: use shared max_line_size config for OpenAI Responses SSE scanner
Two SSE scanners in openai_gateway_messages.go were hardcoded to 1MB
while all other scanners use defaultMaxLineSize (500MB) with config
override. This caused Responses API streams to fail on large payloads.
2026-03-10 02:50:04 +08:00
shaw
25178cdbe1 fix: 修复gpt->claude同步请求返回sse的bug 2026-03-09 15:58:44 +08:00
shaw
a461538d58 fix: 修复gpt->claude转换无法命中codex缓存问题 2026-03-09 15:08:37 +08:00
shaw
ebe6f418f3 fix: gpt->claude格式转换对齐effort映射和fast 2026-03-09 11:42:35 +08:00
shaw
92d35409de feat: 为openai分组增加messages调度开关和默认映射模型 2026-03-07 17:02:19 +08:00
shaw
1b4d2a41c9 fix(openai): /v1/messages端点补齐Codex用量快照提取与错误透传规则
对齐/v1/responses的Forward方法,修复两处不一致:
- 成功响应时从响应头提取OAuth账号的Codex使用量数据
- 非failover错误场景下应用管理员配置的错误透传规则
2026-03-07 08:40:07 +08:00
shaw
921599948b feat: /v1/messages端点适配codex账号池 2026-03-06 22:44:07 +08:00
alfadb
bc194a7d8c fix: address PR review - Anthropic error format in panic recovery and nil guard
- Add recoverAnthropicMessagesPanic for Messages handler to return
  Anthropic-formatted errors instead of OpenAI Responses format on panic
- Add nil check for rateLimitService.HandleUpstreamError in
  ForwardAsAnthropic to match defensive pattern used elsewhere

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 15:40:15 +08:00
alfadb
ff1f114989 feat(openai): add /v1/messages endpoint and API compatibility layer
Add Anthropic Messages API support for OpenAI platform groups, enabling
clients using Claude-style /v1/messages format to access OpenAI accounts
through automatic protocol conversion.

- Add apicompat package with type definitions and bidirectional converters
  (Anthropic ↔ Chat, Chat ↔ Responses, Anthropic ↔ Responses)
- Implement /v1/messages endpoint for OpenAI gateway with streaming support
- Add model mapping UI for OpenAI OAuth accounts (whitelist + mapping modes)
- Support prompt caching fields and codex OAuth transforms
- Fix tool call ID conversion for Responses API (fc_ prefix)
- Ensure function_call_output has non-empty output field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 14:29:22 +08:00