- Security: force token_uri to Google default, preventing SSRF via crafted service account JSON
- Dedup: extract shared getVertexServiceAccountAccessToken() to eliminate ~35 lines of duplication between ClaudeTokenProvider and GeminiTokenProvider
- Fix: apply model mapping + Vertex model ID normalization in forward_as_responses and forward_as_chat_completions paths
- Fix: exclude service_account from AI Studio endpoint selection (Vertex cannot serve generativelanguage.googleapis.com)
- Feature: add model restriction/mapping UI for service_account in EditAccountModal
- Dedup: extract VERTEX_LOCATION_OPTIONS to shared constants
- i18n: replace all hardcoded Chinese strings in Vertex UI with translation keys
(*net.OpError).Error() concatenates Source/Addr fields, so the previous
disconnectMsg surfaced internal source IP/port and upstream server address
to clients via SSE error frames and UpstreamFailoverError.ResponseBody
(reported by @Wei-Shaw on PR #2066).
- Add sanitizeStreamError that maps known errors (io.ErrUnexpectedEOF,
context.Canceled, syscall.ECONNRESET/EPIPE/ETIMEDOUT/...) to fixed
descriptions and falls back to a generic placeholder, with an explicit
*net.OpError branch that drops Source/Addr fields entirely.
- Use sanitized message in client-facing disconnectMsg; full ev.err is
still preserved in the existing operator log line for diagnosis.
- Tests cover net.OpError redaction, the failover ResponseBody path, and
every known sanitized error mapping.
Background / 背景
The ops cleanup task currently rejects retention days < 1 in both validate
and normalize, so operators who want minimal-history setups (e.g. high
churn deployments that prefer near-realtime cleanup) cannot express that
intent through the UI. The only options are 1+ days, which keeps at least
24h of history regardless of cron frequency.
ops 清理任务目前在 validate 和 normalize 两处都拒绝小于 1 的保留天数,
让希望尽量不留历史的运维场景(高吞吐部署 + 想用近实时清理)无法通过 UI
表达。最低只能配 1,等于不管 cron 多频繁,至少都会保留 24 小时的历史。
Purpose / 目的
Let admins set retention days to 0, meaning "every scheduled cleanup
run wipes the corresponding table(s) entirely". Combined with a more
frequent cron (e.g. `0 * * * *`) this yields effectively rolling cleanup.
允许管理员把保留天数设为 0,语义为"每次定时清理时把对应表全部清空"。
搭配更频繁的 cron(比如每小时整点)即可获得近似滚动清理的效果。
Changes / 改动内容
Backend
- service/ops_settings.go: validate accepts [0, 365]; normalize only
refills default 30 when value is < 0 (negative is treated as legacy
bad data, 0 is honoured)
- service/ops_cleanup_service.go: introduce `opsCleanupPlan(now, days)`
returning `(cutoff, truncate, ok)`. days==0 returns truncate=true and
short-circuits to a new `truncateOpsTable` helper that uses
`TRUNCATE TABLE` (O(1), no WAL, no VACUUM pressure). days>0 keeps
the existing batched DELETE path unchanged. Empty tables skip
TRUNCATE to avoid the ACCESS EXCLUSIVE lock entirely
- Extract `isMissingRelationError` helper to dedupe the "table not
yet created" tolerance shared by both delete and truncate paths
- Add unit tests for `opsCleanupPlan` (three branches) and
`isMissingRelationError`
后端
- service/ops_settings.go: validate 接受 [0, 365];normalize 仅在 < 0
时回填默认 30(负数视为脏数据,0 被尊重)
- service/ops_cleanup_service.go: 抽 `opsCleanupPlan(now, days)` 返回
`(cutoff, truncate, ok)`。days==0 → truncate=true,走新增
`truncateOpsTable`(TRUNCATE TABLE,O(1),无 WAL、无 VACUUM 压力);
days>0 仍走原批量 DELETE 路径,行为完全不变。空表跳过 TRUNCATE,
避免无意义的 ACCESS EXCLUSIVE 锁
- 抽 `isMissingRelationError` helper 复用 delete / truncate 两处的
"表不存在"宽容判断
- 补 `opsCleanupPlan` 三分支 + `isMissingRelationError` 单元测试
Frontend
- OpsSettingsDialog.vue: validation accepts [0, 365]; input min=0
- i18n (zh/en): hint mentions "0 = wipe all on every cleanup",
validation message updated to 0-365 range
前端
- OpsSettingsDialog.vue: 校验放宽到 [0, 365],input min 改 0
- i18n(zh/en):hint 补"0 = 每次清理时清空所有",错误提示改 0-365
Trade-offs / 取舍
- TRUNCATE requires ACCESS EXCLUSIVE lock briefly, but ops tables only
have the cleanup task as a writer, so the lock is invisible to other
workloads
- Empty-table guard avoids the lock when there is nothing to clean
- Negative values are still treated as legacy bad data and replaced
with default 30 to preserve compatibility
Two follow-ups to PR #2066's failover-wrap fix:
1. Failover ResponseBody (`UpstreamFailoverError.ResponseBody`) was encoded
as `{"error": "<msg>"}` (string field). `ExtractUpstreamErrorMessage`
probes for `error.message`, `detail`, or top-level `message` only — so
`handleFailoverExhausted` and downstream passthrough rules saw an empty
message, losing the EOF root cause in ops logs. Re-encode as the
Anthropic standard shape `{"type":"error","error":{"type":"upstream_disconnected","message":"..."}}`.
(Addresses the inline review comment from copilot-pull-request-reviewer
on Wei-Shaw/sub2api#2066.)
2. The streaming `event: error` SSE frame for `response_too_large`,
`stream_read_error`, and `stream_timeout` was non-standard
(`{"error":"<reason>"}`). Anthropic SDKs (and Claude Code) expect
`{"type":"error","error":{"type":"...","message":"..."}}` and parse
`error.type`/`error.message` accordingly. Refactor `sendErrorEvent` to
take both reason and message, and emit the standard frame so client
SDKs surface a real diagnostic message instead of a generic stream error.
This does not by itself prevent task interruption on long-stream EOF
(SSE has no resume; client-side retry remains the only complete fix), but
it gives both server-side ops logs and client-side error UIs a meaningful
upstream message so users know the next step is to retry.
Tests updated to assert the new body shape on both branches plus a new
assertion that `ExtractUpstreamErrorMessage` returns a non-empty string.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Anthropic streaming path (gateway_service.go) returned a plain error on
upstream SSE read failure, so the handler-level UpstreamFailoverError check
never fired and the client received a bare `stream_read_error` event,
breaking long-running tasks even when no bytes had been written yet.
The most common trigger is HTTP/2 GOAWAY from api.anthropic.com edge
backends doing graceful rotation: Go's http.Transport surfaces this as
`unexpected EOF` and never auto-retries.
Mirror what the OpenAI and antigravity gateways already do: when the read
error happens before any byte has reached the client (`!c.Writer.Written()`),
return `*UpstreamFailoverError{StatusCode: 502, RetryableOnSameAccount: true}`
so the handler can retry on the same or another account. After client
output has begun, SSE has no resume protocol — keep the existing passthrough
behavior.
Tests cover both branches via streamReadCloser-based fixtures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #1914 unconditionally applied the full mimicry pipeline to all OAuth
accounts, including real Claude Code CLI clients. This replaced the
client's long system prompt (~10K+ tokens with stable cache_control
breakpoints) with a short ~45 token [billing, CC prompt] pair, which
falls below Anthropic's 1024-token minimum cacheable prefix threshold.
The result: every request created a new cache but never hit an existing
one.
Fix: restore the Claude Code client detection gate so that real CC
clients bypass body-level mimicry (system rewrite, message cache
management, tool name obfuscation). Non-CC third-party clients
(opencode, etc.) continue to receive full mimicry.
Also harden the detection logic:
- Make UA regex case-insensitive (align with claude_code_validator.go)
- Validate metadata.user_id format via ParseMetadataUserID() instead of
just checking non-empty, preventing third-party tools from spoofing
a claude-cli/* UA with an arbitrary user_id string to bypass mimicry
- gofmt: realign AffiliateDetail struct tags in affiliate_service.go
- ineffassign: remove dead seenCompleted assignment before return in account_test_service.go
The hardcoded codex CLI version (0.104.0) causes upstream rejection
when using gpt-5.5 with compact, as the server treats the request
as an outdated client and returns 400/502.
Update codexCLIVersion, codexCLIUserAgent, and openAICodexProbeVersion
to 0.125.0 to match the current Codex CLI release.
Fixes#1933, #1887, #1865
Related: #1609, #1298, #849
- staticcheck QF1001: apply De Morgan's law to the OAuth-mimic header
passthrough guard (`!(a && b)` → `a != ... || !b`).
- unused: drop `isClaudeCodeRequest`, which became dead after PR #1914
switched both `/v1/messages` and `/count_tokens` paths to unconditional
`account.IsOAuth()` mimicry. The lowercase helper `isClaudeCodeClient`
is kept (still referenced by `TestIsClaudeCodeClient`).
- Drop SetAffiliateService setters and ProvideAuthService /
ProvidePaymentService / ProvideUserHandler wrappers in favor of direct
Wire constructor injection. AffiliateService has no back-edge to
Auth/Payment/User, so the indirection was never required.
- Change RegisterWithVerification's variadic affiliateCode to a fixed
parameter; adjust all call sites.
- Validate aff_code length and charset in BindInviterByCode before any
DB lookup, eliminating timing-side-channel and useless DB roundtrips
on malformed input.
- Make affiliate cache invalidation synchronous; surface Redis errors
via the project logger instead of swallowing them in a detached
goroutine.
- Add an integration test guarding cross-layer tx propagation in
AccrueQuota and a unit test pinning the aff_code format rules.
Root cause of persistent third-party detection: sub2api's
buildUpstreamRequest transparently forwards client headers via
allowedHeaders whitelist (addHeaderRaw) before applying mimicry
overrides. When third-party clients (opencode, etc.) send their own
anthropic-beta / user-agent / x-stainless-* / x-claude-code-session-id
values, these get appended to the request alongside our injected
headers, creating an inconsistent header set that Anthropic detects.
Parrot's build_upstream_headers constructs exactly 9 headers from
scratch and never forwards anything from the client. This is why
'same opencode version, some users work some don't' — different
opencode configs/versions send different header combinations.
Fix: when tokenType=oauth and mimicClaudeCode=true, skip the
client header passthrough loop entirely. The subsequent
applyClaudeCodeMimicHeaders + ApplyFingerprint + beta merge
pipeline constructs all necessary headers from our controlled values.
Also: remove systemIncludesClaudeCodePrompt gate — OAuth accounts
now unconditionally rewrite system (even if client already sent a
Claude Code-style prompt), ensuring billing attribution block is
always present.
Before: isClaudeCodeRequest() checked whether the client looks like a
real Claude Code CLI (UA, system prompt, X-App header, metadata format).
If it looked like Claude Code, all mimicry was skipped — the assumption
being that a real CLI needs no help.
Problem: third-party tools like opencode partially impersonate Claude
Code (sending claude-cli UA + claude-code beta + CC system prompt) but
miss critical details (billing attribution block, tool-name obfuscation,
cache breakpoints, full beta set). Some users' opencode instances pass
the isClaudeCodeRequest check, causing sub2api to skip mimicry entirely,
while Anthropic still detects the request as third-party.
This explains why 'same opencode version, some users work, some don't'
— it depends on which opencode features/config trigger the validator.
Fix: OAuth accounts now unconditionally run the full mimicry pipeline,
matching Parrot's behavior (Parrot never checks client identity).
This is safe because our mimicry is strictly more complete than any
third-party client's partial impersonation.
Changed:
- /v1/messages path: remove isClaudeCode gate
- /v1/messages/count_tokens path: same
The previous commit only wired stripMessageCacheControl,
addMessageCacheBreakpoints, and tool-name obfuscation into
applyClaudeCodeOAuthMimicryToBody (used by /chat/completions and
/responses). The native /v1/messages path and count_tokens path
have their own independent mimicry code blocks and were missed.
Now all three entry points share the same D/E/F pipeline:
- /v1/messages (gateway_service.go forwardAnthropic)
- /v1/messages/count_tokens (gateway_service.go countTokens)
- OpenAI compat (applyClaudeCodeOAuthMimicryToBody)
Implements the remaining three parity items with Parrot cc_mimicry:
D) Tool-name obfuscation
- Dynamic mapping when tools.length > 5 (matches Parrot threshold).
Fake names follow {prefix}{name[:3]}{i:02d} (e.g. 'manage_bas00').
Go port of random.Random(hash(tuple(names))) uses fnv64a seed +
math/rand; byte-exact reproduction is impossible (Python hash vs
Go hash), but the two invariants that matter are preserved:
* same input tool_names yield identical mapping (cache hit)
* prefix pool is shuffled (names look distributed)
- Static prefix map (sessions_ -> cc_sess_, session_ -> cc_ses_)
applied as fallback, matching Parrot TOOL_NAME_REWRITES verbatim.
- Server tools (web_search_20250305, computer_*, etc.) are NOT
renamed; only type=='function' and type=='custom' tools are.
- tool_choice.name is rewritten in sync (only when type=='tool').
- Response side: bytes-level replace on every SSE chunk / JSON
body at 6 injection points (standard stream/non-stream,
passthrough stream/non-stream, chat_completions stream +
non-stream, responses stream + non-stream). Reverse mapping
applied longest-fake-name-first to prevent substring conflicts
(parity with Parrot _restore_tool_names_in_chunk).
- tool_choice is no longer unconditionally deleted in
normalizeClaudeOAuthRequestBody — Parrot passes it through.
E) tools[-1] cache_control breakpoint
- Injected as {type:ephemeral, ttl:<DefaultCacheControlTTL>} when
the last tool has no cache_control. Client-provided ttl is
passed through unchanged (repo-wide policy).
F) messages cache_control strategy
- stripMessageCacheControl removes every client-provided
messages[*].content[*].cache_control (multi-turn stability).
- addMessageCacheBreakpoints then injects two stable breakpoints:
(1) last message, and (2) second-to-last user turn when
messages.length >= 4.
- Combined with the system block breakpoint and tools[-1]
breakpoint, this gives exactly the 4 breakpoints Anthropic
allows per request.
Non-trivial implementation details to be aware of when rebasing:
* Two new files, no upstream collision:
gateway_tool_rewrite.go (D + E algorithms)
gateway_messages_cache.go (F strip + breakpoints)
* Two new feature calls bolted onto the tail of
applyClaudeCodeOAuthMimicryToBody in gateway_service.go — rebase
conflicts will be ~10 lines maximum.
* Response-side injection points all wrap their existing write with
reverseToolNamesIfPresent(c, ...), preserving original behavior
when no mapping is stored (static prefix rollback still runs).
* Non-stream chat/responses switched from c.JSON to
json.Marshal + c.Data so bytes-level replace is possible.
* Retry bodies (FilterThinkingBlocksForRetry,
FilterSignatureSensitiveBlocksForRetry, RectifyThinkingBudget)
only prune blocks — they preserve the already-obfuscated tool
names, so no extra mapping re-application is needed.
Manual QA: end-to-end scenario verified with 6 tools (above threshold)
and tool_choice.type=='tool'. Obfuscation + restore roundtrip shown
in test logs; then removed the temp test file.
Tests (16 new):
- buildDynamicToolMap stability + below-threshold guard
- sanitizeToolName precedence (dynamic > static)
- restoreToolNamesInBytes longest-first + static rollback
- applyToolNameRewriteToBody skips server tools + syncs tool_choice
- applyToolsLastCacheBreakpoint defaults to 5m + passes client ttl
- stripMessageCacheControl + addMessageCacheBreakpoints in the
1/4/string-content cases + second-to-last user turn selection
- buildToolNameRewriteFromBody ReverseOrdered is desc-by-fake-length
- fake name shape follows Parrot {prefix}{head3}{i:02d}