(*net.OpError).Error() concatenates Source/Addr fields, so the previous
disconnectMsg surfaced internal source IP/port and upstream server address
to clients via SSE error frames and UpstreamFailoverError.ResponseBody
(reported by @Wei-Shaw on PR #2066).
- Add sanitizeStreamError that maps known errors (io.ErrUnexpectedEOF,
context.Canceled, syscall.ECONNRESET/EPIPE/ETIMEDOUT/...) to fixed
descriptions and falls back to a generic placeholder, with an explicit
*net.OpError branch that drops Source/Addr fields entirely.
- Use sanitized message in client-facing disconnectMsg; full ev.err is
still preserved in the existing operator log line for diagnosis.
- Tests cover net.OpError redaction, the failover ResponseBody path, and
every known sanitized error mapping.
Two follow-ups to PR #2066's failover-wrap fix:
1. Failover ResponseBody (`UpstreamFailoverError.ResponseBody`) was encoded
as `{"error": "<msg>"}` (string field). `ExtractUpstreamErrorMessage`
probes for `error.message`, `detail`, or top-level `message` only — so
`handleFailoverExhausted` and downstream passthrough rules saw an empty
message, losing the EOF root cause in ops logs. Re-encode as the
Anthropic standard shape `{"type":"error","error":{"type":"upstream_disconnected","message":"..."}}`.
(Addresses the inline review comment from copilot-pull-request-reviewer
on Wei-Shaw/sub2api#2066.)
2. The streaming `event: error` SSE frame for `response_too_large`,
`stream_read_error`, and `stream_timeout` was non-standard
(`{"error":"<reason>"}`). Anthropic SDKs (and Claude Code) expect
`{"type":"error","error":{"type":"...","message":"..."}}` and parse
`error.type`/`error.message` accordingly. Refactor `sendErrorEvent` to
take both reason and message, and emit the standard frame so client
SDKs surface a real diagnostic message instead of a generic stream error.
This does not by itself prevent task interruption on long-stream EOF
(SSE has no resume; client-side retry remains the only complete fix), but
it gives both server-side ops logs and client-side error UIs a meaningful
upstream message so users know the next step is to retry.
Tests updated to assert the new body shape on both branches plus a new
assertion that `ExtractUpstreamErrorMessage` returns a non-empty string.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Anthropic streaming path (gateway_service.go) returned a plain error on
upstream SSE read failure, so the handler-level UpstreamFailoverError check
never fired and the client received a bare `stream_read_error` event,
breaking long-running tasks even when no bytes had been written yet.
The most common trigger is HTTP/2 GOAWAY from api.anthropic.com edge
backends doing graceful rotation: Go's http.Transport surfaces this as
`unexpected EOF` and never auto-retries.
Mirror what the OpenAI and antigravity gateways already do: when the read
error happens before any byte has reached the client (`!c.Writer.Written()`),
return `*UpstreamFailoverError{StatusCode: 502, RetryableOnSameAccount: true}`
so the handler can retry on the same or another account. After client
output has begun, SSE has no resume protocol — keep the existing passthrough
behavior.
Tests cover both branches via streamReadCloser-based fixtures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>