feat: add view_image tool and optimize web fetch tools

Add image viewing capability for vision-enabled models with ViewImageMiddleware and view_image_tool. Limit web_fetch tool output to 4096 characters to prevent excessive content. Update model config to support vision capability flag. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-04-18 12:04:45 +08:00 · 2026-01-29 13:44:04 +08:00
parent 73a5a7972e
commit 09d9c18a28
12 changed files with 390 additions and 13 deletions
--- a/backend/src/config/model_config.py
+++ b/backend/src/config/model_config.py
@@ -18,3 +18,4 @@ class ModelConfig(BaseModel):
        default_factory=lambda: None,
        description="Extra settings to be passed to the model when thinking is enabled",
    )
+    supports_vision: bool = Field(default_factory=lambda: False, description="Whether the model supports vision/image inputs")