mirror of
https://gitee.com/wanwujie/deer-flow
synced 2026-04-03 06:12:14 +08:00
- Add README.md with project overview, quick start, and API reference - Add CONTRIBUTING.md with development setup and contribution guidelines - Add docs/ARCHITECTURE.md with detailed system architecture diagrams - Add docs/API.md with complete API reference for LangGraph and Gateway - Add docs/README.md as documentation index - Update CLAUDE.md with improved structure and new features - Update docs/TODO.md to reflect current status - Update pyproject.toml description Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
465 lines
29 KiB
Markdown
465 lines
29 KiB
Markdown
# Architecture Overview
|
|
|
|
This document provides a comprehensive overview of the DeerFlow backend architecture.
|
|
|
|
## System Architecture
|
|
|
|
```
|
|
┌──────────────────────────────────────────────────────────────────────────┐
|
|
│ Client (Browser) │
|
|
└─────────────────────────────────┬────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌──────────────────────────────────────────────────────────────────────────┐
|
|
│ Nginx (Port 2026) │
|
|
│ Unified Reverse Proxy Entry Point │
|
|
│ ┌────────────────────────────────────────────────────────────────────┐ │
|
|
│ │ /api/langgraph/* → LangGraph Server (2024) │ │
|
|
│ │ /api/* → Gateway API (8001) │ │
|
|
│ │ /* → Frontend (3000) │ │
|
|
│ └────────────────────────────────────────────────────────────────────┘ │
|
|
└─────────────────────────────────┬────────────────────────────────────────┘
|
|
│
|
|
┌───────────────────────┼───────────────────────┐
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
|
|
│ LangGraph Server │ │ Gateway API │ │ Frontend │
|
|
│ (Port 2024) │ │ (Port 8001) │ │ (Port 3000) │
|
|
│ │ │ │ │ │
|
|
│ - Agent Runtime │ │ - Models API │ │ - Next.js App │
|
|
│ - Thread Mgmt │ │ - MCP Config │ │ - React UI │
|
|
│ - SSE Streaming │ │ - Skills Mgmt │ │ - Chat Interface │
|
|
│ - Checkpointing │ │ - File Uploads │ │ │
|
|
│ │ │ - Artifacts │ │ │
|
|
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
|
|
│ │
|
|
│ ┌─────────────────┘
|
|
│ │
|
|
▼ ▼
|
|
┌──────────────────────────────────────────────────────────────────────────┐
|
|
│ Shared Configuration │
|
|
│ ┌─────────────────────────┐ ┌────────────────────────────────────────┐ │
|
|
│ │ config.yaml │ │ extensions_config.json │ │
|
|
│ │ - Models │ │ - MCP Servers │ │
|
|
│ │ - Tools │ │ - Skills State │ │
|
|
│ │ - Sandbox │ │ │ │
|
|
│ │ - Summarization │ │ │ │
|
|
│ └─────────────────────────┘ └────────────────────────────────────────┘ │
|
|
└──────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Component Details
|
|
|
|
### LangGraph Server
|
|
|
|
The LangGraph server is the core agent runtime, built on LangGraph for robust multi-agent workflow orchestration.
|
|
|
|
**Entry Point**: `src/agents/lead_agent/agent.py:make_lead_agent`
|
|
|
|
**Key Responsibilities**:
|
|
- Agent creation and configuration
|
|
- Thread state management
|
|
- Middleware chain execution
|
|
- Tool execution orchestration
|
|
- SSE streaming for real-time responses
|
|
|
|
**Configuration**: `langgraph.json`
|
|
|
|
```json
|
|
{
|
|
"agent": {
|
|
"type": "agent",
|
|
"path": "src.agents:make_lead_agent"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Gateway API
|
|
|
|
FastAPI application providing REST endpoints for non-agent operations.
|
|
|
|
**Entry Point**: `src/gateway/app.py`
|
|
|
|
**Routers**:
|
|
- `models.py` - `/api/models` - Model listing and details
|
|
- `mcp.py` - `/api/mcp` - MCP server configuration
|
|
- `skills.py` - `/api/skills` - Skills management
|
|
- `uploads.py` - `/api/threads/{id}/uploads` - File upload
|
|
- `artifacts.py` - `/api/threads/{id}/artifacts` - Artifact serving
|
|
|
|
### Agent Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ make_lead_agent(config) │
|
|
└────────────────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Middleware Chain │
|
|
│ ┌──────────────────────────────────────────────────────────────────┐ │
|
|
│ │ 1. ThreadDataMiddleware - Initialize workspace/uploads/outputs │ │
|
|
│ │ 2. UploadsMiddleware - Process uploaded files │ │
|
|
│ │ 3. SandboxMiddleware - Acquire sandbox environment │ │
|
|
│ │ 4. SummarizationMiddleware - Context reduction (if enabled) │ │
|
|
│ │ 5. TitleMiddleware - Auto-generate titles │ │
|
|
│ │ 6. TodoListMiddleware - Task tracking (if plan_mode) │ │
|
|
│ │ 7. ViewImageMiddleware - Vision model support │ │
|
|
│ │ 8. ClarificationMiddleware - Handle clarifications │ │
|
|
│ └──────────────────────────────────────────────────────────────────┘ │
|
|
└────────────────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Agent Core │
|
|
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ │
|
|
│ │ Model │ │ Tools │ │ System Prompt │ │
|
|
│ │ (from factory) │ │ (configured + │ │ (with skills) │ │
|
|
│ │ │ │ MCP + builtin) │ │ │ │
|
|
│ └──────────────────┘ └──────────────────┘ └──────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Thread State
|
|
|
|
The `ThreadState` extends LangGraph's `AgentState` with additional fields:
|
|
|
|
```python
|
|
class ThreadState(AgentState):
|
|
# Core state from AgentState
|
|
messages: list[BaseMessage]
|
|
|
|
# DeerFlow extensions
|
|
sandbox: dict # Sandbox environment info
|
|
artifacts: list[str] # Generated file paths
|
|
thread_data: dict # {workspace, uploads, outputs} paths
|
|
title: str | None # Auto-generated conversation title
|
|
todos: list[dict] # Task tracking (plan mode)
|
|
viewed_images: dict # Vision model image data
|
|
```
|
|
|
|
### Sandbox System
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Sandbox Architecture │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────┐
|
|
│ SandboxProvider │ (Abstract)
|
|
│ - acquire() │
|
|
│ - get() │
|
|
│ - release() │
|
|
└────────────┬────────────┘
|
|
│
|
|
┌────────────────────┼────────────────────┐
|
|
│ │
|
|
▼ ▼
|
|
┌─────────────────────────┐ ┌─────────────────────────┐
|
|
│ LocalSandboxProvider │ │ AioSandboxProvider │
|
|
│ (src/sandbox/local.py) │ │ (src/community/) │
|
|
│ │ │ │
|
|
│ - Singleton instance │ │ - Docker-based │
|
|
│ - Direct execution │ │ - Isolated containers │
|
|
│ - Development use │ │ - Production use │
|
|
└─────────────────────────┘ └─────────────────────────┘
|
|
|
|
┌─────────────────────────┐
|
|
│ Sandbox │ (Abstract)
|
|
│ - execute_command() │
|
|
│ - read_file() │
|
|
│ - write_file() │
|
|
│ - list_dir() │
|
|
└─────────────────────────┘
|
|
```
|
|
|
|
**Virtual Path Mapping**:
|
|
|
|
| Virtual Path | Physical Path |
|
|
|-------------|---------------|
|
|
| `/mnt/user-data/workspace` | `backend/.deer-flow/threads/{thread_id}/user-data/workspace` |
|
|
| `/mnt/user-data/uploads` | `backend/.deer-flow/threads/{thread_id}/user-data/uploads` |
|
|
| `/mnt/user-data/outputs` | `backend/.deer-flow/threads/{thread_id}/user-data/outputs` |
|
|
| `/mnt/skills` | `deer-flow/skills/` |
|
|
|
|
### Tool System
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Tool Sources │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
|
|
│ Built-in Tools │ │ Configured Tools │ │ MCP Tools │
|
|
│ (src/tools/) │ │ (config.yaml) │ │ (extensions.json) │
|
|
├─────────────────────┤ ├─────────────────────┤ ├─────────────────────┤
|
|
│ - present_file │ │ - web_search │ │ - github │
|
|
│ - ask_clarification │ │ - web_fetch │ │ - filesystem │
|
|
│ - view_image │ │ - bash │ │ - postgres │
|
|
│ │ │ - read_file │ │ - brave-search │
|
|
│ │ │ - write_file │ │ - puppeteer │
|
|
│ │ │ - str_replace │ │ - ... │
|
|
│ │ │ - ls │ │ │
|
|
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
|
|
│ │ │
|
|
└───────────────────────┴───────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────┐
|
|
│ get_available_tools() │
|
|
│ (src/tools/__init__) │
|
|
└─────────────────────────┘
|
|
```
|
|
|
|
### Model Factory
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Model Factory │
|
|
│ (src/models/factory.py) │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
|
|
config.yaml:
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ models: │
|
|
│ - name: gpt-4 │
|
|
│ display_name: GPT-4 │
|
|
│ use: langchain_openai:ChatOpenAI │
|
|
│ model: gpt-4 │
|
|
│ api_key: $OPENAI_API_KEY │
|
|
│ max_tokens: 4096 │
|
|
│ supports_thinking: false │
|
|
│ supports_vision: true │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────┐
|
|
│ create_chat_model() │
|
|
│ - name: str │
|
|
│ - thinking_enabled │
|
|
└────────────┬────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────┐
|
|
│ resolve_class() │
|
|
│ (reflection system) │
|
|
└────────────┬────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────┐
|
|
│ BaseChatModel │
|
|
│ (LangChain instance) │
|
|
└─────────────────────────┘
|
|
```
|
|
|
|
**Supported Providers**:
|
|
- OpenAI (`langchain_openai:ChatOpenAI`)
|
|
- Anthropic (`langchain_anthropic:ChatAnthropic`)
|
|
- DeepSeek (`langchain_deepseek:ChatDeepSeek`)
|
|
- Custom via LangChain integrations
|
|
|
|
### MCP Integration
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ MCP Integration │
|
|
│ (src/mcp/manager.py) │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
|
|
extensions_config.json:
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ { │
|
|
│ "mcpServers": { │
|
|
│ "github": { │
|
|
│ "enabled": true, │
|
|
│ "type": "stdio", │
|
|
│ "command": "npx", │
|
|
│ "args": ["-y", "@modelcontextprotocol/server-github"], │
|
|
│ "env": {"GITHUB_TOKEN": "$GITHUB_TOKEN"} │
|
|
│ } │
|
|
│ } │
|
|
│ } │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────┐
|
|
│ MultiServerMCPClient │
|
|
│ (langchain-mcp-adapters)│
|
|
└────────────┬────────────┘
|
|
│
|
|
┌────────────────────┼────────────────────┐
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌───────────┐ ┌───────────┐ ┌───────────┐
|
|
│ stdio │ │ SSE │ │ HTTP │
|
|
│ transport │ │ transport │ │ transport │
|
|
└───────────┘ └───────────┘ └───────────┘
|
|
```
|
|
|
|
### Skills System
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Skills System │
|
|
│ (src/skills/loader.py) │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
|
|
Directory Structure:
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ skills/ │
|
|
│ ├── public/ # Public skills (committed) │
|
|
│ │ ├── pdf-processing/ │
|
|
│ │ │ └── SKILL.md │
|
|
│ │ ├── frontend-design/ │
|
|
│ │ │ └── SKILL.md │
|
|
│ │ └── ... │
|
|
│ └── custom/ # Custom skills (gitignored) │
|
|
│ └── user-installed/ │
|
|
│ └── SKILL.md │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
|
|
SKILL.md Format:
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ --- │
|
|
│ name: PDF Processing │
|
|
│ description: Handle PDF documents efficiently │
|
|
│ license: MIT │
|
|
│ allowed-tools: │
|
|
│ - read_file │
|
|
│ - write_file │
|
|
│ - bash │
|
|
│ --- │
|
|
│ │
|
|
│ # Skill Instructions │
|
|
│ Content injected into system prompt... │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Request Flow
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Request Flow Example │
|
|
│ User sends message to agent │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
|
|
1. Client → Nginx
|
|
POST /api/langgraph/threads/{thread_id}/runs
|
|
{"input": {"messages": [{"role": "user", "content": "Hello"}]}}
|
|
|
|
2. Nginx → LangGraph Server (2024)
|
|
Proxied to LangGraph server
|
|
|
|
3. LangGraph Server
|
|
a. Load/create thread state
|
|
b. Execute middleware chain:
|
|
- ThreadDataMiddleware: Set up paths
|
|
- UploadsMiddleware: Inject file list
|
|
- SandboxMiddleware: Acquire sandbox
|
|
- SummarizationMiddleware: Check token limits
|
|
- TitleMiddleware: Generate title if needed
|
|
- TodoListMiddleware: Load todos (if plan mode)
|
|
- ViewImageMiddleware: Process images
|
|
- ClarificationMiddleware: Check for clarifications
|
|
|
|
c. Execute agent:
|
|
- Model processes messages
|
|
- May call tools (bash, web_search, etc.)
|
|
- Tools execute via sandbox
|
|
- Results added to messages
|
|
|
|
d. Stream response via SSE
|
|
|
|
4. Client receives streaming response
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
### File Upload Flow
|
|
|
|
```
|
|
1. Client uploads file
|
|
POST /api/threads/{thread_id}/uploads
|
|
Content-Type: multipart/form-data
|
|
|
|
2. Gateway receives file
|
|
- Validates file
|
|
- Stores in .deer-flow/threads/{thread_id}/user-data/uploads/
|
|
- If document: converts to Markdown via markitdown
|
|
|
|
3. Returns response
|
|
{
|
|
"files": [{
|
|
"filename": "doc.pdf",
|
|
"path": ".deer-flow/.../uploads/doc.pdf",
|
|
"virtual_path": "/mnt/user-data/uploads/doc.pdf",
|
|
"artifact_url": "/api/threads/.../artifacts/mnt/.../doc.pdf"
|
|
}]
|
|
}
|
|
|
|
4. Next agent run
|
|
- UploadsMiddleware lists files
|
|
- Injects file list into messages
|
|
- Agent can access via virtual_path
|
|
```
|
|
|
|
### Configuration Reload
|
|
|
|
```
|
|
1. Client updates MCP config
|
|
PUT /api/mcp/config
|
|
|
|
2. Gateway writes extensions_config.json
|
|
- Updates mcpServers section
|
|
- File mtime changes
|
|
|
|
3. MCP Manager detects change
|
|
- get_cached_mcp_tools() checks mtime
|
|
- If changed: reinitializes MCP client
|
|
- Loads updated server configurations
|
|
|
|
4. Next agent run uses new tools
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
### Sandbox Isolation
|
|
|
|
- Agent code executes within sandbox boundaries
|
|
- Local sandbox: Direct execution (development only)
|
|
- Docker sandbox: Container isolation (production recommended)
|
|
- Path traversal prevention in file operations
|
|
|
|
### API Security
|
|
|
|
- Thread isolation: Each thread has separate data directories
|
|
- File validation: Uploads checked for path safety
|
|
- Environment variable resolution: Secrets not stored in config
|
|
|
|
### MCP Security
|
|
|
|
- Each MCP server runs in its own process
|
|
- Environment variables resolved at runtime
|
|
- Servers can be enabled/disabled independently
|
|
|
|
## Performance Considerations
|
|
|
|
### Caching
|
|
|
|
- MCP tools cached with file mtime invalidation
|
|
- Configuration loaded once, reloaded on file change
|
|
- Skills parsed once at startup, cached in memory
|
|
|
|
### Streaming
|
|
|
|
- SSE used for real-time response streaming
|
|
- Reduces time to first token
|
|
- Enables progress visibility for long operations
|
|
|
|
### Context Management
|
|
|
|
- Summarization middleware reduces context when limits approached
|
|
- Configurable triggers: tokens, messages, or fraction
|
|
- Preserves recent messages while summarizing older ones
|