docs: revise backend README and CLAUDE.md to reflect full architecture

Updated documentation to accurately cover all backend subsystems including subagents, memory, middleware chain, sandbox, MCP, skills, and gateway API. Fixed broken MCP_SETUP.md link in root README. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 06:12:14 +08:00 · 2026-02-08 22:49:36 +08:00
parent 010aba1e28
commit 2703eb0b22
3 changed files with 395 additions and 371 deletions
--- a/README.md
+++ b/README.md
@@ -72,7 +72,7 @@ See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed local development guide.
 - [Contributing Guide](CONTRIBUTING.md) - Development environment setup and workflow
 - [Configuration Guide](backend/docs/CONFIGURATION.md) - Setup and configuration instructions
 - [Architecture Overview](backend/CLAUDE.md) - Technical architecture details
- [MCP Setup Guide](MCP_SETUP.md) - Configure Model Context Protocol servers for additional tools
+- [Backend Architecture](backend/README.md) - Backend architecture and API reference

 ## Contributing

--- a/backend/CLAUDE.md
+++ b/backend/CLAUDE.md
@@ -4,11 +4,11 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 ## Project Overview

-DeerFlow is a LangGraph-based AI agent system with a full-stack architecture. The backend provides a "super agent" with sandbox execution capabilities that can execute code, browse the web, and manage files in isolated environments.
+DeerFlow is a LangGraph-based AI super agent system with a full-stack architecture. The backend provides a "super agent" with sandbox execution, persistent memory, subagent delegation, and extensible tool integration - all operating in per-thread isolated environments.

 **Architecture**:
 - **LangGraph Server** (port 2024): Agent runtime and workflow execution
- **Gateway API** (port 8001): REST API for models, MCP, skills, artifacts, and uploads
+- **Gateway API** (port 8001): REST API for models, MCP, skills, memory, artifacts, and uploads
 - **Frontend** (port 3000): Next.js web interface
 - **Nginx** (port 2026): Unified reverse proxy entry point

@@ -16,24 +16,39 @@ DeerFlow is a LangGraph-based AI agent system with a full-stack architecture. Th
 ```
 deer-flow/
 ├── Makefile                    # Root commands (check, install, dev, stop)
-├── nginx.conf                  # Nginx reverse proxy configuration
 ├── config.yaml                 # Main application configuration
 ├── extensions_config.json      # MCP servers and skills configuration
 ├── backend/                    # Backend application (this directory)
 │   ├── Makefile               # Backend-only commands (dev, gateway, lint)
+│   ├── langgraph.json         # LangGraph server configuration
 │   ├── src/
-│   │   ├── agents/            # LangGraph agents and workflows
+│   │   ├── agents/            # LangGraph agent system
+│   │   │   ├── lead_agent/    # Main agent (factory + system prompt)
+│   │   │   ├── middlewares/   # 9 middleware components
+│   │   │   ├── memory/        # Memory extraction, queue, prompts
+│   │   │   └── thread_state.py # ThreadState schema
 │   │   ├── gateway/           # FastAPI Gateway API
+│   │   │   ├── app.py         # FastAPI application
+│   │   │   └── routers/       # 6 route modules
 │   │   ├── sandbox/           # Sandbox execution system
-│   │   ├── tools/             # Agent tools
-│   │   ├── mcp/               # MCP integration
-│   │   ├── models/            # Model factory
-│   │   ├── skills/            # Skills loading and management
-│   │   ├── config/            # Configuration system
-│   │   ├── community/         # Community tools (web search, etc.)
-│   │   ├── reflection/        # Dynamic module loading
-│   │   └── utils/             # Utilities
-│   └── langgraph.json         # LangGraph server configuration
+│   │   │   ├── local/         # Local filesystem provider
+│   │   │   ├── sandbox.py     # Abstract Sandbox interface
+│   │   │   ├── tools.py       # bash, ls, read/write/str_replace
+│   │   │   └── middleware.py  # Sandbox lifecycle management
+│   │   ├── subagents/         # Subagent delegation system
+│   │   │   ├── builtins/      # general-purpose, bash agents
+│   │   │   ├── executor.py    # Background execution engine
+│   │   │   └── registry.py    # Agent registry
+│   │   ├── tools/builtins/    # Built-in tools (present_files, ask_clarification, view_image)
+│   │   ├── mcp/               # MCP integration (tools, cache, client)
+│   │   ├── models/            # Model factory with thinking/vision support
+│   │   ├── skills/            # Skills discovery, loading, parsing
+│   │   ├── config/            # Configuration system (app, model, sandbox, tool, etc.)
+│   │   ├── community/         # Community tools (tavily, jina_ai, firecrawl, image_search, aio_sandbox)
+│   │   ├── reflection/        # Dynamic module loading (resolve_variable, resolve_class)
+│   │   └── utils/             # Utilities (network, readability)
+│   ├── tests/                 # Test suite
+│   └── docs/                  # Documentation
 ├── frontend/                   # Next.js frontend application
 └── skills/                     # Agent skills directory
    ├── public/                # Public skills (committed)
@@ -55,52 +70,61 @@ When making code changes, you MUST update the relevant documentation:

 **Root directory** (for full application):
 ```bash
-# Check system requirements
-make check
-
-# Install all dependencies (frontend + backend)
-make install
-
-# Start all services (LangGraph + Gateway + Frontend + Nginx)
-make dev
-
-# Stop all services
-make stop
+make check      # Check system requirements
+make install    # Install all dependencies (frontend + backend)
+make dev        # Start all services (LangGraph + Gateway + Frontend + Nginx)
+make stop       # Stop all services
 ```

 **Backend directory** (for backend development only):
 ```bash
-# Install backend dependencies
-make install
-
-# Run LangGraph server only (port 2024)
-make dev
-
-# Run Gateway API only (port 8001)
-make gateway
-
-# Lint
-make lint
-
-# Format code
-make format
+make install    # Install backend dependencies
+make dev        # Run LangGraph server only (port 2024)
+make gateway    # Run Gateway API only (port 8001)
+make lint       # Lint with ruff
+make format     # Format code with ruff
 ```

 ## Architecture

-### Configuration System
+### Agent System

-The app uses a two-tier YAML/JSON-based configuration system.
+**Lead Agent** (`src/agents/lead_agent/agent.py`):
+- Entry point: `make_lead_agent(config: RunnableConfig)` registered in `langgraph.json`
+- Dynamic model selection via `create_chat_model()` with thinking/vision support
+- Tools loaded via `get_available_tools()` - combines sandbox, built-in, MCP, community, and subagent tools
+- System prompt generated by `apply_prompt_template()` with skills, memory, and subagent instructions
+
+**ThreadState** (`src/agents/thread_state.py`):
+- Extends `AgentState` with: `sandbox`, `thread_data`, `title`, `artifacts`, `todos`, `uploaded_files`, `viewed_images`
+- Uses custom reducers: `merge_artifacts` (deduplicate), `merge_viewed_images` (merge/clear)
+
+**Runtime Configuration** (via `config.configurable`):
+- `thinking_enabled` - Enable model's extended thinking
+- `model_name` - Select specific LLM model
+- `is_plan_mode` - Enable TodoList middleware
+- `subagent_enabled` - Enable task delegation tool
+
+### Middleware Chain
+
+Middlewares execute in strict order in `src/agents/lead_agent/agent.py`:
+
+1. **ThreadDataMiddleware** - Creates per-thread directories (`backend/.deer-flow/threads/{thread_id}/user-data/{workspace,uploads,outputs}`)
+2. **UploadsMiddleware** - Tracks and injects newly uploaded files into conversation
+3. **SandboxMiddleware** - Acquires sandbox, stores `sandbox_id` in state
+4. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled)
+5. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode)
+6. **TitleMiddleware** - Auto-generates thread title after first complete exchange
+7. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses)
+8. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support)
+9. **ClarificationMiddleware** - Intercepts `ask_clarification` tool calls, interrupts via `Command(goto=END)` (must be last)
+
+### Configuration System

 **Main Configuration** (`config.yaml`):

 Setup: Copy `config.example.yaml` to `config.yaml` in the **project root** directory.

-```bash
-# From project root (deer-flow/)
-cp config.example.yaml config.yaml
-```
-
 Configuration priority:
 1. Explicit `config_path` argument
 2. `DEER_FLOW_CONFIG_PATH` environment variable
@@ -113,185 +137,149 @@ Config values starting with `$` are resolved as environment variables (e.g., `$O

 MCP servers and skills are configured together in `extensions_config.json` in project root:

-```bash
-# From project root (deer-flow/)
-cp extensions_config.example.json extensions_config.json
-```
-
 Configuration priority:
 1. Explicit `config_path` argument
 2. `DEER_FLOW_EXTENSIONS_CONFIG_PATH` environment variable
 3. `extensions_config.json` in current directory (backend/)
 4. `extensions_config.json` in parent directory (project root - **recommended location**)
-5. For backward compatibility: `mcp_config.json` (will be deprecated)

-### Core Components
+### Gateway API (`src/gateway/`)

-**Gateway API** (`src/gateway/`)
- FastAPI application that provides REST endpoints for frontend integration
- Endpoints:
-  - `/api/models` - List available LLM models from configuration
-  - `/api/mcp` - Manage MCP server configurations (GET, POST)
-  - `/api/skills` - Manage skill configurations (GET, POST)
-  - `/api/threads/{thread_id}/artifacts/*` - Serve agent-generated artifacts
-  - `/api/threads/{thread_id}/uploads` - File upload, list, delete
- Works alongside LangGraph server, handling non-agent HTTP operations
- Proxied through nginx under `/api/*` routes (except `/api/langgraph/*`)
+FastAPI application on port 8001 with health check at `GET /health`.

-**Agent Graph** (`src/agents/`)
- `lead_agent` is the main entry point registered in `langgraph.json`
- Uses `ThreadState` which extends `AgentState` with:
-  - `sandbox`: Sandbox environment info
-  - `artifacts`: Generated file paths
-  - `thread_data`: Workspace/uploads/outputs paths
-  - `title`: Auto-generated conversation title
-  - `todos`: Task tracking (plan mode)
-  - `viewed_images`: Vision model image data
- Agent is created via `make_lead_agent(config)` with model, tools, middleware, and system prompt
+**Routers**:

-**Sandbox System** (`src/sandbox/`)
- Abstract `Sandbox` base class defines interface: `execute_command`, `read_file`, `write_file`, `list_dir`
- `SandboxProvider` manages sandbox lifecycle: `acquire`, `get`, `release`
- `SandboxMiddleware` automatically acquires sandbox on agent start and injects into state
- `LocalSandboxProvider` is a singleton implementation for local execution
- `AioSandboxProvider` provides Docker-based isolation (in `src/community/`)
- Sandbox tools (`bash`, `ls`, `read_file`, `write_file`, `str_replace`) extract sandbox from tool runtime
+| Router | Endpoints |
+|--------|-----------|
+| **Models** (`/api/models`) | `GET /` - list models; `GET /{name}` - model details |
+| **MCP** (`/api/mcp`) | `GET /config` - get config; `PUT /config` - update config (saves to extensions_config.json) |
+| **Skills** (`/api/skills`) | `GET /` - list skills; `GET /{name}` - details; `PUT /{name}` - update enabled; `POST /install` - install from .skill archive |
+| **Memory** (`/api/memory`) | `GET /` - memory data; `POST /reload` - force reload; `GET /config` - config; `GET /status` - config + data |
+| **Uploads** (`/api/threads/{id}/uploads`) | `POST /` - upload files (auto-converts PDF/PPT/Excel/Word); `GET /list` - list; `DELETE /{filename}` - delete |
+| **Artifacts** (`/api/threads/{id}/artifacts`) | `GET /{path}` - serve artifacts; `?download=true` for download with citation removal |
+
+Proxied through nginx: `/api/langgraph/*` → LangGraph, all other `/api/*` → Gateway.
+
+### Sandbox System (`src/sandbox/`)
+
+**Interface**: Abstract `Sandbox` with `execute_command`, `read_file`, `write_file`, `list_dir`
+**Provider Pattern**: `SandboxProvider` with `acquire`, `get`, `release` lifecycle
+**Implementations**:
+- `LocalSandboxProvider` - Singleton local filesystem execution with path mappings
+- `AioSandboxProvider` (`src/community/`) - Docker-based isolation

 **Virtual Path System**:
- Paths map between virtual and physical locations
- Virtual: `/mnt/user-data/{workspace,uploads,outputs}` - used by agent
- Physical: `backend/.deer-flow/threads/{thread_id}/user-data/{workspace,uploads,outputs}`
- Skills path: `/mnt/skills` maps to `deer-flow/skills/`
+- Agent sees: `/mnt/user-data/{workspace,uploads,outputs}`, `/mnt/skills`
+- Physical: `backend/.deer-flow/threads/{thread_id}/user-data/...`, `deer-flow/skills/`
+- Translation: `replace_virtual_path()` / `replace_virtual_paths_in_command()`
+- Detection: `is_local_sandbox()` checks `sandbox_id == "local"`

-**Model Factory** (`src/models/factory.py`)
- `create_chat_model()` instantiates LLM from config using reflection
+**Sandbox Tools** (in `src/sandbox/tools.py`):
+- `bash` - Execute commands with path translation and error handling
+- `ls` - Directory listing (tree format, max 2 levels)
+- `read_file` - Read file contents with optional line range
+- `write_file` - Write/append to files, creates directories
+- `str_replace` - Substring replacement (single or all occurrences)
+
+### Subagent System (`src/subagents/`)
+
+**Built-in Agents**: `general-purpose` (all tools except `task`) and `bash` (command specialist)
+**Execution**: Dual thread pool - `_scheduler_pool` (3 workers) + `_execution_pool` (3 workers)
+**Concurrency**: `MAX_CONCURRENT_SUBAGENTS = 3` per trace, 15-minute timeout
+**Flow**: `task()` tool → `SubagentExecutor` → background thread → poll 5s → SSE events → result
+**Events**: `task_started`, `task_running`, `task_completed`/`task_failed`/`task_timed_out`
+
+### Tool System (`src/tools/`)
+
+`get_available_tools(groups, include_mcp, model_name, subagent_enabled)` assembles:
+1. **Config-defined tools** - Resolved from `config.yaml` via `resolve_variable()`
+2. **MCP tools** - From enabled MCP servers (lazy initialized, cached with mtime invalidation)
+3. **Built-in tools**:
+   - `present_files` - Make output files visible to user (only `/mnt/user-data/outputs`)
+   - `ask_clarification` - Request clarification (intercepted by ClarificationMiddleware → interrupts)
+   - `view_image` - Read image as base64 (added only if model supports vision)
+4. **Subagent tool** (if enabled):
+   - `task` - Delegate to subagent (description, prompt, subagent_type, max_turns)
+
+**Community tools** (`src/community/`):
+- `tavily/` - Web search (5 results default) and web fetch (4KB limit)
+- `jina_ai/` - Web fetch via Jina reader API with readability extraction
+- `firecrawl/` - Web scraping via Firecrawl API
+- `image_search/` - Image search via DuckDuckGo
+
+### MCP System (`src/mcp/`)
+
+- Uses `langchain-mcp-adapters` `MultiServerMCPClient` for multi-server management
+- **Lazy initialization**: Tools loaded on first use via `get_cached_mcp_tools()`
+- **Cache invalidation**: Detects config file changes via mtime comparison
+- **Transports**: stdio (command-based), SSE, HTTP
+- **Runtime updates**: Gateway API saves to extensions_config.json; LangGraph detects via mtime
+
+### Skills System (`src/skills/`)
+
+- **Location**: `deer-flow/skills/{public,custom}/`
+- **Format**: Directory with `SKILL.md` (YAML frontmatter: name, description, license, allowed-tools)
+- **Loading**: `load_skills()` scans directories, parses SKILL.md, reads enabled state from extensions_config.json
+- **Injection**: Enabled skills listed in agent system prompt with container paths
+- **Installation**: `POST /api/skills/install` extracts .skill ZIP archive to custom/ directory
+
+### Model Factory (`src/models/factory.py`)
+
+- `create_chat_model(name, thinking_enabled)` instantiates LLM from config via reflection
 - Supports `thinking_enabled` flag with per-model `when_thinking_enabled` overrides
 - Supports `supports_vision` flag for image understanding models
+- Config values starting with `$` resolved as environment variables

-**Tool System** (`src/tools/`)
- Tools defined in config with `use` path (e.g., `src.sandbox.tools:bash_tool`)
- `get_available_tools()` resolves tool paths via reflection
- Built-in tools in `src/tools/builtins/`:
-  - `present_file_tool` - Display files to users
-  - `ask_clarification_tool` - Request clarification
-  - `view_image_tool` - Vision model integration (conditional on model capability)
- Community tools in `src/community/`: Jina AI (web fetch), Tavily (web search), Firecrawl (scraping)
- Supports MCP (Model Context Protocol) for pluggable external tools
+### Memory System (`src/agents/memory/`)

-**MCP System** (`src/mcp/`)
- Integrates with MCP servers to provide pluggable external tools using `langchain-mcp-adapters`
- Uses `MultiServerMCPClient` from langchain-mcp-adapters for multi-server management
- **Automatic initialization**: Tools are loaded on first use with lazy initialization
- Supports both eager loading (FastAPI startup) and lazy loading (LangGraph Studio)
- `initialize_mcp_tools()` can be called in FastAPI lifespan handler for eager loading
- `get_cached_mcp_tools()` automatically initializes tools if not already loaded
- Each server can be enabled/disabled independently via `enabled` flag
- Support types: stdio (command-based), SSE, HTTP
- Built on top of langchain-ai/langchain-mcp-adapters for seamless integration
+**Components**:
+- `updater.py` - LLM-based memory updates with fact extraction and atomic file I/O
+- `queue.py` - Debounced update queue (per-thread deduplication, configurable wait time)
+- `prompt.py` - Prompt templates for memory updates

-**Reflection System** (`src/reflection/`)
- `resolve_variable()` imports module and returns variable (e.g., `module:variable`)
- `resolve_class()` imports and validates class against base class
+**Data Structure** (stored in `backend/.deer-flow/memory.json`):
+- **User Context**: `workContext`, `personalContext`, `topOfMind` (1-3 sentence summaries)
+- **History**: `recentMonths`, `earlierContext`, `longTermBackground`
+- **Facts**: Discrete facts with `id`, `content`, `category` (preference/knowledge/context/behavior/goal), `confidence` (0-1), `createdAt`, `source`

-**Skills System** (`src/skills/`)
- Skills provide specialized workflows for specific tasks (e.g., PDF processing, frontend design)
- Located in `deer-flow/skills/{public,custom}` directory structure
- Each skill has a `SKILL.md` file with YAML front matter (name, description, license, allowed-tools)
- Skills are automatically discovered and loaded at runtime
- `load_skills()` scans directories and parses SKILL.md files
- Skills are injected into agent's system prompt with paths (only enabled skills)
- Path mapping system allows seamless access in both local and Docker sandbox
- Each skill can be enabled/disabled independently via `enabled` flag in extensions config
+**Workflow**:
+1. `MemoryMiddleware` filters messages (user inputs + final AI responses) and queues conversation
+2. Queue debounces (30s default), batches updates, deduplicates per-thread
+3. Background thread invokes LLM to extract context updates and facts
+4. Applies updates atomically (temp file + rename) with cache invalidation
+5. Next interaction injects top 15 facts + context into `<memory>` tags in system prompt

-**Middleware System** (`src/agents/middlewares/`)
- Custom middlewares handle cross-cutting concerns
- Middlewares are registered in `src/agents/lead_agent/agent.py` with execution order:
-  1. `ThreadDataMiddleware` - Initializes thread context (workspace, uploads, outputs paths)
-  2. `UploadsMiddleware` - Processes uploaded files, injects file list into state
-  3. `SandboxMiddleware` - Manages sandbox lifecycle, acquires on start
-  4. `SummarizationMiddleware` - Reduces context when token limits approached (if enabled)
-  5. `TitleMiddleware` - Generates conversation titles
-  6. `TodoListMiddleware` - Tracks multi-step tasks (if plan_mode enabled)
-  7. `ViewImageMiddleware` - Injects image details for vision models
-  8. `MemoryMiddleware` - Automatic context retention and personalization (if enabled)
-  9. `ClarificationMiddleware` - Handles clarification requests (must be last)
+**Configuration** (`config.yaml` → `memory`):
+- `enabled` / `injection_enabled` - Master switches
+- `storage_path` - Path to memory.json
+- `debounce_seconds` - Wait time before processing (default: 30)
+- `model_name` - LLM for updates (null = default model)
+- `max_facts` / `fact_confidence_threshold` - Fact storage limits (100 / 0.7)
+- `max_injection_tokens` - Token limit for prompt injection (2000)

-**Memory System** (`src/agents/memory/`)
- LLM-powered personalization layer that automatically extracts and stores user context across conversations
- Components:
-  - `updater.py` - LLM-based memory updates with fact extraction and file I/O
-  - `queue.py` - Debounced update queue for batching and performance optimization
-  - `prompt.py` - Prompt templates and formatting utilities for memory updates
- `MemoryMiddleware` (`src/agents/middlewares/memory_middleware.py`) - Queues conversations for memory updates
- Gateway API (`src/gateway/routers/memory.py`) - REST endpoints for memory management
- Storage: JSON file at `backend/.deer-flow/memory.json`
+### Reflection System (`src/reflection/`)

-**Memory Data Structure**:
- **User Context** (current state):
-  - `workContext` - Work-related information (job, projects, technologies)
-  - `personalContext` - Preferences, communication style, background
-  - `topOfMind` - Current focus areas and immediate priorities
- **History** (temporal context):
-  - `recentMonths` - Recent activities and discussions
-  - `earlierContext` - Important historical context
-  - `longTermBackground` - Persistent background information
- **Facts** (structured knowledge):
-  - Discrete facts with categories: `preference`, `knowledge`, `context`, `behavior`, `goal`
-  - Each fact includes: `id`, `content`, `category`, `confidence` (0-1), `createdAt`, `source` (thread ID)
-  - Confidence threshold (default 0.7) filters low-quality facts
-  - Max facts limit (default 100) keeps highest-confidence facts
-
-**Memory Workflow**:
-1. **Post-Interaction**: `MemoryMiddleware` filters messages (user inputs + final AI responses only) and queues conversation
-2. **Debounced Processing**: Queue waits 30s (configurable), batches multiple updates, resets timer on new updates
-3. **LLM-Based Update**: Background thread loads memory, formats conversation, invokes LLM to extract:
-   - Updated context summaries (1-3 sentences each)
-   - New facts with confidence scores and categories
-   - Facts to remove (contradictions)
-4. **Storage**: Applies updates atomically to `memory.json` with cache invalidation (mtime-based)
-5. **Injection**: Next interaction loads memory, formats top 15 facts + context, injects into `<memory>` tags in system prompt
-
-**Memory API Endpoints** (`/api/memory`):
- `GET /api/memory` - Retrieve current memory data
- `POST /api/memory/reload` - Force reload from file (invalidates cache)
- `GET /api/memory/config` - Get memory configuration
- `GET /api/memory/status` - Get both config and data
+- `resolve_variable(path)` - Import module and return variable (e.g., `module.path:variable_name`)
+- `resolve_class(path, base_class)` - Import and validate class against base class

 ### Config Schema

-Models, tools, sandbox providers, skills, and middleware settings are configured in `config.yaml`:
- `models[]`: LLM configurations with `use` class path, `supports_thinking`, `supports_vision`
- `tools[]`: Tool configurations with `use` variable path and `group`
- `tool_groups[]`: Logical groupings for tools
- `sandbox.use`: Sandbox provider class path
- `skills.path`: Host path to skills directory (optional, default: `../skills`)
- `skills.container_path`: Container mount path (default: `/mnt/skills`)
- `title`: Automatic thread title generation configuration
- `summarization`: Automatic conversation summarization configuration
- `subagents`: Subagent (task tool) configuration
-  - `enabled`: Master switch to enable/disable subagents (boolean, default: true)
- `memory`: Memory system configuration
-  - `enabled`: Master switch (boolean)
-  - `storage_path`: Path to memory.json file (relative to backend/)
-  - `debounce_seconds`: Wait time before processing updates (default: 30)
-  - `model_name`: LLM model for memory updates (null = use default model)
-  - `max_facts`: Maximum facts to store (default: 100)
-  - `fact_confidence_threshold`: Minimum confidence to store fact (default: 0.7)
-  - `injection_enabled`: Inject memory into system prompt (boolean)
-  - `max_injection_tokens`: Token limit for memory injection (default: 2000)
+**`config.yaml`** key sections:
+- `models[]` - LLM configs with `use` class path, `supports_thinking`, `supports_vision`, provider-specific fields
+- `tools[]` - Tool configs with `use` variable path and `group`
+- `tool_groups[]` - Logical groupings for tools
+- `sandbox.use` - Sandbox provider class path
+- `skills.path` / `skills.container_path` - Host and container paths to skills directory
+- `title` - Auto-title generation (enabled, max_words, max_chars, prompt_template)
+- `summarization` - Context summarization (enabled, trigger conditions, keep policy)
+- `subagents.enabled` - Master switch for subagent delegation
+- `memory` - Memory system (enabled, storage_path, debounce_seconds, model_name, max_facts, fact_confidence_threshold, injection_enabled, max_injection_tokens)

-**Extensions Configuration Schema** (`extensions_config.json`):
- `mcpServers`: Map of MCP server name to configuration
-  - `enabled`: Whether the server is enabled (boolean)
-  - `type`: Transport type (`stdio`, `sse`, `http`)
-  - `command`: Command to execute (for stdio type)
-  - `args`: Arguments to pass to the command (array)
-  - `env`: Environment variables (object with `$VAR` support)
-  - `description`: Human-readable description
- `skills`: Map of skill name to state configuration
-  - `enabled`: Whether the skill is enabled (boolean, default: true if not specified)
+**`extensions_config.json`**:
+- `mcpServers` - Map of server name → config (enabled, type, command, args, env, url, headers, description)
+- `skills` - Map of skill name → state (enabled)

-Both MCP servers and skills can be modified at runtime via API endpoints.
+Both can be modified at runtime via Gateway API endpoints.

 ## Development Workflow

@@ -305,13 +293,13 @@ make dev
 This starts all services and makes the application available at `http://localhost:2026`.

 **Nginx routing**:
- `/api/langgraph/*` → LangGraph Server (2024) - Agent interactions, threads, streaming
- `/api/*` (other) → Gateway API (8001) - Models, MCP, skills, artifacts, uploads
- `/` (non-API) → Frontend (3000) - Web interface
+- `/api/langgraph/*` → LangGraph Server (2024)
+- `/api/*` (other) → Gateway API (8001)
+- `/` (non-API) → Frontend (3000)

 ### Running Backend Services Separately

-For backend-only development, from the **backend** directory:
+From the **backend** directory:

 ```bash
 # Terminal 1: LangGraph server
@@ -337,21 +325,20 @@ When using `make dev` from root, the frontend automatically connects through ngi

 ### File Upload

-The backend supports multi-file upload with automatic document conversion:
+Multi-file upload with automatic document conversion:
 - Endpoint: `POST /api/threads/{thread_id}/uploads`
- Supports: PDF, PPT, Excel, Word documents
- Auto-converts documents to Markdown using `markitdown`
+- Supports: PDF, PPT, Excel, Word documents (converted via `markitdown`)
 - Files stored in thread-isolated directories
- Agent automatically receives uploaded file list via `UploadsMiddleware`
+- Agent receives uploaded file list via `UploadsMiddleware`

 See [docs/FILE_UPLOAD.md](docs/FILE_UPLOAD.md) for details.

 ### Plan Mode

-Enable TodoList middleware for complex multi-step tasks:
+TodoList middleware for complex multi-step tasks:
 - Controlled via runtime config: `config.configurable.is_plan_mode = True`
 - Provides `write_todos` tool for task tracking
- Agent can break down complex tasks and track progress
+- One task in_progress at a time, real-time updates

 See [docs/plan_mode_usage.md](docs/plan_mode_usage.md) for details.

@@ -369,30 +356,7 @@ See [docs/summarization.md](docs/summarization.md) for details.
 For models with `supports_vision: true`:
 - `ViewImageMiddleware` processes images in conversation
 - `view_image_tool` added to agent's toolset
- Images automatically converted and injected into state
-
-### Memory System
-
-Persistent context retention and personalization across conversations:
- **Automatic Extraction**: LLM analyzes conversations to extract user context, facts, and preferences
- **Structured Storage**: Maintains user context, history, and confidence-scored facts in JSON format
- **Smart Filtering**: Only processes meaningful messages (user inputs + final AI responses)
- **Debounced Updates**: Batches updates to minimize LLM calls (configurable wait time)
- **System Prompt Injection**: Automatically injects relevant memory context into agent prompts
- **Cache Optimization**: File modification time-based cache invalidation for external edits
- **Thread Safety**: Locks protect queue and cache for concurrent access
- **REST API**: Full CRUD operations via `/api/memory` endpoints
- **Frontend Integration**: Memory settings page for viewing and managing memory data
-
-**Configuration**: Controlled via `memory` section in `config.yaml`
- Enable/disable memory system
- Configure storage path, debounce timing, fact limits
- Control system prompt injection and token limits
- Set confidence thresholds for fact storage
-
-**Storage Location**: `backend/.deer-flow/memory.json`
-
-See configuration section for detailed settings.
+- Images automatically converted to base64 and injected into state

 ## Code Style

@@ -405,6 +369,8 @@ See configuration section for detailed settings.

 See `docs/` directory for detailed documentation:
 - [CONFIGURATION.md](docs/CONFIGURATION.md) - Configuration options
+- [ARCHITECTURE.md](docs/ARCHITECTURE.md) - Architecture details
+- [API.md](docs/API.md) - API reference
 - [SETUP.md](docs/SETUP.md) - Setup guide
 - [FILE_UPLOAD.md](docs/FILE_UPLOAD.md) - File upload feature
 - [PATH_EXAMPLES.md](docs/PATH_EXAMPLES.md) - Path types and usage
--- a/backend/README.md
+++ b/backend/README.md
@@ -1,41 +1,133 @@
 # DeerFlow Backend

-DeerFlow is a LangGraph-based AI agent system that provides a powerful "super agent" with sandbox execution capabilities. The backend enables AI agents to execute code, browse the web, manage files, and perform complex multi-step tasks in isolated environments.
+DeerFlow is a LangGraph-based AI super agent with sandbox execution, persistent memory, and extensible tool integration. The backend enables AI agents to execute code, browse the web, manage files, delegate tasks to subagents, and retain context across conversations - all in isolated, per-thread environments.

 ---
-## Features

- **LangGraph Agent Runtime**: Built on LangGraph for robust multi-agent workflow orchestration
- **Sandbox Execution**: Safe code execution with local or Docker-based isolation
- **Multi-Model Support**: OpenAI, Anthropic Claude, DeepSeek, Doubao, Kimi, and custom LangChain-compatible models
- **MCP Integration**: Extensible tool ecosystem via Model Context Protocol
- **Skills System**: Specialized domain workflows injected into agent prompts
- **File Upload & Processing**: Multi-format document upload with automatic Markdown conversion
- **Context Summarization**: Automatic conversation summarization for long conversations
- **Plan Mode**: TodoList middleware for complex multi-step task tracking
-
---
 ## Architecture

 ```
-┌─────────────────────────────────────────────────────────────────┐
+                        ┌──────────────────────────────────────┐
                        │          Nginx (Port 2026)           │
-│              Unified reverse proxy entry point                   │
-└─────────────────┬───────────────────────────────┬───────────────┘
+                        │      Unified reverse proxy           │
+                        └───────┬──────────────────┬───────────┘
                                │                  │
+              /api/langgraph/*  │                  │  /api/* (other)
                                ▼                  ▼
-┌─────────────────────────────┐   ┌─────────────────────────────┐
-│   LangGraph Server (2024)   │   │    Gateway API (8001)       │
-│   Agent runtime & workflows │   │   Models, MCP, Skills, etc. │
-└─────────────────────────────┘   └─────────────────────────────┘
+               ┌────────────────────┐  ┌────────────────────────┐
+               │ LangGraph Server   │  │   Gateway API (8001)   │
+               │    (Port 2024)     │  │   FastAPI REST         │
+               │                    │  │                        │
+               │ ┌────────────────┐ │  │ Models, MCP, Skills,   │
+               │ │  Lead Agent    │ │  │ Memory, Uploads,       │
+               │ │  ┌──────────┐  │ │  │ Artifacts              │
+               │ │  │Middleware│  │ │  └────────────────────────┘
+               │ │  │  Chain   │  │ │
+               │ │  └──────────┘  │ │
+               │ │  ┌──────────┐  │ │
+               │ │  │  Tools   │  │ │
+               │ │  └──────────┘  │ │
+               │ │  ┌──────────┐  │ │
+               │ │  │Subagents │  │ │
+               │ │  └──────────┘  │ │
+               │ └────────────────┘ │
+               └────────────────────┘
 ```

-**Request Routing**:
- `/api/langgraph/*` → LangGraph Server (agent interactions, threads, streaming)
- `/api/*` (other) → Gateway API (models, MCP, skills, artifacts, uploads)
- `/` (non-API) → Frontend (web interface)
+**Request Routing** (via Nginx):
+- `/api/langgraph/*` → LangGraph Server - agent interactions, threads, streaming
+- `/api/*` (other) → Gateway API - models, MCP, skills, memory, artifacts, uploads
+- `/` (non-API) → Frontend - Next.js web interface

 ---
+
+## Core Components
+
+### Lead Agent
+
+The single LangGraph agent (`lead_agent`) is the runtime entry point, created via `make_lead_agent(config)`. It combines:
+
+- **Dynamic model selection** with thinking and vision support
+- **Middleware chain** for cross-cutting concerns (9 middlewares)
+- **Tool system** with sandbox, MCP, community, and built-in tools
+- **Subagent delegation** for parallel task execution
+- **System prompt** with skills injection, memory context, and working directory guidance
+
+### Middleware Chain
+
+Middlewares execute in strict order, each handling a specific concern:
+
+| # | Middleware | Purpose |
+|---|-----------|---------|
+| 1 | **ThreadDataMiddleware** | Creates per-thread isolated directories (workspace, uploads, outputs) |
+| 2 | **UploadsMiddleware** | Injects newly uploaded files into conversation context |
+| 3 | **SandboxMiddleware** | Acquires sandbox environment for code execution |
+| 4 | **SummarizationMiddleware** | Reduces context when approaching token limits (optional) |
+| 5 | **TodoListMiddleware** | Tracks multi-step tasks in plan mode (optional) |
+| 6 | **TitleMiddleware** | Auto-generates conversation titles after first exchange |
+| 7 | **MemoryMiddleware** | Queues conversations for async memory extraction |
+| 8 | **ViewImageMiddleware** | Injects image data for vision-capable models (conditional) |
+| 9 | **ClarificationMiddleware** | Intercepts clarification requests and interrupts execution (must be last) |
+
+### Sandbox System
+
+Per-thread isolated execution with virtual path translation:
+
+- **Abstract interface**: `execute_command`, `read_file`, `write_file`, `list_dir`
+- **Providers**: `LocalSandboxProvider` (filesystem) and `AioSandboxProvider` (Docker, in community/)
+- **Virtual paths**: `/mnt/user-data/{workspace,uploads,outputs}` → thread-specific physical directories
+- **Skills path**: `/mnt/skills` → `deer-flow/skills/` directory
+- **Tools**: `bash`, `ls`, `read_file`, `write_file`, `str_replace`
+
+### Subagent System
+
+Async task delegation with concurrent execution:
+
+- **Built-in agents**: `general-purpose` (full toolset) and `bash` (command specialist)
+- **Concurrency**: Max 3 subagents per turn, 15-minute timeout
+- **Execution**: Background thread pools with status tracking and SSE events
+- **Flow**: Agent calls `task()` tool → executor runs subagent in background → polls for completion → returns result
+
+### Memory System
+
+LLM-powered persistent context retention across conversations:
+
+- **Automatic extraction**: Analyzes conversations for user context, facts, and preferences
+- **Structured storage**: User context (work, personal, top-of-mind), history, and confidence-scored facts
+- **Debounced updates**: Batches updates to minimize LLM calls (configurable wait time)
+- **System prompt injection**: Top facts + context injected into agent prompts
+- **Storage**: JSON file with mtime-based cache invalidation
+
+### Tool Ecosystem
+
+| Category | Tools |
+|----------|-------|
+| **Sandbox** | `bash`, `ls`, `read_file`, `write_file`, `str_replace` |
+| **Built-in** | `present_files`, `ask_clarification`, `view_image`, `task` (subagent) |
+| **Community** | Tavily (web search), Jina AI (web fetch), Firecrawl (scraping), DuckDuckGo (image search) |
+| **MCP** | Any Model Context Protocol server (stdio, SSE, HTTP transports) |
+| **Skills** | Domain-specific workflows injected via system prompt |
+
+### Gateway API
+
+FastAPI application providing REST endpoints for frontend integration:
+
+| Route | Purpose |
+|-------|---------|
+| `GET /api/models` | List available LLM models |
+| `GET/PUT /api/mcp/config` | Manage MCP server configurations |
+| `GET/PUT /api/skills` | List and manage skills |
+| `POST /api/skills/install` | Install skill from `.skill` archive |
+| `GET /api/memory` | Retrieve memory data |
+| `POST /api/memory/reload` | Force memory reload |
+| `GET /api/memory/config` | Memory configuration |
+| `GET /api/memory/status` | Combined config + data |
+| `POST /api/threads/{id}/uploads` | Upload files (auto-converts PDF/PPT/Excel/Word to Markdown) |
+| `GET /api/threads/{id}/uploads/list` | List uploaded files |
+| `GET /api/threads/{id}/artifacts/{path}` | Serve generated artifacts |
+
+---
+
 ## Quick Start

 ### Prerequisites
@@ -47,7 +139,6 @@ DeerFlow is a LangGraph-based AI agent system that provides a powerful "super ag
 ### Installation

 ```bash
-# Clone the repository (if not already)
 cd deer-flow

 # Copy configuration files
@@ -61,23 +152,23 @@ make install

 ### Configuration

-Edit `config.yaml` in the project root to configure your models and tools:
+Edit `config.yaml` in the project root:

 ```yaml
 models:
-  - name: gpt-4
-    display_name: GPT-4
+  - name: gpt-4o
+    display_name: GPT-4o
    use: langchain_openai:ChatOpenAI
-    model: gpt-4
-    api_key: $OPENAI_API_KEY  # Set environment variable
-    max_tokens: 4096
+    model: gpt-4o
+    api_key: $OPENAI_API_KEY
+    supports_thinking: false
+    supports_vision: true
 ```

 Set your API keys:

 ```bash
 export OPENAI_API_KEY="your-api-key-here"
-# Or other provider keys as needed
 ```

 ### Running
@@ -100,96 +191,70 @@ make dev
 make gateway
 ```

-Direct access:
- LangGraph: http://localhost:2024
- Gateway: http://localhost:8001
+Direct access: LangGraph at http://localhost:2024, Gateway at http://localhost:8001

 ---
+
 ## Project Structure

 ```
 backend/
 ├── src/
-│   ├── agents/              # LangGraph agents and workflows
-│   │   ├── lead_agent/      # Main agent implementation
-│   │   └── middlewares/     # Agent middlewares
+│   ├── agents/                  # Agent system
+│   │   ├── lead_agent/         # Main agent (factory, prompts)
+│   │   ├── middlewares/        # 9 middleware components
+│   │   ├── memory/             # Memory extraction & storage
+│   │   └── thread_state.py    # ThreadState schema
 │   ├── gateway/                # FastAPI Gateway API
-│   │   └── routers/         # API route handlers
-│   ├── sandbox/             # Sandbox execution system
-│   ├── tools/               # Agent tools (builtins)
-│   ├── mcp/                 # MCP integration
+│   │   ├── app.py             # Application setup
+│   │   └── routers/           # 6 route modules
+│   ├── sandbox/                # Sandbox execution
+│   │   ├── local/             # Local filesystem provider
+│   │   ├── sandbox.py         # Abstract interface
+│   │   ├── tools.py           # bash, ls, read/write/str_replace
+│   │   └── middleware.py      # Sandbox lifecycle
+│   ├── subagents/              # Subagent delegation
+│   │   ├── builtins/          # general-purpose, bash agents
+│   │   ├── executor.py        # Background execution engine
+│   │   └── registry.py        # Agent registry
+│   ├── tools/builtins/         # Built-in tools
+│   ├── mcp/                    # MCP protocol integration
 │   ├── models/                 # Model factory
-│   ├── skills/              # Skills loader
+│   ├── skills/                 # Skill discovery & loading
 │   ├── config/                 # Configuration system
-│   ├── community/           # Community tools (web search, etc.)
+│   ├── community/              # Community tools & providers
 │   ├── reflection/             # Dynamic module loading
-│   └── utils/               # Utility functions
+│   └── utils/                  # Utilities
 ├── docs/                       # Documentation
 ├── tests/                      # Test suite
 ├── langgraph.json              # LangGraph server configuration
-├── config.yaml              # Application configuration (optional)
 ├── pyproject.toml              # Python dependencies
 ├── Makefile                    # Development commands
 └── Dockerfile                  # Container build
 ```

 ---
-## API Reference

-### LangGraph API (via `/api/langgraph/*`)
-
- `POST /threads` - Create new conversation thread
- `POST /threads/{thread_id}/runs` - Execute agent with input
- `GET /threads/{thread_id}/runs` - Get run history
- `GET /threads/{thread_id}/state` - Get current conversation state
- WebSocket support for streaming responses
-
-### Gateway API (via `/api/*`)
-
-**Models**:
- `GET /api/models` - List available LLM models
- `GET /api/models/{model_name}` - Get model details
-
-**MCP Configuration**:
- `GET /api/mcp/config` - Get current MCP server configurations
- `PUT /api/mcp/config` - Update MCP configuration
-
-**Skills Management**:
- `GET /api/skills` - List all skills
- `GET /api/skills/{skill_name}` - Get skill details
- `POST /api/skills/{skill_name}/enable` - Enable a skill
- `POST /api/skills/{skill_name}/disable` - Disable a skill
- `POST /api/skills/install` - Install skill from `.skill` file
-
-**File Uploads**:
- `POST /api/threads/{thread_id}/uploads` - Upload files
- `GET /api/threads/{thread_id}/uploads/list` - List uploaded files
- `DELETE /api/threads/{thread_id}/uploads/{filename}` - Delete file
-
-**Artifacts**:
- `GET /api/threads/{thread_id}/artifacts/{path}` - Download generated artifacts
-
---
 ## Configuration

 ### Main Configuration (`config.yaml`)

-The application uses a YAML-based configuration file. Place it in the project root directory.
+Place in project root. Config values starting with `$` resolve as environment variables.

 Key sections:
- `models`: LLM configurations with class paths and API keys
- `tool_groups`: Logical groupings for tools
- `tools`: Tool definitions with module paths
- `sandbox`: Execution environment settings
- `skills`: Skills directory configuration
- `title`: Auto-title generation settings
- `summarization`: Context summarization settings
-
-See [docs/CONFIGURATION.md](docs/CONFIGURATION.md) for detailed documentation.
+- `models` - LLM configurations with class paths, API keys, thinking/vision flags
+- `tools` - Tool definitions with module paths and groups
+- `tool_groups` - Logical tool groupings
+- `sandbox` - Execution environment provider
+- `skills` - Skills directory paths
+- `title` - Auto-title generation settings
+- `summarization` - Context summarization settings
+- `subagents` - Subagent system (enabled/disabled)
+- `memory` - Memory system settings (enabled, storage, debounce, facts limits)

 ### Extensions Configuration (`extensions_config.json`)

-MCP servers and skills are configured in `extensions_config.json`:
+MCP servers and skill states in a single file:

 ```json
 {
@@ -216,6 +281,7 @@ MCP servers and skills are configured in `extensions_config.json`:
 - Tool API keys: `TAVILY_API_KEY`, `GITHUB_TOKEN`, etc.

 ---
+
 ## Development

 ### Commands
@@ -230,10 +296,11 @@ make format     # Format code (ruff)

 ### Code Style

- Uses `ruff` for linting and formatting
- Line length: 240 characters
- Python 3.12+ with type hints
- Double quotes, space indentation
+- **Linter/Formatter**: `ruff`
+- **Line length**: 240 characters
+- **Python**: 3.12+ with type hints
+- **Quotes**: Double quotes
+- **Indentation**: 4 spaces

 ### Testing

@@ -242,45 +309,36 @@ uv run pytest
 ```

 ---
-## Documentation

- [Configuration Guide](docs/CONFIGURATION.md) - Detailed configuration options
- [Setup Guide](docs/SETUP.md) - Quick setup instructions
- [File Upload](docs/FILE_UPLOAD.md) - File upload functionality
- [Path Examples](docs/PATH_EXAMPLES.md) - Path types and usage
- [Summarization](docs/summarization.md) - Context summarization feature
- [Plan Mode](docs/plan_mode_usage.md) - TodoList middleware usage
-
---
 ## Technology Stack

-### Core Frameworks
- **LangChain** (1.2.3+) - LLM orchestration
- **LangGraph** (1.0.6+) - Multi-agent workflows
- **FastAPI** (0.115.0+) - REST API
- **Uvicorn** (0.34.0+) - ASGI server
-
-### LLM Integrations
- `langchain-openai` - OpenAI models
- `langchain-anthropic` - Claude models
- `langchain-deepseek` - DeepSeek models
-
-### Extensions
- `langchain-mcp-adapters` - MCP protocol support
- `agent-sandbox` - Sandboxed code execution
-
-### Utilities
- `markitdown` - Multi-format to Markdown conversion
- `tavily-python` - Web search
- `firecrawl-py` - Web scraping
- `ddgs` - DuckDuckGo image search
+- **LangGraph** (1.0.6+) - Agent framework and multi-agent orchestration
+- **LangChain** (1.2.3+) - LLM abstractions and tool system
+- **FastAPI** (0.115.0+) - Gateway REST API
+- **langchain-mcp-adapters** - Model Context Protocol support
+- **agent-sandbox** - Sandboxed code execution
+- **markitdown** - Multi-format document conversion
+- **tavily-python** / **firecrawl-py** - Web search and scraping

 ---
+
+## Documentation
+
+- [Configuration Guide](docs/CONFIGURATION.md)
+- [Architecture Details](docs/ARCHITECTURE.md)
+- [API Reference](docs/API.md)
+- [File Upload](docs/FILE_UPLOAD.md)
+- [Path Examples](docs/PATH_EXAMPLES.md)
+- [Context Summarization](docs/summarization.md)
+- [Plan Mode](docs/plan_mode_usage.md)
+- [Setup Guide](docs/SETUP.md)
+
+---
+
 ## License

 See the [LICENSE](../LICENSE) file in the project root.

---
 ## Contributing

 See [CONTRIBUTING.md](CONTRIBUTING.md) for contribution guidelines.