mirror of
https://gitee.com/wanwujie/deer-flow
synced 2026-04-03 06:12:14 +08:00
* refactor: extract shared utils to break harness→app cross-layer imports Move _validate_skill_frontmatter to src/skills/validation.py and CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py. This eliminates the two reverse dependencies from client.py (harness layer) into gateway/routers/ (app layer), preparing for the harness/app package split. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: split backend/src into harness (deerflow.*) and app (app.*) Physically split the monolithic backend/src/ package into two layers: - **Harness** (`packages/harness/deerflow/`): publishable agent framework package with import prefix `deerflow.*`. Contains agents, sandbox, tools, models, MCP, skills, config, and all core infrastructure. - **App** (`app/`): unpublished application code with import prefix `app.*`. Contains gateway (FastAPI REST API) and channels (IM integrations). Key changes: - Move 13 harness modules to packages/harness/deerflow/ via git mv - Move gateway + channels to app/ via git mv - Rename all imports: src.* → deerflow.* (harness) / app.* (app layer) - Set up uv workspace with deerflow-harness as workspace member - Update langgraph.json, config.example.yaml, all scripts, Docker files - Add build-system (hatchling) to harness pyproject.toml - Add PYTHONPATH=. to gateway startup commands for app.* resolution - Update ruff.toml with known-first-party for import sorting - Update all documentation to reflect new directory structure Boundary rule enforced: harness code never imports from app. All 429 tests pass. Lint clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: add harness→app boundary check test and update docs Add test_harness_boundary.py that scans all Python files in packages/harness/deerflow/ and fails if any `from app.*` or `import app.*` statement is found. This enforces the architectural rule that the harness layer never depends on the app layer. Update CLAUDE.md to document the harness/app split architecture, import conventions, and the boundary enforcement test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add config versioning with auto-upgrade on startup When config.example.yaml schema changes, developers' local config.yaml files can silently become outdated. This adds a config_version field and auto-upgrade mechanism so breaking changes (like src.* → deerflow.* renames) are applied automatically before services start. - Add config_version: 1 to config.example.yaml - Add startup version check warning in AppConfig.from_file() - Add scripts/config-upgrade.sh with migration registry for value replacements - Add `make config-upgrade` target - Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services - Add config error hints in service failure messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix comments * fix: update src.* import in test_sandbox_tools_security to deerflow.* Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: handle empty config and search parent dirs for config.example.yaml Address Copilot review comments on PR #1131: - Guard against yaml.safe_load() returning None for empty config files - Search parent directories for config.example.yaml instead of only looking next to config.yaml, fixing detection in common setups Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: correct skills root path depth and config_version type coercion - loader.py: fix get_skills_root_path() to use 5 parent levels (was 3) after harness split, file lives at packages/harness/deerflow/skills/ so parent×3 resolved to backend/packages/harness/ instead of backend/ - app_config.py: coerce config_version to int() before comparison in _check_config_version() to prevent TypeError when YAML stores value as string (e.g. config_version: "1") - tests: add regression tests for both fixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: update test imports from src.* to deerflow.*/app.* after harness refactor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
9.9 KiB
9.9 KiB
Contributing to DeerFlow Backend
Thank you for your interest in contributing to DeerFlow! This document provides guidelines and instructions for contributing to the backend codebase.
Table of Contents
- Getting Started
- Development Setup
- Project Structure
- Code Style
- Making Changes
- Testing
- Pull Request Process
- Architecture Guidelines
Getting Started
Prerequisites
- Python 3.12 or higher
- uv package manager
- Git
- Docker (optional, for Docker sandbox testing)
Fork and Clone
- Fork the repository on GitHub
- Clone your fork locally:
git clone https://github.com/YOUR_USERNAME/deer-flow.git cd deer-flow
Development Setup
Install Dependencies
# From project root
cp config.example.yaml config.yaml
# Install backend dependencies
cd backend
make install
Configure Environment
Set up your API keys for testing:
export OPENAI_API_KEY="your-api-key"
# Add other keys as needed
Run the Development Server
# Terminal 1: LangGraph server
make dev
# Terminal 2: Gateway API
make gateway
Project Structure
backend/src/
├── agents/ # Agent system
│ ├── lead_agent/ # Main agent implementation
│ │ └── agent.py # Agent factory and creation
│ ├── middlewares/ # Agent middlewares
│ │ ├── thread_data_middleware.py
│ │ ├── sandbox_middleware.py
│ │ ├── title_middleware.py
│ │ ├── uploads_middleware.py
│ │ ├── view_image_middleware.py
│ │ └── clarification_middleware.py
│ └── thread_state.py # Thread state definition
│
├── gateway/ # FastAPI Gateway
│ ├── app.py # FastAPI application
│ └── routers/ # Route handlers
│ ├── models.py # /api/models endpoints
│ ├── mcp.py # /api/mcp endpoints
│ ├── skills.py # /api/skills endpoints
│ ├── artifacts.py # /api/threads/.../artifacts
│ └── uploads.py # /api/threads/.../uploads
│
├── sandbox/ # Sandbox execution
│ ├── __init__.py # Sandbox interface
│ ├── local.py # Local sandbox provider
│ └── tools.py # Sandbox tools (bash, file ops)
│
├── tools/ # Agent tools
│ └── builtins/ # Built-in tools
│ ├── present_file_tool.py
│ ├── ask_clarification_tool.py
│ └── view_image_tool.py
│
├── mcp/ # MCP integration
│ └── manager.py # MCP server management
│
├── models/ # Model system
│ └── factory.py # Model factory
│
├── skills/ # Skills system
│ └── loader.py # Skills loader
│
├── config/ # Configuration
│ ├── app_config.py # Main app config
│ ├── extensions_config.py # Extensions config
│ └── summarization_config.py
│
├── community/ # Community tools
│ ├── tavily/ # Tavily web search
│ ├── jina/ # Jina web fetch
│ ├── firecrawl/ # Firecrawl scraping
│ └── aio_sandbox/ # Docker sandbox
│
├── reflection/ # Dynamic loading
│ └── __init__.py # Module resolution
│
└── utils/ # Utilities
└── __init__.py
Code Style
Linting and Formatting
We use ruff for both linting and formatting:
# Check for issues
make lint
# Auto-fix and format
make format
Style Guidelines
- Line length: 240 characters maximum
- Python version: 3.12+ features allowed
- Type hints: Use type hints for function signatures
- Quotes: Double quotes for strings
- Indentation: 4 spaces (no tabs)
- Imports: Group by standard library, third-party, local
Docstrings
Use docstrings for public functions and classes:
def create_chat_model(name: str, thinking_enabled: bool = False) -> BaseChatModel:
"""Create a chat model instance from configuration.
Args:
name: The model name as defined in config.yaml
thinking_enabled: Whether to enable extended thinking
Returns:
A configured LangChain chat model instance
Raises:
ValueError: If the model name is not found in configuration
"""
...
Making Changes
Branch Naming
Use descriptive branch names:
feature/add-new-tool- New featuresfix/sandbox-timeout- Bug fixesdocs/update-readme- Documentationrefactor/config-system- Code refactoring
Commit Messages
Write clear, concise commit messages:
feat: add support for Claude 3.5 model
- Add model configuration in config.yaml
- Update model factory to handle Claude-specific settings
- Add tests for new model
Prefix types:
feat:- New featurefix:- Bug fixdocs:- Documentationrefactor:- Code refactoringtest:- Testschore:- Build/config changes
Testing
Running Tests
uv run pytest
Writing Tests
Place tests in the tests/ directory mirroring the source structure:
tests/
├── test_models/
│ └── test_factory.py
├── test_sandbox/
│ └── test_local.py
└── test_gateway/
└── test_models_router.py
Example test:
import pytest
from deerflow.models.factory import create_chat_model
def test_create_chat_model_with_valid_name():
"""Test that a valid model name creates a model instance."""
model = create_chat_model("gpt-4")
assert model is not None
def test_create_chat_model_with_invalid_name():
"""Test that an invalid model name raises ValueError."""
with pytest.raises(ValueError):
create_chat_model("nonexistent-model")
Pull Request Process
Before Submitting
- Ensure tests pass:
uv run pytest - Run linter:
make lint - Format code:
make format - Update documentation if needed
PR Description
Include in your PR description:
- What: Brief description of changes
- Why: Motivation for the change
- How: Implementation approach
- Testing: How you tested the changes
Review Process
- Submit PR with clear description
- Address review feedback
- Ensure CI passes
- Maintainer will merge when approved
Architecture Guidelines
Adding New Tools
- Create tool in
packages/harness/deerflow/tools/builtins/orpackages/harness/deerflow/community/:
# packages/harness/deerflow/tools/builtins/my_tool.py
from langchain_core.tools import tool
@tool
def my_tool(param: str) -> str:
"""Tool description for the agent.
Args:
param: Description of the parameter
Returns:
Description of return value
"""
return f"Result: {param}"
- Register in
config.yaml:
tools:
- name: my_tool
group: my_group
use: deerflow.tools.builtins.my_tool:my_tool
Adding New Middleware
- Create middleware in
packages/harness/deerflow/agents/middlewares/:
# packages/harness/deerflow/agents/middlewares/my_middleware.py
from langchain.agents.middleware import BaseMiddleware
from langchain_core.runnables import RunnableConfig
class MyMiddleware(BaseMiddleware):
"""Middleware description."""
def transform_state(self, state: dict, config: RunnableConfig) -> dict:
"""Transform the state before agent execution."""
# Modify state as needed
return state
- Register in
packages/harness/deerflow/agents/lead_agent/agent.py:
middlewares = [
ThreadDataMiddleware(),
SandboxMiddleware(),
MyMiddleware(), # Add your middleware
TitleMiddleware(),
ClarificationMiddleware(),
]
Adding New API Endpoints
- Create router in
app/gateway/routers/:
# app/gateway/routers/my_router.py
from fastapi import APIRouter
router = APIRouter(prefix="/my-endpoint", tags=["my-endpoint"])
@router.get("/")
async def get_items():
"""Get all items."""
return {"items": []}
@router.post("/")
async def create_item(data: dict):
"""Create a new item."""
return {"created": data}
- Register in
app/gateway/app.py:
from app.gateway.routers import my_router
app.include_router(my_router.router)
Configuration Changes
When adding new configuration options:
- Update
packages/harness/deerflow/config/app_config.pywith new fields - Add default values in
config.example.yaml - Document in
docs/CONFIGURATION.md
MCP Server Integration
To add support for a new MCP server:
- Add configuration in
extensions_config.json:
{
"mcpServers": {
"my-server": {
"enabled": true,
"type": "stdio",
"command": "npx",
"args": ["-y", "@my-org/mcp-server"],
"description": "My MCP Server"
}
}
}
- Update
extensions_config.example.jsonwith the new server
Skills Development
To create a new skill:
- Create directory in
skills/public/orskills/custom/:
skills/public/my-skill/
└── SKILL.md
- Write
SKILL.mdwith YAML front matter:
---
name: My Skill
description: What this skill does
license: MIT
allowed-tools:
- read_file
- write_file
- bash
---
# My Skill
Instructions for the agent when this skill is enabled...
Questions?
If you have questions about contributing:
- Check existing documentation in
docs/ - Look for similar issues or PRs on GitHub
- Open a discussion or issue on GitHub
Thank you for contributing to DeerFlow!