mirror of https://gitee.com/wanwujie/deer-flow synced 2026-04-03 06:12:14 +08:00

Files

DanielWalnut 76803b826f refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

* refactor: extract shared utils to break harness→app cross-layer imports

Move _validate_skill_frontmatter to src/skills/validation.py and
CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py.
This eliminates the two reverse dependencies from client.py (harness layer)
into gateway/routers/ (app layer), preparing for the harness/app package split.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: split backend/src into harness (deerflow.*) and app (app.*)

Physically split the monolithic backend/src/ package into two layers:

- **Harness** (`packages/harness/deerflow/`): publishable agent framework
  package with import prefix `deerflow.*`. Contains agents, sandbox, tools,
  models, MCP, skills, config, and all core infrastructure.

- **App** (`app/`): unpublished application code with import prefix `app.*`.
  Contains gateway (FastAPI REST API) and channels (IM integrations).

Key changes:
- Move 13 harness modules to packages/harness/deerflow/ via git mv
- Move gateway + channels to app/ via git mv
- Rename all imports: src.* → deerflow.* (harness) / app.* (app layer)
- Set up uv workspace with deerflow-harness as workspace member
- Update langgraph.json, config.example.yaml, all scripts, Docker files
- Add build-system (hatchling) to harness pyproject.toml
- Add PYTHONPATH=. to gateway startup commands for app.* resolution
- Update ruff.toml with known-first-party for import sorting
- Update all documentation to reflect new directory structure

Boundary rule enforced: harness code never imports from app.
All 429 tests pass. Lint clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: add harness→app boundary check test and update docs

Add test_harness_boundary.py that scans all Python files in
packages/harness/deerflow/ and fails if any `from app.*` or
`import app.*` statement is found. This enforces the architectural
rule that the harness layer never depends on the app layer.

Update CLAUDE.md to document the harness/app split architecture,
import conventions, and the boundary enforcement test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add config versioning with auto-upgrade on startup

When config.example.yaml schema changes, developers' local config.yaml
files can silently become outdated. This adds a config_version field and
auto-upgrade mechanism so breaking changes (like src.* → deerflow.*
renames) are applied automatically before services start.

- Add config_version: 1 to config.example.yaml
- Add startup version check warning in AppConfig.from_file()
- Add scripts/config-upgrade.sh with migration registry for value replacements
- Add `make config-upgrade` target
- Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services
- Add config error hints in service failure messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix comments

* fix: update src.* import in test_sandbox_tools_security to deerflow.*

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle empty config and search parent dirs for config.example.yaml

Address Copilot review comments on PR #1131:
- Guard against yaml.safe_load() returning None for empty config files
- Search parent directories for config.example.yaml instead of only
  looking next to config.yaml, fixing detection in common setups

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct skills root path depth and config_version type coercion

- loader.py: fix get_skills_root_path() to use 5 parent levels (was 3)
  after harness split, file lives at packages/harness/deerflow/skills/
  so parent×3 resolved to backend/packages/harness/ instead of backend/
- app_config.py: coerce config_version to int() before comparison in
  _check_config_version() to prevent TypeError when YAML stores value
  as string (e.g. config_version: "1")
- tests: add regression tests for both fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update test imports from src.* to deerflow.*/app.* after harness refactor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-14 22:55:52 +08:00

8.4 KiB

Raw Blame History

Configuration Guide

This guide explains how to configure DeerFlow for your environment.

Config Versioning

config.example.yaml contains a config_version field that tracks schema changes. When the example version is higher than your local config.yaml, the application emits a startup warning:

WARNING - Your config.yaml (version 0) is outdated — the latest version is 1.
Run `make config-upgrade` to merge new fields into your config.

Missing config_version in your config is treated as version 0.
Run make config-upgrade to auto-merge missing fields (your existing values are preserved, a .bak backup is created).
When changing the config schema, bump config_version in config.example.yaml.

Configuration Sections

Models

Configure the LLM models available to the agent:

models:
  - name: gpt-4                    # Internal identifier
    display_name: GPT-4            # Human-readable name
    use: langchain_openai:ChatOpenAI  # LangChain class path
    model: gpt-4                   # Model identifier for API
    api_key: $OPENAI_API_KEY       # API key (use env var)
    max_tokens: 4096               # Max tokens per request
    temperature: 0.7               # Sampling temperature

Supported Providers:

OpenAI (langchain_openai:ChatOpenAI)
Anthropic (langchain_anthropic:ChatAnthropic)
DeepSeek (langchain_deepseek:ChatDeepSeek)
Any LangChain-compatible provider

For OpenAI-compatible gateways (for example Novita or OpenRouter), keep using langchain_openai:ChatOpenAI and set base_url:

models:
  - name: novita-deepseek-v3.2
    display_name: Novita DeepSeek V3.2
    use: langchain_openai:ChatOpenAI
    model: deepseek/deepseek-v3.2
    api_key: $NOVITA_API_KEY
    base_url: https://api.novita.ai/openai
    supports_thinking: true
    when_thinking_enabled:
      extra_body:
        thinking:
          type: enabled

  - name: minimax-m2.5
    display_name: MiniMax M2.5
    use: langchain_openai:ChatOpenAI
    model: MiniMax-M2.5
    api_key: $MINIMAX_API_KEY
    base_url: https://api.minimax.io/v1
    max_tokens: 4096
    temperature: 1.0  # MiniMax requires temperature in (0.0, 1.0]
    supports_vision: true

  - name: minimax-m2.5-highspeed
    display_name: MiniMax M2.5 Highspeed
    use: langchain_openai:ChatOpenAI
    model: MiniMax-M2.5-highspeed
    api_key: $MINIMAX_API_KEY
    base_url: https://api.minimax.io/v1
    max_tokens: 4096
    temperature: 1.0  # MiniMax requires temperature in (0.0, 1.0]
    supports_vision: true
  - name: openrouter-gemini-2.5-flash
    display_name: Gemini 2.5 Flash (OpenRouter)
    use: langchain_openai:ChatOpenAI
    model: google/gemini-2.5-flash-preview
    api_key: $OPENAI_API_KEY
    base_url: https://openrouter.ai/api/v1

If your OpenRouter key lives in a different environment variable name, point api_key at that variable explicitly (for example api_key: $OPENROUTER_API_KEY).

Thinking Models: Some models support "thinking" mode for complex reasoning:

models:
  - name: deepseek-v3
    supports_thinking: true
    when_thinking_enabled:
      extra_body:
        thinking:
          type: enabled

Tool Groups

Organize tools into logical groups:

tool_groups:
  - name: web          # Web browsing and search
  - name: file:read    # Read-only file operations
  - name: file:write   # Write file operations
  - name: bash         # Shell command execution

Tools

Configure specific tools available to the agent:

tools:
  - name: web_search
    group: web
    use: deerflow.community.tavily.tools:web_search_tool
    max_results: 5
    # api_key: $TAVILY_API_KEY  # Optional

Built-in Tools:

web_search - Search the web (Tavily)
web_fetch - Fetch web pages (Jina AI)
ls - List directory contents
read_file - Read file contents
write_file - Write file contents
str_replace - String replacement in files
bash - Execute bash commands

Sandbox

DeerFlow supports multiple sandbox execution modes. Configure your preferred mode in config.yaml:

Local Execution (runs sandbox code directly on the host machine):

sandbox:
   use: deerflow.sandbox.local:LocalSandboxProvider # Local execution

Docker Execution (runs sandbox code in isolated Docker containers):

sandbox:
   use: deerflow.community.aio_sandbox:AioSandboxProvider # Docker-based sandbox

Docker Execution with Kubernetes (runs sandbox code in Kubernetes pods via provisioner service):

This mode runs each sandbox in an isolated Kubernetes Pod on your host machine's cluster. Requires Docker Desktop K8s, OrbStack, or similar local K8s setup.

sandbox:
   use: deerflow.community.aio_sandbox:AioSandboxProvider
   provisioner_url: http://provisioner:8002

When using Docker development (make docker-start), DeerFlow starts the provisioner service only if this provisioner mode is configured. In local or plain Docker sandbox modes, provisioner is skipped.

See Provisioner Setup Guide for detailed configuration, prerequisites, and troubleshooting.

Choose between local execution or Docker-based isolation:

Option 1: Local Sandbox (default, simpler setup):

sandbox:
  use: deerflow.sandbox.local:LocalSandboxProvider

Option 2: Docker Sandbox (isolated, more secure):

sandbox:
  use: deerflow.community.aio_sandbox:AioSandboxProvider
  port: 8080
  auto_start: true
  container_prefix: deer-flow-sandbox

  # Optional: Additional mounts
  mounts:
    - host_path: /path/on/host
      container_path: /path/in/container
      read_only: false

Skills

Configure the skills directory for specialized workflows:

skills:
  # Host path (optional, default: ../skills)
  path: /custom/path/to/skills

  # Container mount path (default: /mnt/skills)
  container_path: /mnt/skills

How Skills Work:

Skills are stored in deer-flow/skills/{public,custom}/
Each skill has a SKILL.md file with metadata
Skills are automatically discovered and loaded
Available in both local and Docker sandbox via path mapping

Title Generation

Automatic conversation title generation:

title:
  enabled: true
  max_words: 6
  max_chars: 60
  model_name: null  # Use first model in list

Environment Variables

DeerFlow supports environment variable substitution using the $ prefix:

models:
  - api_key: $OPENAI_API_KEY  # Reads from environment

Common Environment Variables:

OPENAI_API_KEY - OpenAI API key
ANTHROPIC_API_KEY - Anthropic API key
DEEPSEEK_API_KEY - DeepSeek API key
NOVITA_API_KEY - Novita API key (OpenAI-compatible endpoint)
TAVILY_API_KEY - Tavily search API key
DEER_FLOW_CONFIG_PATH - Custom config file path

Configuration Location

The configuration file should be placed in the project root directory (deer-flow/config.yaml), not in the backend directory.

Configuration Priority

DeerFlow searches for configuration in this order:

Path specified in code via config_path argument
Path from DEER_FLOW_CONFIG_PATH environment variable
config.yaml in current working directory (typically backend/ when running)
config.yaml in parent directory (project root: deer-flow/)

Best Practices

Place config.yaml in project root - Not in backend/ directory
Never commit config.yaml - It's already in .gitignore
Use environment variables for secrets - Don't hardcode API keys
Keep config.example.yaml updated - Document all new options
Test configuration changes locally - Before deploying
Use Docker sandbox for production - Better isolation and security

Troubleshooting

"Config file not found"

Ensure config.yaml exists in the project root directory (deer-flow/config.yaml)
The backend searches parent directory by default, so root location is preferred
Alternatively, set DEER_FLOW_CONFIG_PATH environment variable to custom location

"Invalid API key"

Verify environment variables are set correctly
Check that $ prefix is used for env var references

"Skills not loading"

Check that deer-flow/skills/ directory exists
Verify skills have valid SKILL.md files
Check skills.path configuration if using custom path

"Docker sandbox fails to start"

Ensure Docker is running
Check port 8080 (or configured port) is available
Verify Docker image is accessible

Examples

See config.example.yaml for complete examples of all configuration options.

8.4 KiB Raw Blame History