Files
deer-flow/backend/docs/task_tool_improvements.md
DanielWalnut 76803b826f refactor: split backend into harness (deerflow.*) and app (app.*) (#1131)
* refactor: extract shared utils to break harness→app cross-layer imports

Move _validate_skill_frontmatter to src/skills/validation.py and
CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py.
This eliminates the two reverse dependencies from client.py (harness layer)
into gateway/routers/ (app layer), preparing for the harness/app package split.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: split backend/src into harness (deerflow.*) and app (app.*)

Physically split the monolithic backend/src/ package into two layers:

- **Harness** (`packages/harness/deerflow/`): publishable agent framework
  package with import prefix `deerflow.*`. Contains agents, sandbox, tools,
  models, MCP, skills, config, and all core infrastructure.

- **App** (`app/`): unpublished application code with import prefix `app.*`.
  Contains gateway (FastAPI REST API) and channels (IM integrations).

Key changes:
- Move 13 harness modules to packages/harness/deerflow/ via git mv
- Move gateway + channels to app/ via git mv
- Rename all imports: src.* → deerflow.* (harness) / app.* (app layer)
- Set up uv workspace with deerflow-harness as workspace member
- Update langgraph.json, config.example.yaml, all scripts, Docker files
- Add build-system (hatchling) to harness pyproject.toml
- Add PYTHONPATH=. to gateway startup commands for app.* resolution
- Update ruff.toml with known-first-party for import sorting
- Update all documentation to reflect new directory structure

Boundary rule enforced: harness code never imports from app.
All 429 tests pass. Lint clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: add harness→app boundary check test and update docs

Add test_harness_boundary.py that scans all Python files in
packages/harness/deerflow/ and fails if any `from app.*` or
`import app.*` statement is found. This enforces the architectural
rule that the harness layer never depends on the app layer.

Update CLAUDE.md to document the harness/app split architecture,
import conventions, and the boundary enforcement test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add config versioning with auto-upgrade on startup

When config.example.yaml schema changes, developers' local config.yaml
files can silently become outdated. This adds a config_version field and
auto-upgrade mechanism so breaking changes (like src.* → deerflow.*
renames) are applied automatically before services start.

- Add config_version: 1 to config.example.yaml
- Add startup version check warning in AppConfig.from_file()
- Add scripts/config-upgrade.sh with migration registry for value replacements
- Add `make config-upgrade` target
- Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services
- Add config error hints in service failure messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix comments

* fix: update src.* import in test_sandbox_tools_security to deerflow.*

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle empty config and search parent dirs for config.example.yaml

Address Copilot review comments on PR #1131:
- Guard against yaml.safe_load() returning None for empty config files
- Search parent directories for config.example.yaml instead of only
  looking next to config.yaml, fixing detection in common setups

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct skills root path depth and config_version type coercion

- loader.py: fix get_skills_root_path() to use 5 parent levels (was 3)
  after harness split, file lives at packages/harness/deerflow/skills/
  so parent×3 resolved to backend/packages/harness/ instead of backend/
- app_config.py: coerce config_version to int() before comparison in
  _check_config_version() to prevent TypeError when YAML stores value
  as string (e.g. config_version: "1")
- tests: add regression tests for both fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update test imports from src.* to deerflow.*/app.* after harness refactor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 22:55:52 +08:00

5.1 KiB
Raw Blame History

Task Tool Improvements

Overview

The task tool has been improved to eliminate wasteful LLM polling. Previously, when using background tasks, the LLM had to repeatedly call task_status to poll for completion, causing unnecessary API requests.

Changes Made

1. Removed run_in_background Parameter

The run_in_background parameter has been removed from the task tool. All subagent tasks now run asynchronously by default, but the tool handles completion automatically.

Before:

# LLM had to manage polling
task_id = task(
    subagent_type="bash",
    prompt="Run tests",
    description="Run tests",
    run_in_background=True
)
# Then LLM had to poll repeatedly:
while True:
    status = task_status(task_id)
    if completed:
        break

After:

# Tool blocks until complete, polling happens in backend
result = task(
    subagent_type="bash",
    prompt="Run tests",
    description="Run tests"
)
# Result is available immediately after the call returns

2. Backend Polling

The task_tool now:

  • Starts the subagent task asynchronously
  • Polls for completion in the backend (every 2 seconds)
  • Blocks the tool call until completion
  • Returns the final result directly

This means:

  • LLM makes only ONE tool call
  • No wasteful LLM polling requests
  • Backend handles all status checking
  • Timeout protection (5 minutes max)

3. Removed task_status from LLM Tools

The task_status_tool is no longer exposed to the LLM. It's kept in the codebase for potential internal/debugging use, but the LLM cannot call it.

4. Updated Documentation

  • Updated SUBAGENT_SECTION in prompt.py to remove all references to background tasks and polling
  • Simplified usage examples
  • Made it clear that the tool automatically waits for completion

Implementation Details

Polling Logic

Located in packages/harness/deerflow/tools/builtins/task_tool.py:

# Start background execution
task_id = executor.execute_async(prompt)

# Poll for task completion in backend
while True:
    result = get_background_task_result(task_id)

    # Check if task completed or failed
    if result.status == SubagentStatus.COMPLETED:
        return f"[Subagent: {subagent_type}]\n\n{result.result}"
    elif result.status == SubagentStatus.FAILED:
        return f"[Subagent: {subagent_type}] Task failed: {result.error}"

    # Wait before next poll
    time.sleep(2)

    # Timeout protection (5 minutes)
    if poll_count > 150:
        return "Task timed out after 5 minutes"

Execution Timeout

In addition to polling timeout, subagent execution now has a built-in timeout mechanism:

Configuration (packages/harness/deerflow/subagents/config.py):

@dataclass
class SubagentConfig:
    # ...
    timeout_seconds: int = 300  # 5 minutes default

Thread Pool Architecture:

To avoid nested thread pools and resource waste, we use two dedicated thread pools:

  1. Scheduler Pool (_scheduler_pool):

    • Max workers: 4
    • Purpose: Orchestrates background task execution
    • Runs run_task() function that manages task lifecycle
  2. Execution Pool (_execution_pool):

    • Max workers: 8 (larger to avoid blocking)
    • Purpose: Actual subagent execution with timeout support
    • Runs execute() method that invokes the agent

How it works:

# In execute_async():
_scheduler_pool.submit(run_task)  # Submit orchestration task

# In run_task():
future = _execution_pool.submit(self.execute, task)  # Submit execution
exec_result = future.result(timeout=timeout_seconds)  # Wait with timeout

Benefits:

  • Clean separation of concerns (scheduling vs execution)
  • No nested thread pools
  • Timeout enforcement at the right level
  • Better resource utilization

Two-Level Timeout Protection:

  1. Execution Timeout: Subagent execution itself has a 5-minute timeout (configurable in SubagentConfig)
  2. Polling Timeout: Tool polling has a 5-minute timeout (30 polls × 10 seconds)

This ensures that even if subagent execution hangs, the system won't wait indefinitely.

Benefits

  1. Reduced API Costs: No more repeated LLM requests for polling
  2. Simpler UX: LLM doesn't need to manage polling logic
  3. Better Reliability: Backend handles all status checking consistently
  4. Timeout Protection: Two-level timeout prevents infinite waiting (execution + polling)

Testing

To verify the changes work correctly:

  1. Start a subagent task that takes a few seconds
  2. Verify the tool call blocks until completion
  3. Verify the result is returned directly
  4. Verify no task_status calls are made

Example test scenario:

# This should block for ~10 seconds then return result
result = task(
    subagent_type="bash",
    prompt="sleep 10 && echo 'Done'",
    description="Test task"
)
# result should contain "Done"

Migration Notes

For users/code that previously used run_in_background=True:

  • Simply remove the parameter
  • Remove any polling logic
  • The tool will automatically wait for completion

No other changes needed - the API is backward compatible (minus the removed parameter).