feat(harness): integration ACP agent tool (#1344)

* refactor: extract shared utils to break harness→app cross-layer imports

Move _validate_skill_frontmatter to src/skills/validation.py and
CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py.
This eliminates the two reverse dependencies from client.py (harness layer)
into gateway/routers/ (app layer), preparing for the harness/app package split.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: split backend/src into harness (deerflow.*) and app (app.*)

Physically split the monolithic backend/src/ package into two layers:

- **Harness** (`packages/harness/deerflow/`): publishable agent framework
  package with import prefix `deerflow.*`. Contains agents, sandbox, tools,
  models, MCP, skills, config, and all core infrastructure.

- **App** (`app/`): unpublished application code with import prefix `app.*`.
  Contains gateway (FastAPI REST API) and channels (IM integrations).

Key changes:
- Move 13 harness modules to packages/harness/deerflow/ via git mv
- Move gateway + channels to app/ via git mv
- Rename all imports: src.* → deerflow.* (harness) / app.* (app layer)
- Set up uv workspace with deerflow-harness as workspace member
- Update langgraph.json, config.example.yaml, all scripts, Docker files
- Add build-system (hatchling) to harness pyproject.toml
- Add PYTHONPATH=. to gateway startup commands for app.* resolution
- Update ruff.toml with known-first-party for import sorting
- Update all documentation to reflect new directory structure

Boundary rule enforced: harness code never imports from app.
All 429 tests pass. Lint clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: add harness→app boundary check test and update docs

Add test_harness_boundary.py that scans all Python files in
packages/harness/deerflow/ and fails if any `from app.*` or
`import app.*` statement is found. This enforces the architectural
rule that the harness layer never depends on the app layer.

Update CLAUDE.md to document the harness/app split architecture,
import conventions, and the boundary enforcement test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add config versioning with auto-upgrade on startup

When config.example.yaml schema changes, developers' local config.yaml
files can silently become outdated. This adds a config_version field and
auto-upgrade mechanism so breaking changes (like src.* → deerflow.*
renames) are applied automatically before services start.

- Add config_version: 1 to config.example.yaml
- Add startup version check warning in AppConfig.from_file()
- Add scripts/config-upgrade.sh with migration registry for value replacements
- Add `make config-upgrade` target
- Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services
- Add config error hints in service failure messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix comments

* fix: update src.* import in test_sandbox_tools_security to deerflow.*

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle empty config and search parent dirs for config.example.yaml

Address Copilot review comments on PR #1131:
- Guard against yaml.safe_load() returning None for empty config files
- Search parent directories for config.example.yaml instead of only
  looking next to config.yaml, fixing detection in common setups

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct skills root path depth and config_version type coercion

- loader.py: fix get_skills_root_path() to use 5 parent levels (was 3)
  after harness split, file lives at packages/harness/deerflow/skills/
  so parent×3 resolved to backend/packages/harness/ instead of backend/
- app_config.py: coerce config_version to int() before comparison in
  _check_config_version() to prevent TypeError when YAML stores value
  as string (e.g. config_version: "1")
- tests: add regression tests for both fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update test imports from src.* to deerflow.*/app.* after harness refactor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(harness): add tool-first ACP agent invocation (#37)

* feat(harness): add tool-first ACP agent invocation

* build(harness): make ACP dependency required

* fix(harness): address ACP review feedback

* feat(harness): decouple ACP agent workspace from thread data

ACP agents (codex, claude-code) previously used per-thread workspace
directories, causing path resolution complexity and coupling task
execution to DeerFlow's internal thread data layout. This change:

- Replace _resolve_cwd() with a fixed _get_work_dir() that always uses
  {base_dir}/acp-workspace/, eliminating virtual path translation and
  thread_id lookups
- Introduce /mnt/acp-workspace virtual path for lead agent read-only
  access to ACP agent output files (same pattern as /mnt/skills)
- Add security guards: read-only validation, path traversal prevention,
  command path allowlisting, and output masking for acp-workspace
- Update system prompt and tool description to guide LLM: send
  self-contained tasks to ACP agents, copy results via /mnt/acp-workspace
- Add 11 new security tests for ACP workspace path handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(prompt): inject ACP section only when ACP agents are configured

The ACP agent guidance in the system prompt is now conditionally built
by _build_acp_section(), which checks get_acp_agents() and returns an
empty string when no ACP agents are configured. This avoids polluting
the prompt with irrelevant instructions for users who don't use ACP.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix lint

* fix(harness): address Copilot review comments on sandbox path handling and ACP tool

- local_sandbox: fix path-segment boundary bug in _resolve_path (== or startswith +"/")
  and add lookahead in _resolve_paths_in_command regex to prevent /mnt/skills matching
  inside /mnt/skills-extra
- local_sandbox_provider: replace print() with logger.warning(..., exc_info=True)
- invoke_acp_agent_tool: guard getattr(option, "optionId") with None default + continue;
  move full prompt from INFO to DEBUG level (truncated to 200 chars)
- sandbox/tools: fix _get_acp_workspace_host_path docstring to match implementation;
  remove misleading "read-only" language from validate_local_bash_command_paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(acp): thread-isolated workspaces, permission guardrail, and ContextVar registry

P1.1 – ACP workspace thread isolation
- Add `Paths.acp_workspace_dir(thread_id)` for per-thread paths
- `_get_work_dir(thread_id)` in invoke_acp_agent_tool now uses
  `{base_dir}/threads/{thread_id}/acp-workspace/`; falls back to
  global workspace when thread_id is absent or invalid
- `_invoke` extracts thread_id from `RunnableConfig` via
  `Annotated[RunnableConfig, InjectedToolArg]`
- `sandbox/tools.py`: `_get_acp_workspace_host_path(thread_id)`,
  `_resolve_acp_workspace_path(path, thread_id)`, and all callers
  (`replace_virtual_paths_in_command`, `mask_local_paths_in_output`,
  `ls_tool`, `read_file_tool`) now resolve ACP paths per-thread

P1.2 – ACP permission guardrail
- New `auto_approve_permissions: bool = False` field in `ACPAgentConfig`
- `_build_permission_response(options, *, auto_approve: bool)` now
  defaults to deny; only approves when `auto_approve=True`
- Document field in `config.example.yaml`

P2 – Deferred tool registry race condition
- Replace module-level `_registry` global with `contextvars.ContextVar`
- Each asyncio request context gets its own registry; worker threads
  inherit the context automatically via `loop.run_in_executor`
- Expose `get_deferred_registry` / `set_deferred_registry` /
  `reset_deferred_registry` helpers

Tests: 831 pass (57 for affected modules, 3 new tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(sandbox): mount /mnt/acp-workspace in docker sandbox container

The AioSandboxProvider was not mounting the ACP workspace into the
sandbox container, so /mnt/acp-workspace was inaccessible when the lead
agent tried to read ACP results in docker mode.

Changes:
- `ensure_thread_dirs`: also create `acp-workspace/` (chmod 0o777) so
  the directory exists before the sandbox container starts — required
  for Docker volume mounts
- `_get_thread_mounts`: add read-only `/mnt/acp-workspace` mount using
  the per-thread host path (`host_paths.acp_workspace_dir(thread_id)`)
- Update stale CLAUDE.md description (was "fixed global workspace")

Tests: `test_aio_sandbox_provider.py` (4 new tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): remove unused imports in test_aio_sandbox_provider

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix config

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
DanielWalnut
2026-03-26 14:20:18 +08:00
committed by GitHub
parent 792c49e6af
commit d119214fee
46 changed files with 1565 additions and 218 deletions

View File

@@ -0,0 +1,208 @@
"""Built-in tool for invoking external ACP-compatible agents."""
import logging
import shutil
from typing import Annotated, Any
from langchain_core.runnables import RunnableConfig
from langchain_core.tools import BaseTool, InjectedToolArg, StructuredTool
from pydantic import BaseModel, Field
logger = logging.getLogger(__name__)
class _InvokeACPAgentInput(BaseModel):
agent: str = Field(description="Name of the ACP agent to invoke")
prompt: str = Field(description="The concise task prompt to send to the agent")
def _get_work_dir(thread_id: str | None) -> str:
"""Get the per-thread ACP workspace directory.
Each thread gets an isolated workspace under
``{base_dir}/threads/{thread_id}/acp-workspace/`` so that concurrent
sessions cannot read or overwrite each other's ACP agent outputs.
Falls back to the legacy global ``{base_dir}/acp-workspace/`` when
``thread_id`` is not available (e.g. embedded / direct invocation).
The directory is created automatically if it does not exist.
Returns:
An absolute physical filesystem path to use as the working directory.
"""
from deerflow.config.paths import get_paths
paths = get_paths()
if thread_id:
try:
work_dir = paths.acp_workspace_dir(thread_id)
except ValueError:
logger.warning("Invalid thread_id %r for ACP workspace, falling back to global", thread_id)
work_dir = paths.base_dir / "acp-workspace"
else:
work_dir = paths.base_dir / "acp-workspace"
work_dir.mkdir(parents=True, exist_ok=True)
logger.info("ACP agent work_dir: %s", work_dir)
return str(work_dir)
def _build_mcp_servers() -> dict[str, dict[str, Any]]:
"""Build ACP ``mcpServers`` config from DeerFlow's enabled MCP servers."""
from deerflow.config.extensions_config import ExtensionsConfig
from deerflow.mcp.client import build_servers_config
return build_servers_config(ExtensionsConfig.from_file())
def _build_permission_response(options: list[Any], *, auto_approve: bool) -> Any:
"""Build an ACP permission response.
When ``auto_approve`` is True, selects the first ``allow_once`` (preferred)
or ``allow_always`` option. When False (the default), always cancels —
permission requests must be handled by the ACP agent's own policy or the
agent must be configured to operate without requesting permissions.
"""
from acp import RequestPermissionResponse
from acp.schema import AllowedOutcome, DeniedOutcome
if auto_approve:
for preferred_kind in ("allow_once", "allow_always"):
for option in options:
if getattr(option, "kind", None) != preferred_kind:
continue
option_id = getattr(option, "option_id", None)
if option_id is None:
option_id = getattr(option, "optionId", None)
if option_id is None:
continue
return RequestPermissionResponse(
outcome=AllowedOutcome(outcome="selected", optionId=option_id),
)
return RequestPermissionResponse(outcome=DeniedOutcome(outcome="cancelled"))
def _format_invocation_error(agent: str, cmd: str, exc: Exception) -> str:
"""Return a user-facing ACP invocation error with actionable remediation."""
if not isinstance(exc, FileNotFoundError):
return f"Error invoking ACP agent '{agent}': {exc}"
message = f"Error invoking ACP agent '{agent}': Command '{cmd}' was not found on PATH."
if cmd == "codex-acp" and shutil.which("codex"):
return f"{message} The installed `codex` CLI does not speak ACP directly. Install a Codex ACP adapter (for example `npx @zed-industries/codex-acp`) or update `acp_agents.codex.command` and `args` in config.yaml."
return f"{message} Install the agent binary or update `acp_agents.{agent}.command` in config.yaml."
def build_invoke_acp_agent_tool(agents: dict) -> BaseTool:
"""Create the ``invoke_acp_agent`` tool with a description generated from configured agents.
The tool description includes the list of available agents so that the LLM
knows which agents it can invoke without requiring hardcoded names.
Args:
agents: Mapping of agent name -> ``ACPAgentConfig``.
Returns:
A LangChain ``BaseTool`` ready to be included in the tool list.
"""
agent_lines = "\n".join(f"- {name}: {cfg.description}" for name, cfg in agents.items())
description = (
"Invoke an external ACP-compatible agent and return its final response.\n\n"
"Available agents:\n"
f"{agent_lines}\n\n"
"IMPORTANT: ACP agents operate in their own independent workspace. "
"Do NOT include /mnt/user-data paths in the prompt. "
"Give the agent a self-contained task description — it will produce results in its own workspace. "
"After the agent completes, its output files are accessible at /mnt/acp-workspace/ (read-only)."
)
# Capture agents in closure so the function can reference it
_agents = dict(agents)
async def _invoke(agent: str, prompt: str, config: Annotated[RunnableConfig, InjectedToolArg] = None) -> str:
logger.info("Invoking ACP agent %s (prompt length: %d)", agent, len(prompt))
logger.debug("Invoking ACP agent %s with prompt: %.200s%s", agent, prompt, "..." if len(prompt) > 200 else "")
if agent not in _agents:
available = ", ".join(_agents.keys())
return f"Error: Unknown agent '{agent}'. Available: {available}"
agent_config = _agents[agent]
thread_id: str | None = ((config or {}).get("configurable") or {}).get("thread_id")
try:
from acp import PROTOCOL_VERSION, Client, text_block
from acp.schema import ClientCapabilities, Implementation
except ImportError:
return "Error: agent-client-protocol package is not installed. Run `uv sync` to install project dependencies."
class _CollectingClient(Client):
"""Minimal ACP Client that collects streamed text from session updates."""
def __init__(self) -> None:
self._chunks: list[str] = []
@property
def collected_text(self) -> str:
return "".join(self._chunks)
async def session_update(self, session_id: str, update, **kwargs) -> None: # type: ignore[override]
try:
from acp.schema import TextContentBlock
if hasattr(update, "content") and isinstance(update.content, TextContentBlock):
self._chunks.append(update.content.text)
except Exception:
pass
async def request_permission(self, options, session_id: str, tool_call, **kwargs): # type: ignore[override]
response = _build_permission_response(options, auto_approve=agent_config.auto_approve_permissions)
outcome = response.outcome.outcome
if outcome == "selected":
logger.info("ACP permission auto-approved for tool call %s in session %s", tool_call.tool_call_id, session_id)
else:
logger.warning("ACP permission denied for tool call %s in session %s (set auto_approve_permissions: true in config.yaml to enable)", tool_call.tool_call_id, session_id)
return response
client = _CollectingClient()
cmd = agent_config.command
args = agent_config.args or []
physical_cwd = _get_work_dir(thread_id)
mcp_servers = _build_mcp_servers()
try:
from acp import spawn_agent_process
async with spawn_agent_process(client, cmd, *args, cwd=physical_cwd) as (conn, proc):
logger.info("Spawning ACP agent '%s' with command '%s' and args %s in cwd %s", agent, cmd, args, physical_cwd)
await conn.initialize(
protocol_version=PROTOCOL_VERSION,
client_capabilities=ClientCapabilities(),
client_info=Implementation(name="deerflow", title="DeerFlow", version="0.1.0"),
)
session_kwargs: dict[str, Any] = {"cwd": physical_cwd, "mcp_servers": mcp_servers}
if agent_config.model:
session_kwargs["model"] = agent_config.model
session = await conn.new_session(**session_kwargs)
await conn.prompt(
session_id=session.session_id,
prompt=[text_block(prompt)],
)
result = client.collected_text
logger.info("ACP agent '%s' returned %s", agent, result[:1000])
logger.info("ACP agent '%s' returned %d characters", agent, len(result))
return result or "(no response)"
except Exception as e:
logger.error("ACP agent '%s' invocation failed: %s", agent, e)
return _format_invocation_error(agent, cmd, e)
return StructuredTool.from_function(
name="invoke_acp_agent",
description=description,
coroutine=_invoke,
args_schema=_InvokeACPAgentInput,
)

View File

@@ -9,6 +9,7 @@ call them until it fetches their full schema via the tool_search tool.
Source-agnostic: no mention of MCP or tool origin.
"""
import contextvars
import json
import logging
import re
@@ -108,24 +109,31 @@ def _regex_score(pattern: str, entry: DeferredToolEntry) -> int:
return len(regex.findall(f"{entry.name} {entry.description}"))
# ── Singleton ──
# ── Per-request registry (ContextVar) ──
#
# Using a ContextVar instead of a module-level global prevents concurrent
# requests from clobbering each other's registry. In asyncio-based LangGraph
# each graph run executes in its own async context, so each request gets an
# independent registry value. For synchronous tools run via
# loop.run_in_executor, Python copies the current context to the worker thread,
# so the ContextVar value is correctly inherited there too.
_registry: DeferredToolRegistry | None = None
_registry_var: contextvars.ContextVar[DeferredToolRegistry | None] = contextvars.ContextVar(
"deferred_tool_registry", default=None
)
def get_deferred_registry() -> DeferredToolRegistry | None:
return _registry
return _registry_var.get()
def set_deferred_registry(registry: DeferredToolRegistry) -> None:
global _registry
_registry = registry
_registry_var.set(registry)
def reset_deferred_registry() -> None:
"""Reset the deferred registry singleton. Useful for testing."""
global _registry
_registry = None
"""Reset the deferred registry for the current async context."""
_registry_var.set(None)
# ── Tool ──

View File

@@ -97,5 +97,18 @@ def get_available_tools(
except Exception as e:
logger.error(f"Failed to get cached MCP tools: {e}")
logger.info(f"Total tools loaded: {len(loaded_tools)}, built-in tools: {len(builtin_tools)}, MCP tools: {len(mcp_tools)}")
return loaded_tools + builtin_tools + mcp_tools
# Add invoke_acp_agent tool if any ACP agents are configured
acp_tools: list[BaseTool] = []
try:
from deerflow.config.acp_config import get_acp_agents
from deerflow.tools.builtins.invoke_acp_agent_tool import build_invoke_acp_agent_tool
acp_agents = get_acp_agents()
if acp_agents:
acp_tools.append(build_invoke_acp_agent_tool(acp_agents))
logger.info(f"Including invoke_acp_agent tool ({len(acp_agents)} agent(s): {list(acp_agents.keys())})")
except Exception as e:
logger.warning(f"Failed to load ACP tool: {e}")
logger.info(f"Total tools loaded: {len(loaded_tools)}, built-in tools: {len(builtin_tools)}, MCP tools: {len(mcp_tools)}, ACP tools: {len(acp_tools)}")
return loaded_tools + builtin_tools + mcp_tools + acp_tools