Files
deer-flow/backend/packages/harness/deerflow/community/aio_sandbox/local_backend.py

328 lines
12 KiB
Python
Raw Normal View History

"""Local container backend for sandbox provisioning.
Manages sandbox containers using Docker or Apple Container on the local machine.
Handles container lifecycle, port allocation, and cross-process container discovery.
"""
from __future__ import annotations
import logging
import os
import subprocess
refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) * refactor: extract shared utils to break harness→app cross-layer imports Move _validate_skill_frontmatter to src/skills/validation.py and CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py. This eliminates the two reverse dependencies from client.py (harness layer) into gateway/routers/ (app layer), preparing for the harness/app package split. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: split backend/src into harness (deerflow.*) and app (app.*) Physically split the monolithic backend/src/ package into two layers: - **Harness** (`packages/harness/deerflow/`): publishable agent framework package with import prefix `deerflow.*`. Contains agents, sandbox, tools, models, MCP, skills, config, and all core infrastructure. - **App** (`app/`): unpublished application code with import prefix `app.*`. Contains gateway (FastAPI REST API) and channels (IM integrations). Key changes: - Move 13 harness modules to packages/harness/deerflow/ via git mv - Move gateway + channels to app/ via git mv - Rename all imports: src.* → deerflow.* (harness) / app.* (app layer) - Set up uv workspace with deerflow-harness as workspace member - Update langgraph.json, config.example.yaml, all scripts, Docker files - Add build-system (hatchling) to harness pyproject.toml - Add PYTHONPATH=. to gateway startup commands for app.* resolution - Update ruff.toml with known-first-party for import sorting - Update all documentation to reflect new directory structure Boundary rule enforced: harness code never imports from app. All 429 tests pass. Lint clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: add harness→app boundary check test and update docs Add test_harness_boundary.py that scans all Python files in packages/harness/deerflow/ and fails if any `from app.*` or `import app.*` statement is found. This enforces the architectural rule that the harness layer never depends on the app layer. Update CLAUDE.md to document the harness/app split architecture, import conventions, and the boundary enforcement test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add config versioning with auto-upgrade on startup When config.example.yaml schema changes, developers' local config.yaml files can silently become outdated. This adds a config_version field and auto-upgrade mechanism so breaking changes (like src.* → deerflow.* renames) are applied automatically before services start. - Add config_version: 1 to config.example.yaml - Add startup version check warning in AppConfig.from_file() - Add scripts/config-upgrade.sh with migration registry for value replacements - Add `make config-upgrade` target - Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services - Add config error hints in service failure messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix comments * fix: update src.* import in test_sandbox_tools_security to deerflow.* Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: handle empty config and search parent dirs for config.example.yaml Address Copilot review comments on PR #1131: - Guard against yaml.safe_load() returning None for empty config files - Search parent directories for config.example.yaml instead of only looking next to config.yaml, fixing detection in common setups Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: correct skills root path depth and config_version type coercion - loader.py: fix get_skills_root_path() to use 5 parent levels (was 3) after harness split, file lives at packages/harness/deerflow/skills/ so parent×3 resolved to backend/packages/harness/ instead of backend/ - app_config.py: coerce config_version to int() before comparison in _check_config_version() to prevent TypeError when YAML stores value as string (e.g. config_version: "1") - tests: add regression tests for both fixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: update test imports from src.* to deerflow.*/app.* after harness refactor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 22:55:52 +08:00
from deerflow.utils.network import get_free_port, release_port
from .backend import SandboxBackend, wait_for_sandbox_ready
from .sandbox_info import SandboxInfo
logger = logging.getLogger(__name__)
class LocalContainerBackend(SandboxBackend):
"""Backend that manages sandbox containers locally using Docker or Apple Container.
On macOS, automatically prefers Apple Container if available, otherwise falls back to Docker.
On other platforms, uses Docker.
Features:
- Deterministic container naming for cross-process discovery
- Port allocation with thread-safe utilities
- Container lifecycle management (start/stop with --rm)
- Support for volume mounts and environment variables
"""
def __init__(
self,
*,
image: str,
base_port: int,
container_prefix: str,
config_mounts: list,
environment: dict[str, str],
):
"""Initialize the local container backend.
Args:
image: Container image to use.
base_port: Base port number to start searching for free ports.
container_prefix: Prefix for container names (e.g., "deer-flow-sandbox").
config_mounts: Volume mount configurations from config (list of VolumeMountConfig).
environment: Environment variables to inject into containers.
"""
self._image = image
self._base_port = base_port
self._container_prefix = container_prefix
self._config_mounts = config_mounts
self._environment = environment
self._runtime = self._detect_runtime()
@property
def runtime(self) -> str:
"""The detected container runtime ("docker" or "container")."""
return self._runtime
def _detect_runtime(self) -> str:
"""Detect which container runtime to use.
On macOS, prefer Apple Container if available, otherwise fall back to Docker.
On other platforms, use Docker.
Returns:
"container" for Apple Container, "docker" for Docker.
"""
import platform
if platform.system() == "Darwin":
try:
result = subprocess.run(
["container", "--version"],
capture_output=True,
text=True,
check=True,
timeout=5,
)
logger.info(f"Detected Apple Container: {result.stdout.strip()}")
return "container"
except (FileNotFoundError, subprocess.CalledProcessError, subprocess.TimeoutExpired):
logger.info("Apple Container not available, falling back to Docker")
return "docker"
# ── SandboxBackend interface ──────────────────────────────────────────
def create(self, thread_id: str, sandbox_id: str, extra_mounts: list[tuple[str, str, bool]] | None = None) -> SandboxInfo:
"""Start a new container and return its connection info.
Args:
thread_id: Thread ID for which the sandbox is being created. Useful for backends that want to organize sandboxes by thread.
sandbox_id: Deterministic sandbox identifier (used in container name).
extra_mounts: Additional volume mounts as (host_path, container_path, read_only) tuples.
Returns:
SandboxInfo with container details.
Raises:
RuntimeError: If the container fails to start.
"""
container_name = f"{self._container_prefix}-{sandbox_id}"
# Retry loop: if Docker rejects the port (e.g. a stale container still
# holds the binding after a process restart), skip that port and try the
# next one. The socket-bind check in get_free_port mirrors Docker's
# 0.0.0.0 bind, but Docker's port-release can be slightly asynchronous,
# so a reactive fallback here ensures we always make progress.
_next_start = self._base_port
container_id: str | None = None
port: int = 0
for _attempt in range(10):
port = get_free_port(start_port=_next_start)
try:
container_id = self._start_container(container_name, port, extra_mounts)
break
except RuntimeError as exc:
release_port(port)
err = str(exc)
err_lower = err.lower()
# Port already bound: skip this port and retry with the next one.
if "port is already allocated" in err or "address already in use" in err_lower:
logger.warning(f"Port {port} rejected by Docker (already allocated), retrying with next port")
_next_start = port + 1
continue
# Container-name conflict: another process may have already started
# the deterministic sandbox container for this sandbox_id. Try to
# discover and adopt the existing container instead of failing.
if "is already in use by container" in err_lower or "conflict. the container name" in err_lower:
logger.warning(f"Container name {container_name} already in use, attempting to discover existing sandbox instance")
existing = self.discover(sandbox_id)
if existing is not None:
return existing
raise
else:
raise RuntimeError("Could not start sandbox container: all candidate ports are already allocated by Docker")
# When running inside Docker (DooD), sandbox containers are reachable via
# host.docker.internal rather than localhost (they run on the host daemon).
sandbox_host = os.environ.get("DEER_FLOW_SANDBOX_HOST", "localhost")
return SandboxInfo(
sandbox_id=sandbox_id,
sandbox_url=f"http://{sandbox_host}:{port}",
container_name=container_name,
container_id=container_id,
)
def destroy(self, info: SandboxInfo) -> None:
"""Stop the container and release its port."""
if info.container_id:
self._stop_container(info.container_id)
# Extract port from sandbox_url for release
try:
from urllib.parse import urlparse
port = urlparse(info.sandbox_url).port
if port:
release_port(port)
except Exception:
pass
def is_alive(self, info: SandboxInfo) -> bool:
"""Check if the container is still running (lightweight, no HTTP)."""
if info.container_name:
return self._is_container_running(info.container_name)
return False
def discover(self, sandbox_id: str) -> SandboxInfo | None:
"""Discover an existing container by its deterministic name.
Checks if a container with the expected name is running, retrieves its
port, and verifies it responds to health checks.
Args:
sandbox_id: The deterministic sandbox ID (determines container name).
Returns:
SandboxInfo if container found and healthy, None otherwise.
"""
container_name = f"{self._container_prefix}-{sandbox_id}"
if not self._is_container_running(container_name):
return None
port = self._get_container_port(container_name)
if port is None:
return None
sandbox_host = os.environ.get("DEER_FLOW_SANDBOX_HOST", "localhost")
sandbox_url = f"http://{sandbox_host}:{port}"
if not wait_for_sandbox_ready(sandbox_url, timeout=5):
return None
return SandboxInfo(
sandbox_id=sandbox_id,
sandbox_url=sandbox_url,
container_name=container_name,
)
# ── Container operations ─────────────────────────────────────────────
def _start_container(
self,
container_name: str,
port: int,
extra_mounts: list[tuple[str, str, bool]] | None = None,
) -> str:
"""Start a new container.
Args:
container_name: Name for the container.
port: Host port to map to container port 8080.
extra_mounts: Additional volume mounts.
Returns:
The container ID.
Raises:
RuntimeError: If container fails to start.
"""
cmd = [self._runtime, "run"]
# Docker-specific security options
if self._runtime == "docker":
cmd.extend(["--security-opt", "seccomp=unconfined"])
cmd.extend(
[
"--rm",
"-d",
"-p",
f"{port}:8080",
"--name",
container_name,
]
)
# Environment variables
for key, value in self._environment.items():
cmd.extend(["-e", f"{key}={value}"])
# Config-level volume mounts
for mount in self._config_mounts:
mount_spec = f"{mount.host_path}:{mount.container_path}"
if mount.read_only:
mount_spec += ":ro"
cmd.extend(["-v", mount_spec])
# Extra mounts (thread-specific, skills, etc.)
if extra_mounts:
for host_path, container_path, read_only in extra_mounts:
mount_spec = f"{host_path}:{container_path}"
if read_only:
mount_spec += ":ro"
cmd.extend(["-v", mount_spec])
cmd.append(self._image)
logger.info(f"Starting container using {self._runtime}: {' '.join(cmd)}")
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
container_id = result.stdout.strip()
logger.info(f"Started container {container_name} (ID: {container_id}) using {self._runtime}")
return container_id
except subprocess.CalledProcessError as e:
logger.error(f"Failed to start container using {self._runtime}: {e.stderr}")
raise RuntimeError(f"Failed to start sandbox container: {e.stderr}")
def _stop_container(self, container_id: str) -> None:
"""Stop a container (--rm ensures automatic removal)."""
try:
subprocess.run(
[self._runtime, "stop", container_id],
capture_output=True,
text=True,
check=True,
)
logger.info(f"Stopped container {container_id} using {self._runtime}")
except subprocess.CalledProcessError as e:
logger.warning(f"Failed to stop container {container_id}: {e.stderr}")
def _is_container_running(self, container_name: str) -> bool:
"""Check if a named container is currently running.
This enables cross-process container discovery any process can detect
containers started by another process via the deterministic container name.
"""
try:
result = subprocess.run(
[self._runtime, "inspect", "-f", "{{.State.Running}}", container_name],
capture_output=True,
text=True,
timeout=5,
)
return result.returncode == 0 and result.stdout.strip().lower() == "true"
except (subprocess.CalledProcessError, subprocess.TimeoutExpired):
return False
def _get_container_port(self, container_name: str) -> int | None:
"""Get the host port of a running container.
Args:
container_name: The container name to inspect.
Returns:
The host port mapped to container port 8080, or None if not found.
"""
try:
result = subprocess.run(
[self._runtime, "port", container_name, "8080"],
capture_output=True,
text=True,
timeout=5,
)
if result.returncode == 0 and result.stdout.strip():
# Output format: "0.0.0.0:PORT" or ":::PORT"
port_str = result.stdout.strip().split(":")[-1]
return int(port_str)
except (subprocess.CalledProcessError, subprocess.TimeoutExpired, ValueError):
pass
return None