Files
DanielWalnut 76803b826f refactor: split backend into harness (deerflow.*) and app (app.*) (#1131)
* refactor: extract shared utils to break harness→app cross-layer imports

Move _validate_skill_frontmatter to src/skills/validation.py and
CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py.
This eliminates the two reverse dependencies from client.py (harness layer)
into gateway/routers/ (app layer), preparing for the harness/app package split.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: split backend/src into harness (deerflow.*) and app (app.*)

Physically split the monolithic backend/src/ package into two layers:

- **Harness** (`packages/harness/deerflow/`): publishable agent framework
  package with import prefix `deerflow.*`. Contains agents, sandbox, tools,
  models, MCP, skills, config, and all core infrastructure.

- **App** (`app/`): unpublished application code with import prefix `app.*`.
  Contains gateway (FastAPI REST API) and channels (IM integrations).

Key changes:
- Move 13 harness modules to packages/harness/deerflow/ via git mv
- Move gateway + channels to app/ via git mv
- Rename all imports: src.* → deerflow.* (harness) / app.* (app layer)
- Set up uv workspace with deerflow-harness as workspace member
- Update langgraph.json, config.example.yaml, all scripts, Docker files
- Add build-system (hatchling) to harness pyproject.toml
- Add PYTHONPATH=. to gateway startup commands for app.* resolution
- Update ruff.toml with known-first-party for import sorting
- Update all documentation to reflect new directory structure

Boundary rule enforced: harness code never imports from app.
All 429 tests pass. Lint clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: add harness→app boundary check test and update docs

Add test_harness_boundary.py that scans all Python files in
packages/harness/deerflow/ and fails if any `from app.*` or
`import app.*` statement is found. This enforces the architectural
rule that the harness layer never depends on the app layer.

Update CLAUDE.md to document the harness/app split architecture,
import conventions, and the boundary enforcement test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add config versioning with auto-upgrade on startup

When config.example.yaml schema changes, developers' local config.yaml
files can silently become outdated. This adds a config_version field and
auto-upgrade mechanism so breaking changes (like src.* → deerflow.*
renames) are applied automatically before services start.

- Add config_version: 1 to config.example.yaml
- Add startup version check warning in AppConfig.from_file()
- Add scripts/config-upgrade.sh with migration registry for value replacements
- Add `make config-upgrade` target
- Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services
- Add config error hints in service failure messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix comments

* fix: update src.* import in test_sandbox_tools_security to deerflow.*

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle empty config and search parent dirs for config.example.yaml

Address Copilot review comments on PR #1131:
- Guard against yaml.safe_load() returning None for empty config files
- Search parent directories for config.example.yaml instead of only
  looking next to config.yaml, fixing detection in common setups

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct skills root path depth and config_version type coercion

- loader.py: fix get_skills_root_path() to use 5 parent levels (was 3)
  after harness split, file lives at packages/harness/deerflow/skills/
  so parent×3 resolved to backend/packages/harness/ instead of backend/
- app_config.py: coerce config_version to int() before comparison in
  _check_config_version() to prevent TypeError when YAML stores value
  as string (e.g. config_version: "1")
- tests: add regression tests for both fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update test imports from src.* to deerflow.*/app.* after harness refactor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 22:55:52 +08:00

96 lines
4.0 KiB
Python

from importlib import import_module
MODULE_TO_PACKAGE_HINTS = {
"langchain_google_genai": "langchain-google-genai",
"langchain_anthropic": "langchain-anthropic",
"langchain_openai": "langchain-openai",
"langchain_deepseek": "langchain-deepseek",
}
def _build_missing_dependency_hint(module_path: str, err: ImportError) -> str:
"""Build an actionable hint when module import fails."""
module_root = module_path.split(".", 1)[0]
missing_module = getattr(err, "name", None) or module_root
# Prefer provider package hints for known integrations, even when the import
# error is triggered by a transitive dependency (e.g. `google`).
package_name = MODULE_TO_PACKAGE_HINTS.get(module_root)
if package_name is None:
package_name = MODULE_TO_PACKAGE_HINTS.get(missing_module, missing_module.replace("_", "-"))
return f"Missing dependency '{missing_module}'. Install it with `uv add {package_name}` (or `pip install {package_name}`), then restart DeerFlow."
def resolve_variable[T](
variable_path: str,
expected_type: type[T] | tuple[type, ...] | None = None,
) -> T:
"""Resolve a variable from a path.
Args:
variable_path: The path to the variable (e.g. "parent_package_name.sub_package_name.module_name:variable_name").
expected_type: Optional type or tuple of types to validate the resolved variable against.
If provided, uses isinstance() to check if the variable is an instance of the expected type(s).
Returns:
The resolved variable.
Raises:
ImportError: If the module path is invalid or the attribute doesn't exist.
ValueError: If the resolved variable doesn't pass the validation checks.
"""
try:
module_path, variable_name = variable_path.rsplit(":", 1)
except ValueError as err:
raise ImportError(f"{variable_path} doesn't look like a variable path. Example: parent_package_name.sub_package_name.module_name:variable_name") from err
try:
module = import_module(module_path)
except ImportError as err:
module_root = module_path.split(".", 1)[0]
err_name = getattr(err, "name", None)
if isinstance(err, ModuleNotFoundError) or err_name == module_root:
hint = _build_missing_dependency_hint(module_path, err)
raise ImportError(f"Could not import module {module_path}. {hint}") from err
# Preserve the original ImportError message for non-missing-module failures.
raise ImportError(f"Error importing module {module_path}: {err}") from err
try:
variable = getattr(module, variable_name)
except AttributeError as err:
raise ImportError(f"Module {module_path} does not define a {variable_name} attribute/class") from err
# Type validation
if expected_type is not None:
if not isinstance(variable, expected_type):
type_name = expected_type.__name__ if isinstance(expected_type, type) else " or ".join(t.__name__ for t in expected_type)
raise ValueError(f"{variable_path} is not an instance of {type_name}, got {type(variable).__name__}")
return variable
def resolve_class[T](class_path: str, base_class: type[T] | None = None) -> type[T]:
"""Resolve a class from a module path and class name.
Args:
class_path: The path to the class (e.g. "langchain_openai:ChatOpenAI").
base_class: The base class to check if the resolved class is a subclass of.
Returns:
The resolved class.
Raises:
ImportError: If the module path is invalid or the attribute doesn't exist.
ValueError: If the resolved object is not a class or not a subclass of base_class.
"""
model_class = resolve_variable(class_path, expected_type=type)
if not isinstance(model_class, type):
raise ValueError(f"{class_path} is not a valid class")
if base_class is not None and not issubclass(model_class, base_class):
raise ValueError(f"{class_path} is not a subclass of {base_class.__name__}")
return model_class