diff --git a/backend/CLAUDE.md b/backend/CLAUDE.md index d6b275c..7af5fd7 100644 --- a/backend/CLAUDE.md +++ b/backend/CLAUDE.md @@ -156,13 +156,14 @@ Middlewares execute in strict order in `packages/harness/deerflow/agents/lead_ag 2. **UploadsMiddleware** - Tracks and injects newly uploaded files into conversation 3. **SandboxMiddleware** - Acquires sandbox, stores `sandbox_id` in state 4. **DanglingToolCallMiddleware** - Injects placeholder ToolMessages for AIMessage tool_calls that lack responses (e.g., due to user interruption) -5. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled) -6. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode) -7. **TitleMiddleware** - Auto-generates thread title after first complete exchange and normalizes structured message content before prompting the title model -8. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses) -9. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support) -10. **SubagentLimitMiddleware** - Truncates excess `task` tool calls from model response to enforce `MAX_CONCURRENT_SUBAGENTS` limit (optional, if subagent_enabled) -11. **ClarificationMiddleware** - Intercepts `ask_clarification` tool calls, interrupts via `Command(goto=END)` (must be last) +5. **GuardrailMiddleware** - Pre-tool-call authorization via pluggable `GuardrailProvider` protocol (optional, if `guardrails.enabled` in config). Evaluates each tool call and returns error ToolMessage on deny. Three provider options: built-in `AllowlistProvider` (zero deps), OAP policy providers (e.g. `aport-agent-guardrails`), or custom providers. See [docs/GUARDRAILS.md](docs/GUARDRAILS.md) for setup, usage, and how to implement a provider. +6. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled) +7. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode) +8. **TitleMiddleware** - Auto-generates thread title after first complete exchange and normalizes structured message content before prompting the title model +9. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses) +10. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support) +11. **SubagentLimitMiddleware** - Truncates excess `task` tool calls from model response to enforce `MAX_CONCURRENT_SUBAGENTS` limit (optional, if subagent_enabled) +12. **ClarificationMiddleware** - Intercepts `ask_clarification` tool calls, interrupts via `Command(goto=END)` (must be last) ### Configuration System diff --git a/backend/docs/GUARDRAILS.md b/backend/docs/GUARDRAILS.md new file mode 100644 index 0000000..81fc4be --- /dev/null +++ b/backend/docs/GUARDRAILS.md @@ -0,0 +1,385 @@ +# Guardrails: Pre-Tool-Call Authorization + +> **Context:** [Issue #1213](https://github.com/bytedance/deer-flow/issues/1213) — DeerFlow has Docker sandboxing and human approval via `ask_clarification`, but no deterministic, policy-driven authorization layer for tool calls. An agent running autonomous multi-step tasks can execute any loaded tool with any arguments. Guardrails add a middleware that evaluates every tool call against a policy **before** execution. + +## Why Guardrails + +``` +Without guardrails: With guardrails: + + Agent Agent + │ │ + ▼ ▼ + ┌──────────┐ ┌──────────┐ + │ bash │──▶ executes immediately │ bash │──▶ GuardrailMiddleware + │ rm -rf / │ │ rm -rf / │ │ + └──────────┘ └──────────┘ ▼ + ┌──────────────┐ + │ Provider │ + │ evaluates │ + │ against │ + │ policy │ + └──────┬───────┘ + │ + ┌─────┴─────┐ + │ │ + ALLOW DENY + │ │ + ▼ ▼ + Tool runs Agent sees: + normally "Guardrail denied: + rm -rf blocked" +``` + +- **Sandboxing** provides process isolation but not semantic authorization. A sandboxed `bash` can still `curl` data out. +- **Human approval** (`ask_clarification`) requires a human in the loop for every action. Not viable for autonomous workflows. +- **Guardrails** provide deterministic, policy-driven authorization that works without human intervention. + +## Architecture + +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ Middleware Chain │ +│ │ +│ 1. ThreadDataMiddleware ─── per-thread dirs │ +│ 2. UploadsMiddleware ─── file upload tracking │ +│ 3. SandboxMiddleware ─── sandbox acquisition │ +│ 4. DanglingToolCallMiddleware ── fix incomplete tool calls │ +│ 5. GuardrailMiddleware ◄──── EVALUATES EVERY TOOL CALL │ +│ 6. ToolErrorHandlingMiddleware ── convert exceptions to messages │ +│ 7-12. (Summarization, Title, Memory, Vision, Subagent, Clarify) │ +│ │ +└─────────────────────────────────────────────────────────────────────┘ + │ + ▼ + ┌──────────────────────────┐ + │ GuardrailProvider │ ◄── pluggable: any class + │ (configured in YAML) │ with evaluate/aevaluate + └────────────┬─────────────┘ + │ + ┌─────────┼──────────────┐ + │ │ │ + ▼ ▼ ▼ + Built-in OAP Passport Custom + Allowlist Provider Provider + (zero dep) (open standard) (your code) + │ + Any implementation + (e.g. APort, or + your own evaluator) +``` + +The `GuardrailMiddleware` implements `wrap_tool_call` / `awrap_tool_call` (the same `AgentMiddleware` pattern used by `ToolErrorHandlingMiddleware`). It: + +1. Builds a `GuardrailRequest` with tool name, arguments, and passport reference +2. Calls `provider.evaluate(request)` on whatever provider is configured +3. If **deny**: returns `ToolMessage(status="error")` with the reason -- agent sees the denial and adapts +4. If **allow**: passes through to the actual tool handler +5. If **provider error** and `fail_closed=true` (default): blocks the call +6. `GraphBubbleUp` exceptions (LangGraph control signals) are always propagated, never caught + +## Three Provider Options + +### Option 1: Built-in AllowlistProvider (Zero Dependencies) + +The simplest option. Ships with DeerFlow. Block or allow tools by name. No external packages, no passport, no network. + +**config.yaml:** +```yaml +guardrails: + enabled: true + provider: + use: deerflow.guardrails.builtin:AllowlistProvider + config: + denied_tools: ["bash", "write_file"] +``` + +This blocks `bash` and `write_file` for all requests. All other tools pass through. + +You can also use an allowlist (only these tools are permitted): +```yaml +guardrails: + enabled: true + provider: + use: deerflow.guardrails.builtin:AllowlistProvider + config: + allowed_tools: ["web_search", "read_file", "ls"] +``` + +**Try it:** +1. Add the config above to your `config.yaml` +2. Start DeerFlow: `make dev` +3. Ask the agent: "Use bash to run echo hello" +4. The agent sees: `Guardrail denied: tool 'bash' was blocked (oap.tool_not_allowed)` + +### Option 2: OAP Passport Provider (Policy-Based) + +For policy enforcement based on the [Open Agent Passport (OAP)](https://github.com/aporthq/aport-spec) open standard. An OAP passport is a JSON document that declares an agent's identity, capabilities, and operational limits. Any provider that reads an OAP passport and returns OAP-compliant decisions works with DeerFlow. + +``` +┌─────────────────────────────────────────────────────────────┐ +│ OAP Passport (JSON) │ +│ (open standard, any provider) │ +│ { │ +│ "spec_version": "oap/1.0", │ +│ "status": "active", │ +│ "capabilities": [ │ +│ {"id": "system.command.execute"}, │ +│ {"id": "data.file.read"}, │ +│ {"id": "data.file.write"}, │ +│ {"id": "web.fetch"}, │ +│ {"id": "mcp.tool.execute"} │ +│ ], │ +│ "limits": { │ +│ "system.command.execute": { │ +│ "allowed_commands": ["git", "npm", "node", "ls"], │ +│ "blocked_patterns": ["rm -rf", "sudo", "chmod 777"] │ +│ } │ +│ } │ +│ } │ +└──────────────────────────┬──────────────────────────────────┘ + │ + Any OAP-compliant provider + ┌────────────────┼────────────────┐ + │ │ │ + Your own APort (ref. Other future + evaluator implementation) implementations +``` + +**Creating a passport manually:** + +An OAP passport is just a JSON file. You can create one by hand following the [OAP specification](https://github.com/aporthq/aport-spec/blob/main/oap/oap-spec.md) and validate it against the [JSON schema](https://github.com/aporthq/aport-spec/blob/main/oap/passport-schema.json). See the [examples](https://github.com/aporthq/aport-spec/tree/main/oap/examples) directory for templates. + +**Using APort as a reference implementation:** + +[APort Agent Guardrails](https://github.com/aporthq/aport-agent-guardrails) is one open-source (Apache 2.0) implementation of an OAP provider. It handles passport creation, local evaluation, and optional hosted API evaluation. + +```bash +pip install aport-agent-guardrails +aport setup --framework deerflow +``` + +This creates: +- `~/.aport/deerflow/config.yaml` -- evaluator config (local or API mode) +- `~/.aport/deerflow/aport/passport.json` -- OAP passport with capabilities and limits + +**config.yaml (using APort as the provider):** +```yaml +guardrails: + enabled: true + provider: + use: aport_guardrails.providers.generic:OAPGuardrailProvider +``` + +**config.yaml (using your own OAP provider):** +```yaml +guardrails: + enabled: true + provider: + use: my_oap_provider:MyOAPProvider + config: + passport_path: ./my-passport.json +``` + +Any provider that accepts `framework` as a kwarg and implements `evaluate`/`aevaluate` works. The OAP standard defines the passport format and decision codes; DeerFlow doesn't care which provider reads them. + +**What the passport controls:** + +| Passport field | What it does | Example | +|---|---|---| +| `capabilities[].id` | Which tool categories the agent can use | `system.command.execute`, `data.file.write` | +| `limits.*.allowed_commands` | Which commands are allowed | `["git", "npm", "node"]` or `["*"]` for all | +| `limits.*.blocked_patterns` | Patterns always denied | `["rm -rf", "sudo", "chmod 777"]` | +| `status` | Kill switch | `active`, `suspended`, `revoked` | + +**Evaluation modes (provider-dependent):** + +OAP providers may support different evaluation modes. For example, the APort reference implementation supports: + +| Mode | How it works | Network | Latency | +|---|---|---|---| +| **Local** | Evaluates passport locally (bash script). | None | ~300ms | +| **API** | Sends passport + context to a hosted evaluator. Signed decisions. | Yes | ~65ms | + +A custom OAP provider can implement any evaluation strategy -- the DeerFlow middleware doesn't care how the provider reaches its decision. + +**Try it:** +1. Install and set up as above +2. Start DeerFlow and ask: "Create a file called test.txt with content hello" +3. Then ask: "Now delete it using bash rm -rf" +4. Guardrail blocks it: `oap.blocked_pattern: Command contains blocked pattern: rm -rf` + +### Option 3: Custom Provider (Bring Your Own) + +Any Python class with `evaluate(request)` and `aevaluate(request)` methods works. No base class or inheritance needed -- it's a structural protocol. + +```python +# my_guardrail.py + +class MyGuardrailProvider: + name = "my-company" + + def evaluate(self, request): + from deerflow.guardrails.provider import GuardrailDecision, GuardrailReason + + # Example: block any bash command containing "delete" + if request.tool_name == "bash" and "delete" in str(request.tool_input): + return GuardrailDecision( + allow=False, + reasons=[GuardrailReason(code="custom.blocked", message="delete not allowed")], + policy_id="custom.v1", + ) + return GuardrailDecision(allow=True, reasons=[GuardrailReason(code="oap.allowed")]) + + async def aevaluate(self, request): + return self.evaluate(request) +``` + +**config.yaml:** +```yaml +guardrails: + enabled: true + provider: + use: my_guardrail:MyGuardrailProvider +``` + +Make sure `my_guardrail.py` is on the Python path (e.g. in the backend directory or installed as a package). + +**Try it:** +1. Create `my_guardrail.py` in the backend directory +2. Add the config +3. Start DeerFlow and ask: "Use bash to delete test.txt" +4. Your provider blocks it + +## Implementing a Provider + +### Required Interface + +``` +┌──────────────────────────────────────────────────┐ +│ GuardrailProvider Protocol │ +│ │ +│ name: str │ +│ │ +│ evaluate(request: GuardrailRequest) │ +│ -> GuardrailDecision │ +│ │ +│ aevaluate(request: GuardrailRequest) (async) │ +│ -> GuardrailDecision │ +└──────────────────────────────────────────────────┘ + +┌──────────────────────────┐ ┌──────────────────────────┐ +│ GuardrailRequest │ │ GuardrailDecision │ +│ │ │ │ +│ tool_name: str │ │ allow: bool │ +│ tool_input: dict │ │ reasons: [GuardrailReason]│ +│ agent_id: str | None │ │ policy_id: str | None │ +│ thread_id: str | None │ │ metadata: dict │ +│ is_subagent: bool │ │ │ +│ timestamp: str │ │ GuardrailReason: │ +│ │ │ code: str │ +└──────────────────────────┘ │ message: str │ + └──────────────────────────┘ +``` + +### DeerFlow Tool Names + +These are the tool names your provider will see in `request.tool_name`: + +| Tool | What it does | +|---|---| +| `bash` | Shell command execution | +| `write_file` | Create/overwrite a file | +| `str_replace` | Edit a file (find and replace) | +| `read_file` | Read file content | +| `ls` | List directory | +| `web_search` | Web search query | +| `web_fetch` | Fetch URL content | +| `image_search` | Image search | +| `present_file` | Present file to user | +| `view_image` | Display image | +| `ask_clarification` | Ask user a question | +| `task` | Delegate to subagent | +| `mcp__*` | MCP tools (dynamic) | + +### OAP Reason Codes + +Standard codes used by the [OAP specification](https://github.com/aporthq/aport-spec): + +| Code | Meaning | +|---|---| +| `oap.allowed` | Tool call authorized | +| `oap.tool_not_allowed` | Tool not in allowlist | +| `oap.command_not_allowed` | Command not in allowed_commands | +| `oap.blocked_pattern` | Command matches a blocked pattern | +| `oap.limit_exceeded` | Operation exceeds a limit | +| `oap.passport_suspended` | Passport status is suspended/revoked | +| `oap.evaluator_error` | Provider crashed (fail-closed) | + +### Provider Loading + +DeerFlow loads providers via `resolve_variable()` -- the same mechanism used for models, tools, and sandbox providers. The `use:` field is a Python class path: `package.module:ClassName`. + +The provider is instantiated with `**config` kwargs if `config:` is set, plus `framework="deerflow"` is always injected. Accept `**kwargs` to stay forward-compatible: + +```python +class YourProvider: + def __init__(self, framework: str = "generic", **kwargs): + # framework="deerflow" tells you which config dir to use + ... +``` + +## Configuration Reference + +```yaml +guardrails: + # Enable/disable guardrail middleware (default: false) + enabled: true + + # Block tool calls if provider raises an exception (default: true) + fail_closed: true + + # Passport reference -- passed as request.agent_id to the provider. + # File path, hosted agent ID, or null (provider resolves from its config). + passport: null + + # Provider: loaded by class path via resolve_variable + provider: + use: deerflow.guardrails.builtin:AllowlistProvider + config: # optional kwargs passed to provider.__init__ + denied_tools: ["bash"] +``` + +## Testing + +```bash +cd backend +uv run python -m pytest tests/test_guardrail_middleware.py -v +``` + +25 tests covering: +- AllowlistProvider: allow, deny, both allowlist+denylist, async +- GuardrailMiddleware: allow passthrough, deny with OAP codes, fail-closed, fail-open, passport forwarding, empty reasons fallback, empty tool name, protocol isinstance check +- Async paths: awrap_tool_call for allow, deny, fail-closed, fail-open +- GraphBubbleUp: LangGraph control signals propagate through (not caught) +- Config: defaults, from_dict, singleton load/reset + +## Files + +``` +packages/harness/deerflow/guardrails/ + __init__.py # Public exports + provider.py # GuardrailProvider protocol, GuardrailRequest, GuardrailDecision + middleware.py # GuardrailMiddleware (AgentMiddleware subclass) + builtin.py # AllowlistProvider (zero deps) + +packages/harness/deerflow/config/ + guardrails_config.py # GuardrailsConfig Pydantic model + singleton + +packages/harness/deerflow/agents/middlewares/ + tool_error_handling_middleware.py # Registers GuardrailMiddleware in chain + +config.example.yaml # Three provider options documented +tests/test_guardrail_middleware.py # 25 tests +docs/GUARDRAILS.md # This file +``` diff --git a/backend/packages/harness/deerflow/agents/middlewares/tool_error_handling_middleware.py b/backend/packages/harness/deerflow/agents/middlewares/tool_error_handling_middleware.py index 8c9be44..b692da4 100644 --- a/backend/packages/harness/deerflow/agents/middlewares/tool_error_handling_middleware.py +++ b/backend/packages/harness/deerflow/agents/middlewares/tool_error_handling_middleware.py @@ -90,6 +90,31 @@ def _build_runtime_middlewares( middlewares.append(DanglingToolCallMiddleware()) + # Guardrail middleware (if configured) + from deerflow.config.guardrails_config import get_guardrails_config + + guardrails_config = get_guardrails_config() + if guardrails_config.enabled and guardrails_config.provider: + import inspect + + from deerflow.guardrails.middleware import GuardrailMiddleware + from deerflow.reflection import resolve_variable + + provider_cls = resolve_variable(guardrails_config.provider.use) + provider_kwargs = dict(guardrails_config.provider.config) if guardrails_config.provider.config else {} + # Pass framework hint if the provider accepts it (e.g. for config discovery). + # Built-in providers like AllowlistProvider don't need it, so only inject + # when the constructor accepts 'framework' or '**kwargs'. + if "framework" not in provider_kwargs: + try: + sig = inspect.signature(provider_cls.__init__) + if "framework" in sig.parameters or any(p.kind == inspect.Parameter.VAR_KEYWORD for p in sig.parameters.values()): + provider_kwargs["framework"] = "deerflow" + except (ValueError, TypeError): + pass + provider = provider_cls(**provider_kwargs) + middlewares.append(GuardrailMiddleware(provider, fail_closed=guardrails_config.fail_closed, passport=guardrails_config.passport)) + middlewares.append(ToolErrorHandlingMiddleware()) return middlewares diff --git a/backend/packages/harness/deerflow/config/app_config.py b/backend/packages/harness/deerflow/config/app_config.py index fc48d1e..390680c 100644 --- a/backend/packages/harness/deerflow/config/app_config.py +++ b/backend/packages/harness/deerflow/config/app_config.py @@ -9,6 +9,7 @@ from pydantic import BaseModel, ConfigDict, Field from deerflow.config.checkpointer_config import CheckpointerConfig, load_checkpointer_config_from_dict from deerflow.config.extensions_config import ExtensionsConfig +from deerflow.config.guardrails_config import load_guardrails_config_from_dict from deerflow.config.memory_config import load_memory_config_from_dict from deerflow.config.model_config import ModelConfig from deerflow.config.sandbox_config import SandboxConfig @@ -107,6 +108,10 @@ class AppConfig(BaseModel): if "tool_search" in config_data: load_tool_search_config_from_dict(config_data["tool_search"]) + # Load guardrails config if present + if "guardrails" in config_data: + load_guardrails_config_from_dict(config_data["guardrails"]) + # Load checkpointer config if present if "checkpointer" in config_data: load_checkpointer_config_from_dict(config_data["checkpointer"]) diff --git a/backend/packages/harness/deerflow/config/guardrails_config.py b/backend/packages/harness/deerflow/config/guardrails_config.py new file mode 100644 index 0000000..fe7a0b8 --- /dev/null +++ b/backend/packages/harness/deerflow/config/guardrails_config.py @@ -0,0 +1,48 @@ +"""Configuration for pre-tool-call authorization.""" + +from pydantic import BaseModel, Field + + +class GuardrailProviderConfig(BaseModel): + """Configuration for a guardrail provider.""" + + use: str = Field(description="Class path (e.g. 'deerflow.guardrails.builtin:AllowlistProvider')") + config: dict = Field(default_factory=dict, description="Provider-specific settings passed as kwargs") + + +class GuardrailsConfig(BaseModel): + """Configuration for pre-tool-call authorization. + + When enabled, every tool call passes through the configured provider + before execution. The provider receives tool name, arguments, and the + agent's passport reference, and returns an allow/deny decision. + """ + + enabled: bool = Field(default=False, description="Enable guardrail middleware") + fail_closed: bool = Field(default=True, description="Block tool calls if provider errors") + passport: str | None = Field(default=None, description="OAP passport path or hosted agent ID") + provider: GuardrailProviderConfig | None = Field(default=None, description="Guardrail provider configuration") + + +_guardrails_config: GuardrailsConfig | None = None + + +def get_guardrails_config() -> GuardrailsConfig: + """Get the guardrails config, returning defaults if not loaded.""" + global _guardrails_config + if _guardrails_config is None: + _guardrails_config = GuardrailsConfig() + return _guardrails_config + + +def load_guardrails_config_from_dict(data: dict) -> GuardrailsConfig: + """Load guardrails config from a dict (called during AppConfig loading).""" + global _guardrails_config + _guardrails_config = GuardrailsConfig.model_validate(data) + return _guardrails_config + + +def reset_guardrails_config() -> None: + """Reset the cached config instance. Used in tests to prevent singleton leaks.""" + global _guardrails_config + _guardrails_config = None diff --git a/backend/packages/harness/deerflow/guardrails/__init__.py b/backend/packages/harness/deerflow/guardrails/__init__.py new file mode 100644 index 0000000..3c23cd0 --- /dev/null +++ b/backend/packages/harness/deerflow/guardrails/__init__.py @@ -0,0 +1,14 @@ +"""Pre-tool-call authorization middleware.""" + +from deerflow.guardrails.builtin import AllowlistProvider +from deerflow.guardrails.middleware import GuardrailMiddleware +from deerflow.guardrails.provider import GuardrailDecision, GuardrailProvider, GuardrailReason, GuardrailRequest + +__all__ = [ + "AllowlistProvider", + "GuardrailDecision", + "GuardrailMiddleware", + "GuardrailProvider", + "GuardrailReason", + "GuardrailRequest", +] diff --git a/backend/packages/harness/deerflow/guardrails/builtin.py b/backend/packages/harness/deerflow/guardrails/builtin.py new file mode 100644 index 0000000..53ce9f8 --- /dev/null +++ b/backend/packages/harness/deerflow/guardrails/builtin.py @@ -0,0 +1,23 @@ +"""Built-in guardrail providers that ship with DeerFlow.""" + +from deerflow.guardrails.provider import GuardrailDecision, GuardrailReason, GuardrailRequest + + +class AllowlistProvider: + """Simple allowlist/denylist provider. No external dependencies.""" + + name = "allowlist" + + def __init__(self, *, allowed_tools: list[str] | None = None, denied_tools: list[str] | None = None): + self._allowed = set(allowed_tools) if allowed_tools else None + self._denied = set(denied_tools) if denied_tools else set() + + def evaluate(self, request: GuardrailRequest) -> GuardrailDecision: + if self._allowed is not None and request.tool_name not in self._allowed: + return GuardrailDecision(allow=False, reasons=[GuardrailReason(code="oap.tool_not_allowed", message=f"tool '{request.tool_name}' not in allowlist")]) + if request.tool_name in self._denied: + return GuardrailDecision(allow=False, reasons=[GuardrailReason(code="oap.tool_not_allowed", message=f"tool '{request.tool_name}' is denied")]) + return GuardrailDecision(allow=True, reasons=[GuardrailReason(code="oap.allowed")]) + + async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision: + return self.evaluate(request) diff --git a/backend/packages/harness/deerflow/guardrails/middleware.py b/backend/packages/harness/deerflow/guardrails/middleware.py new file mode 100644 index 0000000..a35e155 --- /dev/null +++ b/backend/packages/harness/deerflow/guardrails/middleware.py @@ -0,0 +1,98 @@ +"""GuardrailMiddleware - evaluates tool calls against a GuardrailProvider before execution.""" + +import logging +from collections.abc import Awaitable, Callable +from datetime import UTC, datetime +from typing import override + +from langchain.agents import AgentState +from langchain.agents.middleware import AgentMiddleware +from langchain_core.messages import ToolMessage +from langgraph.errors import GraphBubbleUp +from langgraph.prebuilt.tool_node import ToolCallRequest +from langgraph.types import Command + +from deerflow.guardrails.provider import GuardrailDecision, GuardrailProvider, GuardrailReason, GuardrailRequest + +logger = logging.getLogger(__name__) + + +class GuardrailMiddleware(AgentMiddleware[AgentState]): + """Evaluate tool calls against a GuardrailProvider before execution. + + Denied calls return an error ToolMessage so the agent can adapt. + If the provider raises, behavior depends on fail_closed: + - True (default): block the call + - False: allow it through with a warning + """ + + def __init__(self, provider: GuardrailProvider, *, fail_closed: bool = True, passport: str | None = None): + self.provider = provider + self.fail_closed = fail_closed + self.passport = passport + + def _build_request(self, request: ToolCallRequest) -> GuardrailRequest: + return GuardrailRequest( + tool_name=str(request.tool_call.get("name", "")), + tool_input=request.tool_call.get("args", {}), + agent_id=self.passport, + timestamp=datetime.now(UTC).isoformat(), + ) + + def _build_denied_message(self, request: ToolCallRequest, decision: GuardrailDecision) -> ToolMessage: + tool_name = str(request.tool_call.get("name", "unknown_tool")) + tool_call_id = str(request.tool_call.get("id", "missing_id")) + reason_text = decision.reasons[0].message if decision.reasons else "blocked by guardrail policy" + reason_code = decision.reasons[0].code if decision.reasons else "oap.denied" + return ToolMessage( + content=f"Guardrail denied: tool '{tool_name}' was blocked ({reason_code}). Reason: {reason_text}. Choose an alternative approach.", + tool_call_id=tool_call_id, + name=tool_name, + status="error", + ) + + @override + def wrap_tool_call( + self, + request: ToolCallRequest, + handler: Callable[[ToolCallRequest], ToolMessage | Command], + ) -> ToolMessage | Command: + gr = self._build_request(request) + try: + decision = self.provider.evaluate(gr) + except GraphBubbleUp: + # Preserve LangGraph control-flow signals (interrupt/pause/resume). + raise + except Exception: + logger.exception("Guardrail provider error (sync)") + if self.fail_closed: + decision = GuardrailDecision(allow=False, reasons=[GuardrailReason(code="oap.evaluator_error", message="guardrail provider error (fail-closed)")]) + else: + return handler(request) + if not decision.allow: + logger.warning("Guardrail denied: tool=%s policy=%s code=%s", gr.tool_name, decision.policy_id, decision.reasons[0].code if decision.reasons else "unknown") + return self._build_denied_message(request, decision) + return handler(request) + + @override + async def awrap_tool_call( + self, + request: ToolCallRequest, + handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]], + ) -> ToolMessage | Command: + gr = self._build_request(request) + try: + decision = await self.provider.aevaluate(gr) + except GraphBubbleUp: + # Preserve LangGraph control-flow signals (interrupt/pause/resume). + raise + except Exception: + logger.exception("Guardrail provider error (async)") + if self.fail_closed: + decision = GuardrailDecision(allow=False, reasons=[GuardrailReason(code="oap.evaluator_error", message="guardrail provider error (fail-closed)")]) + else: + return await handler(request) + if not decision.allow: + logger.warning("Guardrail denied: tool=%s policy=%s code=%s", gr.tool_name, decision.policy_id, decision.reasons[0].code if decision.reasons else "unknown") + return self._build_denied_message(request, decision) + return await handler(request) diff --git a/backend/packages/harness/deerflow/guardrails/provider.py b/backend/packages/harness/deerflow/guardrails/provider.py new file mode 100644 index 0000000..f9cb718 --- /dev/null +++ b/backend/packages/harness/deerflow/guardrails/provider.py @@ -0,0 +1,56 @@ +"""GuardrailProvider protocol and data structures for pre-tool-call authorization.""" + +from __future__ import annotations + +from dataclasses import dataclass, field +from typing import Any, Protocol, runtime_checkable + + +@dataclass +class GuardrailRequest: + """Context passed to the provider for each tool call.""" + + tool_name: str + tool_input: dict[str, Any] + agent_id: str | None = None + thread_id: str | None = None + is_subagent: bool = False + timestamp: str = "" + + +@dataclass +class GuardrailReason: + """Structured reason for an allow/deny decision (OAP reason object).""" + + code: str + message: str = "" + + +@dataclass +class GuardrailDecision: + """Provider's allow/deny verdict (aligned with OAP Decision object).""" + + allow: bool + reasons: list[GuardrailReason] = field(default_factory=list) + policy_id: str | None = None + metadata: dict[str, Any] = field(default_factory=dict) + + +@runtime_checkable +class GuardrailProvider(Protocol): + """Contract for pluggable tool-call authorization. + + Any class with these methods works - no base class required. + Providers are loaded by class path via resolve_variable(), + the same mechanism DeerFlow uses for models, tools, and sandbox. + """ + + name: str + + def evaluate(self, request: GuardrailRequest) -> GuardrailDecision: + """Evaluate whether a tool call should proceed.""" + ... + + async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision: + """Async variant.""" + ... diff --git a/backend/tests/test_guardrail_middleware.py b/backend/tests/test_guardrail_middleware.py new file mode 100644 index 0000000..5c021ba --- /dev/null +++ b/backend/tests/test_guardrail_middleware.py @@ -0,0 +1,344 @@ +"""Tests for the guardrail middleware and built-in providers.""" + +from __future__ import annotations + +import asyncio +from unittest.mock import MagicMock + +import pytest +from langgraph.errors import GraphBubbleUp + +from deerflow.guardrails.builtin import AllowlistProvider +from deerflow.guardrails.middleware import GuardrailMiddleware +from deerflow.guardrails.provider import GuardrailDecision, GuardrailReason, GuardrailRequest + +# --- Helpers --- + + +def _make_tool_call_request(name: str = "bash", args: dict | None = None, call_id: str = "call_1"): + """Create a mock ToolCallRequest.""" + req = MagicMock() + req.tool_call = {"name": name, "args": args or {}, "id": call_id} + return req + + +class _AllowAllProvider: + name = "allow-all" + + def evaluate(self, request: GuardrailRequest) -> GuardrailDecision: + return GuardrailDecision(allow=True, reasons=[GuardrailReason(code="oap.allowed")]) + + async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision: + return self.evaluate(request) + + +class _DenyAllProvider: + name = "deny-all" + + def evaluate(self, request: GuardrailRequest) -> GuardrailDecision: + return GuardrailDecision( + allow=False, + reasons=[GuardrailReason(code="oap.denied", message="all tools blocked")], + policy_id="test.deny.v1", + ) + + async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision: + return self.evaluate(request) + + +class _ExplodingProvider: + name = "exploding" + + def evaluate(self, request: GuardrailRequest) -> GuardrailDecision: + raise RuntimeError("provider crashed") + + async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision: + raise RuntimeError("provider crashed") + + +# --- AllowlistProvider tests --- + + +class TestAllowlistProvider: + def test_no_restrictions_allows_all(self): + provider = AllowlistProvider() + req = GuardrailRequest(tool_name="bash", tool_input={}) + decision = provider.evaluate(req) + assert decision.allow is True + + def test_denied_tools(self): + provider = AllowlistProvider(denied_tools=["bash", "write_file"]) + req = GuardrailRequest(tool_name="bash", tool_input={}) + decision = provider.evaluate(req) + assert decision.allow is False + assert decision.reasons[0].code == "oap.tool_not_allowed" + + def test_denied_tools_allows_unlisted(self): + provider = AllowlistProvider(denied_tools=["bash"]) + req = GuardrailRequest(tool_name="web_search", tool_input={}) + decision = provider.evaluate(req) + assert decision.allow is True + + def test_allowed_tools_blocks_unlisted(self): + provider = AllowlistProvider(allowed_tools=["web_search", "read_file"]) + req = GuardrailRequest(tool_name="bash", tool_input={}) + decision = provider.evaluate(req) + assert decision.allow is False + + def test_allowed_tools_allows_listed(self): + provider = AllowlistProvider(allowed_tools=["web_search"]) + req = GuardrailRequest(tool_name="web_search", tool_input={}) + decision = provider.evaluate(req) + assert decision.allow is True + + def test_both_allowed_and_denied(self): + provider = AllowlistProvider(allowed_tools=["bash", "web_search"], denied_tools=["bash"]) + # bash is in both: allowlist passes, denylist blocks + req = GuardrailRequest(tool_name="bash", tool_input={}) + decision = provider.evaluate(req) + assert decision.allow is False + + def test_async_delegates_to_sync(self): + provider = AllowlistProvider(denied_tools=["bash"]) + req = GuardrailRequest(tool_name="bash", tool_input={}) + decision = asyncio.run(provider.aevaluate(req)) + assert decision.allow is False + + +# --- GuardrailMiddleware tests --- + + +class TestGuardrailMiddleware: + def test_allowed_tool_passes_through(self): + mw = GuardrailMiddleware(_AllowAllProvider()) + req = _make_tool_call_request("web_search") + expected = MagicMock() + handler = MagicMock(return_value=expected) + result = mw.wrap_tool_call(req, handler) + handler.assert_called_once_with(req) + assert result is expected + + def test_denied_tool_returns_error_message(self): + mw = GuardrailMiddleware(_DenyAllProvider()) + req = _make_tool_call_request("bash") + handler = MagicMock() + result = mw.wrap_tool_call(req, handler) + handler.assert_not_called() + assert result.status == "error" + assert "oap.denied" in result.content + assert result.name == "bash" + + def test_fail_closed_on_provider_error(self): + mw = GuardrailMiddleware(_ExplodingProvider(), fail_closed=True) + req = _make_tool_call_request("bash") + handler = MagicMock() + result = mw.wrap_tool_call(req, handler) + handler.assert_not_called() + assert result.status == "error" + assert "oap.evaluator_error" in result.content + + def test_fail_open_on_provider_error(self): + mw = GuardrailMiddleware(_ExplodingProvider(), fail_closed=False) + req = _make_tool_call_request("bash") + expected = MagicMock() + handler = MagicMock(return_value=expected) + result = mw.wrap_tool_call(req, handler) + handler.assert_called_once_with(req) + assert result is expected + + def test_passport_passed_as_agent_id(self): + captured = {} + + class CapturingProvider: + name = "capture" + + def evaluate(self, request): + captured["agent_id"] = request.agent_id + return GuardrailDecision(allow=True) + + async def aevaluate(self, request): + return self.evaluate(request) + + mw = GuardrailMiddleware(CapturingProvider(), passport="./guardrails/passport.json") + req = _make_tool_call_request("bash") + mw.wrap_tool_call(req, MagicMock()) + assert captured["agent_id"] == "./guardrails/passport.json" + + def test_decision_contains_oap_reason_codes(self): + mw = GuardrailMiddleware(_DenyAllProvider()) + req = _make_tool_call_request("bash") + result = mw.wrap_tool_call(req, MagicMock()) + assert "oap.denied" in result.content + assert "all tools blocked" in result.content + + def test_deny_with_empty_reasons_uses_fallback(self): + """Provider returns deny with empty reasons list -- middleware uses fallback text.""" + + class EmptyReasonProvider: + name = "empty-reason" + + def evaluate(self, request): + return GuardrailDecision(allow=False, reasons=[]) + + async def aevaluate(self, request): + return self.evaluate(request) + + mw = GuardrailMiddleware(EmptyReasonProvider()) + req = _make_tool_call_request("bash") + result = mw.wrap_tool_call(req, MagicMock()) + assert result.status == "error" + assert "blocked by guardrail policy" in result.content + + def test_empty_tool_name(self): + """Tool call with empty name is handled gracefully.""" + mw = GuardrailMiddleware(_AllowAllProvider()) + req = _make_tool_call_request("") + expected = MagicMock() + handler = MagicMock(return_value=expected) + result = mw.wrap_tool_call(req, handler) + assert result is expected + + def test_protocol_isinstance_check(self): + """AllowlistProvider satisfies GuardrailProvider protocol at runtime.""" + from deerflow.guardrails.provider import GuardrailProvider + + assert isinstance(AllowlistProvider(), GuardrailProvider) + + def test_async_allowed(self): + mw = GuardrailMiddleware(_AllowAllProvider()) + req = _make_tool_call_request("web_search") + expected = MagicMock() + + async def handler(r): + return expected + + async def run(): + return await mw.awrap_tool_call(req, handler) + + result = asyncio.run(run()) + assert result is expected + + def test_async_denied(self): + mw = GuardrailMiddleware(_DenyAllProvider()) + req = _make_tool_call_request("bash") + + async def handler(r): + return MagicMock() + + async def run(): + return await mw.awrap_tool_call(req, handler) + + result = asyncio.run(run()) + assert result.status == "error" + + def test_async_fail_closed(self): + mw = GuardrailMiddleware(_ExplodingProvider(), fail_closed=True) + req = _make_tool_call_request("bash") + + async def handler(r): + return MagicMock() + + async def run(): + return await mw.awrap_tool_call(req, handler) + + result = asyncio.run(run()) + assert result.status == "error" + + def test_async_fail_open(self): + mw = GuardrailMiddleware(_ExplodingProvider(), fail_closed=False) + req = _make_tool_call_request("bash") + expected = MagicMock() + + async def handler(r): + return expected + + async def run(): + return await mw.awrap_tool_call(req, handler) + + result = asyncio.run(run()) + assert result is expected + + def test_graph_bubble_up_not_swallowed(self): + """GraphBubbleUp (LangGraph interrupt/pause) must propagate, not be caught.""" + + class BubbleProvider: + name = "bubble" + + def evaluate(self, request): + raise GraphBubbleUp() + + async def aevaluate(self, request): + raise GraphBubbleUp() + + mw = GuardrailMiddleware(BubbleProvider(), fail_closed=True) + req = _make_tool_call_request("bash") + with pytest.raises(GraphBubbleUp): + mw.wrap_tool_call(req, MagicMock()) + + def test_async_graph_bubble_up_not_swallowed(self): + """Async: GraphBubbleUp must propagate.""" + + class BubbleProvider: + name = "bubble" + + def evaluate(self, request): + raise GraphBubbleUp() + + async def aevaluate(self, request): + raise GraphBubbleUp() + + mw = GuardrailMiddleware(BubbleProvider(), fail_closed=True) + req = _make_tool_call_request("bash") + + async def handler(r): + return MagicMock() + + async def run(): + return await mw.awrap_tool_call(req, handler) + + with pytest.raises(GraphBubbleUp): + asyncio.run(run()) + + +# --- Config tests --- + + +class TestGuardrailsConfig: + def test_config_defaults(self): + from deerflow.config.guardrails_config import GuardrailsConfig + + config = GuardrailsConfig() + assert config.enabled is False + assert config.fail_closed is True + assert config.passport is None + assert config.provider is None + + def test_config_from_dict(self): + from deerflow.config.guardrails_config import GuardrailsConfig + + config = GuardrailsConfig.model_validate( + { + "enabled": True, + "fail_closed": False, + "passport": "./guardrails/passport.json", + "provider": { + "use": "deerflow.guardrails.builtin:AllowlistProvider", + "config": {"denied_tools": ["bash"]}, + }, + } + ) + assert config.enabled is True + assert config.fail_closed is False + assert config.passport == "./guardrails/passport.json" + assert config.provider.use == "deerflow.guardrails.builtin:AllowlistProvider" + assert config.provider.config == {"denied_tools": ["bash"]} + + def test_singleton_load_and_get(self): + from deerflow.config.guardrails_config import get_guardrails_config, load_guardrails_config_from_dict, reset_guardrails_config + + try: + load_guardrails_config_from_dict({"enabled": True, "provider": {"use": "test:Foo"}}) + config = get_guardrails_config() + assert config.enabled is True + finally: + reset_guardrails_config() diff --git a/config.example.yaml b/config.example.yaml index 44ebb0d..b966f65 100644 --- a/config.example.yaml +++ b/config.example.yaml @@ -505,3 +505,38 @@ checkpointer: # context: # thinking_enabled: true # subagent_enabled: true + +# ============================================================================ +# Guardrails Configuration +# ============================================================================ +# Optional pre-execution authorization for tool calls. +# When enabled, every tool call passes through the configured provider +# before execution. Three options: built-in allowlist, OAP policy provider, +# or custom provider. See backend/docs/GUARDRAILS.md for full documentation. +# +# Providers are loaded by class path via resolve_variable (same as models/tools). + +# --- Option 1: Built-in AllowlistProvider (zero external deps) --- +# guardrails: +# enabled: true +# provider: +# use: deerflow.guardrails.builtin:AllowlistProvider +# config: +# denied_tools: ["bash", "write_file"] + +# --- Option 2: OAP passport provider (open standard, any implementation) --- +# The Open Agent Passport (OAP) spec defines passport format and decision codes. +# Any OAP-compliant provider works. Example using APort (reference implementation): +# pip install aport-agent-guardrails && aport setup --framework deerflow +# guardrails: +# enabled: true +# provider: +# use: aport_guardrails.providers.generic:OAPGuardrailProvider + +# --- Option 3: Custom provider (any class with evaluate/aevaluate methods) --- +# guardrails: +# enabled: true +# provider: +# use: my_package:MyGuardrailProvider +# config: +# key: value