refactor: split backend into harness (deerflow.*) and app (app.*) (#1131)

* refactor: extract shared utils to break harness→app cross-layer imports

Move _validate_skill_frontmatter to src/skills/validation.py and
CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py.
This eliminates the two reverse dependencies from client.py (harness layer)
into gateway/routers/ (app layer), preparing for the harness/app package split.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: split backend/src into harness (deerflow.*) and app (app.*)

Physically split the monolithic backend/src/ package into two layers:

- **Harness** (`packages/harness/deerflow/`): publishable agent framework
  package with import prefix `deerflow.*`. Contains agents, sandbox, tools,
  models, MCP, skills, config, and all core infrastructure.

- **App** (`app/`): unpublished application code with import prefix `app.*`.
  Contains gateway (FastAPI REST API) and channels (IM integrations).

Key changes:
- Move 13 harness modules to packages/harness/deerflow/ via git mv
- Move gateway + channels to app/ via git mv
- Rename all imports: src.* → deerflow.* (harness) / app.* (app layer)
- Set up uv workspace with deerflow-harness as workspace member
- Update langgraph.json, config.example.yaml, all scripts, Docker files
- Add build-system (hatchling) to harness pyproject.toml
- Add PYTHONPATH=. to gateway startup commands for app.* resolution
- Update ruff.toml with known-first-party for import sorting
- Update all documentation to reflect new directory structure

Boundary rule enforced: harness code never imports from app.
All 429 tests pass. Lint clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: add harness→app boundary check test and update docs

Add test_harness_boundary.py that scans all Python files in
packages/harness/deerflow/ and fails if any `from app.*` or
`import app.*` statement is found. This enforces the architectural
rule that the harness layer never depends on the app layer.

Update CLAUDE.md to document the harness/app split architecture,
import conventions, and the boundary enforcement test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add config versioning with auto-upgrade on startup

When config.example.yaml schema changes, developers' local config.yaml
files can silently become outdated. This adds a config_version field and
auto-upgrade mechanism so breaking changes (like src.* → deerflow.*
renames) are applied automatically before services start.

- Add config_version: 1 to config.example.yaml
- Add startup version check warning in AppConfig.from_file()
- Add scripts/config-upgrade.sh with migration registry for value replacements
- Add `make config-upgrade` target
- Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services
- Add config error hints in service failure messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix comments

* fix: update src.* import in test_sandbox_tools_security to deerflow.*

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle empty config and search parent dirs for config.example.yaml

Address Copilot review comments on PR #1131:
- Guard against yaml.safe_load() returning None for empty config files
- Search parent directories for config.example.yaml instead of only
  looking next to config.yaml, fixing detection in common setups

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct skills root path depth and config_version type coercion

- loader.py: fix get_skills_root_path() to use 5 parent levels (was 3)
  after harness split, file lives at packages/harness/deerflow/skills/
  so parent×3 resolved to backend/packages/harness/ instead of backend/
- app_config.py: coerce config_version to int() before comparison in
  _check_config_version() to prevent TypeError when YAML stores value
  as string (e.g. config_version: "1")
- tests: add regression tests for both fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update test imports from src.* to deerflow.*/app.* after harness refactor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
DanielWalnut
2026-03-14 22:55:52 +08:00
committed by GitHub
parent 9b49a80dda
commit 76803b826f
198 changed files with 1786 additions and 941 deletions

0
backend/app/__init__.py Normal file
View File

View File

@@ -0,0 +1,16 @@
"""IM Channel integration for DeerFlow.
Provides a pluggable channel system that connects external messaging platforms
(Feishu/Lark, Slack, Telegram) to the DeerFlow agent via the ChannelManager,
which uses ``langgraph-sdk`` to communicate with the underlying LangGraph Server.
"""
from app.channels.base import Channel
from app.channels.message_bus import InboundMessage, MessageBus, OutboundMessage
__all__ = [
"Channel",
"InboundMessage",
"MessageBus",
"OutboundMessage",
]

View File

@@ -0,0 +1,108 @@
"""Abstract base class for IM channels."""
from __future__ import annotations
import logging
from abc import ABC, abstractmethod
from typing import Any
from app.channels.message_bus import InboundMessage, InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
logger = logging.getLogger(__name__)
class Channel(ABC):
"""Base class for all IM channel implementations.
Each channel connects to an external messaging platform and:
1. Receives messages, wraps them as InboundMessage, publishes to the bus.
2. Subscribes to outbound messages and sends replies back to the platform.
Subclasses must implement ``start``, ``stop``, and ``send``.
"""
def __init__(self, name: str, bus: MessageBus, config: dict[str, Any]) -> None:
self.name = name
self.bus = bus
self.config = config
self._running = False
@property
def is_running(self) -> bool:
return self._running
# -- lifecycle ---------------------------------------------------------
@abstractmethod
async def start(self) -> None:
"""Start listening for messages from the external platform."""
@abstractmethod
async def stop(self) -> None:
"""Gracefully stop the channel."""
# -- outbound ----------------------------------------------------------
@abstractmethod
async def send(self, msg: OutboundMessage) -> None:
"""Send a message back to the external platform.
The implementation should use ``msg.chat_id`` and ``msg.thread_ts``
to route the reply to the correct conversation/thread.
"""
async def send_file(self, msg: OutboundMessage, attachment: ResolvedAttachment) -> bool:
"""Upload a single file attachment to the platform.
Returns True if the upload succeeded, False otherwise.
Default implementation returns False (no file upload support).
"""
return False
# -- helpers -----------------------------------------------------------
def _make_inbound(
self,
chat_id: str,
user_id: str,
text: str,
*,
msg_type: InboundMessageType = InboundMessageType.CHAT,
thread_ts: str | None = None,
files: list[dict[str, Any]] | None = None,
metadata: dict[str, Any] | None = None,
) -> InboundMessage:
"""Convenience factory for creating InboundMessage instances."""
return InboundMessage(
channel_name=self.name,
chat_id=chat_id,
user_id=user_id,
text=text,
msg_type=msg_type,
thread_ts=thread_ts,
files=files or [],
metadata=metadata or {},
)
async def _on_outbound(self, msg: OutboundMessage) -> None:
"""Outbound callback registered with the bus.
Only forwards messages targeted at this channel.
Sends the text message first, then uploads any file attachments.
File uploads are skipped entirely when the text send fails to avoid
partial deliveries (files without accompanying text).
"""
if msg.channel_name == self.name:
try:
await self.send(msg)
except Exception:
logger.exception("Failed to send outbound message on channel %s", self.name)
return # Do not attempt file uploads when the text message failed
for attachment in msg.attachments:
try:
success = await self.send_file(msg, attachment)
if not success:
logger.warning("[%s] file upload skipped for %s", self.name, attachment.filename)
except Exception:
logger.exception("[%s] failed to upload file %s", self.name, attachment.filename)

View File

@@ -0,0 +1,510 @@
"""Feishu/Lark channel — connects to Feishu via WebSocket (no public IP needed)."""
from __future__ import annotations
import asyncio
import json
import logging
import threading
from typing import Any
from app.channels.base import Channel
from app.channels.message_bus import InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
logger = logging.getLogger(__name__)
class FeishuChannel(Channel):
"""Feishu/Lark IM channel using the ``lark-oapi`` WebSocket client.
Configuration keys (in ``config.yaml`` under ``channels.feishu``):
- ``app_id``: Feishu app ID.
- ``app_secret``: Feishu app secret.
- ``verification_token``: (optional) Event verification token.
The channel uses WebSocket long-connection mode so no public IP is required.
Message flow:
1. User sends a message → bot adds "OK" emoji reaction
2. Bot replies in thread: "Working on it......"
3. Agent processes the message and returns a result
4. Bot replies in thread with the result
5. Bot adds "DONE" emoji reaction to the original message
"""
def __init__(self, bus: MessageBus, config: dict[str, Any]) -> None:
super().__init__(name="feishu", bus=bus, config=config)
self._thread: threading.Thread | None = None
self._main_loop: asyncio.AbstractEventLoop | None = None
self._api_client = None
self._CreateMessageReactionRequest = None
self._CreateMessageReactionRequestBody = None
self._Emoji = None
self._PatchMessageRequest = None
self._PatchMessageRequestBody = None
self._background_tasks: set[asyncio.Task] = set()
self._running_card_ids: dict[str, str] = {}
self._running_card_tasks: dict[str, asyncio.Task] = {}
self._CreateFileRequest = None
self._CreateFileRequestBody = None
self._CreateImageRequest = None
self._CreateImageRequestBody = None
async def start(self) -> None:
if self._running:
return
try:
import lark_oapi as lark
from lark_oapi.api.im.v1 import (
CreateFileRequest,
CreateFileRequestBody,
CreateImageRequest,
CreateImageRequestBody,
CreateMessageReactionRequest,
CreateMessageReactionRequestBody,
CreateMessageRequest,
CreateMessageRequestBody,
Emoji,
PatchMessageRequest,
PatchMessageRequestBody,
ReplyMessageRequest,
ReplyMessageRequestBody,
)
except ImportError:
logger.error("lark-oapi is not installed. Install it with: uv add lark-oapi")
return
self._lark = lark
self._CreateMessageRequest = CreateMessageRequest
self._CreateMessageRequestBody = CreateMessageRequestBody
self._ReplyMessageRequest = ReplyMessageRequest
self._ReplyMessageRequestBody = ReplyMessageRequestBody
self._CreateMessageReactionRequest = CreateMessageReactionRequest
self._CreateMessageReactionRequestBody = CreateMessageReactionRequestBody
self._Emoji = Emoji
self._PatchMessageRequest = PatchMessageRequest
self._PatchMessageRequestBody = PatchMessageRequestBody
self._CreateFileRequest = CreateFileRequest
self._CreateFileRequestBody = CreateFileRequestBody
self._CreateImageRequest = CreateImageRequest
self._CreateImageRequestBody = CreateImageRequestBody
app_id = self.config.get("app_id", "")
app_secret = self.config.get("app_secret", "")
if not app_id or not app_secret:
logger.error("Feishu channel requires app_id and app_secret")
return
self._api_client = lark.Client.builder().app_id(app_id).app_secret(app_secret).build()
self._main_loop = asyncio.get_event_loop()
self._running = True
self.bus.subscribe_outbound(self._on_outbound)
# Both ws.Client construction and start() must happen in a dedicated
# thread with its own event loop. lark-oapi caches the running loop
# at construction time and later calls loop.run_until_complete(),
# which conflicts with an already-running uvloop.
self._thread = threading.Thread(
target=self._run_ws,
args=(app_id, app_secret),
daemon=True,
)
self._thread.start()
logger.info("Feishu channel started")
def _run_ws(self, app_id: str, app_secret: str) -> None:
"""Construct and run the lark WS client in a thread with a fresh event loop.
The lark-oapi SDK captures a module-level event loop at import time
(``lark_oapi.ws.client.loop``). When uvicorn uses uvloop, that
captured loop is the *main* thread's uvloop — which is already
running, so ``loop.run_until_complete()`` inside ``Client.start()``
raises ``RuntimeError``.
We work around this by creating a plain asyncio event loop for this
thread and patching the SDK's module-level reference before calling
``start()``.
"""
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
import lark_oapi as lark
import lark_oapi.ws.client as _ws_client_mod
# Replace the SDK's module-level loop so Client.start() uses
# this thread's (non-running) event loop instead of the main
# thread's uvloop.
_ws_client_mod.loop = loop
event_handler = lark.EventDispatcherHandler.builder("", "").register_p2_im_message_receive_v1(self._on_message).build()
ws_client = lark.ws.Client(
app_id=app_id,
app_secret=app_secret,
event_handler=event_handler,
log_level=lark.LogLevel.INFO,
)
ws_client.start()
except Exception:
if self._running:
logger.exception("Feishu WebSocket error")
async def stop(self) -> None:
self._running = False
self.bus.unsubscribe_outbound(self._on_outbound)
for task in list(self._background_tasks):
task.cancel()
self._background_tasks.clear()
for task in list(self._running_card_tasks.values()):
task.cancel()
self._running_card_tasks.clear()
if self._thread:
self._thread.join(timeout=5)
self._thread = None
logger.info("Feishu channel stopped")
async def send(self, msg: OutboundMessage, *, _max_retries: int = 3) -> None:
if not self._api_client:
logger.warning("[Feishu] send called but no api_client available")
return
logger.info(
"[Feishu] sending reply: chat_id=%s, thread_ts=%s, text_len=%d",
msg.chat_id,
msg.thread_ts,
len(msg.text),
)
last_exc: Exception | None = None
for attempt in range(_max_retries):
try:
await self._send_card_message(msg)
return # success
except Exception as exc:
last_exc = exc
if attempt < _max_retries - 1:
delay = 2**attempt # 1s, 2s
logger.warning(
"[Feishu] send failed (attempt %d/%d), retrying in %ds: %s",
attempt + 1,
_max_retries,
delay,
exc,
)
await asyncio.sleep(delay)
logger.error("[Feishu] send failed after %d attempts: %s", _max_retries, last_exc)
raise last_exc # type: ignore[misc]
async def send_file(self, msg: OutboundMessage, attachment: ResolvedAttachment) -> bool:
if not self._api_client:
return False
# Check size limits (image: 10MB, file: 30MB)
if attachment.is_image and attachment.size > 10 * 1024 * 1024:
logger.warning("[Feishu] image too large (%d bytes), skipping: %s", attachment.size, attachment.filename)
return False
if not attachment.is_image and attachment.size > 30 * 1024 * 1024:
logger.warning("[Feishu] file too large (%d bytes), skipping: %s", attachment.size, attachment.filename)
return False
try:
if attachment.is_image:
file_key = await self._upload_image(attachment.actual_path)
msg_type = "image"
content = json.dumps({"image_key": file_key})
else:
file_key = await self._upload_file(attachment.actual_path, attachment.filename)
msg_type = "file"
content = json.dumps({"file_key": file_key})
if msg.thread_ts:
request = self._ReplyMessageRequest.builder().message_id(msg.thread_ts).request_body(self._ReplyMessageRequestBody.builder().msg_type(msg_type).content(content).reply_in_thread(True).build()).build()
await asyncio.to_thread(self._api_client.im.v1.message.reply, request)
else:
request = self._CreateMessageRequest.builder().receive_id_type("chat_id").request_body(self._CreateMessageRequestBody.builder().receive_id(msg.chat_id).msg_type(msg_type).content(content).build()).build()
await asyncio.to_thread(self._api_client.im.v1.message.create, request)
logger.info("[Feishu] file sent: %s (type=%s)", attachment.filename, msg_type)
return True
except Exception:
logger.exception("[Feishu] failed to upload/send file: %s", attachment.filename)
return False
async def _upload_image(self, path) -> str:
"""Upload an image to Feishu and return the image_key."""
with open(str(path), "rb") as f:
request = self._CreateImageRequest.builder().request_body(self._CreateImageRequestBody.builder().image_type("message").image(f).build()).build()
response = await asyncio.to_thread(self._api_client.im.v1.image.create, request)
if not response.success():
raise RuntimeError(f"Feishu image upload failed: code={response.code}, msg={response.msg}")
return response.data.image_key
async def _upload_file(self, path, filename: str) -> str:
"""Upload a file to Feishu and return the file_key."""
suffix = path.suffix.lower() if hasattr(path, "suffix") else ""
if suffix in (".xls", ".xlsx", ".csv"):
file_type = "xls"
elif suffix in (".ppt", ".pptx"):
file_type = "ppt"
elif suffix == ".pdf":
file_type = "pdf"
elif suffix in (".doc", ".docx"):
file_type = "doc"
else:
file_type = "stream"
with open(str(path), "rb") as f:
request = self._CreateFileRequest.builder().request_body(self._CreateFileRequestBody.builder().file_type(file_type).file_name(filename).file(f).build()).build()
response = await asyncio.to_thread(self._api_client.im.v1.file.create, request)
if not response.success():
raise RuntimeError(f"Feishu file upload failed: code={response.code}, msg={response.msg}")
return response.data.file_key
# -- message formatting ------------------------------------------------
@staticmethod
def _build_card_content(text: str) -> str:
"""Build a Feishu interactive card with markdown content.
Feishu's interactive card format natively renders markdown, including
headers, bold/italic, code blocks, lists, and links.
"""
card = {
"config": {"wide_screen_mode": True, "update_multi": True},
"elements": [{"tag": "markdown", "content": text}],
}
return json.dumps(card)
# -- reaction helpers --------------------------------------------------
async def _add_reaction(self, message_id: str, emoji_type: str = "THUMBSUP") -> None:
"""Add an emoji reaction to a message."""
if not self._api_client or not self._CreateMessageReactionRequest:
return
try:
request = self._CreateMessageReactionRequest.builder().message_id(message_id).request_body(self._CreateMessageReactionRequestBody.builder().reaction_type(self._Emoji.builder().emoji_type(emoji_type).build()).build()).build()
await asyncio.to_thread(self._api_client.im.v1.message_reaction.create, request)
logger.info("[Feishu] reaction '%s' added to message %s", emoji_type, message_id)
except Exception:
logger.exception("[Feishu] failed to add reaction '%s' to message %s", emoji_type, message_id)
async def _reply_card(self, message_id: str, text: str) -> str | None:
"""Reply with an interactive card and return the created card message ID."""
if not self._api_client:
return None
content = self._build_card_content(text)
request = self._ReplyMessageRequest.builder().message_id(message_id).request_body(self._ReplyMessageRequestBody.builder().msg_type("interactive").content(content).reply_in_thread(True).build()).build()
response = await asyncio.to_thread(self._api_client.im.v1.message.reply, request)
response_data = getattr(response, "data", None)
return getattr(response_data, "message_id", None)
async def _create_card(self, chat_id: str, text: str) -> None:
"""Create a new card message in the target chat."""
if not self._api_client:
return
content = self._build_card_content(text)
request = self._CreateMessageRequest.builder().receive_id_type("chat_id").request_body(self._CreateMessageRequestBody.builder().receive_id(chat_id).msg_type("interactive").content(content).build()).build()
await asyncio.to_thread(self._api_client.im.v1.message.create, request)
async def _update_card(self, message_id: str, text: str) -> None:
"""Patch an existing card message in place."""
if not self._api_client or not self._PatchMessageRequest:
return
content = self._build_card_content(text)
request = self._PatchMessageRequest.builder().message_id(message_id).request_body(self._PatchMessageRequestBody.builder().content(content).build()).build()
await asyncio.to_thread(self._api_client.im.v1.message.patch, request)
def _track_background_task(self, task: asyncio.Task, *, name: str, msg_id: str) -> None:
"""Keep a strong reference to fire-and-forget tasks and surface errors."""
self._background_tasks.add(task)
task.add_done_callback(lambda done_task, task_name=name, mid=msg_id: self._finalize_background_task(done_task, task_name, mid))
def _finalize_background_task(self, task: asyncio.Task, name: str, msg_id: str) -> None:
self._background_tasks.discard(task)
self._log_task_error(task, name, msg_id)
async def _create_running_card(self, source_message_id: str, text: str) -> str | None:
"""Create the running card and cache its message ID when available."""
running_card_id = await self._reply_card(source_message_id, text)
if running_card_id:
self._running_card_ids[source_message_id] = running_card_id
logger.info("[Feishu] running card created: source=%s card=%s", source_message_id, running_card_id)
else:
logger.warning("[Feishu] running card creation returned no message_id for source=%s, subsequent updates will fall back to new replies", source_message_id)
return running_card_id
def _ensure_running_card_started(self, source_message_id: str, text: str = "Working on it...") -> asyncio.Task | None:
"""Start running-card creation once per source message."""
running_card_id = self._running_card_ids.get(source_message_id)
if running_card_id:
return None
running_card_task = self._running_card_tasks.get(source_message_id)
if running_card_task:
return running_card_task
running_card_task = asyncio.create_task(self._create_running_card(source_message_id, text))
self._running_card_tasks[source_message_id] = running_card_task
running_card_task.add_done_callback(lambda done_task, mid=source_message_id: self._finalize_running_card_task(mid, done_task))
return running_card_task
def _finalize_running_card_task(self, source_message_id: str, task: asyncio.Task) -> None:
if self._running_card_tasks.get(source_message_id) is task:
self._running_card_tasks.pop(source_message_id, None)
self._log_task_error(task, "create_running_card", source_message_id)
async def _ensure_running_card(self, source_message_id: str, text: str = "Working on it...") -> str | None:
"""Ensure the in-thread running card exists and track its message ID."""
running_card_id = self._running_card_ids.get(source_message_id)
if running_card_id:
return running_card_id
running_card_task = self._ensure_running_card_started(source_message_id, text)
if running_card_task is None:
return self._running_card_ids.get(source_message_id)
return await running_card_task
async def _send_running_reply(self, message_id: str) -> None:
"""Reply to a message in-thread with a running card."""
try:
await self._ensure_running_card(message_id)
except Exception:
logger.exception("[Feishu] failed to send running reply for message %s", message_id)
async def _send_card_message(self, msg: OutboundMessage) -> None:
"""Send or update the Feishu card tied to the current request."""
source_message_id = msg.thread_ts
if source_message_id:
running_card_id = self._running_card_ids.get(source_message_id)
awaited_running_card_task = False
if not running_card_id:
running_card_task = self._running_card_tasks.get(source_message_id)
if running_card_task:
awaited_running_card_task = True
running_card_id = await running_card_task
if running_card_id:
try:
await self._update_card(running_card_id, msg.text)
except Exception:
if not msg.is_final:
raise
logger.exception(
"[Feishu] failed to patch running card %s, falling back to final reply",
running_card_id,
)
await self._reply_card(source_message_id, msg.text)
else:
logger.info("[Feishu] running card updated: source=%s card=%s", source_message_id, running_card_id)
elif msg.is_final:
await self._reply_card(source_message_id, msg.text)
elif awaited_running_card_task:
logger.warning(
"[Feishu] running card task finished without message_id for source=%s, skipping duplicate non-final creation",
source_message_id,
)
else:
await self._ensure_running_card(source_message_id, msg.text)
if msg.is_final:
self._running_card_ids.pop(source_message_id, None)
await self._add_reaction(source_message_id, "DONE")
return
await self._create_card(msg.chat_id, msg.text)
# -- internal ----------------------------------------------------------
@staticmethod
def _log_future_error(fut, name: str, msg_id: str) -> None:
"""Callback for run_coroutine_threadsafe futures to surface errors."""
try:
exc = fut.exception()
if exc:
logger.error("[Feishu] %s failed for msg_id=%s: %s", name, msg_id, exc)
except Exception:
pass
@staticmethod
def _log_task_error(task: asyncio.Task, name: str, msg_id: str) -> None:
"""Callback for background asyncio tasks to surface errors."""
try:
exc = task.exception()
if exc:
logger.error("[Feishu] %s failed for msg_id=%s: %s", name, msg_id, exc)
except asyncio.CancelledError:
logger.info("[Feishu] %s cancelled for msg_id=%s", name, msg_id)
except Exception:
pass
async def _prepare_inbound(self, msg_id: str, inbound) -> None:
"""Kick off Feishu side effects without delaying inbound dispatch."""
reaction_task = asyncio.create_task(self._add_reaction(msg_id, "OK"))
self._track_background_task(reaction_task, name="add_reaction", msg_id=msg_id)
self._ensure_running_card_started(msg_id)
await self.bus.publish_inbound(inbound)
def _on_message(self, event) -> None:
"""Called by lark-oapi when a message is received (runs in lark thread)."""
try:
logger.info("[Feishu] raw event received: type=%s", type(event).__name__)
message = event.event.message
chat_id = message.chat_id
msg_id = message.message_id
sender_id = event.event.sender.sender_id.open_id
# root_id is set when the message is a reply within a Feishu thread.
# Use it as topic_id so all replies share the same DeerFlow thread.
root_id = getattr(message, "root_id", None) or None
# Parse message content
content = json.loads(message.content)
text = content.get("text", "").strip()
logger.info(
"[Feishu] parsed message: chat_id=%s, msg_id=%s, root_id=%s, sender=%s, text=%r",
chat_id,
msg_id,
root_id,
sender_id,
text[:100] if text else "",
)
if not text:
logger.info("[Feishu] empty text, ignoring message")
return
# Check if it's a command
if text.startswith("/"):
msg_type = InboundMessageType.COMMAND
else:
msg_type = InboundMessageType.CHAT
# topic_id: use root_id for replies (same topic), msg_id for new messages (new topic)
topic_id = root_id or msg_id
inbound = self._make_inbound(
chat_id=chat_id,
user_id=sender_id,
text=text,
msg_type=msg_type,
thread_ts=msg_id,
metadata={"message_id": msg_id, "root_id": root_id},
)
inbound.topic_id = topic_id
# Schedule on the async event loop
if self._main_loop and self._main_loop.is_running():
logger.info("[Feishu] publishing inbound message to bus (type=%s, msg_id=%s)", msg_type.value, msg_id)
fut = asyncio.run_coroutine_threadsafe(self._prepare_inbound(msg_id, inbound), self._main_loop)
fut.add_done_callback(lambda f, mid=msg_id: self._log_future_error(f, "prepare_inbound", mid))
else:
logger.warning("[Feishu] main loop not running, cannot publish inbound message")
except Exception:
logger.exception("[Feishu] error processing message")

View File

@@ -0,0 +1,703 @@
"""ChannelManager — consumes inbound messages and dispatches them to the DeerFlow agent via LangGraph Server."""
from __future__ import annotations
import asyncio
import logging
import mimetypes
import time
from collections.abc import Mapping
from typing import Any
from app.channels.message_bus import InboundMessage, InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
from app.channels.store import ChannelStore
logger = logging.getLogger(__name__)
DEFAULT_LANGGRAPH_URL = "http://localhost:2024"
DEFAULT_GATEWAY_URL = "http://localhost:8001"
DEFAULT_ASSISTANT_ID = "lead_agent"
DEFAULT_RUN_CONFIG: dict[str, Any] = {"recursion_limit": 100}
DEFAULT_RUN_CONTEXT: dict[str, Any] = {
"thinking_enabled": True,
"is_plan_mode": False,
"subagent_enabled": False,
}
STREAM_UPDATE_MIN_INTERVAL_SECONDS = 0.35
def _as_dict(value: Any) -> dict[str, Any]:
return dict(value) if isinstance(value, Mapping) else {}
def _merge_dicts(*layers: Any) -> dict[str, Any]:
merged: dict[str, Any] = {}
for layer in layers:
if isinstance(layer, Mapping):
merged.update(layer)
return merged
def _extract_response_text(result: dict | list) -> str:
"""Extract the last AI message text from a LangGraph runs.wait result.
``runs.wait`` returns the final state dict which contains a ``messages``
list. Each message is a dict with at least ``type`` and ``content``.
Handles special cases:
- Regular AI text responses
- Clarification interrupts (``ask_clarification`` tool messages)
- AI messages with tool_calls but no text content
"""
if isinstance(result, list):
messages = result
elif isinstance(result, dict):
messages = result.get("messages", [])
else:
return ""
# Walk backwards to find usable response text, but stop at the last
# human message to avoid returning text from a previous turn.
for msg in reversed(messages):
if not isinstance(msg, dict):
continue
msg_type = msg.get("type")
# Stop at the last human message — anything before it is a previous turn
if msg_type == "human":
break
# Check for tool messages from ask_clarification (interrupt case)
if msg_type == "tool" and msg.get("name") == "ask_clarification":
content = msg.get("content", "")
if isinstance(content, str) and content:
return content
# Regular AI message with text content
if msg_type == "ai":
content = msg.get("content", "")
if isinstance(content, str) and content:
return content
# content can be a list of content blocks
if isinstance(content, list):
parts = []
for block in content:
if isinstance(block, dict) and block.get("type") == "text":
parts.append(block.get("text", ""))
elif isinstance(block, str):
parts.append(block)
text = "".join(parts)
if text:
return text
return ""
def _extract_text_content(content: Any) -> str:
"""Extract text from a streaming payload content field."""
if isinstance(content, str):
return content
if isinstance(content, list):
parts: list[str] = []
for block in content:
if isinstance(block, str):
parts.append(block)
elif isinstance(block, Mapping):
text = block.get("text")
if isinstance(text, str):
parts.append(text)
else:
nested = block.get("content")
if isinstance(nested, str):
parts.append(nested)
return "".join(parts)
if isinstance(content, Mapping):
for key in ("text", "content"):
value = content.get(key)
if isinstance(value, str):
return value
return ""
def _merge_stream_text(existing: str, chunk: str) -> str:
"""Merge either delta text or cumulative text into a single snapshot."""
if not chunk:
return existing
if not existing or chunk == existing:
return chunk or existing
if chunk.startswith(existing):
return chunk
if existing.endswith(chunk):
return existing
return existing + chunk
def _extract_stream_message_id(payload: Any, metadata: Any) -> str | None:
"""Best-effort extraction of the streamed AI message identifier."""
candidates = [payload, metadata]
if isinstance(payload, Mapping):
candidates.append(payload.get("kwargs"))
for candidate in candidates:
if not isinstance(candidate, Mapping):
continue
for key in ("id", "message_id"):
value = candidate.get(key)
if isinstance(value, str) and value:
return value
return None
def _accumulate_stream_text(
buffers: dict[str, str],
current_message_id: str | None,
event_data: Any,
) -> tuple[str | None, str | None]:
"""Convert a ``messages-tuple`` event into the latest displayable AI text."""
payload = event_data
metadata: Any = None
if isinstance(event_data, (list, tuple)):
if event_data:
payload = event_data[0]
if len(event_data) > 1:
metadata = event_data[1]
if isinstance(payload, str):
message_id = current_message_id or "__default__"
buffers[message_id] = _merge_stream_text(buffers.get(message_id, ""), payload)
return buffers[message_id], message_id
if not isinstance(payload, Mapping):
return None, current_message_id
payload_type = str(payload.get("type", "")).lower()
if "tool" in payload_type:
return None, current_message_id
text = _extract_text_content(payload.get("content"))
if not text and isinstance(payload.get("kwargs"), Mapping):
text = _extract_text_content(payload["kwargs"].get("content"))
if not text:
return None, current_message_id
message_id = _extract_stream_message_id(payload, metadata) or current_message_id or "__default__"
buffers[message_id] = _merge_stream_text(buffers.get(message_id, ""), text)
return buffers[message_id], message_id
def _extract_artifacts(result: dict | list) -> list[str]:
"""Extract artifact paths from the last AI response cycle only.
Instead of reading the full accumulated ``artifacts`` state (which contains
all artifacts ever produced in the thread), this inspects the messages after
the last human message and collects file paths from ``present_files`` tool
calls. This ensures only newly-produced artifacts are returned.
"""
if isinstance(result, list):
messages = result
elif isinstance(result, dict):
messages = result.get("messages", [])
else:
return []
artifacts: list[str] = []
for msg in reversed(messages):
if not isinstance(msg, dict):
continue
# Stop at the last human message — anything before it is a previous turn
if msg.get("type") == "human":
break
# Look for AI messages with present_files tool calls
if msg.get("type") == "ai":
for tc in msg.get("tool_calls", []):
if isinstance(tc, dict) and tc.get("name") == "present_files":
args = tc.get("args", {})
paths = args.get("filepaths", [])
if isinstance(paths, list):
artifacts.extend(p for p in paths if isinstance(p, str))
return artifacts
def _format_artifact_text(artifacts: list[str]) -> str:
"""Format artifact paths into a human-readable text block listing filenames."""
import posixpath
filenames = [posixpath.basename(p) for p in artifacts]
if len(filenames) == 1:
return f"Created File: 📎 {filenames[0]}"
return "Created Files: 📎 " + "".join(filenames)
_OUTPUTS_VIRTUAL_PREFIX = "/mnt/user-data/outputs/"
def _resolve_attachments(thread_id: str, artifacts: list[str]) -> list[ResolvedAttachment]:
"""Resolve virtual artifact paths to host filesystem paths with metadata.
Only paths under ``/mnt/user-data/outputs/`` are accepted; any other
virtual path is rejected with a warning to prevent exfiltrating uploads
or workspace files via IM channels.
Skips artifacts that cannot be resolved (missing files, invalid paths)
and logs warnings for them.
"""
from deerflow.config.paths import get_paths
attachments: list[ResolvedAttachment] = []
paths = get_paths()
outputs_dir = paths.sandbox_outputs_dir(thread_id).resolve()
for virtual_path in artifacts:
# Security: only allow files from the agent outputs directory
if not virtual_path.startswith(_OUTPUTS_VIRTUAL_PREFIX):
logger.warning("[Manager] rejected non-outputs artifact path: %s", virtual_path)
continue
try:
actual = paths.resolve_virtual_path(thread_id, virtual_path)
# Verify the resolved path is actually under the outputs directory
# (guards against path-traversal even after prefix check)
try:
actual.resolve().relative_to(outputs_dir)
except ValueError:
logger.warning("[Manager] artifact path escapes outputs dir: %s -> %s", virtual_path, actual)
continue
if not actual.is_file():
logger.warning("[Manager] artifact not found on disk: %s -> %s", virtual_path, actual)
continue
mime, _ = mimetypes.guess_type(str(actual))
mime = mime or "application/octet-stream"
attachments.append(
ResolvedAttachment(
virtual_path=virtual_path,
actual_path=actual,
filename=actual.name,
mime_type=mime,
size=actual.stat().st_size,
is_image=mime.startswith("image/"),
)
)
except (ValueError, OSError) as exc:
logger.warning("[Manager] failed to resolve artifact %s: %s", virtual_path, exc)
return attachments
def _prepare_artifact_delivery(
thread_id: str,
response_text: str,
artifacts: list[str],
) -> tuple[str, list[ResolvedAttachment]]:
"""Resolve attachments and append filename fallbacks to the text response."""
attachments: list[ResolvedAttachment] = []
if not artifacts:
return response_text, attachments
attachments = _resolve_attachments(thread_id, artifacts)
resolved_virtuals = {attachment.virtual_path for attachment in attachments}
unresolved = [path for path in artifacts if path not in resolved_virtuals]
if unresolved:
artifact_text = _format_artifact_text(unresolved)
response_text = (response_text + "\n\n" + artifact_text) if response_text else artifact_text
# Always include resolved attachment filenames as a text fallback so files
# remain discoverable even when the upload is skipped or fails.
if attachments:
resolved_text = _format_artifact_text([attachment.virtual_path for attachment in attachments])
response_text = (response_text + "\n\n" + resolved_text) if response_text else resolved_text
return response_text, attachments
class ChannelManager:
"""Core dispatcher that bridges IM channels to the DeerFlow agent.
It reads from the MessageBus inbound queue, creates/reuses threads on
the LangGraph Server, sends messages via ``runs.wait``, and publishes
outbound responses back through the bus.
"""
def __init__(
self,
bus: MessageBus,
store: ChannelStore,
*,
max_concurrency: int = 5,
langgraph_url: str = DEFAULT_LANGGRAPH_URL,
gateway_url: str = DEFAULT_GATEWAY_URL,
assistant_id: str = DEFAULT_ASSISTANT_ID,
default_session: dict[str, Any] | None = None,
channel_sessions: dict[str, Any] | None = None,
) -> None:
self.bus = bus
self.store = store
self._max_concurrency = max_concurrency
self._langgraph_url = langgraph_url
self._gateway_url = gateway_url
self._assistant_id = assistant_id
self._default_session = _as_dict(default_session)
self._channel_sessions = dict(channel_sessions or {})
self._client = None # lazy init — langgraph_sdk async client
self._semaphore: asyncio.Semaphore | None = None
self._running = False
self._task: asyncio.Task | None = None
def _resolve_session_layer(self, msg: InboundMessage) -> tuple[dict[str, Any], dict[str, Any]]:
channel_layer = _as_dict(self._channel_sessions.get(msg.channel_name))
users_layer = _as_dict(channel_layer.get("users"))
user_layer = _as_dict(users_layer.get(msg.user_id))
return channel_layer, user_layer
def _resolve_run_params(self, msg: InboundMessage, thread_id: str) -> tuple[str, dict[str, Any], dict[str, Any]]:
channel_layer, user_layer = self._resolve_session_layer(msg)
assistant_id = user_layer.get("assistant_id") or channel_layer.get("assistant_id") or self._default_session.get("assistant_id") or self._assistant_id
if not isinstance(assistant_id, str) or not assistant_id.strip():
assistant_id = self._assistant_id
run_config = _merge_dicts(
DEFAULT_RUN_CONFIG,
self._default_session.get("config"),
channel_layer.get("config"),
user_layer.get("config"),
)
run_context = _merge_dicts(
DEFAULT_RUN_CONTEXT,
self._default_session.get("context"),
channel_layer.get("context"),
user_layer.get("context"),
{"thread_id": thread_id},
)
return assistant_id, run_config, run_context
# -- LangGraph SDK client (lazy) ----------------------------------------
def _get_client(self):
"""Return the ``langgraph_sdk`` async client, creating it on first use."""
if self._client is None:
from langgraph_sdk import get_client
self._client = get_client(url=self._langgraph_url)
return self._client
# -- lifecycle ---------------------------------------------------------
async def start(self) -> None:
"""Start the dispatch loop."""
if self._running:
return
self._running = True
self._semaphore = asyncio.Semaphore(self._max_concurrency)
self._task = asyncio.create_task(self._dispatch_loop())
logger.info("ChannelManager started (max_concurrency=%d)", self._max_concurrency)
async def stop(self) -> None:
"""Stop the dispatch loop."""
self._running = False
if self._task:
self._task.cancel()
try:
await self._task
except asyncio.CancelledError:
pass
self._task = None
logger.info("ChannelManager stopped")
# -- dispatch loop -----------------------------------------------------
async def _dispatch_loop(self) -> None:
logger.info("[Manager] dispatch loop started, waiting for inbound messages")
while self._running:
try:
msg = await asyncio.wait_for(self.bus.get_inbound(), timeout=1.0)
except TimeoutError:
continue
except asyncio.CancelledError:
break
logger.info(
"[Manager] received inbound: channel=%s, chat_id=%s, type=%s, text=%r",
msg.channel_name,
msg.chat_id,
msg.msg_type.value,
msg.text[:100] if msg.text else "",
)
task = asyncio.create_task(self._handle_message(msg))
task.add_done_callback(self._log_task_error)
@staticmethod
def _log_task_error(task: asyncio.Task) -> None:
"""Surface unhandled exceptions from background tasks."""
if task.cancelled():
return
exc = task.exception()
if exc:
logger.error("[Manager] unhandled error in message task: %s", exc, exc_info=exc)
async def _handle_message(self, msg: InboundMessage) -> None:
async with self._semaphore:
try:
if msg.msg_type == InboundMessageType.COMMAND:
await self._handle_command(msg)
else:
await self._handle_chat(msg)
except Exception:
logger.exception(
"Error handling message from %s (chat=%s)",
msg.channel_name,
msg.chat_id,
)
await self._send_error(msg, "An internal error occurred. Please try again.")
# -- chat handling -----------------------------------------------------
async def _create_thread(self, client, msg: InboundMessage) -> str:
"""Create a new thread on the LangGraph Server and store the mapping."""
thread = await client.threads.create()
thread_id = thread["thread_id"]
self.store.set_thread_id(
msg.channel_name,
msg.chat_id,
thread_id,
topic_id=msg.topic_id,
user_id=msg.user_id,
)
logger.info("[Manager] new thread created on LangGraph Server: thread_id=%s for chat_id=%s topic_id=%s", thread_id, msg.chat_id, msg.topic_id)
return thread_id
async def _handle_chat(self, msg: InboundMessage) -> None:
client = self._get_client()
# Look up existing DeerFlow thread.
# topic_id may be None (e.g. Telegram private chats) — the store
# handles this by using the "channel:chat_id" key without a topic suffix.
thread_id = self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id)
if thread_id:
logger.info("[Manager] reusing thread: thread_id=%s for topic_id=%s", thread_id, msg.topic_id)
# No existing thread found — create a new one
if thread_id is None:
thread_id = await self._create_thread(client, msg)
assistant_id, run_config, run_context = self._resolve_run_params(msg, thread_id)
if msg.channel_name == "feishu":
await self._handle_streaming_chat(
client,
msg,
thread_id,
assistant_id,
run_config,
run_context,
)
return
logger.info("[Manager] invoking runs.wait(thread_id=%s, text=%r)", thread_id, msg.text[:100])
result = await client.runs.wait(
thread_id,
assistant_id,
input={"messages": [{"role": "human", "content": msg.text}]},
config=run_config,
context=run_context,
)
response_text = _extract_response_text(result)
artifacts = _extract_artifacts(result)
logger.info(
"[Manager] agent response received: thread_id=%s, response_len=%d, artifacts=%d",
thread_id,
len(response_text) if response_text else 0,
len(artifacts),
)
response_text, attachments = _prepare_artifact_delivery(thread_id, response_text, artifacts)
if not response_text:
if attachments:
response_text = _format_artifact_text([a.virtual_path for a in attachments])
else:
response_text = "(No response from agent)"
outbound = OutboundMessage(
channel_name=msg.channel_name,
chat_id=msg.chat_id,
thread_id=thread_id,
text=response_text,
artifacts=artifacts,
attachments=attachments,
thread_ts=msg.thread_ts,
)
logger.info("[Manager] publishing outbound message to bus: channel=%s, chat_id=%s", msg.channel_name, msg.chat_id)
await self.bus.publish_outbound(outbound)
async def _handle_streaming_chat(
self,
client,
msg: InboundMessage,
thread_id: str,
assistant_id: str,
run_config: dict[str, Any],
run_context: dict[str, Any],
) -> None:
logger.info("[Manager] invoking runs.stream(thread_id=%s, text=%r)", thread_id, msg.text[:100])
last_values: dict[str, Any] | list | None = None
streamed_buffers: dict[str, str] = {}
current_message_id: str | None = None
latest_text = ""
last_published_text = ""
last_publish_at = 0.0
stream_error: BaseException | None = None
try:
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input={"messages": [{"role": "human", "content": msg.text}]},
config=run_config,
context=run_context,
stream_mode=["messages-tuple", "values"],
):
event = getattr(chunk, "event", "")
data = getattr(chunk, "data", None)
if event == "messages-tuple":
accumulated_text, current_message_id = _accumulate_stream_text(streamed_buffers, current_message_id, data)
if accumulated_text:
latest_text = accumulated_text
elif event == "values" and isinstance(data, (dict, list)):
last_values = data
snapshot_text = _extract_response_text(data)
if snapshot_text:
latest_text = snapshot_text
if not latest_text or latest_text == last_published_text:
continue
now = time.monotonic()
if last_published_text and now - last_publish_at < STREAM_UPDATE_MIN_INTERVAL_SECONDS:
continue
await self.bus.publish_outbound(
OutboundMessage(
channel_name=msg.channel_name,
chat_id=msg.chat_id,
thread_id=thread_id,
text=latest_text,
is_final=False,
thread_ts=msg.thread_ts,
)
)
last_published_text = latest_text
last_publish_at = now
except Exception as exc:
stream_error = exc
logger.exception("[Manager] streaming error: thread_id=%s", thread_id)
finally:
result = last_values if last_values is not None else {"messages": [{"type": "ai", "content": latest_text}]}
response_text = _extract_response_text(result)
artifacts = _extract_artifacts(result)
response_text, attachments = _prepare_artifact_delivery(thread_id, response_text, artifacts)
if not response_text:
if attachments:
response_text = _format_artifact_text([attachment.virtual_path for attachment in attachments])
elif stream_error:
response_text = "An error occurred while processing your request. Please try again."
else:
response_text = latest_text or "(No response from agent)"
logger.info(
"[Manager] streaming response completed: thread_id=%s, response_len=%d, artifacts=%d, error=%s",
thread_id,
len(response_text),
len(artifacts),
stream_error,
)
await self.bus.publish_outbound(
OutboundMessage(
channel_name=msg.channel_name,
chat_id=msg.chat_id,
thread_id=thread_id,
text=response_text,
artifacts=artifacts,
attachments=attachments,
is_final=True,
thread_ts=msg.thread_ts,
)
)
# -- command handling --------------------------------------------------
async def _handle_command(self, msg: InboundMessage) -> None:
text = msg.text.strip()
parts = text.split(maxsplit=1)
command = parts[0].lower().lstrip("/")
if command == "new":
# Create a new thread on the LangGraph Server
client = self._get_client()
thread = await client.threads.create()
new_thread_id = thread["thread_id"]
self.store.set_thread_id(
msg.channel_name,
msg.chat_id,
new_thread_id,
topic_id=msg.topic_id,
user_id=msg.user_id,
)
reply = "New conversation started."
elif command == "status":
thread_id = self.store.get_thread_id(msg.channel_name, msg.chat_id, topic_id=msg.topic_id)
reply = f"Active thread: {thread_id}" if thread_id else "No active conversation."
elif command == "models":
reply = await self._fetch_gateway("/api/models", "models")
elif command == "memory":
reply = await self._fetch_gateway("/api/memory", "memory")
elif command == "help":
reply = "Available commands:\n/new — Start a new conversation\n/status — Show current thread info\n/models — List available models\n/memory — Show memory status\n/help — Show this help"
else:
reply = f"Unknown command: /{command}. Type /help for available commands."
outbound = OutboundMessage(
channel_name=msg.channel_name,
chat_id=msg.chat_id,
thread_id=self.store.get_thread_id(msg.channel_name, msg.chat_id) or "",
text=reply,
thread_ts=msg.thread_ts,
)
await self.bus.publish_outbound(outbound)
async def _fetch_gateway(self, path: str, kind: str) -> str:
"""Fetch data from the Gateway API for command responses."""
import httpx
try:
async with httpx.AsyncClient() as http:
resp = await http.get(f"{self._gateway_url}{path}", timeout=10)
resp.raise_for_status()
data = resp.json()
except Exception:
logger.exception("Failed to fetch %s from gateway", kind)
return f"Failed to fetch {kind} information."
if kind == "models":
names = [m["name"] for m in data.get("models", [])]
return ("Available models:\n" + "\n".join(f"{n}" for n in names)) if names else "No models configured."
elif kind == "memory":
facts = data.get("facts", [])
return f"Memory contains {len(facts)} fact(s)."
return str(data)
# -- error helper ------------------------------------------------------
async def _send_error(self, msg: InboundMessage, error_text: str) -> None:
outbound = OutboundMessage(
channel_name=msg.channel_name,
chat_id=msg.chat_id,
thread_id=self.store.get_thread_id(msg.channel_name, msg.chat_id) or "",
text=error_text,
thread_ts=msg.thread_ts,
)
await self.bus.publish_outbound(outbound)

View File

@@ -0,0 +1,173 @@
"""MessageBus — async pub/sub hub that decouples channels from the agent dispatcher."""
from __future__ import annotations
import asyncio
import logging
import time
from collections.abc import Callable, Coroutine
from dataclasses import dataclass, field
from enum import StrEnum
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Message types
# ---------------------------------------------------------------------------
class InboundMessageType(StrEnum):
"""Types of messages arriving from IM channels."""
CHAT = "chat"
COMMAND = "command"
@dataclass
class InboundMessage:
"""A message arriving from an IM channel toward the agent dispatcher.
Attributes:
channel_name: Name of the source channel (e.g. "feishu", "slack").
chat_id: Platform-specific chat/conversation identifier.
user_id: Platform-specific user identifier.
text: The message text.
msg_type: Whether this is a regular chat message or a command.
thread_ts: Optional platform thread identifier (for threaded replies).
topic_id: Conversation topic identifier used to map to a DeerFlow thread.
Messages sharing the same ``topic_id`` within a ``chat_id`` will
reuse the same DeerFlow thread. When ``None``, each message
creates a new thread (one-shot Q&A).
files: Optional list of file attachments (platform-specific dicts).
metadata: Arbitrary extra data from the channel.
created_at: Unix timestamp when the message was created.
"""
channel_name: str
chat_id: str
user_id: str
text: str
msg_type: InboundMessageType = InboundMessageType.CHAT
thread_ts: str | None = None
topic_id: str | None = None
files: list[dict[str, Any]] = field(default_factory=list)
metadata: dict[str, Any] = field(default_factory=dict)
created_at: float = field(default_factory=time.time)
@dataclass
class ResolvedAttachment:
"""A file attachment resolved to a host filesystem path, ready for upload.
Attributes:
virtual_path: Original virtual path (e.g. /mnt/user-data/outputs/report.pdf).
actual_path: Resolved host filesystem path.
filename: Basename of the file.
mime_type: MIME type (e.g. "application/pdf").
size: File size in bytes.
is_image: True for image/* MIME types (platforms may handle images differently).
"""
virtual_path: str
actual_path: Path
filename: str
mime_type: str
size: int
is_image: bool
@dataclass
class OutboundMessage:
"""A message from the agent dispatcher back to a channel.
Attributes:
channel_name: Target channel name (used for routing).
chat_id: Target chat/conversation identifier.
thread_id: DeerFlow thread ID that produced this response.
text: The response text.
artifacts: List of artifact paths produced by the agent.
is_final: Whether this is the final message in the response stream.
thread_ts: Optional platform thread identifier for threaded replies.
metadata: Arbitrary extra data.
created_at: Unix timestamp.
"""
channel_name: str
chat_id: str
thread_id: str
text: str
artifacts: list[str] = field(default_factory=list)
attachments: list[ResolvedAttachment] = field(default_factory=list)
is_final: bool = True
thread_ts: str | None = None
metadata: dict[str, Any] = field(default_factory=dict)
created_at: float = field(default_factory=time.time)
# ---------------------------------------------------------------------------
# MessageBus
# ---------------------------------------------------------------------------
OutboundCallback = Callable[[OutboundMessage], Coroutine[Any, Any, None]]
class MessageBus:
"""Async pub/sub hub connecting channels and the agent dispatcher.
Channels publish inbound messages; the dispatcher consumes them.
The dispatcher publishes outbound messages; channels receive them
via registered callbacks.
"""
def __init__(self) -> None:
self._inbound_queue: asyncio.Queue[InboundMessage] = asyncio.Queue()
self._outbound_listeners: list[OutboundCallback] = []
# -- inbound -----------------------------------------------------------
async def publish_inbound(self, msg: InboundMessage) -> None:
"""Enqueue an inbound message from a channel."""
await self._inbound_queue.put(msg)
logger.info(
"[Bus] inbound enqueued: channel=%s, chat_id=%s, type=%s, queue_size=%d",
msg.channel_name,
msg.chat_id,
msg.msg_type.value,
self._inbound_queue.qsize(),
)
async def get_inbound(self) -> InboundMessage:
"""Block until the next inbound message is available."""
return await self._inbound_queue.get()
@property
def inbound_queue(self) -> asyncio.Queue[InboundMessage]:
return self._inbound_queue
# -- outbound ----------------------------------------------------------
def subscribe_outbound(self, callback: OutboundCallback) -> None:
"""Register an async callback for outbound messages."""
self._outbound_listeners.append(callback)
def unsubscribe_outbound(self, callback: OutboundCallback) -> None:
"""Remove a previously registered outbound callback."""
self._outbound_listeners = [cb for cb in self._outbound_listeners if cb is not callback]
async def publish_outbound(self, msg: OutboundMessage) -> None:
"""Dispatch an outbound message to all registered listeners."""
logger.info(
"[Bus] outbound dispatching: channel=%s, chat_id=%s, listeners=%d, text_len=%d",
msg.channel_name,
msg.chat_id,
len(self._outbound_listeners),
len(msg.text),
)
for callback in self._outbound_listeners:
try:
await callback(msg)
except Exception:
logger.exception("Error in outbound callback for channel=%s", msg.channel_name)

View File

@@ -0,0 +1,178 @@
"""ChannelService — manages the lifecycle of all IM channels."""
from __future__ import annotations
import logging
from typing import Any
from app.channels.manager import ChannelManager
from app.channels.message_bus import MessageBus
from app.channels.store import ChannelStore
logger = logging.getLogger(__name__)
# Channel name → import path for lazy loading
_CHANNEL_REGISTRY: dict[str, str] = {
"feishu": "app.channels.feishu:FeishuChannel",
"slack": "app.channels.slack:SlackChannel",
"telegram": "app.channels.telegram:TelegramChannel",
}
class ChannelService:
"""Manages the lifecycle of all configured IM channels.
Reads configuration from ``config.yaml`` under the ``channels`` key,
instantiates enabled channels, and starts the ChannelManager dispatcher.
"""
def __init__(self, channels_config: dict[str, Any] | None = None) -> None:
self.bus = MessageBus()
self.store = ChannelStore()
config = dict(channels_config or {})
langgraph_url = config.pop("langgraph_url", None) or "http://localhost:2024"
gateway_url = config.pop("gateway_url", None) or "http://localhost:8001"
default_session = config.pop("session", None)
channel_sessions = {name: channel_config.get("session") for name, channel_config in config.items() if isinstance(channel_config, dict)}
self.manager = ChannelManager(
bus=self.bus,
store=self.store,
langgraph_url=langgraph_url,
gateway_url=gateway_url,
default_session=default_session if isinstance(default_session, dict) else None,
channel_sessions=channel_sessions,
)
self._channels: dict[str, Any] = {} # name -> Channel instance
self._config = config
self._running = False
@classmethod
def from_app_config(cls) -> ChannelService:
"""Create a ChannelService from the application config."""
from deerflow.config.app_config import get_app_config
config = get_app_config()
channels_config = {}
# extra fields are allowed by AppConfig (extra="allow")
extra = config.model_extra or {}
if "channels" in extra:
channels_config = extra["channels"]
return cls(channels_config=channels_config)
async def start(self) -> None:
"""Start the manager and all enabled channels."""
if self._running:
return
await self.manager.start()
for name, channel_config in self._config.items():
if not isinstance(channel_config, dict):
continue
if not channel_config.get("enabled", False):
logger.info("Channel %s is disabled, skipping", name)
continue
await self._start_channel(name, channel_config)
self._running = True
logger.info("ChannelService started with channels: %s", list(self._channels.keys()))
async def stop(self) -> None:
"""Stop all channels and the manager."""
for name, channel in list(self._channels.items()):
try:
await channel.stop()
logger.info("Channel %s stopped", name)
except Exception:
logger.exception("Error stopping channel %s", name)
self._channels.clear()
await self.manager.stop()
self._running = False
logger.info("ChannelService stopped")
async def restart_channel(self, name: str) -> bool:
"""Restart a specific channel. Returns True if successful."""
if name in self._channels:
try:
await self._channels[name].stop()
except Exception:
logger.exception("Error stopping channel %s for restart", name)
del self._channels[name]
config = self._config.get(name)
if not config or not isinstance(config, dict):
logger.warning("No config for channel %s", name)
return False
return await self._start_channel(name, config)
async def _start_channel(self, name: str, config: dict[str, Any]) -> bool:
"""Instantiate and start a single channel."""
import_path = _CHANNEL_REGISTRY.get(name)
if not import_path:
logger.warning("Unknown channel type: %s", name)
return False
try:
from deerflow.reflection import resolve_class
channel_cls = resolve_class(import_path, base_class=None)
except Exception:
logger.exception("Failed to import channel class for %s", name)
return False
try:
channel = channel_cls(bus=self.bus, config=config)
await channel.start()
self._channels[name] = channel
logger.info("Channel %s started", name)
return True
except Exception:
logger.exception("Failed to start channel %s", name)
return False
def get_status(self) -> dict[str, Any]:
"""Return status information for all channels."""
channels_status = {}
for name in _CHANNEL_REGISTRY:
config = self._config.get(name, {})
enabled = isinstance(config, dict) and config.get("enabled", False)
running = name in self._channels and self._channels[name].is_running
channels_status[name] = {
"enabled": enabled,
"running": running,
}
return {
"service_running": self._running,
"channels": channels_status,
}
# -- singleton access -------------------------------------------------------
_channel_service: ChannelService | None = None
def get_channel_service() -> ChannelService | None:
"""Get the singleton ChannelService instance (if started)."""
return _channel_service
async def start_channel_service() -> ChannelService:
"""Create and start the global ChannelService from app config."""
global _channel_service
if _channel_service is not None:
return _channel_service
_channel_service = ChannelService.from_app_config()
await _channel_service.start()
return _channel_service
async def stop_channel_service() -> None:
"""Stop the global ChannelService."""
global _channel_service
if _channel_service is not None:
await _channel_service.stop()
_channel_service = None

View File

@@ -0,0 +1,244 @@
"""Slack channel — connects via Socket Mode (no public IP needed)."""
from __future__ import annotations
import asyncio
import logging
from typing import Any
from markdown_to_mrkdwn import SlackMarkdownConverter
from app.channels.base import Channel
from app.channels.message_bus import InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
logger = logging.getLogger(__name__)
_slack_md_converter = SlackMarkdownConverter()
class SlackChannel(Channel):
"""Slack IM channel using Socket Mode (WebSocket, no public IP).
Configuration keys (in ``config.yaml`` under ``channels.slack``):
- ``bot_token``: Slack Bot User OAuth Token (xoxb-...).
- ``app_token``: Slack App-Level Token (xapp-...) for Socket Mode.
- ``allowed_users``: (optional) List of allowed Slack user IDs. Empty = allow all.
"""
def __init__(self, bus: MessageBus, config: dict[str, Any]) -> None:
super().__init__(name="slack", bus=bus, config=config)
self._socket_client = None
self._web_client = None
self._loop: asyncio.AbstractEventLoop | None = None
self._allowed_users: set[str] = set(config.get("allowed_users", []))
async def start(self) -> None:
if self._running:
return
try:
from slack_sdk import WebClient
from slack_sdk.socket_mode import SocketModeClient
from slack_sdk.socket_mode.response import SocketModeResponse
except ImportError:
logger.error("slack-sdk is not installed. Install it with: uv add slack-sdk")
return
self._SocketModeResponse = SocketModeResponse
bot_token = self.config.get("bot_token", "")
app_token = self.config.get("app_token", "")
if not bot_token or not app_token:
logger.error("Slack channel requires bot_token and app_token")
return
self._web_client = WebClient(token=bot_token)
self._socket_client = SocketModeClient(
app_token=app_token,
web_client=self._web_client,
)
self._loop = asyncio.get_event_loop()
self._socket_client.socket_mode_request_listeners.append(self._on_socket_event)
self._running = True
self.bus.subscribe_outbound(self._on_outbound)
# Start socket mode in background thread
asyncio.get_event_loop().run_in_executor(None, self._socket_client.connect)
logger.info("Slack channel started")
async def stop(self) -> None:
self._running = False
self.bus.unsubscribe_outbound(self._on_outbound)
if self._socket_client:
self._socket_client.close()
self._socket_client = None
logger.info("Slack channel stopped")
async def send(self, msg: OutboundMessage, *, _max_retries: int = 3) -> None:
if not self._web_client:
return
kwargs: dict[str, Any] = {
"channel": msg.chat_id,
"text": _slack_md_converter.convert(msg.text),
}
if msg.thread_ts:
kwargs["thread_ts"] = msg.thread_ts
last_exc: Exception | None = None
for attempt in range(_max_retries):
try:
await asyncio.to_thread(self._web_client.chat_postMessage, **kwargs)
# Add a completion reaction to the thread root
if msg.thread_ts:
await asyncio.to_thread(
self._add_reaction,
msg.chat_id,
msg.thread_ts,
"white_check_mark",
)
return
except Exception as exc:
last_exc = exc
if attempt < _max_retries - 1:
delay = 2**attempt # 1s, 2s
logger.warning(
"[Slack] send failed (attempt %d/%d), retrying in %ds: %s",
attempt + 1,
_max_retries,
delay,
exc,
)
await asyncio.sleep(delay)
logger.error("[Slack] send failed after %d attempts: %s", _max_retries, last_exc)
# Add failure reaction on error
if msg.thread_ts:
try:
await asyncio.to_thread(
self._add_reaction,
msg.chat_id,
msg.thread_ts,
"x",
)
except Exception:
pass
raise last_exc # type: ignore[misc]
async def send_file(self, msg: OutboundMessage, attachment: ResolvedAttachment) -> bool:
if not self._web_client:
return False
try:
kwargs: dict[str, Any] = {
"channel": msg.chat_id,
"file": str(attachment.actual_path),
"filename": attachment.filename,
"title": attachment.filename,
}
if msg.thread_ts:
kwargs["thread_ts"] = msg.thread_ts
await asyncio.to_thread(self._web_client.files_upload_v2, **kwargs)
logger.info("[Slack] file uploaded: %s to channel=%s", attachment.filename, msg.chat_id)
return True
except Exception:
logger.exception("[Slack] failed to upload file: %s", attachment.filename)
return False
# -- internal ----------------------------------------------------------
def _add_reaction(self, channel_id: str, timestamp: str, emoji: str) -> None:
"""Add an emoji reaction to a message (best-effort, non-blocking)."""
if not self._web_client:
return
try:
self._web_client.reactions_add(
channel=channel_id,
timestamp=timestamp,
name=emoji,
)
except Exception as exc:
if "already_reacted" not in str(exc):
logger.warning("[Slack] failed to add reaction %s: %s", emoji, exc)
def _send_running_reply(self, channel_id: str, thread_ts: str) -> None:
"""Send a 'Working on it......' reply in the thread (called from SDK thread)."""
if not self._web_client:
return
try:
self._web_client.chat_postMessage(
channel=channel_id,
text=":hourglass_flowing_sand: Working on it...",
thread_ts=thread_ts,
)
logger.info("[Slack] 'Working on it...' reply sent in channel=%s, thread_ts=%s", channel_id, thread_ts)
except Exception:
logger.exception("[Slack] failed to send running reply in channel=%s", channel_id)
def _on_socket_event(self, client, req) -> None:
"""Called by slack-sdk for each Socket Mode event."""
try:
# Acknowledge the event
response = self._SocketModeResponse(envelope_id=req.envelope_id)
client.send_socket_mode_response(response)
event_type = req.type
if event_type != "events_api":
return
event = req.payload.get("event", {})
etype = event.get("type", "")
# Handle message events (DM or @mention)
if etype in ("message", "app_mention"):
self._handle_message_event(event)
except Exception:
logger.exception("Error processing Slack event")
def _handle_message_event(self, event: dict) -> None:
# Ignore bot messages
if event.get("bot_id") or event.get("subtype"):
return
user_id = event.get("user", "")
# Check allowed users
if self._allowed_users and user_id not in self._allowed_users:
logger.debug("Ignoring message from non-allowed user: %s", user_id)
return
text = event.get("text", "").strip()
if not text:
return
channel_id = event.get("channel", "")
thread_ts = event.get("thread_ts") or event.get("ts", "")
if text.startswith("/"):
msg_type = InboundMessageType.COMMAND
else:
msg_type = InboundMessageType.CHAT
# topic_id: use thread_ts as the topic identifier.
# For threaded messages, thread_ts is the root message ts (shared topic).
# For non-threaded messages, thread_ts is the message's own ts (new topic).
inbound = self._make_inbound(
chat_id=channel_id,
user_id=user_id,
text=text,
msg_type=msg_type,
thread_ts=thread_ts,
)
inbound.topic_id = thread_ts
if self._loop and self._loop.is_running():
# Acknowledge with an eyes reaction
self._add_reaction(channel_id, event.get("ts", thread_ts), "eyes")
# Send "running" reply first (fire-and-forget from SDK thread)
self._send_running_reply(channel_id, thread_ts)
asyncio.run_coroutine_threadsafe(self.bus.publish_inbound(inbound), self._loop)

View File

@@ -0,0 +1,153 @@
"""ChannelStore — persists IM chat-to-DeerFlow thread mappings."""
from __future__ import annotations
import json
import logging
import tempfile
import threading
import time
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
class ChannelStore:
"""JSON-file-backed store that maps IM conversations to DeerFlow threads.
Data layout (on disk)::
{
"<channel_name>:<chat_id>": {
"thread_id": "<uuid>",
"user_id": "<platform_user>",
"created_at": 1700000000.0,
"updated_at": 1700000000.0
},
...
}
The store is intentionally simple — a single JSON file that is atomically
rewritten on every mutation. For production workloads with high concurrency,
this can be swapped for a proper database backend.
"""
def __init__(self, path: str | Path | None = None) -> None:
if path is None:
from deerflow.config.paths import get_paths
path = Path(get_paths().base_dir) / "channels" / "store.json"
self._path = Path(path)
self._path.parent.mkdir(parents=True, exist_ok=True)
self._data: dict[str, dict[str, Any]] = self._load()
self._lock = threading.Lock()
# -- persistence -------------------------------------------------------
def _load(self) -> dict[str, dict[str, Any]]:
if self._path.exists():
try:
return json.loads(self._path.read_text(encoding="utf-8"))
except (json.JSONDecodeError, OSError):
logger.warning("Corrupt channel store at %s, starting fresh", self._path)
return {}
def _save(self) -> None:
fd = tempfile.NamedTemporaryFile(
mode="w",
dir=self._path.parent,
suffix=".tmp",
delete=False,
)
try:
json.dump(self._data, fd, indent=2)
fd.close()
Path(fd.name).replace(self._path)
except BaseException:
fd.close()
Path(fd.name).unlink(missing_ok=True)
raise
# -- key helpers -------------------------------------------------------
@staticmethod
def _key(channel_name: str, chat_id: str, topic_id: str | None = None) -> str:
if topic_id:
return f"{channel_name}:{chat_id}:{topic_id}"
return f"{channel_name}:{chat_id}"
# -- public API --------------------------------------------------------
def get_thread_id(self, channel_name: str, chat_id: str, topic_id: str | None = None) -> str | None:
"""Look up the DeerFlow thread_id for a given IM conversation/topic."""
entry = self._data.get(self._key(channel_name, chat_id, topic_id))
return entry["thread_id"] if entry else None
def set_thread_id(
self,
channel_name: str,
chat_id: str,
thread_id: str,
*,
topic_id: str | None = None,
user_id: str = "",
) -> None:
"""Create or update the mapping for an IM conversation/topic."""
with self._lock:
key = self._key(channel_name, chat_id, topic_id)
now = time.time()
existing = self._data.get(key)
self._data[key] = {
"thread_id": thread_id,
"user_id": user_id,
"created_at": existing["created_at"] if existing else now,
"updated_at": now,
}
self._save()
def remove(self, channel_name: str, chat_id: str, topic_id: str | None = None) -> bool:
"""Remove a mapping.
If ``topic_id`` is provided, only that specific conversation/topic mapping is removed.
If ``topic_id`` is omitted, all mappings whose key starts with
``"<channel_name>:<chat_id>"`` (including topic-specific ones) are removed.
Returns True if at least one mapping was removed.
"""
with self._lock:
# Remove a specific conversation/topic mapping.
if topic_id is not None:
key = self._key(channel_name, chat_id, topic_id)
if key in self._data:
del self._data[key]
self._save()
return True
return False
# Remove all mappings for this channel/chat_id (base and any topic-specific keys).
prefix = self._key(channel_name, chat_id)
keys_to_delete = [k for k in self._data if k == prefix or k.startswith(prefix + ":")]
if not keys_to_delete:
return False
for k in keys_to_delete:
del self._data[k]
self._save()
return True
def list_entries(self, channel_name: str | None = None) -> list[dict[str, Any]]:
"""List all stored mappings, optionally filtered by channel."""
results = []
for key, entry in self._data.items():
parts = key.split(":", 2)
ch = parts[0]
chat = parts[1] if len(parts) > 1 else ""
topic = parts[2] if len(parts) > 2 else None
if channel_name and ch != channel_name:
continue
item: dict[str, Any] = {"channel_name": ch, "chat_id": chat, **entry}
if topic is not None:
item["topic_id"] = topic
results.append(item)
return results

View File

@@ -0,0 +1,299 @@
"""Telegram channel — connects via long-polling (no public IP needed)."""
from __future__ import annotations
import asyncio
import logging
import threading
from typing import Any
from app.channels.base import Channel
from app.channels.message_bus import InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
logger = logging.getLogger(__name__)
class TelegramChannel(Channel):
"""Telegram bot channel using long-polling.
Configuration keys (in ``config.yaml`` under ``channels.telegram``):
- ``bot_token``: Telegram Bot API token (from @BotFather).
- ``allowed_users``: (optional) List of allowed Telegram user IDs. Empty = allow all.
"""
def __init__(self, bus: MessageBus, config: dict[str, Any]) -> None:
super().__init__(name="telegram", bus=bus, config=config)
self._application = None
self._thread: threading.Thread | None = None
self._tg_loop: asyncio.AbstractEventLoop | None = None
self._main_loop: asyncio.AbstractEventLoop | None = None
self._allowed_users: set[int] = set()
for uid in config.get("allowed_users", []):
try:
self._allowed_users.add(int(uid))
except (ValueError, TypeError):
pass
# chat_id -> last sent message_id for threaded replies
self._last_bot_message: dict[str, int] = {}
async def start(self) -> None:
if self._running:
return
try:
from telegram.ext import ApplicationBuilder, CommandHandler, MessageHandler, filters
except ImportError:
logger.error("python-telegram-bot is not installed. Install it with: uv add python-telegram-bot")
return
bot_token = self.config.get("bot_token", "")
if not bot_token:
logger.error("Telegram channel requires bot_token")
return
self._main_loop = asyncio.get_event_loop()
self._running = True
self.bus.subscribe_outbound(self._on_outbound)
# Build the application
app = ApplicationBuilder().token(bot_token).build()
# Command handlers
app.add_handler(CommandHandler("start", self._cmd_start))
app.add_handler(CommandHandler("new", self._cmd_generic))
app.add_handler(CommandHandler("status", self._cmd_generic))
app.add_handler(CommandHandler("models", self._cmd_generic))
app.add_handler(CommandHandler("memory", self._cmd_generic))
app.add_handler(CommandHandler("help", self._cmd_generic))
# General message handler
app.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, self._on_text))
self._application = app
# Run polling in a dedicated thread with its own event loop
self._thread = threading.Thread(target=self._run_polling, daemon=True)
self._thread.start()
logger.info("Telegram channel started")
async def stop(self) -> None:
self._running = False
self.bus.unsubscribe_outbound(self._on_outbound)
if self._tg_loop and self._tg_loop.is_running():
self._tg_loop.call_soon_threadsafe(self._tg_loop.stop)
if self._thread:
self._thread.join(timeout=10)
self._thread = None
self._application = None
logger.info("Telegram channel stopped")
async def send(self, msg: OutboundMessage, *, _max_retries: int = 3) -> None:
if not self._application:
return
try:
chat_id = int(msg.chat_id)
except (ValueError, TypeError):
logger.error("Invalid Telegram chat_id: %s", msg.chat_id)
return
kwargs: dict[str, Any] = {"chat_id": chat_id, "text": msg.text}
# Reply to the last bot message in this chat for threading
reply_to = self._last_bot_message.get(msg.chat_id)
if reply_to:
kwargs["reply_to_message_id"] = reply_to
bot = self._application.bot
last_exc: Exception | None = None
for attempt in range(_max_retries):
try:
sent = await bot.send_message(**kwargs)
self._last_bot_message[msg.chat_id] = sent.message_id
return
except Exception as exc:
last_exc = exc
if attempt < _max_retries - 1:
delay = 2**attempt # 1s, 2s
logger.warning(
"[Telegram] send failed (attempt %d/%d), retrying in %ds: %s",
attempt + 1,
_max_retries,
delay,
exc,
)
await asyncio.sleep(delay)
logger.error("[Telegram] send failed after %d attempts: %s", _max_retries, last_exc)
raise last_exc # type: ignore[misc]
async def send_file(self, msg: OutboundMessage, attachment: ResolvedAttachment) -> bool:
if not self._application:
return False
try:
chat_id = int(msg.chat_id)
except (ValueError, TypeError):
logger.error("[Telegram] Invalid chat_id: %s", msg.chat_id)
return False
# Telegram limits: 10MB for photos, 50MB for documents
if attachment.size > 50 * 1024 * 1024:
logger.warning("[Telegram] file too large (%d bytes), skipping: %s", attachment.size, attachment.filename)
return False
bot = self._application.bot
reply_to = self._last_bot_message.get(msg.chat_id)
try:
if attachment.is_image and attachment.size <= 10 * 1024 * 1024:
with open(attachment.actual_path, "rb") as f:
kwargs: dict[str, Any] = {"chat_id": chat_id, "photo": f}
if reply_to:
kwargs["reply_to_message_id"] = reply_to
sent = await bot.send_photo(**kwargs)
else:
from telegram import InputFile
with open(attachment.actual_path, "rb") as f:
input_file = InputFile(f, filename=attachment.filename)
kwargs = {"chat_id": chat_id, "document": input_file}
if reply_to:
kwargs["reply_to_message_id"] = reply_to
sent = await bot.send_document(**kwargs)
self._last_bot_message[msg.chat_id] = sent.message_id
logger.info("[Telegram] file sent: %s to chat=%s", attachment.filename, msg.chat_id)
return True
except Exception:
logger.exception("[Telegram] failed to send file: %s", attachment.filename)
return False
# -- helpers -----------------------------------------------------------
async def _send_running_reply(self, chat_id: str, reply_to_message_id: int) -> None:
"""Send a 'Working on it...' reply to the user's message."""
if not self._application:
return
try:
bot = self._application.bot
await bot.send_message(
chat_id=int(chat_id),
text="Working on it...",
reply_to_message_id=reply_to_message_id,
)
logger.info("[Telegram] 'Working on it...' reply sent in chat=%s", chat_id)
except Exception:
logger.exception("[Telegram] failed to send running reply in chat=%s", chat_id)
# -- internal ----------------------------------------------------------
def _run_polling(self) -> None:
"""Run telegram polling in a dedicated thread."""
self._tg_loop = asyncio.new_event_loop()
asyncio.set_event_loop(self._tg_loop)
try:
# Cannot use run_polling() because it calls add_signal_handler(),
# which only works in the main thread. Instead, manually
# initialize the application and start the updater.
self._tg_loop.run_until_complete(self._application.initialize())
self._tg_loop.run_until_complete(self._application.start())
self._tg_loop.run_until_complete(self._application.updater.start_polling())
self._tg_loop.run_forever()
except Exception:
if self._running:
logger.exception("Telegram polling error")
finally:
# Graceful shutdown
try:
if self._application.updater.running:
self._tg_loop.run_until_complete(self._application.updater.stop())
self._tg_loop.run_until_complete(self._application.stop())
self._tg_loop.run_until_complete(self._application.shutdown())
except Exception:
logger.exception("Error during Telegram shutdown")
def _check_user(self, user_id: int) -> bool:
if not self._allowed_users:
return True
return user_id in self._allowed_users
async def _cmd_start(self, update, context) -> None:
"""Handle /start command."""
if not self._check_user(update.effective_user.id):
return
await update.message.reply_text("Welcome to DeerFlow! Send me a message to start a conversation.\nType /help for available commands.")
async def _cmd_generic(self, update, context) -> None:
"""Forward slash commands to the channel manager."""
if not self._check_user(update.effective_user.id):
return
text = update.message.text
chat_id = str(update.effective_chat.id)
user_id = str(update.effective_user.id)
msg_id = str(update.message.message_id)
# Use the same topic_id logic as _on_text so that commands
# like /new target the correct thread mapping.
if update.effective_chat.type == "private":
topic_id = None
else:
reply_to = update.message.reply_to_message
if reply_to:
topic_id = str(reply_to.message_id)
else:
topic_id = msg_id
inbound = self._make_inbound(
chat_id=chat_id,
user_id=user_id,
text=text,
msg_type=InboundMessageType.COMMAND,
thread_ts=msg_id,
)
inbound.topic_id = topic_id
if self._main_loop and self._main_loop.is_running():
asyncio.run_coroutine_threadsafe(self._send_running_reply(chat_id, update.message.message_id), self._main_loop)
asyncio.run_coroutine_threadsafe(self.bus.publish_inbound(inbound), self._main_loop)
async def _on_text(self, update, context) -> None:
"""Handle regular text messages."""
if not self._check_user(update.effective_user.id):
return
text = update.message.text.strip()
if not text:
return
chat_id = str(update.effective_chat.id)
user_id = str(update.effective_user.id)
msg_id = str(update.message.message_id)
# topic_id determines which DeerFlow thread the message maps to.
# In private chats, use None so that all messages share a single
# thread (the store key becomes "channel:chat_id").
# In group chats, use the reply-to message id or the current
# message id to keep separate conversation threads.
if update.effective_chat.type == "private":
topic_id = None
else:
reply_to = update.message.reply_to_message
if reply_to:
topic_id = str(reply_to.message_id)
else:
topic_id = msg_id
inbound = self._make_inbound(
chat_id=chat_id,
user_id=user_id,
text=text,
msg_type=InboundMessageType.CHAT,
thread_ts=msg_id,
)
inbound.topic_id = topic_id
if self._main_loop and self._main_loop.is_running():
asyncio.run_coroutine_threadsafe(self._send_running_reply(chat_id, update.message.message_id), self._main_loop)
asyncio.run_coroutine_threadsafe(self.bus.publish_inbound(inbound), self._main_loop)

View File

@@ -0,0 +1,4 @@
from .app import app, create_app
from .config import GatewayConfig, get_gateway_config
__all__ = ["app", "create_app", "GatewayConfig", "get_gateway_config"]

192
backend/app/gateway/app.py Normal file
View File

@@ -0,0 +1,192 @@
import logging
from collections.abc import AsyncGenerator
from contextlib import asynccontextmanager
from fastapi import FastAPI
from app.gateway.config import get_gateway_config
from app.gateway.routers import (
agents,
artifacts,
channels,
mcp,
memory,
models,
skills,
suggestions,
uploads,
)
from deerflow.config.app_config import get_app_config
# Configure logging
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
logger = logging.getLogger(__name__)
@asynccontextmanager
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
"""Application lifespan handler."""
# Load config and check necessary environment variables at startup
try:
get_app_config()
logger.info("Configuration loaded successfully")
except Exception as e:
error_msg = f"Failed to load configuration during gateway startup: {e}"
logger.exception(error_msg)
raise RuntimeError(error_msg) from e
config = get_gateway_config()
logger.info(f"Starting API Gateway on {config.host}:{config.port}")
# NOTE: MCP tools initialization is NOT done here because:
# 1. Gateway doesn't use MCP tools - they are used by Agents in the LangGraph Server
# 2. Gateway and LangGraph Server are separate processes with independent caches
# MCP tools are lazily initialized in LangGraph Server when first needed
# Start IM channel service if any channels are configured
try:
from app.channels.service import start_channel_service
channel_service = await start_channel_service()
logger.info("Channel service started: %s", channel_service.get_status())
except Exception:
logger.exception("No IM channels configured or channel service failed to start")
yield
# Stop channel service on shutdown
try:
from app.channels.service import stop_channel_service
await stop_channel_service()
except Exception:
logger.exception("Failed to stop channel service")
logger.info("Shutting down API Gateway")
def create_app() -> FastAPI:
"""Create and configure the FastAPI application.
Returns:
Configured FastAPI application instance.
"""
app = FastAPI(
title="DeerFlow API Gateway",
description="""
## DeerFlow API Gateway
API Gateway for DeerFlow - A LangGraph-based AI agent backend with sandbox execution capabilities.
### Features
- **Models Management**: Query and retrieve available AI models
- **MCP Configuration**: Manage Model Context Protocol (MCP) server configurations
- **Memory Management**: Access and manage global memory data for personalized conversations
- **Skills Management**: Query and manage skills and their enabled status
- **Artifacts**: Access thread artifacts and generated files
- **Health Monitoring**: System health check endpoints
### Architecture
LangGraph requests are handled by nginx reverse proxy.
This gateway provides custom endpoints for models, MCP configuration, skills, and artifacts.
""",
version="0.1.0",
lifespan=lifespan,
docs_url="/docs",
redoc_url="/redoc",
openapi_url="/openapi.json",
openapi_tags=[
{
"name": "models",
"description": "Operations for querying available AI models and their configurations",
},
{
"name": "mcp",
"description": "Manage Model Context Protocol (MCP) server configurations",
},
{
"name": "memory",
"description": "Access and manage global memory data for personalized conversations",
},
{
"name": "skills",
"description": "Manage skills and their configurations",
},
{
"name": "artifacts",
"description": "Access and download thread artifacts and generated files",
},
{
"name": "uploads",
"description": "Upload and manage user files for threads",
},
{
"name": "agents",
"description": "Create and manage custom agents with per-agent config and prompts",
},
{
"name": "suggestions",
"description": "Generate follow-up question suggestions for conversations",
},
{
"name": "channels",
"description": "Manage IM channel integrations (Feishu, Slack, Telegram)",
},
{
"name": "health",
"description": "Health check and system status endpoints",
},
],
)
# CORS is handled by nginx - no need for FastAPI middleware
# Include routers
# Models API is mounted at /api/models
app.include_router(models.router)
# MCP API is mounted at /api/mcp
app.include_router(mcp.router)
# Memory API is mounted at /api/memory
app.include_router(memory.router)
# Skills API is mounted at /api/skills
app.include_router(skills.router)
# Artifacts API is mounted at /api/threads/{thread_id}/artifacts
app.include_router(artifacts.router)
# Uploads API is mounted at /api/threads/{thread_id}/uploads
app.include_router(uploads.router)
# Agents API is mounted at /api/agents
app.include_router(agents.router)
# Suggestions API is mounted at /api/threads/{thread_id}/suggestions
app.include_router(suggestions.router)
# Channels API is mounted at /api/channels
app.include_router(channels.router)
@app.get("/health", tags=["health"])
async def health_check() -> dict:
"""Health check endpoint.
Returns:
Service health status information.
"""
return {"status": "healthy", "service": "deer-flow-gateway"}
return app
# Create app instance for uvicorn
app = create_app()

View File

@@ -0,0 +1,27 @@
import os
from pydantic import BaseModel, Field
class GatewayConfig(BaseModel):
"""Configuration for the API Gateway."""
host: str = Field(default="0.0.0.0", description="Host to bind the gateway server")
port: int = Field(default=8001, description="Port to bind the gateway server")
cors_origins: list[str] = Field(default_factory=lambda: ["http://localhost:3000"], description="Allowed CORS origins")
_gateway_config: GatewayConfig | None = None
def get_gateway_config() -> GatewayConfig:
"""Get gateway config, loading from environment if available."""
global _gateway_config
if _gateway_config is None:
cors_origins_str = os.getenv("CORS_ORIGINS", "http://localhost:3000")
_gateway_config = GatewayConfig(
host=os.getenv("GATEWAY_HOST", "0.0.0.0"),
port=int(os.getenv("GATEWAY_PORT", "8001")),
cors_origins=cors_origins_str.split(","),
)
return _gateway_config

View File

@@ -0,0 +1,28 @@
"""Shared path resolution for thread virtual paths (e.g. mnt/user-data/outputs/...)."""
from pathlib import Path
from fastapi import HTTPException
from deerflow.config.paths import get_paths
def resolve_thread_virtual_path(thread_id: str, virtual_path: str) -> Path:
"""Resolve a virtual path to the actual filesystem path under thread user-data.
Args:
thread_id: The thread ID.
virtual_path: The virtual path as seen inside the sandbox
(e.g., /mnt/user-data/outputs/file.txt).
Returns:
The resolved filesystem path.
Raises:
HTTPException: If the path is invalid or outside allowed directories.
"""
try:
return get_paths().resolve_virtual_path(thread_id, virtual_path)
except ValueError as e:
status = 403 if "traversal" in str(e) else 400
raise HTTPException(status_code=status, detail=str(e))

View File

@@ -0,0 +1,3 @@
from . import artifacts, mcp, models, skills, suggestions, uploads
__all__ = ["artifacts", "mcp", "models", "skills", "suggestions", "uploads"]

View File

@@ -0,0 +1,383 @@
"""CRUD API for custom agents."""
import logging
import re
import shutil
import yaml
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
from deerflow.config.agents_config import AgentConfig, list_custom_agents, load_agent_config, load_agent_soul
from deerflow.config.paths import get_paths
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api", tags=["agents"])
AGENT_NAME_PATTERN = re.compile(r"^[A-Za-z0-9-]+$")
class AgentResponse(BaseModel):
"""Response model for a custom agent."""
name: str = Field(..., description="Agent name (hyphen-case)")
description: str = Field(default="", description="Agent description")
model: str | None = Field(default=None, description="Optional model override")
tool_groups: list[str] | None = Field(default=None, description="Optional tool group whitelist")
soul: str | None = Field(default=None, description="SOUL.md content (included on GET /{name})")
class AgentsListResponse(BaseModel):
"""Response model for listing all custom agents."""
agents: list[AgentResponse]
class AgentCreateRequest(BaseModel):
"""Request body for creating a custom agent."""
name: str = Field(..., description="Agent name (must match ^[A-Za-z0-9-]+$, stored as lowercase)")
description: str = Field(default="", description="Agent description")
model: str | None = Field(default=None, description="Optional model override")
tool_groups: list[str] | None = Field(default=None, description="Optional tool group whitelist")
soul: str = Field(default="", description="SOUL.md content — agent personality and behavioral guardrails")
class AgentUpdateRequest(BaseModel):
"""Request body for updating a custom agent."""
description: str | None = Field(default=None, description="Updated description")
model: str | None = Field(default=None, description="Updated model override")
tool_groups: list[str] | None = Field(default=None, description="Updated tool group whitelist")
soul: str | None = Field(default=None, description="Updated SOUL.md content")
def _validate_agent_name(name: str) -> None:
"""Validate agent name against allowed pattern.
Args:
name: The agent name to validate.
Raises:
HTTPException: 422 if the name is invalid.
"""
if not AGENT_NAME_PATTERN.match(name):
raise HTTPException(
status_code=422,
detail=f"Invalid agent name '{name}'. Must match ^[A-Za-z0-9-]+$ (letters, digits, and hyphens only).",
)
def _normalize_agent_name(name: str) -> str:
"""Normalize agent name to lowercase for filesystem storage."""
return name.lower()
def _agent_config_to_response(agent_cfg: AgentConfig, include_soul: bool = False) -> AgentResponse:
"""Convert AgentConfig to AgentResponse."""
soul: str | None = None
if include_soul:
soul = load_agent_soul(agent_cfg.name) or ""
return AgentResponse(
name=agent_cfg.name,
description=agent_cfg.description,
model=agent_cfg.model,
tool_groups=agent_cfg.tool_groups,
soul=soul,
)
@router.get(
"/agents",
response_model=AgentsListResponse,
summary="List Custom Agents",
description="List all custom agents available in the agents directory.",
)
async def list_agents() -> AgentsListResponse:
"""List all custom agents.
Returns:
List of all custom agents with their metadata (without soul content).
"""
try:
agents = list_custom_agents()
return AgentsListResponse(agents=[_agent_config_to_response(a) for a in agents])
except Exception as e:
logger.error(f"Failed to list agents: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to list agents: {str(e)}")
@router.get(
"/agents/check",
summary="Check Agent Name",
description="Validate an agent name and check if it is available (case-insensitive).",
)
async def check_agent_name(name: str) -> dict:
"""Check whether an agent name is valid and not yet taken.
Args:
name: The agent name to check.
Returns:
``{"available": true/false, "name": "<normalized>"}``
Raises:
HTTPException: 422 if the name is invalid.
"""
_validate_agent_name(name)
normalized = _normalize_agent_name(name)
available = not get_paths().agent_dir(normalized).exists()
return {"available": available, "name": normalized}
@router.get(
"/agents/{name}",
response_model=AgentResponse,
summary="Get Custom Agent",
description="Retrieve details and SOUL.md content for a specific custom agent.",
)
async def get_agent(name: str) -> AgentResponse:
"""Get a specific custom agent by name.
Args:
name: The agent name.
Returns:
Agent details including SOUL.md content.
Raises:
HTTPException: 404 if agent not found.
"""
_validate_agent_name(name)
name = _normalize_agent_name(name)
try:
agent_cfg = load_agent_config(name)
return _agent_config_to_response(agent_cfg, include_soul=True)
except FileNotFoundError:
raise HTTPException(status_code=404, detail=f"Agent '{name}' not found")
except Exception as e:
logger.error(f"Failed to get agent '{name}': {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to get agent: {str(e)}")
@router.post(
"/agents",
response_model=AgentResponse,
status_code=201,
summary="Create Custom Agent",
description="Create a new custom agent with its config and SOUL.md.",
)
async def create_agent_endpoint(request: AgentCreateRequest) -> AgentResponse:
"""Create a new custom agent.
Args:
request: The agent creation request.
Returns:
The created agent details.
Raises:
HTTPException: 409 if agent already exists, 422 if name is invalid.
"""
_validate_agent_name(request.name)
normalized_name = _normalize_agent_name(request.name)
agent_dir = get_paths().agent_dir(normalized_name)
if agent_dir.exists():
raise HTTPException(status_code=409, detail=f"Agent '{normalized_name}' already exists")
try:
agent_dir.mkdir(parents=True, exist_ok=True)
# Write config.yaml
config_data: dict = {"name": normalized_name}
if request.description:
config_data["description"] = request.description
if request.model is not None:
config_data["model"] = request.model
if request.tool_groups is not None:
config_data["tool_groups"] = request.tool_groups
config_file = agent_dir / "config.yaml"
with open(config_file, "w", encoding="utf-8") as f:
yaml.dump(config_data, f, default_flow_style=False, allow_unicode=True)
# Write SOUL.md
soul_file = agent_dir / "SOUL.md"
soul_file.write_text(request.soul, encoding="utf-8")
logger.info(f"Created agent '{normalized_name}' at {agent_dir}")
agent_cfg = load_agent_config(normalized_name)
return _agent_config_to_response(agent_cfg, include_soul=True)
except HTTPException:
raise
except Exception as e:
# Clean up on failure
if agent_dir.exists():
shutil.rmtree(agent_dir)
logger.error(f"Failed to create agent '{request.name}': {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to create agent: {str(e)}")
@router.put(
"/agents/{name}",
response_model=AgentResponse,
summary="Update Custom Agent",
description="Update an existing custom agent's config and/or SOUL.md.",
)
async def update_agent(name: str, request: AgentUpdateRequest) -> AgentResponse:
"""Update an existing custom agent.
Args:
name: The agent name.
request: The update request (all fields optional).
Returns:
The updated agent details.
Raises:
HTTPException: 404 if agent not found.
"""
_validate_agent_name(name)
name = _normalize_agent_name(name)
try:
agent_cfg = load_agent_config(name)
except FileNotFoundError:
raise HTTPException(status_code=404, detail=f"Agent '{name}' not found")
agent_dir = get_paths().agent_dir(name)
try:
# Update config if any config fields changed
config_changed = any(v is not None for v in [request.description, request.model, request.tool_groups])
if config_changed:
updated: dict = {
"name": agent_cfg.name,
"description": request.description if request.description is not None else agent_cfg.description,
}
new_model = request.model if request.model is not None else agent_cfg.model
if new_model is not None:
updated["model"] = new_model
new_tool_groups = request.tool_groups if request.tool_groups is not None else agent_cfg.tool_groups
if new_tool_groups is not None:
updated["tool_groups"] = new_tool_groups
config_file = agent_dir / "config.yaml"
with open(config_file, "w", encoding="utf-8") as f:
yaml.dump(updated, f, default_flow_style=False, allow_unicode=True)
# Update SOUL.md if provided
if request.soul is not None:
soul_path = agent_dir / "SOUL.md"
soul_path.write_text(request.soul, encoding="utf-8")
logger.info(f"Updated agent '{name}'")
refreshed_cfg = load_agent_config(name)
return _agent_config_to_response(refreshed_cfg, include_soul=True)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to update agent '{name}': {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to update agent: {str(e)}")
class UserProfileResponse(BaseModel):
"""Response model for the global user profile (USER.md)."""
content: str | None = Field(default=None, description="USER.md content, or null if not yet created")
class UserProfileUpdateRequest(BaseModel):
"""Request body for setting the global user profile."""
content: str = Field(default="", description="USER.md content — describes the user's background and preferences")
@router.get(
"/user-profile",
response_model=UserProfileResponse,
summary="Get User Profile",
description="Read the global USER.md file that is injected into all custom agents.",
)
async def get_user_profile() -> UserProfileResponse:
"""Return the current USER.md content.
Returns:
UserProfileResponse with content=None if USER.md does not exist yet.
"""
try:
user_md_path = get_paths().user_md_file
if not user_md_path.exists():
return UserProfileResponse(content=None)
raw = user_md_path.read_text(encoding="utf-8").strip()
return UserProfileResponse(content=raw or None)
except Exception as e:
logger.error(f"Failed to read user profile: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to read user profile: {str(e)}")
@router.put(
"/user-profile",
response_model=UserProfileResponse,
summary="Update User Profile",
description="Write the global USER.md file that is injected into all custom agents.",
)
async def update_user_profile(request: UserProfileUpdateRequest) -> UserProfileResponse:
"""Create or overwrite the global USER.md.
Args:
request: The update request with the new USER.md content.
Returns:
UserProfileResponse with the saved content.
"""
try:
paths = get_paths()
paths.base_dir.mkdir(parents=True, exist_ok=True)
paths.user_md_file.write_text(request.content, encoding="utf-8")
logger.info(f"Updated USER.md at {paths.user_md_file}")
return UserProfileResponse(content=request.content or None)
except Exception as e:
logger.error(f"Failed to update user profile: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to update user profile: {str(e)}")
@router.delete(
"/agents/{name}",
status_code=204,
summary="Delete Custom Agent",
description="Delete a custom agent and all its files (config, SOUL.md, memory).",
)
async def delete_agent(name: str) -> None:
"""Delete a custom agent.
Args:
name: The agent name.
Raises:
HTTPException: 404 if agent not found.
"""
_validate_agent_name(name)
name = _normalize_agent_name(name)
agent_dir = get_paths().agent_dir(name)
if not agent_dir.exists():
raise HTTPException(status_code=404, detail=f"Agent '{name}' not found")
try:
shutil.rmtree(agent_dir)
logger.info(f"Deleted agent '{name}' from {agent_dir}")
except Exception as e:
logger.error(f"Failed to delete agent '{name}': {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to delete agent: {str(e)}")

View File

@@ -0,0 +1,158 @@
import logging
import mimetypes
import zipfile
from pathlib import Path
from urllib.parse import quote
from fastapi import APIRouter, HTTPException, Request
from fastapi.responses import FileResponse, HTMLResponse, PlainTextResponse, Response
from app.gateway.path_utils import resolve_thread_virtual_path
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api", tags=["artifacts"])
def is_text_file_by_content(path: Path, sample_size: int = 8192) -> bool:
"""Check if file is text by examining content for null bytes."""
try:
with open(path, "rb") as f:
chunk = f.read(sample_size)
# Text files shouldn't contain null bytes
return b"\x00" not in chunk
except Exception:
return False
def _extract_file_from_skill_archive(zip_path: Path, internal_path: str) -> bytes | None:
"""Extract a file from a .skill ZIP archive.
Args:
zip_path: Path to the .skill file (ZIP archive).
internal_path: Path to the file inside the archive (e.g., "SKILL.md").
Returns:
The file content as bytes, or None if not found.
"""
if not zipfile.is_zipfile(zip_path):
return None
try:
with zipfile.ZipFile(zip_path, "r") as zip_ref:
# List all files in the archive
namelist = zip_ref.namelist()
# Try direct path first
if internal_path in namelist:
return zip_ref.read(internal_path)
# Try with any top-level directory prefix (e.g., "skill-name/SKILL.md")
for name in namelist:
if name.endswith("/" + internal_path) or name == internal_path:
return zip_ref.read(name)
# Not found
return None
except (zipfile.BadZipFile, KeyError):
return None
@router.get(
"/threads/{thread_id}/artifacts/{path:path}",
summary="Get Artifact File",
description="Retrieve an artifact file generated by the AI agent. Supports text, HTML, and binary files.",
)
async def get_artifact(thread_id: str, path: str, request: Request) -> FileResponse:
"""Get an artifact file by its path.
The endpoint automatically detects file types and returns appropriate content types.
Use the `?download=true` query parameter to force file download.
Args:
thread_id: The thread ID.
path: The artifact path with virtual prefix (e.g., mnt/user-data/outputs/file.txt).
request: FastAPI request object (automatically injected).
Returns:
The file content as a FileResponse with appropriate content type:
- HTML files: Rendered as HTML
- Text files: Plain text with proper MIME type
- Binary files: Inline display with download option
Raises:
HTTPException:
- 400 if path is invalid or not a file
- 403 if access denied (path traversal detected)
- 404 if file not found
Query Parameters:
download (bool): If true, returns file as attachment for download
Example:
- Get HTML file: `/api/threads/abc123/artifacts/mnt/user-data/outputs/index.html`
- Download file: `/api/threads/abc123/artifacts/mnt/user-data/outputs/data.csv?download=true`
"""
# Check if this is a request for a file inside a .skill archive (e.g., xxx.skill/SKILL.md)
if ".skill/" in path:
# Split the path at ".skill/" to get the ZIP file path and internal path
skill_marker = ".skill/"
marker_pos = path.find(skill_marker)
skill_file_path = path[: marker_pos + len(".skill")] # e.g., "mnt/user-data/outputs/my-skill.skill"
internal_path = path[marker_pos + len(skill_marker) :] # e.g., "SKILL.md"
actual_skill_path = resolve_thread_virtual_path(thread_id, skill_file_path)
if not actual_skill_path.exists():
raise HTTPException(status_code=404, detail=f"Skill file not found: {skill_file_path}")
if not actual_skill_path.is_file():
raise HTTPException(status_code=400, detail=f"Path is not a file: {skill_file_path}")
# Extract the file from the .skill archive
content = _extract_file_from_skill_archive(actual_skill_path, internal_path)
if content is None:
raise HTTPException(status_code=404, detail=f"File '{internal_path}' not found in skill archive")
# Determine MIME type based on the internal file
mime_type, _ = mimetypes.guess_type(internal_path)
# Add cache headers to avoid repeated ZIP extraction (cache for 5 minutes)
cache_headers = {"Cache-Control": "private, max-age=300"}
if mime_type and mime_type.startswith("text/"):
return PlainTextResponse(content=content.decode("utf-8"), media_type=mime_type, headers=cache_headers)
# Default to plain text for unknown types that look like text
try:
return PlainTextResponse(content=content.decode("utf-8"), media_type="text/plain", headers=cache_headers)
except UnicodeDecodeError:
return Response(content=content, media_type=mime_type or "application/octet-stream", headers=cache_headers)
actual_path = resolve_thread_virtual_path(thread_id, path)
logger.info(f"Resolving artifact path: thread_id={thread_id}, requested_path={path}, actual_path={actual_path}")
if not actual_path.exists():
raise HTTPException(status_code=404, detail=f"Artifact not found: {path}")
if not actual_path.is_file():
raise HTTPException(status_code=400, detail=f"Path is not a file: {path}")
mime_type, _ = mimetypes.guess_type(actual_path)
# Encode filename for Content-Disposition header (RFC 5987)
encoded_filename = quote(actual_path.name)
# if `download` query parameter is true, return the file as a download
if request.query_params.get("download"):
return FileResponse(path=actual_path, filename=actual_path.name, media_type=mime_type, headers={"Content-Disposition": f"attachment; filename*=UTF-8''{encoded_filename}"})
if mime_type and mime_type == "text/html":
return HTMLResponse(content=actual_path.read_text())
if mime_type and mime_type.startswith("text/"):
return PlainTextResponse(content=actual_path.read_text(), media_type=mime_type)
if is_text_file_by_content(actual_path):
return PlainTextResponse(content=actual_path.read_text(), media_type=mime_type)
return Response(content=actual_path.read_bytes(), media_type=mime_type, headers={"Content-Disposition": f"inline; filename*=UTF-8''{encoded_filename}"})

View File

@@ -0,0 +1,52 @@
"""Gateway router for IM channel management."""
from __future__ import annotations
import logging
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/channels", tags=["channels"])
class ChannelStatusResponse(BaseModel):
service_running: bool
channels: dict[str, dict]
class ChannelRestartResponse(BaseModel):
success: bool
message: str
@router.get("/", response_model=ChannelStatusResponse)
async def get_channels_status() -> ChannelStatusResponse:
"""Get the status of all IM channels."""
from app.channels.service import get_channel_service
service = get_channel_service()
if service is None:
return ChannelStatusResponse(service_running=False, channels={})
status = service.get_status()
return ChannelStatusResponse(**status)
@router.post("/{name}/restart", response_model=ChannelRestartResponse)
async def restart_channel(name: str) -> ChannelRestartResponse:
"""Restart a specific IM channel."""
from app.channels.service import get_channel_service
service = get_channel_service()
if service is None:
raise HTTPException(status_code=503, detail="Channel service is not running")
success = await service.restart_channel(name)
if success:
logger.info("Channel %s restarted successfully", name)
return ChannelRestartResponse(success=True, message=f"Channel {name} restarted successfully")
else:
logger.warning("Failed to restart channel %s", name)
return ChannelRestartResponse(success=False, message=f"Failed to restart channel {name}")

View File

@@ -0,0 +1,169 @@
import json
import logging
from pathlib import Path
from typing import Literal
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
from deerflow.config.extensions_config import ExtensionsConfig, get_extensions_config, reload_extensions_config
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api", tags=["mcp"])
class McpOAuthConfigResponse(BaseModel):
"""OAuth configuration for an MCP server."""
enabled: bool = Field(default=True, description="Whether OAuth token injection is enabled")
token_url: str = Field(default="", description="OAuth token endpoint URL")
grant_type: Literal["client_credentials", "refresh_token"] = Field(default="client_credentials", description="OAuth grant type")
client_id: str | None = Field(default=None, description="OAuth client ID")
client_secret: str | None = Field(default=None, description="OAuth client secret")
refresh_token: str | None = Field(default=None, description="OAuth refresh token")
scope: str | None = Field(default=None, description="OAuth scope")
audience: str | None = Field(default=None, description="OAuth audience")
token_field: str = Field(default="access_token", description="Token response field containing access token")
token_type_field: str = Field(default="token_type", description="Token response field containing token type")
expires_in_field: str = Field(default="expires_in", description="Token response field containing expires-in seconds")
default_token_type: str = Field(default="Bearer", description="Default token type when response omits token_type")
refresh_skew_seconds: int = Field(default=60, description="Refresh this many seconds before expiry")
extra_token_params: dict[str, str] = Field(default_factory=dict, description="Additional form params sent to token endpoint")
class McpServerConfigResponse(BaseModel):
"""Response model for MCP server configuration."""
enabled: bool = Field(default=True, description="Whether this MCP server is enabled")
type: str = Field(default="stdio", description="Transport type: 'stdio', 'sse', or 'http'")
command: str | None = Field(default=None, description="Command to execute to start the MCP server (for stdio type)")
args: list[str] = Field(default_factory=list, description="Arguments to pass to the command (for stdio type)")
env: dict[str, str] = Field(default_factory=dict, description="Environment variables for the MCP server")
url: str | None = Field(default=None, description="URL of the MCP server (for sse or http type)")
headers: dict[str, str] = Field(default_factory=dict, description="HTTP headers to send (for sse or http type)")
oauth: McpOAuthConfigResponse | None = Field(default=None, description="OAuth configuration for MCP HTTP/SSE servers")
description: str = Field(default="", description="Human-readable description of what this MCP server provides")
class McpConfigResponse(BaseModel):
"""Response model for MCP configuration."""
mcp_servers: dict[str, McpServerConfigResponse] = Field(
default_factory=dict,
description="Map of MCP server name to configuration",
)
class McpConfigUpdateRequest(BaseModel):
"""Request model for updating MCP configuration."""
mcp_servers: dict[str, McpServerConfigResponse] = Field(
...,
description="Map of MCP server name to configuration",
)
@router.get(
"/mcp/config",
response_model=McpConfigResponse,
summary="Get MCP Configuration",
description="Retrieve the current Model Context Protocol (MCP) server configurations.",
)
async def get_mcp_configuration() -> McpConfigResponse:
"""Get the current MCP configuration.
Returns:
The current MCP configuration with all servers.
Example:
```json
{
"mcp_servers": {
"github": {
"enabled": true,
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_TOKEN": "ghp_xxx"},
"description": "GitHub MCP server for repository operations"
}
}
}
```
"""
config = get_extensions_config()
return McpConfigResponse(mcp_servers={name: McpServerConfigResponse(**server.model_dump()) for name, server in config.mcp_servers.items()})
@router.put(
"/mcp/config",
response_model=McpConfigResponse,
summary="Update MCP Configuration",
description="Update Model Context Protocol (MCP) server configurations and save to file.",
)
async def update_mcp_configuration(request: McpConfigUpdateRequest) -> McpConfigResponse:
"""Update the MCP configuration.
This will:
1. Save the new configuration to the mcp_config.json file
2. Reload the configuration cache
3. Reset MCP tools cache to trigger reinitialization
Args:
request: The new MCP configuration to save.
Returns:
The updated MCP configuration.
Raises:
HTTPException: 500 if the configuration file cannot be written.
Example Request:
```json
{
"mcp_servers": {
"github": {
"enabled": true,
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_TOKEN": "$GITHUB_TOKEN"},
"description": "GitHub MCP server for repository operations"
}
}
}
```
"""
try:
# Get the current config path (or determine where to save it)
config_path = ExtensionsConfig.resolve_config_path()
# If no config file exists, create one in the parent directory (project root)
if config_path is None:
config_path = Path.cwd().parent / "extensions_config.json"
logger.info(f"No existing extensions config found. Creating new config at: {config_path}")
# Load current config to preserve skills configuration
current_config = get_extensions_config()
# Convert request to dict format for JSON serialization
config_data = {
"mcpServers": {name: server.model_dump() for name, server in request.mcp_servers.items()},
"skills": {name: {"enabled": skill.enabled} for name, skill in current_config.skills.items()},
}
# Write the configuration to file
with open(config_path, "w") as f:
json.dump(config_data, f, indent=2)
logger.info(f"MCP configuration updated and saved to: {config_path}")
# NOTE: No need to reload/reset cache here - LangGraph Server (separate process)
# will detect config file changes via mtime and reinitialize MCP tools automatically
# Reload the configuration and update the global cache
reloaded_config = reload_extensions_config()
return McpConfigResponse(mcp_servers={name: McpServerConfigResponse(**server.model_dump()) for name, server in reloaded_config.mcp_servers.items()})
except Exception as e:
logger.error(f"Failed to update MCP configuration: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to update MCP configuration: {str(e)}")

View File

@@ -0,0 +1,201 @@
"""Memory API router for retrieving and managing global memory data."""
from fastapi import APIRouter
from pydantic import BaseModel, Field
from deerflow.agents.memory.updater import get_memory_data, reload_memory_data
from deerflow.config.memory_config import get_memory_config
router = APIRouter(prefix="/api", tags=["memory"])
class ContextSection(BaseModel):
"""Model for context sections (user and history)."""
summary: str = Field(default="", description="Summary content")
updatedAt: str = Field(default="", description="Last update timestamp")
class UserContext(BaseModel):
"""Model for user context."""
workContext: ContextSection = Field(default_factory=ContextSection)
personalContext: ContextSection = Field(default_factory=ContextSection)
topOfMind: ContextSection = Field(default_factory=ContextSection)
class HistoryContext(BaseModel):
"""Model for history context."""
recentMonths: ContextSection = Field(default_factory=ContextSection)
earlierContext: ContextSection = Field(default_factory=ContextSection)
longTermBackground: ContextSection = Field(default_factory=ContextSection)
class Fact(BaseModel):
"""Model for a memory fact."""
id: str = Field(..., description="Unique identifier for the fact")
content: str = Field(..., description="Fact content")
category: str = Field(default="context", description="Fact category")
confidence: float = Field(default=0.5, description="Confidence score (0-1)")
createdAt: str = Field(default="", description="Creation timestamp")
source: str = Field(default="unknown", description="Source thread ID")
class MemoryResponse(BaseModel):
"""Response model for memory data."""
version: str = Field(default="1.0", description="Memory schema version")
lastUpdated: str = Field(default="", description="Last update timestamp")
user: UserContext = Field(default_factory=UserContext)
history: HistoryContext = Field(default_factory=HistoryContext)
facts: list[Fact] = Field(default_factory=list)
class MemoryConfigResponse(BaseModel):
"""Response model for memory configuration."""
enabled: bool = Field(..., description="Whether memory is enabled")
storage_path: str = Field(..., description="Path to memory storage file")
debounce_seconds: int = Field(..., description="Debounce time for memory updates")
max_facts: int = Field(..., description="Maximum number of facts to store")
fact_confidence_threshold: float = Field(..., description="Minimum confidence threshold for facts")
injection_enabled: bool = Field(..., description="Whether memory injection is enabled")
max_injection_tokens: int = Field(..., description="Maximum tokens for memory injection")
class MemoryStatusResponse(BaseModel):
"""Response model for memory status."""
config: MemoryConfigResponse
data: MemoryResponse
@router.get(
"/memory",
response_model=MemoryResponse,
summary="Get Memory Data",
description="Retrieve the current global memory data including user context, history, and facts.",
)
async def get_memory() -> MemoryResponse:
"""Get the current global memory data.
Returns:
The current memory data with user context, history, and facts.
Example Response:
```json
{
"version": "1.0",
"lastUpdated": "2024-01-15T10:30:00Z",
"user": {
"workContext": {"summary": "Working on DeerFlow project", "updatedAt": "..."},
"personalContext": {"summary": "Prefers concise responses", "updatedAt": "..."},
"topOfMind": {"summary": "Building memory API", "updatedAt": "..."}
},
"history": {
"recentMonths": {"summary": "Recent development activities", "updatedAt": "..."},
"earlierContext": {"summary": "", "updatedAt": ""},
"longTermBackground": {"summary": "", "updatedAt": ""}
},
"facts": [
{
"id": "fact_abc123",
"content": "User prefers TypeScript over JavaScript",
"category": "preference",
"confidence": 0.9,
"createdAt": "2024-01-15T10:30:00Z",
"source": "thread_xyz"
}
]
}
```
"""
memory_data = get_memory_data()
return MemoryResponse(**memory_data)
@router.post(
"/memory/reload",
response_model=MemoryResponse,
summary="Reload Memory Data",
description="Reload memory data from the storage file, refreshing the in-memory cache.",
)
async def reload_memory() -> MemoryResponse:
"""Reload memory data from file.
This forces a reload of the memory data from the storage file,
useful when the file has been modified externally.
Returns:
The reloaded memory data.
"""
memory_data = reload_memory_data()
return MemoryResponse(**memory_data)
@router.get(
"/memory/config",
response_model=MemoryConfigResponse,
summary="Get Memory Configuration",
description="Retrieve the current memory system configuration.",
)
async def get_memory_config_endpoint() -> MemoryConfigResponse:
"""Get the memory system configuration.
Returns:
The current memory configuration settings.
Example Response:
```json
{
"enabled": true,
"storage_path": ".deer-flow/memory.json",
"debounce_seconds": 30,
"max_facts": 100,
"fact_confidence_threshold": 0.7,
"injection_enabled": true,
"max_injection_tokens": 2000
}
```
"""
config = get_memory_config()
return MemoryConfigResponse(
enabled=config.enabled,
storage_path=config.storage_path,
debounce_seconds=config.debounce_seconds,
max_facts=config.max_facts,
fact_confidence_threshold=config.fact_confidence_threshold,
injection_enabled=config.injection_enabled,
max_injection_tokens=config.max_injection_tokens,
)
@router.get(
"/memory/status",
response_model=MemoryStatusResponse,
summary="Get Memory Status",
description="Retrieve both memory configuration and current data in a single request.",
)
async def get_memory_status() -> MemoryStatusResponse:
"""Get the memory system status including configuration and data.
Returns:
Combined memory configuration and current data.
"""
config = get_memory_config()
memory_data = get_memory_data()
return MemoryStatusResponse(
config=MemoryConfigResponse(
enabled=config.enabled,
storage_path=config.storage_path,
debounce_seconds=config.debounce_seconds,
max_facts=config.max_facts,
fact_confidence_threshold=config.fact_confidence_threshold,
injection_enabled=config.injection_enabled,
max_injection_tokens=config.max_injection_tokens,
),
data=MemoryResponse(**memory_data),
)

View File

@@ -0,0 +1,113 @@
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
from deerflow.config import get_app_config
router = APIRouter(prefix="/api", tags=["models"])
class ModelResponse(BaseModel):
"""Response model for model information."""
name: str = Field(..., description="Unique identifier for the model")
display_name: str | None = Field(None, description="Human-readable name")
description: str | None = Field(None, description="Model description")
supports_thinking: bool = Field(default=False, description="Whether model supports thinking mode")
supports_reasoning_effort: bool = Field(default=False, description="Whether model supports reasoning effort")
class ModelsListResponse(BaseModel):
"""Response model for listing all models."""
models: list[ModelResponse]
@router.get(
"/models",
response_model=ModelsListResponse,
summary="List All Models",
description="Retrieve a list of all available AI models configured in the system.",
)
async def list_models() -> ModelsListResponse:
"""List all available models from configuration.
Returns model information suitable for frontend display,
excluding sensitive fields like API keys and internal configuration.
Returns:
A list of all configured models with their metadata.
Example Response:
```json
{
"models": [
{
"name": "gpt-4",
"display_name": "GPT-4",
"description": "OpenAI GPT-4 model",
"supports_thinking": false
},
{
"name": "claude-3-opus",
"display_name": "Claude 3 Opus",
"description": "Anthropic Claude 3 Opus model",
"supports_thinking": true
}
]
}
```
"""
config = get_app_config()
models = [
ModelResponse(
name=model.name,
display_name=model.display_name,
description=model.description,
supports_thinking=model.supports_thinking,
supports_reasoning_effort=model.supports_reasoning_effort,
)
for model in config.models
]
return ModelsListResponse(models=models)
@router.get(
"/models/{model_name}",
response_model=ModelResponse,
summary="Get Model Details",
description="Retrieve detailed information about a specific AI model by its name.",
)
async def get_model(model_name: str) -> ModelResponse:
"""Get a specific model by name.
Args:
model_name: The unique name of the model to retrieve.
Returns:
Model information if found.
Raises:
HTTPException: 404 if model not found.
Example Response:
```json
{
"name": "gpt-4",
"display_name": "GPT-4",
"description": "OpenAI GPT-4 model",
"supports_thinking": false
}
```
"""
config = get_app_config()
model = config.get_model_config(model_name)
if model is None:
raise HTTPException(status_code=404, detail=f"Model '{model_name}' not found")
return ModelResponse(
name=model.name,
display_name=model.display_name,
description=model.description,
supports_thinking=model.supports_thinking,
supports_reasoning_effort=model.supports_reasoning_effort,
)

View File

@@ -0,0 +1,438 @@
import json
import logging
import shutil
import stat
import tempfile
import zipfile
from pathlib import Path
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
from app.gateway.path_utils import resolve_thread_virtual_path
from deerflow.config.extensions_config import ExtensionsConfig, SkillStateConfig, get_extensions_config, reload_extensions_config
from deerflow.skills import Skill, load_skills
from deerflow.skills.loader import get_skills_root_path
from deerflow.skills.validation import _validate_skill_frontmatter
logger = logging.getLogger(__name__)
def _is_unsafe_zip_member(info: zipfile.ZipInfo) -> bool:
"""Return True if the zip member path is absolute or attempts directory traversal."""
name = info.filename
if not name:
return False
path = Path(name)
if path.is_absolute():
return True
if ".." in path.parts:
return True
return False
def _is_symlink_member(info: zipfile.ZipInfo) -> bool:
"""Detect symlinks based on the external attributes stored in the ZipInfo."""
# Upper 16 bits of external_attr contain the Unix file mode when created on Unix.
mode = info.external_attr >> 16
return stat.S_ISLNK(mode)
def _safe_extract_skill_archive(
zip_ref: zipfile.ZipFile,
dest_path: Path,
max_total_size: int = 512 * 1024 * 1024,
) -> None:
"""Safely extract a skill archive into dest_path with basic protections.
Protections:
- Reject absolute paths and directory traversal (..).
- Skip symlink entries instead of materialising them.
- Enforce a hard limit on total uncompressed size to mitigate zip bombs.
"""
dest_root = Path(dest_path).resolve()
total_size = 0
for info in zip_ref.infolist():
# Reject absolute paths or any path that attempts directory traversal.
if _is_unsafe_zip_member(info):
raise HTTPException(
status_code=400,
detail=f"Archive contains unsafe member path: {info.filename!r}",
)
# Skip any symlink entries instead of materialising them on disk.
if _is_symlink_member(info):
logger.warning("Skipping symlink entry in skill archive: %s", info.filename)
continue
# Basic unzip-bomb defence: bound the total uncompressed size we will write.
total_size += max(info.file_size, 0)
if total_size > max_total_size:
raise HTTPException(
status_code=400,
detail="Skill archive is too large or appears highly compressed.",
)
member_path = dest_root / info.filename
member_path_parent = member_path.parent
member_path_parent.mkdir(parents=True, exist_ok=True)
if info.is_dir():
member_path.mkdir(parents=True, exist_ok=True)
continue
with zip_ref.open(info) as src, open(member_path, "wb") as dst:
shutil.copyfileobj(src, dst)
router = APIRouter(prefix="/api", tags=["skills"])
class SkillResponse(BaseModel):
"""Response model for skill information."""
name: str = Field(..., description="Name of the skill")
description: str = Field(..., description="Description of what the skill does")
license: str | None = Field(None, description="License information")
category: str = Field(..., description="Category of the skill (public or custom)")
enabled: bool = Field(default=True, description="Whether this skill is enabled")
class SkillsListResponse(BaseModel):
"""Response model for listing all skills."""
skills: list[SkillResponse]
class SkillUpdateRequest(BaseModel):
"""Request model for updating a skill."""
enabled: bool = Field(..., description="Whether to enable or disable the skill")
class SkillInstallRequest(BaseModel):
"""Request model for installing a skill from a .skill file."""
thread_id: str = Field(..., description="The thread ID where the .skill file is located")
path: str = Field(..., description="Virtual path to the .skill file (e.g., mnt/user-data/outputs/my-skill.skill)")
class SkillInstallResponse(BaseModel):
"""Response model for skill installation."""
success: bool = Field(..., description="Whether the installation was successful")
skill_name: str = Field(..., description="Name of the installed skill")
message: str = Field(..., description="Installation result message")
def _should_ignore_archive_entry(path: Path) -> bool:
return path.name.startswith(".") or path.name == "__MACOSX"
def _resolve_skill_dir_from_archive_root(temp_path: Path) -> Path:
extracted_items = [item for item in temp_path.iterdir() if not _should_ignore_archive_entry(item)]
if len(extracted_items) == 0:
raise HTTPException(status_code=400, detail="Skill archive is empty")
if len(extracted_items) == 1 and extracted_items[0].is_dir():
return extracted_items[0]
return temp_path
def _skill_to_response(skill: Skill) -> SkillResponse:
"""Convert a Skill object to a SkillResponse."""
return SkillResponse(
name=skill.name,
description=skill.description,
license=skill.license,
category=skill.category,
enabled=skill.enabled,
)
@router.get(
"/skills",
response_model=SkillsListResponse,
summary="List All Skills",
description="Retrieve a list of all available skills from both public and custom directories.",
)
async def list_skills() -> SkillsListResponse:
"""List all available skills.
Returns all skills regardless of their enabled status.
Returns:
A list of all skills with their metadata.
Example Response:
```json
{
"skills": [
{
"name": "PDF Processing",
"description": "Extract and analyze PDF content",
"license": "MIT",
"category": "public",
"enabled": true
},
{
"name": "Frontend Design",
"description": "Generate frontend designs and components",
"license": null,
"category": "custom",
"enabled": false
}
]
}
```
"""
try:
# Load all skills (including disabled ones)
skills = load_skills(enabled_only=False)
return SkillsListResponse(skills=[_skill_to_response(skill) for skill in skills])
except Exception as e:
logger.error(f"Failed to load skills: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to load skills: {str(e)}")
@router.get(
"/skills/{skill_name}",
response_model=SkillResponse,
summary="Get Skill Details",
description="Retrieve detailed information about a specific skill by its name.",
)
async def get_skill(skill_name: str) -> SkillResponse:
"""Get a specific skill by name.
Args:
skill_name: The name of the skill to retrieve.
Returns:
Skill information if found.
Raises:
HTTPException: 404 if skill not found.
Example Response:
```json
{
"name": "PDF Processing",
"description": "Extract and analyze PDF content",
"license": "MIT",
"category": "public",
"enabled": true
}
```
"""
try:
skills = load_skills(enabled_only=False)
skill = next((s for s in skills if s.name == skill_name), None)
if skill is None:
raise HTTPException(status_code=404, detail=f"Skill '{skill_name}' not found")
return _skill_to_response(skill)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to get skill {skill_name}: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to get skill: {str(e)}")
@router.put(
"/skills/{skill_name}",
response_model=SkillResponse,
summary="Update Skill",
description="Update a skill's enabled status by modifying the extensions_config.json file.",
)
async def update_skill(skill_name: str, request: SkillUpdateRequest) -> SkillResponse:
"""Update a skill's enabled status.
This will modify the extensions_config.json file to update the enabled state.
The SKILL.md file itself is not modified.
Args:
skill_name: The name of the skill to update.
request: The update request containing the new enabled status.
Returns:
The updated skill information.
Raises:
HTTPException: 404 if skill not found, 500 if update fails.
Example Request:
```json
{
"enabled": false
}
```
Example Response:
```json
{
"name": "PDF Processing",
"description": "Extract and analyze PDF content",
"license": "MIT",
"category": "public",
"enabled": false
}
```
"""
try:
# Find the skill to verify it exists
skills = load_skills(enabled_only=False)
skill = next((s for s in skills if s.name == skill_name), None)
if skill is None:
raise HTTPException(status_code=404, detail=f"Skill '{skill_name}' not found")
# Get or create config path
config_path = ExtensionsConfig.resolve_config_path()
if config_path is None:
# Create new config file in parent directory (project root)
config_path = Path.cwd().parent / "extensions_config.json"
logger.info(f"No existing extensions config found. Creating new config at: {config_path}")
# Load current configuration
extensions_config = get_extensions_config()
# Update the skill's enabled status
extensions_config.skills[skill_name] = SkillStateConfig(enabled=request.enabled)
# Convert to JSON format (preserve MCP servers config)
config_data = {
"mcpServers": {name: server.model_dump() for name, server in extensions_config.mcp_servers.items()},
"skills": {name: {"enabled": skill_config.enabled} for name, skill_config in extensions_config.skills.items()},
}
# Write the configuration to file
with open(config_path, "w") as f:
json.dump(config_data, f, indent=2)
logger.info(f"Skills configuration updated and saved to: {config_path}")
# Reload the extensions config to update the global cache
reload_extensions_config()
# Reload the skills to get the updated status (for API response)
skills = load_skills(enabled_only=False)
updated_skill = next((s for s in skills if s.name == skill_name), None)
if updated_skill is None:
raise HTTPException(status_code=500, detail=f"Failed to reload skill '{skill_name}' after update")
logger.info(f"Skill '{skill_name}' enabled status updated to {request.enabled}")
return _skill_to_response(updated_skill)
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to update skill {skill_name}: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to update skill: {str(e)}")
@router.post(
"/skills/install",
response_model=SkillInstallResponse,
summary="Install Skill",
description="Install a skill from a .skill file (ZIP archive) located in the thread's user-data directory.",
)
async def install_skill(request: SkillInstallRequest) -> SkillInstallResponse:
"""Install a skill from a .skill file.
The .skill file is a ZIP archive containing a skill directory with SKILL.md
and optional resources (scripts, references, assets).
Args:
request: The install request containing thread_id and virtual path to .skill file.
Returns:
Installation result with skill name and status message.
Raises:
HTTPException:
- 400 if path is invalid or file is not a valid .skill file
- 403 if access denied (path traversal detected)
- 404 if file not found
- 409 if skill already exists
- 500 if installation fails
Example Request:
```json
{
"thread_id": "abc123-def456",
"path": "/mnt/user-data/outputs/my-skill.skill"
}
```
Example Response:
```json
{
"success": true,
"skill_name": "my-skill",
"message": "Skill 'my-skill' installed successfully"
}
```
"""
try:
# Resolve the virtual path to actual file path
skill_file_path = resolve_thread_virtual_path(request.thread_id, request.path)
# Check if file exists
if not skill_file_path.exists():
raise HTTPException(status_code=404, detail=f"Skill file not found: {request.path}")
# Check if it's a file
if not skill_file_path.is_file():
raise HTTPException(status_code=400, detail=f"Path is not a file: {request.path}")
# Check file extension
if not skill_file_path.suffix == ".skill":
raise HTTPException(status_code=400, detail="File must have .skill extension")
# Verify it's a valid ZIP file
if not zipfile.is_zipfile(skill_file_path):
raise HTTPException(status_code=400, detail="File is not a valid ZIP archive")
# Get the custom skills directory
skills_root = get_skills_root_path()
custom_skills_dir = skills_root / "custom"
# Create custom directory if it doesn't exist
custom_skills_dir.mkdir(parents=True, exist_ok=True)
# Extract to a temporary directory first for validation
with tempfile.TemporaryDirectory() as temp_dir:
temp_path = Path(temp_dir)
# Extract the .skill file with validation and protections.
with zipfile.ZipFile(skill_file_path, "r") as zip_ref:
_safe_extract_skill_archive(zip_ref, temp_path)
skill_dir = _resolve_skill_dir_from_archive_root(temp_path)
# Validate the skill
is_valid, message, skill_name = _validate_skill_frontmatter(skill_dir)
if not is_valid:
raise HTTPException(status_code=400, detail=f"Invalid skill: {message}")
if not skill_name:
raise HTTPException(status_code=400, detail="Could not determine skill name")
# Check if skill already exists
target_dir = custom_skills_dir / skill_name
if target_dir.exists():
raise HTTPException(status_code=409, detail=f"Skill '{skill_name}' already exists. Please remove it first or use a different name.")
# Move the skill directory to the custom skills directory
shutil.copytree(skill_dir, target_dir)
logger.info(f"Skill '{skill_name}' installed successfully to {target_dir}")
return SkillInstallResponse(success=True, skill_name=skill_name, message=f"Skill '{skill_name}' installed successfully")
except HTTPException:
raise
except Exception as e:
logger.error(f"Failed to install skill: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Failed to install skill: {str(e)}")

View File

@@ -0,0 +1,132 @@
import json
import logging
from fastapi import APIRouter
from pydantic import BaseModel, Field
from deerflow.models import create_chat_model
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api", tags=["suggestions"])
class SuggestionMessage(BaseModel):
role: str = Field(..., description="Message role: user|assistant")
content: str = Field(..., description="Message content as plain text")
class SuggestionsRequest(BaseModel):
messages: list[SuggestionMessage] = Field(..., description="Recent conversation messages")
n: int = Field(default=3, ge=1, le=5, description="Number of suggestions to generate")
model_name: str | None = Field(default=None, description="Optional model override")
class SuggestionsResponse(BaseModel):
suggestions: list[str] = Field(default_factory=list, description="Suggested follow-up questions")
def _strip_markdown_code_fence(text: str) -> str:
stripped = text.strip()
if not stripped.startswith("```"):
return stripped
lines = stripped.splitlines()
if len(lines) >= 3 and lines[0].startswith("```") and lines[-1].startswith("```"):
return "\n".join(lines[1:-1]).strip()
return stripped
def _parse_json_string_list(text: str) -> list[str] | None:
candidate = _strip_markdown_code_fence(text)
start = candidate.find("[")
end = candidate.rfind("]")
if start == -1 or end == -1 or end <= start:
return None
candidate = candidate[start : end + 1]
try:
data = json.loads(candidate)
except Exception:
return None
if not isinstance(data, list):
return None
out: list[str] = []
for item in data:
if not isinstance(item, str):
continue
s = item.strip()
if not s:
continue
out.append(s)
return out
def _extract_response_text(content: object) -> str:
if isinstance(content, str):
return content
if isinstance(content, list):
parts: list[str] = []
for block in content:
if isinstance(block, str):
parts.append(block)
elif isinstance(block, dict) and block.get("type") == "text":
text = block.get("text")
if isinstance(text, str):
parts.append(text)
return "\n".join(parts) if parts else ""
if content is None:
return ""
return str(content)
def _format_conversation(messages: list[SuggestionMessage]) -> str:
parts: list[str] = []
for m in messages:
role = m.role.strip().lower()
if role in ("user", "human"):
parts.append(f"User: {m.content.strip()}")
elif role in ("assistant", "ai"):
parts.append(f"Assistant: {m.content.strip()}")
else:
parts.append(f"{m.role}: {m.content.strip()}")
return "\n".join(parts).strip()
@router.post(
"/threads/{thread_id}/suggestions",
response_model=SuggestionsResponse,
summary="Generate Follow-up Questions",
description="Generate short follow-up questions a user might ask next, based on recent conversation context.",
)
async def generate_suggestions(thread_id: str, request: SuggestionsRequest) -> SuggestionsResponse:
if not request.messages:
return SuggestionsResponse(suggestions=[])
n = request.n
conversation = _format_conversation(request.messages)
if not conversation:
return SuggestionsResponse(suggestions=[])
prompt = (
"You are generating follow-up questions to help the user continue the conversation.\n"
f"Based on the conversation below, produce EXACTLY {n} short questions the user might ask next.\n"
"Requirements:\n"
"- Questions must be relevant to the conversation.\n"
"- Questions must be written in the same language as the user.\n"
"- Keep each question concise (ideally <= 20 words / <= 40 Chinese characters).\n"
"- Do NOT include numbering, markdown, or any extra text.\n"
"- Output MUST be a JSON array of strings only.\n\n"
"Conversation:\n"
f"{conversation}\n"
)
try:
model = create_chat_model(name=request.model_name, thinking_enabled=False)
response = model.invoke(prompt)
raw = _extract_response_text(response.content)
suggestions = _parse_json_string_list(raw) or []
cleaned = [s.replace("\n", " ").strip() for s in suggestions if s.strip()]
cleaned = cleaned[:n]
return SuggestionsResponse(suggestions=cleaned)
except Exception as exc:
logger.exception("Failed to generate suggestions: thread_id=%s err=%s", thread_id, exc)
return SuggestionsResponse(suggestions=[])

View File

@@ -0,0 +1,195 @@
"""Upload router for handling file uploads."""
import logging
from pathlib import Path
from fastapi import APIRouter, File, HTTPException, UploadFile
from pydantic import BaseModel
from deerflow.config.paths import VIRTUAL_PATH_PREFIX, get_paths
from deerflow.sandbox.sandbox_provider import get_sandbox_provider
from deerflow.utils.file_conversion import CONVERTIBLE_EXTENSIONS, convert_file_to_markdown
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/threads/{thread_id}/uploads", tags=["uploads"])
class UploadResponse(BaseModel):
"""Response model for file upload."""
success: bool
files: list[dict[str, str]]
message: str
def get_uploads_dir(thread_id: str) -> Path:
"""Get the uploads directory for a thread.
Args:
thread_id: The thread ID.
Returns:
Path to the uploads directory.
"""
base_dir = get_paths().sandbox_uploads_dir(thread_id)
base_dir.mkdir(parents=True, exist_ok=True)
return base_dir
@router.post("", response_model=UploadResponse)
async def upload_files(
thread_id: str,
files: list[UploadFile] = File(...),
) -> UploadResponse:
"""Upload multiple files to a thread's uploads directory.
For PDF, PPT, Excel, and Word files, they will be converted to markdown using markitdown.
All files (original and converted) are saved to /mnt/user-data/uploads.
Args:
thread_id: The thread ID to upload files to.
files: List of files to upload.
Returns:
Upload response with success status and file information.
"""
if not files:
raise HTTPException(status_code=400, detail="No files provided")
uploads_dir = get_uploads_dir(thread_id)
paths = get_paths()
uploaded_files = []
sandbox_provider = get_sandbox_provider()
sandbox_id = sandbox_provider.acquire(thread_id)
sandbox = sandbox_provider.get(sandbox_id)
for file in files:
if not file.filename:
continue
try:
# Normalize filename to prevent path traversal
safe_filename = Path(file.filename).name
if not safe_filename or safe_filename in {".", ".."} or "/" in safe_filename or "\\" in safe_filename:
logger.warning(f"Skipping file with unsafe filename: {file.filename!r}")
continue
content = await file.read()
file_path = uploads_dir / safe_filename
file_path.write_bytes(content)
# Build relative path from backend root
relative_path = str(paths.sandbox_uploads_dir(thread_id) / safe_filename)
virtual_path = f"{VIRTUAL_PATH_PREFIX}/uploads/{safe_filename}"
# Keep local sandbox source of truth in thread-scoped host storage.
# For non-local sandboxes, also sync to virtual path for runtime visibility.
if sandbox_id != "local":
sandbox.update_file(virtual_path, content)
file_info = {
"filename": safe_filename,
"size": str(len(content)),
"path": relative_path, # Actual filesystem path (relative to backend/)
"virtual_path": virtual_path, # Path for Agent in sandbox
"artifact_url": f"/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/{safe_filename}", # HTTP URL
}
logger.info(f"Saved file: {safe_filename} ({len(content)} bytes) to {relative_path}")
# Check if file should be converted to markdown
file_ext = file_path.suffix.lower()
if file_ext in CONVERTIBLE_EXTENSIONS:
md_path = await convert_file_to_markdown(file_path)
if md_path:
md_relative_path = str(paths.sandbox_uploads_dir(thread_id) / md_path.name)
md_virtual_path = f"{VIRTUAL_PATH_PREFIX}/uploads/{md_path.name}"
if sandbox_id != "local":
sandbox.update_file(md_virtual_path, md_path.read_bytes())
file_info["markdown_file"] = md_path.name
file_info["markdown_path"] = md_relative_path
file_info["markdown_virtual_path"] = md_virtual_path
file_info["markdown_artifact_url"] = f"/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/{md_path.name}"
uploaded_files.append(file_info)
except Exception as e:
logger.error(f"Failed to upload {file.filename}: {e}")
raise HTTPException(status_code=500, detail=f"Failed to upload {file.filename}: {str(e)}")
return UploadResponse(
success=True,
files=uploaded_files,
message=f"Successfully uploaded {len(uploaded_files)} file(s)",
)
@router.get("/list", response_model=dict)
async def list_uploaded_files(thread_id: str) -> dict:
"""List all files in a thread's uploads directory.
Args:
thread_id: The thread ID to list files for.
Returns:
Dictionary containing list of files with their metadata.
"""
uploads_dir = get_uploads_dir(thread_id)
if not uploads_dir.exists():
return {"files": [], "count": 0}
files = []
for file_path in sorted(uploads_dir.iterdir()):
if file_path.is_file():
stat = file_path.stat()
relative_path = str(get_paths().sandbox_uploads_dir(thread_id) / file_path.name)
files.append(
{
"filename": file_path.name,
"size": stat.st_size,
"path": relative_path, # Actual filesystem path
"virtual_path": f"{VIRTUAL_PATH_PREFIX}/uploads/{file_path.name}", # Path for Agent in sandbox
"artifact_url": f"/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/{file_path.name}", # HTTP URL
"extension": file_path.suffix,
"modified": stat.st_mtime,
}
)
return {"files": files, "count": len(files)}
@router.delete("/{filename}")
async def delete_uploaded_file(thread_id: str, filename: str) -> dict:
"""Delete a file from a thread's uploads directory.
Args:
thread_id: The thread ID.
filename: The filename to delete.
Returns:
Success message.
"""
uploads_dir = get_uploads_dir(thread_id)
file_path = uploads_dir / filename
if not file_path.exists():
raise HTTPException(status_code=404, detail=f"File not found: {filename}")
# Security check: ensure the path is within the uploads directory
try:
file_path.resolve().relative_to(uploads_dir.resolve())
except ValueError:
raise HTTPException(status_code=403, detail="Access denied")
try:
file_path.unlink()
logger.info(f"Deleted file: {filename}")
return {"success": True, "message": f"Deleted {filename}"}
except Exception as e:
logger.error(f"Failed to delete {filename}: {e}")
raise HTTPException(status_code=500, detail=f"Failed to delete {filename}: {str(e)}")