mirror of
https://gitee.com/wanwujie/deer-flow
synced 2026-04-16 11:24:45 +08:00
chore: 移除所有 Citations 相关逻辑,为后续重构做准备
- Backend: 删除 lead_agent / general_purpose 中的 citations_format 与引用相关 reminder;artifacts 下载不再对 markdown 做 citation 清洗,统一走 FileResponse,保留 Response 用于二进制 inline - Frontend: 删除 core/citations 模块、inline-citation、safe-citation-content;新增 MarkdownContent 仅做 Markdown 渲染;消息/artifact 预览与复制均使用原始 content - i18n: 移除 citations 命名空间(loadingCitations、loadingCitationsWithCount) - 技能与 demo: 措辞改为 references,demo 数据去掉 <citations> 块 - 文档: 更新 CLAUDE/AGENTS/README 描述,新增按文件 diff 的代码变更总结 Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -156,7 +156,7 @@ FastAPI application on port 8001 with health check at `GET /health`.
|
||||
| **Skills** (`/api/skills`) | `GET /` - list skills; `GET /{name}` - details; `PUT /{name}` - update enabled; `POST /install` - install from .skill archive |
|
||||
| **Memory** (`/api/memory`) | `GET /` - memory data; `POST /reload` - force reload; `GET /config` - config; `GET /status` - config + data |
|
||||
| **Uploads** (`/api/threads/{id}/uploads`) | `POST /` - upload files (auto-converts PDF/PPT/Excel/Word); `GET /list` - list; `DELETE /{filename}` - delete |
|
||||
| **Artifacts** (`/api/threads/{id}/artifacts`) | `GET /{path}` - serve artifacts; `?download=true` for download with citation removal |
|
||||
| **Artifacts** (`/api/threads/{id}/artifacts`) | `GET /{path}` - serve artifacts; `?download=true` for file download |
|
||||
|
||||
Proxied through nginx: `/api/langgraph/*` → LangGraph, all other `/api/*` → Gateway.
|
||||
|
||||
|
||||
@@ -240,34 +240,8 @@ You have access to skills that provide optimized workflows for specific tasks. E
|
||||
- Action-Oriented: Focus on delivering results, not explaining processes
|
||||
</response_style>
|
||||
|
||||
<citations_format>
|
||||
After web_search, ALWAYS include citations in your output:
|
||||
|
||||
1. Start with a `<citations>` block in JSONL format listing all sources
|
||||
2. In content, use FULL markdown link format: [Short Title](full_url)
|
||||
|
||||
**CRITICAL - Citation Link Format:**
|
||||
- CORRECT: `[TechCrunch](https://techcrunch.com/ai-trends)` - full markdown link with URL
|
||||
- WRONG: `[arXiv:2502.19166]` - missing URL, will NOT render as link
|
||||
- WRONG: `[Source]` - missing URL, will NOT render as link
|
||||
|
||||
**Rules:**
|
||||
- Every citation MUST be a complete markdown link with URL: `[Title](https://...)`
|
||||
- Write content naturally, add citation link at end of sentence/paragraph
|
||||
- NEVER use bare brackets like `[arXiv:xxx]` or `[Source]` without URL
|
||||
|
||||
**Example:**
|
||||
<citations>
|
||||
{{"id": "cite-1", "title": "AI Trends 2026", "url": "https://techcrunch.com/ai-trends", "snippet": "Tech industry predictions"}}
|
||||
{{"id": "cite-2", "title": "OpenAI Research", "url": "https://openai.com/research", "snippet": "Latest AI research developments"}}
|
||||
</citations>
|
||||
The key AI trends for 2026 include enhanced reasoning capabilities and multimodal integration [TechCrunch](https://techcrunch.com/ai-trends). Recent breakthroughs in language models have also accelerated progress [OpenAI](https://openai.com/research).
|
||||
</citations_format>
|
||||
|
||||
|
||||
<critical_reminders>
|
||||
- **Clarification First**: ALWAYS clarify unclear/missing/ambiguous requirements BEFORE starting work - never assume or guess
|
||||
- **Web search citations**: When you use web_search (or synthesize subagent results that used it), you MUST output the `<citations>` block and [Title](url) links as specified in citations_format so citations display for the user.
|
||||
{subagent_reminder}- Skill First: Always load the relevant skill before starting **complex** tasks.
|
||||
- Progressive Loading: Load resources incrementally as referenced in skills
|
||||
- Output Files: Final deliverables must be in `/mnt/user-data/outputs`
|
||||
@@ -341,7 +315,6 @@ def apply_prompt_template(subagent_enabled: bool = False) -> str:
|
||||
# Add subagent reminder to critical_reminders if enabled
|
||||
subagent_reminder = (
|
||||
"- **Orchestrator Mode**: You are a task orchestrator - decompose complex tasks into parallel sub-tasks and launch multiple subagents simultaneously. Synthesize results, don't execute directly.\n"
|
||||
"- **Citations when synthesizing**: When you synthesize subagent results that used web search or cite sources, you MUST include a consolidated `<citations>` block (JSONL format) and use [Title](url) markdown links in your response so citations display correctly.\n"
|
||||
if subagent_enabled
|
||||
else ""
|
||||
)
|
||||
|
||||
@@ -1,12 +1,10 @@
|
||||
import json
|
||||
import mimetypes
|
||||
import re
|
||||
import zipfile
|
||||
from pathlib import Path
|
||||
from urllib.parse import quote
|
||||
|
||||
from fastapi import APIRouter, HTTPException, Request, Response
|
||||
from fastapi.responses import FileResponse, HTMLResponse, PlainTextResponse
|
||||
from fastapi import APIRouter, HTTPException, Request
|
||||
from fastapi.responses import FileResponse, HTMLResponse, PlainTextResponse, Response
|
||||
|
||||
from src.gateway.path_utils import resolve_thread_virtual_path
|
||||
|
||||
@@ -24,40 +22,6 @@ def is_text_file_by_content(path: Path, sample_size: int = 8192) -> bool:
|
||||
return False
|
||||
|
||||
|
||||
def _extract_citation_urls(content: str) -> set[str]:
|
||||
"""Extract URLs from <citations> JSONL blocks. Format must match frontend core/citations/utils.ts."""
|
||||
urls: set[str] = set()
|
||||
for match in re.finditer(r"<citations>([\s\S]*?)</citations>", content):
|
||||
for line in match.group(1).split("\n"):
|
||||
line = line.strip()
|
||||
if line.startswith("{"):
|
||||
try:
|
||||
obj = json.loads(line)
|
||||
if "url" in obj:
|
||||
urls.add(obj["url"])
|
||||
except (json.JSONDecodeError, ValueError):
|
||||
pass
|
||||
return urls
|
||||
|
||||
|
||||
def remove_citations_block(content: str) -> str:
|
||||
"""Remove ALL citations from markdown (blocks, [cite-N], and citation links). Used for downloads."""
|
||||
if not content:
|
||||
return content
|
||||
|
||||
citation_urls = _extract_citation_urls(content)
|
||||
|
||||
result = re.sub(r"<citations>[\s\S]*?</citations>", "", content)
|
||||
if "<citations>" in result:
|
||||
result = re.sub(r"<citations>[\s\S]*$", "", result)
|
||||
result = re.sub(r"\[cite-\d+\]", "", result)
|
||||
|
||||
for url in citation_urls:
|
||||
result = re.sub(rf"\[[^\]]+\]\({re.escape(url)}\)", "", result)
|
||||
|
||||
return re.sub(r"\n{3,}", "\n\n", result).strip()
|
||||
|
||||
|
||||
def _extract_file_from_skill_archive(zip_path: Path, internal_path: str) -> bytes | None:
|
||||
"""Extract a file from a .skill ZIP archive.
|
||||
|
||||
@@ -172,24 +136,9 @@ async def get_artifact(thread_id: str, path: str, request: Request) -> FileRespo
|
||||
|
||||
# Encode filename for Content-Disposition header (RFC 5987)
|
||||
encoded_filename = quote(actual_path.name)
|
||||
|
||||
# Check if this is a markdown file that might contain citations
|
||||
is_markdown = mime_type == "text/markdown" or actual_path.suffix.lower() in [".md", ".markdown"]
|
||||
|
||||
|
||||
# if `download` query parameter is true, return the file as a download
|
||||
if request.query_params.get("download"):
|
||||
# For markdown files, remove citations block before download
|
||||
if is_markdown:
|
||||
content = actual_path.read_text()
|
||||
clean_content = remove_citations_block(content)
|
||||
return Response(
|
||||
content=clean_content.encode("utf-8"),
|
||||
media_type="text/markdown",
|
||||
headers={
|
||||
"Content-Disposition": f"attachment; filename*=UTF-8''{encoded_filename}",
|
||||
"Content-Type": "text/markdown; charset=utf-8"
|
||||
}
|
||||
)
|
||||
return FileResponse(path=actual_path, filename=actual_path.name, media_type=mime_type, headers={"Content-Disposition": f"attachment; filename*=UTF-8''{encoded_filename}"})
|
||||
|
||||
if mime_type and mime_type == "text/html":
|
||||
|
||||
@@ -24,21 +24,10 @@ Do NOT use for simple, single-step operations.""",
|
||||
- Do NOT ask for clarification - work with the information provided
|
||||
</guidelines>
|
||||
|
||||
<citations_format>
|
||||
If you used web_search (or similar) and cite sources, ALWAYS include citations in your output:
|
||||
1. Start with a `<citations>` block in JSONL format listing all sources (one JSON object per line)
|
||||
2. In content, use FULL markdown link format: [Short Title](full_url)
|
||||
- Every citation MUST be a complete markdown link with URL: [Title](https://...)
|
||||
- Example block:
|
||||
<citations>
|
||||
{"id": "cite-1", "title": "...", "url": "https://...", "snippet": "..."}
|
||||
</citations>
|
||||
</citations_format>
|
||||
|
||||
<output_format>
|
||||
When you complete the task, provide:
|
||||
1. A brief summary of what was accomplished
|
||||
2. Key findings or results (with citation links when from web search)
|
||||
2. Key findings or results
|
||||
3. Any relevant file paths, data, or artifacts created
|
||||
4. Issues encountered (if any)
|
||||
</output_format>
|
||||
|
||||
Reference in New Issue
Block a user