From 9809af1f26982f73532296ddeb35a0875836ab52 Mon Sep 17 00:00:00 2001 From: lailoo <1811866786@qq.com> Date: Tue, 17 Mar 2026 09:51:08 +0800 Subject: [PATCH] feat: add citation/reference support to deep research reports (#1143) * feat: add citation/reference support to deep research reports (#1141) - Enhance lead agent system prompt with mandatory citation requirements after web_search/web_fetch tool usage - Add citation examples and best practices to GitHub Deep Research skill - Add citation hints to report template (Executive Summary, Key Analysis) - Style regular markdown links in frontend for visual distinction (color, underline, hover effect) - Fix TitleMiddleware being registered when title generation is disabled * fix: address PR review comments - Revert TitleMiddleware conditional registration (agent.py) to avoid sync/async incompatibility with DeerFlowClient - Fix markdown link rendering: merge classNames instead of overwriting, only set target=_blank for external http(s) URLs - Remove unrelated package.json/pnpm-lock.yaml changes * fix: use plain markdown links in Sources section for cleaner rendering Inline citations in report body use [citation:Title](URL) for pill/badge style. Sources section uses plain [Title](URL) for simple underlined link style. * fix(frontend): render plain links as underlined text in artifact markdown Only links with citation: prefix render as Badge pills. Regular links in Sources section now render as underlined text links. --------- Co-authored-by: Willem Jiang --- .../deerflow/agents/lead_agent/prompt.py | 57 ++++++++++++++++++- .../artifacts/artifact-file-detail.tsx | 4 +- .../workspace/citations/artifact-link.tsx | 33 +++++++++++ .../workspace/messages/markdown-content.tsx | 20 ++++++- skills/public/github-deep-research/SKILL.md | 19 ++++++- .../assets/report_template.md | 5 ++ 6 files changed, 128 insertions(+), 10 deletions(-) create mode 100644 frontend/src/components/workspace/citations/artifact-link.tsx diff --git a/backend/packages/harness/deerflow/agents/lead_agent/prompt.py b/backend/packages/harness/deerflow/agents/lead_agent/prompt.py index 3591d79..3417bbf 100644 --- a/backend/packages/harness/deerflow/agents/lead_agent/prompt.py +++ b/backend/packages/harness/deerflow/agents/lead_agent/prompt.py @@ -257,15 +257,66 @@ You: "Deploying to staging..." [proceed] -- When to Use: After web_search, include citations if applicable -- Format: Use Markdown link format `[citation:TITLE](URL)` -- Example: +**CRITICAL: Always include citations when using web search results** + +- **When to Use**: MANDATORY after web_search, web_fetch, or any external information source +- **Format**: Use Markdown link format `[citation:TITLE](URL)` immediately after the claim +- **Placement**: Inline citations should appear right after the sentence or claim they support +- **Sources Section**: Also collect all citations in a "Sources" section at the end of reports + +**Example - Inline Citations:** ```markdown The key AI trends for 2026 include enhanced reasoning capabilities and multimodal integration [citation:AI Trends 2026](https://techcrunch.com/ai-trends). Recent breakthroughs in language models have also accelerated progress [citation:OpenAI Research](https://openai.com/research). ``` + +**Example - Deep Research Report with Citations:** +```markdown +## Executive Summary + +DeerFlow is an open-source AI agent framework that gained significant traction in early 2026 +[citation:GitHub Repository](https://github.com/bytedance/deer-flow). The project focuses on +providing a production-ready agent system with sandbox execution and memory management +[citation:DeerFlow Documentation](https://deer-flow.dev/docs). + +## Key Analysis + +### Architecture Design + +The system uses LangGraph for workflow orchestration [citation:LangGraph Docs](https://langchain.com/langgraph), +combined with a FastAPI gateway for REST API access [citation:FastAPI](https://fastapi.tiangolo.com). + +## Sources + +### Primary Sources +- [GitHub Repository](https://github.com/bytedance/deer-flow) - Official source code and documentation +- [DeerFlow Documentation](https://deer-flow.dev/docs) - Technical specifications + +### Media Coverage +- [AI Trends 2026](https://techcrunch.com/ai-trends) - Industry analysis +``` + +**CRITICAL: Sources section format:** +- Every item in the Sources section MUST be a clickable markdown link with URL +- Use standard markdown link `[Title](URL) - Description` format (NOT `[citation:...]` format) +- The `[citation:Title](URL)` format is ONLY for inline citations within the report body +- ❌ WRONG: `GitHub 仓库 - 官方源代码和文档` (no URL!) +- ❌ WRONG in Sources: `[citation:GitHub Repository](url)` (citation prefix is for inline only!) +- ✅ RIGHT in Sources: `[GitHub Repository](https://github.com/bytedance/deer-flow) - 官方源代码和文档` + +**WORKFLOW for Research Tasks:** +1. Use web_search to find sources → Extract {{title, url, snippet}} from results +2. Write content with inline citations: `claim [citation:Title](url)` +3. Collect all citations in a "Sources" section at the end +4. NEVER write claims without citations when sources are available + +**CRITICAL RULES:** +- ❌ DO NOT write research content without citations +- ❌ DO NOT forget to extract URLs from search results +- ✅ ALWAYS add `[citation:Title](URL)` after claims from external sources +- ✅ ALWAYS include a "Sources" section listing all references diff --git a/frontend/src/components/workspace/artifacts/artifact-file-detail.tsx b/frontend/src/components/workspace/artifacts/artifact-file-detail.tsx index 0539d18..267320b 100644 --- a/frontend/src/components/workspace/artifacts/artifact-file-detail.tsx +++ b/frontend/src/components/workspace/artifacts/artifact-file-detail.tsx @@ -38,7 +38,7 @@ import { checkCodeFile, getFileName } from "@/core/utils/files"; import { env } from "@/env"; import { cn } from "@/lib/utils"; -import { CitationLink } from "../citations/citation-link"; +import { ArtifactLink } from "../citations/artifact-link"; import { useThread } from "../messages/context"; import { Tooltip } from "../tooltip"; @@ -274,7 +274,7 @@ export function ArtifactFilePreview({ {content ?? ""} diff --git a/frontend/src/components/workspace/citations/artifact-link.tsx b/frontend/src/components/workspace/citations/artifact-link.tsx new file mode 100644 index 0000000..c7bc171 --- /dev/null +++ b/frontend/src/components/workspace/citations/artifact-link.tsx @@ -0,0 +1,33 @@ +import type { AnchorHTMLAttributes } from "react"; + +import { cn } from "@/lib/utils"; + +import { CitationLink } from "./citation-link"; + +function isExternalUrl(href: string | undefined): boolean { + return !!href && /^https?:\/\//.test(href); +} + +/** Link renderer for artifact markdown: citation: prefix → CitationLink, otherwise underlined text. */ +export function ArtifactLink(props: AnchorHTMLAttributes) { + if (typeof props.children === "string") { + const match = /^citation:(.+)$/.exec(props.children); + if (match) { + const [, text] = match; + return {text}; + } + } + const { className, target, rel, ...rest } = props; + const external = isExternalUrl(props.href); + return ( + + ); +} diff --git a/frontend/src/components/workspace/messages/markdown-content.tsx b/frontend/src/components/workspace/messages/markdown-content.tsx index 2e240d8..4b61fc9 100644 --- a/frontend/src/components/workspace/messages/markdown-content.tsx +++ b/frontend/src/components/workspace/messages/markdown-content.tsx @@ -1,16 +1,21 @@ "use client"; import { useMemo } from "react"; -import type { HTMLAttributes } from "react"; +import type { AnchorHTMLAttributes } from "react"; import { MessageResponse, type MessageResponseProps, } from "@/components/ai-elements/message"; import { streamdownPlugins } from "@/core/streamdown"; +import { cn } from "@/lib/utils"; import { CitationLink } from "../citations/citation-link"; +function isExternalUrl(href: string | undefined): boolean { + return !!href && /^https?:\/\//.test(href); +} + export type MarkdownContentProps = { content: string; isLoading: boolean; @@ -30,7 +35,7 @@ export function MarkdownContent({ }: MarkdownContentProps) { const components = useMemo(() => { return { - a: (props: HTMLAttributes) => { + a: (props: AnchorHTMLAttributes) => { if (typeof props.children === "string") { const match = /^citation:(.+)$/.exec(props.children); if (match) { @@ -38,7 +43,16 @@ export function MarkdownContent({ return {text}; } } - return ; + const { className, target, rel, ...rest } = props; + const external = isExternalUrl(props.href); + return ( + + ); }, ...componentsFromProps, }; diff --git a/skills/public/github-deep-research/SKILL.md b/skills/public/github-deep-research/SKILL.md index fafdb83..2add158 100644 --- a/skills/public/github-deep-research/SKILL.md +++ b/skills/public/github-deep-research/SKILL.md @@ -147,5 +147,20 @@ Save report as: `research_{topic}_{YYYYMMDD}.md` 3. **Triangulate claims** - 2+ independent sources 4. **Note conflicting info** - Don't hide contradictions 5. **Distinguish fact vs opinion** - Label speculation clearly -6. **Reference sources** - Add source references near claims where applicable -7. **Update as you go** - Don't wait until end to synthesize +6. **CRITICAL: Always include inline citations** - Use `[citation:Title](URL)` format immediately after each claim from external sources +7. **Extract URLs from search results** - web_search returns {title, url, snippet} - always use the URL field +8. **Update as you go** - Don't wait until end to synthesize + +### Citation Examples + +**Good - With inline citations:** +```markdown +The project gained 10,000 stars within 3 months of launch [citation:GitHub Stats](https://github.com/owner/repo). +The architecture uses LangGraph for workflow orchestration [citation:LangGraph Docs](https://langchain.com/langgraph). +``` + +**Bad - Without citations:** +```markdown +The project gained 10,000 stars within 3 months of launch. +The architecture uses LangGraph for workflow orchestration. +``` diff --git a/skills/public/github-deep-research/assets/report_template.md b/skills/public/github-deep-research/assets/report_template.md index 57c9c1a..9ea92ae 100644 --- a/skills/public/github-deep-research/assets/report_template.md +++ b/skills/public/github-deep-research/assets/report_template.md @@ -30,6 +30,9 @@ {EXECUTIVE_SUMMARY} +**IMPORTANT**: Include inline citations using `[citation:Title](URL)` format after each claim. Example: +"The project gained 10k stars in 3 months [citation:GitHub Stats](https://github.com/owner/repo)." + --- ## Complete Chronological Timeline @@ -56,6 +59,8 @@ ## Key Analysis +**IMPORTANT**: Support each analysis point with inline citations `[citation:Title](URL)`. + ### {ANALYSIS_SECTION_1_TITLE} {ANALYSIS_SECTION_1_CONTENT}