feat: add citation/reference support to deep research reports (#1143)

* feat: add citation/reference support to deep research reports (#1141)

- Enhance lead agent system prompt with mandatory citation requirements
  after web_search/web_fetch tool usage
- Add citation examples and best practices to GitHub Deep Research skill
- Add citation hints to report template (Executive Summary, Key Analysis)
- Style regular markdown links in frontend for visual distinction
  (color, underline, hover effect)
- Fix TitleMiddleware being registered when title generation is disabled

* fix: address PR review comments

- Revert TitleMiddleware conditional registration (agent.py) to avoid
  sync/async incompatibility with DeerFlowClient
- Fix markdown link rendering: merge classNames instead of overwriting,
  only set target=_blank for external http(s) URLs
- Remove unrelated package.json/pnpm-lock.yaml changes

* fix: use plain markdown links in Sources section for cleaner rendering

Inline citations in report body use [citation:Title](URL) for pill/badge style.
Sources section uses plain [Title](URL) for simple underlined link style.

* fix(frontend): render plain links as underlined text in artifact markdown

Only links with citation: prefix render as Badge pills.
Regular links in Sources section now render as underlined text links.

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
This commit is contained in:
lailoo
2026-03-17 09:51:08 +08:00
committed by GitHub
parent b1913a1902
commit 9809af1f26
6 changed files with 128 additions and 10 deletions

View File

@@ -257,15 +257,66 @@ You: "Deploying to staging..." [proceed]
</response_style> </response_style>
<citations> <citations>
- When to Use: After web_search, include citations if applicable **CRITICAL: Always include citations when using web search results**
- Format: Use Markdown link format `[citation:TITLE](URL)`
- Example: - **When to Use**: MANDATORY after web_search, web_fetch, or any external information source
- **Format**: Use Markdown link format `[citation:TITLE](URL)` immediately after the claim
- **Placement**: Inline citations should appear right after the sentence or claim they support
- **Sources Section**: Also collect all citations in a "Sources" section at the end of reports
**Example - Inline Citations:**
```markdown ```markdown
The key AI trends for 2026 include enhanced reasoning capabilities and multimodal integration The key AI trends for 2026 include enhanced reasoning capabilities and multimodal integration
[citation:AI Trends 2026](https://techcrunch.com/ai-trends). [citation:AI Trends 2026](https://techcrunch.com/ai-trends).
Recent breakthroughs in language models have also accelerated progress Recent breakthroughs in language models have also accelerated progress
[citation:OpenAI Research](https://openai.com/research). [citation:OpenAI Research](https://openai.com/research).
``` ```
**Example - Deep Research Report with Citations:**
```markdown
## Executive Summary
DeerFlow is an open-source AI agent framework that gained significant traction in early 2026
[citation:GitHub Repository](https://github.com/bytedance/deer-flow). The project focuses on
providing a production-ready agent system with sandbox execution and memory management
[citation:DeerFlow Documentation](https://deer-flow.dev/docs).
## Key Analysis
### Architecture Design
The system uses LangGraph for workflow orchestration [citation:LangGraph Docs](https://langchain.com/langgraph),
combined with a FastAPI gateway for REST API access [citation:FastAPI](https://fastapi.tiangolo.com).
## Sources
### Primary Sources
- [GitHub Repository](https://github.com/bytedance/deer-flow) - Official source code and documentation
- [DeerFlow Documentation](https://deer-flow.dev/docs) - Technical specifications
### Media Coverage
- [AI Trends 2026](https://techcrunch.com/ai-trends) - Industry analysis
```
**CRITICAL: Sources section format:**
- Every item in the Sources section MUST be a clickable markdown link with URL
- Use standard markdown link `[Title](URL) - Description` format (NOT `[citation:...]` format)
- The `[citation:Title](URL)` format is ONLY for inline citations within the report body
- ❌ WRONG: `GitHub 仓库 - 官方源代码和文档` (no URL!)
- ❌ WRONG in Sources: `[citation:GitHub Repository](url)` (citation prefix is for inline only!)
- ✅ RIGHT in Sources: `[GitHub Repository](https://github.com/bytedance/deer-flow) - 官方源代码和文档`
**WORKFLOW for Research Tasks:**
1. Use web_search to find sources → Extract {{title, url, snippet}} from results
2. Write content with inline citations: `claim [citation:Title](url)`
3. Collect all citations in a "Sources" section at the end
4. NEVER write claims without citations when sources are available
**CRITICAL RULES:**
- ❌ DO NOT write research content without citations
- ❌ DO NOT forget to extract URLs from search results
- ✅ ALWAYS add `[citation:Title](URL)` after claims from external sources
- ✅ ALWAYS include a "Sources" section listing all references
</citations> </citations>
<critical_reminders> <critical_reminders>

View File

@@ -38,7 +38,7 @@ import { checkCodeFile, getFileName } from "@/core/utils/files";
import { env } from "@/env"; import { env } from "@/env";
import { cn } from "@/lib/utils"; import { cn } from "@/lib/utils";
import { CitationLink } from "../citations/citation-link"; import { ArtifactLink } from "../citations/artifact-link";
import { useThread } from "../messages/context"; import { useThread } from "../messages/context";
import { Tooltip } from "../tooltip"; import { Tooltip } from "../tooltip";
@@ -274,7 +274,7 @@ export function ArtifactFilePreview({
<Streamdown <Streamdown
className="size-full" className="size-full"
{...streamdownPlugins} {...streamdownPlugins}
components={{ a: CitationLink }} components={{ a: ArtifactLink }}
> >
{content ?? ""} {content ?? ""}
</Streamdown> </Streamdown>

View File

@@ -0,0 +1,33 @@
import type { AnchorHTMLAttributes } from "react";
import { cn } from "@/lib/utils";
import { CitationLink } from "./citation-link";
function isExternalUrl(href: string | undefined): boolean {
return !!href && /^https?:\/\//.test(href);
}
/** Link renderer for artifact markdown: citation: prefix → CitationLink, otherwise underlined text. */
export function ArtifactLink(props: AnchorHTMLAttributes<HTMLAnchorElement>) {
if (typeof props.children === "string") {
const match = /^citation:(.+)$/.exec(props.children);
if (match) {
const [, text] = match;
return <CitationLink {...props}>{text}</CitationLink>;
}
}
const { className, target, rel, ...rest } = props;
const external = isExternalUrl(props.href);
return (
<a
{...rest}
className={cn(
"text-primary underline decoration-primary/30 underline-offset-2 hover:decoration-primary/60 transition-colors",
className,
)}
target={target ?? (external ? "_blank" : undefined)}
rel={rel ?? (external ? "noopener noreferrer" : undefined)}
/>
);
}

View File

@@ -1,16 +1,21 @@
"use client"; "use client";
import { useMemo } from "react"; import { useMemo } from "react";
import type { HTMLAttributes } from "react"; import type { AnchorHTMLAttributes } from "react";
import { import {
MessageResponse, MessageResponse,
type MessageResponseProps, type MessageResponseProps,
} from "@/components/ai-elements/message"; } from "@/components/ai-elements/message";
import { streamdownPlugins } from "@/core/streamdown"; import { streamdownPlugins } from "@/core/streamdown";
import { cn } from "@/lib/utils";
import { CitationLink } from "../citations/citation-link"; import { CitationLink } from "../citations/citation-link";
function isExternalUrl(href: string | undefined): boolean {
return !!href && /^https?:\/\//.test(href);
}
export type MarkdownContentProps = { export type MarkdownContentProps = {
content: string; content: string;
isLoading: boolean; isLoading: boolean;
@@ -30,7 +35,7 @@ export function MarkdownContent({
}: MarkdownContentProps) { }: MarkdownContentProps) {
const components = useMemo(() => { const components = useMemo(() => {
return { return {
a: (props: HTMLAttributes<HTMLAnchorElement>) => { a: (props: AnchorHTMLAttributes<HTMLAnchorElement>) => {
if (typeof props.children === "string") { if (typeof props.children === "string") {
const match = /^citation:(.+)$/.exec(props.children); const match = /^citation:(.+)$/.exec(props.children);
if (match) { if (match) {
@@ -38,7 +43,16 @@ export function MarkdownContent({
return <CitationLink {...props}>{text}</CitationLink>; return <CitationLink {...props}>{text}</CitationLink>;
} }
} }
return <a {...props} />; const { className, target, rel, ...rest } = props;
const external = isExternalUrl(props.href);
return (
<a
{...rest}
className={cn("text-primary underline decoration-primary/30 underline-offset-2 hover:decoration-primary/60 transition-colors", className)}
target={target ?? (external ? "_blank" : undefined)}
rel={rel ?? (external ? "noopener noreferrer" : undefined)}
/>
);
}, },
...componentsFromProps, ...componentsFromProps,
}; };

View File

@@ -147,5 +147,20 @@ Save report as: `research_{topic}_{YYYYMMDD}.md`
3. **Triangulate claims** - 2+ independent sources 3. **Triangulate claims** - 2+ independent sources
4. **Note conflicting info** - Don't hide contradictions 4. **Note conflicting info** - Don't hide contradictions
5. **Distinguish fact vs opinion** - Label speculation clearly 5. **Distinguish fact vs opinion** - Label speculation clearly
6. **Reference sources** - Add source references near claims where applicable 6. **CRITICAL: Always include inline citations** - Use `[citation:Title](URL)` format immediately after each claim from external sources
7. **Update as you go** - Don't wait until end to synthesize 7. **Extract URLs from search results** - web_search returns {title, url, snippet} - always use the URL field
8. **Update as you go** - Don't wait until end to synthesize
### Citation Examples
**Good - With inline citations:**
```markdown
The project gained 10,000 stars within 3 months of launch [citation:GitHub Stats](https://github.com/owner/repo).
The architecture uses LangGraph for workflow orchestration [citation:LangGraph Docs](https://langchain.com/langgraph).
```
**Bad - Without citations:**
```markdown
The project gained 10,000 stars within 3 months of launch.
The architecture uses LangGraph for workflow orchestration.
```

View File

@@ -30,6 +30,9 @@
{EXECUTIVE_SUMMARY} {EXECUTIVE_SUMMARY}
**IMPORTANT**: Include inline citations using `[citation:Title](URL)` format after each claim. Example:
"The project gained 10k stars in 3 months [citation:GitHub Stats](https://github.com/owner/repo)."
--- ---
## Complete Chronological Timeline ## Complete Chronological Timeline
@@ -56,6 +59,8 @@
## Key Analysis ## Key Analysis
**IMPORTANT**: Support each analysis point with inline citations `[citation:Title](URL)`.
### {ANALYSIS_SECTION_1_TITLE} ### {ANALYSIS_SECTION_1_TITLE}
{ANALYSIS_SECTION_1_CONTENT} {ANALYSIS_SECTION_1_CONTENT}