feat: add citation/reference support to deep research reports (#1143)

* feat: add citation/reference support to deep research reports (#1141)

- Enhance lead agent system prompt with mandatory citation requirements
  after web_search/web_fetch tool usage
- Add citation examples and best practices to GitHub Deep Research skill
- Add citation hints to report template (Executive Summary, Key Analysis)
- Style regular markdown links in frontend for visual distinction
  (color, underline, hover effect)
- Fix TitleMiddleware being registered when title generation is disabled

* fix: address PR review comments

- Revert TitleMiddleware conditional registration (agent.py) to avoid
  sync/async incompatibility with DeerFlowClient
- Fix markdown link rendering: merge classNames instead of overwriting,
  only set target=_blank for external http(s) URLs
- Remove unrelated package.json/pnpm-lock.yaml changes

* fix: use plain markdown links in Sources section for cleaner rendering

Inline citations in report body use [citation:Title](URL) for pill/badge style.
Sources section uses plain [Title](URL) for simple underlined link style.

* fix(frontend): render plain links as underlined text in artifact markdown

Only links with citation: prefix render as Badge pills.
Regular links in Sources section now render as underlined text links.

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
This commit is contained in:
lailoo
2026-03-17 09:51:08 +08:00
committed by GitHub
parent b1913a1902
commit 9809af1f26
6 changed files with 128 additions and 10 deletions

View File

@@ -257,15 +257,66 @@ You: "Deploying to staging..." [proceed]
</response_style>
<citations>
- When to Use: After web_search, include citations if applicable
- Format: Use Markdown link format `[citation:TITLE](URL)`
- Example:
**CRITICAL: Always include citations when using web search results**
- **When to Use**: MANDATORY after web_search, web_fetch, or any external information source
- **Format**: Use Markdown link format `[citation:TITLE](URL)` immediately after the claim
- **Placement**: Inline citations should appear right after the sentence or claim they support
- **Sources Section**: Also collect all citations in a "Sources" section at the end of reports
**Example - Inline Citations:**
```markdown
The key AI trends for 2026 include enhanced reasoning capabilities and multimodal integration
[citation:AI Trends 2026](https://techcrunch.com/ai-trends).
Recent breakthroughs in language models have also accelerated progress
[citation:OpenAI Research](https://openai.com/research).
```
**Example - Deep Research Report with Citations:**
```markdown
## Executive Summary
DeerFlow is an open-source AI agent framework that gained significant traction in early 2026
[citation:GitHub Repository](https://github.com/bytedance/deer-flow). The project focuses on
providing a production-ready agent system with sandbox execution and memory management
[citation:DeerFlow Documentation](https://deer-flow.dev/docs).
## Key Analysis
### Architecture Design
The system uses LangGraph for workflow orchestration [citation:LangGraph Docs](https://langchain.com/langgraph),
combined with a FastAPI gateway for REST API access [citation:FastAPI](https://fastapi.tiangolo.com).
## Sources
### Primary Sources
- [GitHub Repository](https://github.com/bytedance/deer-flow) - Official source code and documentation
- [DeerFlow Documentation](https://deer-flow.dev/docs) - Technical specifications
### Media Coverage
- [AI Trends 2026](https://techcrunch.com/ai-trends) - Industry analysis
```
**CRITICAL: Sources section format:**
- Every item in the Sources section MUST be a clickable markdown link with URL
- Use standard markdown link `[Title](URL) - Description` format (NOT `[citation:...]` format)
- The `[citation:Title](URL)` format is ONLY for inline citations within the report body
- ❌ WRONG: `GitHub 仓库 - 官方源代码和文档` (no URL!)
- ❌ WRONG in Sources: `[citation:GitHub Repository](url)` (citation prefix is for inline only!)
- ✅ RIGHT in Sources: `[GitHub Repository](https://github.com/bytedance/deer-flow) - 官方源代码和文档`
**WORKFLOW for Research Tasks:**
1. Use web_search to find sources → Extract {{title, url, snippet}} from results
2. Write content with inline citations: `claim [citation:Title](url)`
3. Collect all citations in a "Sources" section at the end
4. NEVER write claims without citations when sources are available
**CRITICAL RULES:**
- ❌ DO NOT write research content without citations
- ❌ DO NOT forget to extract URLs from search results
- ✅ ALWAYS add `[citation:Title](URL)` after claims from external sources
- ✅ ALWAYS include a "Sources" section listing all references
</citations>
<critical_reminders>

View File

@@ -38,7 +38,7 @@ import { checkCodeFile, getFileName } from "@/core/utils/files";
import { env } from "@/env";
import { cn } from "@/lib/utils";
import { CitationLink } from "../citations/citation-link";
import { ArtifactLink } from "../citations/artifact-link";
import { useThread } from "../messages/context";
import { Tooltip } from "../tooltip";
@@ -274,7 +274,7 @@ export function ArtifactFilePreview({
<Streamdown
className="size-full"
{...streamdownPlugins}
components={{ a: CitationLink }}
components={{ a: ArtifactLink }}
>
{content ?? ""}
</Streamdown>

View File

@@ -0,0 +1,33 @@
import type { AnchorHTMLAttributes } from "react";
import { cn } from "@/lib/utils";
import { CitationLink } from "./citation-link";
function isExternalUrl(href: string | undefined): boolean {
return !!href && /^https?:\/\//.test(href);
}
/** Link renderer for artifact markdown: citation: prefix → CitationLink, otherwise underlined text. */
export function ArtifactLink(props: AnchorHTMLAttributes<HTMLAnchorElement>) {
if (typeof props.children === "string") {
const match = /^citation:(.+)$/.exec(props.children);
if (match) {
const [, text] = match;
return <CitationLink {...props}>{text}</CitationLink>;
}
}
const { className, target, rel, ...rest } = props;
const external = isExternalUrl(props.href);
return (
<a
{...rest}
className={cn(
"text-primary underline decoration-primary/30 underline-offset-2 hover:decoration-primary/60 transition-colors",
className,
)}
target={target ?? (external ? "_blank" : undefined)}
rel={rel ?? (external ? "noopener noreferrer" : undefined)}
/>
);
}

View File

@@ -1,16 +1,21 @@
"use client";
import { useMemo } from "react";
import type { HTMLAttributes } from "react";
import type { AnchorHTMLAttributes } from "react";
import {
MessageResponse,
type MessageResponseProps,
} from "@/components/ai-elements/message";
import { streamdownPlugins } from "@/core/streamdown";
import { cn } from "@/lib/utils";
import { CitationLink } from "../citations/citation-link";
function isExternalUrl(href: string | undefined): boolean {
return !!href && /^https?:\/\//.test(href);
}
export type MarkdownContentProps = {
content: string;
isLoading: boolean;
@@ -30,7 +35,7 @@ export function MarkdownContent({
}: MarkdownContentProps) {
const components = useMemo(() => {
return {
a: (props: HTMLAttributes<HTMLAnchorElement>) => {
a: (props: AnchorHTMLAttributes<HTMLAnchorElement>) => {
if (typeof props.children === "string") {
const match = /^citation:(.+)$/.exec(props.children);
if (match) {
@@ -38,7 +43,16 @@ export function MarkdownContent({
return <CitationLink {...props}>{text}</CitationLink>;
}
}
return <a {...props} />;
const { className, target, rel, ...rest } = props;
const external = isExternalUrl(props.href);
return (
<a
{...rest}
className={cn("text-primary underline decoration-primary/30 underline-offset-2 hover:decoration-primary/60 transition-colors", className)}
target={target ?? (external ? "_blank" : undefined)}
rel={rel ?? (external ? "noopener noreferrer" : undefined)}
/>
);
},
...componentsFromProps,
};

View File

@@ -147,5 +147,20 @@ Save report as: `research_{topic}_{YYYYMMDD}.md`
3. **Triangulate claims** - 2+ independent sources
4. **Note conflicting info** - Don't hide contradictions
5. **Distinguish fact vs opinion** - Label speculation clearly
6. **Reference sources** - Add source references near claims where applicable
7. **Update as you go** - Don't wait until end to synthesize
6. **CRITICAL: Always include inline citations** - Use `[citation:Title](URL)` format immediately after each claim from external sources
7. **Extract URLs from search results** - web_search returns {title, url, snippet} - always use the URL field
8. **Update as you go** - Don't wait until end to synthesize
### Citation Examples
**Good - With inline citations:**
```markdown
The project gained 10,000 stars within 3 months of launch [citation:GitHub Stats](https://github.com/owner/repo).
The architecture uses LangGraph for workflow orchestration [citation:LangGraph Docs](https://langchain.com/langgraph).
```
**Bad - Without citations:**
```markdown
The project gained 10,000 stars within 3 months of launch.
The architecture uses LangGraph for workflow orchestration.
```

View File

@@ -30,6 +30,9 @@
{EXECUTIVE_SUMMARY}
**IMPORTANT**: Include inline citations using `[citation:Title](URL)` format after each claim. Example:
"The project gained 10k stars in 3 months [citation:GitHub Stats](https://github.com/owner/repo)."
---
## Complete Chronological Timeline
@@ -56,6 +59,8 @@
## Key Analysis
**IMPORTANT**: Support each analysis point with inline citations `[citation:Title](URL)`.
### {ANALYSIS_SECTION_1_TITLE}
{ANALYSIS_SECTION_1_CONTENT}