mirror of https://gitee.com/wanwujie/deer-flow synced 2026-04-03 06:12:14 +08:00

Files

lailoo 9809af1f26 feat: add citation/reference support to deep research reports (#1143 )

* feat: add citation/reference support to deep research reports (#1141)

- Enhance lead agent system prompt with mandatory citation requirements
  after web_search/web_fetch tool usage
- Add citation examples and best practices to GitHub Deep Research skill
- Add citation hints to report template (Executive Summary, Key Analysis)
- Style regular markdown links in frontend for visual distinction
  (color, underline, hover effect)
- Fix TitleMiddleware being registered when title generation is disabled

* fix: address PR review comments

- Revert TitleMiddleware conditional registration (agent.py) to avoid
  sync/async incompatibility with DeerFlowClient
- Fix markdown link rendering: merge classNames instead of overwriting,
  only set target=_blank for external http(s) URLs
- Remove unrelated package.json/pnpm-lock.yaml changes

* fix: use plain markdown links in Sources section for cleaner rendering

Inline citations in report body use [citation:Title](URL) for pill/badge style.
Sources section uses plain [Title](URL) for simple underlined link style.

* fix(frontend): render plain links as underlined text in artifact markdown

Only links with citation: prefix render as Badge pills.
Regular links in Sources section now render as underlined text links.

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>

2026-03-17 09:51:08 +08:00

4.9 KiB

Raw Permalink Blame History

name, description

name	description
github-deep-research	Conduct multi-round deep research on any GitHub Repo. Use when users request comprehensive analysis, timeline reconstruction, competitive analysis, or in-depth investigation of GitHub. Produces structured markdown reports with executive summaries, chronological timelines, metrics analysis, and Mermaid diagrams. Triggers on Github repository URL or open source projects.

GitHub Deep Research Skill

Multi-round research combining GitHub API, web_search, web_fetch to produce comprehensive markdown reports.

Research Workflow

Round 1: GitHub API
Round 2: Discovery
Round 3: Deep Investigation
Round 4: Deep Dive

Core Methodology

Query Strategy

Broad to Narrow: Start with GitHub API, then general queries, refine based on findings.

Round 1: GitHub API
Round 2: "{topic} overview"
Round 3: "{topic} architecture", "{topic} vs alternatives"
Round 4: "{topic} issues", "{topic} roadmap", "site:github.com {topic}"

Source Prioritization:

Official docs/repos (highest weight)
Technical blogs (Medium, Dev.to)
News articles (verified outlets)
Community discussions (Reddit, HN)
Social media (lowest weight, for sentiment)

Research Rounds

Round 1 - GitHub API Directly execute scripts/github_api.py without read_file():

python /path/to/skill/scripts/github_api.py <owner> <repo> summary
python /path/to/skill/scripts/github_api.py <owner> <repo> readme
python /path/to/skill/scripts/github_api.py <owner> <repo> tree

Available commands (the last argument of github_api.py):

summary
info
readme
tree
languages
contributors
commits
issues
prs
releases

Round 2 - Discovery (3-5 web_search)

Get overview and identify key terms
Find official website/repo
Identify main players/competitors

Round 3 - Deep Investigation (5-10 web_search + web_fetch)

Technical architecture details
Timeline of key events
Community sentiment
Use web_fetch on valuable URLs for full content

Round 4 - Deep Dive

Analyze commit history for timeline
Review issues/PRs for feature evolution
Check contributor activity

Report Structure

Follow template in assets/report_template.md:

Metadata Block - Date, confidence level, subject
Executive Summary - 2-3 sentence overview with key metrics
Chronological Timeline - Phased breakdown with dates
Key Analysis Sections - Topic-specific deep dives
Metrics & Comparisons - Tables, growth charts
Strengths & Weaknesses - Balanced assessment
Sources - Categorized references
Confidence Assessment - Claims by confidence level
Methodology - Research approach used

Mermaid Diagrams

Include diagrams where helpful:

Timeline (Gantt):

gantt
    title Project Timeline
    dateFormat YYYY-MM-DD
    section Phase 1
    Development    :2025-01-01, 2025-03-01
    section Phase 2
    Launch         :2025-03-01, 2025-04-01

Architecture (Flowchart):

flowchart TD
    A[User] --> B[Coordinator]
    B --> C[Planner]
    C --> D[Research Team]
    D --> E[Reporter]

Comparison (Pie/Bar):

pie title Market Share
    "Project A" : 45
    "Project B" : 30
    "Others" : 25

Confidence Scoring

Assign confidence based on source quality:

Confidence	Criteria
High (90%+)	Official docs, GitHub data, multiple corroborating sources
Medium (70-89%)	Single reliable source, recent articles
Low (50-69%)	Social media, unverified claims, outdated info

Output

Save report as: research_{topic}_{YYYYMMDD}.md

Formatting Rules

Chinese content: Use full-width punctuation（，。：；！？）
Technical terms: Provide Wiki/doc URL on first mention
Tables: Use for metrics, comparisons
Code blocks: For technical examples
Mermaid: For architecture, timelines, flows

Best Practices

Start with official sources - Repo, docs, company blog
Verify dates from commits/PRs - More reliable than articles
Triangulate claims - 2+ independent sources
Note conflicting info - Don't hide contradictions
Distinguish fact vs opinion - Label speculation clearly
CRITICAL: Always include inline citations - Use [citation:Title](URL) format immediately after each claim from external sources
Extract URLs from search results - web_search returns {title, url, snippet} - always use the URL field
Update as you go - Don't wait until end to synthesize

Citation Examples

Good - With inline citations:

The project gained 10,000 stars within 3 months of launch [citation:GitHub Stats](https://github.com/owner/repo).
The architecture uses LangGraph for workflow orchestration [citation:LangGraph Docs](https://langchain.com/langgraph).

Bad - Without citations:

The project gained 10,000 stars within 3 months of launch.
The architecture uses LangGraph for workflow orchestration.

4.9 KiB Raw Permalink Blame History Unescape Escape