mirror of
https://gitee.com/wanwujie/deer-flow
synced 2026-04-03 06:12:14 +08:00
* feat: add citation/reference support to deep research reports (#1141) - Enhance lead agent system prompt with mandatory citation requirements after web_search/web_fetch tool usage - Add citation examples and best practices to GitHub Deep Research skill - Add citation hints to report template (Executive Summary, Key Analysis) - Style regular markdown links in frontend for visual distinction (color, underline, hover effect) - Fix TitleMiddleware being registered when title generation is disabled * fix: address PR review comments - Revert TitleMiddleware conditional registration (agent.py) to avoid sync/async incompatibility with DeerFlowClient - Fix markdown link rendering: merge classNames instead of overwriting, only set target=_blank for external http(s) URLs - Remove unrelated package.json/pnpm-lock.yaml changes * fix: use plain markdown links in Sources section for cleaner rendering Inline citations in report body use [citation:Title](URL) for pill/badge style. Sources section uses plain [Title](URL) for simple underlined link style. * fix(frontend): render plain links as underlined text in artifact markdown Only links with citation: prefix render as Badge pills. Regular links in Sources section now render as underlined text links. --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
4.9 KiB
4.9 KiB
name, description
| name | description |
|---|---|
| github-deep-research | Conduct multi-round deep research on any GitHub Repo. Use when users request comprehensive analysis, timeline reconstruction, competitive analysis, or in-depth investigation of GitHub. Produces structured markdown reports with executive summaries, chronological timelines, metrics analysis, and Mermaid diagrams. Triggers on Github repository URL or open source projects. |
GitHub Deep Research Skill
Multi-round research combining GitHub API, web_search, web_fetch to produce comprehensive markdown reports.
Research Workflow
- Round 1: GitHub API
- Round 2: Discovery
- Round 3: Deep Investigation
- Round 4: Deep Dive
Core Methodology
Query Strategy
Broad to Narrow: Start with GitHub API, then general queries, refine based on findings.
Round 1: GitHub API
Round 2: "{topic} overview"
Round 3: "{topic} architecture", "{topic} vs alternatives"
Round 4: "{topic} issues", "{topic} roadmap", "site:github.com {topic}"
Source Prioritization:
- Official docs/repos (highest weight)
- Technical blogs (Medium, Dev.to)
- News articles (verified outlets)
- Community discussions (Reddit, HN)
- Social media (lowest weight, for sentiment)
Research Rounds
Round 1 - GitHub API
Directly execute scripts/github_api.py without read_file():
python /path/to/skill/scripts/github_api.py <owner> <repo> summary
python /path/to/skill/scripts/github_api.py <owner> <repo> readme
python /path/to/skill/scripts/github_api.py <owner> <repo> tree
Available commands (the last argument of github_api.py):
- summary
- info
- readme
- tree
- languages
- contributors
- commits
- issues
- prs
- releases
Round 2 - Discovery (3-5 web_search)
- Get overview and identify key terms
- Find official website/repo
- Identify main players/competitors
Round 3 - Deep Investigation (5-10 web_search + web_fetch)
- Technical architecture details
- Timeline of key events
- Community sentiment
- Use web_fetch on valuable URLs for full content
Round 4 - Deep Dive
- Analyze commit history for timeline
- Review issues/PRs for feature evolution
- Check contributor activity
Report Structure
Follow template in assets/report_template.md:
- Metadata Block - Date, confidence level, subject
- Executive Summary - 2-3 sentence overview with key metrics
- Chronological Timeline - Phased breakdown with dates
- Key Analysis Sections - Topic-specific deep dives
- Metrics & Comparisons - Tables, growth charts
- Strengths & Weaknesses - Balanced assessment
- Sources - Categorized references
- Confidence Assessment - Claims by confidence level
- Methodology - Research approach used
Mermaid Diagrams
Include diagrams where helpful:
Timeline (Gantt):
gantt
title Project Timeline
dateFormat YYYY-MM-DD
section Phase 1
Development :2025-01-01, 2025-03-01
section Phase 2
Launch :2025-03-01, 2025-04-01
Architecture (Flowchart):
flowchart TD
A[User] --> B[Coordinator]
B --> C[Planner]
C --> D[Research Team]
D --> E[Reporter]
Comparison (Pie/Bar):
pie title Market Share
"Project A" : 45
"Project B" : 30
"Others" : 25
Confidence Scoring
Assign confidence based on source quality:
| Confidence | Criteria |
|---|---|
| High (90%+) | Official docs, GitHub data, multiple corroborating sources |
| Medium (70-89%) | Single reliable source, recent articles |
| Low (50-69%) | Social media, unverified claims, outdated info |
Output
Save report as: research_{topic}_{YYYYMMDD}.md
Formatting Rules
- Chinese content: Use full-width punctuation(,。:;!?)
- Technical terms: Provide Wiki/doc URL on first mention
- Tables: Use for metrics, comparisons
- Code blocks: For technical examples
- Mermaid: For architecture, timelines, flows
Best Practices
- Start with official sources - Repo, docs, company blog
- Verify dates from commits/PRs - More reliable than articles
- Triangulate claims - 2+ independent sources
- Note conflicting info - Don't hide contradictions
- Distinguish fact vs opinion - Label speculation clearly
- CRITICAL: Always include inline citations - Use
[citation:Title](URL)format immediately after each claim from external sources - Extract URLs from search results - web_search returns {title, url, snippet} - always use the URL field
- Update as you go - Don't wait until end to synthesize
Citation Examples
Good - With inline citations:
The project gained 10,000 stars within 3 months of launch [citation:GitHub Stats](https://github.com/owner/repo).
The architecture uses LangGraph for workflow orchestration [citation:LangGraph Docs](https://langchain.com/langgraph).
Bad - Without citations:
The project gained 10,000 stars within 3 months of launch.
The architecture uses LangGraph for workflow orchestration.