fix: move Key Citations to early position in reporter prompt to reduce URL hallucination (#859)

* fix: move Key Citations to early position in reporter prompt to reduce URL hallucination

Move the Key Citations section from position 6 (end of report) to position 2
(immediately after title) in the reporter prompt. When citations are placed at
the end of a long report, LLMs tend to forget real URLs from source material
and fabricate plausible-looking but non-existent URLs.

Changes to src/prompts/reporter.md:
- Move Key Citations from section 6 to section 2 (right after Title)
- Add explicit anti-hallucination instructions: only use URLs from provided
  source material, never fabricate or guess URLs
- Keep a repeated citation list at the end (section 7) for completeness
- Renumber all subsequent sections accordingly
- Update Notes section to reflect new structure

Tested with real DeerFlow backend + DuckDuckGo search:
- Before: multiple hallucinated URLs in report citations
- After: hallucinated URLs reduced significantly

Closes #825

* fix: move citations after observations in reporter_node to reduce URL hallucination

Previously, the citation message was appended BEFORE observation messages,
meaning it got buried under potentially thousands of chars of research data.
By the time the LLM reached the end of the context to generate the report,
it had 'forgotten' the real URLs and fabricated plausible-looking ones.

Now citations are appended AFTER compressed observations, placing them
closest to the LLM's generation point for maximum recall accuracy.

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
This commit is contained in:
大猫子
2026-02-14 15:21:24 +08:00
committed by GitHub
parent c95b2711c3
commit 13a25112b1
2 changed files with 31 additions and 17 deletions

View File

@@ -853,7 +853,8 @@ def reporter_node(state: State, config: RunnableConfig):
# Get collected citations for the report
citations = state.get("citations", [])
# If we have collected citations, provide them to the reporter
# Build citation messages for the reporter
citation_list = ""
if citations:
citation_list = "\n\n## Available Source References (use these in References section):\n\n"
for i, citation in enumerate(citations, 1):
@@ -869,13 +870,6 @@ def reporter_node(state: State, config: RunnableConfig):
logger.info(f"Providing {len(citations)} collected citations to reporter")
invoke_messages.append(
HumanMessage(
content=citation_list,
name="system",
)
)
observation_messages = []
for observation in observations:
observation_messages.append(
@@ -892,6 +886,17 @@ def reporter_node(state: State, config: RunnableConfig):
)
invoke_messages += compressed_state.get("messages", [])
# Append citations AFTER observations so they are closest to the LLM's
# generation point. This reduces the chance of the model "forgetting"
# real URLs and fabricating plausible-looking ones instead.
if citation_list:
invoke_messages.append(
HumanMessage(
content=citation_list,
name="system",
)
)
logger.debug(f"Current invoke messages: {invoke_messages}")
response = get_llm_by_type(AGENT_LLM_MAP["reporter"]).invoke(invoke_messages)
response_content = response.content

View File

@@ -60,23 +60,31 @@ Structure your report in the following format:
- Always use the first level heading for the title.
- A concise title for the report.
2. **Key Points**
2. **Key Citations**
- List all references IMMEDIATELY after the title, before any analysis content.
- This section MUST come early to ensure all URLs are accurate and verifiable.
- Only use URLs that appear in the provided source material or 'Available Source References'.
- Include an empty line between each citation for better readability.
- Format: `- [Source Title](URL)`
- NEVER fabricate or guess URLs. If a URL is not available, omit it.
3. **Key Points**
- A bulleted list of the most important findings (4-6 points).
- Each point should be concise (1-2 sentences).
- Focus on the most significant and actionable information.
3. **Overview**
4. **Overview**
- A brief introduction to the topic (1-2 paragraphs).
- Provide context and significance.
4. **Detailed Analysis**
5. **Detailed Analysis**
- Organize information into logical sections with clear headings.
- Include relevant subsections as needed.
- Present information in a structured, easy-to-follow manner.
- Highlight unexpected or particularly noteworthy details.
- **Including images from the previous steps in the report is very helpful.**
5. **Survey Note** (for more comprehensive reports)
6. **Survey Note** (for more comprehensive reports)
{% if report_style == "academic" %}
- **Literature Review & Theoretical Framework**: Comprehensive analysis of existing research and theoretical foundations
- **Methodology & Data Analysis**: Detailed examination of research methods and analytical approaches
@@ -132,10 +140,10 @@ Structure your report in the following format:
- This section is optional for shorter reports.
{% endif %}
6. **Key Citations**
- List all references at the end in link reference format.
- Include an empty line between each citation for better readability.
- Format: `- [Source Title](URL)`
7. **Key Citations** (repeated at end for completeness)
- Repeat the same citation list from section 2 at the end of the report.
- This ensures references are accessible both at the beginning and end.
- ONLY use URLs from the provided source material. NEVER fabricate URLs.
# Writing Guidelines
@@ -372,9 +380,10 @@ Structure your report in the following format:
- If uncertain about any information, acknowledge the uncertainty.
- Only include verifiable facts from the provided source material.
- Structure your report to include: Key Points, Overview, Detailed Analysis, Survey Note (optional), and References.
- Structure your report to include: Key Citations, Key Points, Overview, Detailed Analysis, Survey Note (optional), and References.
- Use inline citations [n] in the text where appropriate.
- The number n must correspond to the source index in the provided 'Available Source References' list.
- NEVER fabricate or guess URLs. Only use URLs that appear in the provided source material or 'Available Source References'.
- Make the inline citation a link to the reference at the bottom using the format `[[n]](#ref-n)`.
- In the References section at the end, list the sources using the format `[[n]](#citation-target-n) **[Title](URL)**`.
- PRIORITIZE USING MARKDOWN TABLES for data presentation and comparison. Use tables whenever presenting comparative data, statistics, features, or options.