feat: add citation support in research report block and markdown

* feat: add citation support in research report block and markdown

- Enhanced ResearchReportBlock to fetch citations based on researchId and pass them to the Markdown component.
- Introduced CitationLink component to display citation metadata on hover for links in markdown.
- Implemented CitationCard and CitationList components for displaying citation details and lists.
- Updated Markdown component to handle citation links and inline citations.
- Created HoverCard component for displaying citation information in a tooltip-like manner.
- Modified store to manage citations, including setting and retrieving citations for ongoing research.
- Added CitationsEvent type to handle citations in chat events and updated Message type to include citations.

* fix(log): Enable the logging level  when enabling the DEBUG environment variable (#793)

* fix(frontend): render all tool calls in the frontend #796 (#797)

* build(deps): bump jspdf from 3.0.4 to 4.0.0 in /web (#798)

Bumps [jspdf](https://github.com/parallax/jsPDF) from 3.0.4 to 4.0.0.
- [Release notes](https://github.com/parallax/jsPDF/releases)
- [Changelog](https://github.com/parallax/jsPDF/blob/master/RELEASE.md)
- [Commits](https://github.com/parallax/jsPDF/compare/v3.0.4...v4.0.0)

---
updated-dependencies:
- dependency-name: jspdf
  dependency-version: 4.0.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix(frontend):added the display of the 'analyst' message #800 (#801)

* fix: migrate from deprecated create_react_agent to langchain.agents.create_agent (#802)

* fix: migrate from deprecated create_react_agent to langchain.agents.create_agent

Fixes #799

- Replace deprecated langgraph.prebuilt.create_react_agent with
  langchain.agents.create_agent (LangGraph 1.0 migration)
- Add DynamicPromptMiddleware to handle dynamic prompt templates
  (replaces the 'prompt' callable parameter)
- Add PreModelHookMiddleware to handle pre-model hooks
  (replaces the 'pre_model_hook' parameter)
- Update AgentState import from langchain.agents in template.py
- Update tests to use the new API

* fix:update the code with review comments

* fix: Add runtime parameter to compress_messages method(#803) 

* fix: Add runtime parameter to compress_messages method(#803)

    The compress_messages method was being called by PreModelHookMiddleware
    with both state and runtime parameters, but only accepted state parameter.
    This caused a TypeError when the middleware executed the pre_model_hook.

    Added optional runtime parameter to compress_messages signature to match
    the expected interface while maintaining backward compatibility.

* Update the code with the review comments

* fix: Refactor citation handling and add comprehensive tests for citation features

* refactor: Clean up imports and formatting across citation modules

* fix: Add monkeypatch to clear AGENT_RECURSION_LIMIT in recursion limit tests

* feat: Enhance citation link handling in Markdown component

* fix: Exclude citations from finish reason handling in mergeMessage function

* fix(nodes): update message handling

* fix(citations): improve citation extraction and handling in event processing

* feat(citations): enhance citation extraction and handling with improved merging and normalization

* fix(reporter): update citation formatting instructions for clarity and consistency

* fix(reporter): prioritize using Markdown tables for data presentation and comparison

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: LoftyComet <1277173875@qq。>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This commit is contained in:
LoftyComet
2026-01-24 17:49:13 +08:00
committed by GitHub
parent 612bddd3fb
commit b7f0f54aa0
22 changed files with 2125 additions and 29 deletions

View File

@@ -14,6 +14,7 @@ from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.types import Command, interrupt
from src.agents import create_agent
from src.citations import extract_citations_from_messages, merge_citations
from src.config.agents import AGENT_LLM_MAP
from src.config.configuration import Configuration
from src.llms.llm import get_llm_by_type, get_llm_token_limit_by_type
@@ -715,6 +716,7 @@ def coordinator_node(
"clarified_research_topic": clarified_topic,
"is_clarification_complete": False,
"goto": goto,
"citations": state.get("citations", []),
"__interrupt__": [("coordinator", response.content)],
},
goto=goto,
@@ -802,6 +804,7 @@ def coordinator_node(
"clarification_history": clarification_history,
"is_clarification_complete": goto != "coordinator",
"goto": goto,
"citations": state.get("citations", []),
},
goto=goto,
)
@@ -822,14 +825,32 @@ def reporter_node(state: State, config: RunnableConfig):
}
invoke_messages = apply_prompt_template("reporter", input_, configurable, input_.get("locale", "en-US"))
observations = state.get("observations", [])
# Get collected citations for the report
citations = state.get("citations", [])
# Add a reminder about the new report format, citation style, and table usage
invoke_messages.append(
HumanMessage(
content="IMPORTANT: Structure your report according to the format in the prompt. Remember to include:\n\n1. Key Points - A bulleted list of the most important findings\n2. Overview - A brief introduction to the topic\n3. Detailed Analysis - Organized into logical sections\n4. Survey Note (optional) - For more comprehensive reports\n5. Key Citations - List all references at the end\n\nFor citations, DO NOT include inline citations in the text. Instead, place all citations in the 'Key Citations' section at the end using the format: `- [Source Title](URL)`. Include an empty line between each citation for better readability.\n\nPRIORITIZE USING MARKDOWN TABLES for data presentation and comparison. Use tables whenever presenting comparative data, statistics, features, or options. Structure tables with clear headers and aligned columns. Example table format:\n\n| Feature | Description | Pros | Cons |\n|---------|-------------|------|------|\n| Feature 1 | Description 1 | Pros 1 | Cons 1 |\n| Feature 2 | Description 2 | Pros 2 | Cons 2 |",
name="system",
# If we have collected citations, provide them to the reporter
if citations:
citation_list = "\n\n## Available Source References (use these in References section):\n\n"
for i, citation in enumerate(citations, 1):
title = citation.get("title", "Untitled")
url = citation.get("url", "")
domain = citation.get("domain", "")
description = citation.get("description", "")
desc_truncated = description[:150] if description else ""
citation_list += f"{i}. **{title}**\n - URL: {url}\n - Domain: {domain}\n"
if desc_truncated:
citation_list += f" - Summary: {desc_truncated}...\n"
citation_list += "\n"
logger.info(f"Providing {len(citations)} collected citations to reporter")
invoke_messages.append(
HumanMessage(
content=citation_list,
name="system",
)
)
)
observation_messages = []
for observation in observations:
@@ -852,7 +873,10 @@ def reporter_node(state: State, config: RunnableConfig):
response_content = response.content
logger.info(f"reporter response: {response_content}")
return {"final_report": response_content}
return {
"final_report": response_content,
"citations": citations, # Pass citations through to final state
}
def research_team_node(state: State):
@@ -1114,11 +1138,23 @@ async def _execute_agent_step(
f"All tool results will be preserved and streamed to frontend."
)
# Extract citations from tool call results (web_search, crawl)
existing_citations = state.get("citations", [])
new_citations = extract_citations_from_messages(agent_messages)
merged_citations = merge_citations(existing_citations, new_citations)
if new_citations:
logger.info(
f"Extracted {len(new_citations)} new citations from {agent_name} agent. "
f"Total citations: {len(merged_citations)}"
)
return Command(
update={
**preserve_state_meta_fields(state),
"messages": agent_messages,
"observations": observations + [response_content + validation_info],
**preserve_state_meta_fields(state),
"citations": merged_citations, # Store merged citations based on existing state and new tool results
},
goto="research_team",
)

View File

@@ -3,6 +3,7 @@
from dataclasses import field
from typing import Any
from langgraph.graph import MessagesState
@@ -27,6 +28,10 @@ class State(MessagesState):
auto_accepted_plan: bool = False
enable_background_investigation: bool = True
background_investigation_results: str = None
# Citation metadata collected during research
# Format: List of citation dictionaries with url, title, description, etc.
citations: list[dict[str, Any]] = field(default_factory=list)
# Clarification state tracking (disabled by default)
enable_clarification: bool = (