Files
deer-flow/tests/unit/citations/test_citations.py
LoftyComet b7f0f54aa0 feat: add citation support in research report block and markdown
* feat: add citation support in research report block and markdown

- Enhanced ResearchReportBlock to fetch citations based on researchId and pass them to the Markdown component.
- Introduced CitationLink component to display citation metadata on hover for links in markdown.
- Implemented CitationCard and CitationList components for displaying citation details and lists.
- Updated Markdown component to handle citation links and inline citations.
- Created HoverCard component for displaying citation information in a tooltip-like manner.
- Modified store to manage citations, including setting and retrieving citations for ongoing research.
- Added CitationsEvent type to handle citations in chat events and updated Message type to include citations.

* fix(log): Enable the logging level  when enabling the DEBUG environment variable (#793)

* fix(frontend): render all tool calls in the frontend #796 (#797)

* build(deps): bump jspdf from 3.0.4 to 4.0.0 in /web (#798)

Bumps [jspdf](https://github.com/parallax/jsPDF) from 3.0.4 to 4.0.0.
- [Release notes](https://github.com/parallax/jsPDF/releases)
- [Changelog](https://github.com/parallax/jsPDF/blob/master/RELEASE.md)
- [Commits](https://github.com/parallax/jsPDF/compare/v3.0.4...v4.0.0)

---
updated-dependencies:
- dependency-name: jspdf
  dependency-version: 4.0.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix(frontend):added the display of the 'analyst' message #800 (#801)

* fix: migrate from deprecated create_react_agent to langchain.agents.create_agent (#802)

* fix: migrate from deprecated create_react_agent to langchain.agents.create_agent

Fixes #799

- Replace deprecated langgraph.prebuilt.create_react_agent with
  langchain.agents.create_agent (LangGraph 1.0 migration)
- Add DynamicPromptMiddleware to handle dynamic prompt templates
  (replaces the 'prompt' callable parameter)
- Add PreModelHookMiddleware to handle pre-model hooks
  (replaces the 'pre_model_hook' parameter)
- Update AgentState import from langchain.agents in template.py
- Update tests to use the new API

* fix:update the code with review comments

* fix: Add runtime parameter to compress_messages method(#803) 

* fix: Add runtime parameter to compress_messages method(#803)

    The compress_messages method was being called by PreModelHookMiddleware
    with both state and runtime parameters, but only accepted state parameter.
    This caused a TypeError when the middleware executed the pre_model_hook.

    Added optional runtime parameter to compress_messages signature to match
    the expected interface while maintaining backward compatibility.

* Update the code with the review comments

* fix: Refactor citation handling and add comprehensive tests for citation features

* refactor: Clean up imports and formatting across citation modules

* fix: Add monkeypatch to clear AGENT_RECURSION_LIMIT in recursion limit tests

* feat: Enhance citation link handling in Markdown component

* fix: Exclude citations from finish reason handling in mergeMessage function

* fix(nodes): update message handling

* fix(citations): improve citation extraction and handling in event processing

* feat(citations): enhance citation extraction and handling with improved merging and normalization

* fix(reporter): update citation formatting instructions for clarity and consistency

* fix(reporter): prioritize using Markdown tables for data presentation and comparison

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: LoftyComet <1277173875@qq。>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-24 17:49:13 +08:00

137 lines
4.7 KiB
Python

# Copyright (c) 2025 Bytedance Ltd. and/or its affiliates
# SPDX-License-Identifier: MIT
from langchain_core.messages import ToolMessage
from src.citations.collector import CitationCollector
from src.citations.extractor import (
_extract_domain,
citations_to_markdown_references,
extract_citations_from_messages,
merge_citations,
)
from src.citations.formatter import CitationFormatter
from src.citations.models import Citation, CitationMetadata
class TestCitationMetadata:
def test_initialization(self):
meta = CitationMetadata(
url="https://example.com/page",
title="Example Page",
description="An example description",
)
assert meta.url == "https://example.com/page"
assert meta.title == "Example Page"
assert meta.description == "An example description"
assert meta.domain == "example.com" # Auto-extracted in post_init
def test_id_generation(self):
meta = CitationMetadata(url="https://example.com", title="Test")
# Just check it's a non-empty string, length 12
assert len(meta.id) == 12
assert isinstance(meta.id, str)
def test_to_dict(self):
meta = CitationMetadata(
url="https://example.com", title="Test", relevance_score=0.8
)
data = meta.to_dict()
assert data["url"] == "https://example.com"
assert data["title"] == "Test"
assert data["relevance_score"] == 0.8
assert "id" in data
class TestCitation:
def test_citation_wrapper(self):
meta = CitationMetadata(url="https://example.com", title="Test")
citation = Citation(number=1, metadata=meta)
assert citation.number == 1
assert citation.url == "https://example.com"
assert citation.title == "Test"
assert citation.to_markdown_reference() == "[Test](https://example.com)"
assert citation.to_numbered_reference() == "[1] Test - https://example.com"
class TestExtractor:
def test_extract_from_tool_message_web_search(self):
search_result = {
"results": [
{
"url": "https://example.com/1",
"title": "Result 1",
"content": "Content 1",
"score": 0.9,
}
]
}
msg = ToolMessage(
content=str(search_result).replace("'", '"'), # Simple JSON dump simulation
tool_call_id="call_1",
name="web_search",
)
# Mocking json structure if ToolMessage content expects stringified JSON
import json
msg.content = json.dumps(search_result)
citations = extract_citations_from_messages([msg])
assert len(citations) == 1
assert citations[0]["url"] == "https://example.com/1"
assert citations[0]["title"] == "Result 1"
def test_extract_domain(self):
assert _extract_domain("https://www.example.com/path") == "www.example.com"
assert _extract_domain("http://example.org") == "example.org"
def test_merge_citations(self):
existing = [{"url": "https://a.com", "title": "A", "relevance_score": 0.5}]
new = [
{"url": "https://b.com", "title": "B", "relevance_score": 0.6},
{
"url": "https://a.com",
"title": "A New",
"relevance_score": 0.7,
}, # Better score for A
]
merged = merge_citations(existing, new)
assert len(merged) == 2
# Check A was updated
a_citation = next(c for c in merged if c["url"] == "https://a.com")
assert a_citation["relevance_score"] == 0.7
# Check B is present
b_citation = next(c for c in merged if c["url"] == "https://b.com")
assert b_citation["title"] == "B"
def test_citations_to_markdown(self):
citations = [{"url": "https://a.com", "title": "A", "description": "Desc A"}]
md = citations_to_markdown_references(citations)
assert "## Key Citations" in md
assert "- [A](https://a.com)" in md
class TestCollector:
def test_add_citations(self):
collector = CitationCollector()
results = [
{"url": "https://example.com", "title": "Example", "content": "Test"}
]
added = collector.add_from_search_results(results, query="test")
assert len(added) == 1
assert added[0].url == "https://example.com"
assert collector.count == 1
class TestFormatter:
def test_format_inline(self):
formatter = CitationFormatter(style="superscript")
assert formatter.format_inline_marker(1) == "¹"
assert formatter.format_inline_marker(12) == "¹²"