deer-flow

mirror of https://gitee.com/wanwujie/deer-flow synced 2026-04-03 22:32:12 +08:00

Author	SHA1	Message	Date
Willem Jiang	2a97170b6c	feat: add Serper search engine support (#762 ) * feat: add Serper search engine support * docs: update configuration guide and env example for Serper * test: add test case for Serper with missing API key	2025-12-15 23:04:26 +08:00
infoquest-byteplus	7ec9e45702	feat: support infoquest (#708 ) * support infoquest * support html checker * support html checker * change line break format * change line break format * change line break format * change line break format * change line break format * change line break format * change line break format * change line break format * Fix several critical issues in the codebase - Resolve crawler panic by improving error handling - Fix plan validation to prevent invalid configurations - Correct InfoQuest crawler JSON conversion logic * add test for infoquest * add test for infoquest * Add InfoQuest introduction to the README * add test for infoquest * fix readme for infoquest * fix readme for infoquest * resolve the conflict * resolve the conflict * resolve the conflict * Fix formatting of INFOQUEST in SearchEngine enum * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Willem Jiang <143703838+willem-bd@users.noreply.github.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-12-02 08:16:35 +08:00
Willem Jiang	170c4eb33c	Upgrade langchain version to 1.x (#720 ) * fix: revert the part of patch of issue-710 to extract the content from the plan * Upgrade the ddgs for the new compatible version * Upgraded langchain to 1.1.0 updated langchain related package to the new compatable version * Update pyproject.toml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-28 22:09:13 +08:00
Willem Jiang	bec97f02ae	fix: the crawling error when encountering PDF URLs (#707 ) * fix: the crawling error when encountering PDF URLs * Added the unit test for the new feature of crawl tool * fix: address the code review problems * fix: address the code review problems	2025-11-25 09:24:52 +08:00
Willem Jiang	975b344ca7	fix: resolve issue #651 - crawl error with None content handling (#652 ) * fix: resolve issue #651 - crawl error with None content handling Fixed issue #651 by adding comprehensive null-safety checks and error handling to the crawl system. The fix prevents the ‘TypeError: Incoming markup is of an invalid type: None’ crash by: 1. Validating HTTP responses from Jina API 2. Handling None/empty content at extraction stage 3. Adding fallback handling in Article markdown/message conversion 4. Improving error diagnostics with detailed logging 5. Adding 16 new tests with 100% coverage for critical paths * Update src/crawler/readability_extractor.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/crawler/article.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-24 17:06:54 +08:00
jimmyuconn1982	2001a7c223	Fix: clarification bugs - max rounds, locale passing, and over-clarification (#647 ) Fixes: Max rounds bug, locale passing bug, over-clarification issue * reslove Copilot spelling comments --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2025-10-24 16:43:39 +08:00
Willem Jiang	052490b116	fix: resolve issue #467 - message content validation and Tavily search error handling (#645 ) * fix: resolve issue #467 - message content validation and Tavily search error handling This commit implements a comprehensive fix for issue #467 where the application crashed with 'Field required: input.messages.3.content' error when generating reports. ## Root Cause Analysis The issue had multiple interconnected causes: 1. Tavily tool returned mixed types (lists/error strings) instead of consistent JSON 2. background_investigation_node didn't handle error cases properly, returning None 3. Missing message content validation before LLM calls 4. Insufficient error diagnostics for content-related errors ## Changes Made ### Part 1: Fix Tavily Search Tool (tavily_search_results_with_images.py) - Modified _run() and _arun() methods to return JSON strings instead of mixed types - Error responses now return JSON: {"error": repr(e)} - Successful responses return JSON string: json.dumps(cleaned_results) - Ensures tool results always have valid string content for ToolMessages ### Part 2: Fix background_investigation_node Error Handling (graph/nodes.py) - Initialize background_investigation_results to empty list instead of None - Added proper JSON parsing for string responses from Tavily tool - Handle error responses with explicit error logging - Always return valid JSON (empty list if error) instead of None ### Part 3: Add Message Content Validation (utils/context_manager.py) - New validate_message_content() function validates all messages before LLM calls - Ensures all messages have content attribute and valid string content - Converts complex types (lists, dicts) to JSON strings - Provides graceful fallback for messages with issues ### Part 4: Enhanced Error Diagnostics (_execute_agent_step in graph/nodes.py) - Call message validation before agent invocation - Add detailed logging for content-related errors - Log message types, content types, and lengths when validation fails - Helps with future debugging of similar issues ## Testing - All unit tests pass (395 tests) - Python syntax verified for all modified files - No breaking changes to existing functionality * test: update tests for issue #467 fixes Update test expectations to match the new implementation: - Tavily search tool now returns JSON strings instead of mixed types - background_investigation_node returns empty list [] for errors instead of None - All tests updated to verify the new behavior - All 391 tests pass successfully * Update src/graph/nodes.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-23 22:08:14 +08:00
Willem Jiang	9ece3fd9c3	fix: support additional Tavily search parameters via configuration to fix #548 (#643 ) * fix: support additional Tavily search parameters via configuration to fix #548 - Add include_answer, search_depth, include_raw_content, include_images, include_image_descriptions to SEARCH_ENGINE config - Update get_web_search_tool() to load these parameters from configuration with sensible defaults - Parameters are now properly passed to TavilySearchWithImages during initialization - This fixes 'got an unexpected keyword argument' errors when using web_search tool - Update tests to verify new parameters are correctly set * test: add comprehensive unit tests for web search configuration loading - Add test for custom configuration values (include_answer, search_depth, etc.) - Add test for empty configuration (all defaults) - Add test for image_descriptions logic when include_images is false - Add test for partial configuration - Add test for missing config file - Add test for multiple domains in include/exclude lists All 7 new tests pass and provide comprehensive coverage of configuration loading and parameter handling for Tavily search tool initialization. * test: verify all Tavily configuration parameters are optional Add 8 comprehensive tests to verify that all Tavily engine configuration parameters are truly optional: - test_tavily_with_no_search_engine_section: SEARCH_ENGINE section missing - test_tavily_with_completely_empty_config: Entire config missing - test_tavily_with_only_include_answer_param: Single param, rest default - test_tavily_with_only_search_depth_param: Single param, rest default - test_tavily_with_only_include_domains_param: Domain param, rest default - test_tavily_with_explicit_false_boolean_values: False values work correctly - test_tavily_with_empty_domain_lists: Empty lists handled correctly - test_tavily_all_parameters_optional_mix: Multiple missing params work These tests verify: - Tool creation never fails regardless of missing configuration - All parameters have sensible defaults - Boolean parameters can be explicitly set to False - Any combination of optional parameters works - Domain lists can be empty or omitted All 15 Tavily configuration tests pass successfully.	2025-10-22 22:56:02 +08:00
Willem Jiang	d30c4d00d3	fix: convert crawl_tool dict return to JSON string for type consistency (#636 ) Keep fixing #631 This pull request updates the crawl_tool function to return its results as a JSON string instead of a dictionary, and adjusts the unit tests accordingly to handle the new return type. The changes ensure consistent serialization of output and proper validation in tests.	2025-10-21 10:00:33 +08:00
Willem Jiang	e2ff765460	fix: correct image result format for OpenAI compatibility to fix #632 (#634 ) - Change image result type from 'image' to 'image_url' to match OpenAI API expectations - Wrap image URL in dict structure: {"url": "..."} instead of plain string - Update SearchResultPostProcessor to handle dict-based image_url during duplicate removal - Update tests to validate new image format This fixes the 400 error: Invalid value: 'image'. Supported values are: 'text', 'image_url'... Co-authored-by: Willem Jiang <143703838+willem-bd@users.noreply.github.com>	2025-10-20 23:14:09 +08:00
jimmyuconn1982	2510cc61de	feat: Add intelligent clarification feature in coordinate step for research queries (#613 ) * fix: support local models by making thought field optional in Plan model - Make thought field optional in Plan model to fix Pydantic validation errors with local models - Add Ollama configuration example to conf.yaml.example - Update documentation to include local model support - Improve planner prompt with better JSON format requirements Fixes local model integration issues where models like qwen3:14b would fail due to missing thought field in JSON output. * feat: Add intelligent clarification feature for research queries - Add multi-turn clarification process to refine vague research questions - Implement three-dimension clarification standard (Tech/App, Focus, Scope) - Add clarification state management in coordinator node - Update coordinator prompt with detailed clarification guidelines - Add UI settings to enable/disable clarification feature (disabled by default) - Update workflow to handle clarification rounds recursively - Add comprehensive test coverage for clarification functionality - Update documentation with clarification feature usage guide Key components: - src/graph/nodes.py: Core clarification logic and state management - src/prompts/coordinator.md: Detailed clarification guidelines - src/workflow.py: Recursive clarification handling - web/: UI settings integration - tests/: Comprehensive test coverage - docs/: Updated configuration guide * fix: Improve clarification conversation continuity - Add comprehensive conversation history to clarification context - Include previous exchanges summary in system messages - Add explicit guidelines for continuing rounds in coordinator prompt - Prevent LLM from starting new topics during clarification - Ensure topic continuity across clarification rounds Fixes issue where LLM would restart clarification instead of building upon previous exchanges. * fix: Add conversation history to clarification context * fix: resolve clarification feature message to planer, prompt, test issues - Optimize coordinator.md prompt template for better clarification flow - Simplify final message sent to planner after clarification - Fix API key assertion issues in test_search.py * fix: Add configurable max_clarification_rounds and comprehensive tests - Add max_clarification_rounds parameter for external configuration - Add comprehensive test cases for clarification feature in test_app.py - Fixes issues found during interactive mode testing where: - Recursive call failed due to missing initial_state parameter - Clarification exited prematurely at max rounds - Incorrect logging of max rounds reached * Move clarification tests to test_nodes.py and add max_clarification_rounds to zh.json	2025-10-14 13:35:57 +08:00
Fancy-hjyp	5f4eb38fdb	feat: add context compress (#590 ) * feat:Add context compress * feat: Add unit test * feat: add unit test for context manager * feat: add postprocessor param && code format * feat: add configuration guide * fix: fix the configuration_guide * fix: fix the unit test * fix: fix the default value * feat: add test and log for context_manager	2025-09-27 21:42:22 +08:00
zgjja	3b4e993531	feat: 1. replace black with ruff for fomatting and sort import (#489 ) 2. use tavily from`langchain-tavily` rather than the older one from `langchain-community` Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2025-08-17 22:57:23 +08:00
Willem Jiang	9e691ecf20	fix: added configuration of python_repl (#503 ) * fix: added configuration of python_repl * fix the lint and unit test errors * fix the lint and unit test errors * fix:the lint check errors	2025-08-06 14:27:03 +08:00
DanielWalnut	6d8853b7c7	refine the research prompt (#460 )	2025-07-22 14:49:04 +08:00
Willem Jiang	3c46201ff0	fix: fix the lint check errors of the main branch (#403 )	2025-07-12 14:43:25 +08:00
Willem Jiang	4c2fe2e7f5	test: add more unit tests of tools (#315 ) * test: add more test on test_tts.py * test: add unit test of search and retriever in tools * test: remove the main code of search.py * test: add the travily_search unit test * reformate the codes * test: add unit tests of tools * Added the pytest-asyncio dependency * added the license header of test_tavily_search_api_wrapper.py	2025-06-12 20:43:32 +08:00

17 Commits