fix: ensure web search is performed for research plans to fix #535 (#640)

* fix: ensure web search is performed for research plans to fix #535 When using certain models (DeepSeek-V3, Qwen3, or local deployments), the agent framework failed to trigger web search tools, resulting in hallucinated data. This fix implements multiple safeguards: 1. Add enforce_web_search configuration flag: - New config option to mandate web search in research plans - Defaults to False for backward compatibility 2. Add plan validation function validate_and_fix_plan(): - Validates that plans include at least one research step with web search - Enforces web search requirement when enabled - Adds default research step if plan has no steps 3. Enhance coordinator_node fallback logic: - When model fails to call tools, fallback to planner instead of __end__ - Ensures workflow continues even when tool calling fails - Logs detailed diagnostic info for debugging 4. Update prompts for stricter requirements: - planner.md: Add MANDATORY web search requirement and clear warnings - coordinator.md: Add CRITICAL tool calling requirement - Emphasize consequences of missing web search (hallucinated data) 5. Update tests to reflect new behavior: - test_coordinator_node_no_tool_calls: Expect planner instead of __end__ - test_coordinator_empty_llm_response_corner_case: Same expectation Fixes #535 by ensuring: - Web search is always performed for research tasks - Workflow doesn't terminate on tool calling failures - Models with poor tool calling support can still proceed - No hallucinated data without real information gathering * Update src/graph/nodes.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/graph/nodes.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * accept the review suggestion of getting configuration --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-22 21:54:45 +08:00 · 2025-10-22 08:27:06 +08:00
parent 2ff7d9adf8
commit add0a701f4
5 changed files with 95 additions and 8 deletions
--- a/tests/integration/test_nodes.py
+++ b/tests/integration/test_nodes.py
@@ -568,7 +568,7 @@ def test_coordinator_node_no_tool_calls(
    patch_handoff_to_planner,
    patch_logger,
 ):
-    # No tool calls, should goto __end__
+    # No tool calls, should fallback to planner (fix for issue #535)
    with (
        patch("src.graph.nodes.AGENT_LLM_MAP", {"coordinator": "basic"}),
        patch("src.graph.nodes.get_llm_by_type") as mock_get_llm,
@@ -579,7 +579,8 @@ def test_coordinator_node_no_tool_calls(
        mock_get_llm.return_value = mock_llm

        result = coordinator_node(mock_state_coordinator, MagicMock())
-        assert result.goto == "__end__"
+        # Should fallback to planner instead of __end__ to ensure workflow continues
+        assert result.goto == "planner"
        assert result.update["locale"] == "en-US"
        assert result.update["resources"] == ["resource1", "resource2"]

@@ -1535,7 +1536,7 @@ def test_coordinator_empty_llm_response_corner_case(mock_get_llm):

    This tests error handling when LLM fails to return any content or tool calls
    in the initial state (clarification_rounds=0). The system should gracefully
-    handle this by going to __end__ instead of crashing.
+    handle this by going to planner instead of crashing (fix for issue #535).

    Note: This is NOT a typical clarification workflow test, but rather tests
    fault tolerance when LLM misbehaves.
@@ -1563,6 +1564,6 @@ def test_coordinator_empty_llm_response_corner_case(mock_get_llm):
    # Call coordinator_node - should not crash
    result = coordinator_node(state, config)

-    # Should gracefully handle empty response by going to __end__
-    assert result.goto == "__end__"
+    # Should gracefully handle empty response by going to planner to ensure workflow continues
+    assert result.goto == "planner"
    assert result.update["locale"] == "en-US"