fix: ensure web search is performed for research plans to fix #535 (#640)

* fix: ensure web search is performed for research plans to fix #535

          When using certain models (DeepSeek-V3, Qwen3, or local deployments), the
          agent framework failed to trigger web search tools, resulting in hallucinated
          data. This fix implements multiple safeguards:

          1. Add enforce_web_search configuration flag:
             - New config option to mandate web search in research plans
             - Defaults to False for backward compatibility

          2. Add plan validation function validate_and_fix_plan():
             - Validates that plans include at least one research step with web search
             - Enforces web search requirement when enabled
             - Adds default research step if plan has no steps

          3. Enhance coordinator_node fallback logic:
             - When model fails to call tools, fallback to planner instead of __end__
             - Ensures workflow continues even when tool calling fails
             - Logs detailed diagnostic info for debugging

          4. Update prompts for stricter requirements:
             - planner.md: Add MANDATORY web search requirement and clear warnings
             - coordinator.md: Add CRITICAL tool calling requirement
             - Emphasize consequences of missing web search (hallucinated data)

          5. Update tests to reflect new behavior:
             - test_coordinator_node_no_tool_calls: Expect planner instead of __end__
             - test_coordinator_empty_llm_response_corner_case: Same expectation

          Fixes #535 by ensuring:
          - Web search is always performed for research tasks
          - Workflow doesn't terminate on tool calling failures
          - Models with poor tool calling support can still proceed
          - No hallucinated data without real information gathering

* Update src/graph/nodes.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/graph/nodes.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* accept the review suggestion of getting configuration

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
Willem Jiang
2025-10-22 08:27:06 +08:00
committed by GitHub
parent 2ff7d9adf8
commit add0a701f4
5 changed files with 95 additions and 8 deletions

View File

@@ -568,7 +568,7 @@ def test_coordinator_node_no_tool_calls(
patch_handoff_to_planner,
patch_logger,
):
# No tool calls, should goto __end__
# No tool calls, should fallback to planner (fix for issue #535)
with (
patch("src.graph.nodes.AGENT_LLM_MAP", {"coordinator": "basic"}),
patch("src.graph.nodes.get_llm_by_type") as mock_get_llm,
@@ -579,7 +579,8 @@ def test_coordinator_node_no_tool_calls(
mock_get_llm.return_value = mock_llm
result = coordinator_node(mock_state_coordinator, MagicMock())
assert result.goto == "__end__"
# Should fallback to planner instead of __end__ to ensure workflow continues
assert result.goto == "planner"
assert result.update["locale"] == "en-US"
assert result.update["resources"] == ["resource1", "resource2"]
@@ -1535,7 +1536,7 @@ def test_coordinator_empty_llm_response_corner_case(mock_get_llm):
This tests error handling when LLM fails to return any content or tool calls
in the initial state (clarification_rounds=0). The system should gracefully
handle this by going to __end__ instead of crashing.
handle this by going to planner instead of crashing (fix for issue #535).
Note: This is NOT a typical clarification workflow test, but rather tests
fault tolerance when LLM misbehaves.
@@ -1563,6 +1564,6 @@ def test_coordinator_empty_llm_response_corner_case(mock_get_llm):
# Call coordinator_node - should not crash
result = coordinator_node(state, config)
# Should gracefully handle empty response by going to __end__
assert result.goto == "__end__"
# Should gracefully handle empty response by going to planner to ensure workflow continues
assert result.goto == "planner"
assert result.update["locale"] == "en-US"