* fix: strip <think> tags from LLM output to prevent thinking text leakage (#781)
Some models (e.g. DeepSeek-R1, QwQ via ollama) embed reasoning in
content using <think>...</think> tags instead of the separate
reasoning_content field. This causes thinking text to leak into
both streamed messages and the final report.
Fix at two layers:
- server/app.py: strip <think> tags in _create_event_stream_message
so ALL streamed content is filtered (coordinator, planner, etc.)
- graph/nodes.py: strip <think> tags in reporter_node before storing
final_report (which is not streamed through the event layer)
The regex uses a fast-path check ("<think>" in content) to avoid
unnecessary regex calls on normal content.
* refactor: add defensive check for think tag stripping and add reporter_node tests (#781)
- Add isinstance and fast-path check in reporter_node before regex, consistent with app.py
- Add TestReporterNodeThinkTagStripping with 5 test cases covering various scenarios
* chore: re-trigger review
* fix(config): Add support for MCP server configuration parameters
* refact: rename the sse_readtimeout to sse_read_timeout
* update the code with review comments
* update the MCP document for the latest change
* test: add unit tests for global connection pool (Issue #778)
- Add TestLifespanFunction class with 9 tests for lifespan management:
- PostgreSQL/MongoDB pool initialization success/failure
- Cleanup on shutdown
- Skip initialization when not configured
- Add TestGlobalConnectionPoolUsage class with 4 tests:
- Using global pools when available
- Fallback to per-request connections
- Fix missing dict_row import in app.py (bug from PR #757)
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* feat: add enable_web_search config to disable web search (#681)
* fix: skip enforce_researcher_search validation when web search is disabled
- Return json.dumps([]) instead of empty string for consistency in background_investigation_node
- Add enable_web_search check to skip validation warning when user intentionally disabled web search
- Add warning log when researcher has no tools available
- Update tests to include new enable_web_search parameter
* fix: address Copilot review feedback
- Coordinate enforce_web_search with enable_web_search in validate_and_fix_plan
- Fix misleading comment in background_investigation_node
* docs: add warning about local RAG setup when disabling web search
* docs: add web search toggle section to configuration guide
* Update uv.lock to sync with pyproject.toml
* fix: update Interrupt object attribute access for LangGraph 1.0+ (#730)
The Interrupt class in LangGraph 1.0 no longer has the 'ns' attribute.
This change updates _create_interrupt_event() to use the new 'id'
attribute instead, with a fallback to thread_id for compatibility.
Changes:
- Replace event_data["__interrupt__"][0].ns[0] with interrupt.id
- Use getattr() with fallback for backward compatibility
- Update debug log message from 'ns=' to 'id='
- Add unit tests for _create_interrupt_event function
* fix the unit test error and address review comment
---------
Co-authored-by: Willem Jiang <143703838+willem-bd@users.noreply.github.com>
* security: add log injection attack prevention with input sanitization
- Created src/utils/log_sanitizer.py to sanitize user-controlled input before logging
- Prevents log injection attacks using newlines, tabs, carriage returns, etc.
- Escapes dangerous characters: \n, \r, \t, \0, \x1b
- Provides specialized functions for different input types:
- sanitize_log_input: general purpose sanitization
- sanitize_thread_id: for user-provided thread IDs
- sanitize_user_content: for user messages (more aggressive truncation)
- sanitize_agent_name: for agent identifiers
- sanitize_tool_name: for tool names
- sanitize_feedback: for user interrupt feedback
- create_safe_log_message: template-based safe message creation
- Updated src/server/app.py to sanitize all user input in logging:
- Thread IDs from request parameter
- Message content from user
- Agent names and node information
- Tool names and feedback
- Updated src/agents/tool_interceptor.py to sanitize:
- Tool names during execution
- User feedback during interrupt handling
- Tool input data
- Added 29 comprehensive unit tests covering:
- Classic newline injection attacks
- Carriage return injection
- Tab and null character injection
- HTML/ANSI escape sequence injection
- Combined multi-character attacks
- Truncation and length limits
Fixes potential log forgery vulnerability where malicious users could inject
fake log entries via unsanitized input containing control characters.
- Implement index-based grouping of tool call chunks in _process_tool_call_chunks()
- Add _validate_tool_call_chunks() for debug logging and validation
- Enhance _process_message_chunk() with tool call ID validation and boundary detection
- Add comprehensive unit tests (17 tests) for tool call chunk processing
- Fix issue where tool names were incorrectly concatenated (e.g., 'web_searchweb_search')
- Ensure chunks from different tool calls (different indices) remain properly separated
- Add detailed logging for debugging tool call streaming issues
* update the code with suggestions of reviewing
* fix: support local models by making thought field optional in Plan model
- Make thought field optional in Plan model to fix Pydantic validation errors with local models
- Add Ollama configuration example to conf.yaml.example
- Update documentation to include local model support
- Improve planner prompt with better JSON format requirements
Fixes local model integration issues where models like qwen3:14b would fail
due to missing thought field in JSON output.
* feat: Add intelligent clarification feature for research queries
- Add multi-turn clarification process to refine vague research questions
- Implement three-dimension clarification standard (Tech/App, Focus, Scope)
- Add clarification state management in coordinator node
- Update coordinator prompt with detailed clarification guidelines
- Add UI settings to enable/disable clarification feature (disabled by default)
- Update workflow to handle clarification rounds recursively
- Add comprehensive test coverage for clarification functionality
- Update documentation with clarification feature usage guide
Key components:
- src/graph/nodes.py: Core clarification logic and state management
- src/prompts/coordinator.md: Detailed clarification guidelines
- src/workflow.py: Recursive clarification handling
- web/: UI settings integration
- tests/: Comprehensive test coverage
- docs/: Updated configuration guide
* fix: Improve clarification conversation continuity
- Add comprehensive conversation history to clarification context
- Include previous exchanges summary in system messages
- Add explicit guidelines for continuing rounds in coordinator prompt
- Prevent LLM from starting new topics during clarification
- Ensure topic continuity across clarification rounds
Fixes issue where LLM would restart clarification instead of building upon previous exchanges.
* fix: Add conversation history to clarification context
* fix: resolve clarification feature message to planer, prompt, test issues
- Optimize coordinator.md prompt template for better clarification flow
- Simplify final message sent to planner after clarification
- Fix API key assertion issues in test_search.py
* fix: Add configurable max_clarification_rounds and comprehensive tests
- Add max_clarification_rounds parameter for external configuration
- Add comprehensive test cases for clarification feature in test_app.py
- Fixes issues found during interactive mode testing where:
- Recursive call failed due to missing initial_state parameter
- Clarification exited prematurely at max rounds
- Incorrect logging of max rounds reached
* Move clarification tests to test_nodes.py and add max_clarification_rounds to zh.json
* fix: add max_clarification_rounds parameter passing from frontend to backend
- Add max_clarification_rounds parameter in store.ts sendMessage function
- Add max_clarification_rounds type definition in chat.ts
- Ensure frontend settings page clarification rounds are correctly passed to backend
* fix: refine clarification workflow state handling and coverage
- Add clarification history reconstruction
- Fix clarified topic accumulation
- Add clarified_research_topic state field
- Preserve clarification state in recursive calls
- Add comprehensive test coverage
* refactor: optimize coordinator logic and type annotations
- Simplify handoff topic logic in coordinator_node
- Update type annotations from Tuple to tuple
- Improve code readability and maintainability
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* fix: support local models by making thought field optional in Plan model
- Make thought field optional in Plan model to fix Pydantic validation errors with local models
- Add Ollama configuration example to conf.yaml.example
- Update documentation to include local model support
- Improve planner prompt with better JSON format requirements
Fixes local model integration issues where models like qwen3:14b would fail
due to missing thought field in JSON output.
* feat: Add intelligent clarification feature for research queries
- Add multi-turn clarification process to refine vague research questions
- Implement three-dimension clarification standard (Tech/App, Focus, Scope)
- Add clarification state management in coordinator node
- Update coordinator prompt with detailed clarification guidelines
- Add UI settings to enable/disable clarification feature (disabled by default)
- Update workflow to handle clarification rounds recursively
- Add comprehensive test coverage for clarification functionality
- Update documentation with clarification feature usage guide
Key components:
- src/graph/nodes.py: Core clarification logic and state management
- src/prompts/coordinator.md: Detailed clarification guidelines
- src/workflow.py: Recursive clarification handling
- web/: UI settings integration
- tests/: Comprehensive test coverage
- docs/: Updated configuration guide
* fix: Improve clarification conversation continuity
- Add comprehensive conversation history to clarification context
- Include previous exchanges summary in system messages
- Add explicit guidelines for continuing rounds in coordinator prompt
- Prevent LLM from starting new topics during clarification
- Ensure topic continuity across clarification rounds
Fixes issue where LLM would restart clarification instead of building upon previous exchanges.
* fix: Add conversation history to clarification context
* fix: resolve clarification feature message to planer, prompt, test issues
- Optimize coordinator.md prompt template for better clarification flow
- Simplify final message sent to planner after clarification
- Fix API key assertion issues in test_search.py
* fix: Add configurable max_clarification_rounds and comprehensive tests
- Add max_clarification_rounds parameter for external configuration
- Add comprehensive test cases for clarification feature in test_app.py
- Fixes issues found during interactive mode testing where:
- Recursive call failed due to missing initial_state parameter
- Clarification exited prematurely at max rounds
- Incorrect logging of max rounds reached
* Move clarification tests to test_nodes.py and add max_clarification_rounds to zh.json
* feat: disable the MCP server configuation by default
* Fixed the lint and test errors
* fix the lint error
* feat:update the mcp config documents and tests
* fixed the lint errors
* test: add unit tests in server
* test: add unit tests of app.py in server
* test: reformat the codes
* test: add more tests to cover the exception part
* test: add more tests on the server app part
* fix: don't show the detail exception to the client
* test: try to fix the CI test
* fix: keep the TTS API call without exposure information
* Fixed the unit test errors
* Fixed the lint error