feat: support infoquest (#708)

* support infoquest

* support html checker

* support html checker

* change line break format

* change line break format

* change line break format

* change line break format

* change line break format

* change line break format

* change line break format

* change line break format

* Fix several critical issues in the codebase
- Resolve crawler panic by improving error handling
- Fix plan validation to prevent invalid configurations
- Correct InfoQuest crawler JSON conversion logic

* add test for infoquest

* add test for infoquest

* Add InfoQuest introduction to the README

* add test for infoquest

* fix readme for infoquest

* fix readme for infoquest

* resolve the conflict

* resolve the conflict

* resolve the conflict

* Fix formatting of INFOQUEST in SearchEngine enum

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Willem Jiang <143703838+willem-bd@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
infoquest-byteplus
2025-12-02 08:16:35 +08:00
committed by GitHub
parent e179fb1632
commit 7ec9e45702
22 changed files with 2103 additions and 94 deletions

View File

@@ -14,6 +14,7 @@
Currently, DeerFlow has officially entered the [FaaS Application Center of Volcengine](https://console.volcengine.com/vefaas/region:vefaas+cn-beijing/market). Users can experience it online through the [experience link](https://console.volcengine.com/vefaas/region:vefaas+cn-beijing/market/deerflow/?channel=github&source=deerflow) to intuitively feel its powerful functions and convenient operations. At the same time, to meet the deployment needs of different users, DeerFlow supports one-click deployment based on Volcengine. Click the [deployment link](https://console.volcengine.com/vefaas/region:vefaas+cn-beijing/application/create?templateId=683adf9e372daa0008aaed5c&channel=github&source=deerflow) to quickly complete the deployment process and start an efficient research journey.
DeerFlow has newly integrated the intelligent search and crawling toolset independently developed by BytePlus--[InfoQuest (supports free online experience)](https://console.byteplus.com/infoquest/infoquests)
Please visit [our official website](https://deerflow.tech/) for more details.
@@ -159,6 +160,13 @@ DeerFlow supports multiple search engines that can be configured in your `.env`
- Requires `TAVILY_API_KEY` in your `.env` file
- Sign up at: https://app.tavily.com/home
- **InfoQuest** (recommended): AI-optimized intelligent search and crawling toolset independently developed by BytePlus
- Requires `INFOQUEST_API_KEY` in your `.env` file
- Support for time range filtering and site filtering
- Provides high-quality search results and content extraction
- Sign up at: https://console.byteplus.com/infoquest/infoquests
- Visit https://docs.byteplus.com/en/docs/InfoQuest/What_is_Info_Quest to learn more
- **DuckDuckGo**: Privacy-focused search engine
- No API key required
@@ -177,10 +185,31 @@ DeerFlow supports multiple search engines that can be configured in your `.env`
To configure your preferred search engine, set the `SEARCH_API` variable in your `.env` file:
```bash
# Choose one: tavily, duckduckgo, brave_search, arxiv
# Choose one: tavily, infoquest, duckduckgo, brave_search, arxiv
SEARCH_API=tavily
```
### Crawling Tools
DeerFlow supports multiple crawling tools that can be configured in your `conf.yaml` file:
- **Jina** (default): Freely accessible web content crawling tool
- **InfoQuest** (recommended): AI-optimized intelligent search and crawling toolset developed by BytePlus
- Requires `INFOQUEST_API_KEY` in your `.env` file
- Provides configurable crawling parameters
- Supports custom timeout settings
- Offers more powerful content extraction capabilities
- Visit https://docs.byteplus.com/en/docs/InfoQuest/What_is_Info_Quest to learn more
To configure your preferred crawling tool, set the following in your `conf.yaml` file:
```yaml
CRAWLER_ENGINE:
# Engine type: "jina" (default) or "infoquest"
engine: infoquest
```
### Private Knowledgebase
DeerFlow supports private knowledgebase such as RAGFlow, Qdrant, Milvus, and VikingDB, so that you can use your private documents to answer questions.
@@ -221,8 +250,8 @@ DeerFlow supports private knowledgebase such as RAGFlow, Qdrant, Milvus, and Vik
### Tools and MCP Integrations
- 🔍 **Search and Retrieval**
- Web search via Tavily, Brave Search and more
- Crawling with Jina
- Web search via Tavily, InfoQuest, Brave Search and more
- Crawling with Jina and InfoQuest
- Advanced content extraction
- Support for private knowledgebase
@@ -284,7 +313,6 @@ The system employs a streamlined workflow with the following components:
- Manages the research flow and decides when to generate the final report
3. **Research Team**: A collection of specialized agents that execute the plan:
- **Researcher**: Conducts web searches and information gathering using tools like web search engines, crawling and even MCP services.
- **Coder**: Handles code analysis, execution, and technical tasks using Python REPL tool.
Each agent has access to specific tools optimized for their role and operates within the LangGraph framework
@@ -475,7 +503,6 @@ docker build -t deer-flow-api .
```
Final, start up a docker container running the web server:
```bash
# Replace deer-flow-api-app with your preferred container name
# Start the server then bind to localhost:8000
@@ -655,4 +682,4 @@ Your unwavering commitment and expertise have been the driving force behind Deer
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=bytedance/deer-flow&type=Date)](https://star-history.com/#bytedance/deer-flow&Date)
[![Star History Chart](https://api.star-history.com/svg?repos=bytedance/deer-flow&type=Date)](https://star-history.com/#bytedance/deer-flow&Date)