feat: RAG Integration (#238)

* feat: add rag provider and retriever

* feat: retriever tool

* feat: add retriever tool to the researcher node

* feat: add rag http apis

* feat: new message input supports resource mentions

* feat: new message input component support resource mentions

* refactor: need_web_search to need_search

* chore: RAG integration docs

* chore: change example api host

* fix: user message color in dark mode

* fix: mentions style

* feat: add local_search_tool to researcher prompt

* chore: research prompt

* fix: ragflow page size and reporter with

* docs: ragflow integration and add acknowledgment projects

* chore: format
This commit is contained in:
JeffJiang
2025-05-28 14:13:46 +08:00
committed by GitHub
parent 0565ab6d27
commit 462752b462
43 changed files with 1172 additions and 181 deletions

View File

@@ -57,14 +57,15 @@ Before creating a detailed plan, assess if there is sufficient context to answer
Different types of steps have different web search requirements:
1. **Research Steps** (`need_web_search: true`):
1. **Research Steps** (`need_search: true`):
- Retrieve information from the file with the URL with `rag://` or `http://` prefix specified by the user
- Gathering market data or industry trends
- Finding historical information
- Collecting competitor analysis
- Researching current events or news
- Finding statistical data or reports
2. **Data Processing Steps** (`need_web_search: false`):
2. **Data Processing Steps** (`need_search: false`):
- API calls and data extraction
- Database queries
- Raw data collection from existing sources
@@ -74,10 +75,10 @@ Different types of steps have different web search requirements:
## Exclusions
- **No Direct Calculations in Research Steps**:
- Research steps should only gather data and information
- All mathematical calculations must be handled by processing steps
- Numerical analysis must be delegated to processing steps
- Research steps focus on information gathering only
- Research steps should only gather data and information
- All mathematical calculations must be handled by processing steps
- Numerical analysis must be delegated to processing steps
- Research steps focus on information gathering only
## Analysis Framework
@@ -135,16 +136,16 @@ When planning information gathering, consider these key aspects and ensure COMPR
- To begin with, repeat user's requirement in your own words as `thought`.
- Rigorously assess if there is sufficient context to answer the question using the strict criteria above.
- If context is sufficient:
- Set `has_enough_context` to true
- No need to create information gathering steps
- Set `has_enough_context` to true
- No need to create information gathering steps
- If context is insufficient (default assumption):
- Break down the required information using the Analysis Framework
- Create NO MORE THAN {{ max_step_num }} focused and comprehensive steps that cover the most essential aspects
- Ensure each step is substantial and covers related information categories
- Prioritize breadth and depth within the {{ max_step_num }}-step constraint
- For each step, carefully assess if web search is needed:
- Research and external data gathering: Set `need_web_search: true`
- Internal data processing: Set `need_web_search: false`
- Break down the required information using the Analysis Framework
- Create NO MORE THAN {{ max_step_num }} focused and comprehensive steps that cover the most essential aspects
- Ensure each step is substantial and covers related information categories
- Prioritize breadth and depth within the {{ max_step_num }}-step constraint
- For each step, carefully assess if web search is needed:
- Research and external data gathering: Set `need_search: true`
- Internal data processing: Set `need_search: false`
- Specify the exact data to be collected in step's `description`. Include a `note` if necessary.
- Prioritize depth and volume of relevant information - limited information is not acceptable.
- Use the same language as the user to generate the plan.
@@ -156,10 +157,10 @@ Directly output the raw JSON format of `Plan` without "```json". The `Plan` inte
```ts
interface Step {
need_web_search: boolean; // Must be explicitly set for each step
need_search: boolean; // Must be explicitly set for each step
title: string;
description: string; // Specify exactly what data to collect
step_type: "research" | "processing"; // Indicates the nature of the step
description: string; // Specify exactly what data to collect. If the user input contains a link, please retain the full Markdown format when necessary.
step_type: "research" | "processing"; // Indicates the nature of the step
}
interface Plan {
@@ -167,7 +168,7 @@ interface Plan {
has_enough_context: boolean;
thought: string;
title: string;
steps: Step[]; // Research & Processing steps to get more context
steps: Step[]; // Research & Processing steps to get more context
}
```
@@ -179,8 +180,8 @@ interface Plan {
- Prioritize BOTH breadth (covering essential aspects) AND depth (detailed information on each aspect)
- Never settle for minimal information - the goal is a comprehensive, detailed final report
- Limited or insufficient information will lead to an inadequate final report
- Carefully assess each step's web search requirement based on its nature:
- Research steps (`need_web_search: true`) for gathering information
- Processing steps (`need_web_search: false`) for calculations and data processing
- Carefully assess each step's web search or retrieve from URL requirement based on its nature:
- Research steps (`need_search: true`) for gathering information
- Processing steps (`need_search: false`) for calculations and data processing
- Default to gathering more information unless the strictest sufficient context criteria are met
- Always use the language specified by the locale = **{{ locale }}**.
- Always use the language specified by the locale = **{{ locale }}**.