feat: add resource upload support for RAG (#768)

* feat: add resource upload support for RAG

- Backend: Added ingest_file method to Retriever and MilvusRetriever
- Backend: Added /api/rag/upload endpoint
- Frontend: Added RAGTab in settings for uploading resources
- Frontend: Updated translations and settings registration

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review of src/rag/milvus.py

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
Willem Jiang
2025-12-19 09:55:34 +08:00
committed by GitHub
parent 3e8f2ce3ad
commit 04296cdf5a
9 changed files with 567 additions and 2 deletions

View File

@@ -79,3 +79,52 @@ class Retriever(abc.ABC):
Query relevant documents from the resources.
"""
pass
def ingest_file(self, file_content: bytes, filename: str, **kwargs) -> Resource:
"""
Ingest a file into the RAG provider and register it as a :class:`Resource`.
This method is intended to be overridden by concrete retriever implementations.
The default implementation always raises :class:`NotImplementedError`.
Parameters
----------
file_content:
Raw bytes of the file to ingest. For text-based formats, implementations
will typically assume UTF-8 encoding unless documented otherwise. Binary
formats (such as PDF, images, or office documents) should be passed as
their original bytes.
filename:
The original filename, including extension (e.g. ``"report.pdf"``). This
can be used by implementations to infer the file type, MIME type, or to
populate the resulting resource's title.
**kwargs:
Additional, implementation-specific options. Examples may include:
- Explicit MIME type or file type hints.
- Additional metadata to associate with the resource.
- Chunking, indexing, or preprocessing parameters.
Unsupported or invalid keyword arguments may result in an exception being
raised by the concrete implementation.
Returns
-------
Resource
A :class:`Resource` instance describing the ingested file, including its
URI and title. The exact URI scheme and how the resource is stored are
implementation-defined.
Raises
------
NotImplementedError
Always raised by the base ``Retriever`` implementation. Concrete
implementations should override this method to provide functionality.
ValueError
May be raised by implementations if the input bytes, filename, or
provided options are invalid.
RuntimeError
May be raised by implementations to signal unexpected ingestion or
storage failures (e.g. backend service errors).
"""
raise NotImplementedError("ingest_file is not implemented")