Commit Graph

7 Commits

Author SHA1 Message Date
Willem Jiang
170c4eb33c Upgrade langchain version to 1.x (#720)
* fix: revert the part of patch of issue-710 to extract the content from the plan

* Upgrade the ddgs for the new compatible version

* Upgraded langchain to 1.1.0
updated langchain related package to the new compatable version

* Update pyproject.toml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-28 22:09:13 +08:00
Willem Jiang
bec97f02ae fix: the crawling error when encountering PDF URLs (#707)
* fix: the crawling error when encountering PDF URLs

* Added the unit test for the new feature of crawl tool

* fix: address the code review problems

* fix: address the code review problems
2025-11-25 09:24:52 +08:00
Willem Jiang
975b344ca7 fix: resolve issue #651 - crawl error with None content handling (#652)
* fix: resolve issue #651 - crawl error with None content handling
Fixed issue #651 by adding comprehensive null-safety checks and error handling to the crawl system.
The fix prevents the ‘TypeError: Incoming markup is of an invalid type: None’ crash by:
1. Validating HTTP responses from Jina API
2. Handling None/empty content at extraction stage
3. Adding fallback handling in Article markdown/message conversion
4. Improving error diagnostics with detailed logging
5. Adding 16 new tests with 100% coverage for critical paths

* Update src/crawler/readability_extractor.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/crawler/article.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-24 17:06:54 +08:00
Willem Jiang
d30c4d00d3 fix: convert crawl_tool dict return to JSON string for type consistency (#636)
Keep fixing #631
This pull request updates the crawl_tool function to return its results as a JSON string instead of a dictionary, and adjusts the unit tests accordingly to handle the new return type. The changes ensure consistent serialization of output and proper validation in tests.
2025-10-21 10:00:33 +08:00
zgjja
3b4e993531 feat: 1. replace black with ruff for fomatting and sort import (#489)
2. use tavily from`langchain-tavily` rather than the older one from `langchain-community`

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2025-08-17 22:57:23 +08:00
Willem Jiang
3c46201ff0 fix: fix the lint check errors of the main branch (#403) 2025-07-12 14:43:25 +08:00
Willem Jiang
4c2fe2e7f5 test: add more unit tests of tools (#315)
* test: add more test on test_tts.py

* test: add unit test of search and retriever in tools

* test: remove the main code of search.py

* test: add the travily_search unit test

* reformate the codes

* test: add unit tests of tools

* Added the pytest-asyncio dependency

* added the license header of test_tavily_search_api_wrapper.py
2025-06-12 20:43:32 +08:00