wizardchen
|
bff0e742fa
|
fix: try fix ocr avx not support
|
2025-09-11 13:21:21 +08:00 |
|
wizardchen
|
6f6ca84dae
|
feat(docreader): add health check
|
2025-09-10 20:22:14 +08:00 |
|
wizardchen
|
7cfae7e0d3
|
fix: pre fetch ocr models in docker container
|
2025-09-10 17:24:26 +08:00 |
|
wizardchen
|
19d2493afc
|
fix: make file docker build not work
|
2025-09-10 15:13:12 +08:00 |
|
wizardchen
|
0e1d7edca3
|
fix: image parser concurrency error
|
2025-09-10 13:19:39 +08:00 |
|
wizardchen
|
7775559a9b
|
feat: use paddle ocr v4 instead
|
2025-09-10 01:22:25 +08:00 |
|
wizardchen
|
2b6cbee1b6
|
feat: add aliyun rerank
|
2025-09-10 01:22:25 +08:00 |
|
begoniezhao
|
3f8a1d20c1
|
fix(docreader): update paddle version
|
2025-09-09 19:25:02 +08:00 |
|
Liwx
|
4489a4da7f
|
Update base_parser.py
|
2025-09-08 14:58:37 +08:00 |
|
Liwx
|
202f353543
|
Update base_parser.py
|
2025-09-08 14:58:37 +08:00 |
|
Liwx1014
|
696815ddfb
|
update pdf_parser.py
|
2025-09-08 14:58:37 +08:00 |
|
Liwx1014
|
88b467caf0
|
fix:build docreader timeout; update ocr config;support pdf tables parsing
|
2025-09-08 14:58:37 +08:00 |
|
Liwx1014
|
eb27a30c41
|
fix:build docreader timeout; update ocr config;support pdf tables parsing
|
2025-09-08 14:58:37 +08:00 |
|
Liwx1014
|
3aad892a62
|
fix:build docreader timeout; update ocr config;support pdf tables parsing
|
2025-09-08 14:58:37 +08:00 |
|
fatelei
|
d74ae9153b
|
fix: https://github.com/Tencent/WeKnora/issues/114
|
2025-09-08 10:36:37 +08:00 |
|
Liwx
|
a1473fe731
|
Update ocr_engine.py
|
2025-08-29 12:24:58 +08:00 |
|
Liwx
|
7d0037fc2d
|
Update ocr_engine.py
|
2025-08-29 12:24:58 +08:00 |
|
Liwx1014
|
11910048c0
|
fix:ocr extract error list out of range
|
2025-08-29 12:24:58 +08:00 |
|
wizardchen
|
f8394c7e4d
|
fix processed_content used before assignment
|
2025-08-21 16:52:39 +08:00 |
|
wizardchen
|
d801112f5f
|
fix: use image_data_list before assign
|
2025-08-21 10:08:37 +08:00 |
|
wizardchen
|
785261313f
|
feat: make CONCURRENCY_POOL_SIZE configurable
|
2025-08-16 13:27:01 +08:00 |
|
begoniezhao
|
20049d034a
|
refactor: optimize storage configuration priority and VLM configuration check logic
|
2025-08-15 17:33:44 +08:00 |
|
wizardchen
|
09d038eeb7
|
fix: strip minio path prefix
|
2025-08-15 01:36:04 +08:00 |
|
begoniezhao
|
f77720155c
|
feat: Added WEB_PROXY environment variable to optimize web content processing
|
2025-08-14 17:09:11 +08:00 |
|
wizardchen
|
8b43931886
|
feat: support minio storage
|
2025-08-14 12:16:08 +08:00 |
|
dongyuxiang
|
396fd9326b
|
chore: ignore mac .DS_Store
|
2025-08-12 17:55:51 +08:00 |
|
wizardchen
|
ddcf5edf02
|
fix: Fix docx parser init failed
|
2025-08-11 11:03:39 +08:00 |
|
wizardchen
|
bdabed6bfa
|
feat: Added web page for configuring model information
|
2025-08-10 17:11:07 +08:00 |
|
begoniezhao
|
24c190c492
|
feat: 新增多模态模型配置及 VLM 模型认证
|
2025-08-08 17:05:24 +08:00 |
|
begoniezhao
|
6d1e192a2c
|
refactor(caption.py): 改进 CaptionChatResp 解析逻辑,增强字段处理健壮性
|
2025-08-06 16:33:16 +08:00 |
|
begoniezhao
|
8557297f28
|
fix(docreader): add detail parameter for openai interface
|
2025-08-06 11:52:35 +08:00 |
|
wizardchen
|
56eb2bce33
|
init commit
|
2025-08-05 15:08:07 +08:00 |
|