Commit Graph

  • 1da347cbf8 docs: update index.md main 程序员阿江(Relakkes) 2025-11-22 09:12:25 +08:00
  • 422cc92dd1 docs: update README 程序员阿江(Relakkes) 2025-11-22 08:20:09 +08:00
  • 13d2302c9c docs: update README 程序员阿江(Relakkes) 2025-11-18 17:56:55 +08:00
  • ff8c92daad chore: add copyright to every file 程序员阿江(Relakkes) 2025-11-18 12:24:02 +08:00
  • 5288bddb42 refactor: weibo search #771 程序员阿江(Relakkes) 2025-11-17 17:24:47 +08:00
  • 6dcfd7e0a5 refactor: weibo login 程序员阿江(Relakkes) 2025-11-17 17:11:35 +08:00
  • e89a6d5781 feat: cdp browser cleanup after crawler done 程序员阿江(Relakkes) 2025-11-17 12:21:53 +08:00
  • a1c5e07df8 fix: xhs sub comment bugfix #769 程序员阿江(Relakkes) 2025-11-17 11:47:33 +08:00
  • b6caa7a85e refactor: add xhs creator params 程序员阿江(Relakkes) 2025-11-10 21:10:03 +08:00
  • 1e3637f238 refactor: update xhs note detail 程序员阿江(Relakkes) 2025-11-10 18:13:51 +08:00
  • b5dab6d1e8 refactor: 使用 xhshow 替代 playwright 签名方案 程序员阿江(Relakkes) 2025-11-10 18:12:45 +08:00
  • 54f23b8d1c Merge pull request #768 from yangtao210/main 程序员阿江-Relakkes 2025-11-07 05:44:07 -05:00
  • 58eb89f073 Merge branch 'NanmiCoder:main' into main yangtao210 2025-11-07 17:44:09 +08:00
  • 7888f4c6bd 优化mongodb配置获取逻辑,移动存储基类位置。集成测试 yt210 2025-11-07 17:42:50 +08:00
  • b61ec54a72 优化mongodb配置获取逻辑,移动存储基类位置。 yt210 2025-11-07 17:42:28 +08:00
  • 60cbb3e37d fix: weibo container error #568 程序员阿江(Relakkes) 2025-11-06 19:43:09 +08:00
  • 05a1782746 Merge pull request #764 from yangtao210/main 程序员阿江-Relakkes 2025-11-06 06:10:49 -05:00
  • ef6948b305 新增存储到mongoDB yt210 2025-11-06 10:40:30 +08:00
  • 45ec4b433a docs: update 程序员阿江(Relakkes) 2025-11-06 00:08:03 +08:00
  • 0074e975dd fix: dy search 程序员阿江(Relakkes) 2025-11-04 00:14:16 +08:00
  • 889fa01466 fix: bili词云图修复 程序员阿江(Relakkes) 2025-11-02 13:25:31 +08:00
  • 3f5925e326 feat: update xhs sign 程序员阿江(Relakkes) 2025-10-27 19:06:07 +08:00
  • ed6e0bfb5f refactor: tieba 改为浏览器获取数据 程序员阿江(Relakkes) 2025-10-19 17:09:55 +08:00
  • 26a261bc09 Merge branch 'feature/config-refactor-20251018' 程序员阿江(Relakkes) 2025-10-19 15:32:42 +08:00
  • 03e384bbe2 refactor: cdp模式下移除stealth注入 程序员阿江(Relakkes) 2025-10-19 15:32:03 +08:00
  • 56bf5d226f The configuration file supports URL crawling 程序员阿江-Relakkes 2025-10-18 07:42:14 +08:00
  • ae7955787c feat: kuaishou support url link feature/config-refactor-20251018 程序员阿江(Relakkes) 2025-10-18 07:40:10 +08:00
  • a9dd08680f feat: xhs support creator url link 程序员阿江(Relakkes) 2025-10-18 07:20:09 +08:00
  • cae707cb2a feat: douyin support url link 程序员阿江(Relakkes) 2025-10-18 07:00:21 +08:00
  • 906c259cc7 feat: bilibili support url link 程序员阿江(Relakkes) 2025-10-18 06:30:20 +08:00
  • 3b6fae8a62 docs: update README.md 程序员阿江(Relakkes) 2025-10-17 15:30:44 +08:00
  • a72504a33d Merge pull request #739 from callmeiks/add-tikhub-sponsor 程序员阿江-Relakkes 2025-10-16 16:54:18 +08:00
  • e177f799df docs: resize TikHub banner to smaller size Callmeiks 2025-10-16 01:51:55 -07:00
  • 1a5dcb6db7 Merge pull request #738 from callmeiks/add-tikhub-sponsor 程序员阿江-Relakkes 2025-10-16 16:41:19 +08:00
  • 2c9eec544d docs: add TikHub as sponsor Callmeiks 2025-10-16 01:22:40 -07:00
  • d1f73e811c docs: update README.md 程序员阿江(Relakkes) 2025-10-12 21:19:11 +08:00
  • 2d3e7555c6 docs: update README.md 程序员阿江(Relakkes) 2025-10-11 16:16:11 +08:00
  • 3c5b9e8035 docs: update wechat qrcode 程序员阿江(Relakkes) 2025-10-02 14:27:10 +08:00
  • e6f3182ed7 Merge branch 'codex/replace-argparse-with-typer-for-cli' 程序员阿江(Relakkes) 2025-09-26 18:11:02 +08:00
  • 2cf143cc7c fix: #730 程序员阿江(Relakkes) 2025-09-26 18:10:30 +08:00
  • eb625b0b48 Merge pull request #729 from NanmiCoder/codex/replace-argparse-with-typer-for-cli 程序员阿江-Relakkes 2025-09-26 18:08:21 +08:00
  • 84f6f650f8 fix: typer args bugfix codex/replace-argparse-with-typer-for-cli 程序员阿江(Relakkes) 2025-09-26 18:07:57 +08:00
  • 9d6cf065e9 fix(cli): support runtime without peps604 程序员阿江-Relakkes 2025-09-26 17:38:50 +08:00
  • 95c740dee2 refine: harden typer cli defaults 程序员阿江-Relakkes 2025-09-26 17:38:44 +08:00
  • f97e0c18cd feat(cli): migrate argument parsing to typer 程序员阿江-Relakkes 2025-09-26 17:21:47 +08:00
  • 879a72ea30 fix: 修复cdp启动的浏览器无法关闭的bug 程序员阿江-Relakkes 2025-09-26 16:57:48 +08:00
  • 3237073a0e Improve BrowserLauncher cleanup handling codex/fix-issue-with-mediacrawler-functionality 程序员阿江-Relakkes 2025-09-26 16:52:38 +08:00
  • 7b9db2f748 Merge pull request #726 from LePao1/main 程序员阿江-Relakkes 2025-09-25 01:58:24 +08:00
  • 3954c40e69 feat(bilibili):增加视频清晰度参数,可以通过BILI_QN更改下载的视频清晰度; 在 BilibiliClient 中添加视频质量配置并改进错误处理,修复下载请求被 302 重定向到 CDN,旧代码未跟随重定向且只接受 “OK” ,导致失败,现在即便是低清晰度/CDN 跳转的链接也能正常下载。 LePao1 2025-09-24 12:27:16 +08:00
  • e2554288e0 docs: update README 程序员阿江(Relakkes) 2025-09-23 09:46:44 +08:00
  • 1342797486 Merge pull request #718 from persist-1/refactor 程序员阿江-Relakkes 2025-09-11 06:45:26 +08:00
  • 926ea9dc42 fix: 修复路径分隔符连接方式不当导致的路径格式问题 persist-1 2025-09-11 00:35:02 +08:00
  • a6d85b4194 sync #717 persist-1 2025-09-11 00:00:06 +08:00
  • 0d0af57a01 fix(store): 修复'crawler_type_var'的不当使用导致csv/json保存文件名异常的bug persist-1 2025-09-10 23:47:05 +08:00
  • 4b346cfb61 Merge pull request #716 from persist-1/refactor 程序员阿江-Relakkes 2025-09-10 14:38:39 +08:00
  • bf7a0098bd Merge pull request #717 from wisty/patch-1 程序员阿江-Relakkes 2025-09-09 17:14:37 +08:00
  • c87df59996 log client modify 刘小龙 2025-09-09 15:27:46 +08:00
  • d3bebd039e refactor(database): 调整数据库模块位置、调整初始化arg名称,并更新文档 persist-1 2025-09-08 01:14:31 +08:00
  • 99756612b4 chore: 移除先前被同步的sqlite数据库,让用户自行进行初始化 persist-1 2025-09-08 00:40:55 +08:00
  • 95a3dc8ce1 chore: 删除不必要的注释 persist-1 2025-09-08 00:37:57 +08:00
  • 40de0e47e5 fix(store): 将async for循环替换为async with语句来修复zhihu数据库会话管理 persist-1 2025-09-08 00:29:04 +08:00
  • a38058856f test: 添加数据库同步测试脚本用于ORM与数据库结构对比与同步 persist-1 2025-09-08 00:13:00 +08:00
  • 684a16ed9a fix(数据库): 修复模型字段类型以支持更广泛的数据格式; 修复xhs评论存储方法,从批量处理改为单条处理 persist-1 2025-09-07 04:10:49 +08:00
  • b04f5bcd6f feat(database): 优化数据库模型和连接管理 persist-1 2025-09-06 06:08:28 +08:00
  • 0965bd6c96 fix: 使用 get_current_time() 替代 get_current_date() 以避免文件名因同日期而冲突 persist-1 2025-09-06 04:43:56 +08:00
  • e92c6130e1 fix(store): 修复存储实现的AsyncFileWriter导入 persist-1 2025-09-06 04:41:37 +08:00
  • be306c6f54 refactor(database): 重构数据库存储实现,使用SQLAlchemy ORM替代原始SQL操作 persist-1 2025-09-06 04:10:20 +08:00
  • fa5f07e9ee docs: update README.md 程序员阿江(Relakkes) 2025-09-05 17:51:36 +08:00
  • 6b6fedd031 fix: #711 程序员阿江(Relakkes) 2025-09-02 18:57:18 +08:00
  • 2bce3593f7 feat: support time deplay for all platform 程序员阿江(Relakkes) 2025-09-02 16:43:09 +08:00
  • eb799e1fa7 refactor: xhs extractor 程序员阿江(Relakkes) 2025-09-02 14:50:32 +08:00
  • ce52c58b98 Merge pull request #707 from CzsGit/fix-douyin-json-format 程序员阿江-Relakkes 2025-08-18 19:15:50 +08:00
  • 48da268bc5 fix: 为抖音JSON存储添加格式化输出 Czs-HF 2025-08-16 12:52:37 +08:00
  • 9e8c979164 fix: note_download_url field length error 程序员阿江(Relakkes) 2025-08-14 14:57:24 +08:00
  • 4a68e79ed0 docs: update README.md 程序员阿江(Relakkes) 2025-08-12 22:25:21 +08:00
  • 526c37822b Merge pull request #700 from 2513502304/main 程序员阿江-Relakkes 2025-08-06 17:26:29 +08:00
  • 2c11e64dc9 Merge branch 'NanmiCoder:main' into main 翟持江 2025-08-06 11:39:42 +08:00
  • 6a10d0d11c 原始的HTTPStatusError不能捕获像ConnectError、ReadError这些异常类型,本次提交修改了捕获异常的类型为httpx模块请求异常的基类:HTTPError,以便捕获在httpx.request方法中引发的任何异常(例如ip被封,服务器拒接连接),正确处理爬取媒体被中断时并不会导致爬取文本的中断逻辑 未来可欺 2025-08-06 11:24:51 +08:00
  • e4e0f659e0 Merge pull request #699 from 2513502304/main 程序员阿江-Relakkes 2025-08-05 16:11:03 +08:00
  • 81f2dbe4ab 添加了对媒体资源服务器的异常处理,参见 issue #691 未来可欺 2025-08-05 13:11:00 +08:00
  • b9d30bbabb fix: #693 程序员阿江(Relakkes) 2025-08-01 15:55:21 +08:00
  • 12450759d8 fix: httpx proxy format error feat: add a ip proxy provider 程序员阿江(Relakkes) 2025-08-01 01:05:11 +08:00
  • 0024ce6ab4 feat: upgrade httpx version to 0.28.1 程序员阿江-Relakkes 2025-07-31 23:19:08 +08:00
  • a6fd9ebdbc 简单更改了抖音保存图片与视频的命名方式,一个视频 id 仅对应一个短视频,返回一个 video_download_url,因此不需要使用数字方式进行命名 未来可欺 2025-07-31 23:11:45 +08:00
  • 0b81240aed 升级 httpx 版本至 0.28.1,并修改关键字参数 proxies 至 proxy 未来可欺 2025-07-31 22:48:02 +08:00
  • 9d90e9fc6d fix issue #689,目前来看,应该是 httpx 库的问题,因为无论是使用同步还是异步版本,构不构造 httpx.***Client 对象来发起请求,返回的响应都是为空,response.content = b'',response.text = ’‘,但换成 requests 库就能正常获取数据了 未来可欺 2025-07-31 22:01:48 +08:00
  • a1535289c1 Merge pull request #687 from 2513502304/main 程序员阿江-Relakkes 2025-07-30 23:06:35 +08:00
  • e9f976117a 将配置文件恢复原状 未来可欺 2025-07-30 21:31:50 +08:00
  • 082c316345 Merge branch 'NanmiCoder:main' into main 翟持江 2025-07-30 21:28:29 +08:00
  • c61ed57a20 fix: 二维码在部分系统无法显示 #685 程序员阿江-Relakkes 2025-07-30 21:26:41 +08:00
  • 93a1c27fff 通过测试search模式,修复部分运行时的bug,并对能够爬取媒体的平台设置了较长的超时时间 未来可欺 2025-07-30 21:19:56 +08:00
  • 87caf07495 fix: #685 GokoRuri 2025-07-30 21:14:37 +08:00
  • a7cc18ec7d 修改部分文档 未来可欺 2025-07-30 18:58:10 +08:00
  • ecddfbe02c 将store文件夹中后缀名为_video或_image的.py文件统一更名为以_media.py为结尾的命名方式,避免某些平台仅有_video.py文件或_image.py文件的单独实现。之后的所有存储视频或图像的代码均放在此文件中实现 未来可欺 2025-07-30 18:32:08 +08:00
  • 173bc08a9d 添加了抖音存储视频以及图片的逻辑,并将config.py中ENABLE_GET_IMAGES参数更名为ENABLE_GET_MEIDAS,在此基础上略微修改存储逻辑 未来可欺 2025-07-30 18:24:08 +08:00
  • 417c39de69 docs: add a sponsor 程序员阿江(Relakkes) 2025-07-30 16:44:10 +08:00
  • b2d52918ae Merge pull request #684 from 2513502304/main 程序员阿江-Relakkes 2025-07-30 14:51:46 +08:00
  • 8ab1b7ee4c fix: fixed circular import issue 程序员阿江(Relakkes) 2025-07-30 14:47:11 +08:00
  • 214ccaa294 Update sqlite_tables.sql,同步更新sqlite以支持保存笔记下载地址 翟持江 2025-07-30 10:48:52 +08:00
  • 612a9b53d3 Update tables.sql,同步更新该文件以支持保存笔记下载地址 翟持江 2025-07-30 10:46:46 +08:00