- Pager:根据分页URL请求格式, 获取某一范围的所有页的response
- Ruler: 指定网页response 分析规则
- URL Collector: 依赖
Pager和Ruler收集所有的需要最终爬取数据的页面的URL集合 - Data Collector: 从
URL Collector中读取URL, 并指定Ruler集合, 让后爬取相关数据 - Data Storage: 从
Data Collector中读取数据存储到指定位置, 现在只支持到CSV
-
Notifications
You must be signed in to change notification settings - Fork 0
songshine/crawler
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published