Skip to content

Topwkl/SimpleSpider

 
 

Repository files navigation

Simple Spider for zhihu in china

Lib use :

requests

gevent

You can use pip to install:

pip install requests

pip install gevent

for use it ,you should create a file named user_headers.py

user_headers.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# the content of browser's request headers, after you login zhihu.com
headers = {
    'Accept-Language': 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0',
    'Accept-Encoding': 'gzip, deflate, br',
    'Origin': 'https://www.zhihu.com',
    'Connection': 'keep-alive',
    'X-Xsrftoken': 'your xsrf token',
    'Cookie': 'your cookies'
}

you will find cookie press f12 when you use Firefox. X-Xsrftoken you can delete this key.

if you want get someone followers

you can use it for this:

from followers import Followers

flers = Followers(url_token="chen-er-bai-18")

for i in flers.get_api_object():
    print i["name"], i["url_token"]

the same following.py, question_api.py, voter_profile.py, comments.py, following_topic.py

the question.py use old api it return html then parse pyquery.

use gevent improve efficiency becouse Coroutines.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%