A Scrapy-based Python web crawler to notify users on a daily basis with up-to-date job postings.
git clone git@github.com:WHYjun/job-search-bot.git
The requirements.txt file should list all Python libraries that you should install to run Job Search Bot. You can install required libraries by using:
pip install --upgrade pip
pip install -r requirements.txt
If you don't have pip yet, please install pip from the following link: https://pip.pypa.io/en/stable/installing/
If you don't have MongoDB yet, please install MongoDB from the following link: https://docs.mongodb.com/manual/installation/
Once you have installed MongoDB, you should set dbpath first without any authentication.
mongod --dbpath "<your_db_path>"
If you already set your repository with running setup.js, you may want to reset the repository with running reset.js in a separate terminal window.
mongo reset.js
Then, copy the following JSON to any text editor and save as config.json. Please update name and password for security.
{
"repo": {
"name": "repo"
},
"admin": {
"name": "name",
"pwd": "password"
},
"user": {
"name": "name",
"pwd": "password"
}
}
Finally, run setup.js to set your repository with your own name and password in config.json.
mongo setup.js
Now, stop mongod and restart it with the following command (Enabling authentication).
mongod --auth --dbpath "<your_db_path>"
After setting up all requirements, change directory to /jobbot/jobbot/spiders, and run scrapy genspider <company_name> <company_base_url>. Please update the created file refer to example.py.
- Add the functionality to filter the job title.