CrushedIce

Collecting and summarizing news articles from rss feeds

Setup

CrushedIce runs as a Twisted server, connecting to a MongoDB database to store the summarized articles and a S3 bucket to store the image files. Internally, the articles are summarized using statistical text analysis. Some functionality (tokenization and stemming) uses NLTK (Python natural language toolkit).

Setup requirements in a virtual environment

virtualenv venv
venv/bin/activate
pip install -r requirements.txt

To download the required nltk modules, run this code in the python interpreter. CrushedIce expects the nltk files to be in a subfolder nltk_data.

import nltk
nltk.download('punkt')
nltk.download('rslp')
nltk.download('stopwords')

Configuration

Create the file config.py in the main directory according to the sample file config.sample.py. The configuration file must contain the credentials to MongoDB and S3.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
crusher		crusher
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.sample.py		config.sample.py
crushed.py		crushed.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CrushedIce

Setup

Configuration

About

Uh oh!

Releases

Packages

Languages

License

thmp/crushedice

Folders and files

Latest commit

History

Repository files navigation

CrushedIce

Setup

Configuration

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages