DataHack2015 Project Clarity! Data pipeline: Mining_Wikipedia.py ---> tf-idf_transformation.py ---> Dimension_table.py ---> t-sne.py Exactly as produced for the 2015 Data Hackathon in NYC, and seen a bit here.