Tools, tutorials, code and workflows to scrap data from websites.
- Select the sources from where the data has to be picked.
- Write code to frame the proper URL to download the html files having data in form of tables, divs, etc.
- Download the articles.
- Parse the downloaded articles and seive out(through regex and pattern matching) the data and save the output in the form of csv files.
- Do analysis of the csv code after loading in the memory through python or R.
- publish the results using d34js, nbviewer, etc.