Using open source data set for sentiment analysis on Tweet data.
open source data set for sentiment analysis on Tweet data.
Data was retrieved from http://help.sentiment140.com/for-students/. The concept is that 1.6 million tweets were pulled and classified based on the emoticon contained within the body. :) was deemed positive or a polarity of ‘4’ and :( was deemed negative with a polarity of ‘0’. ‘2’ is considered neutral. The attached has reduced data source (25% of original) with equal random sampling from the original 1.6 million records to facilitate quicker training.