This is the implemenation for the paper "End-to-End Compromised Account Detection" (http://cse.msu.edu/~karimiha/publications/E2ECAD.pdf)
Please refer to https://github.com/hamidkarimi/E2ECAD/wiki to see how to run the code.
The ID-TwitterHandle-Label-Split.csv contains four fields -namely an internal ID assigned to an account (i.e., user), the Twitter handler of the user, the label (1 for compromised and 0 for not-compromised), and split in our experiments (i.e., train, test, and eval). Due to the Twitter constraints, we can not release the plain text of the tweets. Hence, in the zip file tweetid_for_users.zip you can find tweet ids associated with each user.
The entired processed data files are available from http://goo.gl/vabH3Z
Please cite the following papers if you use either the dataset or the source code.
@inproceedings{karimi2018end, title={End-to-End Compromised Account Detection}, author={Karimi, Hamid and VanDam, Courtland and Ye, Liyang and Tang, Jiliang}, booktitle={2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)}, pages={314--321}, year={2018}, organization={IEEE} }
@inproceedings{vandam2018cadet, title={CADET: A Multi-View Learning Framework for Compromised Account Detection on Twitter}, author={VanDam, Courtland and Tan, Pang-Ning and Tang, Jiliang and Karimi, Hamid}, booktitle={2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)}, pages={471--478}, year={2018}, organization={IEEE} }
To follow my work please follow my webpage http://cse.msu.edu/~karimiha/