Fake Twitter Account Classification of Fake News Spreading Using Naïve Bayes

Heru Agus Santoso(1), Eko Hari Rachmawanto(2), Ulfa Hidayati(3),


(1) Universitas Dian Nuswantoro
(2) Universitas Dian Nuswantoro
(3) Universitas Dian Nuswantoro

Abstract

Twitter is a very popular microblog, where users can search for various information, current news, celebrity posts, and hot topics. Indonesia is ranked 5th for the most Twitter users. The large number of users makes Twitter used for the benefit of certain parties with bad goals, such as spreading fake news using fake accounts. Fake accounts are often used by several parties to spread fake news, therefore the spread of fake news must be immediately limited to minimize the negative impact caused by fake news. For this reason, this research is written with the aim of being able to classify fake and genuine Twitter accounts. In this study, using data mining techniques that are closely related to big data in decision making by applying the Naive Bayes method. Naïve Bayes is one of the most widely used classification methods because it has good accuracy and faster computation time. The classification process uses nine parameters, namely based on the Profile Created, Favorite Count, Follower Count, Following Count, Geo Enabled, Follower Rate, Following Rate, Follower Following Ratio, Verified. This study uses 210 datasets of twitter accounts that spread fake news, the result is that Naïve Bayes works very promising  in the classification of fake twitter accounts and in the testing process using 5% of training set produces an accuracy of 80%.

Keywords

Social media, Twitter, Fake account, Naïve Bayes

Full Text:

PDF

References

Kathy Lee et al., "TwitterTrendingTopicClassiï¬cation," in 11th IEEE International Conference on Data Mining Workshops, 2011, pp. 251-258.

J. Muller. (2020, July) Indonesia: number of internet users 2015-2025. [Online]. https://www.statista.com/statistics/254456/number-of-internet-users-in-indonesia/

Department, Statista Research. (2015, November) Number of Twitter users in Indonesia from 2014 to 2019. [Online]. https://www.statista.com/statistics/490548/twitter-users-indonesia/#:~:text=Indonesia%3A%20number%20of%20Twitter%20users%202014%2D2019&text=In%202019%2C%20the%20number%20of,from%2012%20million%20in%202014.

Mansoor Iqbal. (2020, July) Twitter Revenue and Usage Statistics (2020). [Online]. https://www.businessofapps.com/data/twitter-statistics/#:~:text=The%20145%20million%20daily%20active,on%20Q3%202018s's%20124%20million.

Balachander Krishnamurthy, Phillipa Gill, and Martin Arlitt, "A few chirps about Twitter," in WOSN’08, Seattle, USA, 2008.

Supanya Aphiwongsophon and Prabhas Chongstitvatana, "Detecting Fake News with Machine Learning Method," in 15th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, Chiang Rai, Thailand, 2018, pp. 528-531.

Anders E. Lillie and Emil R. Middelboe, "Fake News Detection using Stance Classiï¬cation: A Survey," University of Copenhagen , Copenhagen, Thesis Preparation 2018.

Mauro Conti, Radha Poovendran, and Marco Secchiero, "FakeBook: Detecting Fake Profiles in On-Line Social Networks," in Conference on Advances in Social Networks Analysis and Mining (ASONAM), Turkey, 2012, pp. 1071– 1078.

Sowmya P and Madhumita Chatterjee, "Detection of Fake and Cloned Profiles in Online Social Networks," in Conference on Technologies for Future Cities (CTFC), 2019.

Iasonas Polakis, Sotiris Ioannidis, and Evangelos P. Markatos, "Detecting Social Network Profile Cloning," in Proceedings of the 3rd International workshop on security and social networking, USA, 2011.

Buket Ersahin, Ozlem Aktas, Deniz Kilinic, and Ceyhun Akyol, "Twitter Fake Account Detection," in 2nd International Conference on Computer Science and Engineering, Antalya, Turkey, 2017, pp. 388-392.

Twitter. (2020) Safety and security. [Online]. https://help.twitter.com/id/safety-and-security/fake-twitter-emails

Supraja Gurajala, Joshua S. White, Brian Hudson, and Jeanna N. Matthews, "Fake Twitter accounts: Profile characteristics obtained using an activity-based pattern detection approach," in Proceedings of the 2015InternationalConferenceonSocialMedia&Society(SMSociety’15), Toronto, Canada, 2015.

Mohammadreza Mohammadrezaei, Mohammad Ebrahim Shiri, and Amir Masoud Rahmani, "Identifying Fake Accounts on Social Networks Based on Graph Analysis and Classification Algorithms," SecurityandCommunication Networks, pp. 1-18, 2018.

Jiangtao Ren et al., "Naive Bayes Classiï¬cation of Uncertain Data," in Ninth IEEE International Conference on Data Mining, 2009.

Sarah Khaled, Neamat El-Tazi, and Hoda M.O. Mokhtar, "Detecting Fake Accounts on Social Media," in IEEE International Conference on Big Data (Big Data), Seattle, USA, 2018, pp. 3672-3681.

Stefan Helmstetter and Heiko Paulheim, "Weakly Supervised Learning for Fake News Detection on Twitter," in IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) , Barcelona, Spain, 2018, pp. 274-277.

Freeman , David Mandell;, "Using Naive Bayes to Detect Spammy Names in Social Networks," in Proceedings of the 2013 ACM workshop on Artificial intelligence and security, Berlin, Germany, 2013, pp. 3-12.

Daniela Xhemali, Christopher J. Hinde, and Roger G. Stone, "Naïve Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Pages," IJCSI International Journal of Computer Science Issues, vol. 4, no. 1, pp. 16-23, September 2009.

Vivek Narayanan, Ishan Aurora, and Arjun Bhatia, "Fast and Accurate Sentiment Classification Using an Enhanced Naive Bayes Model," in Lecture Notes in Computer Science book series, Hujun Yin et al., Eds.: Springer-Verlag Berlin Heidelberg, 2013, ch. 8206.

Eibe Frank, Mark Hall, and Bernhard Pfahringer, "Locally weighted naive bayes," in Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence, 2003, pp. 249–256.

Usman Sudibyo, Yani Parti Astuti, and Achmad Wahid Kurniawan, "High School Major Classification towards University Students Variable of Score Using Naïve Bayes Algorithm," Scientific Journal of Informatics, vol. 4, no. 2, pp. 191-198, November 2017.

Wandha Budhi Trihanto, Riza Arifudin, and Much Aziz Muslim, "Information Retrieval System for Determining The Title of Journal Trends in Indonesian Language Using TF-IDF and NaÑ—ve Bayes Classifier," Scientific Journal of Informatics, vol. 4, no. 2, pp. 179-190, November 2017.

Refbacks

  • There are currently no refbacks.




Scientific Journal of Informatics (SJI)
p-ISSN 2407-7658 | e-ISSN 2460-0040
Published By Department of Computer Science Universitas Negeri Semarang
Website: https://journal.unnes.ac.id/nju/index.php/sji
Email: [email protected]

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.