Hyperparameter Tuning of Long Short-Term Memory Model for Clickbait Classification in News Headlines

Grace Yudha Satriawan; Budi Prasetiyo

doi:10.15294/rji.v2i1.71831

Grace Yudha Satriawan Universitas Negeri Semarang
Budi Prasetiyo

DOI: https://doi.org/10.15294/rji.v2i1.71831

Keywords: Hyperparameter Tuning, Neural Network, Clickbait, Natural Language Processing

Abstract

Abstract. The information available on the internet nowadays is diverse and moves very quickly. Information is becoming easier to obtain by the general public with the numerous online media outlets, including news portals that provide up-to-date information insights. Various news portals earn revenue from advertising using pay-per-click methods that encourage article writers to use clickbait techniques to attract visitors. However, the negative effects of clickbait include a decrease in journalism quality and the spread of hoaxes. This problem can be prevented by using text classification to classify clickbait in news titles. One method that can be used for text classification is a neural network. Artificial neural networks use algorithms that can independently adjust input coefficient weights. This makes this algorithm highly effective for modeling non-linear statistical data. The artificial neural network algorithm, especially the Long Short-Term Memory (LSTM), has been widely used in various natural language processing fields with satisfying results, including text classification. To improve the performance of the neural network model, adjustments can be made to the model's hyperparameters. Hyperparameters are parameters that cannot be obtained through data and must be defined before the training process. In this research, the Long Short-Term Memory (LSTM) model was used in clickbait classification in news titles. Sixteen neural network models were trained with different hyperparameter configurations for each model. Hyperparameter tuning was carried out using the random search algorithm. The dataset used was the CLICK-ID dataset published by William & Sari, 2020[1], with a total of 15,000 annotated data. The research results show that the developed LSTM model has a validation accuracy of 0.8030, higher than William & Sari's research, and a validation loss of 0.4876. Using this model, researchers were able to classify clickbait in news titles with fairly good accuracy.

Purpose: The study was to develop and evaluate a LSTM model with hyperparameter tuning for clickbait classification on news headlines. The thesis also aims to compare the performance of simple LSTM and bidirectional LSTM for this task.

Methods: This study uses CLICK-ID dataset and applies different text preprocessing techniques. The dataset later was used to build and train 16 LSTM models with different hyperparameters and evaluates them using validation accuracy and loss. This study uses random search for hyperparameter tuning.

Result: The results of the study show that the best model for clickbait classification on news headlines is a bidirectional LSTM model with one layer, 64 units, 0.2 dropout rate, and 0.001 learning rate. This model achieves a validation accuracy of 0.8030 and a validation loss of 0.4876. The results also show that hyperparameter tuning using random search can improve the performance of the LSTM models by avoiding zero probabilities and finding the optimal values for the hyperparameters.

Novelty: This study compares and analyzes the different preprocessing methods on text and the different configurations of the models to find the best model for clickbait classification on news headlines. The study also uses hyperparameter tuning to tune the model into the best model and finding the optimal values for the hyperparameters.

References

[1] A. William and Y. Sari, “CLICK-ID: A Novel Dataset for Indonesian Clickbait Headlines,” Data Brief, vol. 32, p. 106231, Oct. 2020, doi: 10.1016/J.DIB.2020.106231.
[2] I. Beleslin and B. R. Njegovan, “Clickbait Titles: Risky Formula for Attracting Readers and Advertisers,” in XVII International Scientific Conference on Industrial Systems, 2017.
[3] H. T. Zheng, J. Y. Chen, X. Yao, A. K. Sangaiah, Y. Jiang, and C. Z. Zhao, “Clickbait Convolutional Neural Network,” Symmetry (Basel), vol. 10, no. 5, May 2018, doi: 10.3390/sym10050138.
[4] C. I. Coste and D. Bufnea, “Advances in Clickbait and Fake News Detection Using New Language-Independent Strategies,” Journal of Communications Software and Systems, vol. 17, no. 3, 2021, doi: 10.24138/jcomss-2021-0038.
[5] M. Potthast, S. Kopsel, B. Stein, and M. Hagen, “Clickbait Detection,” pp. 810–817, 2016, doi: 10.1007/978-3-319-30671-1.
[6] A. Chakraborty, B. Paranjape, S. Kakarla, and N. Ganguly, “Stop Clickbait: Detecting and Preventing Clickbaits in Online News Media,” in Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016, 2016. doi: 10.1109/ASONAM.2016.7752207.
[7] G. Loewenstein, “The Psychology of Curiosity: A Review and Reinterpretation.,” Psychol Bull, vol. 116, no. 1, pp. 75–98, Jul. 1994, doi: 10.1037/0033-2909.116.1.75.
[8] J. Kuiken, A. Schuth, M. Spitters, and M. Marx, “Effective Headlines of Newspaper Articles in a Digital Environment,” Digital Journalism, vol. 5, no. 10, 2017, doi: 10.1080/21670811.2017.1279978.
[9] W. Wang, F. Feng, X. He, H. Zhang, and T. S. Chua, “Clicks Can be Cheating: Counterfactual Recommendation for Mitigating Clickbait Issue,” in SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021. doi: 10.1145/3404835.3462962.
[10] P. Biyani, K. Tsioutsiouliklis, and J. Blackmer, “‘8 Amazing Secrets for Getting More Clicks’: Detecting Clickbaits in News Streams Using Article Informality,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1, Feb. 2016, doi: 10.1609/aaai.v30i1.9966.
[11] M. M. Mirończuk and J. Protasiewicz, “A Recent Overview of the State-of-the-Art Elements of Text Classification,” Expert Systems with Applications, vol. 106. 2018. doi: 10.1016/j.eswa.2018.03.058.
[12] K. Kowsari, K. J. Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, “Text Classification Algorithms: A Survey,” Information (Switzerland), vol. 10, no. 4, 2019, doi: 10.3390/info10040150.
[13] H. Hassani, C. Beneki, S. Unger, M. T. Mazinani, and M. R. Yeganegi, “Text Mining in Big Data Analytics,” Big Data and Cognitive Computing, vol. 4, no. 1, 2020, doi: 10.3390/bdcc4010001.
[14] U. Güçlü and M. A. J. van Gerven, “Modeling the Dynamics of Human Brain Activity with Recurrent Neural Networks,” Front Comput Neurosci, vol. 11, 2017, doi: 10.3389/fncom.2017.00007.
[15] IBM Cloud Education, “What are Neural Networks? | IBM,” Ibm. 2020.
[16] S. Sharma, S. Sharma, and A. Athaiya, “Activation Functions in Neural Networks,” International Journal of Engineering Applied Sciences and Technology, vol. 04, no. 12, 2020, doi: 10.33564/ijeast.2020.v04i12.054.
[17] O. Agasi, J. Anderson, A. Cole, M. Berthold, M. Cox, and D. Dimov, “What is an Artificial Neural Network (ANN)? - Definition from Techopedia,” Techopedia. 2018.
[18] A. Sherstinsky, “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Physica D, vol. 404, 2020, doi: 10.1016/j.physd.2019.132306.
[19] B. Naeem, A. Khan, M. O. Beg, and H. Mujtaba, “A Deep Learning Framework for Clickbait Detection on Social Area Network Using Natural Language Cues,” J Comput Soc Sci, vol. 3, no. 1, 2020, doi: 10.1007/s42001-020-00063-y.
[20] S. Cornegruta, R. Bakewell, S. Withey, and G. Montana, “Modelling Radiological Language with Bidirectional Long Short-Term Memory Networks,” 2016.
[21] S. K. Palaniswamy and R. Venkatesan, “Hyperparameters Tuning of Ensemble Model for Software Effort Estimation,” J Ambient Intell Humaniz Comput, vol. 12, no. 6, 2021, doi: 10.1007/s12652-020-02277-4.
[22] L. Yang and A. Shami, “On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice,” Jul. 2020, doi: 10.1016/j.neucom.2020.07.061.
[23] B. Wang, A. Wang, F. Chen, Y. Wang, and C. C. J. Kuo, “Evaluating Word Embedding Models: Methods and Experimental Results,” APSIPA Transactions on Signal and Information Processing, vol. 8. 2019. doi: 10.1017/ATSIP.2019.12.
[24] Google Code Archive, “word2vec,” Jul. 30, 2013. https://code.google.com/archive/p/word2vec/ (accessed Nov. 02, 2022).
[25] D. Goldhahn, T. Eckart, and U. Quasthoff, “Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages,” 2012. [Online]. Available: http://corpora.uni-leipzig.de
[26] M. Basaldella, E. Antolli, G. Serra, and C. Tasso, “Bidirectional LSTM Recurrent Neural Network for Keyphrase Extraction,” in Communications in Computer and Information Science, Springer Verlag, 2018, pp. 180–187. doi: 10.1007/978-3-319-73165-0_18.
[27] C. Sammut and G. I. Webb, Encyclopedia of Machine Learning. Boston, MA: Springer US, 2010. doi: 10.1007/978-0-387-30164-8.
[28] S. V Stehman, “Selecting and Interpreting Measures of Thematic Classification Accuracy,” OElsevier Science Inc, 1997.
[29] M. Vakili, M. Ghamsari, and M. Rezaei, “Performance Analysis and Comparison of Machine and Deep Learning Algorithms for IoT Data Classification,” 2020.