• Faris Febri Rahanto Universitas Negeri Semarang
  • Iqbal Kharisudin
Keywords: Naive Bayes Classifier, Sentiment Analysis, Text Mining, Web Scraping


The tripadvisor site provides information on visitor reviews of The Wujil Resort & Conventions. Each visitor can provide a review in the form of criticism, suggestions, or ratings of the hotel. The number of incoming reviews, it requires a special technique to extract information from these reviews. This study aims to analyze the sentiment of the review data using the Naive Bayes Classifier method. Web scraping is used to get data online on website pages, namely collecting visitor review data for The Wujil Resort & Conventions. The process of classifying the review data will be carried out using machine learning using the Naive Bayes Classifier method. Furthermore, the classification results will be analyzed using the text mining method, the main concept is to carry out extensive exploration and extraction with a large and growing number of data. So that we find facts and information that are considered important and can be useful for various fields of purpose. The classification using the Naive Bayes method shows an accuracy rate of 76.6%. In general, with the text mining method, information is obtained that there are more visitors who give positive ratings than visitors who give negative ratings.


Al-Saqqa, S., Al-Naymat, G., & Awajan, A. (2018). A large-scale sentiment data classification for online reviews under apache spark. Procedia Computer Science, 141, 183–189.

Elnagar, A., Lulu, L., & Einea, O. (2018). An Annotated Huge Dataset for Standard and Colloquial Arabic Reviews for Subjective Sentiment Analysis. Procedia Computer Science, 142, 182–189.

Gegick, M., Rotella, P., & Xie, T. (2010). Identifying security bug reports via text mining: An industrial case study. Proceedings - International Conference on Software Engineering, (June 2010), 11–20.

Han, J., Kamber, M., & Pei, J. (2012). Data Mining Concepts and Techniques (Third Edit). Waltham, MA: Morgan Kaufmann.

Josi, A., Abdillah, L. A., & Suryayusra. (2014). Penerapan Teknik Web Scraping pada Mesin Pencari Artikel Ilmiah. 159–164.

Kanna, P. R., & Pandiaraja, P. (2019). An Efficient Sentiment Analysis Approach for Product Review using Turney Algorithm. Procedia Computer Science, 165(2019), 356–362.

Mubarok, M. S., Adiwijaya, A., & Aldhi, M. D. (2017). Aspect-based Sentiment Analysis to Review Products Using Naïve Bayes. AIP Conference Proceedings, 1867, 020060.

Ray, B., Garain, A., & Sarkar, R. (2021). An ensemble-based hotel recommender system using sentiment analysis and aspect categorization of hotel reviews. Applied Soft Computing, 98, 106935.

Sánchez-Franco, M. J., Navarro-García, A., & Rondán-Cataluña, F. J. (2019). A naive Bayes strategy for classifying customer satisfaction: A study based on online reviews of hospitality services. Journal of Business Research, 101(December), 499–506.

Srivastava, A. N., & Sahami, M. (2009). Text Mining Classification, Clustering, and Applications. Boca Raton: CRC Press.

Suhartono, B. (2013). Fungsi dan Manfaat Internet dalam Bidang Bisnis dan Perdagangan. Retrieved September 15, 2019, from

Turland, M. (2010). Guide to Web Scraping with PHP (First Edit). Toronto: Marco Tabini & Associates, Inc.

Xhemali, D., J. Hinde, C., & G. Stone, R. (2009). Naive Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Pages. International Journal of Computer Science, 4(1), 16–23.