Application of the Naïve Bayes Classifier Algorithm using N-Gram and Information Gain to Improve the Accuracy of Restaurant Review Sentiment Analysis
Abstract
A consumer's review is an essential aspect for influencing others in determining decisions. The process of identifying positive or negative reviews can be conducted through sentiment analysis. One of the popular techniques in the sentiment analysis is the Naïve Bayes Classifier (NBC) algorithm, which has optimal performance. The purpose of this study was to improve the accuracy of the classifier in the analysis of restaurant review sentiments by applying N-Gram as feature extraction and Information Gain as a feature selection. N-Gram is used to produce new features that are more varied, while information gain functions to select relevant features with high weights. The dataset used in this study is the sentiment labeled dataset from UCI machine learning. The results of applying the NBC have an accuracy of 82.5%. The research results revealed that the Naïve Bayes Classifier's accuracy by using N-Gram and information gain of 86%. The application of N-Gram and information gain in the NBC algorithm can be concluded that it has succeeded in improving the classification accuracy of the restaurant review sentiment analysis with an increase in accuracy of 3.5%.
Copyright (c) 2020 Journal of Advances in Information Systems and Technology
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.