Implementation of the Term Frequency-Inverse Document Frequency Method for Mental Health Classification Using Algorithm Support Vector Machine
DOI:
https://doi.org/10.15294/rji.v3i2.1921Keywords:
Machine Learning, SVM, TF-IDF, Mental Health, Classification textAbstract
Abstract. Mental health is a person's emotional, social and psychological condition. A person's mental health level can be influenced by emotional experiences, behavior, environment and family educational background. A person's psychological well-being can be influenced by a person's behavior, where they live, the education they receive, and their emotional experiences. It is important not to underestimate the existence of mental health disorders because the number of cases is currently increasing.
Purpose: Using SVM algorithm and TF-IDF method can produce good accuracy for classification text. Therefore this research aims to determine the implementation of the use of the TF-IDF method and the SVM algorithm in mental health classification and to determine the accuracy results of using these methods.
Study Method/Design/Approach: The methods used in the research this for the mental health classification is Term Frequency-Inverse Document Frequency used in the vectorization process to convert text into a numerical representation, as well as using the Support Vector Machine algorithm in modeling. The dataset used is the Mental Health Corpus dataset obtained from the Kaggle website. This dataset consists of two classes containing text and labels totaling 27,977 data. Before applying the model, preprocessing is carried out first, namely cleaning the text using stopword removal and stemming. After cleaning the text, the next process is vectorization using CountVectorizer and TF-IDF.
Results/Findings: In this study the SVM algorithm was used four kernels, namely the linear kernel, the RBF kernel, the polynomial kernel, and the later sigmoid kernel get the best accuracy results on the RBF kernel if compared to with other kernels. Accuracy results obtained _ of 92.62%, value precision of 92.64%, value recall 92.62%, and value f1-score 92.62%.
Novelty/Originality/Value: So, it can be concluded that the application of the SVM algorithm and the TF-IDF method is possible used for classification mental health results mark high accuracy.






