Implementation of Bidirectional Long-Short Term Memory (Bi-LSTM) and Attention to Detect Political Fake News Using IndoBERT and GloVe Embedding
DOI:
https://doi.org/10.15294/rji.v3i2.159Keywords:
Natural Language Processing, NLP, political hoax news, IndoBERT, Bi-LSTM, prediction, GloVe embedding, attention mechanismAbstract
Abstract. Indonesian political news is now increasingly spread through various media, especially social and online media. However, a lot of fake news are spread to bring down political opponents or attract public sympathy in order to find their own supporters. Of course, this news need to be watched out for and preventive measures must be taken so as not to cause misunderstanding in the wider community.
Purpose: This study was conducted to detect the political news whether it’s classified as hoax or fact by its narration. Also, understanding how to build the news detector using corresponding architecture and word embeddings.
Methods/Study design/approach: The model architecture of Bi-LSTM and attention mechanism is used to reach the goals from this study’s purposes. Many studies have been conducted to detect hoaxes but have not yet paid attention to the context of sentences and the contribution of words in a news text so that this architecture is made to overcome this problem. It uses IndoBERT to optimize the model for Indonesian language and also GloVe to obtain the word weights from pre-trained embedding. Then, the tokenization process is performed with IndoBERT and keras to generate token id and attention mask. After receiving the token id and attention mask as input, the data training process is performed for three architectural scenarios with each configuration of 20 epochs, batch size of 32, and the learning rate is 0.00001.
Result/Findings: The results of this study are defined by a confusion matrix which contains accuracy, recall, precision, and F1-score as the evaluation. The combination of Bi-LSTM + Attention + IndoBERT + GloVe obtains the best result of 97,71% of accuracy, 96,33% of precision, 97,93% of recall, and 97,72% of F1-score.
Novelty/Originality/Value: This study combines two word embeddings in order to make sure the weight of words is completely defined and optimized into the Bi-LSTM and attention mechanism architecture.






