Classification Modeling with RNN-based, Random Forest, and XGBoost for Imbalanced Data: A Case of Early Crash Detection in ASEAN-5 Stock Markets

Authors

  • Deri Siswara IPB University Author
  • Agus M. Soleh IPB University Author
  • Aji Hamim Wigena IPB University Author

DOI:

https://doi.org/10.15294/sji.v11i3.4067

Keywords:

Early crash detection, GRU, LSTM, RNN, Random forest, XGBoost

Abstract

Purpose: This research aims to evaluate the performance of several Recurrent Neural Network (RNN) architectures, including Simple RNN, Gated Recurrent Units (GRU), and Long Short-Term Memory (LSTM), compared to classic algorithms such as Random Forest and XGBoost, in building classification models for early crash detection in the ASEAN-5 stock markets.

Methods: The study examines imbalanced data, which is expected due to the rarity of market crashes. It analyzes daily data from 2010 to 2023 across the major stock markets of the ASEAN-5 countries: Indonesia, Malaysia, Singapore, Thailand, and the Philippines. A market crash is the target variable when the primary stock price indices fall below the Value at Risk (VaR) thresholds of 5%, 2.5%, and 1%. Predictors include technical indicators from major local and global markets and commodity markets. The study incorporates 213 predictors with their respective lags (5, 10, 15, 22, 50, 200) and uses a time step of 7, expanding the total number of predictors to 1,491. The challenge of data imbalance is addressed with SMOTE-ENN. Model performance is evaluated using the false alarm rate, hit rate, balanced accuracy, and the precision-recall curve (PRC) score.

Result: The results indicate that all RNN-based architectures outperform Random Forest and XGBoost. Among the various RNN architectures, Simple RNN is the most superior, primarily due to its simple data characteristics and focus on short-term information.

Novelty: This study enhances and extends the range of phenomena observed in previous studies by incorporating variables such as different geographical zones and periods and methodological adjustments.

Author Biographies

  • Agus M. Soleh, IPB University

    Statistics and Data Sciences Department, Faculty of Mathematics and Natural Sciences

  • Aji Hamim Wigena, IPB University

    Statistics and Data Sciences Department, Faculty of Mathematics and Natural Sciences

Downloads

Article ID

4067

Published

05-08-2024

Issue

Section

Articles

How to Cite

Classification Modeling with RNN-based, Random Forest, and XGBoost for Imbalanced Data: A Case of Early Crash Detection in ASEAN-5 Stock Markets. (2024). Scientific Journal of Informatics, 11(3), 569-582. https://doi.org/10.15294/sji.v11i3.4067