Classification Modeling with RNN-based, Random Forest, and XGBoost for Imbalanced Data: A Case of Early Crash Detection in ASEAN-5 Stock Markets

Deri Siswara; Agus M. Soleh; Aji Hamim Wigena

doi:10.15294/sji.v11i3.4067

Authors

Deri Siswara IPB University Author
Agus M. Soleh IPB University Author
Aji Hamim Wigena IPB University Author

DOI:

https://doi.org/10.15294/sji.v11i3.4067

Keywords:

Early crash detection, GRU, LSTM, RNN, Random forest, XGBoost

Abstract

Purpose: This research aims to evaluate the performance of several Recurrent Neural Network (RNN) architectures, including Simple RNN, Gated Recurrent Units (GRU), and Long Short-Term Memory (LSTM), compared to classic algorithms such as Random Forest and XGBoost, in building classification models for early crash detection in the ASEAN-5 stock markets.

Methods: The study examines imbalanced data, which is expected due to the rarity of market crashes. It analyzes daily data from 2010 to 2023 across the major stock markets of the ASEAN-5 countries: Indonesia, Malaysia, Singapore, Thailand, and the Philippines. A market crash is the target variable when the primary stock price indices fall below the Value at Risk (VaR) thresholds of 5%, 2.5%, and 1%. Predictors include technical indicators from major local and global markets and commodity markets. The study incorporates 213 predictors with their respective lags (5, 10, 15, 22, 50, 200) and uses a time step of 7, expanding the total number of predictors to 1,491. The challenge of data imbalance is addressed with SMOTE-ENN. Model performance is evaluated using the false alarm rate, hit rate, balanced accuracy, and the precision-recall curve (PRC) score.

Result: The results indicate that all RNN-based architectures outperform Random Forest and XGBoost. Among the various RNN architectures, Simple RNN is the most superior, primarily due to its simple data characteristics and focus on short-term information.

Novelty: This study enhances and extends the range of phenomena observed in previous studies by incorporating variables such as different geographical zones and periods and methodological adjustments.

Author Biographies

Agus M. Soleh, IPB University

Statistics and Data Sciences Department, Faculty of Mathematics and Natural Sciences
Aji Hamim Wigena, IPB University

Statistics and Data Sciences Department, Faculty of Mathematics and Natural Sciences

Classification Modeling with RNN-based, Random Forest, and XGBoost for Imbalanced Data: A Case of Early Crash Detection in ASEAN-5 Stock Markets

Authors

DOI:

Keywords:

Abstract

Author Biographies

Downloads

Published

Article ID

Issue

Section

How to Cite

Main-Sidebar

Stat Counter