Increasing Accuracy of The Random Forest Algorithm Using PCA and Resampling Techniques with Data Augmentation for Fraud Detection of Credit Card Transaction

  • Andhika Seno Tamtama Department of Computer Science, Faculty of Mathematics and Natural Sciences, Universitas Negeri Semarang.
  • Riza Arifudin Department of Computer Science, Faculty of Mathematics and Natural Sciences, Universitas Negeri Semarang.
Keywords: Imbalanced Data, Random Forest, PCA, Resampling Technique, Data Augmentation

Abstract

The credit-card transaction analysis uses a random forest algorithm as an algorithm for the classification process. The problem faced from the classification process using credit card fraud filing dataset fraud is an imbalanced data that causes an imbalanced data alignment on the model results from data training. To resolve the problem, a combination of PCA methods and resampling techniques with data augmentation for the optimum process on random forest classification algorithms. The PCA method is used in the preprocessing stage to do the process of transforming data into numerical data and resampling techniques and data augmentation are used in data resamples to bring the data to a balance. The data used is a data card fraud of Europe that has 284807 transactions. Model accuracy measurement was implemented using confusion matrix. The highest accuracy results from a random forest combination using PCA and resampling techniques with data augmentation of 99.9976%.

Published
2022-12-08
How to Cite
Tamtama, A., & Arifudin, R. (2022). Increasing Accuracy of The Random Forest Algorithm Using PCA and Resampling Techniques with Data Augmentation for Fraud Detection of Credit Card Transaction. Journal of Advances in Information Systems and Technology, 4(1), 60-76. https://doi.org/10.15294/jaist.v4i1.60865
Section
Articles