Increasing Accuracy of The Random Forest Algorithm Using PCA and Resampling Techniques with Data Augmentation for Fraud Detection of Credit Card Transaction
Abstract
The credit-card transaction analysis uses a random forest algorithm as an algorithm for the classification process. The problem faced from the classification process using credit card fraud filing dataset fraud is an imbalanced data that causes an imbalanced data alignment on the model results from data training. To resolve the problem, a combination of PCA methods and resampling techniques with data augmentation for the optimum process on random forest classification algorithms. The PCA method is used in the preprocessing stage to do the process of transforming data into numerical data and resampling techniques and data augmentation are used in data resamples to bring the data to a balance. The data used is a data card fraud of Europe that has 284807 transactions. Model accuracy measurement was implemented using confusion matrix. The highest accuracy results from a random forest combination using PCA and resampling techniques with data augmentation of 99.9976%.
Copyright (c) 2022 Journal of Advances in Information Systems and Technology
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.