The Effect of Best First and Spreadsubsample on Selection of a Feature Wrapper With Nae Bayes Classifier for The Classification of the Ratio of Inpatients

M Rizky Wijaya, Ristu Saptono, Afrizal Doewes

Abstract


Diabetes can lead to mortality and disability, so patients should be inpatient again to undergo treatment again to be saved. On previous research about feature selection with greedy stepwise forward fail to predict classification ratio inpatient of patient with the result of recall and precision 0 on data training 60%, 75%, 80%, and 90% and there is suggestion to handle unbalanced class data problem by comparison of data readmitted 6293 and the otherwise 64141. The research purposed to know the effect of choosing the best model using best first instead of greedy stepwise forward and data sampling with spreadsubsample to resolve unbalanced class data problem. The data used was patient data from 130 American Hospital in 1999 until 2008 with 70434 data. The method that used was best first search and spreadsubsample. The result of this research are precision found 0.4 and 0.333 on training dataset 75% and 90% with best first method, while spreadsubsample method found that value of precision and recall is more significantly increased. Spreadsubsample has more effect with the result of precision and recall rather than using best first method.

Keywords


best first, unbalanced dataset, spreadsubsample, classification, feature selection

Full Text:

PDF

References


Zuanetti, G., Latini, R. & Maggioni, A., 1993. Influence of diabetes on mortality in acute: data from the GISSI-2 study. J Am Coll Cardiol. Vol 22(7):1788-1794.

Alim, M. S. 2016. Seminar Hasil m0508125 Syahirul : Penerapan Method Seleksi Feature Wrapper Greedy Stepwise Dengan Nave Bayes Classifier Untuk Klasifikasi Rasio Pasien Rawat Inap. https://www.scribd.com/doc/310822916/Seminar-Hasil-m0508125-Syahirul, diakses 07 September 2016.

Deepa, T. & Ladha, L., 2011. Feature Selection Methods and Algorithms. International Journal on Computer Science and Engineering (IJCSE). Vol 3(5):1787.

Yu, L. & Liu, H., 2004. Efficient Feature Selection via Analysis of Relevance and Redundancy. The Journal of Machine Learning Research. Vol 5:1205-1224.

Kohavi, R. & John, G. H., 1998. The Wrapper Approach. Feature extraction, construction and selection. Springer US.

Hall, M., Frank, E, Holmes, G., Pfahringer. B., Reutemann, P. and Witten, I. H. 2009. The WEKA data mining software: An update. SIGKDD. Vol 11(1):1018.

Christopher, W., Thayer, J. & Ruml, W., 2010. A Comparison of Greedy Search Algorithms. Department of Computer Science University of New Hampshire, Durham.

Sembiring, R. K., 1995. Analisa Regresi. Penerbit ITB, Bandung.

Kohavi, R. & John, G. H., 1997. Wrappers for feature subset selection. Artificial Intelligence. Vol 97(1-2): 273-324.

Han, J. & Kamber, M., 2006. Data Mining Concepts and Techniques 2nd ed. Elsevier, San Francisco.

Kusumadewi, S., 2009. Klasifikasi Status Gizi Menggunakan Naive Bayesian Classification. CommIT. Vol 3(1):6-11.

Doreswamy & Hemanth, K. S., 2011. Performance Evaluation of Predictive Classifiers For Knowledge Discovery From Engineering Materials Data Sets. CoRR. Vol 3(3):1209-2501.




DOI: https://doi.org/10.15294/sji.v3i2.7910

Refbacks

  • There are currently no refbacks.




Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.