Identification of Tuberculosis Patient Characteristics Using K-Means Clustering

Betha Nur Sari


In Indonesia, tuberculosis remains one of the major health problems unresolved. Indonesia is second ranked in the world as the country with the most tuberculosis cases. The purpose of this research is to study how K-means clustering applied to the treatment of tuberculosis patients data in order to identify the characteristics of tuberculosis patients. The results of K-means clustering validated by gene shaving and silhoutte coefficient. The experiment results indicate the optimum clusters value obtained from the K-mean clustering that has been validated by gene shaving and silhouette coefficient. K-means clustering divided four groups of tuberculosis patients based on their characteristics. There were divided at a category of disease (pulmonary TB, Extra Pulmonary TB and both), the age of the patient and the results of treatment of tuberculosis.


characteristic, clustering, K-means, patient, tuberculosis

Full Text:



World Health Organization. 2015. Global Report Tuberculosis 2015. accessed at 1 September 2016

Ministry of Health of the Republic of Indonesia.. 2011. Laporan situasi terkini perkembangan tuberkulosis di Indonesia Januari-Desember 2011. 209975729/Kementerian-Kesehatan-RI-Laporan-Situasi-Terkait-Perkembangan-Tuberkulosis-Di-Indonesia-2011. diakses 1 September 2016

Hripcsak, George, dan Albers DJ. 2013.Correlating electronic health record concepts with healthcare process events. J Am Med Inform Assoc. Vol.20 : 311-18.

Tadesse, Takele,et all. 2013. The Clustering of Smear-Positive Tuberculosis in Dabat, Ethiopia: A Population Based Cross Sectional Study. Plos One. Vol. 8.5: 1-6.

Sewitch, Maida J, Karen Leffondre, dan Patricia L. Dobkin. 2004. Clustering patients according to health perceptions Relationships to psychosocial characteristics and medication nonadherence. Journal of Psychosomatic Research. Vol.56: 323 – 332.

Hripcsak, George, Albers DJ, Perrote A. 2011. Exploiting time in electronic health records correlations. J Am Med Inform Assoc. Vol.18; 109-15

Tekmono, Kardi. 2007. K-means Clustering Tutorial. K%20mean%20Clustering1.pdf. diakses 1 September 2016.

Hastie, Trevor, et al. 2000. ‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biology. Vol.1(2): 1-21

Al-Zoubi, Moh’d Belal dan Mohammad al Rawi. 2008. An Efficient Approach for Computing Silhoutte Coefficients. Journal of Computer Science. Vol.4(3): 252-255

Struyf, Anja, Mia Hubert, dan Peter J.Rousseeuw. 1997. Clustering in an Object-Oriented Environment. Journal of Statistical Software. Vol. 1(4) : 1-30.



  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.