The Comparison Combination of Naïve Bayes Classification Algorithm with Fuzzy C-Means and K-Means for Determining Beef Cattle Quality in Semarang Regency

The beef cattle quality certainly affects the quality of meat to be consumed. This research performs data processing to do the classification of beef cattle quality. The data used are 196 data record taken from data in 2016 and 2017. The data have 3 variables for determining the quality of beef cattle in Semarang regency namely age (month), Weight (Kg), and Body Condition Score (BCS) . In this research, used the combination of Naïve Bayes Classification and Fuzzy C-Means algorithm also Naïve Bayes Classification and K-Means. After doing the combinations, then conducted analysis of the results of which type of combination that has a high accuracy. The results of this research indicate that the accuracy of combination Naïve Bayes Classification and K-Means has a higher accuracy than the combination of Naïve Bayes Classification and Fuzzy C-Means. This can be seen from the combination accuracy of Fuzzy C-Means algorithm and Naïve Bayes Classifier of 96,67 while combination of K Means Clustering and Naïve Bayes Classifier algorithm is 98,33%, so it can be concluded that combination of K Means Clustering algorithm and Naïve Bayes Classifier is more recommended for determining the quality of beef cattle in Semarang regency.


INTRODUCTION
Currently the concept of data mining is increasingly recognized as an important tool in information management because of the increasing amount of information.One of the data mining techniques is clustering [1].Clustering is an unattended classification and is a process of partitioning a set of data objects from one set into several appropriate classes or clusters [2].There are several methods used for grouping, including K-Means, possibilistic C-Means (PCM) and Fuzzy C-Means (FCM) [3].
Fuzzy C-Means algorithm is a data clustering technique where the existence of each data point of a cluster is determined by the membership value [4].While K-Means is a data clustering method that partitions data into clusters / groups [5].
The use of the K-Means algorithm is limited to numerical data only [6].One of the classification algorithm is Naive Bayesian Classification (NBC).NBC algorithm aims to classify data in a particular class.For classifier work is measured by predictive accuracy [7].Manuscripts arranged with the following order of topics.
Beef cattle breeding is an activity that is not foreign to the broader community in Indonesia [8].Beef cattle are a special kind of cow that is kept to be fattened because of its characteristics, such as the growth rate and meat quality is quite good [9].Department of Agriculture, Fisheries and Food is one of the offices that located in Semarang regency which has one of the programs that is always to record and observe the good development and quality of cattle in various regions in Semarang regency.Some components affect the quality of beef cattle are age, weight, and BCS (Body Condition Score) [10].
Based on the problem that is still used manual observation in determining the quality of beef cattle, hence required a data processing able for determine the quality of beef cattle with more effective and efficient.Therefore, the authors try to combine Naïve Bayes Classification with Fuzzy C-Means and K-Means for determine the quality of beef cattle and compare the accuracy of the two combined application methods in order to obtain more recommended combination for determining the quality of beef cattle in Semarang regency.

METHODS
Data processing is done by combining Naïve Bayes Classification and Fuzzy C-Means also Naïve Bayes Classification and K-Means.From two combinations will be compared each accuracy to obtain the type of combination with a high accuracy and more recommended for determination of beef cattle quality in Semarang regency.Flow diagram the combination of Naïve Bayes Classification and Fuzzy C-Means shown in Figure 1.

RESULTS AND DISCUSSION
This research uses dataset of beef cattle quality in 2016 and 2017 which amounted to 196 records that the process of taking it by plunging directly into the field.The attributes used in the process of determining the quality of beef cattle include age (month), weight (Kg) and BCS (Body Condition Score).This data will then be classified into Good or Bad quality.All the data type is continuous.
In the process of calculation, used data training of 136 records data while data testing a number of 60 data records.The algorithm that used in this research are Fuzzy C-Means, K-Means, and Naïve Bayes Classification.
The first step in determining the quality of beef cattle was grouping the weight attribute into 3 classes using either Fuzzy C-Means algorithm or K-Means.The data type that originally is continuous, after the grouping process will turn into discrete data.The clustering result of weight attribute by using Fuzzy C-Means algorithm whose grouping technique that the existence of each data point in a cluster is determined by the degree of membership [11], shown in Table 1.From the clustering results using Fuzzy C-Means in Table 1, it can be seen that in C1 there is weight data between 280 to 410.In C2 there is data weight between 430 to 670 and on C3 there is data weight between 150 to 260.The advantages of Fuzzy C-Means are it has a high level of accuracy and fast computation time [12].
The clustering result of weight attribute by using K-Means algorithm shown in Table 2.The results using K-Means in Table 2 can be seen that in C1 there is data weight between 150 to 370.At C2 there is weight data between 380 to 500 and at C3 there is weight data between 530 to 670.The results of K-Means are strongly influenced by the k parameter and centroid initialization.Generally K-Means initializes the centroid randomly [13].After performing clustering process of weight attribute neither using Fuzzy C-Means algorithm and K-Means, do the process of beef cattle quality classification using Naïve Bayes Classification.The data with the clustering result of weight attribute using Fuzzy C-Means algorithm is shown in Table 3.The Confusion matrix of Naïve Bayes Classification and K-Means combination can be seen in Table 8.Accuracy can be calculated by Equation 1 as produce the following calculation. = 54+5 54+5+0+1 100% = 98,33 % From that results, it can be seen that the accuracy of the Naïve Bayes Classification and K-Means algorithms is higher than the combination of Naïve Bayes Classification and Fuzzy C-Means algorithms.The accuracy comparison of the two combinations can be seen in Table 9.In the process of clustering weights attributes using both Fuzzy C-Means and K-Means algorithm on the classification of beef cattle quality, proved equally optimized accuracy to the classification of beef cattle quality using Naïve Bayes Classification only.However, on the the combination of Naïve Bayes Classification and K-Means, the accuracy is 98.33%, it was higher than the combination of Naïve Bayes Classification and Fuzzy C-Means which has accuracy 96.67%.
The accuracy comparison of these two combinations is influenced by the result of cluster attribute weights performed with each clustering algorithm, i.e Fuzzy C-Means and K-Means.It can be seen from the cluster members of each different clustering algorithm, the number of cluster members weights each clustering algorithms will then yield result in different probability values in the classification process using Naïve Bayes Classification.

CONCLUSION
The accuracy of combination of the Naïve Bayes Classification and Fuzzy C-Means algorithm is 96,67%, while the combination of Naïve Bayes Classification and K-Means is 98.33%.The results showed that the accuracy of Naïve Bayes Classification and K-Means algorithm was higher than the Naïve Bayes Classification and Fuzzy C-Means algorithms combination accuracy with the difference of 1,66% so the combination of the algorithm that more recommended

Figure 1 .
Figure 1.Flowchart of naïve bayes classification and fuzzy C-Means algorithm

Figure 2 .
Figure 2. Flowchart of naïve bayes classification and K-Means algorithm

Table 1 .
Clustering results of weight attribute using fuzzy C-Means

Table 2 .
Clustering results of weight attribute using K-Means

Table 3 .
Data with the clustering result of weight attribute using fuzzy C-Means

Table 8 .
The confusion matrix of naïve bayes classification and K-Means

Table 9 .
The accuracy comparison of the two combinations