Implementation Of Naïve Bayes Method with Certainty Factor for Disease and Pest Diagnosis on Onion Plants

ABSTRACT


Introduction
In various areas, many farmers cultivate shallots, where this plant is one of the most popular types of tubers. Shallots can be regarded as non-substituted, which is a plant that is used as a food seasoning and herbal medicine. With a large land area estimated at ninety thousand hectares, this plant is one of the largest suppliers in the regional economy, which reaches two point seven trillion per year (Udiarto et al., 2005). One of the areas known as shallot producers is Brebes Regency. This area is very suitable for planting these plants because it has alluvial. This type of soil is a type of clay that has a reaction content of around pH 5.6 to pH 8.5, which contains P and K elements. But behind the advantages of this type of soil, there are also disadvantages, it is the lack of nutrients (Rosliani & Hilman, 2002) (Sumarni et al., 2013).
Every year, the demand for shallots is increasing. Therefore, Brebes Regency makes shallot plants a superior product for farmers. But along with the ever-increasing demand, it is inversely proportional to the lack of availability. The cause of this is the lack of knowledge about shallot cultivation, including pest and disease disturbances (Tuswanto & Fadlil, 2013). The head of the Banjarharjo Extension Implementation Agency also said that,"It was difficult to provide socialization or direction because of the lack of an expert." He also said that farmers complained about pests and diseases attacking their crops (Tuswanto & Fadlil, 2013). The pests and diseases found in shallots include leaf-mining fly (liriomyza chinensis), onion caterpillar (spodoptera exigua hubn), trips (thrips tabaci), soil caterpillar (agrotis ipsilon), fusarium wilt (fusarium oxysporum hanz), spotting purple (alternaria porri), anthracnose (colletotrichum gloeosporioides), and leaf spot (cercospora duddiae) (Aldo & Putra, 2020).
For these pests and diseases, a system contains knowledge from an expert is needed to diagnose early symptoms experienced by plants (Aldo & Putra, 2020). The definition of an expert system is an innovation from Artificial Intelligence (AI), which is generally in the form of an application technology (Al-Ajlan, 2015). An expert system is a technological development from an expert that is used to identify and solve problems (Wang et al., 2015). It needed to diagnose and help determine the types of pests and diseases on shallot plants and provide solutions on how to properly handle pests and diseases (Tuswanto & Fadlil, 2013). In this study, the authors combined two methods to perform calculations. The first method used is the Naïve Bayes method, as a method for classifying data. Based on the results of various previous studies, the Naïve Bayes has a high level of accuracy and is also sufficient with only a small amount of training data (Kawani, 2019). The second method is the Certainty Factor method. This method is an expert system method used specifically to determine uncertainty. The method is believed to be able to overcome uncertain problems (Nofriansyah et al., 2015). So, the author decided to use this method as a combination of the first method. Based on the description of the problem above, this study focuses on the implementation of the Naïve Bayes method with Certainty Factors for the diagnosis of diseases and pests on shallot plants.

Shallot
Shallots (allium ascalonicum) are ancient vegetables based on hereditary culture cultivated by humans. It can be traced back to the history of Egypt between the First and Second Dynasties around 3200-2700 BC (Aryanta, 2019). Although farmers' interest in shallots is quite high, along with that, there are still many obstacles in the business process, both technical and economic problems (Sumarni & Hidayat, 2005). The benefits of red onion is use as traditional medicine, have been recognized by the wider community. Based on these facts, the demand for or the need for shallots tends to increase (Fernando et al., 2020). The content of shallots is protein, fat, carbohydrates, vitamins, and minerals, including 334 mg of potassium, 0.8 mg of iron, and 40 mg of phosphorus, and they produce 30 calories of energy. In addition, there are also compounds that act as antimutagenic and anticarcinogenic (Firmansyah & Sumarni, 2016).

Expert System
In general, an expert system is a system that seeks to bring human knowledge into computers so that they can solve problems as a professional would normally do. A good expert system is designed to solve a specific problem by imitating the work of an expert. This expert system allows ordinary people to solve complex problems that can only be solved with the help of experts. For specialists, the expert system can also support their work as experienced assistants (Fadli, 2010). The success of an expert system is characterized by solving existing problems by imitating the intelligence of an expert in several areas of expertise. There are two views of the expert system, called development and consultative. The first is a development environment that acts as a means to incorporate knowledge and system components into the knowledge base. The second is a consulting environment that is used as a tool for users to gain knowledge and answers (Honggowibowo, 2015).

Classification
Classification is a data processing technique that processes all the properties of a dataset and then collects the same dataset based on classes according to the model specified at the beginning (Canela et al., 2019). In data mining, there are several algorithms for classifying data, such as Decision Trees, Naïve Bayes, Artificial Neural Networks, K-Nearest Neighbors, Genetic Algorithms, and others (Ting, 2017). In general, the classification process is carried out in two stages. First, the system must be able to learn from existing data so that it is able to make the right model from a set of data. The second stage is the classification stage, in which the data that has been labeled according to the model is retested and classified into the appropriate classes (Abdullah et al., 2020).

Naïve Bayes Algorithm
The Naïve Bayes classifier probabilistic calculates a set of probabilities by adding the frequencies and combinations of values from a given data set. This algorithm uses Bayes and assumes that all attributes are independent or do not affect of each other. This is given by the value of the class variable (Saleh, 2015). Naïve Bayes is also one way to classify data. The easy way to classify data by predicting the probability of belonging to a class is called Bayesian classification (Borman & Wati, 2020). Probability Bayesian is a method used to solve cases using Bayesian obtained (Arhami, 2005).

Certainty Factor Algorithm
There is a method to overcome data uncertainty in expert systems. The method that can be used is the Certainty Factor (Kusrini, 2008). Rules for the Certainty Factor is the processing of uncertainty in the thinking of an expert. An expert, among other doctors, often gives inaccurate analysis results, such as "probably", "most likely", or "almost certain". Therefore, to overcome this, you can use this rule to provide an answer from an expert on the problem that is being experienced (Hutama, 2018).

Method
This study focuses on the implementation of the Naive Bayes method with certainty factors for diagnosing diseases and pests on shallots. The research process consists of several main stages, including literature study, data collection, needs analysis, system design, system implementation, system testing, and conclusions. The stages of the process can be seen in Figure 1.

Literature Study
The initial activity carried out was a literature study. This aimed to obtain a detailed description of the cases in the process of diagnosing onion diseases and pests using the Naïve Bayes and Certainty Factor. In this step, the researcher collects data, like complete information related to influencing factors, obtained from references from books, articles, journals, proceedings, and previous scientific works related to shallots, expert systems, Naïve Bayes algorithms, and methods. In addition to the main dataset, there is also data taken from literature studies, which is used as references in classifying the factors or attributes in the main dataset. This data will greatly affect the accuracy and final results of this study.

Data Collection
The data collection stage is carried out by looking for data sources from existing research to obtain reliable data. The data used in this study is taken from the statement of an expert from the Department of Agriculture and Food Security of Brebes Regency as the main dataset. Other datasets were taken from onion farmer respondents. In this study, the dataset used is divided into two parts. The first part is the certainty value data from an expert. The data is in the form of a certainty value of a symptom experienced by plants. The second data is test data that comes from the opinions of users or farmers who experience disease and pests on their plants.

Needs Analysis
The needs analysis stage is the stage carried out by researchers to analyze what needs are needed as a means of supporting the creation of this system. Needs analysis is used as a stage to get all the requirements needed for building an expert system according to the Naïve Bayes and Certainty Factor. In the needs analysis process, it is divided into 2 categories, namely functional and non-functional requirements. The definition of functional requirements is requirements related to the running of the system or things that can be run by the system, while non-functional requirements are requirements related to supporting tools, system formation platforms, as well as software and hardware needed in the formation of the application system.

System Planning
In the next stage, the researcher designs the system by making an initial display design or mockup and the hardware used to build the system. An initial display design is an interface design between the system and the user. Creating this system design is divided into 2 parts, called the page for the user and the page for the admin. There is a prototype diagnostic page design that can be seen in Figure 2.

System Implementation
The implementation of the Naïve Bayes and Certainty Factor in the manufacture of an application system for the diagnosis of diseases and pests of shallots is one of the stages carried out in the process of making the system. The system flow for Naïve Bayes can be seen in Figure 3. The flow of calculations using the Naïve Bayes is depicted in Figure 3. The calculation begins with the user or the user selecting their plant symptoms, then determining the nc value, which is the record in the training data. The value of nc is between 0 and 1, which is 1 if true and 0 if false. Continue to calculate the P value (vj) or prior value, then determine the P value (ai|vj) or the likelihood, and finally calculate the P (vj) x P (ai|vj) value or the posterior value before determining the diagnosis result. The highest posterior value will be used as a diagnostic result. Below you can see Figure 4, which is the flow of the calculation using the Certainty Factor.  Figure 4 describes some of the calculation processes for the Certainty Factor. The first step begins with the admin entering the expert weight value, then proceeds to calculate the combined CF and CF combined values, which will produce the final CF value. The expert weight value is in the range of 0 to 1, which means the greater the value or close to 1, the more certain it will be. A CF combine calculation is a calculation that if there is more than one symptom.

System Testing
At this stage, the success and accuracy level of the system has been tested. The success test is carried out by testing the system to see whether it is as expected or if there are still obstacles. Then for testing, the level of accuracy is determined by comparing the test results from the system with the results of an expert's diagnosis. Black box testing is also carried out to test the system in terms of functionality. Black box testing is testing that focuses on the functional specifications of the software. The tester can define a set of input conditions and run tests against the program's functional specifications (Mustaqbal et al., 2015).

Conclusion
In this study, conclusions can be drawn from the above steps. The conclusion of this study is how to apply the Naive Bayes method and the Certainty Factor for early detection of shallot diseases and pests and how much accuracy is achieved in detecting shallot diseases.

Results and Discussion
Based on the previous chapter, everything related to this research has been explained, starting from the problem, the study of the literature used, and the research methodology. At this stage, the researcher will explain the results and conduct a discussion on the implementation of this research.

Knowledge Acquisition
The acquisition of knowledge in this research is carried out through the process of extracting, structuring, and organizing knowledge from one or more sources. This stage is the stage where the researcher explores knowledge and datasets from an expert that will be used in the implementation of the expert system. The method used in this stage is that researchers conduct interviews and make observations of an expert Mr. Maryadi S.P. at the Department of Agriculture and Food Defense, Brebes Regency. After conducting the interview process with an expert, several information and datasets were produced that were used as a reference or data source in diagnosing diseases and pests in shallot plants. There is Table 1, which shows the types of diseases and their symptoms. Table 1. Onion Plant Diseases and Symptoms Disease Name Symptom Layu Fusarium (As seen in Figure 5) Leaves turn yellow but do not dry out Twisted leaves wilt and are easy to pluck Bulbs rot A dead plant starts from the tip of the leaf and spreads to the bottom Bercak Ungu (As seen in Figure 6) There are white or gray grooved spots on the leaves There are spots resembling rings or ovals, reddish purple The tips of the leaves dry up Antraknosa (As seen in Figure 7) Up There are white spots on the leaves Brown spots at the base of the stem Bercak Daun (As seen in Figure 8) There are brown spots on the leaves Wither like hot water  Table 2. Table 2. Onion Plant Pests and Symptoms Pest Name Symptom Lalat Penggorok Daun (As seen in Figure 9) There are white spots on the leaves There is a larval scraping burrow that winds in the leaves Dry leaves brown leaves like burning or drying Ulat Bawang (As seen in Figure 10) Dry leaves There are transparent white spots on the leaves Drooping leaves Thrips (As seen in Figure 11) The white leaves shine like silver Leaves turning brown and speckled with black Small onion bulbs Ulat Tanah (As seen in Figure 12) The sliced stem neck  Figure 12. Ulat Tanah Pest Diseases and symptoms of shallots are the types of diseases and their symptoms obtained from the interview stage with an expert. The data will be entered into the application system created. Onion disease coding is shown in Table 3, and for coding disease symptoms is shown in Table 4.

G05
There are white or gray grooved spots on the leaves 196

G06
There are spots resembling rings or ovals, reddish purple

G07
The tips of the leaves dry up 8. G08 Up There are white spots on the leaves

G09
Brown spots at the base of the stem

G10
There are brown spots on the leaves 11. G11 Wither like hot water Pests and symptoms of shallots are types of pests that attack plants along with physical symptoms obtained from the interview process with experts who will be included in the application system. The coding of shallot pests is shown in Table 5 and the coding of the symptoms of shallot pests is shown in Table 6. G07 The white leaves shine like silver 8. G08 Leaves turning brown and speckled with black 9.
G09 Small onion bulbs 10. G10 The sliced stem neck Formation are the rules used in expert systems to obtain information from experts, where the experts referred to here are experts on diseases and pests of shallots. The following is the form of the rule obtained: The rules for the type of disease in the expert system are as follows: RULE 1 = IF G01 AND G02 AND G03 AND G04 THEN P01 RULE 2 = IF G05 AND G06 AND G07 THEN P02 RULE 3 = IF G08 AND G09 THEN P03 RULE 4 = IF G10 AND G11 THEN P04 The rules for the types of pests in the expert system are as follows: RULE 1 = IF G01 AND G02 AND G03 AND G04 THEN H01 RULE 2 = IF G03 AND G05 AND 06 THEN H02 RULE 3 = IF G07 AND G08 AND G09 THEN H03 RULE 4 = IF G10 THEN H04 After obtaining the rule from the expert, the rule execution process will be carried out based on the answers to the symptoms selected by the user. The first step is the classification process using the Naïve Bayes method, followed by calculating the confidence value using the Certainty Factor method.
• Finds the prior value and determines the values of N, m, x, and nc.
Based on the selected training data, the prior of each class will be searched for and the values of N, m, x, and nc will be found. Where the value of N is the duplicate value of the disease, then, the value of m is the number of symptoms, the value of x is the number of diseases, the value of nc is the value based on training data, and the prior is the probability of disease. P(c) = 1 / x is the formula. The selected symptoms from the training data are shown in Table 7. Table 7. Selected Symptoms Symptom Code Symptom

G01
Leaves turn yellow but do not dry out G02 Twisted leaves wilt and are easy to pluck G03 Bulbs rot An example calculation to find the prior value manually. At this stage, we calculate the probability value for each disease with the provisions of P (c) x P (a|c) for each symptom that has been calculated in P (a|c). The following is the calculation of P (c) x P (a|c) for each disease: An example calculation to find the posterior manually. From the calculation above, it can be determined the probability value. This value is taken from the calculation result with the largest value. In the posterior above, the largest value is in disease layu fusarium with a value of 0.007629.
• Determining combined or diagnostic CF At this stage the expert CF value is obtained based on expert information according to the existing symptoms. While the user CF is obtained from the record, there is Table 8, which displays the results of expert CF and user CF. The result of the calculation is: Confidence Percentage = 1 x 100% = 100%. Based on calculations from the Certainty Factor method, the confidence value in fusarium wilt disease is 100%.

Implementation of the onion disease and pest system
This stage discusses the implementation of the expert system implementation with information guidelines and the knowledge base that has been obtained in the previous stage. At this stage, the researchers began to create a Web-based system. The steps at this implementation stage are as follows.

Interface Result
This section is a section that presents the results of the interface that has been created. This display serves to make it easier for users to interact with the system. There are several parts of the display on the system, including: 4.2.1.1 Login Page and Registration Page This page is the initial page that the user will see where he will enter his email and password. Then, if the user does not have an account, he will be directed to register first. For more details, Figure 13 displays the login and Figure 14 displays the register. This page contains a display of symptoms that the user can choose to make a diagnosis. For more details, Figure 15 shows the disease diagnosis page and Figure 16 shows the pest diagnosis page.

Disease and Pest Diagnosis History Page
This page is a page that displays the history of the results of the disease diagnosis and the results of the pest diagnosis. It can be seen in Figure 17, which displays the disease diagnosis history page, and in Figure 18, which displays the pest diagnosis history page. This page displays information about diseases and pests exist in shallots. The information includes the names of diseases and pests, other names, symptoms, and also how to handle them. For more details, see Figure 19, which displays the disease info page, and Figure 20, which displays the pest info page.

System Testing
At this stage, the researchers conducted testing by testing blackbox testing and testing the system's level of accuracy.

BlackBox Test
In this test, an experiment was conducted on Mr. Maryadi S.P. on September 5, 2022, at the Office of Agriculture and Food Security, Brebes Regency. This testing phase is intended to check whether all functions on the system can run as needed and have also met the desired.

System Accuracy Level Test
This test is done by comparing the results of manual calculations with those generated by the system. The results of calculations by the system must be the same as the results of calculations carried out manually, considering to the reference in this study is manual calculations. This experiment was conducted by testing 35 test data consisting of 20 disease data and 15 pest data. This data was provided by an expert, namely Mr. Maryadi S.P. on September 5, 2022, at the Office of Agriculture and Food Security, Brebes Regency. Based on the test results of the training data. We produced 20 test data on disease according to manual test data and 14 pest test data according to manual test data. Therefore, it can be concluded that the results of the level of accuracy are: Accuracy (%) = x 100% = 97%

35
So, the accuracy rate is 97%.

Conclusion
The way this expert system application works is that it starts with a user choosing to detect a disease or onion pest. Then, the user begins to choose the symptoms experienced. After selecting a symptom, click on "Diagnose". The system will automatically display the results of the diagnosis.
In the process of diagnosing the system, using the calculation of the Naïve Bayes as a classification of each symptom. This calculation begins by finding the prior, then, calculating the likelihood, and finally calculating the posterior, where in this calculation a probability value will be generated. Furthermore, the second method used in this system is the Certainty Factor, which is useful for determining the confidence value of the diagnostic results in the first method. The calculation of this method begins with finding the diagnostic CF value or combined CF, then, calculating the combined that will result in the confidence value of the diagnostic results. This study resulted in an accuracy rate of 97% from a total of 35 test data points. This is better than previous studies using the Naïve Bayes alone, which is 93.54%, and the use of the Certainty Factor, which is 85.71%. The use of the Dempster-Shafer, which produces an accuracy of 95%.