Classification SARS-CoV-2 Disease based on CT-Scan Image Using Convolutional Neural Network

. Purpose: Convolutional Neural Network (CNN) is one of the most popular and widely used deep learning algorithms. These algorithms are commonly used in various applications, including image processing in medical and digital forensics, speech recognition, and other academic disciplines. SARS-CoV-2 (COVID-19) is a disease that first appeared in Wuhan, China, and has symptoms similar to pneumonia. This study aims to classify the covid-19 virus by proposing a deep learning model to prevent infection rates. Methods: The dataset used in this study is a public dataset originating from a hospital in Sao Paulo, Brazil. The data images consisted of 1252 infected with covid and 1230 data classified as non-covid but have other lung diseases. The classification method proposed in this research is a CNN model based on Resnet 50. Result: The experimental results show that the proposed Resnet 50-based convolutional neural network model works well in classifying SARS-CoV-2 disease using CT-Scan images. Our proposed model obtains 95% accuracy, precision, recall, and f1 values on the Epoch 500. Novelty: In this experiment, we utilized the Resnet50-based CNN model to classify the SARS-CoV-2 (COVID-19) disease using CT-Scan images and got good performance.


INTRODUCTION
Deep learning is an evolution of machine learning, and its adaptability to computer vision has made great strides in problem solving [1], [2]. Deep learning, in general, has proven to be an advanced technology widely used in a wide range of applications, including image processing in medical and digital forensics [3], speech recognition [4], [5] and other academic disciplines [6]. There are many studies conducted to classify or predict different illnesses using deep learning techniques, such as the prediction of cardiovascular disease [7], lung disease [8], [9], brain tumor disease [10], detection of leukemia cancer [11], and SARS-CoV-2 disease [12], [13].
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a virus that was detected in late 2019 at a traditional market in Wuhan, Hubei Province, China, when many people were experiencing pneumonialike symptoms [14], [15]. After clinical examination, virus infection has confirmed by analyzing the patient's sputum using the polymerase chain reaction (PCR) method [14]. The virus was then given the name "COVID-19" on the World Health Organization (WHO) recommendation and was officially declared a pandemic in March 2020 [16]. In addition to using the RT-PCR test, there are several ways to diagnose SARS-CoV-2, using radiographic imaging such as x-rays or computer tomography (CT) scans of the lungs [17]. The COVID-19 pandemic has considerably impacted the global economy, from large industries such as airlines to small businesses [18], [19].
A convolutional neural network (CNN) is a deep learning model inspired by human neural networks and is the most widely used method in classification cases. Previous research by Chaniago M et al. A study was performed to detect SARS-CoV-2 on lung x-rays using a convolutional neural network (CNN) method. The built model with CNN achieved an accuracy of 97.87% [20]. According to [21], their study used x-ray images to classify SARS-CoV-2 disease. The classification process was performed using the VGG16 *Corresponding author. Email addresses: ceokelvin12@gmail.com (Kohsasih), b.herawan.hayadi@gmail.com (Hayadi) DOI: 10.15294/sji.v9i2.36583 model. The results show that the model built with VGG16 achieved an accuracy of 79.58. In the following research [22], detected SARS-CoV-2 using CT images of the lungs. The detection process in this study uses the AlexNet network model to apply a convolutional neural network (CNN). The model training process was performed three times using epochs 30, 70, and 100. As a result, the model built on the AlexNet network achieved an accuracy of 90.90%.
Based on several previous studies on the prediction and classification of the SARS-CoV-2 disease, the dataset used is mostly an X-ray image of the lungs. This study aims to create a classification model using a convolutional neural network by applying the Resnet50 network model. The model built will be trained using Lung CT-Scan images from infected SARS-CoV-2 and non-Infected SARS-CoV-2 patients. The results of this research activity are expected to produce models that can predict SARS-CoV-2 infection and improve performance, such as accuracy, precision, and recall.

METHODS
The research method begins with data collection, data normalization, CNN model building, model training, model testing, and evaluating the performance of the built model. Figure 1 shows the procedure performed in this study.

Data Collection
The dataset used in this study is a public dataset, which has been collected from actual patients in a hospital in Sao Paulo, Brazil. This dataset contains 2482 CT Scan images, consisting of 1252 infected with covid and 1230 data classified as non-covid but who presented other pneumonia diseases [23]. Figure 2 shows the data from datasets of each class used in the study.

Image Processing (Normalization)
Normalization is typically used in the preprocessing process. In image processing, normalization changes the pixel intensity values range [24].  [25], [26]. Linear normalization of grayscale digital images is performed by Equation (1).
For example, if the image intensity range is 60 to 190 and the desired coverage is 0 to 255, the process must subtract 40 from each pixel intensity, resulting in a range of 0 to 140. Then multiply each pixel intensity by 255/140. create a range of 0 to 255. In this study, we perform image preprocessing (normalization) by changing the range of pixel intensity values from the original size of 348x256 pixels to 64x64 pixels. Figure  3 shows an example of the data preprocessing on the CT scan image.

Convolutional Neural Network
Convolutional Neural Network (CNN) is part of a deep learning algorithm developed with Multilayer Perceptron (MLP) designed to process data in two-dimensional forms [27], such as sound and images [28]. This network belongs to a deep neural network widely used for image analysis and recognition [29]. This network structure consists of several convolution layers, a pooling layer including max pooling or mean pooling and a fully connected layer [30]- [32]. Several medical applications that apply the CNN network model include brain tumor detection [33], breast cancer classification [34], skin diseases [35], and cardiovascular disease [36].
This study proposes Covid-19 disease classification to CNN using the Resnet50 architecture. Layers, output shapes, and a number of parameters in the proposed architecture are shown in Table 1. The proposed layer has a total parameter of 24,122,070, consisting of 24,064,342 trainable parameters and 57,728 untrainable parameters.

Evaluation Measures
To determine the model's performance, we evaluate the already built model. The classification model will be evaluated using evaluation criteria such as accuracy, precision, recall, and f1 score. Confusion Matrix is used to represent the performance of the classification model [37], [38]. Table 2 shows the Confusion Matrix that visualizes the model's performance [39]. Precision is the ratio of the correct positive prediction to the overall result of positive predictions. Precision is calculated using Equation (3).
Recall It's the ratio between a positive prediction and positive global data. Recall is calculated using Equation (4).
F1-score is a kind of balance between accuracy and recall in the system. It is the harmonized average of the precision and recovery values. F1-score is calculated using Equation (5).

RESULT AND DISCUSSION
This section describes some of the results obtained from the experimental results. This study classifies CT scan images to help diagnose covid infection. Based on 2482 CT scan images consisting of 1252 infected with covid and 1230 data classified as non-covid, this dataset will split into 80% used as training data and 20% used as test data. The results of the study are described as follows.
In this research, we conducted several experiments at different times. In the first experiment, we run the training data for an epoch with a value of 50 and slow down the learning speed when the metric growth stops. In the first experiment, the result was an accuracy rate of 90%. in the following test, we increased the number of epochs to increase the accuracy of the model. on the 100-epoch model, we achieved 94% accuracy. When the epoch was increased again to 200 and 500, our model saw a 95% improvement in accuracy to 94%. Based on several experiments that we have done, the proposed model obtains the best accuracy on epoch 500 with 95% accuracy. Figure 4 shows a graph of the results of our proposed model training.

Epoch 50 Epoch 10
Epoch 200 Epoch 500 Figure 4. Model training results In addition to assessing the model's accuracy, we also consider the confusion matrix of the trained model [40]. Figure 5 shows the performance evaluation of the Confusion matrix.

Epoch 50
Epoch 100 Epoch 200 Epoch 500 Figure 5. Confusion matrix evaluation As shown in Figure 5, Epoch 500 classifies 472 correct data and 25 incorrect data when evaluating the confusion matrix. Epoch 200 has categorized 468 accurate data and 29 inaccurate dates. Its result shows the improved model performance from Epoch 100, correctly classified data 462, and classified data 418 from Epoch 50. Table 3. shows the results of the evaluation of the model's performance at various epochs. The findings of this study indicate that our proposed model can outperform several previous studies in terms of performance evaluation, with 95% precision, accuracy, memory, and f1-score values. Compared with research [22], using the Alexnet model to classify SARS-CoV-2 only achieved an accuracy of 90.90%, and the study [21] used the VGG16 model and achieved an accuracy of 79.58%.
Based on the confusion matrix information, it shows that the performance of the proposed model is improved and can classify SARS-CoV-2 CT scan images correctly. However, this study still has limitations where the model is only trained using images. Therefore, future research can make predictions using other data and improve performance with other models.

CONCLUSION
This paper proposes a method for classifying SARS-CoV-2 (COVID-19) disease on lung CT images using a convolutional neural network model based on Resnet50. This study consists of data collection, image processing (normalization), network structure development, and model performance testing. Based on the results of our experiment, we found that the accuracy, f1-score, precision and recall of the SARS-CoV-2 classification can reach 95%. Its shows that the Resnet50-based CNN model can be used to classify SARS-CoV-2 diseases and work well.