Prediction of COVID-19 Using Recurrent Neural Network Model

Purpose: The COVID-19 case that infected humans was first discovered in China at the end of 2019. Since then, COVID-19 has spread to almost all countries in the world. To overcome this problem, it takes a quick effort to identify humans infected with COVID-19 more quickly. Methods: In this paper, RNN is implemented using the Elman network and applied to the COVID-19 dataset from Kaggle. The dataset consists of 70% training data and 30% test data. The learning parameters used were the maximum epoch, learning late, and hidden nodes. Result: The research results show the percentage of accuracy is 88. Novelty: One of the alternative diagnoses for potential COVID-19 disease is Recurrent Neural Network (RNN).


INTRODUCTION
Corona Virus Disease 2019 (COVID-19) has been designated as a pandemic by the World Health Organization (WHO) since March 2020.COVID-19, known as attacking respiratory disease, can easily transmit between people through air and physical contact.Cough, shortness of breath, fever, loss of smell and taste, headache, and muscle ache become the most common symptoms of the virus [1]- [3].As a pandemic, the virus brings a high impact in various life fields [4].The conditions show that COVID-19 becomes a critical problem to solve soon.
The confirmed case of COVID-19 reaches 138.416.498people, 2.975.875 death cases, and 192 countries.This number is still growin 1 g today.Reducing the rate of virus growth is one of the efforts that are continuously being made.The virus with rapid transmission also needs quick overcome.The most used methods against the virus spread are diagnosis, testing, quarantine, and isolation [4], [5].However, routine COVID-19 confirmation such as rRT-PCS is still less to cover the virus spread [6].Artificial Intelligence (AI) becomes an expecting approach to improve the diagnosing method [7]- [12].The faster the detection of human infected by COVID-19, the faster treatment can be applied.It can be one of the best efforts to reduce the rate of growth.Therefore, an alternative system in early diagnosis for humans infected by COVID-19 is needed.One of the alternatives is the Recurrent Neural Network (RNN) Artificial Neural Network model.Artificial Neural Network (ANN) is an implementation of artificial intelligence technology that represents the brain of humans who always try to simulate the learning process in the human brain [4].Recurrent Neural Network (RNN) is a type of artificial neural network that has a main building block with repeating cells.Recurrent cells are formed in sequence with past information to learn new information [13].
In previous studies, several algorithms were used to predict by classifying or regressing.Prasetiyo et al. [14] used the Naive Bayes algorithm as a classifier to solve their problem.Walid and Alamsyah [15] also stated that RNN is a robust model for forecasting continuous data types.Muslim et al. [16] conducted a study on the diagnosis of chronic kidney disease using an expert system method based on the Mamdani Fuzzy Inference System (FIS) to determine the level of accuracy.The MOM (Mean of Maximum) method produces an accuracy of 97.14%, and the Bisector method produces an accuracy of 98.86%.Many researchers also used X-ray images to predict COVID-19 [10], [17]- [19].As the predicted model, RNN output results describe a mainly robust pneumonia detection of COVID-infected patients [19].
In this study, the Recurrent Neural Network (RNN) model is used to diagnose early COVID-19.The model used in the classification process will greatly affect the output and the level of accuracy.An accuracy of 98.34% was successfully generated when the Support Vector Machine (SVM) model was used to predict chronic kidney disease [20].Meanwhile, according to Sean Shensheng Xu [21], Deep Learning Neural Network managed to outperform SVM and ANN model accuracy.Deep learning also showed that running better than decision tree algorithm in early diagnosis COVID-19 affected [22].Using the RNN model in diagnosing COVID-19 in this study will certainly have an effect on the better output results.

Data Preprocessing
The data used in this study is COVID-19 taken from the UCI Machine Learning Repository.COVID-19 has 25 attributes and 400 events.This data has 24 attributes plus 1 class with the provision of data that has a numeric of 11 attributes and data that has a nominal type of 14 attributes.

Processing Nominal Attribute Data into Binary
The nominal attribute is an attribute used to classify information or data or values that have no meaning in order.This stage converts nominal attribute data into binary attributes that have values 0 and 1.

Data Cleaning
In the dataset used for this study, there are missing values.Researchers are conducting observations that experience attribute data loss due to various reasons, such as medical events, cost savings, anomalies, and so on.It has been noted that these missing values are usually valuable information.Missing values and patterns provide valuable information for setting targets, especially supervised learning [23].It is necessary to process the missing value data to overcome the problem.In this study, filling in the missing value is done by replacing the missing value with the value obtained from the most number of frequencies in one attribute.

Normalizing Attributes
Data normalization is the processing stage in which attribute data is scaled to fit into smaller specific ranges such as ranges between (0,1) or (-1,0).In this study   normalization data is used between ranges (-1,1) where   is the maximum value of attribute data and   is the minimum value of attribute data as in Equation 1.
If the data range is equal to zero then the formula is as in Equation 2.

Recurrent Neural Network
Recurrent Neural Network (RNN) is one of the Artificial Neural Network (ANN) models.RNN is a type of network on neural networks where there are loops as feedback connections in the network.
The Elman Network training algorithm is similar to the Multi-Layer Perceptron (MLP) training, the network output compared to the target output and error is used to update the network weights according to the Backpropagation error algorithm with the exception that the values of connection weights are constant for 1.0.
The model of the Elman network RNN using the binary sigmoid activation function can be mathematically formulated as in Equation 3.

RESULT AND DISCUSSION
At this stage, the data classification is carried out.The data classification stage uses the Recurrent Neural Network (RNN) model of the Neural Network, there are three stages: the stage of data sharing, the stage of determining learning parameters, the stage of implementing the Recurrent Neural Network (RNN) based on optimum parameters.

Data Sharing
The first step taken is data sharing.The data is divided into two, namely training data used in the training phase and testing data used in the testing phase.In this study, the data was divided by the proportion of training data as much as 70% and testing data as much as 30%.

Determination of Learning Parameter Maximum Epoch
The first testing was to determine the value of the maximum epoch.An epoch is a computational iteration of programs to determine the weight of each input.In the maximum epoch test, the initial weight of the Recurrent Neural Network is randomly initialized with a value range between (-0.01) to (0.01), the learning rate is equal to (0.1), the hidden node is equal to (5), MSE is similar to (0.001) and momentum is equal to (0.0).Maximum epochs used in the test are (50), ( 100), ( 150), ( 250), ( 350), ( 450), (550).Maximum epoch test was applied entire dataset with 63 attributes that have been previously processed.The maximum epoch testing results are shown in Table 1.

Learning Rate
The second test is to determine the value of the learning rate.Learning rate is one of the training parameters to calculate the weight correction value during the training process.This α value is in the range zero (0) to (1).The greater the learning rate value, the faster the training process will run.However, if the value of the learning rate is relatively too large, the training process can exceed the optimal state generally when the minimum error value is reached.In other words, the learning rate affects the network accuracy of a system.The greater the learning rate, the less network accuracy will be.Otherwise, if the learning rate is getting smaller, the network accuracy will be greater or increase with the consequence that the training process will take longer.In the learning rate test, the initial weight of the Recurrent Neural Network is initialized randomly with a value range between (-0.01) to (0.01), the maximum epoch is equal to (1100), the hidden node is equal to 5, MSE is equal to (0.000001) and momentum is equal to (0.0).Learning rates used in this test are: (0.1) (0.2), (0.3), (0.5), (0.7), (0.9), (1).Learning rate testing is implemented to the entire dataset with 63 attributes that have been processed before.The learning rate testing results are shown in Table 2.In this learning rate test, an optimum alpha value of (0.3) was produced with an accuracy rate of 89.09% and an MSE of 0.03507256.

Hidden Node
The third test is to determine the value of the hidden node.The Elman network architecture is a multi-layer perceptron type with a single hidden layer and connections from hidden layer neurons to context units.The context unit stores the output values of the hidden nodes, and these values are entered as additional inputs to the neurons.In hidden node test, the initial weight of the Recurrent Neural Network is initialized randomly with a value range between (-0.01) to (0.01), the maximum epoch is equal to (100), the learning rate is equal to (0.1), MSE is equal to (0.001) and momentum is equal to (0.0).The hidden nodes used in this test are: (1), ( 2), ( 3), ( 4), ( 5), ( 6), ( 7), ( 8), ( 9), (10).Hidden node testing is applied to the entire dataset with a total of 63 attributes that have been previously processed.The results of the hidden node test are shown in Table 3.  6) is generated with an accuracy value of 88.18% and an MSE value of 0.03710022.

Testing Using Optimum Parameter
The last testing is to enter the optimal learning parameters that have been tested previously into the network.The results of network testing with optimal parameters can be seen in Table 4.
In this study, the Recurrent Neural Network (RNN) model is used to diagnose COVID-19 early on.The RNN model is used because the Recurrent Neural Network has at least one feedback loop and has very good imaging capabilities and can overcome feedforward weaknesses.RNN is also included in the Deep Learning Neural Network algorithm which successfully outperforms the SVM and ANN models in terms of accuracy.By using the RNN model in diagnosing COVID-19 early on in this study will certainly affect the output results that are getting better.The output of the test is the level of accuracy in diagnosing COVID-19.The flowchart of the method used in this study is shown in Figure 1.

Figure 1 .
Figure 1.Flowchart of implementing Recurrent Neural Network

Table 1 .
The maximum epoch testing result No.

Table 3 .
Hidden node testing result No.

Table 4 .
(6)ting result using optimum parameterBased on Table4, the model build using optimal parameter alfa (0.3), hidden node(6)and epoch (450) generated an accuracy of 88%.CONCLUSIONThe method proposed in this research is to predict COVID-19 with 25 attributes by modifying the learning parameter of Recurrent Neural Network to get an optimum parameter.After the training phase, the learning rate, hidden layer, and maximum epoch parameters generated optimum value for building the best model of RNN.The values are 0.3 for the learning rate, 6 for the hidden layer, and 450 for the maximum epoch.By implementing optimum parameter values, best accuracy value generated is 88%.The accuracy generated by model show that RNN can be best alternative for diagnosing COVID-19.The future research is hoped that accuracy generated for diagnosing COVID-19 is better than it.