Fast Diagnosing of Pediatric Respiratory Diseases ... - Semantic Scholar

Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 2013

Fast Diagnosing of Pediatric Respiratory Diseases by using High Speed Neural Networks Hazem M. El-Bakry, and Mohamed Hamada

Abstract—In this paper, a new fast neural model for testing

massive volume of medical data is presented. The idea is to accelerate the process of detecting and classifying pediatric respiratory diseases by using neural networks. This is done by applying cross correlation between the input patterns and the input weights of neural networks in the frequency domain rather than time domain. Furthermore, such model is very useful for understanding the internal relation between the medical patterns. In addition, the input patterns are collected in one vector and manipulated as a one pattern. Moreover, before training neural networks, rough sets are used to reduce the length of the feature input vector. The most important feature elements are used to train the neural networks. The reduced

input medical patterns are classified to one of eight diseases. Simulation results confirm the theoretical considerations as 98% of all tested cases are classified correctly. The presented model can be applied successfully for any other classification application.

I. INTRODUCTION

D

iagnostic expert systems help the doctors to determine the range of alternative diagnosis or a definitive diagnosis systems used for treatment helps in deciding on an action plan where multiple options are available in individual cases. Quick decision making in differential diagnosis and selection of proper treatment in a short time are the features of the medical expert system [1-32]. It has been shown that neural networks are efficient in many different applications. The fast response of these neural networks is very important especially in real-time applications. In this paper, a new model for accelerating the operation of neural networks in the test phase is presented. The idea is to collect all patterns in one vector and treat them at the same time as a one pattern. Then cross correlations are applied between such vector and the input weights of neural networks. These cross correlations are applied in the frequency domain rather than time domain. Performing cross correlation in the frequency domain is faster than time one while the resulted final out put is the same in both cases. The idea of applying cross correlation between the input data pattern and the input weights was applied successfully in many different applications [1-6]. Compared to traditional neural networks (TNNs) shown in Fig. 1, cross correlation between the tested data and the input weights of neural Manuscript received February 28, 2013. H. M. El-Bakry is with the Information Systems Department, Faculty of Computer & Information Sciences, Mansoura University, EGYPT (e-mail: [email protected]). H. Hamada is with the University of Aizu, Aizu Wakamatsu, Japan (e-mail: [email protected]).

978-1-4673-6129-3/13/$31.00 ©2013 IEEE

networks in the frequency domain showed a significant reduction in the number of computation steps required for processing certain data [1,4,5]. Here, we make use of the theory of cross correlation implemented in the frequency domain to increase the speed of neural networks. By using the presented model, important data can be hidden inside the whole input data. Then it can be processed without appearing to the neural network as a certain pattern. This is very useful for reasons of security. Furthermore, these important codes can be encrypted in the input data. This paper is organized as follows. Medical expert system for diagnosing of pediatric respiratory diseases is described in section II. Design of neural networks for diagnosing of pediatric respiratory diseases is introduced in section III. A Fast neural model for testing medical patterns is presented in section IV. Finally conclusions are given. II. MEDICAL EXPERT SYSTEM FOR DIAGNOSING OF PEDIATRIC RESPIRATORY DISEASES In respiratory system, as well as in other field of medicine, the differentiation between normal and abnormal is difficult. Reason is that there are no two subjects exactly alike because every biological quantity is a variable influenced by heredity, environment, occupation, nutrition, age, sex, culture etc. Hence even in perfectly normal subjects most biological quantities are scattered over certain range of values [32]. These facts should be well taken care of during the development of medical expert systems for respiratory system diagnosis. It reveals the requirement of the close association of the domain expert and the knowledge engineer. The scope of developing a wide variety of medical Expert Systems in Respiratory medicine is unlimited. The complex jargon of respiratory physiology and mechanics have made pulmonary function tests very unpopular among general practitioners. The decision making in intensive care units for mechanically ventilated becomes challenging even to specialists because of the complex ventilator mechanics. The evaluation of unexplained breathlessness or assessment of respiratory function in sportsmen is not an easy task. Thus creating a stable, popular, power full and user friendly program can help the physician in many ways. Chronic diseases like asthma, chronic obstructive pulmonary diseases, interstitial lung diseases and tuber cluosis can be managed better if suitable algorithms are created and a set of treatment choices are outlined. This then becomes useful in managing a large group of patients in busy hospitals. Decision making in individual cases also will be made easy then.

2941

I1

I2

Output Layer Output

In-1 Hidden Layer In Input Layer

Cross correlation in time domain between the (n) input data and weights of the hidden layer.

Serial input data 1:N in groups of (n) elements shifted by a step of one element each time.

IN Fig . 1. Traditional Neural Networks.

There is only a limited number of medical expert systems available in this area. Even though the development of medical expert systems started right from the mid 1970s with the development MYCIN [33], the number of articles in computer aided medical diagnosis up to 1979 was only two [34]. Mycin was developed to assist in the treatment of infectious diseases in particular bacterial infections in the blood. In this, in addition to the performance program, there are three adjunct programs that increase system utility and flexibility. After this period, development of some medical expert systems in the area of respiratory system has taken place [35-39]. On close observation, it can be seen that each of them are more or less concentrating on very narrow specific areas of respiratory system. Note of the expert systems developed can analyze respiratory system as a whole. Artificial neural network is another artificial intelligence tool now in use for the development of expert systems. It consists of numerous, simple processing units or neurons that we can globally program for computation. We can program or train neural networks to store, recognize and associatively retrieve patterns or data base entries and to solve different types of optimization problems, i.e., to estimate sampled functions when we do not know the form of the functions. The overall network behaves as an adaptive function estimator.

III. DESIGN OF NEURAL NETWORK FOR DIAGNOSING OF PEDIATRIC RESPIRATORY DISEASES Imagine the way the human mind works when presented with a problem. At first, the problem’s facts are analyzed and weighted at some sensorial level. Next, these facts are passed through neural paths, which act as filters and are based on previously known patterns. This leads to conclusions, which may be possible solutions to the problem or may serve as additional facts for a new iteration over the neural paths [19]. Artificial neural networks (ANNs) have been extensively used in many research areas from marketing to medicine [15]. They first received much attention from computer scientists, neurophysiologists, psychologists, and engineers, interested in biological nervous system organization and artificial intelligence. Their two main applications in medicine are pattern recognition (classification) and prediction: during these last years (from 1990s and increasing in the 2000s), the applications for prognostic and diagnostic classification in medicine have attracted growing interest in the medical literature [16]. Building Neural Networks (ANNs) have been applied in a wide range of problems and have given, in many cases, superior results to standard statistical models [17]. In particular, the predictive reliability of ANN models has been demonstrated in medical diagnosis [18]. Neural networks are pattern- learning instruments that are more sophisticated than decision trees and Naïve Bayes. A neural network contains a

2942

set of nodes (neurons) and edges that form a network. There are three types of nodes: input, hidden, and output. Each edge links two nodes with an associated weight. The direction of an edge represents the data flow during the prediction process. Each node is a unit of processing. Input nodes form the first layer of the network. In most neural networks, each input node is mapped to one input attribute (such as productive cough, stridor, or fever). Hidden nodes are the nodes in the intermediate layers. A hidden node receives input from nodes in the input layers or precedent hidden layer. It combines all the input based on the weight of associated edges, processes some calculations, and emits a result value of the processing to the subsequent layer. Output nodes usually represent the predictable attributes. A neural network may have multiple output attributes, combining several output nodes in the same network reduces the processing time because such networks can share the common cost of scanning the source data. The result of the output node is often a floating number between 0 and 1 [19]. In the presented medical application, the input nodes represent the symptoms of eight respiratory diseases. Such as dry cough, productive cough, fever, heamoptysis, tachypnea, chest pain, dysnea and etc. also using medical sign such as bronchial breathing, cyanosis, dullness on percussion, hyper resonant on percussion, pleural rub, respiratory distress, wheezing and et al. In addition to some medical investigation as X-ray [showing lung consolidation], X-ray [shawing edematous epiglottic], X-ray [shawing subglottic narrowing and classic narrow trachea], X-ray [showing diffuse haziness]. The output nodes represent eight cases of respiratory disease. As we see in our study that the output value is a floating number between 0 and 1. This means that our result is as the previous concept. And the original value of an input attribute must be massaged to a floating number in the same scale (often between -1 and 1) before processing [19]. The input vector data set is shown in Fig. 2. The second column values are [0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0]T, this mean that the symptoms that appear on patient are decrease breathing sound on auscultation, dullness on percussion, fever and respiratory distress because it's value is 1's and all remaining symptoms take 0's value. This data set consists of 699 Twenty-seven-element input vectors and eight-element target vectors. The input data set contains 699 cases with some common symptoms and different in some symptoms. The examination result is one of the eight cases examinations that described in Fig. 3. As illustrated in figure (6) the first column [1 0 0 0 0 0 0 0]T this mean that the patient suffers from pneumonia, the second column [0 1 0 0 0 0 0 0]T this mean that patient that have input vector which [0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0]T suffers from empyema and so on.

Fig. 2. Sample of the input patterns used for training neural network.

Fig. 3. Sample of target outputs.

The simulation program was performed under MATLAB source and train the network with the input and target vectors as described in Fig. 4. In the training, rough sets are used to reduce the length of input patterns. This is done be finding the most important elements in these medical patterns. As a result, the number of neurons on the hidden layer is reduced to 20 neurons. 699 patterns are used for training the neural network, and the remaining 40% patterns are randomly selected and used as the test dataset. The results are shown in Figs.5~11. After training, the network is tested with some different cases. The results show that the neural network gives correct diagnosis. A dataset including 699 data samples obtained from experimental studies are used for ANNs.

Fig. 4. Neural network training screen.

It can be seen that the mean square error is correct for the trained data, validation data, and test data, all take the same

2943

curve. This means that the network learns from cases and this depends on the amount of the input cases. If the input cases are large then the out put cases in test cases is extremely correct.

Fig. 9. All curves true and false positive rates.

After we train network with input and target vectors now we test the network by enter twenty cases to the network with out the target vector. And let the network to give us the right examination that is one of eight cases.

Fig. 5: Mean squared error for (Train, Validation, Test) data.

Fig. 10. Patterns used in testing the network.

Fig. 6. Training curve true and false positive rates.

The results of testing patterns are shown in figure 11. The first column indicates that the first value in column is 0.8130 ≈ 1 and all the followed values approximately equal to zero. This means that the patient examination is pneumonia and so on.

Fig. 7. Validation curve true and false positive rates.

Fig. 8. The test curve true and false positive rates.

Fig. 11. Result of the testing data.

2944

IV. ACCELERATING THE OPERATION OF NEURAL NETWORKS FOR FAST TESTING OF MEDICAL PATTERNS The operation of neural networks is divided into two parts. First neural networks are trained to classify the input medical patterns. In the test phase, each pattern in the incoming matrix is processed and tested by using neural networks. At each position in the input one dimensional matrix, each submatrix is multiplied by a window of weights, which has the same size as the sub-matrix. The outputs of neurons in the hidden layer are multiplied by the weights of the output layer. Thus, we may conclude that the whole problem is a cross correlation between the incoming serial data and the weights of neurons in the hidden layer. The convolution theorem in mathematical analysis says that a convolution of f with h is identical to the result of the following steps: let F and H be the results of the Fourier Transformation of f and h in the frequency domain. Multiply F and H* in the frequency domain point by point and then transform this product into the spatial domain via the inverse Fourier Transform. As a result, these cross correlations can be represented by a product in the frequency domain. Thus, by using cross correlation in the frequency domain, speed up in an order of magnitude can be achieved during the test phase [1,4,5]. Assume that the size of the input pattern is 1xn. In the test phase, a sub matrix I of size 1xn (sliding window) is extracted from the tested matrix, which has a size of 1xN. Such sub matrix, which contains the input pattern, is fed to the neural network. Let Wi be the matrix of weights between the input sub-matrix and the hidden layer. This vector has a size of 1xn and can be represented as 1xn matrix. The output of hidden neurons h(i) can be calculated as follows [1,5]: 

hi = g

n

∑



 k =1

Wi (k)I(k) + bi 

(1)



where g is the activation function and b(i) is the bias of each hidden neuron (i). Equation 1 represents the output of each hidden neuron for a particular sub-matrix I. It can be obtained to the whole input matrix Z as follows [2]: 

hi(u)=g 



n/2

∑

Wi(k) Z(u+ k) +b i   k n/2 = −  

(2)

Eq.1 represents a cross correlation operation. Given any two functions f and d, their cross correlation can be obtained by [7]: 

d(x)⊗ f(x) =  

(

∞



∑f(x + n)d(n) n= − ∞

(3)



)

Therefore, Eq. 2 may be written as follows [3-83]:

h i = g Wi ⊗ Z + b i

(4)

where hi is the output of the hidden neuron (i) and hi (u) is the activity of the hidden unit (i) when the sliding window is located at position (u) and (u) ∈ [N-n+1]. Now, the above cross correlation can be expressed in terms of one dimensional Fast Fourier Transform as follows [1,4,5]:

(

( ))

Wi ⊗ Z = F −1 F(Z)• F * Wi

(5)

Hence, by evaluating this cross correlation, a speed up ratio can be obtained comparable to traditional neural networks. Also, the final output of the neural network can be evaluated as follows:  q  O(u) = g ∑ Wo (i) h i (u ) + b o   i=1 

(6)

where q is the number of neurons in the hidden layer. O(u) is the output of the neural network when the sliding window located at the position (u) in the input matrix Z. Wo is the weight matrix between hidden and output layer. The complexity of cross correlation in the frequency domain can be analyzed as follows: 1- For a tested matrix of 1xN elements, the 1D-FFT requires a number equal to Nlog2N of complex computation steps [42]. Also, the same number of complex computation steps is required for computing the 1D-FFT of the weight matrix at each neuron in the hidden layer. 2- At each neuron in the hidden layer, the inverse 1D-FFT is computed. Therefore, q backward and (1+q) forward transforms have to be computed. Therefore, for a given matrix under test, the total number of operations required to compute the 1D-FFT is (2q+1) Nlog2N. 3- The number of computation steps required by fast neural model (FNM) shown in Fig. 2 is complex and must be converted into a real version. It is known that, the one dimensional Fast Fourier Transform requires (N/2)log2N complex multiplications and Nlog2N complex additions [40]. Every complex multiplication is realized by six real floating point operations and every complex addition is implemented by two real floating point operations. Therefore, the total number of computation steps required to obtain the 1D-FFT of a 1xN matrix is: ρ=6((N/2)log2N) + 2(Nlog2N) (7) which may be simplified to: ρ=5Nlog2N (8) 4- Both the input and the weight matrices should be dot multiplied in the frequency domain. Thus, a number of complex computation steps equal to qN should be considered. This means 6qN real operations will be added to the number of computation steps required by fast neural networks (FNNs). 5- In order to perform cross correlation in the frequency domain, the weight matrix must be extended to have the same size as the input matrix. So, a number of zeros = (N-n) must be added to the weight matrix. This requires a total real number of computation steps = q(N-n) for all neurons. Moreover, after computing the FFT for the weight matrix, the conjugate of this matrix must be obtained. As a result, a real number of computation steps = qN should be added in order to obtain the conjugate of the weight matrix for all neurons. Also, a number of real computation steps equal to N is required to create butterflies complex numbers (e-jk(2Πn/N)), where 0