AIML

The International Congress for global Science and Technology

ICGST International Journal on Artificial Intelligence and Machine Learning

(AIML) Volume (10), Issue (I) October, 2010 www.icgst.com www.icgst-amc.com www.icgst-ees.com © ICGST LLC, Delaware, USA, 2010

AIML Journal ISSN Print 1687-4846 ISSN Online 1687-4854 ISSN CD-ROM 1687-4862 © ICGST LLC, Delaware, USA, 2010

AIML Journal, Volume 10, Issue 1 Table of Contents Papers P1121004977, Alaa M. Elsayad, "Diagnosis of Breast Tumor using Boosted Decision Trees" P1121002967, Y. Mechqrane and R. Ezzahir and C. Bessiere and E.H. Bouyakhf, "A Constraint Based Approach To Air Traffic Control" P1121022077, Tarek Aboueldahab and Mahumod Fakhreldin, "Stock Market Indices Prediction via Hybrid Sigmoid Diagonal Recurrent Neural Networks and Enhanced Particle Swarm Optimization" P1121026153, S. Abu Naser and R. Al-Dahdooh and A. Mushtaha and M. El-Naffar, "Knowledge Management in ESMDA: Expert System for Medical Diagnostic Assistance" P1121052912, A. Thobbi and R. Kadam and W. Sheng, "Achieving Remote Presence using a Humanoid Robot Controlled by a Non-Invasive BCI Device" P1121052911, S.Kokila and P.Gomathi and T.Manigandan, "Design of Energy Efficient Humidification Plant for Textile Processing"

Pages 1--11

13--22

23--30

31--40

41--45

47--54

ICGST International Journal on Artificial Intelligence and Machine Learning (AIML) A publication of the International Congress for global Science and Technology (ICGST)

ICGST Editor in Chief Dr. rer. nat. Ashraf Aboshosha www.icgst.com, www.icgst-amc.com, www.icgst-ees.com [email protected]

AIML Journal, ISSN: 1687-4846, Volume 10, Issue 1, ICGST LLC, Delaware, USA, October 2010

Diagnosis of Breast Tumor using Boosted Decision Trees Alaa M. Elsayad Department of Computers and Systems, Electronics Research Institute, 12622 Bohoth St., Dokki, Geza, Egypt. [email protected], www.eri.sci.eg rare. Worldwide, it is the most common shape of cancer in females, affecting about 10% of all women at some stages of their life. It is the second leading reason of death for women in the United States, and is the leading reason of cancer deaths for women ages 40—59 [3,4]. Actually, not all tumors are cancer. Tumors may be benign or malignant. Benign tumors are abnormal growths; they are rarely lifethreatening. However, some benign breast lumps can increase a woman’s risk of getting breast cancer. Some women with a history of breast biopsy for benign breast disease have an increased risk of breast cancer [3]. On the other side, malignant tumors are cancer; they generally are more serious. But, the early detection of this kind of cancer increases the chances of successful treatment [3]. Xray mammography is the most common technique used by radiologists in the screening and diagnosing of breast cancer. Sometimes, there are variations in interpreting a mammography. In such cases, fine needle aspiration cytology (FNAC) is adopted [5]. The average correct identification rate of FNAC is only 90% [3,4]. So, it is necessary to develop better identification methods to recognize breast cancer. Data mining methods can help to reduce the number of false positives and false negative decisions [6-8]. The objective of these techniques is to assign patient to either a ‘benign’ group that does not have breast cancer or a ‘malignant’ group that has strong evidence of having breast cancer.

Abstract Decision tree (DT) is one of the popular and effective data mining methods. DT provides a pathway to find “rules” that could be evaluated for separating the input samples into one of several groups without having to express the functional relationship directly. They avoid the limitations of the parametric models and are well suited for the analysis of nonlinear events. The purpose of this study is to examine the performance of the recent invented DT model algorithm (C5.0) on the diagnosis of breast cancer using cytologically proven tumor dataset. The objective is to classify a tumor as either benign or malignant based on cell descriptions gathered by microscopic examination. The classification performance of C5.0 DT is evaluated and compared to the one that achieved by radial basis function kernel support vector machine (RBF-SVM). The dataset has been partitioned by the ratio 70:30% into training and test subsets respectively. Experimental results show that the generalization of the C5.0 DT has been increased radically using boosting, winnowing and tree pruning methods. The C5.0 DT model has achieved a remarkable performance with 98.95% classification accuracy on training subset and 100% of test one while RBF-SVM has achieved 100% success on both training and test subsets. Keywords: Breast cancer, cytology patterns, decision tree, support vector machine, performance measures.

DT approaches are widely used data mining methods. DT approaches focus on conveying the relationships among the rules that express the results. They allow for non-linear relations between

1. Introduction Breast cancer is a malignant tumor that has developed from cells of the breast. It occurs in both men and women, even though male breast cancer is

1


acknowledgments are discussed in Section 5 and 6 respectively.

independent attributes and their outcomes and isolates outliers [18]. The C5.0 DT model is an improved version of C4.5 and ID3 algorithms; it includes discretization of numerical attributes using information theory based functions, boosting, pre and post-pruning and some other state-of-the-art options for building DT model [14,16]. This paper investigates the effectiveness of this C5.0 DT model on the diagnosis of breast tumors as benign and malignant state. The results are compared to those obtained using radial basis function kernel support vector machine (RBF-SVM). SVM has been chosen as it is considered to be a good candidate because of its high generalization performance [11]. The reminder of the paper is organized as follows. Section 2 describes the background of this study. It presents the cytological attributes included in the dataset and a review of the recent previous works. Section 3 describes the two classification models used here; C5.0 DT tree and RBF-SVM. Section 4 presents the statistical measures used to evaluate the classification performance for all models and their experimental results. Finally conclusions and

2. Background 2.1. Breast cancer dataset The breast cancer dataset used here was collected by Dr. William H. Wolberg (1989–1991) at the University of Wisconsin–Madison Hospitals. The dataset contains 699 samples with 16 samples have attributes with missing values and 683 samples have complete data. The samples were virtually assessed nuclear features of fine needle aspirates taken from patients' breasts. Each sample record has nine cytological attributes; they measure the external appearance and the internal chromosome changes in nine different scales. The nine attributes are graded on an interval scale from a normal state of 1–10, with 10 being the most abnormal state [9]. They all are saved as ordinal data type (ordered set). The class attribute is of flag type with two states; 2 for benign and 4 for malignant as shown in Table 1. This dataset can be downloaded from the University of California at Irvine (UCI) machine learning repository [1].

Table 1: Description of Wisconsin breast cancer dataset; 683 completet samples, 239 malignant and 444 benign Attribute Attribute Name Description Type Values Number * 1 Clump Thickness Ordinal 1-10 2 Uniformity of Cell Size Ordinal 1-10 3 Uniformity of Cell Shape Ordinal 1-10 4 Marginal Adhesion Fibrous bands tissue that form between two Ordinal 1-10 fibrous surfaces Ordinal 1-10 5 Single Epithelial Cell Size Size of a single cell that forms tissues that lines the outside of the body and the passageways that lead to or from the surface 6 Bare Nuclei Ordinal 1-10 7 Bland Chromatin Evaluates for the presence of Barr bodies Ordinal 1-10 8 Normal Nucleoli Ordinal 1-10 9 Mitoses Cell growth. Ordinal 1-10 10 Class The required diagnosis Flag 2,4 * All predictive attributes have common rang; from 1 to10. Wisconsin dataset. They reported accuracy 92.13% with their SVM model and 96.57% with the RBFNN one [30]. In their study, they normalized all the attributes between -1 and +1 in order for the classifier to have a common range to work with. 2. M. Karabatak and M. Cevdet applied association rules and neural network to build diagnostic expert system for breast cancer. They have achieved success with 97.4% [31].

2.2 Literature review The diagnosis of breast cancer has been attracting many researchers; they have reached high classification accuracies using the dataset taken from UCI machine learning repository. Three papers have been published in 2009 are reviewed here: 1. T.S. Subashini et al. compared the use of polynomial kernel of support vector machine (SVM) and radial basis function neural network (RBFNN) in ascertaining the diagnostic accuracy of

2


3. P. Luukka used fuzzy robust PCA algorithms (FRPCA) and similarity classifier. Their algorithm has achieved 98.19% accuracy with breast cancer data [33]. In 2007, K. Polat et al. published a paper that used least square support vector machine and achieved 96.59% when they partitioned the dataset into 70% for training and 30% for test [32]. In their paper, they presented a good review with the accuracies achieved in previous papers. On the other hand, the C4.5 DT model has been used by Quinlan in [34]. He has achieved 94.74% classification accuracy using 10-fold cross-validation. The accuracy rates which are achieved here using RBF-SVM model with gamma (the width of the kernel) is set to 0.66 and regularization parameter (the cost parameter) is set to 10, these accuracies all are 100% of the training and test samples. In this study, there is no preprocessing for the attributes as they all logically have a common range. The accuracies which are achieved using the C5.0 DT model with boosting, pruning and winnowing are 98.95% of training samples and 100% of test ones.

Different DT models use different splitting algorithms that maximize the purity of the resulting classes of data samples. Popular DT models include ID3, C4.5 [13,14], CART [15], QUEST[17], CHAID[10], and C5.0 [16]. Common splitting algorithms include Entropy based information gain (used in ID3, C4.5, C5.0), Gini index (used in CART), and Chi-squared test (used in CHAID). This study uses the C5.0 DT algorithm which is an improved version of C4.5 and ID3 algorithms [16]. It is a commercial product designed by Rule Quest Research Ltd Pty to analyze huge datasets and is implemented in SPSS Clementine workbench data mining software [2]. C5.0 uses information gain as a measure of purity, which is based on the notion of entropy. If the training subset consists of n samples, X = { x1 , x 2 , … , x n } , x i ∈ R p is the vector of independent attributes of sample i and one class attribute y i ∈ C , where C is a predefined k classes. Then the entropy of the set X relative to this k-wise classification is defined as: k

entropy ( X ) = ( ∑ − p j log j=0

3. Classification models 3.1 Decision tree

2

p j ),

(1)

where pj is the proportion of X fitting in class cj. Information gain, gain(X,A) is simply the expected reduction in entropy caused by partitioning the set of samples, X, based on an attribute A: Xv gain(X,A)=entropy(X)- ∑ entropy ( X v ) (2) v∈values ( A ) X

DT models are powerful classification algorithms. They are becoming increasingly more popular with the growth of data mining applications [12]. As the name implies, this model recursively separates data samples into branches to construct a tree structure for the purpose of improving the classification accuracy. Each tree node is either a leaf node or decision node. All decision nodes have splits, testing the values of some functions of data attributes. Each branch from the decision node corresponds to a different outcome of the test. Each leaf node has a class label attached to it. General algorithm to build a DT is as follows: 1. Start with the entire training subset and a vacant tree 2. If all training samples at the current node n are of the same class label c, then the node becomes a leaf node with label c. 3. Or else, select the splitting attribute x that is the most important in separating the training samples into different classes. This attribute x becomes a decision node. 4. A branch is created for each individual value of x, and the samples are partitioned accordingly. 5. The process is iterated recursively until a certain value of specified stopping criterion is achieved.

where values(A) is the set of all possible values of attribute A, and Xv is the subset of X for which attribute A has the attribute value v, i.e., Xv = {x ‫א‬ X | A(x) = v}. Boosting, winnowing and pruning are three methods used in the C5.0 tree construction; they propose to build the tree with the right size [11]. They increase the generalization and reduce the overfitting of the DT model. Boosting is a method for combining classifiers; it works by building multiple models in a sequence. The first model is built in the usual way. Then, the second model is built in such a way that it focuses on the samples that were misclassified by the first model. Then the third model is built to focus on the second model's errors, and so on [22]. When a new sample is to be classified, each model votes for its predicted class and the votes are counted to determine the final class. Winnowing method investigates the usefulness of predictive attributes before starting to build the model [19]. This ability to pick and choose

3


among the predictive attributes is an important advantage of tree-based modeling techniques. Winnowing method preselect a subset of the attributes that will be used to construct the tree. Attributes that are irrelevant are excluded from the treebuilding process. In case of the current cytological dataset, only five attributes have been selected to build the tree. Pruning is the last method used to increase the performance of the C5.0 DT model here. It consists of two steps; pre-pruning and post-pruning [20]. Pre-pruning step allows only nodes with minimum number of samples (node size). Postpruning step reduces the tree size based on the estimated classification errors.

Figure 1: Mapping the input space to a higher dimensional feature space.

3.2 Support vector machine The SVM model is a supervised machine learning technique, which is based on the statistical learning theory. It was firstly proposed by Cortes and Vapnik from his original work on structural risk minimization in [23] and modified by Vapnik in [24]. The algorithm of SVM is able to create a complex decision boundary between two classes with good classification ability. Figures 1 and 2 give the basic principles of SVM. When the data is not linearly separable, the algorithm works by mapping the input space to higher dimensional feature space, through some nonlinear mapping chosen a priori (Figure 1), and constructs a hyperplane, which splits class members from non-members (Figure 2). SVM introduces the concept of ‘margin’ on either side of a hyperplane that separates the two classes. Maximizing the margins and thus creating the largest possible distance between the separating hyperplane and the samples on either side, is proven to reduce an upper bound on the expected generalization error. SVM may be considered as a linear classifier in the feature space. On the other side it becomes a nonlinear classifier as a result of the nonlinear mapping from the input space to the feature one [25,26]. For linearly separable classes, SVM divides these classes by finding the optimal (with maximum margin) separating hyperplane. Optimal hyperplane can be found by solving a convex quadratic programming (QP) problem [11]. Once the optimum separating hyperplane is found, data samples that lie on its margin are known as support vectors. The solution to this optimization problem is a global one.

Figure 2: Optimal hyperplane separating the two classes and support vectors For linearly decision space, suppose the training subset consists of n samples (x1,y1),…,(xn,yn), x‫א‬Rp and y‫{א‬+1,-1} i.e. the data contains only two classes. The separating hyperplane can be written as:

D ( x i ) = wx i + b where the vector w and constant b are (3) learned from a training subset of linearly separable samples. The solution of SVM is equivalent to solve a linearly constrained quadratic programming problem as in Equation (4) for both targets y = 1 and −1: y i = wx i + b ≥ 1, i=1,…,n. (4) As mentioned before, samples that provide the above formula in case of equality are referred as support vectors. SVM classifies any new sample using these support vectors. On the other hand, margins of the hyperplane follow the subsequent inequality:

y ‫ × ه‬D ( x‫) ه‬ ≥ Γ, i=1,…,n. w

(5)

The norm of the w has to be minimized in order to maximize the margin Γ . In order to lessen the number of solutions to the norm of w, the following equation is assumed:

Γ × w =1

4

(6)


(negative) as a malignant one (positive). Finally, false negative occurs when the system labels a positive case as negative. Classification accuracy is defined as the ratio of the number of correctly classified cases and is equal to the sum of TP and TN divided by the total number of cases N. TP + TN Accuracy = (10) N Sensitivity refers to the rate of correctly classified positive and is equal to TP divided by the sum of TP and FN. Sensitivity may be referred as a True Positive Rate

Then the algorithm tries to minimize the value of 1/ 2

w

2

subject to condition in Equation (4). In the

case of non-separable samples, slack parameters ξ are added into Equation (4) as follows: y i ( wx i + b) ≥ 1 − ξ , ξ ≥ 0, ∀i (7) And the value that need to be minimize becomes: n

C ∑ ξi + 1/ 2 w . 2

(8)

i =1

Where C is the regularization parameter. A regularization parameter C (may be called cost parameter) is set to determine the level of tolerance the model has, with larger C values allowing larger deviations from the optimal solution. This parameter is optimized to balance the classification error with the complexity of the model. There is a family of kernel functions that may be used to map input space into feature space. They range from simple linear and polynomial mappings to sigmoid and radial basis functions (RBFs). Once a hyperplane has been created, the kernel function is used to map new samples into the feature space for classification. This mapping technique makes SVM dimensionally independent, whereas other machine learning techniques are not. This study uses the RBF kernel to map the input space into the higher dimensional feature space. RBF kernels can be controlled by adjusting the width of the basis functions σ , so only one parameter that needs to be optimized [27].

Sensitivity =

TP TP + FN

(11)

Specificity refers to the rate of correctly classified negative and is equal to the ratio of TN to the sum of TN and FP. False Positive Rate equals (100-specificity). TN Specificit y = (12) TN + FP Figure 3 demonstrates the component nodes of the proposed stream. This stream is implemented in SPSS Clementine data mining workbench using Intel core 2 Dup CPU with 2.00 GHz. Clementine uses client/server architecture to distribute requests for resource-intensive operations to powerful server software, resulting in faster performance on larger datasets [2]. SPSS Clementine is very appropriate as a mining engine with its interface and manipulating modules that allow data examination, manipulation and exploration of any interesting knowledge patterns. The software offers many modeling techniques, such as prediction, classification, segmentation, and association detection algorithms. The components of the data mining streams are: Cytological Breast Cancer dataset node is connected directly to SPSS file that contains the source data. The dataset was explored for incorrect, inconsistent or missing values. The system uses only 683 complete records out of 699 ones. No preprocessing has been done for all attributes as they are of ordinal types. Their values represent categories with some intrinsic ranking. They all are on a scale from 1(normal state) to 10 (the most abnormal state); there is no need for normalization. Logically, all models have common range to work with.

2

K ( x, x • ) = exp(− x − x • / σ 2 ), where σ is a specified positive real number, (9) which determines the width of the RBF kernel. So, this RBF-SVM classification model has two parameters which need to be optimized; the width of the basis function σ and the regularization parameter C [28].

4. Experimental results The classification performance of each model is evaluated using three statistical measures; classification accuracy, sensitivity and specificity. These measures are defined using the values of true positive (TP), true negative (TN), false positive (FP) and false negative (FN). A true positive decision occurs when the positive prediction of the classifier coincided with a positive prediction of the pathologist. A true negative decision occurs when both the classifier and the pathologist suggested the absence of a positive prediction. False positive occurs when the classifier labels a benign case

Type node specifies the field metadata and properties that are important for modeling and other works in Clementine. These properties include usage type, setting options for

5


(predictive) attributes and the disease is defined as target class.

handling missing values, as well as setting the role of an attribute for modeling purposes; input or output. As previously stated, the first 9 attributes in Table 1 are defined as input

breast

cancer

Figure 3: Diagnosis of cytological dataset using C5.0 DT and RBF-SVM models between maximizing the margin and minimizing the training error term as discussed in Sec. 2.2. Normally, its value should be between 1 and 10. While increasing the value improves the classification accuracy of the training samples, but this can also lead to overfitting. The best value of this parameter was found to be 10. Filter, Analysis and Evaluation nodes are used to select and rename the classifier outputs in order to compute the performance statistical measures and to graph the evaluation charts. The Wisconsin breast cancer dataset contains 683 of complete cytology data records each with 9 attributes and 2 diagnosis classes; malignant and benign. The whole dataset is divided for training the models, and test them by the ratio of 70:30% respectively. The training set is used to estimate the model parameters, while the test set is used to independently assess the individual model. These models are applied again to the entire dataset and to any new data. Both models are fast; the time required to build each model with the Wisconsin dataset is below one second. In C5.0 DT model, boosting can significantly improve the accuracy of model, but it also requires longer training. It works by building multiple models in a sequence. Cases are classified by applying the whole set of models to them and using a voting procedure to combine the separate predictions into one overall prediction. Figure 4 shows the resulting DT model without boosting. On the other hand, Figure 5 shows list of classification accuracies of the boosted 10 trails decision trees.

Partition node is used to generate a partition field that splits the dataset into separate subsets for the training and test the models. In this study the dataset was partitioned by the ratio 70:30% for training and test subsets respectively. C5.0 node is DT model which is trained using boosting (with 10 trails), pruning and winnowing methods to increase the model accuracy. The minimum number of samples per node is set to be 2 and the system uses equal misclassification costs. The high speed property is a notable feature of C5.0 DT model; it clearly uses a special technique, although this has not been described in the open literature. The results listed in Table 3 show that boosting, pruning and winnowing enhanced the accuracy of the DT model to reach 100% of test samples. Furthermore, Figure 6(a) illustrates that only 5 attributes are required to predict the diagnosis with this degree of accuracy. SVM node is used to train the RBF-SVM model with value of σ (the width of the radial basis function) should be normally between 3/k and 6/k, where k is the number of input attributes. There are 9 attributes in the input dataset, so its value is normally chosen to be in the range 1/3 and 2/3. Increasing the value improves the classification accuracy of the training samples, but this can also lead to overfitting. In this study, two values of σ have been used 0.33 and 0.66. Both values result in classification accuracy with 100% of the test samples. However, the 0.66 value of gamma uses only 8 attributes instead of 9. Figure 6(b) illustrates that the “Clump Thickness” attribute has no importance at all in the identification of the class value. Regularization parameter C Controls the trade-off

6


Figure 4: Decision tree generated on the training subset (70% of the whole set) of the cytological breast cancer dataset using only one trail C5.0 DT model (without boosting). can produce an excellent fit of the predicted class values to the real ones. While there are some errors in diagnosing the training subset, the application of boosted trees on test subset has achieved 100% success as shown in Tables 2 and 3. The predictions of both models are compared to the original classes to identify the values of true positives, true negatives, false positives and false negative. These values have been computed to construct the confusion matrix as tabulated in Table 2 where each cell contains the raw number of samples classified for the corresponding combination of desired and actual classifier outputs. The values of the statistical parameters (sensitivity, specificity and total

Figure 5: List of classification accuracies of the 10 trails C5.0 DT using training subset The overall effect of the boosting algorithm is to create an adaptive weighted expansion of decision trees, which

7


identification of the output class values [29]. Normally, experts want to focus their modeling efforts on the attributes that matter most. Figures 6(a) and 6(b) show the relative importance of each attribute in C5.0 DT and RBF-SVM models respectively. While Clump Thickness is the most important attribute in C5.0 DT model, it has no importance at all in RBF-SVM model. It is clear that the C5.0 DT model uses only 5 attributes; clump thickness, uniformity of cell Shape, bare nuclei, uniformity of cell size and marginal adhession. On the other side RBF-SVM uses 8 attributes. Figure 7 shows the cumulative gains charts of the two models for training and test subsets. The higher lines indicate better models, especially on the left side of the chart. The two curves are identical for test subset and almost identical for the training one.

classification accuracy) of the two models were computed and presented in Table 3. Sensitivity and Specificity approximate the probability of the positive and negative labels being true. They assess the usefulness of the algorithm on a single model. Using the results shown in Table 3, it can be seen that the sensitivity, specificity and classification accuracy of both models have achieved 100% success of test samples. However, the classification accuracy of the C5.0 DT model is 98.95% of training samples. This percentage could be increased to 99.58% if the pruning parameters are changed to allow the minimum number of records per child branch to be one. But, the default value is 2 to help prevent overtraining with noisy data. Sensitivity analysis is frequently used to recognize the degree at which each predictive attribute contributes to the

Table 2: Confusion matrices of C5.0 DT and RBF-SVM for training and test subsets Training data

Model SVM C5.0

Test data

Desired Output

Benign

Malignant

Benign

Malignant

Benign Malignant

316 0

0 162

128 0

0 77

Benign

313

3

128

0

Malignant

2

160

0

77

Table 3: Values of the statistical measures of C5.0 DT and RBF-SVM for training and test subsets Measures

Model SVM C5.0

Partition

Accuracy

Sensitivity

Specificity

Training Test

100.00% 100.00%

100.00% 100.00%

100.00% 100.00%

Training

98.95%

98.77%

99.05%

Test

100.00%

100.00%

100.00%

Relative Attribute Importance

Relative Attribute Importance

(b) (a) Figure 6: Relative attribute importance of the two models; (a) C5.0 DT and (b) RBF-SVM

8


Figure 7: The cumulative gains charts of the two models for training and test subsets (they are almost coincide).

than those required by RBF-SVM to predict the required class labels.

5. Conclusions DT modeling algorithms allow developing classification systems that predict or classify future samples based on a set of decision rules. Besides their efficiencies, they are popular for two additional advantages; they have the ability to pick and choose among the predictive attributes and they are easy to interpret. However, patterns found via any data mining algorithms should be evaluated by medical professionals. Data mining is not aiming to replace medical professionals and researchers, but to increase their ability when taking decisions about their patients. The main contribution of this study is the application of the C5.0 DT model in the diagnosis of breast cancer disease. To increase the generalization of the tree structure, the model is trained using boosting, winnowing and pruning methods. This study has examined the application of the boosted C5.0 decision trees and the radial basis function kernel of support vector machine in actual clinical diagnosis of breast cancer. The dataset is obtained from the Wisconsin University Hospital. It contains cytologically proven tumor data to train the models to categorize cancer patients according to their diagnosis. This dataset has partitioned into training and test by the ratio 70:30% respectively. Experimental results shows that the effectiveness of both models. RBF-SVM identified a set of eight attributes that are sufficient to achieve 100% classification accuracy on both training and test subsets. While C5.0 identifies only five attributes to get 98.95% accuracy on training subset and 100% accuracy on the test subset. Both RBF-SVM and boosted C5.0 DTs models can be effectively used for breast cancer diagnosis to help physicians and oncologists. However, C5.0 DT uses fewer attributes

6. Ackno wle dg ements I wish to express my gratitude to Eng. Mohamed Zohdi from SPSS Egypt office for the evaluation copy of Predictive Analytics Software portfolio (PASW) statistical software they granted (formerly called SPSS).

References [1] ftp://ftp.ics.uci.edu/pub/machine-learningdatabases (last accessed: Jan. 2010). [2] http://www.spss.com. (last accessed: Jan.2010) [3] J. Calle. Breast cancer facts and figures 2003– 2004. American Cancer Society 2004. http://www.cancer.org/. (last accessed: Jan.2010) [4] Breast cancer Q&A/facts and statistics (http://www.komen.org/bci/bhealth/QA/q_ and_a.asp. (last accessed: Jan.2010) [5] A. U. Buzdar and R. S. Freedman. Breast Cancer. The 2nd edition, Springer Science and Business Media, 2008. [6] M. Karabatak and M. Cevdet. An expert system for detection of breast cancer based on association rules and neural network. Expert Systems with Applications 36: 3465–3469, 2009. [7] B. Kovalerchuc, E. Triantaphyllou, J. F. Ruiz and J. Clayton. Fuzzy logic in computer-aided breastcancer diagnosis: Analysis of lobulation. Artificial Intelligence in Medicine, 11: 75–85, 1997. [8] P. C. Pendharkar, J. A. Rodger, G. J. Yaverbaum, N. Herman and M. Benner. Associations statistical, mathematical and neural approaches

9


Statistics, University of California at Berkeley, Berkeley, CA., 2000. [23] C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(2), 273–297, 1995. [24] Vapnik, V.N. Statistical Learning Theory. John Wiley & Sons, New York, 1998. [25] H. Frohlich and A. Zell. Efficient parameter selection for support vector machines. IEEE International Joint Conference on Neural Networks, 3:1431–1436¸2005. [26] F. Friedrichs and C. Igel. Trends in Neurocomputing. The 12th European Symposium on Artificial Neural Networks 64:107–117, 2005. [27] Duda and D.G. Stock pattern classification. John Wiley & Sons New York, 2001. [28] N. Cristianini and J. S. Taylor. An introduction to support vector machines and other Kernelbased learning methods. Cambridge University Press, London, 2000. [29] J.C. Principe, N.R. Eulianoand W.C. Lefebvre. Neural and adaptive systems. Wiley, New York, 2000. [30] T.S. Subashini, V. Ramalingam and S. Palanivel, Breast mass classification based on cytological patterns using RBFNN and SVM. Expert Systems with Applications, 36: 5284–5290, 2009. [31] M. Karabatak and M. Cevdet. An expert system for detection of breast cancer based on association rules and neural network. Expert Systems with Applications, 36: 3465–3469, 2009. [32] K. Polat and S. Güne. Breast cancer diagnosis using least square support vector machine. Digital Signal Processing, 17:694–701, 2007. [33] Pasi Luukka. Classification based on fuzzy robust PCA algorithms and similarity classifier. Expert Systems with Applications 36 (2009) 7463–7468 [34] J.R. Quinlan. Improved use of continuous attributes in C4.5, Journal of Artificial Intelligence Research, 4:77–90, 1996.

for mining breast cancer patterns. Expert Systems with Applications, 17:223–232, 1999. [9] T. Kiyan and T. Yildirim. Breast cancer diagnosis using statistical neural networks. Istanbul University, Journal of Electrical & Electronics Engineering, 4(2): 1149–1153, 2004 [10] J. A. Michael and S.L. Gordon. Data mining technique: For marketing, sales and customer support. Wiley, New York, 1997 [11] J. W. Han and M. Kamber. Data mining concepts and techniques, The 2nd edition, Morgan Kaufmann Publishers, San Francisco, CA, 2006. [12] R. Nisbet, J. Elder and G. Miner. Handbook of statistical analysis and data mining applications. Academic Press, Burlington, MA, 2009. [13] J. Quinlan. Induction of decision trees. Machine Learning 1: 81–106, 1986. [14] J. Quinlan. C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco, CA, 1993. [15] L. Breiman, J.H. Friedman, R.A. Olshen and C.J. Stone. Classification and regression trees. Wadsworth & Brooks/ Cole Advanced Books & Software, Monterey, CA, 1984. [16] http://www.rulequest.com/see5-info.html. (last accessed: Jan. 2010) [17] W.Y. Loh and Y.S. Shih. Split selection methods for classification trees. Statistica Sinica, 7: 815– 840, 1997. [18] S. Faderl, M. J. Keating, K.A. Do, S.Y. Liang, H. M. Kantarjian, S. O’Brien, et al. Expression profile of 11 proteins and their prognostic significance in patients with chronic lymphocytic leukemia (CLL). Leukemia, 16: 1045–1052, 2002. [19] L.A. Breslow, and D.W. Aha, Simplifying decision trees: a survey. Knowledge Engineering Review, 12 (1): 1–40, 1997. [20] S.K. Murthy. Automatic construction of decision trees from data: a multi-disciplinary survey. Data Mining and Knowledge Discovery, 2(4): 345–389, 1998. [21] D. P. Berrar, B. Sturgeon, I. Bradbury and W. Dubitzky. Microarray data integration and machine learning techniques for lung cancer survival prediction. http://www.camda.duke.edu/camda03/ papers/days/friday/berrar/paper.pdf ((last accessed: Jan. 2010). [22] S. Dudoit, J. Fridlyand and T.P. Speed. Comparison of discrimination methods for the classification of tumors using gene expression data. Technical Report 576, Department of

10


Biography Alaa M. Elsayad received a PhD degree in image processing from Cairo University in 1998. He has completed his postdoctoral research studying the telemedicine system and image analysis in the Imaging Sciences and Information System (ISIS) unit at the Medical Center of Georgetown University. Currently, He works as an associate professor of computer science at the Electronics Research Institute, Cairo, Egypt where he heads the information and decision support center. Before working in different universities in the Arab region, he has collected different experiences in information technology while working in different governmental organizations in Egypt which include Ministry of Administrative Development, Ministry of Communication and Ministry of International Cooperation. Dr. Alaa published several papers; all are focusing in image analysis and data mining. His current research interests cover the theory of data mining algorithms and their applications to image understanding and signal processing. Special research topics include medical image analysis and telemedicine systems.

11


12


A Constraint Based Approach To Air Traffic Control Y. Mechqrane 1 , R. Ezzahir 1 , C. Bessiere 2 , E.H. Bouyakhf1 1 LIMIARF FSR, U. of Mohammed V Agdal, Morocco, 2 LIRMM/CNRS / U. of Montpellier, France [email protected], [email protected], [email protected], [email protected]

Abstract

∙ Tactical (a few days to a few hours), Air Traffic Flow Management (ATFM): Air Traffic Control centres opening schedules define hourly capacities of each open sector. To respect these capacity constraints, the Central Flow Management Unit (CFMU) computes and updates flow regulations and rerouting according to the posted flight plans and resulting workload excess.

During the fly, a conflict between two aircraft occurs if the two aircraft are closer than a given safety distance. In this paper we propose a constraint based approach to solve conflicts between aircraft during the fly: Each pair of aircraft in the controlled airspace is related by a separation constraint which specifies that at any moment, the two aircraft must be separated by a minimum distance guaranteeing safety; the issue is to determine the maneuvers to be implemented by the pilots such that the separation constraints are satisfied. To this end, we discretized the time and used discrete variables and constraints. We formulated the separation constraints so that the number of tests required to check a constraint is independent of the time step. We also identified some useful properties of these constraints and used them to infer infeasible values during the search. Moreover, the filtering algorithm we used suspends forward checks until they are required by the search and avoids searching large domains for consistent values until it has to. These techniques save search efforts and reduce the computational times. In addition, to minimize the delays and additional consumption, "least- aggressive" maneuvers are tested first. Our approach make possible to solve difficult Air Traffic Control situations in a few seconds. Keywords: Air traffic control, conflict resolution, constraint satisfaction problem, constraint propagation.

∙ Technical/Real Time (5/20 min), Air Traffic Control (ATC): surveillance, coordination with adjacent centres, conflict detection and resolution by various simple maneuvers transmitted to the pilots. A conflict occurs during the fly if the distance between two aircraft is less than a minimum distance guaranteeing safety. ∙ Emergency (less than 5 min), safety nets: Recourse to the emergency filter occurs only when there is no air traffic control system or there is a fault in it. This filter must resolve unforeseeable conflicts such as, for example, an aircraft exceeding a flight level given by a control, or a technical malfunction which would considerably degrade aircraft performance. We therefore tend to reduce the problem of air traffic control to that of conflict detection and resolution at the technical filter level and therefore to the following problem:

In Europe, the overall system currently implemented to ensure the safety and efficiency of air traffic can be conceptually divided in several layers of filters with decreasing time horizons:

knowing the positions of aircraft at any given time and their future positions (to a fixed degree of accuracy) during an anticipation period T (5 to 20 min), the problem is to determine the maneuvers to be implemented by the pilots so that the trajectories of aircraft do not generate conflicts and delay is minimized.

∙ Strategic (more than six months), Air Space Management (ASM): design of routes, sectors and procedures.

Nowadays, humans are an essential element in the process of resolving conflicts due to their ability to make judgments. However, with the growth of

1

Introduction

13


the airspace congestion, human controllers would no longer be able to solve conflicts by their own, thus requesting automation tools.

make possible to solve difficult Air Traffic Control situations in a few seconds. On the other hand, to minimize the delays and additional consumption, "least- aggressive" maneuvers are tested first.

In this paper, we develop a conflict resolution algorithm for ATC ground-based centralized system. This algorithm deal with situations involving multiple aircraft. Although for the two aircraft case there are many contributions in the literature both in deterministic and probalistic settings, the treatments of multiple aircraft case are relatively rare and unsystematic. A survey of various approaches to air traffic conflict resolution is given in [7] and [3]. Among them we mention the following approaches: Force field approaches model each aircraft as a changed particle and use modified electrostatic equations to determine resolution maneuvers. The repulsive forces between aircraft are used to define the maneuver each aircraft performs to avoid a collision [15], [6]. Optimized conflict resolution can involve a rule-based decision [13] or determining which of several avoidance options minimizes a given cost function. The Traffic Alert and Collision Avoidance System (TCAS), for example, searches through a set of potential climb or descend maneuvers and chooses the least-aggressive maneuver that still provides adequate protection. Algorithms for resolving threedimensional conflicts involving multiple aircraft are presented in [10]. These algorithms are based on trajectory optimization methods and provide resolution actions that minimize a certain cost function. A resolution by Integer Linear Programming was proposed in [8], however the model don’t take into account returning back to the main trajectory. In [11], [2] authors describe mathematical programming model using heuristic methods based respectively on genetic algorithms and ant colony optimization. To our knowledge, only [12] has already proposed an approach using the Constraint Satisfaction Problem (CSP) formalism to solve conflicts between aircraft during the fly. In [12], authors proposed two models using the CSP formalism. The first model is only for emergency situations and is not suitable for common traffic control since it does not take into account returning back to the main trajectory. The second model is more realistic, but the effectiveness of this model is limited because only conflicts involving 2 or 3 aircraft could be solved. Our approach, which is also based on the CSP formalism, makes possible to solve conflicts involving up to 19 aircrafts in a few seconds. We have discretized the time and used discrete variables and constraints. We have also identified some useful properties and used them to improve the efficiency of the searching algorithm. In addition, the searching algorithm was equiped by an efficient filtering algorithm which suspends forward checks until they are required by the search. These techniques save search efforts and

The remainder of this paper is organized as follows: We first recall some definitions in section 2. In section 3 we will present a CSP model for the conflict resolution problem. Afterwards, we will describe some useful properties of the separation constraints in section 4. Section 5 focuses on the search algorithm. Experimental results are given in section 6. In section 7, we show how our approach can be extended to handle some additional features of the problem. We conclude with Section 8.

2

Definitions and preliminaries

Constraint Satisfaction problems (CSPs) occur widely in artificial intelligence. They involve finding values for problem variables subject to constraints which restrict acceptable combinations. In this section, we briefly introduce some notations and definitions used hereafter. Definition 1 A Binary Constraint Network is defined by: ∙ a finite set 𝑋 = {𝑋1 , ..., 𝑋𝑛 } of 𝑛 variables such that each variable 𝑋𝑖 has an associated domain 𝐷(𝑋𝑖 ) denoting the set of values allowed for 𝑋𝑖 . ∙ a finite set 𝐶 = {𝑐𝑖𝑗 } of binary constraints. A constraint 𝑐𝑖𝑗 relates the variables 𝑋𝑖 and 𝑋𝑗 and specifies the allowed pairs of values of these variables. It can be defined by a Boolean function on 𝐷(𝑋𝑖 ) × 𝐷(𝑋𝑗 ).

A solution to a constraint network is an assignment of values to all the variables such that all the constraints are satisfied. A constraint network is said to be satisfiable if it admits at least a solution. The Constraint Satisfaction Problem (CSP) is the task to determine whether or not a given constraint network, also called CSP instance, is satisfiable. To solve a CSP instance, a depth-first search algorithm with backtracking can be applied, where at each step of the search, a variable assignment is performed followed by a process called constraint propagation. This process consists of calling the filtering algorithm associated with the constraints. A filtering algorithm associated with a constraint 𝑐 is an algorithm which may remove some values that are inconsistent with 𝑐; and does not remove any consistent values.

14


3

Modelling

To model the conflict resolution problem, we can associate for each aircraft 𝐴𝑖 a variable vector 𝑋𝑖 of three elements that define the maneuver 𝑚𝑖 = (𝐵, 𝛼, Δ) which will perform the aircraft 𝐴𝑖 . The domain of the variable 𝑋𝑖 is given by the set :

The Conflict Resolution Problem can be modelled as a binary CSP. In this section, we will first present the hypotheses needed to formulate the problem and then we will identify the unknowns (variables), their associated domains and then we will formulate the constraints.

𝐷(𝑋𝑖 ) = {𝑝𝑒𝑟𝑠𝑖𝑠𝑡} ∪ {{0, ..., 𝐿} {−30, −20, −10, 10, 20, 30} ×{1, ..., 𝐿}}

We consider a finite number 𝑛 of aircraft flying on the same horizontal plan. Each aircraft is characterized by its initial position, its speed and its direction. The speed of each aircraft is constant. An aircraft 𝐴𝑖 (𝐴𝑖 is the i-th aircraft) is represented by a point in the plan. The trajectory prediction is done for an anticipation period 𝑇 and the beginning of the prediction coincides with the origin of time. The issue of the conflict resolution problem is to predict if a conflict will occur between two or more aircraft during the anticipation period 𝑇 and in such a case to determine the maneuvers to be implemented by the pilots to avoid conflicts and to minimize the delays. A conflict between two aircraft occurs if the two aircraft are closer than a given safety distance 𝑆𝐷 . Current enroute air traffic control rules often consider this distance to be 5 nautical miles.

×

To formulate the constraints, we will first focus our attention on the case of two aircraft 𝐴𝑖 and 𝐴𝑗 . We will determine a sufficient condition which guaranties that at any time, the future positions of 𝐴𝑖 and 𝐴𝑗 are separated by a distance greater than 𝑆𝐷 . To this end, instead of reasoning about positions at each time step, we will consider straight pieces of the trajectories. Indeed, when we consider an aircraft 𝐴𝑖 , we have five times 0, 𝐵𝑖 , (𝐵𝑖 + Δ𝑖 ) ,(𝐵𝑖 + 2Δ𝑖 ) and 𝐿 which divide the anticipation period into at most four intervals. During each interval, the trajectory of 𝐴𝑖 is a line segment: During the first interval, the aircraft remains on its original trajectory; during the second one, the aircraft is deviated by an angle 𝛼; during the third one, the aircraft backs to its original trajectory; and during the last one, the aircraft joins its original trajectory and continues its flight on it (figure 1).

We consider that each aircraft is flying in a direct route from its point of departure to their point of arrival. To avoid a conflict, an aircraft is authorised to deviate from its initial route, but once the conflict avoided, the aircraft must head back to its original direction by taking a heading that makes an angle with its original direction equal to the performed deviation.1 We denote by 𝐵 the time when the aircraft leaves its initial route, by 𝛼 the deviation angle, and by Δ the duration between the time the aircraft leaves its initial route and the time it begins heading back to its initial route (figure 1). We consider that Δ, 𝐵 and 𝛼 are discrete variables. To define the domains of the variables 𝐵 and Δ, we divide the anticipation period 𝑇 into 𝐿 intervals of fixed length equal to 𝜏 (i.e. 𝑇 = 𝐿𝜏 ). Consequently, 𝐵 can range over {0, ..., 𝐿} and Δ can range over {1, ..., 𝐿}. To give simple orders to pilots, deviation angles are also discretized and can range over {−30, −20, −10, 10, 20, 30} degrees which are realistic ATC orders.

Figure 1: Trajectory avoidance model Therefore, when we consider two aircraft 𝐴𝑖 and 𝐴𝑗 , we have 8 times 0, 𝐿, 𝐵𝑖 , (𝐵𝑖 + Δ𝑖 ), (𝐵𝑖 + 2Δ𝑖 ), 𝐵𝑗 , (𝐵𝑗 + Δ𝑗 ) and (𝐵𝑗 + 2Δ𝑗 ) which divide the anticipation period into at most 7 time intervals 𝐼𝑘 . During an interval 𝐼𝑘 , both of the trajectories of the aircraft 𝐴𝑖 and 𝐴𝑗 are line segments. Each interval 𝐼𝑘 is defined by two times 𝑡𝑘 and 𝑡𝑘+1 (i.e. 𝐼𝑘 = [𝑡𝑘 , 𝑡𝑘+1 ]) such that 𝑘 ∈ {0, 1, ..𝑝 < 7}. To calculate the times 𝑡𝑘 , we sort the set {0, 𝐿, 𝐵𝑖 , (𝐵𝑖 + Δ𝑖 ), (𝐵𝑖 + 2Δ𝑖 ), 𝐵𝑗 , (𝐵𝑗 + Δ𝑗 ), (𝐵𝑗 + 2Δ𝑗 ) } in a strictly increasing order and eliminate times greater than 𝐿. The time 𝑡𝑘 is the 𝑘 𝑡ℎ smaller time in this sorted set. For example, the figure 2 shows the trajectories of two aircraft 𝐴𝑖 and 𝐴𝑗 respectively performing maneuvers 𝑚𝑖 = (𝐵𝑖 = 2, 𝛼𝑖 = 30∘ , Δ𝑖 = 3) and 𝑚𝑗 = (𝐵𝑗 = 3, 𝛼𝑗 = 30∘ , Δ𝑗 = 4). In this scenario we have 𝑡0 = 0, 𝑡1 = 𝐵𝑖 = 2, 𝑡2 = 𝐵𝑗 = 3, 𝑡3 = 𝐵𝑖 + Δ𝑖 = 5, 𝑡4 = 𝐵𝑗 + Δ𝑗 = 7, 𝑡5 = 𝐵+ 2Δ𝑖 = 8, 𝑡6 = 𝐵𝑗 + 2Δ𝑗 = 11 , and 𝑡7 = 𝐿 = 13. As shown in the figure, during each interval [𝑡𝑘 , 𝑡𝑘+1 ], both of

The maneuvers that an aircraft may perform correspond to the different combinations {0, ..., 𝐿} × {−30, −20, −10, 10, 20, 30} × {1, ..., 𝐿} that can take the vector (𝐵, 𝛼, Δ). An aircraft may also persist in its initial route without performing any maneuver. We denote by the vector (0, 0, 𝐿) the maneuver persist. 1 To account for standard routes, we should simply direct the aircraft on its next waypoint once the conflict avoided.

15


the trajectories of the aircraft 𝐴𝑖 and 𝐴𝑗 are line segments.

Figure 3: Aircraft are moving away from one one another Figure 2: Trajectories are divided into straight pieces To be separated during all the anticipation period 𝑇 by a distance strictly greater than the safety distance 𝑆𝐷 , it is sufficient that the minimum distance between 𝐴𝑖 and 𝐴𝑗 in each interval 𝐼𝑘 be strictly greater than 𝑆𝐷 . We will now show how to compute the minimum distance between two aircraft flying on straight paths during a time interval. Let 𝐴𝑖 and 𝐴𝑗 be two aircraft flying on straight paths with constant speed during a time interval 𝐼𝑘 = [𝑡𝑘 , 𝑡𝑘+1 ]. We call 𝑑𝑚𝑖𝑛 the minimum distance that will separate 𝐴𝑖 and 𝐴𝑗 during 𝐼𝑘 . Let 𝑝𝑖 (𝑡) and 𝑣𝑖 (𝑡) be the position and the speed of the aircraft 𝐴𝑖 at the instant 𝑡. We note by 𝑝𝑖𝑗 (𝑡) and 𝑣𝑖𝑗 (𝑡) the relative position and the relative speed of 𝐴𝑖 and 𝐴𝑗 at time 𝑡. We have 𝑝𝑖𝑗 (𝑡) = 𝑝𝑗 (𝑡)-𝑝𝑖 (𝑡) and 𝑣𝑖𝑗 (𝑡)=𝑣𝑗 (𝑡)-𝑣𝑖 (𝑡). Since we assume that speeds are constant, the relative speed will be constant and it will be noted simply by 𝑣𝑖𝑗 . We note 𝑢.𝑣 and 𝑢 ∧ 𝑣 the scalar and vector product of two vectors 𝑢 and 𝑣, and we note ∣∣𝑢∣∣ the norm of the vector 𝑢.

Figure 4: [𝑡𝑘 , 𝑡𝑘+1 ]

Aircraft will cross during the interval

moving closer to one another during the entire interval [𝑡𝑘 , 𝑡𝑘+1 ]. Thus, the minimum distance that will separate them is equal to the norm of the relative position at time 𝑡𝑘+1 (𝑑𝑚𝑖𝑛 = ∣∣𝑝𝑖𝑗 (𝑡𝑘+1 ∣∣)). In a more formal way, we define a function named 𝑑𝑚𝑖𝑛 which returns the minimum distance between two aircraft 𝐴𝑖 and 𝐴𝑗 during an interval [𝑡𝑘 , 𝑡𝑘+1 ]. We also define a parameter 𝜑𝑖 = (𝑣𝑖 (0), 𝑝𝑖 (0), 𝑑𝑖 ) where 𝑣𝑖 (0), 𝑝𝑖 (0), 𝑑𝑖 are the initial speed, position and direction of aircraft 𝐴𝑖 .

If 𝑣𝑖𝑗 = ⃗0, then the distance between the aircraft 𝐴𝑖 and 𝐴𝑗 will be constant during the entire interval [𝑡𝑘 , 𝑡𝑘+1 ]. Suppose now that 𝑣𝑖𝑗 ∕= ⃗0; 𝑑𝑚𝑖𝑛 is dependent on the scalar product 𝑝𝑖𝑗 (𝑡).𝑣𝑖𝑗 .

𝑑𝑚𝑖𝑛 is defined as follow: 𝑑𝑚𝑖𝑛 (𝜑𝑖 , 𝜑𝑗 , [𝑡𝑘 , 𝑡𝑘+1 ]) =

When 𝑝𝑖𝑗 (𝑡𝑘 ).𝑣𝑖𝑗 ≥ 0 (Figure 3) the aircraft 𝐴𝑖 and 𝐴𝑗 are moving away from one another, and then 𝑑𝑚𝑖𝑛 is equal to the norm of the relative position at time 𝑡𝑘 (𝑑𝑚𝑖𝑛 = ∣∣𝑝𝑖𝑗 (𝑡𝑘 )∣∣).

∙ ∥𝑝𝑖𝑗 (𝑡𝑘 )∥ if 𝑝𝑖𝑗 (𝑡𝑘 ).𝑣𝑖𝑗 ≥ 0

When 𝑝𝑖𝑗 (𝑡𝑘 ).𝑣𝑖𝑗 < 0 (Figure 4) the aircraft 𝐴𝑖 and 𝐴𝑗 are moving closer to one another. If 𝑝𝑖𝑗 (𝑡𝑘+1 ).𝑣𝑖𝑗 ≥ 0 the aircraft 𝐴𝑖 and 𝐴𝑗 will cross during the interval [𝑡𝑘 , 𝑡𝑘+1 ] and the minimum distance that will separate them is equal to ∣∣𝑝𝑖𝑗 (𝑡𝑘 ) ∧ 𝑣𝑖𝑗 ∣∣/∣∣𝑣𝑖𝑗 ∣∣.

∙ ∥𝑝𝑖𝑗 (𝑡𝑘 ) ∧ 𝑣𝑖𝑗 ∥/∥𝑣𝑖𝑗 ∥ if 𝑝𝑖𝑗 (𝑡𝑘 ).𝑣𝑖𝑗 < 0 and 𝑝𝑖𝑗 (𝑡𝑘+1 ).𝑣𝑖𝑗 ≥ 0

∙ ∥𝑝𝑖𝑗 (𝑡𝑘+1 )∥ if 𝑝𝑖𝑗 (𝑡𝑘 ).𝑣𝑖𝑗 < 0 and 𝑝𝑖𝑗 (𝑡𝑘+1 ).𝑣𝑖𝑗 < 0

Therefore, we define a Boolean function named conflict which returns true if a conflict occurs between two aircraft 𝐴𝑖 and 𝐴𝑗 respectively performing maneuvers 𝑚𝑖 and 𝑚𝑗 , during the anticipation period 𝑇 . It returns false otherwise:

Otherwise (i.e. 𝑝𝑖𝑗 (𝑡𝑘 ).𝑣𝑖𝑗 < 0 and 𝑝𝑖𝑗 (𝑡𝑘+1 ).𝑣𝑖𝑗 < 0), the aircraft 𝐴𝑖 and 𝐴𝑗 are

16


conflict((𝜑𝑖 , 𝑚𝑖 ), (𝜑𝑗 , 𝑚𝑗 )) = { false 𝑖𝑓 ∀𝑘 ∈ {0, 1, ..., (ℎ − 1)} : 𝑑𝑚𝑖𝑛 (𝜑𝑖 , 𝜑𝑗 , [𝑡𝑘 , 𝑡𝑘+1 ]) > 𝑆𝐷 true 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 ℎ is the number of the intervals 𝐼𝑘 ; ℎ is always less or equal to 7.

are dominated by 𝑚𝑗 with respect to the constraint 𝑐𝑖𝑗 and the current value assigned to 𝑋𝑖 . Given two maneuvers 𝑚𝑖 ∈ 𝐷𝑖 , 𝑚𝑗 ∈ 𝐷𝑗 and a constraint 𝑐𝑖𝑗 , the following propositions enable us to determine the set 𝑊 (𝑚𝑖 , 𝑚𝑗 , 𝑐𝑖𝑗 ) :

Note that the number of the checks performed by the function conflict is independent of the time step; it depends only on ℎ.

Proposition 1 Let (𝑋𝑖 = 𝑚𝑖 , 𝑋𝑗 = 𝑚𝑗 = (𝑏, 𝛼, 𝛿)) an assignment that violates 𝑐𝑖𝑗 and 𝑡𝑐 the conflict date. If 𝑡𝑐 < 𝑏 then ∀𝑤𝑗 ∈ 𝑊 (𝑚𝑖 , 𝑚𝑗 , 𝑐𝑖𝑗 ) = {(𝑏′ , 𝛼′ , 𝛿 ′ ) ∈ 𝐷𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑏′ > 𝑡𝑐 } the assignment (𝑋𝑖 = 𝑚𝑖 , 𝑋𝑗 = 𝑤𝑗 ) violates 𝑐𝑖𝑗 .

We can now state our model as follow: We associate for each aircraft 𝐴𝑖 a variable vector 𝑋𝑖 of three elements that define the maneuver 𝑚𝑖 = (𝐵, 𝛼, Δ) which will perform the aircraft 𝐴𝑖 . The domain of the variable 𝑋𝑖 is given by the set 𝐷(𝑋𝑖 ) = {𝑝𝑒𝑟𝑠𝑖𝑠𝑡} ∪ {{0, ..., 𝐿} × {−30, −20, −10, 10, 20, 30} × {1, ..., 𝐿}}. The constraints specify that the future trajectories of each pair of aircraft must be separated during all the anticpation period T. Thus, if 𝑛 is the number of the aircraft in the controlled airspace, there will be 𝑛(𝑛 − 1)/2 binary constraints. A separation constraint 𝑐𝑖𝑗 relates two variables 𝑋𝑖 and 𝑋𝑗 and is defined by the following Boolean function: 𝑐𝑖𝑗 : ¬conflict((𝜑𝑖 , 𝑋𝑖 ), (𝜑𝑗 , 𝑋𝑗 ))

4

proof 1 Both maneuvers 𝑚𝑗 and 𝑤𝑗 start after the conflict date. Therefore, the position of the aircraft 𝐴𝑗 at the conflict date remains the same if we substitute the maneuver 𝑚𝑗 for 𝑤𝑗 (figure. 5.a). Proposition 2 Let (𝑋𝑖 = 𝑚𝑖 , 𝑋𝑗 = 𝑚𝑗 = (𝑏, 𝛼, 𝛿)) an assignment that violates 𝑐𝑖𝑗 and 𝑡𝑐 the conflict date. If (𝑏 < 𝑡𝑐 < 𝑏 + 𝛿) then ∀𝑤𝑗 ∈ 𝑊 (𝑚𝑖 , 𝑚𝑗 , 𝑐𝑖𝑗 ) = {(𝑏, 𝛼, 𝛿 ′ ) ∈ 𝐷𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑏 + 𝛿 ′ > 𝑡𝑐 } the assignment (𝑋𝑖 = 𝑚𝑖 , 𝑋𝑗 = 𝑤𝑗 ) violates 𝑐𝑖𝑗 . proof 2 Whatever the maneuver performed by 𝐴𝑗 ( 𝑚𝑗 or 𝑤𝑗 ), 𝐴𝑗 will be deviated by the same angle 𝛼 at the same time 𝑏, and it will start to return to its original trajectory after the conflict date ((𝑏 + 𝛿 > 𝑡𝑐 ) and (𝑏 + 𝛿 ′ > 𝑡𝑐 )). Then, the position of 𝐴𝑗 at the conflict date will not change if we substitute the maneuver 𝑚𝑗 for 𝑤𝑗 (figure. 5.b).

Inferring infeasible values

We will now show that the model we presented above permit to identify some useful properties which we can exploit to reduce the search space. To present these properties, we need to introduce the following definitions:

Proposition 3 Let (𝑋𝑖 = 𝑚𝑖 , 𝑚𝑗 = (𝑏, 𝛼, 𝛿)) an assignment that violates 𝑐𝑖𝑗 and 𝑡𝑐 the conflict date. If (𝑡𝑐 > 𝑏 + 2𝛿) then ∀𝑤𝑗 ∈ 𝑊 (𝑚𝑖 , 𝑚𝑗 , 𝑐𝑖𝑗 ) = {(𝑏′ , 𝛼, 𝛿) ∈ 𝐷𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝑏′ + 2𝛿 < 𝑡𝑐 } the assignment (𝑋𝑖 = 𝑚𝑖 , 𝑋𝑗 = 𝑤𝑗 ) violates 𝑐𝑖𝑗 .

Definition 2 let 𝐴𝑖 and 𝐴𝑗 be two aircraft in the controlled airspace. If a conflict will occur between the aircraft 𝐴𝑖 and 𝐴𝑗 , then we call the conflict date the instant 𝑡𝑐 such that the distance between the two aircraft at 𝑡𝑐 is less or equal to the safety distance 𝑆𝐷 and ∀𝑡′ ∈ [0, 𝑡𝑐 [, the distance between the two aircraft is strictly greater than the safety distance.

proof 3 Both maneuvers 𝑚𝑗 and 𝑤𝑗 generate the same delay and will finish before the conflict date. Therefore, the position of the aircraft 𝐴𝑗 at the conflict date remains the same if we substitute the maneuver 𝑚𝑗 for 𝑤𝑗 (figure. 5.c).

Definition 3 let 𝑋𝑖 and 𝑋𝑗 be two variables linked by a constraint 𝑐𝑖𝑗 . Given a value 𝑚𝑖 ∈ 𝐷(𝑋𝑖 ) and a value 𝑚𝑗 ∈ 𝐷(𝑋𝑗 ), a value 𝑤𝑗 ∈ 𝐷(𝑋𝑗 ) is dominated by 𝑚𝑗 with respect to the constraint 𝑐𝑖𝑗 and 𝑚𝑖 if : (𝑋𝑖 = 𝑚𝑖 , 𝑋𝑗 = 𝑚𝑗 ) violates 𝑐𝑖𝑗 ⇒ (𝑋𝑖 = 𝑚𝑖 , 𝑋𝑗 = 𝑤𝑗 ) violates 𝑐𝑖𝑗 . Now, consider a separation constraint 𝑐𝑖𝑗 and suppose that 𝑋𝑖 is currently instantiated to a value 𝑚𝑖 . When we fail to instantiate 𝑋𝑗 with a value 𝑚𝑗 because 𝑐𝑖𝑗 is violated, we can identify a set 𝑊 (𝑚𝑖 , 𝑚𝑗 , 𝑐𝑖𝑗 ) = {𝑤𝑗 ∈ 𝐷𝑗 } of maneuvers in 𝐷𝑗 such that the assignment (𝑋𝑖 = 𝑚𝑖 , 𝑋𝑗 = 𝑤𝑗 ) will necessarily violate 𝑐𝑖𝑗 . The maneuvers in 𝑊 (𝑚𝑖 , 𝑚𝑗 , 𝑐𝑖𝑗 )

Figure 5: dominated maneuvers

17


5

The search algorithm

Controlling propagation and filtering in CHOCO consists in creating subclasses of variables and constraints and redefining their own propagation management functions (awakeOn... functions). Algorithm 1 presents SMFC for a separation constraint 𝑐𝑖𝑗 that we have implemented in CHOCO solver:

The resolution method we propose is a depth-first search algorithm with backtracking. This algorithm is equipped with an appropriate filtering algorithm and suitable search heuristics. Since we manipulate variables with large domains, systematic domain filtering may become costly. This observation has already been made about forward checking (FC) [4] in [16] and largely clarified in [1]. Therefore, the filtering algorithm we associate to each separation constraint 𝑐𝑖𝑗 is based on Minimal Forward Checking (MFC) [1] which is a lazy version of FC. MFC is based on the observation that FC attempts to instantiate a new variable only when there is at least one value in each future domain that is consistent with all the variables that have been instantiated. MFC finds and maintains one consistent value in every future domain, "suspending" forward checks until they are required by the search. In this way, MFC avoids searching (possibly large) domains for consistent values unless it has to and this can be reflected in a considerable gain in terms of computational time. Obviously the worst-case complexity is the same as for usual forward Checking and the average-case behaviour can be far better, especially when the domains are large. Our filtering algorithm, which we call Strong Minimal Forward Checking (SMFC), also finds and maintains one value in every future domain. But, as we will show later, SMFC is stronger than MFC since it prunes more values than MFC.

Each time a variable 𝑋𝑖 is assigned ( INSTANTIATE event) , all the constraints 𝑐𝑖𝑗 connected to 𝑋𝑖 are waken calling awakeOnInst(i) functions and the domain of 𝑋𝑗 , the other variable involved in each constraint 𝑐𝑖𝑗 , is filtered by means of the function filter(j). The function filter(j) searches trough the current domain of 𝑋𝑗 attempting to find the first (i.e. the smallest ) acceptable value. Testing 𝐿𝐵𝑗 (i.e. the current lower bound) of 𝑋𝑗 is made by calling the function conflict. When 𝐿𝐵𝑗 is found to be inconsistent, it is removed (INCINF event), and all the values dominated by the 𝐿𝐵𝑗 are removed too. Therefore, the level of our filtering algorithm is larger than the original minimal forward checking. If the lower bound of 𝑋𝑗 was updated, all the constraints 𝑐𝑘𝑗 involving 𝑋𝑗 are waken ( calling awakeOnInf(j)). When a function awakeOnInf(j) associated with a constraint 𝑐𝑘𝑗 is called, and if 𝑋𝑘 (the other variable connected to 𝑐𝑘𝑗 ) is already assigned, the function filter(j) is called in order to find the smallest value in the domain of 𝑋𝑗 which is consistent with the value assigned to 𝑋𝑘 . If the lower bound of 𝑋𝑗 is once again modified, all constraints that binds 𝑋𝑗 are informed. The process will terminate when the lower bound of 𝑋𝑗 is consistent with all variables already assigned or a wipe out occurs (the domain of 𝑋𝑗 becomes empty). If after assigning a variable 𝑋𝑖 , the forward check is unsuccessful (the domain of a future variable becomes empty), the forward check previously done for the current value of 𝑋𝑖 is undone and the current value of 𝑋𝑖 is removed (i.e. its lower bound is updated (INCINF event)). This event is propagated until the next valid maneuver is found. If there are more values to choose from the domain of 𝑋𝑖 the search can move forward again, otherwise a backtrack occurs.

Our implementation is based on CHOCO [9] which is a library implementing basic primitives for constraint programming: domain management, constraint propagation, as well as global and local search procedures. From an operational point of view, The CHOCO solver can be described in terms of events and agents: during propagation, each time a variable domain is shrunk an event is generated and the constraints connected to this variable are waken (react to the event). In turn, these constraints may generate new events (filtering out some values from domains). CHOCO manages four types of events: ∙ INCINF: the domain lower bound for some variable is increased;

The order in which the branches are visited is particularly interesting when looking for a solution with certain criterion. In our case, we try to find a solution which minimise additional consumption and delays caused by the maneuvers. These two criteria are dependent primarily on the maneuvers delay Δ and the deviation angle 𝛼. So, the domains are sorted as follow: the maneuver persist is the smallest maneuver; then, the maneuvers will be considered according to an increasing order of Δ. If two maneuvers have the same duration Δ, we consider that this

∙ DECSUP the domain upper bound for some variable is decreased; ∙ INSTANTIATE: the domain of some variable is reduced to a single value; ∙ REMOVAL: a value is removed from the domain of some variable.

18


with the lowest deviation is smaller. Therefore, least aggressive maneuvers will be considered first. To direct the search to hard parts of the CSP and to exploit the level of our filtering algorithm which is larger than the original minimal forward checking, we use the heuristic 𝑑𝑜𝑚 [4] to choose the next variable to instantiate: variables are increasingly ordered according to the current size of their domains and the variable with the minimum domain size is selected.

Algorithm 1: Strong Minimal Forward Checking for a constraint separation 𝑐𝑖𝑗 1 2 3

6

4

Experimental results

5

In this section, we report the results obtained using the algorithm presented in the previous section. In the first group of studies, we have considered common conflicting situations in ATC involving aircraft converging to the same waypoint. In the second group of case studies, we have considered general cases randomly generated. In all studied cases, we consider that the standard separation 𝑆𝐷 = 5𝑛𝑚, that the anticipation period 𝑇 = 12𝑚𝑖𝑛 and that the time step 𝜏 = 12𝑠. In the following, results for both case studies are presented. We have used an Intel P4 machine with 1 Go of RAM under Linux ( Ubnutu 6.06 )

6 7 8 9 10 11 12 13 14

symmetric scenarios

6.2

21

end

22

36

function filter( in index ; out events ): begin 𝑗 ← 𝑖𝑛𝑑𝑒𝑥; 𝑖 ← the index of the other variable involved in this constraint; 𝐿𝐵𝑗 ← min (𝐷𝑗 ); 𝑎 ← the current value of 𝑋𝑖 ; found ← false ; while (¬ found ) do found ← ¬(conflict((𝜑𝑖 ,𝑎),(𝜑𝑗 ,𝐿𝐵𝑗 ))); if ¬ found then remove 𝐿𝐵𝑗 from 𝐷𝑗 ; 𝑊 (𝑎, 𝐿𝐵𝑗 , 𝑐𝑖𝑗 ) ← the set of values in 𝐷𝑗 dominated by 𝐿𝐵𝑗 with respect to 𝑐𝑖𝑗 and the current value of 𝑋𝑖 ; remove 𝑊 (𝑎, 𝐿𝐵𝑗 , 𝑐𝑖𝑗 ) from 𝐷𝑗 ; if 𝐷𝑗 = ∅ then exit ("wipeout"); 𝐿𝐵𝑗 ← min(𝐷𝑗 );

37

end

19

23 24 25 26 27 28 29 30 31 32 33

34 35

random scenarios

To simulate more general ATC events we have generated random scenarios. Each scenario is characterized by ( #n, #xy, #sp, #h). Where:

19

function awakeOnInst( in index ; out events ): 𝑖 ← 𝑖𝑛𝑑𝑒𝑥; 𝑗 ← the index of the other variable involved in this constraint; begin filter( 𝑗 ); end

20

17 18

A common real conflicting situation in ATC involves aircraft converging to the same waypoint. To simulate these situations, authors often consider a number 𝑛 of aircraft symmetrically distributed on a circle of radius 𝑅, flying at the same velocity 𝑣 and heading towards the origin (see for example [8]). In absence of maneuvers, all aircraft would have a conflict at the same time. Since assumptions made by authors are different, we choose to compare our results to [8] which is one of the most efficient approaches that have been proposed to address the conflict resolution problem. Pallotino in [8] has formulated the conflict resolution problem as a mixed integer linear program where only heading angles changes are allowed and he solved the problem using CPLEX [5]. As in [8], we consider that 𝑅 is equal to 60𝑛𝑚 and that 𝑣 is equal to 246𝑚/𝑠. In table 1 we indicate by 𝑛 the number of aircraft involved in the conflict and by TIME the computational time (in seconds) obtained by our solver. In the third column, we indicate by TIME*, the computational times obtained in [8] to solve similar problems. It appears clearly that the computational times obtained by our solver are reasonable.

function awakeOnSup( in index ; out events ): begin nothing; end

function awakeOnInf( in index ; out events ): begin 𝑖 ← 𝑖𝑛𝑑𝑒𝑥; 𝑗 ← the index of the other variable involved in this constraint; if (𝑋𝑗 is instantiated) then filter( 𝑖 );

15 16

6.1

function awakeOnRem( in index ; out events ): begin nothing; end


n 10 11 12 13 14 15 16 17 18 19

TIME (s) 1.23 1.62 1.50 2.00 2.87 3.51 4.17 4.59 4.98 6.06

TIME*(s) 2.17 8.62 15.82 -

Table 1: symmetric cases

(a)

∙ 𝑛 is the number of aircraft in the controlled airspace; ∙ 𝑥𝑦 is the vector of the initial positions of aircraft in the horizontal plan (aircraft cruise within the same altitude); 𝑥[𝑖] ∈ [−90𝑛𝑚, 90𝑛𝑚]; 𝑦[𝑖] ∈ [−70𝑛𝑚, 70𝑛𝑚]; ∙ 𝑠𝑝 is the vector of speeds of the aircraft: 𝑠𝑝[𝑖] ∈ [250𝑚/𝑠, 280𝑚/𝑠]; ∙ ℎ is the vector of initial directions. Table 2 shows the performance average on 1000 instances for each 𝑛. In table 2 we indicate by n the number of aircraft in the controlled airspace, by TIME the average of the computational time, by CONF2 the average of the number of conflicts between initial trajectories of aircraft. We also indicate by MAX ( resp MIN ) the maximal ( resp the minimal ) computational time for each n.

(b) Figure 6: The figure (a) shows the symmetric case of 19 aircraft. All aircraft would have a conflict at the same time. In the figure (b) is plotted the trajectories that must follow aircraft to avoid conflicts. The solution was found by the conflict solver in 6 seconds.

As we have specified in the introduction, in this paper we study the conflict resolution problem at the technical filter level. In other words, we should resolve conflicts that take place in more than 5𝑚𝑖𝑛. Therefore, the computational times obtained are very reasonable since the maximum times are largely lower than 5𝑚𝑖𝑛. Hence, our resolution method can be used in a real-time context after suitable implementation studies.

7

scratch every time a change occurs, the problem will be reduced to the maintenance of a solution while exploiting as far as possible the search efforts done during the previous iterations. Many researchers have explored dynamic constraint problems and developed techniques to reason about them [14]. An additional feature of the problem that we must consider in the future is the uncertainty about the speed of aircraft. Indeed, even during the cruise phase, the speed of an aircraft can not be constant because of wind and turbulences. Uncertainties about the speed make it impossible to precisely forecast the future positions of aircraft. Fortunately, one of the strengths of the constraint programming is that a constraint can be defined in intention by a function of satisfaction. This function indicates whether the constraint is satisfied for a valuation of the variables involved. A constraint still well defined even if its associated function is modified. In our case, to handle uncertainties, it is sufficient to modify the function 𝑑𝑚𝑖𝑛 which measures the minimum distance between

Perspectives/Extensions

To further reduce the computational times, one can exploit dynamic constraint reasoning techniques. Indeed, to deal with the evolution of the environment (the exit and entry of the aircraft in controlled airspace), the conflict resolution problem must be solved each time a change occurs. But, another option is to see the conflict resolution problem as a dynamic CSP: Instead of resolving the problem from 2 one

conflict involves two aircraft

20


n 10 11 12 13 14 15 16 17 18 19

TIME (s) 0.17 0.12 0.15 0.26 0.26 0.34 0.37 0.36 0.45 0.59

MIN (s) 0.03 0.03 0.03 0.04 0.04 0.04 0.04 0.04 0.06 0.07

MAX (s) 58.39 2.12 1.76 72.65 34.86 45.79 48.67 45.70 60.09 56.09

CONF 1.75 1.80 2.20 2.67 2.99 3.45 4.00 4.53 5.20 5.65

Table 2: random scenarios

(a)

two aircraft during a time interval: depending on the degree of uncertainty, 𝑑𝑚𝑖𝑛 must ensure a safety margin on the minimum distance. This is the only change required, and anything else (model, resolution method, etc) remains valid. It should also be noted that in this paper we have restricted our problem to aircraft flying at the same level (planar conflict resolution). But our approach can be easily extended to three dimensions. A simple way to do this is to use the horizontal projection of the climbing and descending aircraft and to treat them as levelled aircraft with different speeds. However, horizontal maneuvers might be preferred because they save energy, induce less passenger discomfort and do not require flight level changes.

(b) Figure 7: The figure (a) shows a random scenario of 19 aircraft. Different conflict will occur if no conflict avoidance maneuver is done. In the figure (b) is plotted the trajectories that must follow aircraft to avoid conflicts. The solution was found in 1 second.

References 8

[1] M. J. Dent and R. E. Mercer. Minimal forward checking. In In Proceedings of Sixth International Conference on Tools with Artificial Intelligence, pages 432–438, New Orleans, LA, USA, November 1994.

Conclusion

In this paper, we have presented a constraint based approach to solve conflicts between aircraft during the fly using discrete variables and constraints. Even if we have discretized the time, the number of the tests performed to check the consistency of a separation constraint is independent of the time step. We have also identified some useful properties of the separation constraints and we have used these properties to infer infeasible values during the search. Furthermore, the search algorithm was equipped with an appropriate filtering algorithm which suspends forward checks until they are required by the search. These techniques permit to save search efforts and to reduce the computational times. The search algorithm was also equipped by suitable search heuristics. Experimental results show that the computational times obtained are very reasonable. Hence, our resolution method can be used in a real-time context after suitable implementation studies.

[2] N. DuranD and J. M. Alliot. Ant colony optimization for air traffic conflict resolution. In In proceedings of the USA/Europe Air Traffic Management Research and Developpment Seminar, page (on line), http://www.atmseminar.org, JuLY 2009. [3] I. Roussos K. Kyriakopoulos E. Siva A. LecchiniVisintini G. Chaloulos, J. Lygeros and P. Casek. Comparative study of conflict resolution methods. Technical Report D5.1, iFly Project, June 2009. [4] R. M. Haralick and G. L. Elliott. Increasing tree search efficiency for constraint satisfaction problems. In In proceedings of the 6th international

21


joint conference on Artificial intelligence, pages 356–364, San Francisco, CA, USA, 1979.

11th international joint conference on Artificial intelligence, pages 875–880, San Francisco, CA, USA, 1989.

[5] ILOG. Cplex user guide. ILOG, 1999.

Younes MECHQRANE is in his last year of a PhD program in Computer Science. His research interests include constraint reasoning, air traffic control, artificial intelligence, and computer network. He has worked on centralised and distributed constraint staisfaction problems and he is currently preparing some papers in this field. He has also worked on Frequency Allocation for Cellular Radio Networks and has published certain paper in this field.

[6] G. Pappa J. Kosecka, C. Tomlin and S. Sastry. Generation of conflict resolution maneuvers for air traffic management. In In proceedings of the 1997 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1598–1603 vol.3, Grenoble, France, Sep 1997. [7] J. K. Kuchar and L. C. Yang. A review of conflict detection and resolution modeling methods. IEEE Transactions on Intelligent Transportation Systems, 1(4):179–189, 2000. [8] E. Feron L. Pallottino and A. Bicchi. Conflict resolution problems for air traffic management systems solved with mixed integer programming. IEEE Transactions on Intelligent Transportation Systems, 3(1):3–11, 2002. [9] F. Laburthe. Choco: implementing a cp kernel. In In Proceedings of CP00 Post Conference Workshop on Techniques for Implementing Constraint programming Systems (TRICS), Singapore, September 2000. [10] P.K. Menon and G. D. Sweriduk. Optimal strategies for free-flight air traffic conflict resolution. Journal of Guidance, Control, and Dynamics, 22(2):202–211, 1997. [11] J. M. Alliot N. Durand and J. Noailles. Automatic aircraft conflict resolution using genetic algorithms. In In Proceedings of the 1996 ACM symposium on Applied Computing, pages 289– 298, New York, NY, USA, 1996. [12] P. Brisset T. Feydy, N. Barnier and N. Durand. Mixed conflict model for air traffic control. In Interval Analysis, Constraint Propagation, Applications (IntCP 2005), 2005. [13] F. FlocŠhic J.P. Nicolaon V. Duong, E. Hoffman and A. Bossu. Extended flight rules (efr) to apply to the resolution of encounters in autonomous airborne separation. Technical report, Eurocontrol, September 1996. [14] G. Verfaillie and N. Jussien. Constraint solving in uncertain and dynamic environments: A survey. Constraints, 10(3):253–281, 2005. [15] K. Zeghal. A review of different approaches based on force fields for airborne conflict resolution. In In Proceedings of AIAA Guidance, Navigation, and Control Conference, pages 818–827, 1998. [16] M. Zweben and M. Eskey. Constraint satisfaction with delayed evaluation. In In Proceedings of the

22


Stock Market Indices Prediction via Hybrid Sigmoid Diagonal Recurrent Neural Networks and Enhanced Particle Swarm Optimization Tarek Aboueldahab* and Mahumod Fakhreldin** Cairo Metro Company, Ministry of transport, Cairo, Egypt*, Computers and Systems Department, Electronic Research Institute, Cairo, Egypt** [email protected], [email protected]

Abstract Recently, the usage of hybrid intelligent model comprising both Neural Networks (NN) and Particle Swarm Optimization (PSO) for stock market prediction has been widely established. Although, this model has shown its fast search speed in the complicated optimization and search problem for stock market prediction, however, PSO could often easily fall into local optima, causing the decrease of prediction accuracy. This paper presents an Enhanced PSO (EPSO) to enhance the prediction accuracy and avoid premature convergence. The proposed method employs comparison between cognitive term and randomly perturbation term in particle relocation and the used network architecture is Sigmoid Diagonal Recurrent Neural Network (SDRNN). Experimental results on the most well known stock market indices have shown that EPSO could successfully deal with those difficulties while maintaining fast search speed. Keywords: Sigmoid Diagonal Recurrent Neural Networks, Enhanced Particle Swarm Optimization, Time Series Prediction, Stock Market

1. Introduction Neural networks have become very important method for stock market predictions because of their ability to deal with uncertainty and insufficient data sets which fluctuate rapidly in very short periods of time. A lot of architectures have been applied in the prediction of famous stock market indices such as Nasdaq100 and S&P500 indices [18-19]. For example, Recurrent Neural Networks (RNN) was introduced to forecast the daily closing prices of stock market indexes [10]. Also, Genetic Algorithm (GA) was incorporated to train FeedForward Neural Network parameters (FNN-GA) [4], optimum feature

23

selection was applied to train the network parameters [7]. Polynomial Neural Network based Genetic Algorithm (PNN-GA) was used to search between all possible input variables and to select the order of polynomial [11] and Local Linear Wavelet Neural Network (LLWNN) optimized by Estimation of Distribution Algorithm (EDA) was proposed to train the network parameters [16] On the other hand, researchers proved that ensemble neural networks and training them for the same task can produce more accurate results than using individual neural network [5]. Thus Particle Swarm Optimization (PSO) was used in training neural networks and was applied successfully in time series forecasting [2, 17], also it was shown that it is better suited for real time series prediction applications than GA because it has few parameters to tune and will not follow survival of the fittest [12]. Based on this recognition, (PSO) algorithm was used to train the selective neural network ensemble (PSOSEN) [14] and Flexible Neural Tree (FNT) with its structure and parameters optimized using (PSO) incorporated with (GA) were applied in both Nasdaq 100 and S&P 500 indices [15]. PSO is a population-based search algorithm and starts with an initial population of randomly generated solutions called particles [8]. However, could easily fall into local optima because particles are inspired by their cognitive behavior affecting both their associated best positions and the global best position, so once the best position is stuck in a local optimum, all the rest particles will quickly converge to this position. Also, another problem occurs when the number of iterations increases, the particles speed decreases leading to the convergence towards the global best point found so far, which is not guaranteed even to be a local minimum. Many researchers tried to enhance the PSO performance


and explore better solutions, either by introduced GAs’ operations like crossover, mutation, and selection operators [9] or tuning the PSO parameters [1, 3, 6], however, it leads to the increase in the computational effort and structure complexity. So enhancing the mutated probability of particles by decreasing the dependence of the particle on its previous best and make a dependency between particle position and other particles in the swarm will lead to some changes of the best particle position to help the particles in exploring new areas and escaping from local optima without increasing the structure complexity become very important issue. In our previous work, we introduced Sigmoid Diagonal Recurrent Neural Network (SDRNN) by adding a sigmoid weight to the hidden layer neuron in the standard (DRNN) to adapt the shape of the sigmoid function and proved that it is the best suited architecture to reduce the prediction errors in the time series applications compared to different FNN and RNN architectures [13]. So, in this paper, an Enhanced- PSO is presented based on modifying the normal cognitive term in the PSO velocity update equation by a random perturbation term and compare their effect on the fitness function. We analyzed both Nasdaq-100 index and S&P 500 index; [18-19] and our proposed SDRNN architecture is used in experiments that will be carried out. The results showed significantly that this proposed modification gives better performance and less forecasting error compared to using the standard PSO training algorithm. This paper is organized as follows: Section 2 reviews the SDRNN architecture, Section 3 introduces our proposed EPSO algorithm used for adaptation of the neural network weights, Simulation results is presented in Section 4, finally, Section 5 contains the conclusion and future work.

2. The Sigmoid Diagonal Neural Network (SDRNN)

Recurrent

A typical neural network consists of layers. In a single layer network there is an input layer of source nodes and an output layer of neurons. A multi-layer network has in addition one or more hidden layers and the number of the hidden layers nodes is selected to make the network more efficient and to interpret the data more accurately. The input/output relationship can be either non-linear or linear, and its characteristics are determined by the weights assigned to the connections between the nodes in the two adjacent layers. Changing the weight will

24

change the input-to-output behavior of the network [13]. Sigmoid Diagonal Recurrent Neural Network (SDRNN) is an enhancement of the DRNN because it adapts the shape of the sigmoid function by introducing the sigmoid weight vector. This adaptation enables the hidden layer neuron outputs to take any proper value and not to be restricted to the sigmoid function output value which leads to better accuracy in reflecting the required input/output mapping [13]. Thus we use this neural network architecture and its mathematical model is given as follows:Let ni , n h , n o are the number of neurons in input, hidden, and output layers respectively

W

I

is the input weight matrix connecting between S

D

are input layer and the hidden layer, W and W the sigmoid weight vector and the diagonal weight O

vector hidden layer, and W is the output weight matrix connecting between hidden layer and the output layer th

Assume, at sample k, the input to the i input neuron in the input layer is I ( k ) , so the output of the neural network can be calculated as follows:-

H j (k ) =W

D j

The output of the j be calculated as

n

× H ( k − 1) + ∑W ijI × I i ( k )....(1) i =1

th

H j (k ) = The output of the

neuron in the hidden layer can

f ( Hin j ( k ) ×W js )

m

W th

s j

..........(2)

neuron in the output layer can

be written as follows

Ym ( k ) =

nh

∑W j =1

o jm

× H j ( k )......( 3 )

The structure of hidden layer sigmoid neurons is shown in Figure 1.

Figure 1: The structure of the hidden layer sigmoid neurons


Remark: For the standard Diagonal Recurrent Neural Network, the sigmoid weight vector

S

W (k)

is set to

be one and for the normal Feed Forward Neural D Networks the diagonal vector W (k ) is set to be

Step 1: apply the standard PSO algorithm as follows: 1 – Update the particle velocity by the following equation

VPn(t +1) = λVP(t) +VpC(t +1) +VpS(t +1).......(4) where VPC (t + 1) and VPS (t + 1) are the cognitive

zero

3. The Enhanced Particle Swarm Optimization In PSO, the particles are initially distributed randomly through the problem space and given an initial velocity. At each time step, the particle velocity is updated. Over time, the velocity of each particle is adjusted so that it moves stochastically toward its own best position and the global best position found by all other particles in the swarm [12]. The particle cognition term is associated with its own thinking without consideration of other particles thinking which leads to a lake of diversity and could result is suboptimum solutions [8]. So, to add social thinking of the particle and keep diversity while searching for the global optimum, we add perturbation to modify this cognitive term by the weighted difference of the position vectors of any other two distinct particles randomly chosen from the swarm. The particle is shifted to this new location only if it yields a fitness value better than obtained using the standard PSO. Consider a swarm of S particles in D dimensional search space, particle P at any time instance t , is

and social term respectively and are written as follows:

VpC (t +1) = μ1C1(t +1)[XP−best − XP(t)]

VpS (t +1) = μ2C2 (t +1)[XG − XP (t)] 2– Compute the particle new position by adding its updated velocity vector to the particle old position vector

X Pn ( t + 1) = X P ( t ) + V Pn ( t + 1)...( 5 ) 3- Using this obtained search space variables, for each sample k in the training set, the neural network output Y P ( K ) is calculated using equation (1-3) n

and then the normal objective function value FP (t ) is calculated as follows

FPn (t ) =

N

∑ (Y K =1

P

( K ) − Y ( K )) 2 ...( 6)

N is the Where Y ( K ) is the exact output and training set size Step 2: get the modified objective function FPm (t ) as

represented

follows 1– Construct

change (velocity) Vp (t) = (v1P (t),v2P (t),..vDP(t))

randomly selecting two other distinct particles, say L and M where P ≠ L ≠ M , and compute the difference between their positional coordinates

using its position vector and its corresponding position X p (t) = (x1P (t),x2P (t),..xDP(t))

In our EPSO, considered:-

λ

the

following

terms

is the weighting factor (inertial),

would

μ1 , μ 2

be

δ

are

the cognitive and social parameters respectively, and C 1 ( t + 1 ) , C 2 ( t + 1 ) are cognitive and social random number drawn from the uniform distribution (0,1) at time instance t + 1

X P _ best is the position of particle P yielding the

P

the

difference

(t + 1) = X

L

vector

(t) − X

M

δ P (t + 1) by

( t ) ...( 7 )

The perturbation term is calculated as follows

V pR (t + 1) = μ1C1 (t + 1) × δ P (t + 1) 2- Update the modified particle velocity VPm (t + 1) by the following equation

VPm (t + 1) = λVP (t ) + V pR (t + 1) + V pS (t + 1)..(8)

best obtained objective function value FP _ best so far

where The perturbation term V p (t + 1) is calculated

X G is the particle

as follows V pR ( t + 1 ) = μ 1 C 1 ( t + 1 ) × δ

position with the global

best

obtained objective function value FG . The algorithm of training neural network weights set using Particle Swarm Optimization is as follows:First:- (Initialization At time instance t =0) :- for each particle P in the swarm, initialize randomly its position and velocity, X P _ best Second: - increment t and for each particle in the swarm, and randomly initialize cognitive and social numbers and do the following

25

R

:P

( t + 1)

3– Compute the particle new modified position as follows

X

m P

( t + 1) = X

P

( t ) + V Pm ( t + 1 )...( 9 )

4– Using this new modified position compute the modified objective function FPm (t ) Step 3: compute the objective function and modify best particle and global best as follows


1- Select the objective function FP (t ) as the minimum of the normal and modified objectives functions and consequently select the new particle position 2- If FP (t) ≤ Fp−best then Fp−best = FP (t) and X P _ best = X p (t ) 3- If FP (t ) ≤ FG then FG = FP (t ) and X

G

= X

P

(t )

The particle is shifted to the modified position if the modified objective function is better than the normal objective function associated with standard PSO. Figure 2, shows the structure of the algorithm.

The size of the network used in our study is 4, 4, 1 (i.e.; 4 input neurons, 4 hidden layer neurons and 1 output neuron). The input vector contains previous day open, maximum, and closing values beside the delayed predictor output value, while the output of the neural network predictor is the today predicted closing value. The assessment of the prediction accuracy of the two architectures was done in terms of the maximum absolute difference between the actual model output and the predicted output (MAX) and the Mean Absolute Percentage Error (MAPE) and the Root Mean Square Error (RMSE) [2, 14-16] and they defined as follows: M A X = m ax k

M APE =

RMSE

Figure 2: The structure of the proposed algorithm

4. Experimental Results Previous preliminary researches carried out on both NASDAQ100 and S&P500 stock market indices suggested that input data comprise daily open, maximum, and closing values while the output data is the next day closing value [14-16]. Our simulation experiments will be done using SDRNN trained by both PSO and EPSO and data is found in the website (www.finance.yahoo.com). The rescaling (normalization) of the data is important to ensure convergence and stability of the neural network predictor [2], the normalization was used based on the following equations: Λ I i − m e an Ii = . . . . . . (1 0 ) N

∑ (I i =1

− m e an

i

N

)

2

−1 N

Where m e a n =

I i is the i

th

Ii ∑ i =1 N

Λ

sample in the vector I , I

normalized value of the i number of samples.

th

i

is the

variable, and N is the

26

1 N

=

(Y

k

N

− Y m .k

) ....(1 1)

⎛ Y (k ) − Ym (k ) ⎞ × 1 0 0 ⎟ ....(1 2 ) Y (k ) ⎝ ⎠

∑⎜ k =1

1 N

2

N

∑ (Y ( k ) − Y k =1

m

( k ) ) ..( 13 )

The swarm size is consisting of 60 particles, each of which represents a possible neural network predictor solution. A neural network analysis consists of two stages, namely training and testing. During the training stage, an input/output mapping is determined iteratively using the available training data. The actual output error, propagated from the current input set, is compared with the target output and the required compensation is transmitted backwards to adjust the node weights so that the error can be reduced at the next iteration. The training stage is stopped once a pre-defined error threshold is reached or the training number of epochs has been reached and the node weights are frozen at this point. During the testing stage, data with unknown properties are provided as input and the corresponding output is calculated using the fixed node weights To show clearly the influence of using our proposed EPSO over the standard PSO, the experiment will be carried out in both long term and short term data set.

First: long term data set The training data were chosen starting form 3/1/2000 to 1/1/2006 while the testing data was chosen from 3/1/2006 to 30/6/2010 and the closing data price for both S&P500 and NASDAQ100 indices are shown in Figure 3, 4 respectively.


Figure 3: S&P500 PRINCE INDEX.

Figure 5 ERROR IN S&P 500 USING EPSO

Figure 4: NASDAQ100 PRINCE INDEX. Figure 6: ERROR IN S&P 500 USING PSO

The training and testing performance measurements using both standard PSO and our proposed EPSO in training and testing phases for both S&P500 and NASDAQ100 indices are shown in Tables 1, 2 respectively. Table 1: S&P500 measurement performance using both PSO and EPSO

MAX MAPE RMSE

TRAINING PSO EPSO 94 75 0.27% 0.11% 6.7 3.3

TESTING PSO EPSO 31 16 0.29% 0.11% 6.7 2.7

Table2:- NASDAQ100 measurement performance using both PSO and EPSO

MAX MAPE RMSE

TRAINING PSO EPSO 466 94 0.61% 0.20% 121 26

Figure 7: ERROR IN NASDAQ100 USING EPSO

TESTING PSO EPSO 530 163 0.61% 0.20% 125 28

The error between the exact closing price value of S&P500 index and the neural network predictor output during the entire long term period of both training and testing data sets using both our enhanced EPSO and standard PSO are shown in Figure 5, 6 respectively.

27

Figure 8: ERROR IN NASDAQ100 USING EPSO


The error between the exact closing price value of NASDAQ100 index and the neural network predictor output during the entire long term period of both training and testing data sets using both our enhanced EPSO and standard PSO are shown in Figure 7, 8 respectively. Second: Short term data set The NASDAQ100 index suffer a major downtrend movement from 1/1/2008 when its value was 13043 point till 23/4/2009 when its value became 7957 point and then started an uptrend correctional movement. Also during the same period of time, S&P500 behaved in the same movement dropping from 1447 point to 851 point before starting to make the uptrend correction. The S&P500 and NASDAQ indices short term data sets are shown in Figure 9, 10 respectively.

Table3: short term measurement performance using both PSO and EPSO

MAX MAPE RMSE

S&P500 PSO EPSO 27 7.6 0.24% 0.11% 8 4.3

NASDAQ100 PSO EPSO 503 157 0.62% 0.22% 173 37

The error between the exact closing price value of S&P500 index and the neural network predictor output during the short term period data sets using both our enhanced EPSO and standard PSO are shown in Figure 11, 12 respectively.

Figure 11: ERROR IN S&P 500 SHORTTERM USING EPSO

Figure 9: S&P500 SHORTTERM PRINCE INDEX.

Figure 12: ERROR IN S&P 500 SHORTTERM USING PSO

Figure 10: NASDAQ100 SHORTTERM PRINCE INDEX.

The performance measurements using both standard PSO and our proposed EPSO for both S&P500 and NASDAQ100 indices in the short term is shown in Table 3

28

The error between the exact closing price value of NASDAQ100 index and the neural network predictor output during the short term period data sets using both our enhanced EPSO and standard PSO are shown in Figure 13, 14 respectively.


For the future work, it is suggested that this new EPSO will be used in other different of time series prediction applications and in other applications rather than time series prediction such as pattern recognition and speech verification

6. References

Figure 13 ERRORS IN NASDAQ100 SHORTTERM USING EPSO

Figure 14 ERRORS IN NASDAQ100 SHORTTERM USING PSO

From all the above figures and tables, it is clear that training neural networks parameters using EPSO reduces significantly all performance measurements compared to those values obtained in training using PSO in both long term and short term forecasting in both NASDAQ100 and S&P500 stock market indices.

5. Conclusion and Future work In this paper, we introduce a new proposed Enhanced Particle Swarm Optimization (EPSO) to train the Sigmoid Diagonal Recurrent Neural Networks (SDRNN) weights and applied this technique in the forecasting of both NASDAQ100 and S&P500 stock market indices. This algorithm is based on modifying the cognitive term in the particle velocity update equation by comparing it with a new random perturbation term made by the distance difference between two randomly selection other particles. This algorithm makes the particle aware with other particles performance and utilizes this awareness in choosing new positions to avoid dropping in local minimum and to discover new better positions helping in reaching the global minimum. Another advantage of this algorithm is that it doesn’t contain any new additional terms which could lead to the increase in the computational effort and structure complexity.

29

[1] A. A. E. Ahmed, L. T. Germano, and Z. C. Antonio “A hybrid particle swarm optimization applied to loss power minimization.” IEEE Transactions on Power Systems, Vol. 20, No. 2, pp. 859-866, May 2005. [2] A. Chaouachi, R. M. Kamel, R. Ichikawa, H. Hayashi, and K. Nagasaka. " Neural Network Ensemble-based Solar Power Generation ShortTerm Forecasting," World Academy of Science, Engineering and Technology 54, 2009 [3] A. Ratnaweera, S. Halgamuge, and H. Watson "Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients". In IEEE Transactions on Evolutionary Computation, volume 8, pp. 240255, 2004. [4] Asif Ullah Khan, T. K. Bandopadhyaya, Sudhir Sharma. " Genetic Algorithm Based Backpropagation Neural Network Performs better than Backpropagation Neural Network in Stock Rates Prediction." IJCSNS International Journal of Computer Science and Network Security, VOL.8, No. 7, July 2008. [5] H. X. Chen, S. M. Yuan and K. Jiang. ”Selective neural network ensemble based on clustering,” Lecture Notes in Computer Science, Springer Verlag, Heidelberg, Vol. 3971, pp. 545-550, 2006. [6] J. Y. Chen, Z. Qin, Y. Liu and J. Lu. ”Particle Swarm Optimization with Local Search,” Neural Networks and Brain, Vol. 1, pp. 481-484,2005. [7] K. Kim and W. Lee, "Stock Market Prediction using ANN with Optimal Feature Transformation," Neural Computing & Applications. Vol. 13, No.3, pp.225-260, 2004. [8] M. S. Arumugam, A. Chandramohan and M. V. C. Rao. ”Competitive Approaches to PSO Algorithms via New Acceleration Co-efficient Variant with Mutation Operators,” International Conference on Computational Intelligence and Multimedia Applications, 2005. [9] R. Poli, C. D. Chio, and W. B. Langdon. "Exploring extended particle swarms: a genetic programming approach". Genetic And Evolutionary Computation Conference (GECCO'05), pp. 169-176, 2005.


[10] S.I. Wu and H. Zheng. " Modeling Stock Market using Neural Networks." Proceedings of Software Engineering and Applications, ACTA Press, USA, 2003. [11] S. Farzi "A New Approach to Polynomial Neural Networks based on Genetic Algorithm" International Journal of Intelligent Systems and Technologies 3:3, 2008. [12] S. Sivanagaraju, J.Viswanatha "Capacitor Placement in Balanced and Unbalanced Radial Distribution System by Discrete Particle Swarm Optimization." ICGST-ACSE, Volume 8, Issue I, ISSN: 1687-4811, 2007.

[13] Tarek Aboueldahab" Improved Design of Nonlinear Controllers Using Recurrent Neural Networks." Master Dissertation, Cairo University, December 1997. [14] Xuegin Zhang, Yuehui Chen, Jack Y. Yang, "Stock Index Forecasting Using PSO Bases Selective Neural Network Ensemble." International Conference on Artificial Intelligence (ICAI07), pp. 260-264. 2007. [15] Yuehui Chen, Bo Yang, Ajith Abraham, "Flexible neural trees ensemble for stock index modeling" Neurocomputing, 70, pp. 697–703,2007 [16] Yuehui Chen, Xiaohui Dong, Yaou Zhao, "Stock Index Modeling using EDA based Local Linear Wavelet Neural Network", In Proc. of International Conference on Neural Networks and Brain (ICNN&B), pp.1646-1650, 2005. [17] Z. H. Zhou, J. X. Wu and W. Tang. ”Ensembling Neural Networks: Many Could Be Better Than All,” Artificial Intelligence, Vol. 137, pp. 239-263, 2002. [18] http://en.wikipedia.org/wiki/S&P_500 index [19] http://en.wikipedia.org/wiki/NASDAQ-100 index

Biographies Mahmoud Fakhreldin was born in Giza, Egypt, in 1968. He received the B.S. degree in automatic control from the University of Minufia in 1991 and the M.Sc. and PhD degree in Computer Engineering, from Electronics and Communications Department, Cairo University, Faculty of Engineering, in 1996 and 2004 respectively. He has been a Researcher at the Electronics Research Institute since 1993 till now. He has also worked at the Faculty of Engineering, Abha, KSA, 1998-1999. He works as a consultant at the ministry of communications and information technology, Egypt, 2004 – till now. His areas of interest are Evolutionary Computation, Advanced automatic control and image processing.

30

Tarek Aboueldahab was born in April 1971 and obtained the bachelor degree in electronics and communications engineering from faculty of engineering –Cairo university in 1993 then a master degree in electronic and communications engineering -non linear control sector from the same university in 1998. He is working in ministry of transport, Cairo Metro Company since 1995 in the field of control engineering. His fields of interest include non-linear control, artificial intelligence application, particle swarm optimization, and neural networks.


Knowledge Management in ESMDA: Expert System for Medical Diagnostic Assistance S. Abu Naser, R. Al-Dahdooh, A. Mushtaha and M. El-Naffar Faculty of Engineering & Information Technology, Al-Azhar University, Gaza, Palestine E-mail: [email protected]

automatic programming, machine learning, robotics and vision, software tools, modelling human performance, and expert systems for complex decisions. Complex Medical decisions are central in each phase and our system help to solve this field of expert system [2,6, 15]. There are many medical diagnostic expert systems in the literature: like MYCIN, EasyDiagnosis, PERFEX, INTERNIST-I, ONCOCIN, DXplain, and PUFF. MYCIN was the first well known medical expert system developed by Shortliffe at Stanford University [16] to help doctors, not expert in antimicrobial drugs, prescribe such drugs for blood infections. The limitation of MYCIN was: its knowledge base is incomplete, since it does not cover anything like the full spectrum of infectious diseases. Running it would have required more computing power than most hospitals could afford at that time (1976). Doctors do not relish typing at the terminal and require a much better user interface than that provided one. EasyDiagnosis is an expert system software that provides a list and clinical description of the most likely conditions based on an analysis of your particular symptoms [17]. EasyDiagnosis focuses on the most common medical complaints that account for the majority of physician visits and hospitalizations. EasyDiagnoses has a poorly designed user-interface, the user is required to answer a large number of questions without any notion that gives him the feeling that his data is accepted and will be diagnosed. PERFEX is a medical expert system that support solving problems clinicians currently have in evaluating perfusion studies [18]. The heart of the PERFEX system is the knowledge-base, containing over 250 rules. They were formulated using the expertise of clinicians and researchers at Emory

Abstract

This research involved designing a prototype expert system that helps patients in diagnosing their diseases and offering them the proper advice; furthermore, the knowledge management used in the expert system is discussed. One of the main objectives of this research was to find a proper language for representing patient’s medical history and current situation into a knowledge base for the expert systems to be able to carry out the consultation effectively. Production rules were used to capture the knowledge. The expert system was developed using CLIPS(C Language Integrated Production System) with Java Interface. The expert system yielded good results in the analysis of the medical cases tested and the system was able to determine the correct diagnosis in all cases. Keywords: Knowledge Management, Expert System, CLIPS, Production System, Medical System

1. Introduction

An expert system solves problems by simulating the human reasoning process and applying specific knowledge and interfaces [5, 7, 8]. Expert systems also use human knowledge to solve problems that normally would require human intelligence. These expert systems represent the expertise knowledge as data or rules within the computer. These rules and data can be called upon when needed to solve problems. Books and manual guides have a tremendous amount of knowledge but a human has to read and interpret the knowledge for it to be used. A computer program designed to model the problem solving ability of a human expert. Experts systems and Artificial intelligence encompasses such diverse activities as game playing, automated reasoning, natural language,

31


knowledge or the operation of venture knowledge. It also includes activities that form the communication of tacit knowledge to the integration of explicit knowledge. The basic activities of knowledge management are knowledge collection, creation, sharing/diffusion, and utilization [13]. There are a diversity of technologies that have been applied to support these activities, such as e-mail, database and data warehouse, group decision software, intranet and extranet, expert system, intelligent agent, data mining etc. [1, 3, 4, 10, 11]. Knowledge management in expert systems means how the knowledge is collected (knowledge acquisition), stored (knowledge representation) and retrieved (reasoning). ESMDA covers Six area of diseases: Cold and Flu, cough, fever, Ear and Eye problems (See figure 1).

University Hospital. PERFEX limitation resides in its output. It is mostly numerical. INTERNIST-I is a rule-based expert system designed at the University of Pittsburgh in 1974 for the diagnosis of complex problems in general internal medicine [19]. ONCOCIN is a rule-based medical expert system for oncology protocol management developed at Stanford University [20]. Oncocin was designed to assist physicians with the treatment of cancer patients receiving chemotherapy. DXplain is a decision support system which uses a set of clinical findings (signs, symptoms, laboratory data) to produce a ranked list of diagnoses which might explain (or be associated with) the clinical manifestations [21]. DXplain provides justification for why each of these diseases might be considered, suggests what further clinical information would be useful to collect for each disease, and lists what clinical manifestations, if any, would be unusual or a typical for each of the specific diseases. PUFF is an expert system for the interpretation of pulmonary function tests for patients with lung disease [22]. PUFF was probably the first AI system to have been used in clinical practice. Those Expert Systems suffer from limitation, bad interface or output format. Our expert system is designed to overcome those limitations with descriptive output and carefully designed interface. The primary goal of our expert system is to make expertise available to decision makers and technicians who need answers quickly. There is never enough expertise to go around -- certainly it is not always available at the right place and the right time [9, 12]. The remainder of this paper is organized as follows: We first introduce the knowledge management in ESMDA in section 2. In section 3 we will present how knowledge acquisition was done. Afterwards, we will describe the knowledge representation schema in section 4. Section 5 focuses on the user interface of the system. Experimental result is given in section 6. Conclusion and future work are given in section7.

3. Knowledge Acquisition

Basic information about the Cold and Flu, cough, fever, Ear and Eye problems, symptoms and treatment where collected from experts (physicians) and books. Knowledge elicitation was performed through interviews with the human experts.

4. Knowledge Representation

The environment of the system may affect its reliability. The use of some Expert System programming languages makes the system limited in specific features. The Expert System language chosen must be rulebased, portable, and use CLIPS rules. CLIPS (C Language integrated production System) was developed by Ernest Friedmanhill of Sandia National Labs [5, 14]. CLIPS is a rule based expert system shell that is suitable for our expert system. CLIPS is a productive development and delivery expert system tool which provides a complete environment for the construction of rule and/or object oriented based expert systems with java Class JClips. Class JClips is the Java part of the software bridge between CLIPS and Java. This class contains the methods to set up and control the CLIPS environment. JESS (Java Expert System Shell) was chosen to meet these needs. JESS is rule-based, with a syntax that is a superset of CLIPS. It is also implemented in Java, and is portable to any platform with a Java 2 compatible JVM. While JESS is not open source, it is free for educational use. Since JESS is implemented in Java, it is possible to work with the JESS inference engine directly from Medical Diagnostics Systems

2. Knowledge Management in ESMDA

In the past few years, the appearance of knowledge management has facilitated the improvement for the knowledge demander in searching for knowledge efficiently and effectively [1, 3].The activity of knowledge management is wide and complex. It can be the management of person

32


cause mild or short-term temperature elevations. Temperatures of 103° and above are considered high and can signal a potentially dangerous infection. Table 5 shows the diagnosis and symptoms of Fever. f) Cystinosis ESMDA has a rule set that contains knowledge about how to diagnose and treat the symptoms of Cystinosis. Cystinosis is a rare genetic disorder that causes an accumulation of the amino acid cystine within cells, forming crystals that can build up and damage the cells. These crystals negatively affect many systems in the body, especially the kidneys and eyes. The accumulation is caused by abnormal transport of cystine from lysosomes, resulting in a massive intra-lysosomal cystine accumulation in tissues. Via an as yet unknown mechanism, lysosomal cystine appears to amplify and alter apoptosis in such a way that cells die inappropriately, leading to loss of renal epithelial cells. This results in renal Fanconi syndrome, and similar loss in other tissues can account for the short stature, retinopathy, and other features of the disease. Symptoms There are three distinct types of cystinosis each with slightly different symptoms: nephropathic cystinosis, intermediate cystinosis, and non-nephropathic or ocular cystinosis. • Infants affected by nephropathic cystinosis initially exhibit poor growth and particular kidney problems (sometimes called renal Fanconi syndrome). The kidney problems lead to the loss of important minerals, salts, fluids, and other nutrients. The loss of nutrients not only impairs growth, but may result in soft, bowed bones (hypophosphatemic rickets), especially in the legs. The nutrient imbalances in the body lead to increased urination, thirst, dehydration, and abnormally acidic blood (acidosis). By about age two, cystine crystals may also be present in the cornea. The buildup of these crystals in the eye causes an increased sensitivity to light (photophobia). Without treatment, children with cystinosis are likely to experience complete kidney failure by about age ten. Other signs and symptoms that may occur in untreated patients include muscle deterioration, blindness, inability to swallow, diabetes, and thyroid and nervous system problems.

Interface. JESS manipulates data stored in the knowledge base via a set of defined rules. All of the rule sets used in System are loosely based on Knowledge.clp. The following rule is an example of how knowledge is represented in ESMDA using CLIPS: (defrule SINUSITIS (Cough_Style (dry_cough yes) ) (Pain (around_eyes yes) ;;or (around_cheeks yes) ;;or (around_nose yes) ;;or (around_forehead yes) ) (Symptoms (swelling yes) (Headache yes) (discharge_nose yes) ) ;;end symptoms => (printout t "You may be developing SINUSITIS" crlf) ) ESMDA help in diagnostic the following type of Diseases: a) Cold and Flu problems The rule set of cold & flu contains knowledge about diagnoses, the symptoms of the cold or the flu and how to know when to see a doctor. The knowledge base of Cold and Flu covers the following diagnosis(see table 1): b) Cough problems The rule set contains knowledge about how to diagnose and treat the symptoms of Cough (see Table2). A cough is an annoying symptom that can have many causes. Our Expert system can help patients identify the problem and find suggestions for treatment. c) Ear Problems The rule set contains knowledge about Ear problems, Ear problems are often caused by an infection. However, other conditions may also cause ear pain or discomfort. Table3 shows the diagnosis and symptoms of possible Ear problems. d) Eye Problems This rule set contains knowledge about Eye problems, Eye pain or redness and changes in one's vision may be signs of a problem that requires medical attention. The possible list of diagnosis and symptoms of the Eye problems is in Table 4. e) Fever ESMDA has a rule set that contains knowledge about how to diagnose and treat the symptoms of fever. A fever is defined as a temperature 1° or more above the normal 98.6°. Minor infections may

33


•

•

user can look at the results, print a report of the findings, or go back to the diagnosis screen again.

The signs and symptoms of intermediate cystinosis are the same as nephropathic cystinosis, but they occur at a later age. Intermediate cystinosis typically begins to affect individuals around age twelve to fifteen. Malfunctioning kidneys and corneal crystals are the main initial features of this disorder. If intermediate cystinosis is left untreated, complete kidney failure will occur, but usually not until the late teens to mid twenties. People with non-nephropathic or ocular cystinosis do not usually experience growth impairment or kidney malfunction. The only symptom is photophobia due to cystine crystals in the cornea.

6. Experimental Results

We used ESMDA for a number of 100 patient records from the Hospital Database with different types of diseases that our expert system covers. The corresponding treatment results were compared to the results of our human expert doctor. To evaluate ESMDA, we used the accuracy metric, commonly used for this purpose: Accuracy =(a ), where, a is the number of diagnosed cases correctly out of the possible 100 cases. The evaluation results are presented in Table 6 which shows a very good performance.

5. Expert System User Interface

Table 6. Evaluation results for ESMDA performance Expert Doctor ESMDA Accuracy 96% 91%

The structure of the system is shown in figure 2. When the expert system starts a login screen is shown and the patient is supposed to login into the system (See figure 3). If the patient is using the systems for the first time, he/she must register by entering his personal data and get a new username and password (See Figure 4). The patient’s data is stored a secure database for future visits and diagnosis. Communication between the user and the expert system is done through the user interface in either English or Arabic Language which was implemented to be easy for the regular end user (See Figure 5). The user interface does not require much typing. The user can select the patient user interface (see figure 6) to edit, update his personal data, change the current password by clicking the button Patient from the expert system main user interface. Once the user select Disease form the main expert system user interface, a new screen with the six areas of diseases (see figure 1) is displayed to the user to enable him to diagnose his problem. For example if the user select Ear problem from the areas of diseases, a new screen with multiple choice question is displayed for the diagnoses of the possible diseases of the Ear(see Figure 7). When the user finished answering the questions, he/she should press the button Start Diagnostic. The expert system diagnoses the disease from the symptoms collected and the stored user profile. Once the diagnosis is finished, the result is stored in the user profile for future usage. When the user press the Info Button a new screen of Patient data history is shown (see Figure 8). The

7. Conclusion and Future Work

This paper has provided a brief introduction about knowledge management and the design of a prototype to ESMDA: Expert Systems for Medical Diagnostic Assistant. ESMDA helps patients in diagnosing their diseases and offering them the proper advice. CLIPS with Java language interface was used for representing patient’s medical history and current situation into a knowledge base for the expert systems to be able to carry out the consultation effectively. ESMDA yielded good results in the analysis of the medical cases tested and the system performed very well when compared to the expert doctor. We are planning to expand the knowledge base to include more diseases and have a web based version of the system.

References

[1] S. Abidi, Knowledge Management in Healthcare: Towards Knowledge-Driven Decision-Support Services, International Journal of Medical Informatics, Vol. 63, pp. 5-18, 2001. [2] Abu Naser, S.S., K.A. Kashkash and M. Fayyad, 2008. Developing an expert system for plant disease diagnosis. J. Artif. Intell., 1: 78-85. DOI: 10.3923/jai.2008.78.85-URL: http://scialert.net/abstract/?doi=jai.2008.78.85 [3] J. Barthes, and C. Tacla, Agent-supported Portals and Knowledge Management in

34


[16] B. Buchanan and E. Shortliffe, 1984. Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Reading, Mass.: Addison-Wesley. [17] F. Martin, Medical Diagnosis: Test First, Talk Later? 1(1), Mathemedics, Inc. ,2004. [18] N. Ezquerra, R. Mullick, E. Garcia, C. Cooke, and E. Kachouska, 1992, PERFEX: An Expert System for Interpreting 3D Myocardial Perfusion, Expert Systems with Applications, Pergamon Press, 1992. [19] A. Kumar, Y. Singh and S. Sanyal; Hybrid approach using case-based reasoning and rulebased reasoning for domain independent clinical decision support in ICU. Expert Systems with Applications, V(36), pp. 65-71, 2009. [20] G. Wiederhold, E. Shortliffe, L. Fagan, and L. Perreault, 2001,Medical Informatics: Computer Applications in Health Care and Biomedicine. New York: Springer, 2001. [21] G. Elhanan, S. Socratous, J. Cimino, Integrating DXplain into a clinical information system using the World Wide Web. Proc AMIA Annu Fall Symp. 1996;pp.348-52, 1996. [22] Aikins J., J. Kunz, E, Shortliffe, and R. Fallat. PUFF: an expert system for interpretation of pulmonary function data. Comput Biomed Res. 1983 Jun;16(3):pp.199-208, 1983.

Complex R&D Projects,Ą, Computers in Industry, Vol. 48 (1), pp. 3-16, 2002 [4] K. Chau, , C. Chuntian, and C. Li, Knowledge Management System on the Flow and Water Quality Modeling, Expert System with Applications, Vol. 22 (4), pp. 321-330, 2002. [5] J. Giarratano and G. Riley, Expert Systems: Principles and Programming, , PWS-Kent Publishing Co, 1989. [6] P. Hatzilygeroudis, J. Vassilakos, and A.Tsakalidis, An Intelligent Medical System for Diagnosis of Bone Diseases, Proceedings of the International Conference on Medical Physics and Biomedical Engineering (MPBE’94), Nicosia, Cyprus, May 1994, Vol. I, pp.148-152,1994. [7] P. Jackson, Introduction to Expert Systems: Principles and Programming, Third Edition Books/Cole Publishers, 1998 [8] P. I. James, Introduction to Expert Systems: The Development and Implementation of RuleBased Expert Systems, McGraw-Hill, Inc., 1991. [9] S. Karagiannis, A. Dounis, T. Chalastras, P. Tiropanis, and D .Papachristos, Design of Expert System for Search Allergy and Selection of the Skin Tests using CLIPS, International Journal Of Information Technology, 3(1) , 2006 [10] H. Nemati, D. Steiger, L. Iyer, and R. Herschel, Knowledge W arehouse: An Architectural Integration of Knowledge Management, Decision Support, Artificial Intelligence and Data Warehousing, Decision Support Systems, Vol. 33, pp. 143-161, 2002. [11] D. O'Leary, Enterprise Knowledge Management, Computer, Vol. 31 (3), pp. 54-61, 1998. [12] J. Rashid and A. H. Syed, Design of an Expert System for Diagnosis of Coronary Artery Disease Using Myocardial Perfusion Imaging, National Conference on Emerging Technologies 2004. [13] M. Rodica, U. Adina, A. Anca, Î. Iulian, Knowledge Management in E-Learning Systems. Revista Informatica Economica nr.2(46)/2008. [14] Scandia National Laboratories JESS: The Rule Engine for the Java Platform, 2003. (a Javabased expert system and environment, original based on CLIPS) [15] L. Shu-Hsie,. Expert system methodologies and applications - a decade review from 1995 to 2004, Expert Systems with Applications, 28: 93103, 2005.

Samy Abu Naser was born in Gaza, Palestine, in 1964. He received the B.S. and M.Sc. degrees in Computer Science from the University of Western Kentucky, USA in 1987 and 1989 respectively and the Ph.D. degree from North Dakota State University, USA in 1993 in Computer Science. He has been working as Associate Professor in Faculty of Engineering and Information Technology, Al-Azhar University, Gaza, Palestine since 1996. He was appointed as a Teaching Assistant at the University of Kentucky, USA, 1988-1989. He was appointed as a Research Assistant at the North Dakota State University, USA, 1990-1993. He has worked as Field Information Systems Officer at the United Nations Relief and Works Agency, Gaza 1993-1996. His areas of interest are Artificial Intelligence, Intelligent Tutoring Systems, Expert Systems, Knowledge Management Systems, and Compiler Design.

35


Table1: Diagnosis and symptoms of Cold and Flu Diagnosis Strep throat bacterial infection Flu influenza

Acute Bronchitis Allergies Gastroenteritis Sinusitis

Cold

Symptoms

Patient has fever, sore throat and headache without nasal drainage. Patient has fever, symptoms start suddenly, a combination of symptoms including muscle aches, chills, a sore throat, runny nose or cough, and don’t have a sore throat and headache without nasal drainage. Patient has fever, a persistent cough that brings up yellowish or greenish mucus, wheezing and shortness of breath. Patient has a runny nose, runny itchy, sneezing and itchy eyes. Either patient with high fever and headache, muscle aches and nausea, or vomiting and watery diarrhea Patient have dry cough and there is pain either around the eyes, cheeks, nose, or forehead. Furthermore, the symptoms are swellings and Headache and discharge nose. Sneezing, sore throat, headache, congestion and discharge nose.

Table2: Diagnosis and symptoms of Cough problems Diagnosis Symptoms Pulmonary Edema (fluid in the Patient has very short of breath, coughing up pink and frothy mucus. lungs). Viral illness such as a COLD or the Patient has cough, produce clear or pale yellow mucus. FLU. Chronic Bronchitis Patient have cough, produce yellow, and tan or green mucus. Pneumonia Patient has cough produce yellow, tan or green mucus, have a fever with shaking chills, and are very ill. Asthma a constriction of the Patient have cough come with shortness of breath and wheezing. airways Congestive heart failure Patient has shortness of breath, cough, or a feeling of not being able to get a deep breath. Pulmonary embolism Patient has recently started having sharp chest pain, rapid heartbeat, swelling of the legs and sudden shortness of breath. Tuberculosis Patient has a fever, chills and night sweats along with chest pain when he coughs or takes a deep breath. Lung cancer Patient have unintentionally lost weight, signs of lung cancer may include a cough that produces bloody sputum, shortness of breath and wheezing. Irritation of the airways Patient has inhale dust, particles or an object.

36


Table3: Diagnosis and symptoms of Ear problems Diagnosis

Symptoms

Otitis media Infected Mastoiditis or enlarged lymph node Ruptured Eardrum Otitis Externa Blocked Eustacian Tube

Tooth Problem Radiate Pain in Ear Barotraumas Serous Otitis or Ceruminosis

Patient has fever, experiencing pain deep in the ear and/or fluid draining from the ear. Patient has fever, redness and swelling of the outer ear and the surrounding skin. Patient has fever, a headache-type pain and redness behind the ear or tenderness when he touches the bone behind your ear. Patient has thick pus-filled or bloody drainage from the ear canal that started after a sharp, sudden pain. Patient ear swollen, and does it itch or hurt when you pull on your ear or earlobe. Patient hears fluid in his ear, feels pressure or stuffiness that can't be cleared with coughing, yawning or swallowing, and has cold or flu symptoms. Patient has tooth pain on the same side as the ear pain when he bites down. Patient ear pain start during an airplane flight or right after he traveled on an airplane. Patient has a child who doesn't have ear pain or redness but is having problems hearing?

Table4: Diagnosis and symptoms of Eye problems Diagnosis Symptoms Detached Retina Patient has sudden appearance of spots and strings floating in his field of vision; flashes of light in 1 or both eyes; partial loss of vision Acute Glaucoma Patient has red eye, severe eye pain, or has his vision suddenly decreased or become cloudy. Temporal Arteritis Patient experiencing flu-like symptoms such as fever, fatigue, muscle aches and a pain in one or both temples. Sinusitis Patient has thick nasal drainage and pain or pressure on his forehead and behind his eyes. Crossed Eyes Or Patient toddler have crossed eyes (can occur when eye muscles become weak or Strabismus when there is a loss or lack of vision in one eye). Allergy Or Insect Bite Patient eyes are red, itchy or swollen, or a bite-like swelling on one of his eyelid. Periorbital Cellulitis Patient has a fever, and his eye swollen and tender to the touch. Chalazion Patient has a firm, painful lump in the eyelid or a tender "pimple" on the edge of Hordeolum the eyelid. Conjunctivitis Pink Eye Sensation From Scratched Cornea Blepharitis Irritation from Contact

Patient white of the eye is pink, red or irritated, and have secretions or mucus from the eye. Patient has red eye, vision blurry and feel like sand in his eye. Patient has a burning sensation in the eye, the eye red and itchy, and the skin around the eye scaling. Patient wears contact lenses, and has eye pain.

37


Table5: Diagnosis and symptoms of Fever problems Diagnosis Symptoms Fever Effect Medicine Patient has recently started taking a new medicine. Meningitis Patient has a severe headache, neck stiffness, drowsiness and vomiting, and his Pyelonephritis eyes sensitive to light. Pneumonia Or Patient has a fever, chills, cough, unusually rapid breathing, breathing with Pulmonary Or Embolus grunting or wheezing sounds, labored breathing that makes a child's rib muscles retract , vomiting, chest pain, abdominal pain, decreased activity, loss of appetite (in older kids) or poor feeding (in infants), in extreme cases, bluish or gray color of the lips and fingernails Heat Exhaustion Patient has high fever (104°F or higher), severe headache, dizziness, and feeling light-headed, a flushed or red appearance to the skin, lack of sweating, muscle weakness or cramps, nausea, vomiting, fast heartbeat, fast breathing, feeling confused, anxious or disoriented, seizures Patient has a feeling of fullness in the ear, muffled hearing, fluid that drains Otitis Media Or from the ears, some pain inside the ear , trouble sleeping, irritability, fever, Swimmer's Ear Or Otitis headache Externa Strep Throat Patient has a fever between 101° and 103°, and he has a sore throat and headache. Tuberculosis Aids Patient fever come and go and his temperature stay between 97° and 102°, Has he lost weight unintentionally, and do he has a fever that comes and goes, night sweats or swollen lymph nodes. Bronchitis Pneumonia Patient fever come and go and his temperature stay between 97° and 102°, and Are he short of breath and do he has a cough that produces yellow, green or tan mucus. Gastroenteritis Patient fever come and go and his temperature stay between 97° and 102°, and he has aches, chills, nausea, vomiting, cramps or watery diarrhea. Mononucleosis Patient fever come and goes and his temperature stay between 97° and 102°, and he had a fever for weeks along with tiredness and a sore throat.

Figure 1: Category of disease covered by the expert system.

38


Figure 2: Structure of the ESMDA System

39


Figure 3: Login Screen of the expert system

Figure 4: Registering a new patient

Figure 5: Expert System main user interface

Figure 6: Patient user interface

Figure 7: Ear problems dialogue

Figure 8: Patient data history screen

40


Achieving Remote Presence using a Humanoid Robot Controlled by a NonInvasive BCI Device A. Thobbi, R. Kadam, W. Sheng Laboratory for Advanced Sensing, Computation and Control, Oklahoma State University, Stillwater, Oklahoma. [thobbi, rohit.kadam, weihua.sheng]@okstate.edu, http://ascc.okstate.edu Many applications require direct teleoperation. Teleoperation is difficult when we need to control complex robots such as humanoid robots having many degrees of freedom. The idea of using tele-operated robots for carrying out tasks in remote environments was first proposed by Marvin Minsky [1]. Advances in telecommunications have enabled considerable progress in this area. The focus shifted from simple tele-operation to telepresence, where the operator can have the feeling of being present in a remote location by having some control over the remote environment and receiving real-time feedback. Amongst the earliest works in this field, was by Hightower et al. [2] in which the steering wheel of a car in a remote location could be controlled by a human operator. The use of an anthropomorphic entity at remote environment was proposed. The remote presence technology has been used extensively for controlling mobile robots in hostile remote environments such as outer space, bomb-diffusion, radioactive waste dumping, extreme temperatures such as fire rescue or underwater exploration etc. Commercially sold robots such as Rovio and Spykee try to provide the users with video and audio feedback from a simple mobile robot present in the remote environment thus providing remote presence capabilities. A very good example where tele-robotic technology has found great applications is in the field of medicine [9]. Very recently, Cisco has introduced the Health Presence platform which aims to improve the doctor-patient collaboration during virtual visits [11]. It is common nowadays to perform delicate surgeries using teleoperated robots [10], [9]. Such applications have shown how human expertise can be transferred from one location to another via a robot embodiment. More industrial applications of such technology are used in virtual CAD modeling where users can create and interact with virtually created CAD models. The primary criteria for the remote presence system to work satisfactorily is that the operator should be able to

Abstract This paper presents a platform for ‘Remote Presence’ which enables a person to be present at a remote location through the embodiment of a humanoid robot. We specifically propose the use of a humanoid robot since it will endow human like capabilities for manipulating the remote environment. The numerous sensors available on the humanoid robot such as vision, microphones are essential to give feedback to the human controller about the remote environment. In addition to this, the humanoid has capabilities such as speech synthesis, obstacle avoidance, and ability to grasp objects which can be used to perform a wide array of tasks. To control the actions of the robot we propose the use of non-invasive Brain Computer Interface. The BCI enables the user to conveniently control the robot in the remote environment. The human user receives audio and video feedback from the robot on a personal media viewer such as video goggles. This would help the user to feel total immersion in the remote environment. This system could immensely benefit a variety of sectors such as military, medicine, disaster management etc. for carrying out dangerous or physically intensive tasks. 1 Keywords: Humanoid Robot, Tele-presence, Brain Computer Interface, Human Robot Interaction, Teleoperation.

1. Introduction The field of robotics has made considerable progress over the past few decades. More and more research has been focused on developing completely autonomous robots free of any kind of human control. Although many barriers have been broken, the intelligence of robots is not even close to human intellect. In order for robots to be useful today, they are still dependent on human guidance. Most of the industrial and service robots need explicit programming to achieve desired objectives. 1

This study has been implemented on the Nao Humanoid robot platform at ASCC lab, Oklahoma State University.

41


conveniently control the robot and receive rich feedback from the remote environment. In this paper, we propose a novel method of constructing such a remote presence platform. We propose that a humanoid robot be used in the remote environments primarily because humanoid robots have sensory and manipulatory abilities close to that of a human being. They contain numerous sensors such as cameras, microphones, sonar etc. which can provide rich data feedback from the remote environment. For convenient control of the humanoid robot in the remote environment, we propose the use of a Brain Computer Interface. Study by Bell et al. [3] has proven the utility of BCI devices for controlling the complex devices such as humanoids. The system proposed in this work also makes use of a personal media viewing device which can display the video captured from the humanoid’s camera. We believe that such a system would closely simulate the feeling of actually being present in the remote environment. The remainder of this paper is organized as follows. In

Section 2, we discuss the proposed system design of the remote presence platform and the related problems that need to be addressed. Section 3 discusses the decision filtering strategy. In Section 4, we present the experimental setup and the results to validate the system. Section 5 concludes the paper with future works.

2. System Design Overview: This section presents the software and hardware architecture of the proposed system. The system mainly consists of a non-invasive BCI device, a personal media viewing device, a wearable PC and the humanoid robot which would be controlled in the remote environment. This section presents the hardware and the software design of the proposed system. Fig 1 shows the hardware setup to be worn by the user and Fig 2 shows the complete block diagram of the remote presence system. Hardware architecture: In this subsection we present details about the hardware used for building the system.

Figure 2. System Block Diagram

EEG Device: The BCI device used for experiment-ation is the Emotiv Epoc Neuro-headset [12]. The Emotiv Epoc is a low cost Brain Computer Interface device intended to be used as a controller for gaming purposes. The device is based on recognition of patterns in ElectroEncaphalographic signals. EEG signals are the electric nerve responses generated by the human brain. The frequencies of the EEG signals characterize the thought pattern. Broadly, the signals are divided into 4 bands8 12 , 12 30 , 4 7 0 4 . Each part of the brain is responsible for various activities. A higher frequency in that particular part of the brain indicates that the particular part of the brain is more active than others. This sets the fundamental basis for understanding the thoughts by analyzing EEG signals. The Epoc device consists of 14 EEG electrodes. The electrodes are placed on the scalp according to the International 10-20 format. The electrodes can sample data at the rate of 2048 Hz. The device is connected to the computer using a Bluetooth interface. Since the

Figure 1. The hardware setup consisting of the Video-Goggles, BCI Device and Embedded PC

42


device has been created for gaming, arrangements have been made so that the system follows real time requirements. The device does not require a separate amplifier unit which reduces the cost drastically and allows mobility for the user. Video Goggles: Video goggles generally use a LCD or O-LED (Organic LED) display magnified by tiny lenses. This gives the wearer the sensation of viewing a very large screen (equivalent to 60 inches). Advanced goggles also allow for stereo vision capability. We propose the use of such video goggles because it eliminates the need for a display screen. The proposed system makes use of the MyVu video goggles [14]. Fit PC-2: The Fit PC-2 is a self contained CPU which can fit on the palm [15]. It includes a 1.6 GHz Intel Atom Processor, 1 GB DDR-2 RAM, 4 USB ports and a Wi-Fi and supports the Windows XP OS. Nao Humanoid Robot : The Nao Humanoid robot created by Aldebaran Robotics (France) is the state of art humanoid robot platform [13]. It is the standard platform used for the RoboCup competition. It has 26 degrees of freedom. It is equipped with the x86 AMD Geode microprocessor, 500 MHz, 256 MB RAM and 1 GB of flash memory. It can be connected wirelessly to a network using the Wi-Fi interface. The Bonjour Client helps in automatic network discovery and IP address assignment. The robot is equipped with 2 VGA cameras one pointing straight ahead and one pointing towards the floor. Inertial units and Force sensors on the feet help the robot to maintain balance while walking and other activities. It also has an ultrasonic measurement unit to detect and avoid obstacles. The robot is also equipped with 4 microphones and 2 speakers.

goal in the future is to develop algorithms which will use the raw EEG data for pattern recognition so that it is most suited for robot control. For this system, we try to improve upon the detection results we get from the Cognitiv suite by introducing a decision filtering block. The goal of the decision filter block is to identify whether the current thought is genuine or a falsely detected one. A heuristic approach is taken to remove the false positives, in order to get reliable signals for the robot control. Now, we shall briefly describe the working of the system as a whole. The user is equipped with the Epoc EEG Headset, the Video Goggles and the Embedded PC. The thoughts of the user are sensed by the device and transferred to the embedded PC via Bluetooth. Digital signal processing operations are performed within the SDK to improve the signal to noise ratio. These signals are converted to the feature space and then classified by the pattern recognition algorithm. Both of the above procedures are implemented in the Cognitive suite. The output of the Cognitive Suite is the decision about which action the user is thinking. This output is evaluated by the decision filter which predicts whether it is a genuine thought or a thought that is wrongly classified. Hence we can get a dependable control signal. This signal is applied to the robot’s control software which commands the robot to action corresponding to human thoughts. The video feedback from the robot’s camera is fed to the operator’s video goggles so that the operator can see the remote environment. The communication between the operator setup and the robot setup can occur through any TCP/IP network. Hence, the whole system gives the user a feel of being actually present at a remote location through the humanoid robot embodiment. The idea and design of the whole system is an important contribution of this work.

Software architecture: The software is developed on the wearable PC by using various Software Development Kits associated with the various hardware devices. The software architecture of the system can well understood by referring to Fig. 2 Nao Software Development Kit: The Nao Humanoid robot comes with an object oriented SDK for programming. The programming language used is C++. The SDK contains various functions running in real time for motor actions, sensory data acquisition. Motor action commands can be high level such as walking or low level such as moving individual joint angles. Emotive Software Development Kit: The Emotive software already contains programs which allow for pattern recognition in EEG signals. The program is called Cognitive Actions suite which can classify the active thoughts of the user. Basically, a scenario is provided where the user can control the motion of a virtually generated cube such as PUSH, LEFT, ROTATE etc. The software has to be trained for each user before it can be used. The pattern recognition in EEG signals is implemented in the Cognitive Suite using proprietary algorithms based on the concept of presence of Mu-Beta rhythms in various parts of the brain [5]. The Emotiv Research edition SDK allows us to get the raw EEG data as well as the data directly from the Cognitiv suite. In this work, we use the Cognitiv Suite based detection results to control the actions of the humanoid robot. An important

3. Decision Filtering Since we are using the signals directly from the SDK, these signals are not optimized for the control of a mobile robot. Especially we observe a number of false positives. In our application, false negatives are acceptable, whereas false positives are not since false positives will cause faulty robot movements. We have to take sufficient precautions for avoiding false positives. Hence, this block has been introduced. Using a heuristic technique, the probability of false positives can be greatly reduced. From the Cognitive Suite SDK, we can get 2 parameters such as the action the user is thinking about and the strength of the thought or equivalently the confidence of the SDK in detecting the user’s thought. The intuition we are relying upon is that, if the system is detecting the same thought for a sufficient duration of time as well as it is strongly confident about it, means that the thought is a genuine one. One method proposed is the integral over the thought (action) power function (ITPF). We take the integral of the power for the action performed for one thought cycle. A thought cycle is defined as the time for which the system is detecting the same thought continuously. A global threshold is set based on which the current thought cycle is accepted or rejected. Formally, let x FORWARD, BACK, LEFT, RIGHT denote the set of thought-actions and p t correspond to

43


Validation of the Emotiv Epoc BCI device The first experiment involves validation of the device in the lab environment where there are a lot of noise sources, audio and visual distractions for the user. The validation was done for the Cognitiv suite, provided in the Emotiv research SDK by training the system 10 times with training data of 8 seconds each. We tried 2 actions per user for less complexity. The user had to move a graphically generated block on screen either forward or to the left by his thoughts. We gave each user a sequence of 35 random actions such as MOVE LEFT, MOVE FORWARD, HOLD STILL. The number of times the user could move the block successfully in the given direction were recorded. We employed 4 volunteers for this experiment. Out of these 4 volunteers, volunteers ‘1’ and ‘2’ had used the system previously for 2 weeks whereas volunteers 3 and 4 were completely new to the system. The results for this experiment are presented in table I. The metric of performance is percentage of correct classification (PCC). From this experiment we can observe that the average rate of correct classification is 78.35%. It was also observed that the volunteers who had used the system for a few weeks before could give extremely good results. The conclusion from this experiment is that the BCI device gives acceptably good results. Also, with sufficient practice, users can develop expertise in using the BCI devices. Validation of the complete set up: To validate the effectiveness of the decision filtering strategy, initially, an experiment with a simulated robot was conducted. It was mentioned briefly that the decision filter is a heuristically designed block. Experiment with Simulated Robot: The experiment required the operator to move the green simulated robot towards the yellow target block. Instead of random paths, a pre-defined path was suggested for the operator to follow. The motion of the robot controlled by the operator was recorded. Fig. 4 shows the results of this experiment. We can see that the user is always able to hit the target successfully. A path (solid black line) close to the suggested path (shown by red-dotted line) could be maintained. The chances of false acceptances are very small because of the processing done by the decision filtering block. 2) Experiment with humanoid Robot: Finally, Fig. 5 shows the experiment done with the humanoid robot. A

Figure 3. The figure shows various parameters such as the thoughts, the power associated, ITPF and the ground truth data. It can be clearly seen that the spurious observations are removed by thresholding.

the power of the thought x at time t. If the decision cycle for the current thought is of duration t , then we define ITPF

p t dt

1

The decision filter accepts the thought x as genuine if ITPF is greater than a predefined threshold value. The threshold value is set heuristically by performing multiple trials under controlled conditions. The control action corresponding to thought x is applied to control the robot’s movement. Fig. 3 explains the working of the decision filtering block. The first row shows the observed thought actions obtained from the SDK. The second row shows the thought power function. The third row shows the integral over the thought power function. The integral is taken on individual thought cycles. The ITPF value is very small for thoughts whose duration as well as power is less. From Fig. 3 we can see the actual data obtained from experimentation. The last row shows the ground truth data. The ground truth was obtained by asking the user to push a button while he was thinking. From this we can understand whether it was a true positive or not. It can be seen from the figure that a spurious thought occurring at time 5 sec. was rejected by thresholding (threshold was set to be 9). Also, the thought at time 23 sec. was rejected, thus preventing multiple movements. Hence, the decision filtering block is absolutely essential to provide reliable control signals for the robot.

TABLE I: DEVICE VALIDATION

4. Experimental Evaluation The hypothesis proposed in this work is that a robot present at the remote location can be controlled using the visual feedback and EEG based BCI device. It is clear that the success of the system hinges on the performance of the BCI device. Hence, we have focused on validating the performance of the BCI device for the scenario of robot control. The experimentation has been done in two parts. In the first part we validate the performance of the SDK provided with Epoc. In the second part we test the efficacy of the decision filtering block and test whether getting reliable control signals is possible or not. We use a simulated robot to do the experiment since it is faster and easier to observe the results.

USER NUMBER

PERCENTAGE OF CORRECT CLASSIFICATION (%)

1 2 3 4

95.65 85.71 65.38 66.67

test path was set up shown approximately by the yellow line. The operator was asked to move the robot along the path. Retro-reflective markers were attached to the robot’s head. The motion was tracked and recorded by the Vicon MX optical motion capture system. Some motion paths are shown in Fig. 5. The thick red line 44


Figure 4. Simulated robot control experiment. The goal for the user was to hit the yellow target by controlling the motion of the robot in 4 directions. The user was asked to follow the path shown by red dotted lines. The solid black lines show the actual path obtained by human control using EEG headset

shows the ideal path to be followed by the robot. The ideal path was obtained by recording the motion of the robot when it was controlled using a joystick.

as well as the humanoid robot. The proposed platform can be used for many experiments involving Human-Robot interaction using BCI devices. For the future works investigation needs to be done to derive optimal algorithms for pattern recognition in EEG signals for robot motion control. The given system can be augmented with haptic devices to provide multimodal control to the user. Such a remote presence system can have many practical applications such as tele-operating humanoid robots to work in hostile environments, laborious work or simply working in a remote environment.

6. Acknowledgements This project is partially supported by the NSF grant CISE/CNS 0916864 and CISE/CNS MRI 0923238.

7. References [1] [2] [3]

[4] Figure 5. The top figure shows the test path to be followed. The bottom figure shows the observed paths. The thick red path is the path obtained when the robot was controlled using a joystick

[5]

In the experiment with humanoid robot, it can be seen that the user is able to move the robot from the start to finish with very few false movements We conclude from these experiments that the system is able to provide reliable motion control signals using only the human user’s thoughts.

[6] [7]

[8]

5. Conclusion [9]

This work presented a novel idea and a platform for remote presence through a humanoid robot embodiment. A special strategy called decision filtering was proposed to improve the results by reducing false positives. The merit of the system was evaluated by validating the BCI device and the complete system. It was seen that decision filtering successfully removes false positives. Thus, reliable signals for controlling the robot motion were obtained from the thoughts of the human. The human user was able to control the motion of the simulated robot

[10] [11]

[12] [13] [14] [15]

45

M. Minsky, “Toward a remotely manned Energy and Production Economy .” MIT AIL, 1979. J. Hightower, “Development of Remote Presence Technology for Teleoperator Systems”, Naval Ocean System Centre, 1986. Christian J. Bell, Pradeep Shenoy, Rawichote Chalodhorn, and Rajesh P. N. Rao. “Control of a humanoid robot by a noninvasive brain-computer interface in humans”. Journal of Neural Engineering, 5(2):214+, June 2008. Bauer, M., Heiber, T., Kortuem, G., and Segall, Z. “A Collaborative Wearable System with Remote Sensing”. In Proceedings of the 2nd IEEE international Symposium on Wearable Computers (October 19 - 20, 1998). ISWC. IEEE Computer Society, Washington, DC, 10. J. D. Millan, F. Renkens, J. Mourino, and W. Gerstner. “Noninvasive brain-actuated control of a mobile robot by human EEG”. Biomedical Engineering, IEEE Transactions on, 51(6):1026–1033, May 2004.Y. A. Ansar. “Visual and haptic collaborative tele-presence”. Computers & Graphics, 25(5):789–798, October 2001. Nijhholt A, , “BrainGain: BCI for HCI and Games,” The Society for the Study of Artificial Intelligence and Simulation of Behaviour, April, 2008. N. Roy, G. Baltus, D. Fox, F. Gemperle, J. Goetz, T. Hirsch, D. Magaritis, M. Montemerlo, J. Pineau, J. Schulte, and S. Thrun. “Towards personal service robots for the elderly,” June 2000. Satava R.M., Simon I.B., “Teleoperation, telerobotics, and telepresence in surgery.”, Endosc Surg Allied Technologies. (3):151-3, Jun 1993 Sheridan TB. Telerobotics, Automation, and Human Supervisory Control. Cambridge, MA: MIT Press, 1992. Nick Augustinos and Ash Shehata, Cisco IBSG Healthcare Practice, Cisco HealthPresence, Transforming Access to Healthcare, Jan 2009 http://www.emotiv.com/, Emotiv Epoc EEG Headset http://www.aldebaran-robotics.com/en, Nao Humanoid Robot, Aldebaran Robotics, France http://www.myvu.com/, MyVU personal multimedia viewers. http://www.fit-pc.com/web/, Fit-PC2 CPU system


46


Design of Energy Efficient Humidification Plant for Textile Processing S.Kokila, P.Gomathi, T.Manigandan PG Scholar,Assistant Professor, Dean School of Electrical Sciences Department of EEE, Kongu Engineering College,Perundurai-638 052,Tamilnadu, India. [email protected] level of 65-75% relative humidity (%RH) in textile manufacturing facilities static build-up can be reduced, regain improved, yarn breakage minimized and dust, fly and lint suppressed. This will dramatically improve quality of yarn and maintain consistent product weight thus profit is maximized. Humidity can be controlled by using Variable Frequency Drive for the humidification plant which is connected with textile unit to control motor speed to such a high tolerance (five significant digits) that speed variation and set point repeatability were no longer factors in [20] adversely affecting yarn quality. VFDs can significantly reduce energy consumption and operating costs of [3] the entire system while providing operational benefits to the owner. The basic block diagram representation of an energy efficient humidification plant with speed control loop as shown in fig1. The motor-control issues are traditionally handled by [15] fixed-gain proportionalintegral (PI) and proportional-integral-derivative (PID) controllers. However, the fixed-gain controllers are very sensitive to parameter variations, load disturbances, etc. Thus, the controller parameters have to be continually adapted. The problem can be solved by one of the most successful expert system techniques applied in a wide range of control applications has been fuzzy logic [21]. It can be combined with conventional P1controller, to built a fuzzy self-tuning controllers, in order to achieve a more robust control [22]. The fuzzy adaptation can be built via updating fuzzy sets functions, fuzzy rules, or controller gains. The contribution of this paper is a simple high performance fuzzy self-tuning PI speed controller of vector control of a variable frequency induction motor drive. The fuzzy rules used to update the two gains of the PI with fixed structure. are based on a qualitative knowledge and experience

Abstract: Humidification is an important ancillary process in a textile industry that is supportive to the production of yarns and fabrics. It improves not only the production but also the quality. Besides, humidification is the second-largest power consuming component next to textile mills and accounts for nearly 15% of the power bill of a textile mills. The increasing power cost at the rate of 12% per year, any effort to save power will be received in the industry. Energy saving in the tune of 25% to 65% in the existing condition is possible by incorporating control in the textile mill depending on the outside climate. Energy efficient control system for humidification plant in textile mills for making the existing humidification plants are more energy efficient in operation. Control consists of variable speed drives for air supply fans, exhaust fans and pumps. In this paper energy efficient humidification plant for textile processing is designed and the power of humidification plant is predicted using Fuzzy Logic. The circuit is simulated in MATLAB-SIMULINK and the results are shown. This Energy Efficient Humidification Plant reduces the power consumption and maintains the quality of the product. Keywords- Mathematical model, Humidification Plant, Fuzzy Logic, Energy Efficient Controller.

1. Introduction The power required for the supply air fan, pumping, and exhaust system of the humidification plant of a textile mill depends on many factors [1]. The most important factors are power of motor driving machinery, lighting and heating, number of people inside, temperature gradient and relative humidity. So the power consumed by the humidification plant can be reduced by varying the speed of supply air fan Induction motor [2] Humidity control in the textile industry is essential in order to maintain product quality and reduce imperfections. By maintaining a

47


Supply (KW)

RH (%)

Temperature

Persons

Light (KW)

Study number

Machine (KW)

Table 1.Existing Values in Textile Spinning Mills

established from a lot of simulation results of several transient speed responses obtained for different operating conditions such as response to step speed command from standstill, step load torque application and speed reversion, with nominal parameters and an increased and or decreased rotor resistance, self inductance and inertia The complete vector control scheme of IM incorporating the Fuzzy self tuning PI controller has been successfully simulated using MATLAB. The performances of the proposed drive have also been compared with those obtained from the conventional PI controller. Therefore, a tuning procedure must be presented to ensure that the controller can cope with the variations in the plant. For humidification power analysis, in paper [5] data like the power of the motors driving the machineries, pneumafils and the overhead cleaners, the lighting and heating load inside [17] the department, the number of persons inside the department, the outside design summer dry bulb temperature and relative humidity of the place of location of the mill [19], the existing humidification supply air power, pumping power and exhaust power had been collected for the spinning department of about 50 mills were collected. Out of this, 10 mills were selected for the study after conducting the heat load analysis and are given in Table 1 and all the input and output values are normalized. From this analysis power can be calculated for different machine rating. This paper provides a power prediction for same rating of machine with different speeds using fuzzy logic controller.

1

439

14

37

34.7

58

34

2

836

10

55

34.7

58

60

3

895

10

65

39.5

37

56

4

1063

20

90

38.3

38

81

5

821

16

70

34.7

58

67

6

1007

18

80

34.7

58

67

7

425

3.89

30

34.7

58

34

8

582

6

35

38.5

47

41

9

1134

29.6

39

35.3

38

78

10

828

13.4

50

34.7

58

60

These processes are operated manually or semiautomatic. This paper provides variable frequency drive for speed control of induction motor. Reference input

The remainder of the paper is organized as follows: Section (2) focuses on Proposed method Section (3) developing Mathematical model of humidification plant section(4) emphasizes

Output

PI Controller

Plant

Humidity Sensor

Fuzzy self-tuning PI speed controller design, Section(5) Prediction of humidification power using fuzzy controller, Section (6) shows the simulation results

Figure.1. Basic Block Diagram of Energy Efficient Humidification plant

3. Mathematical model of humidification plant

2. Proposed method

The humidification plant which consist of two supply air fan and two water pump motor. A mathematical model for individual machine of humidification plant is derived.

In textile mills, the power consumption of the humidification Plant can be reduced by the following methods. 1. Pump on and off. 2. Return Air damper gradually open and close. 3. Fresh Air damper gradually open and close. 4. Bypass damper gradually open and close. 5. Exhaust damper gradually open and close. 6. Supply and return air fan ON/OFF.

A. Need for Mathematical modeling Currently mathematical modeling tools for research and analysis of Electric Power Distribution Systems do not include efficient mathematical models of a

48


ω slm I lr · is leakage reactance (Ω)

number of loads that consist of combination of devices with different physical nature [11].This prevents comprehensive analysis of modes and processes of engineering subsystems, including their nonlinear characteristics, in unified systematic way. Steady state mathematical model of induction motor and centrifugal pump proposed in [12] requires defining control strategies of electromagnetic and hydraulic model coordinates in order to obtain unified solution. The goal of this work is the development of mathematical modeling for humidification plant in order to find the initial value of PI controller parameter to control the speed of induction motor. The pump and asynchronous squirrel cage motor data used in this mathematical modeling are the following:

Rr is rotor resistance (Ω) Vs is input voltage per phase (Volt). The machine torque and speed are related by the following equation [13].

J

dω r + B ω r = Te − T L dt

by J is moment of inertia, B is viscous friction, TL is load torque (N ·m) From relationship among equation (1) and (2), transfer function of induction motor for speed control is

Motor power=7.5kW. Voltage=415V. Rated Speed=1440rpm. Rs=0.288;Rr=0.158;

ω

r

V

s

K G(s)

Output C(s)

PID Controller

G2

input R(s)

+

Gc(s) G3

G4

Humidity Sensor

Figure 2.Basic Block Diagram for mathematical of Humidification plant

T

⎞ ⎟ ⎟ ⎠

2

R

r

+ωslm2 Llr 2

2

ω slm Rr R 2 r +ω slm 2 Llr 2

(4)

t

=

V Δ P

η

(5)

t

Dimensionless co efficient of pressure head

Ch =

ωslm Rr 2

⎛ P ⎞⎛ V ⎞ = 3⎜⎜ ⎟⎟ ⎜⎜ S ⎟⎟ ⎝ 2 ⎠ ⎝ ωe ⎠

E

B. Model of Induction motor for speed control In drive operation, the speed can be controlled indirectly by controlling the torque which, for the normal operating region, is directly proportional to the voltage to frequency. The torque e T [6] given by equation also ⎛ P ⎞⎛⎜ VS 3 = ⎜⎜ ⎟⎟⎜ e ⎝ 2 ⎠⎝ ωe

(3)

C. Model of Supply air fan The fan models used by currently available simulation tools can only simulate the performance of a fan itself. There are no models to simulate the performance of other components in a fan subsystem, such as motor, inverter, etc. However, it is difficult to commission a fan using these models because they simulate the energy consumption of a fan itself and this energy consumption, which is termed fan shaft power, is difficult to measure. The total energy consumption of supply air fan [7] is defined by the following equations

G1

-

K JS + B

=

Where

Pump Rated Speed=2870rpm;

+

(2)

E tη t ρ N 2 D 2V

Fan pressure head Δ P = SV

(1)

Where,

2

Where,

Te is developed torque (N ·m) P is pole of induction motor

Ch = dimensionless coefficient of pressure head

ωe is stator supply frequency (rad / s)

D = diameter of fan wheel (m) Er =rated fan power consumption (W)

49

(6)

(7)


Et = total energy consumption of fan subsystem (W)

G (s ) =

N = fan rotation speed (r/s)

0.00846 s + 2.5 s 2 + 1.76 s + 0.22

(13)

ΔP = fan pressure head (Pa) S = airflow resistance coefficient (pa/(m3/s)2) From this Initial Values for kp ki are selected by pole placement methods.

V = fan supply air volume flow rate (m3/s) ηt = total efficiency of fan subsystem ρ =air density (kg/m3).

Kp=1.76

The transfer function of fan

Ki=0.22

Transfer function for PI Controller is

1

V ⎛C ρ ⎞2 g1 = g 2 = = ⎜ h ⎟ N ⎝ S ⎠

(8)

Gc ( s ) = K p +

Transfer function of supply air fan

Ki s

(14)

1

G1 = G 2 =

V k ⎛ Cρ ⎞2 = ⋅⎜ ⎟ ⋅D vs Js + B ⎝ S ⎠

Transfer function of plant with PI Controller is

(9)

D. Model of Centrifugal pump model In order to energy saving, the rotate sped of the centrifugal pump can turn to n, then pump speed becomes varied. So form the relation (8), the rotate speed n of the centrifugal pump is:

T (s ) =

G (s )Gc (s ) 1 + G (s )Gc (s )H (s )

(15)

By substuting plant equations we get, {16} T (s ) =

Q n = no o Q1

(0.00846K P s 2 + (2.5K P + 0.00846K I )s + 2.5K I ) /

(10)

(s 3 + (1.76 + 0.00846K P )s 2 + (0.22 + 2.5K p + 0.0084Ki )s

+ 2.5Ki ) Form this transfer function the value of Kp and Ki are tuned as. Kp=1.7 Ki=0.2.

Where, n= rotate speed no=Rated Speed Q0=Flow rate at rated speed

4. Fuzzy self-tuning PI speed controller design

Q1=Flow rate at rotate speed Transfer function of centrifugal pump model [8][9] can be drive from the equations (3) and (10) is

The block diagram of the proposed Fuzzy Inference block is designed for speed control of an indirectfield oriented induction motor, is presented in figure 3.

1

n ω (s ) H (s ) K A G3 = G 4 = r ⋅ = ∗ 0 ∗ 1 V s (s ) ω r (s ) JS + B Q 2 S+

(11)

RA

E. Model of the Humidification plant From the equations (9) and (10) we can drive a transfer function as 1 ⎡ ⎤ n 1 ⎞ ⎛ Ch ρ ⎞ 2 ⎛ ⎢ 2⋅ ⎜s + ⎟ ⋅D+ 0 ⎥ ⎟⋅⎜ ⎢⎝ RA ⎠ ⎝ S ⎠ AQ 2 ⎥ ⎣ ⎦ G (s ) = B ⎛ J + BRA ⎞ 2 Js + ⎜ ⎟s + RA ⎝ RA ⎠

(12) Figure 3 Fuzzy Inference block

The two gains of the PI controller will be initialized using Well-known conventional methods, but, these gains depend on the induction motor estimated model at rated operating Conditions Then, a fuzzy algorithm for tuning these two gains of the PI controller is proposed to keep good control performance, when parameter variations

By substuting various parameters in plant model, we get

50


manipulated in the fuzzy inference mechanism is based on the fuzzy set theory, the associated fuzzy sets involved in the fuzzy control rules are defined as follows:LN: Low Negative LP: Low Positive N:Negative P: Positive The membership functions for the fuzzy sets corresponding to the temp, speed and the power are defined in fig. 4 and fig.5.In this application, mamdani FLC with two inputs (Temperature and Speed) and one output (power) was constructed. Each input contains five membership functions and table shows the membership functions and rules base.

take place ant or when disturbances are present. This approach uses fuzzy rules to generate Kp and Ki The design of these rules is based on a qualitative knowledge, deduced from an exhaustive simulation tests of a conventional PI speed controller of an induction motor for different values of Kp and Ki with different operating conditions One can note that, as both the two input variables (speed error e and its change Ae), and the two output variables (Kp and Ki) of the fuzzy adaptation mechanism, are described with seven fuzzy sets each, namely: NB: negative big, NM: negative medium,NS: negative small, Z: zero, PS: positive small PM: positive medium, PB: positive big Then, inference rules base can have at most 98 rules (49 rules for Kp tuning, and 49 rules for Ki tuning are proposed in this paper). The fuzzy sets are triangular shape functions with equal width and overlap.

NM B B B S B S B

NS B B B S B B B

Z S S S S S S S

PS B B B S B B B

PM B B B S B B B

PB B B B S B B B

medium

fast

NB B B B S B B B

brisk

CE/E NB NM NS Z PS PM PB

low

Table 2. Inference Rule matrix for the integral gain Ki of the PI controller

slack

Temp/Speed

Table 4.Fuzzy Logic Controller Operation

cold

LN

N

M

LP

P

cool

LN

N

M

LP

P

moderate

LN

N

M

LP

P

warm

LN

N

M

LP

P

hot

LN

N

M

LP

P

Table 3. Inference Rule matrix for the integral gain Kp of the PI controller

CE/E NB NM NS Z PS PM PB

NB B B B S B B B

NM B B B S B S B

NS B B B S B B B

Z S S S S S S S

PS B B B S B B B

PM B B B S B B S

PB B S B S B B B

5. Prediction of humidification power using fuzzy controller Fuzzy logic is convenient way for control in wide area of processes[10]. The main advantage of FLC to some classics algorithm of control can be applied in nonlinear systems and systems with dominated time delays. In humidification plant, the power consumed by this plant is 1/3rd of the unit. By considering [14] the variable frequency drive for controlling the induction motor speed the power consumption can be reduced. The power can be predicted by varying speed of induction motor using fuzzy logic controller. The inputs of the fuzzy adapter are: The temperature and the speed, the output is: power. The data

Figure 4 Membership functions for temp and speed

51


Figure.6.Simulink block of Energy Efficient Humidification Plant

6. Simulation results Many simulation tests were performed in order to comparethe performances of the self-tuning PI speed controller of vector control of a variable frequency induction motor drive, to those obtained with a conventional PI controller Some sample results are shown in this section. Fig 7 shows the conventional PI controller parameter of Kp=1.7,Ki=0.2 for Speed transient response for a step command of 145 rad/s. Fig 8 shows the fuzzy self tuning PI controller parameter of Kp=1.7,Ki=0.2 for Speed transient response for a step command of 145 rad/s. The fuzzy logic controller operation is based on the control operation shown in Table 4. Power can be calculated through fuzzy controller. From this the power can be evaluated by changing the temperature and speed. One sample set of calculations are shown in table 5 for the temperature of 320C with different possible speeds. This was simulated using MATLAB simulink.Simulink block of Energy Efficient Humidification Plant as shown in Fig 6.and the speed control waveform is shown in Fig 7.

Figure 5 Membership function for power

In the design of a fuzzy logic controller, five membership functions were used for both temperature and speed. Membership functions were constructed to represent the input and output value. as shown in Figure. Therefore, the fuzzy logic controllers in the paper will result the higher accuracy in predicting the power of humidification plant. A fuzzy logic controller is proposed to evaluate the power when the speed varies.

52


variable frequency induction motor were presented. The design of energy efficient humidification plant for textile processing was achieved using a simple fuzzy logic adaptation mechanism. The energy is saved with the help of variable frequency drive and the power is predicted by using fuzzy logic controller and the corresponding results were tabulated.

8. Acknowledgements we would like to thank our management of kongu engineering college as well as members of drives.

Time (s) Figure 7. Conventional PI Controller Speed transient response for a step command of 145rad/s

9. References [1]

[2]

[3]

[4]

[5]

Time (s) Figure 8. Fuzzy Self tuning PI Controller Speed transient response for a step command of 145rad/s

[6]

[7]

Energy Saving (Kw-hr)

Energy/ month (Kw-hr)

Power (watts)

Speed (rpm)

Temperature

Table 5.Calculation for Energy

1300

6314

4546.080

853.920

1360

6362

4580.640

819.360

1400

6601

4752.72

647.280

1420

6807

4901.040

498.960

1440

7119

5125.680

274.32

1450

7334

5280.480

119.52

[8]

[9]

32

[10]

[11]

[12]

7. Conclusion In this paper, extensive simulation results of a conventional and a fuzzy self-tuning PI controller for speed regulation of a vector control of a

53

S.Rajasekaran, R.Prakasam, S.Dhandapani and Vasanthakalyani David,“ Prediction of Humidification Power in Textile-Spinning Mills Using Functional and Neural Networks” Journal of Energy Engineering, Vol.129, No. 1, 2003. A.K.Sharma, R.A.Gupta and Laxmi Srivastava, “A new technique for Energy reduction in induction motor drives using artificial neural Network” Journal of Theoretical and Applied information Technology,2008. Benjamin Cohen, “Variable Frequency Drives: Operation and Application with Evaporative cooling Equipment” CTI Journal, Vol.28 No.2, 2002. N. Noroozi-Varcheshme, A. Ranjbar-Noiey, and H. KarimiDavijani,“Sensorless Indirect FieldOriented Control of Induction Motor using Intelligent PI Controller” Universities power Engineering Conference,pp 1-5, 2008. S Rajasekaran, R Prakasam and S Dhandapani, “Studies in Power Engineering using RPD Paradigm Functional and Neural Networks”IEEE, Vol. 85, 2005. Tianchai, Suksri, Satean andTunyasrirut, “T-DOF PI Controller Design for a Speed Control of Induction Motor Networks” World Academy of Science, Engineering and Technology 35, 2007. Fulin Wang, Harunori Yoshida, Masato Miyata. “Total energy Consumption model of fan subsystem suitable for continuous Commissioning” ASHRAE Transactions, Vol. 110, Part 1, 2004. GUO Junzhong, QI Yuanshong, XU Yafen and LIU Yang, “Operation Parameters optimization of centrifugal pumps in multi-sources water Injection system” International Technology and Innovation Conference, 2006. I.Boiko, X.Sun and E.Tamayo,’’Variable-structure PI Controller forTank Level Process’ American Control Conference, pp 11-13, 2008. Ljubomir Francuski and FilipKulic “Fuzzy PI Controller for Vector Control of Induction Machine” 9th Symposium on Neural Network Applications in Electrical Engineering, University of Belgrade, pp 25- 27, 2008. Petro Gogolyuk, Vladyslav Lysiak, and IIya Grinberg “Influence ofFrequency Control Strategies on Induction Motor-Centrifugal Pump Unit and Its Modes”IEEE Conference proceeding, pp 656-661, 2008. Gogolyuk P.F., Lysiak V.G., Kostyshyn V.S.”Mathematical Modeling of Steady State Modes of a Load Consisting of Asynchronous Motor and Centrifugal Pump”Collections of L’viv Polytechnic National University Electric Power and Electromechanically Systems, vol.479, pp.5867, 2003.


[13]

[14]

[15]

[16]

[17]

[18]

Jae-Sub Ko and Jung-Sik Choi and Dong-Hwa Chung” Hybrid Artificial Intelligent Control for Speed Control of Induction motor” SICE-ICASE InternationalJoint Conference,pp 18-2 1, 2006. V. Chitra, and R. S. Prabhakar” Induction Motor Speed Control using Fuzzy Logic Controller” World Academy of Science, Engineering and Technology 23, 2006. M. Nasir Uddin, Tawfik S. Radwan and M. Azizur Rahman” Performances of Fuzzy-LogicBased Indirect Vector Control for induction Motor Drive”IEEETtransactions on Industry Applications, vol. 38,no. 5, 2002. Haobin Zhou, Bo Long & Binggang Cao” Vector Control System of Indu ction Motor Based on Fuzzy Control Method” Modern Applied Science, Vol. 3, No. 4, 2009. S P Patel and K Subrahmanyam,”Air Conditioning in Textile Mills”Ahmedabad Textile Industries Research Association, Ahmedabad, 1974. R Prakasam and A R Kalyanaraman”Energy Conservation in Spinning Mills. Some Aspects in Humidification”

[19]

[20]

[21]

[22]

54

Proceedings of the 32nd Joint Technological Conference held at SITRA, Coimbatore on 22nd and 23rd June,1991, pp 143- 149. Shah,N.J ‘Mathematical studies in psychometry and application in textile humidification.’’Proc., 34th Joint Technological Conference, ATIRA, Ahmedabad, India, pp163–168, 1993. Daniel A. Dey, Elkton, MD” Speed/Torque Considerations for Correct Textile Fiber Spinning Drives”Textile,Fiber and Film Industry Technical Conference, pp 2/1-2/6, 1993. L. Pirrello, L. Yliniemi, K. Leiviskil, and M. Galluza, '' Self- tuning fuzzy control far a rotary dryer," ISh Triennial World Congress of IFAC, Barcelona, Spain, July 2002. L. Mokrani , R. Abdessemed” A Fuzzy Self-Tuning PI Controller for Speed Control of Induction Motor Drive”IEEE Conference proceeding ,pp 785-790,2003.

AIML

AIML

Suggest Documents

AIML

(AIML) Volume (14), Issue

Persona-AIML: An Architecture for Developing Chatterbots with ...

AIML Based Voice Enabled Artificial Intelligent Chatterbot - SERSC

AIML Based Voice Enabled Artificial Intelligent Chatterbot - SERSC

The International Congress for global Science and Technology (AIML

Living in CyN: Mating AIML and Cyc Together with ... - OntoLinux

Review of integrated applications with AIML based ...

(AIML): A conceptual modeling grammar for agile integrative business ...

Extending knowledge of AIML by using RDF - IASL

Building an AIML Chatter Bot Knowledge-Base Starting from a FAQ ...

using tags in an aiml-based chatterbot to improve its knowledge

T-BOT and Q-BOT: A Couple of AIML-based Bots for Tutoring ... - ICEE