Prof Deo Prakash, SMVD University (A Technical University open on I.I.T. ..... In other words, ...... transmission channels like, the Internet etc. which are very.
IJCSI
International Journal of Computer Science Issues
Volume 7, Issue 3, No 3, May 2010 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814
© IJCSI PUBLICATION www.IJCSI.org
IJCSI proceedings are currently indexed by:
© IJCSI PUBLICATION 2010 www.IJCSI.org
IJCSI Publicity Board 2010
Dr. Borislav D Dimitrov Department of General Practice, Royal College of Surgeons in Ireland Dublin, Ireland
Dr. Vishal Goyal Department of Computer Science, Punjabi University Patiala, India
Mr. Nehinbe Joshua University of Essex Colchester, Essex, UK
Mr. Vassilis Papataxiarhis Department of Informatics and Telecommunications National and Kapodistrian University of Athens, Athens, Greece
EDITORIAL In this third edition of 2010, we bring forward issues from various dynamic computer science areas ranging from system performance, computer vision, artificial intelligence, software engineering, multimedia , pattern recognition, information retrieval, databases, security and networking among others. As always we thank all our reviewers for providing constructive comments on papers sent to them for review. This helps enormously in improving the quality of papers published in this issue. IJCSI will maintain its policy of sending print copies of the journal to all corresponding authors worldwide free of charge. Apart from availability of the full-texts from the journal website, all published papers are deposited in open-access repositories to make access easier and ensure continuous availability of its proceedings. The transition from the 2nd issue to the 3rd one has been marked with an agreement signed between IJCSI and ProQuest and EBSCOHOST, two leading directories to help in the dissemination of our published papers. We believe further indexing and more dissemination will definitely lead to further citations of our authors’ articles. We are pleased to present IJCSI Volume 7, Issue 3, May 2010, split in eleven numbers (IJCSI Vol. 7, Issue 3, No. 3). The acceptance rate for this issue is 37.88%. We wish you a happy reading!
IJCSI Editorial Board May 2010 Issue ISSN (Print): 1694-0814 ISSN (Online): 1694-0784 © IJCSI Publications www.IJCSI.org
IJCSI Editorial Board 2010
Dr Tristan Vanrullen Chief Editor LPL, Laboratoire Parole et Langage - CNRS - Aix en Provence, France LABRI, Laboratoire Bordelais de Recherche en Informatique - INRIA - Bordeaux, France LEEE, Laboratoire d'Esthétique et Expérimentations de l'Espace - Université d'Auvergne, France
Dr Constantino Malagôn Associate Professor Nebrija University Spain
Dr Lamia Fourati Chaari Associate Professor Multimedia and Informatics Higher Institute in SFAX Tunisia
Dr Mokhtar Beldjehem Professor Sainte-Anne University Halifax, NS, Canada
Dr Pascal Chatonnay Assistant Professor MaÎtre de Conférences Laboratoire d'Informatique de l'Université de Franche-Comté Université de Franche-Comté France
Dr Yee-Ming Chen Professor Department of Industrial Engineering and Management Yuan Ze University Taiwan
Dr Vishal Goyal Assistant Professor Department of Computer Science Punjabi University Patiala, India
Dr Natarajan Meghanathan Assistant Professor REU Program Director Department of Computer Science Jackson State University Jackson, USA
Dr Deepak Laxmi Narasimha Department of Software Engineering, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Dr Navneet Agrawal Assistant Professor Department of ECE, College of Technology & Engineering, MPUAT, Udaipur 313001 Rajasthan, India
Prof N. Jaisankar Assistant Professor School of Computing Sciences, VIT University Vellore, Tamilnadu, India
IJCSI Reviewers Committee 2010 Mr. Markus Schatten, University of Zagreb, Faculty of Organization and Informatics, Croatia Mr. Vassilis Papataxiarhis, Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens, Greece Dr Modestos Stavrakis, University of the Aegean, Greece Dr Fadi KHALIL, LAAS -- CNRS Laboratory, France Dr Dimitar Trajanov, Faculty of Electrical Engineering and Information technologies, ss. Cyril and Methodius Univesity - Skopje, Macedonia Dr Jinping Yuan, College of Information System and Management,National Univ. of Defense Tech., China Dr Alexis Lazanas, Ministry of Education, Greece Dr Stavroula Mougiakakou, University of Bern, ARTORG Center for Biomedical Engineering Research, Switzerland Dr Cyril de Runz, CReSTIC-SIC, IUT de Reims, University of Reims, France Mr. Pramodkumar P. Gupta, Dept of Bioinformatics, Dr D Y Patil University, India Dr Alireza Fereidunian, School of ECE, University of Tehran, Iran Mr. Fred Viezens, Otto-Von-Guericke-University Magdeburg, Germany Dr. Richard G. Bush, Lawrence Technological University, United States Dr. Ola Osunkoya, Information Security Architect, USA Mr. Kotsokostas N.Antonios, TEI Piraeus, Hellas Prof Steven Totosy de Zepetnek, U of Halle-Wittenberg & Purdue U & National Sun Yat-sen U, Germany, USA, Taiwan Mr. M Arif Siddiqui, Najran University, Saudi Arabia Ms. Ilknur Icke, The Graduate Center, City University of New York, USA Prof Miroslav Baca, Faculty of Organization and Informatics, University of Zagreb, Croatia Dr. Elvia Ruiz Beltrán, Instituto Tecnológico de Aguascalientes, Mexico Mr. Moustafa Banbouk, Engineer du Telecom, UAE Mr. Kevin P. Monaghan, Wayne State University, Detroit, Michigan, USA Ms. Moira Stephens, University of Sydney, Australia Ms. Maryam Feily, National Advanced IPv6 Centre of Excellence (NAV6) , Universiti Sains Malaysia (USM), Malaysia Dr. Constantine YIALOURIS, Informatics Laboratory Agricultural University of Athens, Greece Mrs. Angeles Abella, U. de Montreal, Canada Dr. Patrizio Arrigo, CNR ISMAC, italy Mr. Anirban Mukhopadhyay, B.P.Poddar Institute of Management & Technology, India Mr. Dinesh Kumar, DAV Institute of Engineering & Technology, India Mr. Jorge L. Hernandez-Ardieta, INDRA SISTEMAS / University Carlos III of Madrid, Spain Mr. AliReza Shahrestani, University of Malaya (UM), National Advanced IPv6 Centre of Excellence (NAv6), Malaysia Mr. Blagoj Ristevski, Faculty of Administration and Information Systems Management - Bitola, Republic of Macedonia Mr. Mauricio Egidio Cantão, Department of Computer Science / University of São Paulo, Brazil Mr. Jules Ruis, Fractal Consultancy, The Netherlands
Mr. Mohammad Iftekhar Husain, University at Buffalo, USA Dr. Deepak Laxmi Narasimha, Department of Software Engineering, Faculty of Computer Science and Information Technology, University of Malaya, Malaysia Dr. Paola Di Maio, DMEM University of Strathclyde, UK Dr. Bhanu Pratap Singh, Institute of Instrumentation Engineering, Kurukshetra University Kurukshetra, India Mr. Sana Ullah, Inha University, South Korea Mr. Cornelis Pieter Pieters, Condast, The Netherlands Dr. Amogh Kavimandan, The MathWorks Inc., USA Dr. Zhinan Zhou, Samsung Telecommunications America, USA Mr. Alberto de Santos Sierra, Universidad Politécnica de Madrid, Spain Dr. Md. Atiqur Rahman Ahad, Department of Applied Physics, Electronics & Communication Engineering (APECE), University of Dhaka, Bangladesh Dr. Charalampos Bratsas, Lab of Medical Informatics, Medical Faculty, Aristotle University, Thessaloniki, Greece Ms. Alexia Dini Kounoudes, Cyprus University of Technology, Cyprus Mr. Anthony Gesase, University of Dar es salaam Computing Centre, Tanzania Dr. Jorge A. Ruiz-Vanoye, Universidad Juárez Autónoma de Tabasco, Mexico Dr. Alejandro Fuentes Penna, Universidad Popular Autónoma del Estado de Puebla, México Dr. Ocotlán Díaz-Parra, Universidad Juárez Autónoma de Tabasco, México Mrs. Nantia Iakovidou, Aristotle University of Thessaloniki, Greece Mr. Vinay Chopra, DAV Institute of Engineering & Technology, Jalandhar Ms. Carmen Lastres, Universidad Politécnica de Madrid - Centre for Smart Environments, Spain Dr. Sanja Lazarova-Molnar, United Arab Emirates University, UAE Mr. Srikrishna Nudurumati, Imaging & Printing Group R&D Hub, Hewlett-Packard, India Dr. Olivier Nocent, CReSTIC/SIC, University of Reims, France Mr. Burak Cizmeci, Isik University, Turkey Dr. Carlos Jaime Barrios Hernandez, LIG (Laboratory Of Informatics of Grenoble), France Mr. Md. Rabiul Islam, Rajshahi university of Engineering & Technology (RUET), Bangladesh Dr. LAKHOUA Mohamed Najeh, ISSAT - Laboratory of Analysis and Control of Systems, Tunisia Dr. Alessandro Lavacchi, Department of Chemistry - University of Firenze, Italy Mr. Mungwe, University of Oldenburg, Germany Mr. Somnath Tagore, Dr D Y Patil University, India Ms. Xueqin Wang, ATCS, USA Dr. Borislav D Dimitrov, Department of General Practice, Royal College of Surgeons in Ireland, Dublin, Ireland Dr. Fondjo Fotou Franklin, Langston University, USA Dr. Vishal Goyal, Department of Computer Science, Punjabi University, Patiala, India Mr. Thomas J. Clancy, ACM, United States Dr. Ahmed Nabih Zaki Rashed, Dr. in Electronic Engineering, Faculty of Electronic Engineering, menouf 32951, Electronics and Electrical Communication Engineering Department, Menoufia university, EGYPT, EGYPT Dr. Rushed Kanawati, LIPN, France Mr. Koteshwar Rao, K G Reddy College Of ENGG.&TECH,CHILKUR, RR DIST.,AP, India
Mr. M. Nagesh Kumar, Department of Electronics and Communication, J.S.S. research foundation, Mysore University, Mysore-6, India Dr. Ibrahim Noha, Grenoble Informatics Laboratory, France Mr. Muhammad Yasir Qadri, University of Essex, UK Mr. Annadurai .P, KMCPGS, Lawspet, Pondicherry, India, (Aff. Pondicherry Univeristy, India Mr. E Munivel , CEDTI (Govt. of India), India Dr. Chitra Ganesh Desai, University of Pune, India Mr. Syed, Analytical Services & Materials, Inc., USA Dr. Mashud Kabir, Department of Computer Science, University of Tuebingen, Germany Mrs. Payal N. Raj, Veer South Gujarat University, India Mrs. Priti Maheshwary, Maulana Azad National Institute of Technology, Bhopal, India Mr. Mahesh Goyani, S.P. University, India, India Mr. Vinay Verma, Defence Avionics Research Establishment, DRDO, India Dr. George A. Papakostas, Democritus University of Thrace, Greece Mr. Abhijit Sanjiv Kulkarni, DARE, DRDO, India Mr. Kavi Kumar Khedo, University of Mauritius, Mauritius Dr. B. Sivaselvan, Indian Institute of Information Technology, Design & Manufacturing, Kancheepuram, IIT Madras Campus, India Dr. Partha Pratim Bhattacharya, Greater Kolkata College of Engineering and Management, West Bengal University of Technology, India Mr. Manish Maheshwari, Makhanlal C University of Journalism & Communication, India Dr. Siddhartha Kumar Khaitan, Iowa State University, USA Dr. Mandhapati Raju, General Motors Inc, USA Dr. M.Iqbal Saripan, Universiti Putra Malaysia, Malaysia Mr. Ahmad Shukri Mohd Noor, University Malaysia Terengganu, Malaysia Mr. Selvakuberan K, TATA Consultancy Services, India Dr. Smita Rajpal, Institute of Technology and Management, Gurgaon, India Mr. Rakesh Kachroo, Tata Consultancy Services, India Mr. Raman Kumar, National Institute of Technology, Jalandhar, Punjab., India Mr. Nitesh Sureja, S.P.University, India Dr. M. Emre Celebi, Louisiana State University, Shreveport, USA Dr. Aung Kyaw Oo, Defence Services Academy, Myanmar Mr. Sanjay P. Patel, Sankalchand Patel College of Engineering, Visnagar, Gujarat, India Dr. Pascal Fallavollita, Queens University, Canada Mr. Jitendra Agrawal, Rajiv Gandhi Technological University, Bhopal, MP, India Mr. Ismael Rafael Ponce Medellín, Cenidet (Centro Nacional de Investigación y Desarrollo Tecnológico), Mexico Mr. Supheakmungkol SARIN, Waseda University, Japan Mr. Shoukat Ullah, Govt. Post Graduate College Bannu, Pakistan Dr. Vivian Augustine, Telecom Zimbabwe, Zimbabwe Mrs. Mutalli Vatila, Offshore Business Philipines, Philipines Dr. Emanuele Goldoni, University of Pavia, Dept. of Electronics, TLC & Networking Lab, Italy Mr. Pankaj Kumar, SAMA, India Dr. Himanshu Aggarwal, Punjabi University,Patiala, India Dr. Vauvert Guillaume, Europages, France
Prof Yee Ming Chen, Department of Industrial Engineering and Management, Yuan Ze University, Taiwan Dr. Constantino Malagón, Nebrija University, Spain Prof Kanwalvir Singh Dhindsa, B.B.S.B.Engg.College, Fatehgarh Sahib (Punjab), India Mr. Angkoon Phinyomark, Prince of Singkla University, Thailand Ms. Nital H. Mistry, Veer Narmad South Gujarat University, Surat, India Dr. M.R.Sumalatha, Anna University, India Mr. Somesh Kumar Dewangan, Disha Institute of Management and Technology, India Mr. Raman Maini, Punjabi University, Patiala(Punjab)-147002, India Dr. Abdelkader Outtagarts, Alcatel-Lucent Bell-Labs, France Prof Dr. Abdul Wahid, AKG Engg. College, Ghaziabad, India Mr. Prabu Mohandas, Anna University/Adhiyamaan College of Engineering, india Dr. Manish Kumar Jindal, Panjab University Regional Centre, Muktsar, India Prof Mydhili K Nair, M S Ramaiah Institute of Technnology, Bangalore, India Dr. C. Suresh Gnana Dhas, VelTech MultiTech Dr.Rangarajan Dr.Sagunthala Engineering College,Chennai,Tamilnadu, India Prof Akash Rajak, Krishna Institute of Engineering and Technology, Ghaziabad, India Mr. Ajay Kumar Shrivastava, Krishna Institute of Engineering & Technology, Ghaziabad, India Mr. Deo Prakash, SMVD University, Kakryal(J&K), India Dr. Vu Thanh Nguyen, University of Information Technology HoChiMinh City, VietNam Prof Deo Prakash, SMVD University (A Technical University open on I.I.T. Pattern) Kakryal (J&K), India Dr. Navneet Agrawal, Dept. of ECE, College of Technology & Engineering, MPUAT, Udaipur 313001 Rajasthan, India Mr. Sufal Das, Sikkim Manipal Institute of Technology, India Mr. Anil Kumar, Sikkim Manipal Institute of Technology, India Dr. B. Prasanalakshmi, King Saud University, Saudi Arabia. Dr. K D Verma, S.V. (P.G.) College, Aligarh, India Mr. Mohd Nazri Ismail, System and Networking Department, University of Kuala Lumpur (UniKL), Malaysia Dr. Nguyen Tuan Dang, University of Information Technology, Vietnam National University Ho Chi Minh city, Vietnam Dr. Abdul Aziz, University of Central Punjab, Pakistan Dr. P. Vasudeva Reddy, Andhra University, India Mrs. Savvas A. Chatzichristofis, Democritus University of Thrace, Greece Mr. Marcio Dorn, Federal University of Rio Grande do Sul - UFRGS Institute of Informatics, Brazil Mr. Luca Mazzola, University of Lugano, Switzerland Mr. Nadeem Mahmood, Department of Computer Science, University of Karachi, Pakistan Mr. Hafeez Ullah Amin, Kohat University of Science & Technology, Pakistan Dr. Professor Vikram Singh, Ch. Devi Lal University, Sirsa (Haryana), India Mr. M. Azath, Calicut/Mets School of Enginerring, India Dr. J. Hanumanthappa, DoS in CS, University of Mysore, India Dr. Shahanawaj Ahamad, Department of Computer Science, King Saud University, Saudi Arabia Dr. K. Duraiswamy, K. S. Rangasamy College of Technology, India Prof. Dr Mazlina Esa, Universiti Teknologi Malaysia, Malaysia
Dr. P. Vasant, Power Control Optimization (Global), Malaysia Dr. Taner Tuncer, Firat University, Turkey Dr. Norrozila Sulaiman, University Malaysia Pahang, Malaysia Prof. S K Gupta, BCET, Guradspur, India Dr. Latha Parameswaran, Amrita Vishwa Vidyapeetham, India Mr. M. Azath, Anna University, India Dr. P. Suresh Varma, Adikavi Nannaya University, India Prof. V. N. Kamalesh, JSS Academy of Technical Education, India Dr. D Gunaseelan, Ibri College of Technology, Oman Mr. Sanjay Kumar Anand, CDAC, India Mr. Akshat Verma, CDAC, India Mrs. Fazeela Tunnisa, Najran University, Kingdom of Saudi Arabia Mr. Hasan Asil, Islamic Azad University Tabriz Branch (Azarshahr), Iran Prof. Dr Sajal Kabiraj, Fr. C Rodrigues Institute of Management Studies (Affiliated to University of Mumbai, India), India Mr. Syed Fawad Mustafa, GAC Center, Shandong University, China Dr. Natarajan Meghanathan, Jackson State University, Jackson, MS, USA Prof. Selvakani Kandeeban, Francis Xavier Engineering College, India Mr. Tohid Sedghi, Urmia University, Iran Dr. S. Sasikumar, PSNA College of Engg and Tech, Dindigul, India Dr. Anupam Shukla, Indian Institute of Information Technology and Management Gwalior, India Mr. Rahul Kala, Indian Institute of Inforamtion Technology and Management Gwalior, India Dr. A V Nikolov, National University of Lesotho, Lesotho Mr. Kamal Sarkar, Department of Computer Science and Engineering, Jadavpur University, India Dr. Mokhled S. AlTarawneh, Computer Engineering Dept., Faculty of Engineering, Mutah University, Jordan, Jordan Prof. Sattar J Aboud, Iraqi Council of Representatives, Iraq-Baghdad Dr. Prasant Kumar Pattnaik, Department of CSE, KIST, India Dr. Mohammed Amoon, King Saud University, Saudi Arabia Dr. Tsvetanka Georgieva, Department of Information Technologies, St. Cyril and St. Methodius University of Veliko Tarnovo, Bulgaria Dr. Eva Volna, University of Ostrava, Czech Republic Mr. Ujjal Marjit, University of Kalyani, West-Bengal, India Dr. Prasant Kumar Pattnaik, KIST,Bhubaneswar,India, India Dr. Guezouri Mustapha, Department of Electronics, Faculty of Electrical Engineering, University of Science and Technology (USTO), Oran, Algeria Mr. Maniyar Shiraz Ahmed, Najran University, Najran, Saudi Arabia Dr. Sreedhar Reddy, JNTU, SSIETW, Hyderabad, India Mr. Bala Dhandayuthapani Veerasamy, Mekelle University, Ethiopa Mr. Arash Habibi Lashkari, University of Malaya (UM), Malaysia Mr. Rajesh Prasad, LDC Institute of Technical Studies, Allahabad, India Ms. Habib Izadkhah, Tabriz University, Iran Dr. Lokesh Kumar Sharma, Chhattisgarh Swami Vivekanand Technical University Bhilai, India Mr. Kuldeep Yadav, IIIT Delhi, India Dr. Naoufel Kraiem, Institut Superieur d'Informatique, Tunisia
Prof. Frank Ortmeier, Otto-von-Guericke-Universitaet Magdeburg, Germany Mr. Ashraf Aljammal, USM, Malaysia Mrs. Amandeep Kaur, Department of Computer Science, Punjabi University, Patiala, Punjab, India Mr. Babak Basharirad, University Technology of Malaysia, Malaysia Mr. Avinash singh, Kiet Ghaziabad, India Dr. Miguel Vargas-Lombardo, Technological University of Panama, Panama Dr. Tuncay Sevindik, Firat University, Turkey Ms. Pavai Kandavelu, Anna University Chennai, India Mr. Ravish Khichar, Global Institute of Technology, India Mr Aos Alaa Zaidan Ansaef, Multimedia University, Cyberjaya, Malaysia Dr. Awadhesh Kumar Sharma, Dept. of CSE, MMM Engg College, Gorakhpur-273010, UP, India Mr. Qasim Siddique, FUIEMS, Pakistan Dr. Le Hoang Thai, University of Science, Vietnam National University - Ho Chi Minh City, Vietnam Dr. Saravanan C, NIT, Durgapur, India Dr. Vijay Kumar Mago, DAV College, Jalandhar, India Dr. Do Van Nhon, University of Information Technology, Vietnam Mr. Georgios Kioumourtzis, University of Patras, Greece Mr. Amol D.Potgantwar, SITRC Nasik, India Mr. Lesedi Melton Masisi, Council for Scientific and Industrial Research, South Africa Dr. Karthik.S, Department of Computer Science & Engineering, SNS College of Technology, India Mr. Nafiz Imtiaz Bin Hamid, Department of Electrical and Electronic Engineering, Islamic University of Technology (IUT), Bangladesh Mr. Muhammad Imran Khan, Universiti Teknologi PETRONAS, Malaysia Dr. Abdul Kareem M. Radhi, Information Engineering - Nahrin University, Iraq Dr. Mohd Nazri Ismail, University of Kuala Lumpur, Malaysia Dr. Manuj Darbari, BBDNITM, Institute of Technology, A-649, Indira Nagar, Lucknow 226016, India Ms. Izerrouken, INP-IRIT, France Mr. Nitin Ashokrao Naik, Dept. of Computer Science, Yeshwant Mahavidyalaya, Nanded, India Mr. Nikhil Raj, National Institute of Technology, Kurukshetra, India Prof. Maher Ben Jemaa, National School of Engineers of Sfax, Tunisia Prof. Rajeshwar Singh, BRCM College of Engineering and Technology, Bahal Bhiwani, Haryana, India Mr. Gaurav Kumar, Department of Computer Applications, Chitkara Institute of Engineering and Technology, Rajpura, Punjab, India Mr. Ajeet Kumar Pandey, Indian Institute of Technology, Kharagpur, India Mr. Rajiv Phougat, IBM Corporation, USA Mrs. Aysha V, College of Applied Science Pattuvam affiliated with Kannur University, India Dr. Debotosh Bhattacharjee, Department of Computer Science and Engineering, Jadavpur University, Kolkata-700032, India Dr. Neelam Srivastava, Institute of engineering & Technology, Lucknow, India Prof. Sweta Verma, Galgotia's College of Engineering & Technology, Greater Noida, India Mr. Harminder Singh BIndra, MIMIT, INDIA Dr. Lokesh Kumar Sharma, Chhattisgarh Swami Vivekanand Technical University, Bhilai, India Mr. Tarun Kumar, U.P. Technical University/Radha Govinend Engg. College, India Mr. Tirthraj Rai, Jawahar Lal Nehru University, New Delhi, India
Mr. Akhilesh Tiwari, Madhav Institute of Technology & Science, India Mr. Dakshina Ranjan Kisku, Dr. B. C. Roy Engineering College, WBUT, India Ms. Anu Suneja, Maharshi Markandeshwar University, Mullana, Haryana, India Mr. Munish Kumar Jindal, Punjabi University Regional Centre, Jaito (Faridkot), India Dr. Ashraf Bany Mohammed, Management Information Systems Department, Faculty of Administrative and Financial Sciences, Petra University, Jordan Mrs. Jyoti Jain, R.G.P.V. Bhopal, India Dr. Lamia Chaari, SFAX University, Tunisia Mr. Akhter Raza Syed, Department of Computer Science, University of Karachi, Pakistan Prof. Khubaib Ahmed Qureshi, Information Technology Department, HIMS, Hamdard University, Pakistan Prof. Boubker Sbihi, Ecole des Sciences de L'Information, Morocco Dr. S. M. Riazul Islam, Inha University, South Korea Prof. Lokhande S.N., S.R.T.M.University, Nanded (MH), India Dr. Vijay H Mankar, Dept. of Electronics, Govt. Polytechnic, Nagpur, India Dr. M. Sreedhar Reddy, JNTU, Hyderabad, SSIETW, India Mr. Ojesanmi Olusegun, Ajayi Crowther University, Oyo, Nigeria Ms. Mamta Juneja, RBIEBT, PTU, India Dr. Ekta Walia Bhullar, Maharishi Markandeshwar University, Mullana Ambala (Haryana), India Prof. Chandra Mohan, John Bosco Engineering College, India Mr. Nitin A. Naik, Yeshwant Mahavidyalaya, Nanded, India Mr. Sunil Kashibarao Nayak, Bahirji Smarak Mahavidyalaya, Basmathnagar Dist-Hingoli., India Prof. Rakesh.L, Vijetha Institute of Technology, Bangalore, India Mr B. M. Patil, Indian Institute of Technology, Roorkee, Uttarakhand, India Mr. Thipendra Pal Singh, Sharda University, K.P. III, Greater Noida, Uttar Pradesh, India Prof. Chandra Mohan, John Bosco Engg College, India Mr. Hadi Saboohi, University of Malaya - Faculty of Computer Science and Information Technology, Malaysia Dr. R. Baskaran, Anna University, India Dr. Wichian Sittiprapaporn, Mahasarakham University College of Music, Thailand Mr. Lai Khin Wee, Universiti Teknologi Malaysia, Malaysia Dr. Kamaljit I. Lakhtaria, Atmiya Institute of Technology, India Mrs. Inderpreet Kaur, PTU, Jalandhar, India Mr. Iqbaldeep Kaur, PTU / RBIEBT, India Mrs. Vasudha Bahl, Maharaja Agrasen Institute of Technology, Delhi, India Prof. Vinay Uttamrao Kale, P.R.M. Institute of Technology & Research, Badnera, Amravati, Maharashtra, India Mr. Suhas J Manangi, Microsoft, India Ms. Anna Kuzio, Adam Mickiewicz University, School of English, Poland Dr. Debojyoti Mitra, Sir Padampat Singhania University, India Prof. Rachit Garg, Department of Computer Science, L K College, India Mrs. Manjula K A, Kannur University, India Mr. Rakesh Kumar, Indian Institute of Technology Roorkee, India
TABLE OF CONTENTS
1. Fault-tolerant Mobile Agent-based Monitoring Mechanism for Highly Dynamic Distributed Networks Jinho Ahn
Pg 1-7
2. Multi Objective AODV Based On a Realistic Mobility Model Hamideh Babaei, Morteza Romoozi
Pg 8-16
3. Qualitative analysis of periodically forced nonlinear oscillators responses and stability areas in the vicinity of bifurcation cascade Nizar Jabli, Hedi Khammari, Mohamed Faouzi Mimouni
Pg 17-25
4. Extended Diffie-Hellman Technique to Generate Multiple Shared Keys at a Time with Reduced KEOs and its Polynomial Time Complexity Nistala V.E.S. Murthy, Vankamamidi S. Naresh
Pg 26-30
5. An Effective Technique for Clustering Incremental Gene Expression data Sauravjyoti Sarmah, Dhruba K. Bhattacharyya
Pg 31-40
6. A Nonblocking Coordinated Checkpointing Algorithm for Mobile Computing Systems Rachit Garg, Praveen Kumar
Pg 41-46
7. Testing the relational Database Mitu Dhull, Archana Sharma
Pg 47-52
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814
1
Fault-tolerant Mobile Agent-based Monitoring Mechanism for Highly Dynamic Distributed Networks Jinho Ahn1 1
Dept. of Computer Science, College of Natural Science, Kyonggi University Suwon, Gyeonggi-do 443-760, Republic of Korea
Abstract Thanks to asynchronous and dynamic natures of mobile agents, a certain number of mobile agent-based monitoring mechanisms have actively been developed to monitor large-scale and dynamic distributed networked systems adaptively and efficiently. Among them, some mechanisms attempt to adapt to dynamic changes in various aspects such as network traffic patterns, resource addition and deletion, network topology and so on. However, failures of some domain managers are very critical to providing correct, real-time and efficient monitoring functionality in a large-scale mobile agent-based distributed monitoring system. In this paper, we present a novel faulttolerance mechanism to have the following advantageous features appropriate for large-scale and dynamic hierarchical mobile agent-based monitoring organizations. It supports fast failure detection functionality with low failure-free overhead by each domain manager transmitting heart-beat messages to its immediate higher-level manager. Also, it minimizes the number of non-faulty monitoring managers affected by failures of domain managers. Moreover, it allows consistent failure detection actions to be performed continuously in case of agent creation, migration and termination, and is able to execute consistent takeover actions even in concurrent failures of domain managers. Keywords: Distributed Network, Fault-tolerance, Mobile Agent, Scalability, Takeover.
1. Introduction Recently, as the number of users of distributed systems and networks considerably increases with the increasing complexity of their services and policies, system administrators attempt to ensure high quality of services each user requires by maximizing utilization of system resources [5]. To achieve this goal, correct, real-time and efficient management and monitoring mechanisms are essential for the systems. But, as the infrastructures of the systems rapidly scale up, a huge amount of monitoring information is produced by a larger number of managed nodes and resources and so the complexity of network monitoring function becomes extremely high [1]. Also, there are heterogeneous and various network
environments within the systems needed to be monitored and the nature of managed resources becomes almost dynamic, not static, which forces traditional static centralized and distributed monitoring mechanisms to be unsuitable for the systems [10]. Thus, mobile agent-based monitoring mechanisms have actively been developed to monitor these large scale and dynamic distributed networked systems adaptively and efficiently. Mobile agent is an autonomous and independent software program to satisfy the corresponding user’s goal on behalf of the user while visiting various target nodes through a network [3]. This mobile agent technology has several advantages such as reduction of network traffic, overcoming of network delay, enabling asynchronous execution and enhancement of dynamic adaptability. Thanks to these desirable features, this technology is very widely used in distributed systems, especially for network management. In a network management system, each mobile agent is generally designed to move to one or more agent-executable nodes in a network, sense temporally and permanently other nodes and resources, and filter and deliver the received management information to the appropriate network management nodes [10]. The previous mobile agent-based monitoring mechanisms are classified as follows: centralized and hierarchical distributed monitoring mechanisms. Most of them are based on the centralized monitoring model and divided into two categories, single mobile agent-based and segment-based mechanisms. In the first [11], a single management station creates a mobile agent and allows the agent to sequentially visit the required nodes in a particular order. This mechanism is simple to implement, but causes the task completion time of a mobile agent to become too long in large-scale distributed systems because the number of visiting nodes significantly increases and the size of the agent may grow considerably. In particular, if the visiting nodes are interconnected through lowbandwidth links, the round-trip delay may extremely increase. Secondly, the segment-based mechanism [2] partitions a network into several sub-networks or domains, and creates and transfers a mobile agent to each domain respectively. Therefore, the collection and filtering of the
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org management information for monitored nodes can be performed in parallel per domain, which addresses the scalability problem of the first mechanism to a certain extent. However, in this mechanism, the single manager should execute all the monitoring function and may become the performance bottleneck of the entire system. In addition, if the agent migration network includes expensive low bandwidth links, it is very difficult to perform the procedure to obtain and filter the monitoring information in real-time. To solve the scalability problem, mobile agent-based mechanisms using hierarchical monitoring structure [6, 7] were proposed. They allow a network to be partitioned into a set of domains organized hierarchically and deploy a new monitoring agent to each domain. In this hierarchy, a main manager is at the top-level (level 1) and delegates monitoring tasks with monitoring agents to the lower level domain managers. Each manager clones and dispatches a monitoring agent to the appropriate domain manager node considering load redistribution of monitoring tasks. In this case, each domain manager collects the management information from the lower-level managers and filters and delivers the processed information to its higher-level manager. The original hierarchical monitoring mechanisms were almost based on a static manager organization model. In other words, each network administrator configures a tree of network domains according to its initial monitoring policy and then the main manager at the root domain creates and migrates monitoring manager agents to other domains. However, if any dynamic changes in various aspects such as network traffic patterns, resource addition and deletion, network topology and so on occur, this mechanism cannot adapt to these changes and will degrade significantly the entire management performance. There were presented some adaptive mobile agent-based mechanisms [8] to address this important issue. In these mechanisms, if each domain manager at level i estimates the need for some additional monitoring capability at run-time, it creates and installs a new manager agent to an appropriate node at level i+1 or migrates to another node for keeping location optimality of its network monitoring. However, failures of some domain managers even assuming the main manager can be reliable using replication-based fault-tolerance mechanisms are very critical to providing correct, real-time and efficient monitoring functionality in a large-scale mobile agentbased distributed monitoring system. To the best of our knowledge, the fault-tolerance mechanism proposed in [13] is the only one to address this issue. But, in this mechanism, every agent should periodically send heartbeat messages to global failure detection agents (GFDAs). If the GFDA receives no heart-beat message from an agent for a predefined number of consecutive timeout intervals,
2
it generates and delivers an AgentFailure message to a global recovery agent (GRA). Afterwards, the GRA recreates a new agent based on its most recent configuration information and redeploys it to the appropriate target host. However, this behavior results in high failure-free overhead due to the centralization of failure detection functionality in a single point within a large-scale hierarchical monitoring organization. Additionally, the takeover procedure performed by GRAs is much unsuitable for maintaining a tree-like manager structure efficiently. Also, this mechanism includes no concrete method to detect failures of manager agents correctly in case of agent creation, migration and termination triggered by dynamic changes in a network. This paper proposes a novel fault-tolerance mechanism to have the following desirable features appropriate for largescale and dynamic hierarchical mobile agent-based monitoring organizations: •Support fast failure detection functionality with low failure-free overhead by each domain manager periodically transmitting heart-beat messages to its immediate higher-level manager. •Minimize the number of non-faulty monitoring managers affected by failures of domain managers. •Enable consistent failure detection actions to be performed continuously in case of agent creation, migration and termination. •Can execute consistent takeover actions even in concurrent failures of domain managers. The remainder of this paper is organized as follows. In sections 2 and 3, we describe our proposed mechanism in both conceptual and algorithmic ways, and show its correctness proof. Section 4 compares the proposed mechanism with the existing ones in detail and section 5 concludes this paper.
2. The Proposed Mechanism In the following subsections, data structures and algorithms of the proposed mechanism are described informally.
2.1 Data structures Every domain monitoring manager α has to keep the following three variables. •AIDα: it is the agent identifier of domain manager α. •MMaddrα: it is the main manager’s identifier needed when domain manager α is created or the organization of its lower-level managers changes. •IHMaddrα: it is the immediate higher-level manager’s identifier of domain manager α.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org •ptrα: it is the root node of a tree for saving the identifier and timer of every lower-level manager of main or domain monitoring manager α. Its node is a tuple (aid, tinterval, ptr). tinterval for each lower-level manager aid is used so that monitoring manager α detects whether its lower-level manager aid is alive or failed, and is initialized to τ . ptr for its lower-level manager aid is the next-level node maintaining references for all lower-level managers of the domain manager aid in a hierarchical manner.
2.2 Informal Description
manager until the timer expires, it suspects that the lowerlevel manager crashes. This behavior results in low failure-free overhead incurred by failure detection by utilizing the tree-like organization of monitoring managers effectively. If a monitoring manager determines that a new one is needed as its immediate lower-level manager for effective monitoring, it creates a new mobile agent for this like figure 1. In this figure, manager DMx has a monitoring agent spawned and transferred to a new node DMz. The agent initiates its monitoring task and notifies of its location all nodes on the path between the main manager MM and itself. When a manager knows that it cannot play its role well and effectively for guaranteeing the monitoring performance required, it is voluntarily replaced by agent migration like in figure 2. In this figure, after manager DMz has made the same decision mentioned above, it finds an appropriate substitute node DMα and forces its agent to migrate to the substitute, where the agent resumes its monitoring task. If a manager detects some immediate lower-level managers has failed, it activates our takeover procedure.
Fig.1 In case of another DM being required.
Every domain manager α periodically transmits each heartbeat message only to its immediate higher-level manager IHMaddrα. Therefore, each monitoring manager can know which ones fail or are alive among its immediate lower level managers by their periodic notification. In our mechanism, the manager α decrements the timer tinterval for its corresponding immediate lower-level manager aid in ptrα by one every certain time interval. If α has not received any heart-beat message from the lower-level
3
Fig.2 In case of DM replacement by agent migration.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org At this point, there can occur among three cases depending on availability and capability of nodes. First, like in figure 3, when MM recognizes DMw’s failure and a new node DMγ is its suitable substitute, the main manager creates and transfers a new monitoring agent with the same role to node DMγ. Then, it performs the same monitoring function the failed node DMw executed, and inform its immediate lower-level managers, e.g., DMx of this replacement. Second, when a manager DMγ identifies the failure of its next-level manager DMx in figure 3 and there is no available node for replacing the failed one, it checks whether among DMx’s immediate lower-level managers DMy and DMα, there exists a proper one as DMx’s replacement.
4
all nodes on the path between the main manager MM and DMx. As the last case, when there is neither any new nor lower-level manager capable of being substituted for the failed one DMx in figure 3, DMx’s immediate higher-level manager DMγ takes over DMx’s role aside from DMγ’s own task in figure 5. Also, the mechanism performs the consistent takeover procedure even in case of concurrent failures of domain managers. Algorithmic description of the failure detection and takeover procedures for main or domain manager Self in our mechanism are formally given in figures 6 and 7.
Fig.4 In case of an existing DM taking over failed DMs task.
3. Correctness Proof Fig.3 In case of a new DM taking over failed DM’s task.
If DMγ determines that DMy is just suitable for the role, it allows DMy to take over DMx’s task like in figure 4. In this case, DMy notifies DMx’s other immediate lower-level managers of this substitution and updates its location on
This section shows theorems 1 and 2 to prove safety and liveness of our proposed mechanism in order. Theorem 1. Even if multiple domain managers crash concurrently, our mechanism enables other live managers to monitor all the network elements previously managed by the failed ones.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org
Proof: Suppose that the entire distributed monitoring system consists of a finite set N of monitoring managers whose size is n and there is the set of all crashed domain managers, denoted by SCDM. The proof proceeds by induction on the number of all the crashed domain managers in SCDM, denoted by |SCDM| (|SCDM| < n).
Case 1: there is a new available domain manager DMγ capable of taking over DMx. In this case, after detecting DMx’s failure, the immediate higher level manager of DMx creates and transfers a new monitoring agent with the same role to node DMγ. Then, DMγ performs the same monitoring function the failed node DMx executed, and inform DMx’s immediate lowerlevel managers of this replacement and updates its location on all nodes on the path between the main manager and DMx. procedure CHECK_AGENTLIVENESS() failedMngrs ← invoke DECR_TINTERVAL() on Self ; for all fmngr in failedMngrs do if(there is a new node nmngr as an appropriate substitute for fmngr) then invoke MNGR_TAKEOVER(fmngr) on nmngr ; send a message Change_IHigherLevelMngr(AID Self) to nmngr ; else if(there is a suitable substitute lmngr for fmngr among its immediate lower-level managers) then invoke MNGR_TAKEOVER(fmngr) on lmngr ; send a message Change_IHigherLevelMngr(AID Self ) to lmngr ; else invoke MNGR_TAKEOVER(fmngr) on Self ;
Fig.5 In case of the immediate higher-level DM taking over failed DMs.
procedure NOTIF AGENTALIVEMSG() send a message Update_AgentTInterval(AIDSelf) to IHMaddrSelf ; procedure UPDATE AGENTTINTERVAL(AID) for all e in ptrSelf do if(e.aid = AID) then e.tinterval ← τ ; return ; procedure MIGRATE AGENTTONEWNODE(nmngr) invoke MNGR TAKEOVER(AIDSelf) on nmngr ; send a message Change_IHigherLevelMngr(IHMaddrSelf) to nmngr ; Fig.6 Failure detection and takeover procedures for manager Self.
[Base case] As |SCDM|=1, there is only one crashed domain manager DMx. In this case, the following three cases should be considered.
5
procedure DECR_TINTERVAL() failedMngrs ← ; for all e in ptrSelf do e.tinterval ← e.tinterval - 1 ; if(e.tinterval = 0) then failedMngrs ← failedMngrs U {e} ; ptrSelf ← ptrSelf - failedMngrs ; return failedMngrs ; procedure MNGR_TAKEOVER(fmngr) for all e in fmngr.ptr do ptrSelf ← ptrSelf U {(e.aid, e.ptr, τ )} ; send a message Change_IHigherLevelMngr(AID Self) to e.aid ; send a message Change_TreeTopologyAtMMngr(AID Self, ptrSelf ) to MMaddrSelf ; procedure CHANGE_TREETOPOLOGYATMMNGR(AID, ptr) find a path mngrs to AID in ptrSelf ; find a node e in ptrSelf st (e.aid = AID); e.ptr ← ptr ; NLMngr ← the first element e in mngrs ; mngrs ← mngrs - {e} ; send a message Change_TreeTopology(mngrs, AID, ptr) to NLMngr ; procedure CHANGE_TREETOPOLOGY(mngrs, AID, ptr) find a node e in ptrSelf st (e.aid = AID) ; e.ptr ← ptr ; if(mngrs = ) then NLMngr ← the first element e in mngrs ; mngrs ← mngrs - {e} ; send a message Change_TreeTopology(mngrs, AID, ptr) to NLMngr ; procedure CHANGE_IHIGHERLEVELMNGR(AID) IHMaddrSelf ← AID ;
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org Fig.7 Failure detection and takeover procedures for Self (continued).
Case 2: among DMx’s immediate lower-level managers, there is a proper one DMγ as DMx’s substitute. In this case, DMx’s immediate higher level manager allows DMγ to take over DMx’s task and notifies DMx’s other immediate lower-level managers of this substitution and updates its location on all nodes on the path between the main manager and DMx. Case 3: there is neither any new nor lower-level manager capable of being substituted for DMx. In this case, DMx’s immediate higher-level manager DMγ takes over DMx’s role aside from DMγ’s own task. The subsequent procedure is the same as in case 2. [Induction hypothesis] We assume that the theorem is true in case that |SCDM|=k. [Induction step] Only if (k+1)-th crashed domain manager (k+1 < n) can be taken over by any other live domain managers, the theorem is true in case that |SCDM|=k+1. The following case is the same as the base case mentioned above. By induction, even after |SCDM| concurrent domain manager failures occur, our mechanism allows their monitoring functions to be taken over other surviving ones. Theorem 2. Our mechanism terminates within a finite time. Proof: As no more than |SCDM| (|SCDM| < n) domain manager crashes occur, the proposed mechanism has only to re-execute its takeover procedure at most up to |SCDM| times as explained in theorem 1. Thus, the mechanism terminates within a finite time.
4. Comparisons Most of monitoring systems using mobile agents were developed based on flat network infra-structure. Single agent-based monitoring system proposed in [11] forces a single agent to be created on the network manager and to perform its task monitoring function according to the itinerary consisting of its target nodes. It is simple to implement, but not scalable because in large distributed networks, the round-trip delay for the agent may become significantly increasing, especially on polling frequently, and its size, considerably large while visiting its target nodes. Corradi et al. [2] presented a segment-based monitoring mechanism partitioning a network into a set of subnetworks or domains and transferring a mobile agent to each domain. This mechanism can reduce greatly its overall monitoring response time by collecting and
6
filtering its management data per domain in parallel compared with the single agent-based one [11]. Gavalas et al. [4] proposed a broadcast-based monitoring mechanism, where the network manager instantiates and migrates each a mobile agent to all managed nodes. After the agent collects and analyzes the network traffic information from the corresponding node, it returns to the network manager platform with the requested information. Thus, this mechanism maximizes the parallelism of its monitoring process and achieves its short response time. However, as the number of managed network elements or resources extremely increases, a large number of mobile agents are required. This feature may incur high agent movement overhead by broadcasting and so degrade the entire system performance remarkably. All the mechanisms stated above may not overcome the limitation of scalability fundamentally because of their centralization nature. Also, this problem becomes getting increasingly worse if expensive and low bandwidth links are included in routing paths of agents. Liotta et al. [7] introduced a scalable multi-level monitoring mechanism based on the concept of Management by Delegation (MbM) [6]. In other words, this mechanism partitions a networked system into several domains composing a hierarchical structure and deploys a mobile agent to each of them. A distributed java agent-based monitoring system JAMM was proposed for grid computing in [12]. This system enables monitoring sensors to execute by triggering their execution based on actual client usage. Clients can control remote sensors and obtain their requested information from the sensors in the form of events. In [9], a multi-agent based distributed monitoring system is implemented composed of dynamically controllable agents. The structure of the system is divided into three layers to support independence among communication protocols, message interpretation and monitoring tasks. This independence among the three layers may reduce agent development time and make it easy to manage distributed systems. However, these three mechanisms [7, 9, 12] cannot be autonomously adaptable for dynamic changes such as variations of network traffic patterns, resource addition and deletion, changes of network topology and so on because their structure of monitoring managers is static after the initial agent deployment. In [8], an adaptive and hierarchical mobile agent-based monitoring mechanism was presented to address the above mentioned problems. In this mechanism, each middle-level manager agent is not bound to a particular network node and be able to sense the network, find and move to better locations for seeking monitoring location optimality. But, among all the previously stated hierarchical mobile agent-based mechanisms, no one addresses the failure
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org detection and recovery issue for a hierarchy of monitoring managers. Tripathi et al. [13] presented a mobile agent-based distributed monitoring system supporting autonomic configuration and recovery. In this system, there are several global failure detection agents subscribing to heart-beat events from all monitoring agents. Thus, every monitoring agent periodically sends its heart-beat message to each global failure detection agent. If a monitoring agent fails, one of global recovery agents in this system executes the following recovery procedure: the recovery agent instantiates the monitoring agent based on its most recent configuration information and re-launches it to an available node. Thus, if large-scale networks are assumed, this feature results in high failure-free overhead due to the centralized failure detection procedure. Additionally, the takeover procedure of the global recovery agents in this system is very unsuitable for maintaining a tree-like monitoring manager structure efficiently. Also, this system presented no concrete mechanism to have the hierarchical structure of monitoring managers adaptable for its correct and efficient failure detection in case of agent creation, migration and destruction caused by the dynamic changes within its entire network.
5. Conclusions This paper presented a novel fault-tolerance mechanism to have the following advantageous features appropriate for large-scale and dynamic hierarchical mobile agent-based monitoring organizations. It supports fast failure detection functionality with low failure-free overhead by each domain manager transmitting heart-beat messages to its immediate higher-level manager. Also, it minimizes the number of non-faulty monitoring managers affected by failures of domain managers. Moreover, it allows consistent failure detection actions to be performed continuously in case of agent creation, migration and termination, and is able to execute consistent takeover actions even in concurrent failures of domain managers.
References [1] H. Asgari, P. Trimintzios, M. Irons, G. Pavlou, S. Berghe, and R. Egan, "A Scalable Real-time Monitoring System for Supporting Traffic Engineering", in Proc. of the IEEE Workshop on IP Operations and Management, Dallas, USA, 2002. [2] A. Corradi, C. Stefanelli, and F. Tarantino, "How to Employ Mobile Agents in Systems Management", in Proc. of the 3rd Int. Conf. on the Practical Application of Intelligent Agents and Multi-Agent Technology (PAAM’98), 1998, pp. 17-26. [3] A. Fuggetta, G.P.Picco, and G. Vigna, "Understanding Code Mobility", IEEE Transactions on Software Engineering, Vol. 24, No. 5, 1998, pp. 342-361.
7
[4] D. Gavalas, D. Greenwood, M. Ghanbari, and M. O’Mahony, "Complimentary Polling Modes for Network Performance Management Employing Mobile Agents", in Proc. of the IEEE Global Communications Conference (Globecom’99), 1999, pp. 401-405. [5] D. Goderis, S. Bosch, and Y. T’Joens, "A Service-Centric IP Quality of Service Architecture for Next Generation Networks", in Proc. of the IEEE/IFIP Network Operations and Management Symposium, 2002, pp. 139-154. [6] G. Goldszmidt, and Y. Yemini, "Delegated Agents for Network Management", IEEE Communication Magazine, Vol. 36, No. 3, 1998, pp. 66-70. [7] A. Liotta , G. Knight, and G. Pavlou, "Modelling Network and System Monitoring Over the Internet with Mobile Agents", in Proc. of the IEEE/IFIP Network Operations and Management Symposium (NOMS’98), 1998, pp. 303-312. [8] A. Liotta , G. Pavlou, and G. Knight, "Exploiting Agent Mobility for Large-scale Network Monitoring", IEEE Network, 2002, pp. 7-15. [9] S. Kwon, and J. Choi, "An Agent-based Adaptive Monitoring System", Lecture Notes In Artificial Intelligence, Vol. 4088, 2006, pp. 672-677. [10] J. Philippe, M. Flatin, and S. Znaty, "Two Taxonomies of Distributed Network and System Management Paradigms", Emerging Trends and Challenges in Network Management, 2000. [11] G. Susilo, A. Bieszczad, and B. Pagurek, "Infrastructure for Advanced Network Management based on Mobile Code", In Proc. of the IEEE/IFIP Network Operations and Management Symposium (NOMS’98), 1998, pp. 322-333. [12] B. Tierney, B. Crowley, D. Gunter, J. Lee, and M. Thompson, "A Monitoring Sensor Management System for Grid Environments", Cluster Computing Journal, Vol. 4, No. 1, 2001, pp. 19–28. [13] A. Tripathi, D. Kulkarni, H. Talkad, M. Koka, S. Karanth, T. Ahmed, and I. Osipkov, "Autonomic Configuration and Recovery In A Mobile Agent-based Distributed Event Monitoring System", Software Practice and Experience, Vol. 37, 2007, pp. 493–522.
Jinho Ahn received his B.S., M.S. and Ph.D. degrees in Computer Science and Engineering from Korea University, Korea, in 1997, 1999 and 2003, respectively. He has been an associate professor in Department of Computer Science, Kyonggi University. He has published more than 70 papers in refereed journals and conference proceedings and served as program or organizing committee member or session chair in several domestic/international conferences and editor-in-chief of journal of Korean Institute of Information Technology and editorial board member of journal of Korean Society for Internet Information. His research interests include distributed computing, fault-tolerance, sensor networks and mobile agent systems.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814
Multi Objective AODV Based On a Realistic Mobility Model Hamideh Babaei 1, Morteza Romoozi2 1
Computer Eng. Dept, Islamic Azad University, Naragh Branch Naragh, Iran
2
Computer Eng. Dept, Islamic Azad University, Kashan Branch Kashan, Iran
Abstract Routing is one of the most important challenges in ad hoc network. Numerous algorithms have been presented and one of the most important of them is AODV. This algorithm like many other algorithm calculate optimum path while pays no attention to environment situations, mobility pattern and mobile nodes status. However several presented algorithm have considered this situation and presented algorithm which named environment aware or mobility based. But in them have not considered realistic movement and environment such as obstacles, pathways and realistic movement pattern of the mobile nodes. This article present new algorithm based on AODV which find optimum path based on multi objectives. These objectives have been mined from a realistic mobility model, internal status of the mobile nodes and its status in routing. In this method the objectives are optional and each node can consider a couple of them in routing. Therefore this method supports GPS less mobile nodes. Evaluation of the new method shows that considering multi objectives influence routing metrics and can improve some of them.
Keywords: Multi objective AODV ,Realistic Mobility Model, Ad Hoc Network ,Routing Algorithm , Mobility Model ,Multi objective Problem.
1. Introduction Wireless ad hoc network has extended more and more because of its application and services. Ad hoc network is a type of wireless network which does not include any static infrastructure. In such network each node plays both host role and router role. It means each node while it is moving in its environment, send and receive its data packet and relay data packets of other nodes to reach their destinations. Topology of these networks is variable due of movement of their nodes and there is no control center to support network topology, configuration or reconfiguration it. One of main challenges of ad-hoc networks is routing. Optimum routing algorithm plays a significant role in performance improvement. Problems such as limited
bandwidth, limited power and end to end delay cause to need of an optimum and quick routing algorithm. Many routing algorithm have been presented for this networks that each of them has self special benefits. In standpoint of gathering routing information, routing algorithms are classified to two classes, proactive and reactive [1]. One of the famous routing algorithms is AODV [2] which is one of the useful and effective reactive algorithms. Graphs can model many things of the world such as transforming networks, traffic control networks, neural networks, communication networks and etc. routing problem can be modeled to graph too and each host can be a vertex and each link between to host can be a edge. Therefore routing problem can be considered as a shortest path problem (PSS) in a graph. In AODV algorithm a path with minimum hop count is selected as optimum path. In Single Objective Problem (SOP), there is just one objective [3]. AODV algorithm is an example for these problems. Single objective methods are not suitable for some kind of problems. Finding best solution in this kind of problems depend to multi objectives. Therefore a new kind of problem which named Multi Objective Problem emerged that in it multi objective play role [4]. In shortest path problem [5], we can consider multi objective on each edge such as cost, time, distance and etc and solve this problem based on multi objectives or selected path can satisfy multi objectives. So Multi Objective Shortest Path Problem (MOSPP) can find optimum path based on multi objectives. This paper tries to propose a novel method which can improve AODV routing algorithm in finding best path based on multi objectives. Proposed method can find the path which is optimum in multi objectives. Therefore effective objective in routing must be realized. There are many research which prove that mobility has a significant effect on routing[6]. Since if a routing algorithm
8
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org can be based on mobility of the nodes or can consider mobility parameter in routing, it would present the better performance. For study on such routing algorithm, we need to can simulate it on a network simulator. Mobility of the nodes models with Mobility Model in the simulator. Mobility model dictates initial place and movement of the nodes to them. This model can model environment around the nodes such as obstacles, pathways and etc. A good mobility model must be based on realistic situation of both the nodes and their environment [7]. Therefore if extracted parameter of a realistic mobility model is considered in a routing algorithm, it can present better performance in routing. There is some non mobility objectives such as geographic distance, energy, traffic and etc that play role in an optimum path can be considered too.
9
receive a RREQ which they have already processed, they discard the RREQ and do not forward it. As the RREP propagates back to the source, nodes set up forward pointers to the destination. Once the source node receives the RREP, it may begin to forward data packets to the destination. If the source later receives a RREP containing a greater sequence number or contains the same sequence number with a smaller hop count, it may update its routing information for that destination and begin using the better route.
In this paper first related works are introduced. Second, Classic AODV algorithm is perused. Third, a realistic mobility model is introduced and forth, by using earned knowledge of mobility model detects effective objectives and propose a multi objective AODV algorithm based on a realistic mobility model and finally proposed method is evaluated and compared with classic AODV.
As long as the route remains active, it will continue to be maintained. A route is considered active as long as there are data packets periodically traveling from the source to the destination along that path. Once the source stops sending data packets, the links will time out and eventually be deleted from the intermediate node routing tables. If a link break occurs while the route is active, the node upstream of the break propagates a route error (RERR) message to the source node to inform it of the now unreachable destination(s). After receiving the RERR, if the source node still desires the route, it can reinitiate route discovery.
2. AODV Routing Protocol
RREQ and RREP packet format are illustrated in figures 1 and 2 figure 3 illustrate an entry of route table of a node.
AODV is capable of both unicast and multicast routing. It is an on demand algorithm, meaning that it builds routes between nodes only as desired by source nodes. It maintains these routes as long as they are needed by the sources. Additionally, AODV forms trees which connect multicast group members. The trees are composed of the group members and the nodes needed to connect the members. AODV uses sequence numbers to ensure the freshness of routes. It is loop-free, self-starting, and scales to large numbers of mobile nodes.
Authors have proposed a realistic mobility model previously which named Cluster Based Mobility Model for Intelligent Nodes [7] which is one of the most realistic mobility models. This section describes it in summary.
AODV builds routes using a route request / route reply query cycle. When a source node desires a route to a destination for which it does not already have a route, it broadcasts a route request (RREQ) packet across the network. Nodes receiving this packet update their information for the source node and set up backwards pointers to the source node in the route tables. In addition to the source node's IP address, current sequence number, and broadcast ID, the RREQ also contains the most recent sequence number for the destination of which the source node is aware. A node receiving the RREQ may send a route reply (RREP) if it is either the destination or if it has a route to the destination with corresponding sequence number greater than or equal to that contained in the RREQ. If this is the case, it unicasts a RREP back to the source. Otherwise, it rebroadcasts the RREQ. Nodes keep track of the RREQ's source IP address and broadcast ID. If they
There are different nodes in an Ad-hoc network. Naturally, different nodes have different mobility specifications. For instance, in a campus environment there are automobile nodes, static nodes such as billboards and pedestrian nodes. Even each specific node by itself has different mobility models. For example, pedestrian nodes do not have the same mobility model and teacher nodes may be active in some areas more than the others (e.g. in faculties or libraries) or employee nodes seem to be more active in official places than in other locations. Because of this, it can be said that in an environment, there are different groups of the nodes which can be named clusters. Each cluster have different movement pattern.
3. Cluster Based Mobility Model
In this mobility model to model environment around the mobile nodes, obstacles are determined at the beginning of simulation then pathways are constructed by Voronoi diagram with centroid of obstacles corners [8].
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org But what are the cluster movement specifications? To answer this question, a real campus environment where considered and the movements of different nodes where captured. This reveals the fact that each cluster has the following specifications: Activity area: it is an area on which the nodes are more active than other areas. It means that the nodes select places in this area or the places near it as their destination more than other locations. Speed range: speed range of the nodes in each cluster differs from the rate of other clusters. For example automobile clusters have different speed range from pedestrian clusters. Pause time range: pause time of each cluster is different too. For example, automobiles have shorter pause time than pedestrians. Capacity: each cluster has a certain capacity. For instance, the number of automobiles is less than that of pedestrians. Path choice method: the nodes in each cluster have different path choice method. Automobiles, for example, prefer sparser path even if it is longer, but pedestrians prefer shorter path even if it is crowded or some environment aware nodes choose shortcut path but others do not aware about it choose main paths. The following scenario describes movement behavior of the nodes in their environment. In the proposed model, the nodes become the members of clusters according to their capacity in a random way. They are distributed at Voronoi graph vertices based on their activity area at the beginning of simulation. Then, each node selects a vertex as destination based on its activity area and calculates an optimum path to destination based on path choice method and selects a speed rate between Vmin and Vmax, which has been specified for its cluster at the beginning of the simulation. Then it moves to the destination through the selected path in the predefined pathways. In destination it pauses between pmin and pmax that has been specified for its cluster at the beginning of the simulation. This procedure is repeated to the end of simulation.
4. Proposed Method As it is mentioned previously, in AODV algorithm path with minimum hop count is chosen. But this method can not be suitable every time and every where. Maybe a path with minimum hop count would have nodes with maximum distance between each others, therefore with minimum movement of the nodes, they exit from transmission range
10
of each other and the path is broken. Since a path with more hop count which consider distance between its nodes is better than a path with minimum hop count which does not consider distance. This matter can be said for energy, traffic and etc. So an optimum path is the path that is selected based on multi objectives. Proposed method considers not only hop count but also other objectives. These objectives are driven from mobility model, mobile node specification and routing. By considering these objective multi objective AODV can find paths which are optimum based on multi objectives. In proposed method, selecting of objectives that participate in finding path is optional. Since if a node lacks some facilities such as GPS, objectives in which need GPS can be not considered. Therefore proposed method support GPS less mobile nodes. First objectives which play role in finding path are introduced, then how to use from it will be explained. Title: The title should be centered across the top of the first page and should have a distinctive font of 18 points Century. It should be in a bold font and in lower case with initial capitals.
4.1 Geographical Distance Geographical distance can play a significant role in durability a stability of a path. If distance of two consecutive nodes was so far that with minimum movement they exit from transmission range of each others, the path has not proper durability and stability and maybe break in a short time. If all nodes have GPS, they will able to have their geographical position every time. Therefore a field that named Position is contrived in RREP packet. Each node when relay RREP packet, fill this field by their geographical position. So each node knows its previous node position and on the other hand knows its position. Since with below formula Eq.(1) can calculate distance between it and previous node. (1)
dist ( x1 x 2 ) 2 ( y1 y 2 ) 2 DD
dist dist min dist max dist min
In this formula (x1,y1) is coordination of previous node and (x2,y2) is coordination of next node. distmin is minimum distance that is equal to 0 and distmax is maximum distance that is equal to “ 2 * transmission range of the nodes ”. DD is distance objective which is normal between 0 and 1.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org
4.2 Cluster Objective As in cluster based mobility model mentioned each node is belong to a special cluster and have movement specifications of its cluster. Some of these cluster specifications can have a significant effect on durability and stability. For instance if the nodes have lower speed range or higher pause time range, the path can stay stable more. Since for each cluster can specify a special rank. Thus each cluster which has specifications cause to produce more stable path takes higher rank. These specifications include maximum speed and maximum pause time. This rank can be calculated according Eq.(2). (2) C ((v max v min ) / 2) (2 /( p max p min )) DC
11
which includes nodes with sufficient energy is more stable and durable. Suppose energy of a node is a value between 0 and 100 that 100 is maximum energy and 0 means node has no energy to communication. Energy is decreased in 3 ways. 1. As time passed a constant value of energy is decreased. 2. For sending each packet a constant value of energy is decreased. 3. For receiving each packet a constant value of energy is decreased. Energy is a maximum objective, it means higher value of it is better. But for justify this objective to others, it is taken to minimum objective. So below formula calculate this objective. (4) p p 100 DP
1
p 1 100 1
C
In above formula vmin and vmax are minimum and maximum speed and pmin and pmax are minimum and maximum speed. In this formula whatever lower value of C is better, so to normalize and taking it to maximum objective, 1/C is considered.
4.3 Activity Area Objective Each node has specific activity area where it is found there more than anywhere. Therefore if two consecutive nodes in a path belong to the same activity area or their activity areas are close to each other, probability of stability and durability of the path will be raised. So each node send its cluster number by RREP packet, receiver of this packet according this number verify activity area of previous node and on the other hand know its activity area and since calculate distance between activity area of previous node and activity area of next node. Eq. (3) calculates this distance: (3) dist ( xa1 xa 2 ) 2 ( ya1 ya 2 ) 2 DA
dist dist min dist max dist min
In above formula (xa1,xa1) is coordination of center of previous node activity area and (xa2,xa2) is coordination of center of next node activity area. distmin is minimum distance that is equal to 0 and distmax is maximum distance that is equal to network simulation terrain diameter. DA is normalized objective with value between 0 and 1.
4.4 Node Energy Objective Mobile nodes are notebook computers or portable wireless device, since they equipped to battery and maybe their energy com to end. Therefore if in a path energy of one or more nodes com to end, the path will be broken. So a path
In Eq.(4) DP is energy objective that is normalized and is taken to range 0 to 1.
4.5 Traffic objective Next objective is traffic through a path. A longer path with less traffic is better than a shorter path with high traffic. A high traffic link can cause to partitioning of total of a path. Because this link change to a bottleneck of a path and keep packets in long queue and even drop them. Therefore traffic can has a significant role in stability and durability of a path. To control traffic, each node in its neighbor table apply a new field that increases it per each packet is sent or relayed through this neighbor link. Thus this field determine number of packets that are sent trough this neighbor link during simulation. To use this objective, each node which wants to send or relay a RREP packet adds this traffic objective to it that is calculated by Eq.(5). DT
1
(5)
T
T in above formula is value of mentioned field in neighbor table and DT is objective of traffic in finding path which convert T to a minimum objective and normal it to range 0 and 1.
4.6 Environment Obstacle Objective Environment obstacles can effect on stability or durability of a path. This is because nodes of an ad-hoc network are mobile usually and this movement can cause placing an obstacle among two consecutive nodes that can inhibit signal of them and partition the path. Therefore not only distance of two consecutive nodes of a path can effect on stability of it but also environment obstacle around them.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org But how the effect of these obstacles can be calculated. To reach this goal, two consecutive nodes which have two spheres with radius equal to transmission range of each node is considered. If these two spheres have an overlap region, mentioned two nodes are connected. Whatever area of this region is more, two nodes have stronger link. When two circle have an intersection area, it can cause to creation of a sector in each circle. This pizza-like slice has been illustrated in Figure 1 as BAC sector in circle with center A and BDC sector in circle with center D. These sectors can be considered connection area for two nodes despite it has some extra region but we do not decide to calculate it exactly. There is a rectangle as an obstacle in figure 1. There are two other sectors, eDf and gAh which have been created by mentioned obstacle. These two sectors can not be considered as connection area for two nodes, so for calculating effective region for connection of two nodes these regions are subtracted from sectors BAC and BDC.
12
reach this end, AODV is improved by using a weighted sum method. It means all objectives are added to each others and be a single objective. There are just two mentioned objectives which need to have mobile nodes equipped GPS receiver to calculate their position, geographical distance and obstacle effect. Proposed method able not to consider some objectives and it is optional for nodes when they want to find a path. Thus proposed method support both GPS equipped nodes and GPS less. In proposed method some fields are added to routing table, RREP packet and RREQ packet. Figures 2, 3 and 4 illustrate them.
Fig.2. Route request message frame
Fig.3. Route response message frame in new algorithm
Fig.4. Files of Route Table at each node in new algorithm Fig 1. Effect of an obstacle in connectivity of two nodes
S1 = (Angle(BAC)/360).π.r2 S2 = (Angle(BDC)/360).π.r2 S3=(Angle(gAh)/360).π.r2 S4=(Angle(eDf)/360).π.r2 S=(S1+S2) – (S3+S4)
Area of sector BAC Area of sector BDC Area of sector gAh (6) Area of sector eDf Area of effective section
Formulas 6 calculated effective region area of two circle. Whatever S is more, link between two nodes is stronger. OA convert it to a minimum objective and DO normalize it to range 0 and 1. (Eq.(7)) OA r
(7)
DO
2
S
A 2 r
5. Proposed Method As it is mentioned previously, optimum path is the path which is optimum based on multi objectives. Objectives that are mentioned in previous section were not all the same. Some of them were minimum objectives and some maximum objectives. But all of them are converted to minimum objective and are normalized to range 0 and 1. Now with such objectives, each node can by a Pareto method select its path based on 6 mentioned objectives. To
There is a field In RREQ packet which named objectives primitives that determines primitive of RREQ sender for its required path. In this 6 bit field, each bit has been associated to an objective. If each bit of this field has 0 value, associated objective of it will not be used for finding path and RREQ generator does not consider that objective. There is just one field for all objectives in RREP packet and routing table. It is because of using a Weighted Sum [9] method. In fact all objectives are added to each others according to below formula and result placed in OBJS field. F Fold ( wi f i )f i {DD , DC , DA, DTDP, DO} (8) Fold, in Eq.(8) is calculated as follow: If node which wants to send RREP is generator of it and is destination of path, value of Fold will be considered 0. If node which wants to send RREP is generator of it and is a node which has a path to destination, value of Fold will come from OBJS field of routing table. If node which wants to send RREP is not generator of it and is an intermediate node, value of Fold will come from OBJS field of received RREP packet. Wi , in above formula is routing primitives which RREQ generator considers to determines which objectives play role in finding a path.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org Each node when receive RREP packet insert a reverse path in its routing table. But at the time of inserting, if there was a same entry with the same destination and has the same objective primitives, higher value of OBJS field of RREP packet and routing table entry determines which of them must be stays in routing table. If there is no entry with the same destination or even the same objective primitives new path from RREP packet insert directly in routing table. Indisputable just fresh routes (not expired) of routing table are considered. After updating routing table RREP packet forward to next hop to reach source of path.
6. Simulation Main goal of simulation is evaluation of proposed method and comparing it with previous methods. Since, proposed method has been compared with classic AODV. There are 3 diagrams to evaluation performance of new method as follow: Proposed method: this diagram considers all mentioned objectives. Proposed method for GPS less network: this diagram has not considered GPS related objectives (distance and obstacle effect) in simulations.
13
6-1. Simulation Parameters All simulations have been done with Glomosim [10] network simulator which is one of most popular wireless network simulator. Mobility model is Cluster Based Mobility Model for Intelligent Nodes which was explained in previous sections. The simulation terrain as shown in figure5 is 1000m*1000m with 7 obstacles and 3 clusters that each cluster have an activity area shown with different colors. The maximum node transmission range is 250m. However, in the presence of obstructions, the actual transmission range of each individual node is likely to be limited. At the MAC layer, the IEEE 802.11 DCF protocol is used, and the bandwidth is 2Mbps. After initial distribution of the nodes, the nodes move for 60 seconds so that they are distributed throughout the simulation area. Ten data sessions are then started. The data packet size is 512 bytes and the sending rate is 4 packets/second. The maximum number of packets that can be sent per data session is set to 6,000.Movement continues throughout the simulations for a period of 1800 seconds. Each data point is an average of 30 simulation runs with the nodes distributed in different initial positions.
AODV algorithm: this diagram has been created by classic AODV algorithm and is a criterion of performance of proposed method. Simulation has been done 3 time with different variable parameter. Simulation in variable speed: in which simulation was with 50 nodes and in simulation with size 1000x1000. Speed of the nodes was variable between 0 to 10 m/s. Simulation in variable number of nodes: in which number of the nodes was variable in 20 to 70 and simulation terrain size was 1000x1000 and speed of each node was a random number between 0 to 2. Simulation in variable size of simulation terrain size: in which simulation was with 50 nodes and simulation terrain size was variable between 800x800 and 1800x1800 and speed of each node was a random number between 0 to 2. Each point of diagrams has been calculated by 30 time simulation with different random Seed.
Fig5. simulation terrain
6-2. Simulation Metrics Routing metric has been measured to performance evaluation of proposed algorithm and comparing it with AODV. These metrics are as follow: Data Packet Reception: The number of data packets received at their intended destinations. Control Packet Overhead: The number of network-layer control packet transmissions. End-to-End Delay: The end-to-end transmission time for data packets. This value includes delays due to route discovery.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org Above metric are measured in 3 different mentioned situations.
6-3. Simulation Results a. Average End-End Delay End to end delay is consumed time to point to point transmission of a data packet. This time includes delay that is because of routing. In this section, average end to end delay is evaluated in 3 separate situations, variable range speed, variable number of nodes and variable size of simulation terrain. These were shown in figure 6 , 7 and 8 Using distance and activity area provides path with shorter and more stable link and cause to send data quicker and more dependable. Using traffic objective prevents standing packets in long queue to they are sent rapidly. Consideration energy and cluster cause to produce more stable paths too. Obstacle effect objective decrease probability of exiting nodes from transmission range of each others with a bit movement. Best results is belong to proposed method with consideration all objectives in all diagrams. Since it can result all objectives play role in finding stable and short paths. While when two objectives of 6 objectives are not considered, result is worse than previous diagram. It means two objectives distance and obstacle effect play a significant role in finding stable paths. But GPS less diagram has the better result than classic AODV too. It means remained objectives in GPS less diagram retain theirs effect on finding path and produce more stable path than AODV algorithm. Increasing speed cause to increases average end to end delay in all diagrams. This is because of increasing movement of nodes which cause to nodes exit from transmission range of each others and paths failure rate increased. But increasing average end to end delay while increasing number of node is unexpected. This is because of increasing of node density and therefore data sessions and it can raise average of end to end delay. This matter is reversed for diagram with variable simulation terrain size. b)Average Data Packet Reception Average data packet reception in variable speed, number of nodes and simulation terrain size are illustrated in figures 9, 10 and 11 Considering mentioned objectives play significant role in improving data packet reception and using all of them has best result. This is because of mentioned effect of objectives in previous metric evaluation section.
14
GPS less diagram has better result than classic AODV but not better than the diagram which considers all objectives. This matter manifests role of two missed objectives, distance and obstacle effect. These two objectives have a significant effect on stability of a path. Distance objective cause to shorten the path and obstacle effect objective cause to select more stable and durable path. As speed or size of terrain simulation increased average data packet reception is decreased. This is because of decreasing node density which cause to creation less path and therefore less data packet are sent or received. But this matter is reverse when numbers of the nodes are increased. c)Average Control Packet Overhead Average control packet overhead is evaluated in 3 different cases, different speed range, different simulation terrain size and different numbers of the nodes. As it is illustrated in figures12, 13 and 14 average overhead in diagram of proposed methods is higher than classic AODV in all diagrams. This can due of prioritized requesting of a path. It means when a node request a path with self defined primitives, it may received by an intermediate node which know a path to destination but its path primitives is not matched to requested path primitives. Therefore path finding process will not be stopped while intermediate node knows a path to destination. This is while in classic AODV path finding process will be stopped in the same situation. Since in proposed method more control packet is consumed than classic AODV. Overhead in GPS less diagram is some less than diagram which consider all objectives. This is because of restriction of primitives in GPS less diagram which decreases variation of paths. As speed is increased overhead is increased, because number of broken path is increased and new path need new control packets. But why overhead increased while number of nodes increased. It can because of increasing number of nodes which relay control packets. Overhead decreased when size of simulation terrain increased. This matter is because of decreasing node density which lead to less data packet reception and as a result less control packets.
7. Conclusion There is just one objective, shortest hop count in finding path in classic AODV. But this objective can not be proper in every case everywhere. Maybe a path has the least hop count but has some other non optimistic objectives. This paper proposed new Multi Objective AODV that is based on a realistic mobility model which could improve performance of ad-hoc network in some metrics. Using of a
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org multi objective algorithm, proposed routing algorithm could consider most important objectives which play role in routing directly or indirectly.
15
Average Data Packet Reception 40000 35000
# Data Packet
Previous research shown mobility model have a significant effect on routing algorithm, since authors have used a realistic mobility model that they have proposed previously and extracted its parameter and used them as objectives in routing algorithm.
Cl as i c AODV
30000 25000 20000 15000
Mul ti Objecti ve AODV Mul ti Objecti ve AODV(GPS Less)
10000 5000 0 0
5
10
Node Speed
Fig. 9 Average Data Packet Reception in variant speeds
One of the important points in proposed method is supporting both GPS equipped and GPS less nodes. This ability is because of possibility of selecting objectives in finding a path.
Average Data Packet Reception 30000
Cl asi c AODV
# Data Packets
25000
End to End Delay in different speeds of nodes
Mul ti Objecti ve AODV
20000 15000
Mul ti Objecti ve AODV(GPS Les s)
10000 5000
0.06
0
0.05
0
Clasic AODV
20
0.04
40
60
80
# Mobile Nodes Multi Objective AODV
0.03 0.02
Fig. 10 Average Data Packet Reception in variant number of nodes
Multi Objective AODV(GPS Less)
0.01 0 0
2
4
6
8
10
Node Speed
Fig. 6 End to end delay in variant speeds End to End Delay in different number of nodes 0.07 0.06
Clasic AODV
0.05 0.04
Multi Objective AODV
0.03
Fig. 11 Average Data Packet Reception in variant size of terrain
Multi Objective AODV(GPS Less)
0.02 0.01 0 20
40
60
80
Average Control Packets
number of nodes
# Control Packets
35000
Fig. 7 End to end delay in variant number of nodes
30000
Cl as i c AODV
25000 20000
Mul ti Obj ecti ve AODV
15000 Mul ti Obj ecti ve AODV(GPS Les s)
10000 5000 0 0
5
10
Node Speed(m/s)
Fig. 12 Average Control Packet Overhead in variant speeds Average Control Packets 30000
Fig 8. End to end delay in variant size of terrain
#Control Packets
0
25000
Cl asi c AODV
20000
Mul ti Obj ecti ve AODV
15000 10000
Mul ti Obj ecti ve AODV(GPS Less )
5000 0 0
20
40
60
80
# Nodes
Fig. 13 Average Control Packet Overhead in variant number of nodes
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org
16
Hamideh Babaei is currently PhD student at Science & research branch of Islamic Azad University in Iran. She received Bs in software engineering from the University of Kashan at 2003, and his MS in computer science at 2005 in Iran. She is a faculty member of Islamic Azad University (Naragh branch). She has taught in the areas of Wireless Networks, Ad hoc and Sensor Networks and her research interests include Semantic Web, Information Retrieval, and recent research focusing on the Mobility model and routing protocol in ad hoc networks. She has published several articles in international conferences and LNCS series. Fig. 14 Average Control Packet Overhead in variant size of terrain
References [1]Elizabeth M. Royer, Chai-Keong Toh, A Review of Current Routing Protocols for Ad Hoc Mobile Wireless Networks, IEEE Personal Communications, Vol. 6, No. 2, April 1999, pp. 46-55. [2] C. E. Perkins and E. M. Royer, "Ad Hoc on-demand distance vector (AODV) routing", IETF Internet Draft, draft-ietfmanet-aodv-l4.txt, Jul, 2003. [3] P. L. Yu and M. Zeleny, "The techniques of linear multiobjective programming," RAIRO, Vol. 3, 1974, pp. 51-71. [4] M. Zeleny, "Linear multi-objective programming.", Berlin: Springer, 1974. [5] .A. Warburton, Approximation of Pareto Optima in MultipleObjective ,Shortest-Path Problems, perations Research,Vol. 35, No. 1, 1987, pp. 70-79. [6] Bai. F, Sadagopan. N,and Helmy . A. : The IMPORTANT Framework For Analyzing The Impact of Mobility on Performance of Routing protocols for Adhoc NeTworks. In Proceedings of IEEE INFOCOM, San Francisco, CA, March/April 2003, pages 825– 835. [7] M. Romoozi H. Babaei, M. Fathi, ,A cluster-Based Mobility Model for Intelligent Nodes in Ad hoc Networks, ICCSA 2009, LNCS 5592, Part II, Springer-Verlag Berlin Heidelberg, 2009, pp. 804–817. [8]. A. P. Jardosh, E. M. Belding-Royer, K. C. Almeroth, and S. Suri.Towards Realistic Mobility Models for Mobile Ad hoc Networks.In Proceedings of ACM MOBICOM, San Diego, CA ,September 2003, pages 217–229. [9] .A. Warburton, Approximation of Pareto Optima in MultipleObjective ,Shortest-Path Problems, perations Research,Vol. 35, No. 1, pp. 70-79,1987. [10] Ashwini K.Pandary and Hiroshi Fujinoki Study of MANET routing protocols by Glomosim simulator International journal of network management Int. J. Network Mgmt 2005 Copyright 2005 John Wiley & Sons , Ltd.
Acknowledgement I take grate pleasure in expressing my heart full tanks to Morteza Romoozi ,my dear Husband ,whose favor toward me can not be reckoned. His professional guideline to help me to overcome difficulties during the progress of doing the project. Also I should not to acknowledge the great contribution Islamic Azad University ,Nragh branch that provided financial support for doing the project.
Mortreza Romoozi is currently PhD student at Science & research branch of Islamic Azad University in Iran. He received Bs in software engineering from the University of Kashan at 2003, and his MS in computer science at 2006 in Iran. He is a faculty member of Islamic Azad University (Kashan branch). He has taught in the areas of Wireless Networks, Ad hoc and Sensor Networks and his research interests include Semantic Web, Information Retrieval, and recent research focusing on the Mobility model and routing protocol in ad hoc networks. He has published several articles in international conferences and LNCS series.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814
17
Qualitative analysis of periodically forced nonlinear oscillators responses and stability areas in the vicinity of bifurcation cascade Nizar JABLI1, Hedi KHAMMARI2 and Mohamed Faouzi MIMOUNI3 1
Electrical Engineering Department, National Engineering School of Monastir Monastir, Ibn al Jazar 5019, Tunisia 2
3
Computer Department, Faculty of Computer Science, Taief University Taief, Arabi Saoudi
Electrical Engineering Department, National Engineering School of Monastir Monastir, Ibn al Jazar 5019, Tunisia
Abstract Bifurcation theory is the mathematical investigation of changes in the qualitative or topological structure of a studied family. In this paper, we numerically investigate the qualitative behavior of nonlinear RLC circuit excited by sinusoidal voltage source based on the bifurcation analysis. Poincare mapping and bifurcation methods are applied to study both dynamics and qualitative properties of the periodic responses of such oscillator. As numerically illustrated here, a small variation of amplitude or frequency of the driver sinusoidal voltage may involve qualitative changes for witch the system exhibits fold, period doubling and pitchfork bifurcations. In fact, the presence of these kinds of bifurcation necessitates an examination of the role of these singularities in the dynamical behavior of circuit. Particularly, we numerically study the qualitative changes may affect number and stability of the periodic solutions and the shapes of its basins of attraction associated while approaching the neighborhood of a particular bifurcation structure known as isoordinal lips cascade. Using a numerical scanning technique, higher harmonic domains which can prove the existence of such cascade of bifurcation are numerically computed. Furthermore, we report on some numerical simulations of bifurcation singularity and basins attractor which are useful tools for understanding and illustrating these effects. Keywords: Qualitative behavior, bifurcations cascade, fold, flip, pitchfork, higher harmonic, Attraction basins
1. Introduction Nonlinear dynamical systems undergo abrupt qualitative changes when crossing bifurcation points. Multistability is one of the most important properties of nonlinear systems. One can have two or more stable states for the same system parameters and for different initial conditions set. For a more exhaustive study of nonlinear system responses, it is compulsory to identify the singularities of the parameter plane (bifurcations, chaos, ...) and the singularities of the phase plane (fixed point, cycles, invariant closed curve, attraction basins, ...). The singularity considered here is the attraction basins associated to the attractors which coexist for same parameters of the RLC circuit. The influence domain or stability domain or basin of an attractor is the open set ( D) of the points X n such that the consequent of all X n approach asymptotically as n . The influence *
domain (or basin) of X is the set of points X 0 giving the *
convergence of X n towards X . The attraction basin may be either all in one block, or made up of finite or infinite number of disjointed parts with only one accumulation point [1]. The structure of a stability domain can undergo a global bifurcation changing the connexity property of this area to non-connexity or vice versa. Previous studies have investigated global bifurcations that change the structure and properties of attraction basins and their boundaries for both two-dimensional diffeomorphisms [1], [2] and endomorphisms [3]. When a cycle is locally asymptotically stable, it is possible to enquire about its influence domain, i.e. about the largest admissible initial
18
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org perturbation or more exhaustively, about the set X i of points X n such that the consequents of all X n approach asymptotically and successively the k points of the cycle as n . The forced RLC circuit with nonlinear inductor exhibits a wide variety of nonlinear phenomena, such as the jump and hysteresis, bifurcation and chaotic states, the frequency entrainment, harmonic and subharmonic oscillations, quasi-periodic behavior [4],[5]. A nonlinear forced oscillator containing a ferromagnetic core with saturation and hysteresis or an other Hard Characteristics exhibits a complex bifurcation phenomena near points of resonance [6], [7]. The Duffing Van der Pol oscillator of [8] shows a broad spectrum of dynamic behaviors, both chaotic as well as periodic. Such a considered circuit composed of a resistor, an inductor and a capacitor is described by two dimensional dynamical system modeled by the following second-order nonlinear ordinary differential equation:
x ( x, x ) f ( x) h(t )
(1)
A particular bifurcation structure namely an isoordinal cascade of bifurcations, studied in former works [9-11], include local bifurcations of codimension one such as fold, flip and pitchfork and bifurcations of codimension 2 such as cuspidal points which correspond to the intersection of two fold curves. The symmetry property of the circuit is introduced by the Pitchfork bifurcation, it was stated that a lip structure which is a combination of two fold curves related in the edges by cusp points are surrounded by pitchfork bifurcation curve and is associated to an even higher harmonic predominance. The aim of this work is to study the qualitative changes of the attraction basins of symmetrical attractors in proximity of certain bifurcation points. The rest of the paper is organized as follows. In section 2, we present an overview of the governing differential equations of a nonlinear RLC circuit excited by sinusoidal voltage source. Section 3 is devoted to a reminder of some basic results on singularities in a phase plane and bifurcation sets in parameter plane. In section 4, the analysis of higher harmonic of 2 - periodic solutions is examined. Finally, section 5 presents the main results of the paper. We numerically compute bifurcation diagrams and we report the effects of the structure of singularities on the attraction basins of stable attractors.
2. RLC circuit equations Fig. 1 shows typical RLC circuit modeled as a series combination of a resistor R, inductor L and capacitor C.
Such inductor is characterized by a single valued characteristic (i.e. without hysteresis). The current i is approximated by a cubic polynomial i a1 a3 , 3
a1 0, a3 0 , where is the magnetic flux and a1 , a3 are
constants.
Fig. 1 Typical series RLC circuit.
The system is driven by a sinusoidal voltage source. Using the notation in Fig. 1, the fundamental equation for the circuit is described by: d
1
(2) i (t ). dt e(t ) dt C * e(t ) em sin(t ) is a sinusoidal voltage source, were R. i ( t )
is the excitation frequency and em is the amplitude. * u (t )
1
i (t ). dt is the voltage across the capacitor. C We normalize the state variables and the time variable as: x (t ) , y u (t ) and t . Rewrite equation (2) as follows:
dx 1 (e .sin R ( a . x a . x 3 ) y ) 1 3 d m dy 1 ( a . x a . x 3 ) 3 d C 1
(3)
We can easily verify that (3) is invariant to the transformation ( , x, y ) ( , x, y ) .
3. Reminder of some basic results This section summarizes some basic results about the singularities of nonlinear systems described by second order nonlinear differential equations. These results will be useful for the analysis of the temporal behavior of a Duffing type equation modeling an RLC circuit with nonlinear core inductor. The Poincare map is usually
19
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org studied to understand the nature of non linear oscillatory systems responses in the phase space and their bifurcations in the parameter space. A complete treatment of the bifurcation types and their computation methods may be found in [12].
3.1 Phase plane singularities The two-dimensinal differential equations system (3) can be rewritten as the following general expression: dX d
f ( X , , ) ; , X , 2
3.2 Parameter plane singularities Stability of periodic solutions is obtained by examination of Jacobian of the system at these solutions. Therefore it is possible to show the dynamical behavior around these points and make qualitative studies without having to solve the system equations. Thus the stability nature of a periodic point U is known from the roots S of the following characteristic equation: k
dT (U )
S .I 0
(7)
dU
2
(4)
Where X x, y denotes the state vector, , em T
is the parameters vector and f is supposed to be C and periodic of period 2 . A classical technique for qualitatively investigating the system dynamics controlled by the parameters vector , em is based on the Poincare map T . This map, denoted T , is derived from equation (4) by merely sampling the continuous phase trajectories at t 2 . This geometrical method, called Poincare section, permits to give rise to a discrete trajectory computed implicitly through numerical integration of the differential equations system. By using the solution (t , U 0 , ) of (4), with an initial condition given by X (t0 ) U 0 , the Poincare mapping is
These roots, also called the multipliers, are denoted by S1 and S 2 ( S1 S 2 ). Three topologically points are defined as follow: if S1 1 the point is asymptotically stable, if S1 1 S 2 the point is unstable and is called saddle and if S 2 1 the point is completely unstable. At the particular value S 1 , we have a critical case of lyapunov, a bifurcation may occur. The following local bifurcations are to be identified in the equation (3). * The tangent bifurcation (or fold): This type of bifurcation occurs when one of the multipliers of a fixed point (or a cycle) S p 1 , (p = 1 or 2), this bifurcation is schemed in this paper as follows: cycle ( k , j ) a cycle ( k , j ) r
defined as: T : ; U 0 (t0 2 , U 0 , ) 2
2
(5)
Where , em denotes the system parameters. 2
Thus the analysis of the discrete dynamical system properties defined by the relation (5) via studying the singularities of T enables to perform a more complete description of the original system behavior defined by the relation (4). Indeed, a periodic solution of (4) of period 2 is associated to a periodic point namely a fixed point of T . While a k-order cycle (made up of k points) will correspond to a 2k periodic solution of (4), then U satisfies the following equation: T (U ) U 0 k
Where is used to indicate the non existence of the two cycles before the bifurcation point. Whereas cycle (k, j) denotes a k-order cycle, j characterizes the order of iterations of the points of the cycle. Finally, we note that “a” (resp. “b”) is attributed to attractive cycle (resp. repulsive cycle). In the following discussion the curve associated to this type of bifurcation is denoted by ( k )0 . j
* The doubling period bifurcation (or flip): This type of bifurcation happens when S p 1 , (p = 1 or 2), this bifurcation is schemed in this paper as follows: cycle ( k , j ) a cycle ( k , j ) r cycle ( k .2, j ) a cycle ( k , j ) r cycle ( k , j ) a cycle ( k .2, j ) r
(6) In a parameter plane the curve of bifurcation flip is
In this paper only fixed point singularities type of T and their bifurcations will be considered.
denoted by k
j
20
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org * Pitchfork bifurcation: This type of bifurcation occurs when S p 1 , (p = 1 or 2) after a k-order cycle crosses a Pitchfork bifurcation, the stability of such a cycle is changed and two other k-order cycles with different stability occur. This bifurcation is presented here as follows: cycle ( k , j ) r cycle ( k , j ) a cycle ( k , j ') r cycle ( k , j '') r cycle ( k , j ) a cycle ( k , j ) r cycle ( k , j ') a cycle ( k , j '') a
3.3 Attraction basins properties The trajectory of a given system, in state space will head for some final attracting region, or regions, which might be a point, curve, area, and so on. Such an object is called an attractor for the system, since a number of distinct trajectories will be attracted to this set of points in the state space. Indeed the non-unicity of these attractors led mainly to characterize each stable state by its associate stability domain (or attraction basin). These domains include the open sets of the points in the initial conditions space for which the solutions of the differential equation converges towards this solution this stable state. Thus, an attraction basin (D) is a stability domain of an attractive set (or attractor) having a border (F) see Fig. 2). The analysis of stability domains (D) properties of these attractors and their borders (F) (connexity, complex shape, fractal,) for two dimensional maps was undertaken in several works [1], [13], [14].
2) if the punctual transformation is invertible otherwise it is disconnected and made of a finite or infinite domains [14], the attraction basin can also be connected but including holes, it is the case of multiply connected basins [1]. In the case of our studied circuit shown in the section 2, we analyze in particular the multistability of periodic attractors and the basin of attraction structure in phase space and its dependence with the bifurcation points.
4. Higher harmonic spectral analysis In former studies [9], [10], [15] it had been shown that the 2 periodic solutions of a nonlinear differential equation governing the behavior of the considered RLC circuit with core inductor can be classified according to their Fourier spectra. In an ordering based on line amplitudes of a frequency spectrum in descending order, this means that a rank-m harmonic (m > 1) has the second place, and the first (i.e. the greatest amplitude) in the case of full predominance. It is shown that the study of the higher harmonic predominance in a parameter plane leads to conclude about the existence of a certain bifurcation structure namely isoordinal cascade. Such a bifurcation structure is established in one cell of parameter plane , em . Numerically, we consider a , em parameter plane, the spectral ‘scanning’ consists in dividing the plane in small pixels having the same width and the same height em , then compute the Fourier expansion of 2 period solution to identify the corresponding order of the predominant higher harmonic. And finally the pixel takes the color assigned to the predominant higher harmonic of the oscillatory attractor (or fixed point). The numerical computed domains of the higher harmonic predominance are shown in Fig. 3.
Fig. 2 Connectedness of stability domains of fixed point a) connected attraction basin b) disconnected basins
A Poincare geometrical transformation T associated to a continuous dynamical system can be a diffeomorphism (invertible) or an endomorphism (non invertible, non 1
unicity of T ). An attraction basin is connected (see Fig.
Fig. 3 Higher harmonic domains in
, em - plane
21
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org
As previously reported in [15], the domains shown in Fig.3 can be used to proof the identification of an isoordinal lips cascade embedded in the different colored cells.
below is aiming to have four different stable attractors. Each of these attractors has its stability domain which will be estimated in the phase plane x, y .
5. Numerical simulations of the qualitative behavior of RLC circuit From the whole structure of isoordinal cascade of lips extracted from [9], and numerically illustrated in Fig. 4, we consider a lip structure associated to the fourth higher harmonic predominance Fig. 4. This lip structure, made of two fold bifurcation curves
4
(1)0 and 1
4
4
(1)0 joined at 1'
1
4
1'
their extremities in two cuspidal points C1 and C1 , is surrounded by a Pitchfork bifurcation (1) . This means 4
1
0
that we have two symmetric lips sketched out in the foliation structure of Fig. 7. A flip curve
4
1 is also 1
4
Fig. 5 The lip bifurcation structure ( L ).
related to such a symmetric fold structure.
Fig. 4 Isoordinal lips cascade and associated flip and pitchfork curves
Let us consider a point chosen between two fold bifurcation points A1 and A1’ see Fig. 8. Actually the vertical cross-section revealed 7 attractors of which three are unstable (M2, M4 and M6) and four are stables (M1, M3, M5 and M5). We are concerned with attractive periodic solutions. In Fig. 9 the time series, phase trajectories and spectra of these attractors are presented. We note that in the phase trajectories a small red point is used to identify a fixed point which is the accumulation point in attraction basin of this attractor. The immediate basins of stable attractors M1, M3, M5 and M7 will be numerically illustrated in this section. The lip structure related to an even higher harmonic predominance chosen
Fig. 6 Detailed bifurcation diagram of W-section
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org
22
We shall consider in this section two cross-sections: (W)section, for = constant and (E)-section, for em = constant in two different regions (Z1) and (Z2) respectively as in Fig. 5. Detailed diagrams of such crosssections are given in Fig. 6 and Fig. 7. (W)-section corresponds to relatively small values of and em , and includes a set of points ( , em ) bounded by two fold bifurcation points. This section is relatively far from flip bifurcation curve and includes only fold bifurcations. The second cross-section namely (E)-section is chosen in order to analyze the effect of both of fold and flip bifurcation on attraction basins shapes and sizes, this section contains two tangent bifurcations and a flip bifurcation. Also, we recall from the section 2, that the axis of our phase plane are defined by the variable states (t ) and v (t ) in the x-axis and in the y-axis respectively.
Fig. 8 Foliation of the bifurcation diagram.
5.1 W-section attraction basins As mentioned above this section intersects the lip structure
L4 in two fold bifurcation points ( 50, em 112.959 ) and ( 50, em 128.256 ). Choosing two points from W-section, for a given fixed value of 50 and for a couple of values em 113.261 and em 116.193 , the attraction basins are numerically computed by using the phase plane ‘scanning’ technique. The proposed method consists in dividing a phase plane cell Min , Max
emMin , emMax in small pixels having the same
dimensions, width x and height y . The basin is computed in the obvious way by numerically integrating the differential equation starting from the set of initial conditions on the obtained 400*400 size grid, and in each case, after allowing the transient to decay sufficiently, deciding which solution has been reached. And finally 4
each one of the 16.10 pixels in the figure takes the color assigned to the attractive periodic solution (or attractor) given by considering the initial point in it.
Fig. 7 Detailed bifurcation diagram of E-section.
The details of the intersection of W-section with the lip structure is given in Fig. 5, we have two different fold bifurcation points: f1, a1. Since the parameter space is foliated [16] the two fold curves including f1 and a1 respectively are the boundaries of three different sheets , two stable sheets related through a third unstable one. This kind of bifurcation feature exhibits phenomena of jump and hysteresis. At point fc2 ( 50, em 97.785 ) and for increasing values of em , a stable fixed point undergoes a pitchfork bifurcation becoming unstable and generating two symmetric stable fixed points.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org
23
The phase trajectories of stable attractors, which are periodic solutions of original differential system having the period of the forcing sinusoidal input (normalized to 2 ), are given in Fig.10.
Fig.10. Phase trajectories of stable attractors.
The attraction basins of four stable attractors which corresponding to the same system parameter ( 50, em 113.261 ) is given in Fig. 11, A1, A2, A3 and A4 are the accumulation points of such stability domains.
Fig.11 Attraction basins in proximity of fold bifurcation (w= 50, em = 113.261199).
Fig. 9 Spectra and phase trajectories of stable attractors.
. It is obvious to remark that the attraction basins of A2 and A4 are smaller compared to those of A1 and A3, this is due to the fact that the symmetric attractors A2 and A4 are very close to a fold bifurcation. Picking another point of W-section, relatively far from the two bifurcation points ( 50, em 116.193 ), the obtained basin is shown in
24
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org Fig. 12. It is worth noting that the stability domains sizes of A2 and A4 increase, thus these domains are vanishing in vicinity of bifurcation points. The attraction basins are seemingly scrolled around a central part including accumulation points.
Whereas, when the four attractors are altogether closer to bifurcation points (two attractors close to fold bifurcation and two others are close to flip bifurcation) their attraction basins have relatively important sizes Fig. 14.
Fig.14 Attraction basins in proximity of 2 bifurcations points (
Fig.12 Attraction basins of attractors relatively far from fold bifurcation (w = 50, em = 116.193).
5.2 E-section attraction basins The E-section includes 2 folds and a flip bifurcation, in this particular case we choose two particular points ( 545.989, em 14.10 ) nearby a fold bifurcation and 3
( 557.629, em 14.10 ) close to both a flip and fold 3
bifurcation, in the latter case there is no contact between the two bifurcation points because they are not in the same sheet. For the first case, two attractors M1’ and M7’ are in proximity of fold bifurcation Fig. 13, that is why their stability domains are greatly reduced.
557.629, em 14000 ).
6. Conclusion We have presented a combined qualitative and numerical analysis of the global behavior of a nonlinear RLC circuit by investigating both the dynamic responses of nonlinear model and the bifurcation structure in the amplitudepulsation parameter plane. An analysis of particular bifurcation structure known as isoordinal lips cascade is treated. Especially, we have numerically illustrated the effect of a parametric singularities such as fold, flip and cuspidal bifurcation on a phase plane singularity namely attraction basins of stable attractors. Some properties of these stability areas, however, began to change while approaching the neighborhood of these kinds of bifurcation points. In addition, several basic properties such as multistability and symmetry of the proposed oscillator are carried out. Appendix
Using our bifurcation computing algorithms developed in FORTRAN, the numerical results in this work are obtained with respect to the following values of RLC circuit parameters. Table 1: RLC circuit parameters Fig. 13 Attraction basins in proximity of fold bifurcation (w= 545.989, em = 14000).
R[ ]
20
C [ F ]
1
a1
0.015
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org a3
0.365
Acknowledgments
This work was supported by Networks and Electrical Machines Research Unit (RME). Directed by Professor Rachid DHIFAOUI, RME is well established in INSATTunis, Tunisia.
References [1] C. Mira, chaotic dynamic from the one dimensional endomorphism to the two dimensional diffeomorphism, World Scientific, 1987. [2] Helena E. Nussea and James A. Yorkeb, Bifurcations of basins of attraction from the view point of prime ends, Topology and its Applications, Volume 154, No. 13, July 1, 2007, pp. 2567-2579. [3] Wanda Szemplinska-Stupnicka and Elzbieta Tyrkiel1, Effects of Multi Global Bifurcations on Basin Organization, Catastrophes and Final Outcomes in a Driven Nonlinear Oscillator at the 2T-Subharmonic Resonance, Nonlinear Dynamics, Vol. 17, No. 1, September 1998, pp. 41--59. [4] Michele V. Bartuccelli, Jonathan H.B. Deane and Guido Gentiley, Bifurcation phenomena and attractive periodic solutions in the saturating inductor circuit, Proceedings of the Royal Society A, vol. 463 No. 2085 , September 8, 2007, pp. 2351-2369. [5] Munehisa Sekikawa, Naohiko Inaba, Tetsuya Yoshinaga and Hiroshi Kawakami, Bifurcation structure of fractional harmonic entrainments in the forced Rayleigh oscillator, Electronics and Communications in Japan, Part 3, Vol. 87, No. 3, 2004, pp. 30-40. [6] Paul Bryant and Carson Jeffries, Bifurcations of a Forced Magnetic Oscillator near Points of Resonance, Physical Review Letters, Vol. 53, No. 3, July 16, 1984. [7] Kenjiro Yamaguchi and Genji Yorimitsu, Bifurcation Phenomena of a Forced Self-Oscillatory System, Electronics and Communications in Japan, Part 3, Vol. 82, No. 9, 1999. [8] J.D Jeng, Y. Kang and Y.P. Chang, An Alternative Poincare Section for Steady-State Responses and Bifurcations of a Duffing-Van der Pol Oscillator, WSEAS Transactions on systems}, Vol. 7, No. 6, June 2008. [9] H. Khammari, C. Mira and J.P. Carcasses, Behavior of harmonics generated by a Duffing type equation with a nonlinear damping: partI, 12th IEEE International Conference on Electronics, Circuits and Systems ICECS 2005, 11-14 Dec, Gammarth, Tunisia, 2005. [10]H. Khammari and M.Benrejeb, Tangent bifurcation in doubling period process of a resonant circuit's responses, IEEE International conference on industrial technology ICIT 2004, Hammamet, Tunisia, December 8-10, 2004. [11]C. Mira, H. Kawakami, M. Touzani-Qriouet, Bifurcations structures generated by the non-autonomous duffing equation, International Journal of Bifurcation and Chaos, Vol. 9, No.7, 1999, pp. 1363-1379. [12]H. Kawakami, Bifurcation of periodic responses in forced dynamic nonlinear circuits: computation of bifurcation
25
values of the system parameters, IEEE transactions on circuits and systems, 1984, vol. 31, No. 3, pp. 248-260. [13]Igor Gumowski, C. Mira, Recurrence and Discrete Dynamic Systems, Springer-Verlag, August 1980. [14]C. Mira, D. Fournier-Prunaret, L. Gardini, H. Kawakami and J.C. Cathala, Basin bifurcations of two-dimensional noninvertible maps: fractalization of basins, International Journal of Bifurcation and Chaos in Applied Sciences and Engineering, vol. 4, No. 2, 1994, pp. 343-382. [15]H. Khammari, J.P. Carcasses and M.Benrejeb, Bifurcations of periodic solutions and higher harmonic oscillations in the RLC-circuit, CESA 98, Tunisia, April 4-5th, 1998. [16]C. Mira, J. P. Carcasses, C. Simo, J. C. Tatjer, Crossroad area-spring area transition. (II) Foliated parametric representation, International Journal of Bifurcation and Chaos, vol. 1, No. 2, 1991 pp. 339-348.
Nizar JABLI, PhD Student. He was born in Gafsa in 1977, Tunisia. He received the engineer diplomat and master degree from National Engineering School of Sfax and Monastir, Tunisia, respectively, in 2003 and 2005. He is currently working toward the PhD degree at Monastir University, Tunisia. His research interests are in the analysis and control of complex nonlinear electrical circuits and power systems: bifurcation and chaos theory in electrical engineering applications. JABLI N., is a Member, IEEE and a member in RME Research Unit.
Hedi KHAMMARI, PhD. He was born in Kairouan, Tunisia in 1963. He received the engineer diploma and the Master degree from National Engineering School of Tunis in 1988 and 1990 respectively. He received PhD in Electrical Engineering in 1999. He is currently associate Professor at Taief University, Arabi Saudi. His research interests are mainly in the area of nonlinear dynamics and the application of chaos theory in different fields namely communication, electric systems and bioinformatics.
Mohamed Faouzi MIMOUNI, University Professor. He was born in Siliana, Tunisia in 1960. He received PhD in Electrical Engineering in 1997. He is currently Professor at Monastir University, Tunisia. His research interests are in the control of electrical asynchronous machines and power systems. He is the responsible of the RMEMonastir unit of search (Monastir section).
26
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814
Extended Diffie-Hellman Technique to Generate Multiple Shared Keys at a Time with Reduced KEOs and its Polynomial Time Complexity Nistala V.E.S. Murthy1 and Vankamamidi S. Naresh2 1
Department of Computer Science and Systems Engineering, Andhra University Visakhapatnam-530003,India
2
Department of Computer Science,S.V.K.P. and Dr. K.S.R. Arts and Science College Penugonda-534320,India
Abstract Recently Biswas[1] extended Diffie-Hellman technique to generate multiple two-person-shared keys by exchange of two public keys. In this paper, we further generalize the Diffie-Hellman technique to generate multiple two-person-shared keys by exchange of any number of public keys and study its Polynomial Time Complexity, Security etc. Also, an upper bound for the number of shared keys in terms of the number of exchanged keys and for a given number of shared keys, the minimum required number of keys to be exchanged, were arrived at. Lastly, a comparative study between the proposed technique and the Diffie-Hellman technique repeated m-times is made. Keywords: Diffie-Hellman technique, DDH problem, multiple shared keys, key exchange operations, secure data transmission.
1. Introduction Secure Data Transmission (SDT) is one of the most important parts of Communication in these days with the advent of several transactions being e-transitioned. Interestingly, this SDT is to take place on public transmission channels like, the Internet etc. which are very much open now-a-days to every one including cyber criminals. Hence, cryptosystems were developed and are constantly under research. Some of the cryptosystem (used between two persons) involve exchange of public keys on public transmission channels and construction of a shared key in private, using private keys. One of the logical solutions to increase security in SDT is to change the crypto-system-keys as frequently as possible. In fact, the ability to dynamically and publicly establish a session key for secured communication is a big challenge in cryptography. Now in this direction, Diffie-Hellman developed a simple and easy to implement technique, which here onwards is referred to as D-H. At any time, between two people, it requires an exchange of one public key from each one to construct one shared/common key, using their two private
keys. Biswas [1] generalized D-H to generate 15 multiple keys using two public keys each between two persons. Now in this paper, we further generalize Biswas[1] to generate -1 multiple keys using m public keys between two persons. The main advantages of the present work are: i. The proposed technique reduces not only the computational cost significantly but also the key exchange over heads. ii. Depending on the application and the security needed –by increasing the number of sessions and the number of shared keys, we can generate sufficient number of shared keys N, by selecting ┌ ┐
m=
|√ (log (N+1)/ log2)|
One might think here that multiple use of different D-H itself might solve the problem in multiple sessions. But observe that it brings in additional key exchange operations (KEOs) per session and an increase in overhead (Cf. [2][4]). In such conditions, if multiple shared keys are exchanged securely at a time with comparatively fewer KEOs and if a key or even multiple keys are used in the same session, it not only eases the establishment of session keys, but also reduces the key exchange overhead significantly. In what follows, first we recall the elements of Diffie-Hellman technique. 1.1. Diffie-Hellman technique Diffie and Hellman [3] introduced the concept of two person key exchange technique that allows two participants to exchange two public keys through an unsecured channel and generate a shared secured key between them. Let A and B be two participants that do not know anything about each other but wish to establish a secured shared key between them. Then they execute the following five steps:
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org Step1: Both A and B agree on two large positive integers, n and g such that n is a prime number and g is a group generator. Step2: A randomly chooses a positive integer, x which is smaller than n and serves as A’s private key. Similarly, B chooses its own private key, y.
and sends them to A.Now on exchange of m pairs of public keys, the participants can generate more than one shared key, because, instead of a single combination of the private keys (xy) as exists in the basic (DH), ”m2” combinations such as (xiyj),1≤ i≤ m, 1≤ j≤m, exist and each of them can generate a DH style key. The generation of m2 shared keys is as follows .The person A can compute the following keys:
Step3: Both A and B compute their public keys using X= mod n and Y= mod n, respectively. Step4: They exchange their public keys through a public communication channel. Step5: Now both A and B compute their shared key K, using K= mod n= mod n mod n K= mod n= For practical applications, it is assumed that D-H holds decisional DH (DDH) assumptions (Cf. [5], [6]) which means that no polynomial time algorithm exists to compute K up on knowing X,Y, g and n. Now, in what follows, first we outline the Multiple Shared Key technique using m public keys. We refer to this as MSK-m, Next, we arrive at (1) an upper bound for the total number of shared keys N (of course, not necessarily distinct) generatable in MSK-m in terms of the number of public keys m (2) a formula for the minimum number of public keys m required in order to generate the required number of shared keys N. Then we compare MSK-m with the repeated application of D-H m-times, establishing its polynomial time complexity. Lastly, some security aspects of MSK-m and selection/communication of shared keys is discussed. 2.0 Generation of -1 two person shared keys at a time by exchanging m public keys In brief, each person in the proposed technique assumes m random values namely x1, x2, ...,xm and generates m public keys,X1, X2,...Xm. These keys are then exchanged between the two participants and multiple shared keys are generated. The details of shared key generation are given below. 2.1 Multiple Shared key (MSK) Technique The participant A generates m public keys as given below mod n, 1≤ i≤ m where Xi and xi, and sends to B. So, , = for i=1,2,…,m, are the public and private keys of A. Similarly the participant B chooses private keys y1,y2,....,.ym randomly and generates m public keys as =
mod n, 1≤ j≤ m
27
=
mod n =
mod n, 1 ≤ i ≤ m, 1 ≤ j ≤ m..
The person B can compute the following keys: =
mod n =
mod n, 1 ≤ i ≤ m, 1 ≤ j ≤ m.
= , 1 ≤ i ≤ m, 1 ≤ j ≤ m. These m2 keys are Here called the base keys. Additional shared keys can be derived by multiplying these base keys in different combinations which are called extended keys. The extended keys are derived as follows. For example, multiplying two base keys at a time out of m2 base keys generates C(m2,2) shared keys such as K=
×
=
mod n
Multiplying three base keys at a time out of m2 base keys generates C(m2,3) shared keys such as K= × × = mod n. ....................... ................................... Multiplying m2 – 1 base keys at a time out of m2 base keys generates C(m2,m2-1) shared keys such as K = K11×K12×.....×K1m×.…×Km1×Km2×…×Kmm−1 =
mod n
Finally by multiplying all m2 base keys generates only one shared key K = K11×K12×.....×K1m×.…×Km1×Km2×…×Kmm, =
mod n
Using the above discussion, in what follows, we arrive at the total number of shared keys at a time, in terms of the number of exchanged keys. 2.2 An upper bound for the total number of shared keys in the proposed technique The total number of shared keys
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org = Number of base keys + The number of extended keys, which can be easily seen to be -1 . Observe that for m = 2, we get the 15 shared keys mentioned in Biswas[1]. Example
Total number of The number Number of shared keys of public 2) base keys(m keys exchanged(m) -1) ( 1 12=1 21-1=1 2 22=4 24-1=15 2 3 3 =9 29-1=511 4 42=16 216-1=65535 .
.
.
.
.
.
.
.
.
2.3 A lower bound for m to generate the required number of shared keys -1≥ N, ≥ N+1, ┌ ┐ m log2 ≥ log N+1, m ≥ |√ (log (N+1)/ log2)| ┌ ┐ Therefore by selecting m = |√ (log (N+1)/ log2)| we can generate at least N shared keys. Example: For instance if we required N = 50000 shared keys we find that ┌ ┐ m = |√ (log (N+1)/ log2)| ┌ ┐ = |√ (log (50001)/ log2)| ┌ ┐ = 0 .3950907406 =4 Thus, by exchanging 4 public keys one can generate as many as 65535 shared keys, may be not all distinct.
28
-1 shared keys it requires interchange of ii. To generate 2( -1) messages in D-H technique and 2m messages in MSK-m. iii. To generate -1 shared keys it requires 22( -1) exponential operations in D-H technique and MSK-m requires 2m2 +2m. iv. To generate -1 shared keys it requires no multiplications and MSK-m requires 2(2 -1 (m2-2) +1) v. The time complexity of D-H is TD-H(m) = Ce22 (
-1) = O(
)
Where Ce denotes time needed for execution of one exponential operation. So, D-H possibly has non polynomial (exponential) time complexity. Hence, it requires more time for execution. Since multiplications are very less expensive than exponentiation, for time complexity we consider exponential operations and hence the time complexity of MSK TMSK-m(m) = Ce(2m2+2m.) =O (m2 ) whereCe denotes time needed for execution of one exponential operation. Thus, MSK-m has polynomial (quadratic) time complexity. So it requires less time for execution than D-H.
Observe that for 2
Numbe r of rounds
-1
DH MS K
Interchan ge of number of messages 2(
Execution of number Numberof Time multiplicati complexit of y exponentia ons tions
2
2m2 +2m. 1
2m
NIL
-1) 2 (2 -1) 2
O( ) (exponent ial)
O (m2) m2- (quadrati 2)+1 c) -1(
2.5. Security of the proposed technique 2.4. Comparison between MSK-m and D-H repeated mtimes and the Polynomial Time Complexity of MSK-m First observe that i. To generate -1 shared keys in D-H technique it requires -1 rounds and proposed technique (MSK-m) requires single round
We consider DDH assumption for the security of the shared keys generated using the Group-DH technique. For this we assume large finite cyclic groups that hold DDH assumption (Cf. [5]), and it is widely believed that there exist such groups for which DDH is intractable. For instance, let p, q and g be publicly known, where p and q are large primes and p = |G|= 2q+1, and g is the generator of a subgroup QRp of Zp* and q =|QRp|. If the participants choose
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org
their private keys from Zq = {0, 1, 2. . . q -1}, then an adversary cannot distinguish between the random keys in QRp and the keys that are generated by DH technique. In what follows, we recall some material from ZhengManz-Foss-Chen[7]. DDH assumption: G is a finite cyclic group and g is a generator. Given ( ga, gb, gab) and ( ga, gb, gc) for random a, b, c [1, |G|], no efficient algorithm can decide that c = ab in G. In other words, the value gab is indistinguishable in polynomial time from a random number of G. Using the DDH assumption, if K1 and K2 are random numbers and PKi for i = 1, 2 are the corresponding public values, then the shared value K = PK1 exp K2 = PK2 exp K1using two-party secure group key exchange is indistinguishable in polynomial time from a random value, where exp is an exponentiation operation. From the security point of view, the above assumption is very strong and many secured practical cryptographic systems designed are based on it. The present paper also follows this assumption to generate secured multiple two-party shared keys. Now we make the following Proposition.
2.5.1 Proposition: m2 two-party shared keys k1, k2, k3……km2(base keys) derived in the application of the basic DH technique are indistinguishable in polynomial time from random numbers. Proof: Since each of the m2 shared keys is basically a D-H style key and for D-H shared key the proposition is true, our Proposition follows. Corollary 1: -1-m2 shared keys generated by The extended 2 multiplying the m base keys in different combinations are also indistinguishable in polynomial time from random numbers. For a communication, the participants must agree upon a particular key. One method for the selection of a shared key is shown below. 2.6 Selection of shared key Now that the proposed extension can generate multiple shared keys, it is necessary for us to be able to select a key for a session. Here we suggest that one can follow a method similar to that in the well known Merkle’s puzzle which is recalled below: A party generates n messages each with having a different puzzle number and a secret key number and sends all the messages to the other party in encrypted form. Note that a different 20 bit key is used for encryption of each message.
29
The other party chooses one message at random and performs brute-force attack to decrypt it -although it needs a large amount of work, it is still computable. It then encrypts its message with the key thus recovered and sends it to the first party along with the puzzle number. Since it knows the puzzle number, it thus identifies the key and decrypts the message. Similarly, in order to select a key out of -1shared keys, one party generates a message comprising a shared key and a puzzle number. After encrypting it either with the smallest or largest key of the shared keys generated, it is sent to the other party. The message is easily decrypted as the recipient knows all keys and the (shared key, puzzle number) pair is recovered. The party then either sends the puzzle number alone or an encrypted message along with the puzzle number to the first party, where the message is encrypted with the shared key found. Since the first party knows the puzzle number, it therefore identifies the session key and can decrypt the message. For subsequent changes, the present session key may be used to encrypt a shared key at one end, and it is decrypted at the other end to obtain the new session key. Conclusions: Since the D-H technique is simple and easy to implement and since MSK-m uses only the D-H style for shared key generation, MSK-m is also easy to implement. Further, using the lower and upper bounds for m and N (Cf. 1.2 and 1.3), the security levels can be increased in SDT with relatively lesser operational overhead (Cf.1.4). Acknowledgments Author would like to thank the Management of S.V.K.P. and Dr. K.S.R. Arts and Science College, for Sponsor ing and financial support. References [1] G.P. Biswas, Diffie Hellman Technique Extended To Multiple Two Party Keys And One Multi Party Key, IET inf. Sec., 2008, Vol.2(1), pp.12-18. [2] Menezes A.J., Elliptic Curve Public Key Crypto Systems, Kluwer Academic Publishers, 1993. [3] Diffie W. and Hellman M., New Directions in Cryptography, IEEE Trans. Inf. Theory, Vol. 22(6), pp. 644–654, 1976. [4] Stallings W., Cryptography And Network Security, Principles And Practices, Pearson Education, 3rd Edn., 2004. [5] Boneh D., The Decision Diffie-Hellman Problem, Proc. 3rd Algorithmic Number Theory Symposium, Lecture Notes in Computer Science, 1423, pp. 48–63, 1998. [6] Boneh D. and Venkatesan R., Breaking RSA May Not Be Equivalent to Factoring, Advances in Cryptology, EUROCRYPT’98, pp. 59–71, 1998. [7] ZHENG S., MANZ D., Alves-Foss J and Chen Y., Security And Performance Of Group Key Agreement
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org Protocols, Proc. IASTED Int. Conf. Networks and Communications Systems, pp. 321–327, March 2006. [8] Merkle R.C., Secrecy, Authentication And Public Key Systems, Communications ACM, Vol. 21(4), pp. 294–299, 1978. About the Authors Nistala V.E.S. Murthy is currently working as a Professor in the department of Computer Science and Systems Engineering of Andhra University, Visakhapatnam. He developed f-Set Theory –wherein f-maps exists between fuzzy sets with truth values in different complete lattices, generalizing L-fuzzy set Theory of Goguen which generalized the [0,1]-fuzzy set theory of Zadeh, the Father of Fuzzy Set Theories. He also published papers on Representation of various Fuzzy Mathematical (Sub) structures in terms of their appropriate crisp cousins. Vankamamidi Srinivasa Naresh is currently working as a Director, for the Post Graduate Department of Computer Science Courses in S.V.K.P. and Dr. K.S.R. Arts and Science College. He obtained an M.Sc. in Mathematics from Andhra University, an M.Phil. in Mathematics from Madurai Kamaraj University and an M.Tech in Computer Science and Engineering from J.N.T. UniversityHyderabad. He is also a recipient of U.G.C.-C.S.I.R. JUNIOR RESEARCH FELLOSHIP and cleared NET for LECTURERSHIP.
30
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814
31
An Effective Technique for Clustering Incremental Gene Expression data Sauravjyoti Sarmah1 and Dhruba K. Bhattacharyya2 1,2
Department of Computer Science & Engg., Tezpur University3 Tezpur-784028, Assam, India
Abstract This paper presents a clustering technique (GenClus) for gene expression data which can also handle incremental data. It is designed based on density based approach. It retains the regulation information which is also the main advantage of the clustering. It uses no proximity measures and is therefore free of the restrictions offered by them. GenClus is capable of handling datasets which are updated incrementally. Experimental results show the efficiency of GenClus in detecting quality clusters over gene expression data. Our approach improves the cluster quality by identifying sub-clusters within big clusters. It was compared with some well-known clustering algorithms and found to perform well in terms of the z-score cluster validity measure. Keywords: Clustering, microarray, gene expression, density based, incremental.
1. Introduction According to [1], most data mining algorithms developed for microarray gene expression data deal with the problem of clustering. Clustering genes groups similar genes into the same cluster based on a proximity measure. Genes in the same cluster have similar expression patterns. One of the characteristics of gene expression data is that it is meaningful to cluster both genes and samples. Coexpressed genes can be grouped into clusters based on their expression patterns [2] and [3]. In gene-based clustering, the genes are treated as the objects, while the samples are the features. In sample based clustering, the samples can be partitioned into homogeneous groups where the genes are regarded as features and the samples as objects. Both the gene-based and sample based clustering approaches search exclusive and exhaustive partitions of objects that share the same feature space. The third category, that is subspace clustering, captures clusters formed by a subset of genes across a subset of samples. For subspace clustering algorithms, either genes or samples can be regarded as objects or features. The details of the challenges and the representative clustering 3
The department is funded by UGC’s DRS- Phase I under the SAP
techniques will be discussed in Section 2. The result of clustering is dependant on the proximity measure used [4] and different measures give different results. In this paper, we introduce an effective gene-based clustering approach (GenClus), which is capable of identifying clusters and sub-clusters of arbitrary shapes of any gene expression dataset, even in presence of noise. GenClus attempts to find sub-clusters which may be relevant for biologists. It does not use any proximity measure during clustering the genes and therefore free from the restriction offered by various proximity measures. GenClus gives a hierarchical view of the clusters and sub-clusters formed. With the increasing development of internet technology and with the constant increase in the microarray experimentation conducted, it has led to the ever-increasing volume of data. There is therefore a need to introduce incremental clustering so that updates can be clustered in an incremental manner. To handle such increase in volume of microarray data, incremental clustering technique often has been found suitable. This paper also introduces an incremental version of GenClus i.e., InGenClus which has been established to perform well in terms of several gene datasets. Both GenClus and InGenClus can be found to be significant in view of the following points: provides a hierarchical cluster solution; free from the use of proximity measures; faster processing due to simplified matching mechanism; capable of handling noisy datasets; does not require the number of clusters apriori; GenClus improves the quality of the clusters by identifying sub-clusters within large clusters. It can also handle the situation when the database is updated incrementally using less computation time.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org In Section 2, we present some gene-based clustering techniques, Section 3 presents our proposed clustering algorithm GenClus and Section 4 presents our proposed incremental version of GenClus (InGenClus). Section 5 reports the performance evaluation of the algorithms and finally Section 6 presents the conclusion. Next, we discuss some of the clustering techniques.
2. Clustering Techniques The goal of gene-based clustering is to group co-expressed genes together. Co-expressed genes indicate co-function and co-regulation [4]. Gene expression data has certain special characteristics and is a challenging research problem. Here, we will first present the challenges of gene-based clustering and then review a series of genebased clustering algorithms.
2.1 Challenges of Gene-based Clustering The purpose of clustering gene expression data is to reveal the natural structure inherent in the data. A good clustering algorithm should depend as little as possible on prior knowledge, for example requiring the predetermined number of clusters as an input parameter. Clustering algorithms for gene expression data should be capable of extracting useful information from noisy data. Gene expression data are often highly connected and may have intersecting and embedded patterns [5]. Therefore, algorithms for gene-based clustering should be able to handle this situation effectively. Finally, biologists are not only interested in the clusters of genes, but also in the relationships (i.e., closeness) among the clusters and their sub-clusters, and the relationship among the genes within a cluster (e.g., which gene can be considered as the representative of the cluster and which genes are at the boundary area of the cluster). A clustering algorithm, which also provides some graphical representation of the cluster structure is much favored by the biologists.
2.2 Gene based Clustering Techniques: A brief review A large number of clustering techniques have been reported for analyzing gene expression data. They have been broadly classified into the following approaches. Partitional Approaches: K-means [6] is a typical partition-based clustering algorithm used for clustering gene expression data. It divides the data into pre-defined number of clusters in order to optimize a predefined criterion. The major advantages of it are its simplicity and speed, which allows it to run on large datasets. However, it may not yield the same result with each run of the
32
algorithm. Often, it can be found incapable of handling outliers and is not suitable to detect clusters of arbitrary shapes. A Self Organizing Map (SOM) [7] is more robust than K-means for clustering noisy data. It requires the number of clusters and the grid layout of the neuron map as user input. Specifying the number of clusters in advance is difficult in case of gene expression data. Moreover, partitioning approaches are restricted to data of lower dimensionality, with inherent well-separated clusters of high density. But, gene expression data sets may be high dimensional and often contain intersecting and embedded clusters. A hierarchical structure can also be built based on SOM such as Self-Organizing Tree Algorithm (SOTA) [8]. Another example of SOM extension is the Fuzzy Adaptive Resonance Theory (Fuzzy ART) [9] which provide some approaches to measure the coherence of a neuron (e.g., vigilance criterion). The output map is adjusted by splitting the existing neurons or adding new neurons into the map, until the coherence of each neuron in the map satisfies a user specified threshold. Hierarchical Approaches: Hierarchical clustering generates a hierarchy of nested clusters. These algorithms are divided into agglomerative and divisive approaches. Unweighted Pair Group Method with Arithmetic Mean (UPGMA), presented in [3], adopts an agglomerative method to graphically represent the clustered dataset. However, it is not robust in the presence of noise. In [10], the genes are split through a divisive approach, called the Deterministic-Annealing Algorithm (DAA). The Divisive Correlation Clustering Algorithm (DCCA) [11] uses Pearson’s Correlation as the similarity measure. All genes in a cluster have highest average correlation with genes in that cluster. Hierarchical clustering not only groups together genes with similar expression patterns but also provides a natural way to graphically represent the data set allowing a thorough inspection. However, a small change in the data set may greatly change the hierarchical dendrogram structure. Another drawback is its high computational complexity. Density Based Approaches: Density based clustering identifies dense areas in the object space. Clusters are highly dense areas separated by sparsely dense areas. DBSCAN [12] was one of the pioneering density based algorithms used over spatial datasets. In [5], Jiang et. al. propose the Density-Based Hierarchical clustering method (DHC) to identify co-expressed gene groups. It can identify embedded clusters in the dataset and can also handle outliers. It can effectively visualize the internal structure of the data set. A kernel density clustering method for gene expression profile analysis is reported in [13]. An alternative to this is to define the similarity of points in terms of their shared nearest neighbors. This idea
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org was first introduced by Jarvis and Patrick [14]. In [15], a density based gene clustering algorithm, DGC, is presented. DGC uses the regulation information along with the order preserving [16] nature of gene expression data to identify clusters over gene expression data. A density-based approach discovers clusters of arbitrary shapes even in the presence of noise. However, densitybased clustering techniques suffer from high computational complexity with increase in dimensionality (even if spatial index structure is used) and input parameter dependency. Model Based Approaches: Model based approaches provide a statistical framework to model the cluster structure in gene expression data. The Expectation Maximization (EM) algorithm [17] discovers good values for its parameters iteratively. It can handle various shapes of data, but can be very expensive since a large number of iterations may be required. In [18], a signal shape similarity method used to cluster genes using a Variational Bayes Expectation Maximization algorithm [19]. A model-based approach provides an estimated probability that a data object will belong to a particular cluster. Thus, a gene can have high correlation with two totally different clusters. However, the model-based approach assumes that the data set fits a specific distribution which is not always true. Graph Theoretical Approaches: In graph-based clustering algorithms, graphs are built as combinations of objects, features or both, as nodes and edges, and partitioned by using graph theoretic algorithms. Graph theoretic algorithms are also used for the problem of clustering cDNAs based on their oligo-nucleotide fingerprints [20]. CLuster Identification via Connectivity Kernels (CLICK) [21] is suitable for subspace and high dimensional data clustering. The Cluster Affinity Search Technique (CAST) by [2] takes as input pair-wise similarities between genes and an affinity threshold. It does not require a user-defined number of clusters and handles outliers efficiently. But, it faces difficulty in determining a good threshold value. To overcome this problem, E-CAST [22] calculates the threshold value dynamically based on the similarity values of the objects that are yet to be clustered. In [23], a graph based algorithm for identifying disjoint clusters over gene expression datasets is presented. It is based on the concept that inter-cluster genes have more repulsion between them while the repulsion of intra-cluster genes is less. The cluster results are dependent on a connectivity threshold which is calculated dynamically during the cluster creation process.
33
Soft Computing Approaches: Fuzzy c-means [24] and Genetic Algorithms (GA) (such as [25] and [26]) have been used effectively in clustering gene expression data. The Fuzzy c-means algorithm requires the number of clusters as an input parameter. The GA based algorithms have been found to detect biologically relevant clusters but are dependent on proper tuning of the input parameters. The current information explosion, fuelled by the availability of the World Wide Web and the huge amount of microarray experiments being conducted, has led to ever increasing volume of data. Therefore, there is a need to introduce incremental clustering so that updates can be clustered in an incremental manner. Though a lot of research has been performed on incremental clustering in other application domains, incremental clustering of gene expression data has not been exploited much yet. Incremental Algorithms: In [27], the authors present an incremental clustering approach based on the DBSCAN [12] algorithm. A one pass clustering algorithm for relational datasets is proposed in [28]. Rough set theory is employed in the incremental approach for clustering interval datasets in [29]. In [30], an incremental genetic Kmeans algorithm is presented. In [31], an incremental gene selection algorithm using a wrapper-based method that reduces the search space complexity since it works on the ranking directly, is presented. In [32], an incremental clustering over gene expression data is presented that uses the regulation information to store the cluster information for use when clustering genes incrementally.
2.3. Discussion From the discussion above, we conclude that various clustering algorithms require different types of input parameters and clustering results are highly dependent on the values of parameters. Gene expression data has coherent patterns embedded in the full gene space, identification of which is an important research field. Coherent genes may indicate co-regulation and hence fall under the same functional classification. Clustering algorithms that do not require the number of clusters as an input parameter and are robust to noise are of utmost importance. Clustering algorithms are sensitive to the proximity measure chosen. In this paper, we present a gene based clustering technique which is able to identify clusters automatically from the dataset. An incremental version of GenClus is also presented that is capable of handling incremental gene expression datasets.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org
34
3. GENCLUS GenClus is a gene based clustering technique which adopts the notion of density based approach as can be found in [12], [15]. It exploits a discretization technique which retains the up- or down- regulation information. GenClus normalizes the gene expression data and works over a discrete domain (of regulation information). Clustering is then run on the discretized data. The gene expression data is normalized to have mean 0 and standard deviation 1. Expression data having a low variance across conditions as well as data having more than 3-fold variation are filtered. Discretization is then performed on this normalized expression data. Discretization uses the regulation information, i.e. up- or down- regulation in each of the conditions for a particular gene. Here, let G be the set of all genes and T be the set of all conditions. The discretization is done as follows:
Fig. 1. Expression profiles of an example dataset
i. The discretized value of gene gi at condition, t1 (i.e., the first condition)
g ,t
i 1
1 if gi ,t1 0 0 if gi ,t1 0 1 if gi ,t1 0
ii. The discretized values of gene gi at conditions tj (j = 1,..(T − 1)) i.e., at the rest of the conditions (T − {t1})
g ,t i
j 1
1 if g ,t g ,t i j i j 1 0 if gi ,t j gi ,t j 1 1 if gi ,t j gi ,t j 1
where gi ,t j is the discretized value of gene gi at conditions tj (j = 1,..(T − 1)). The expression value of gene gi at condition tj is given by g ,t . We see in the above i
j
computation that the first condition, t1, is treated as a special case and its discretized value is directly based on g ,t i.e., the expression value at condition t1. For the rest i 1
of the conditions the discretized value is calculated by comparing its expression value with that of the previous value. This helps in finding whether the gene is up- (1) or -down (-1) regulated at that particular condition. Each gene will now have a regulation pattern ( ) of 0, 1, and 1 across the conditions or time points. This pattern is represented as a string.
Fig. 2. Regulation and range information of the example dataset of 1
Each gene is divided into various range−ids depending on their expression values as follows. The range value for each expression level is given by uniformly dividing the difference between the maximum and minimum values in the normalized data. Max EV Min EV range _ value I where MaxEV is the maximum expression value and MinEV is the minimum expression value. For example, suppose interval, I = 7. Therefore, we will have 7 range_ids (3, 2, 1, 0, -1, -2, -3), where the expression values of a gene falling in the corresponding range will get its range_id. Now, each gene will have a pattern of range_ids across the conditions or time points which is represented as a string. Fig. 2 illustrates an example of a discretized matrix showing the regulation pattern and range_ids, where the number of intervals is set to 7, namely (3,2,1,0,-1,-2,-3). The regulation information and range values are used together to cluster the gene expression dataset using a density based approach. A string matching approach is used for matching the regulation pattern and range pattern of two genes. Next, we give some definitions which provide the foundation of GenClus.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org Definition 1. Neighborhood level of a gene: A gene gj is said to be a neighbor of gene gi i.e., g j N level ( g i ) if (i) gi matches with gj over each of the v conditions, where v is greater than a user defined threshold, α; (ii) range_id(gi, tk) ± level = range_id(gj, tk), tk refers to the conditions where k = 1, 2, · · · , T and level is a dynamically calculated parameter. (Initially level = 0). Definition 2. Core gene: A gene gi is said to be a core gene if | Nlevel(gi) | ≥ σ (user-defined threshold). In our experiments, we have obtained good results for σ = 2. Initially, level = 0 and the neighborhood of gene gi is searched for genes satisfying the core gene condition of Definition 2. If no neighbor gene is found, then level is increased in both positive and negative range by one i.e., we search for neighbor genes in adjacent range_ids of (gi, tk) and the neighborhood search continues. Definition 3. Direct-Reachability: A gene gj is said to be directly reachable from another gene gi if gi is a core gene and g j N level ( g i ) .
35
The clustering process starts with an arbitrary gene gi and searches the neighborhood of it to check if it is core. If gi is not core then the process restarts with another unclassified gene. If gi is a core gene, then clustering proceeds with finding all reachable genes from gi. All reachable genes are assigned the same sub_cluster_id as gi. From the neighbors of gi, if any gene satisfies the core gene condition, sub cluster expansion proceeds with that gene. The process continues till no more genes can be assigned to the sub cluster. The process then restarts with another unclassified gene and starts forming the next sub cluster. The clustering process continues till no more genes can be assigned sub_cluster_id. Once all sub clusters have been assigned, the process groups all subclusters as well as genes having no sub_cluster_id but having the same regulation pattern into the same cluster and assign them the same cluster_id. All unclassified genes are now termed as noise genes.
Definition 4. Reachability: A gene gj is said to be reachable from another gene gi if there is a chain of genes g1, g2, · · · , gp between gi and gj such that gi+1 is directly reachable from gi. Finding sub-clusters within bigger clusters gives the finer clustering of a dataset. Sub-cluster information may be useful for the biologists by means of visual display and in the interpretation. Definition 5. Sub-Cluster: Let DG be a database of genes. A sub-cluster Si is a non-empty subset of DG satisfying the following conditions: 1. gi, gj : if gi Si and gj is reachable from gi, then gj Si. 2. gi matches with gj over each of the v conditions. 3. range_id(gi, tk) ± level = range_id(gj , tk), tk refers to the conditions where k = 1, 2, · · · , T. Definition 6. Cluster: Gene gi, gj Ci (ith cluster), if gi matches with gj over each of the v conditions i.e., all genes having same regulation pattern over v conditions are grouped into the same cluster. Subclusters Sj where, j = 1, 2, · · · will belong to cluster Ci if they have the same regulation pattern. Definition 7. Noise Genes: Let C1,C2, · · ·Cn be the set of clusters of DG, then noise is the set of genes in DG not belonging to any cluster Ci, i.e., noise {g x D G | i : g x C i }
Fig. 3. Clustering of the example dataset given in Fig. 2. Here, Cis (i =1, 2, · · ·) are clusters; SCij refer to the jth sub-cluster of cluster i and UCik is the kth gene in cluster i not belonging to any sub-clusters.
The clusters and sub-clusters for the example dataset of Fig. 2 are illustrated in Fig. 3. It can be observed that subclusters give the highly coherent patterns in the dataset. The algorithms for cluster formation and cluster expansion are given in Fig. 4 and Fig. 5. Cluster_creation() //Pre-condition: All genes are unclassified // cluster id = 0 FOR i from 1 to DG do IF gi.classified = unclassified THEN Cluster expand(gi , cluster id) cluster id++; END IF END FOR Fig. 4. Algorithm for cluster formation of GenClus
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org Microarrays generate tens of thousands of data in one experiment. Data volume is constantly increasing due to the huge amount of microarray experiments performed. While clustering this type of data, it is of utmost importance that the updations of the database are handled incrementally. Some of the incremental clustering algorithms are reported in section 2. Though a lot of work has focused on incremental clustering over spatial datasets, not much research has been done over incrementally handling gene expression data. In this paper, we also introduce an incremental clustering technique (InGenClus) for gene expression data, which is based on GenClus. Once clustering of the dataset is obtained, each of the clusters are represented by cluster profiles. The cluster profiles store the regulation of that particular cluster. The sub-clusters are represented by the sub-cluster profiles which stores the regulation and range information of that particular sub-cluster. This information is further used by GenClus when clustering the updated database incrementally. Cluster_expand(gi, cluster_id) IF get_core(gi) = 0 THEN gi.cluster_id = cluster_id; RETURN; ELSE gi.classified = classified; FOR j from 1 to DG do IF gj .classified = unclassified Cluster_expand(gj , cluster_id) END IF END FOR END IF Fig. 5. Algorithm for cluster expansion of GenClus
4. Incremental Clustering In this section, we present InGenClus which is based on GenClus and is capable of handling incremental data. Due to the density based nature of GenClus, the insertion of a gene affects the current clustering only in the neighborhood of the gene. We examine the parts of an existing clustering affected by an update and show how GenClus can handle incremental updates of a clustering after insertions. The changes of the clustering of the gene database DG are restricted to the neighborhood of an inserted gene. The previously core genes [15] retain their core property but, non core genes (border genes or noise genes) may become cores. Thus new density connections may surface. The insertion of a gene gi may result in a change of cluster membership of genes in the neighborhood of gi and all genes reachable from one of
36
these genes in DG' DG { g i } , where DG' is the updated dataset. While inserting gi the following cases may occur: 1. Fusion: A gene gi may be fused to a cluster Ci if regulation pattern of gi matches with cluster profile of Ci, then gi is fused into cluster Ci. Gene gi may be fused to a cluster Si if gi is reachable from Si. 2. Cluster Creation: Gene gi may have same regulation pattern w.r.t. some other noise or unclassified gene(s) and may lead to the formation of a new cluster. 3. Sub-cluster Creation: Gene gi may become core w.r.t. (i) Some gene(s) in a cluster which are not members of any sub-cluster. This leads to the creation of a new sub-cluster. (ii) Some other noise or unclassified gene(s) and may lead to the formation of a new sub-cluster. 4. Noise: If gi does not match with any of the cluster profiles then gi is a noise gene and no densityconnections are changed. InGenClus starts with a newly inserted gene gi and finds if its regulation and range information matches with any of the cluster or sub-cluster profiles then there can be the following cases: i. gi matches with cluster profile of Ci, then GenClus will assign cluster_id of Ci to gi. After insertion of gi, one of the genes gk Ci and gk Si (Si is a subcluster in Ci) may become core and hence can become a potential candidate for sub-cluster expansion (case 1). ii. gi matches with none of the cluster profiles, but it matches with some other unclassified genes. Then it creates a new cluster (case 2) and finds if it can form sub-clusters (case 3). Finally, it forms the cluster and/or sub-cluster profiles accordingly. iii. gi matches with cluster profile of Ci, then InGenClus will assign cluster_id of Ci to gi. After insertion of gi, any gene gk Ci and gk not
iv.
belonging to any sub-clusters in Ci may become core and hence may become a potential candidate for sub-cluster creation (case 3). gi matches with none of the cluster profiles nor does it match with any other gene, then case 4 occurs.
In case of fusion, the affected cluster profiles are updated based on an effective data fusion technique. To achieve better space time complexity, the cluster profiles are organized using an effective data structure. It has been found that the InGenClus yields the same result as when compared with GenClus, yet at a lesser time.
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org
37
5. Performance Evaluation GenClus was implemented in Java in Windows environment and evaluated with several real-life datasets. GenClus was tested on the following three real-life data sets given in Table 1. All the datasets are normalized to have mean zero and standard deviation one. Of the various datasets, some of the clusters formed from the full and reduced form of Dataset 2 are shown in Fig. 6 and Fig.7 and the clusters obtained from the reduced Dataset 1 are shown in Fig.8. The datasets have been reduced by Table 1: Datasets used for evaluating the clustering algorithm
Serial No
Dataset
Datase t1
Yeast CDC28-13 [33] Yeast Diauxic Shift [34] Subset of Human Fibroblasts Serum [35]
Datase t2 Datase t3
No. of No. of genes conditions
Source
621 8
17
http://yscdp.stanford. edu/yeast_cell_cycle/ full_data.html
608 9
7
http://www.ncbi.nlm .nih.gov/geo/query
517
13
http://www.science mag.org/feature/data/ 984559.hsl
Fig. 8. Some clusters are illustrated from the Dataset 1
Fig. 9. Some of the clusters obtained by GenClus over Dataset 3
Fig. 6. Some clusters are illustrated from the full Dataset 2 Fig. 10. Hierarchy of four clusters of Dataset 2. The full dataset is at the root, the clusters are shown with the single line frames, sub-clusters are shown with double line frames and the genes which are part of a higher level cluster but not part of any sub-clusters are shown with dotted line frames.
Fig. 11. Some of the clusters obtained by InGenClus over data incrementally updated from Dataset 2 Fig. 7. Result of GenClus on the reduced form of Dataset 2
filtering out low variance genes and genes having more than 3-fold standard deviation. Some of the clusters obtained by GenClus from Dataset 3 are shown in Fig. 9. The hierarchy of four of the clusters and sub-clusters of
IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3, May 2010 www.IJCSI.org Dataset 2 is shown in Fig. 10. In the figure, the full dataset is at the root, the clusters are shown with the single line frames, sub-clusters are shown with double line frames and the genes which are part of a higher level cluster but not part of any sub-clusters are shown with dotted line frames. The data from Dataset 2 was inserted incrementally and InGenClus was executed. Fig. 11 shows a sample output of some clusters of Dataset 2 with genes inserted incrementally. The inserted genes are shown in red color (grey for black & white images) with filled circles at the time points.
5.1. Cluster Quality To assess the quality of our method, we need an objective external criterion. In order to validate our clustering result, we employed z-score and p-value. Z-score: For evaluating the quality of clusters produced by different algorithms, we need an objective external criterion. We obtain a statistical rating of the relative gene expression activity shown by the genes associated in each cluster and the GO terms. In order to validate our clustering result, we employ z-score [36] as the measure of agreement. To assess the quality of GenClus, we employed z-score [36] as the measure of agreement. Higher the value of z, better the cluster results indicating more biologically relevant clusters of genes. z-score is calculated by investigating the relation between a clustering result and the functional annotation of the genes in the cluster. We have used Gibbons ClusterJudge [36] tool to calculate the z-score. Table 2: z-scores for GenClus and its counterparts for Dataset 2 Method Applied
No. of Clusters
Total no. of genes
z-score
k-means
62
614
5.57
SOM
42
614
5.78
GenClus
61
614
7.39
To test the performance of the clustering algorithm, we compared the clusters identified by our method with the results from k-means and SOM. The result of applying the z-score on the reduced form of Dataset 2 is shown in Table 2. In this table GenClus was compared with the well known k-means and SOM. Similarly, InGenClus was implemented and tested over various datasets. The results were compared with GenClus and have been found satisfactory. Some of the results obtained by InGenClus over Dataset 2 are reported in Fig. 11. It has been found that InGenClus yields the same result as GenClus as can be observed from Table 3.
38
Table 3: z-scores for GenClus and In GenClus for Dataset 1 Method Applied
No. of Total no. of Clusters genes
z-score
GenClus
21
384
11.68
InGenClus
21
384
11.68
Biological significance: The biological relevance of a cluster can be verified based on the gene ontology (GO) annotation database located at http://db. yeastgenome.org /cgibin/GO/goTermFinder. It is used to test the functional enrichment of a group of genes in terms of three structured controlled ontologies, viz., associated biological processes, molecular functions and biological components. The functional enrichment of each GO category in each of the clusters obtained is calculated by its p-value. p-value represents the probability of observing the number of genes from a specific GO functional category within each cluster. A low p-value indicates the genes belonging to the enriched functional categories are biologically significant in the corresponding clusters. To compute the p-value, we used the software FuncAssociate [37]. FuncAssociate [37] computes the hypergeometric functional enrichment score based on Molecular Function and Biological Process annotations. The resulting scores are adjusted for multiple hypothesis testing using Monte Carlo simulations. FuncAssociate is a Web-based tool that accepts as input a list of genes and returns a list of GO attributes that are over-represented (or under-represented) among the genes in the input list. To test the biological significance of the clusters obtained by GCA, we use a reduced form of Dataset 1 obtained from http://faculty.washington.edu/kayee/cluster. Functional categories with p-value