Computers and Software

International Review on

Computers and Software (IRECOS) Contents Achieving Optimal Firewall Filtering Through Dynamic Rule Reordering by Janani M., Subramaniyaswamy V., Lakshmi R. B.

680

Energy Based Efficiency Evaluation of Tree-Based Routing Protocols for Wireless Sensor Networks (WSNs) by M. Faheem, Zia Ud Din, M. A. Shahid, M. S. Ullah, Y. Munir, L. Sakar

688

FPGA Implementation of Baseband OFDM Transmitter and Receiver Using Modified IFFT/FFT to Reduce Latency by Amos H. Jeeva Oli, R. Rani Hemamalini

698

Smart Camera Based on FPGA Oriented to Embedded Image Processing by Yahia Said, Taoufik Saidani, Fethi Smach, Mohamed Atri, Hichem Snoussi

704

Developments in Vehicular Ad-Hoc Network by Rabindra Ku Jena

710

Outage Probability of Impairments Due to Combining Errors and Branch Correlation in Rayleigh Fading Channels Incorporating Diversity by J. Subhashini, Vidhyacharan Bhaskar

722

High Performance and Reliable Fault Detection Scheme for the Advanced Encryption Standard by Hassen Mestiri, Noura Benhadjyoussef, Mohsen Machhout, Rached Tourki

730

A New and Robust Image Watermarking Technique Using Contourlet-DCT Domain and Decomposition Model by S. Senhaji, A. Aarab

747

Forest Fire Image Intelligent Recognition Under Sea Computing Model by Qiang Yan, Li Yue, Zhao Juanjuan, Liu Yongxing

753

Hierarchical Energy and Delay Aware MAC Protocol for Wireless Sensor Networks by C. Venkataramanan, S. M. Girirajkumar

762

Swarm Based Fault Tolerant Routing in MPLS Networks by Venkata Raju S., Govardhan A., Premchand P.

770

Multipath Routing for Admission Control and Load Balancing in Wireless Mesh Networks by Rakesh Kumar Giri, Masih Saikia

779

Reliable and Energy Efficient Congestion Control Protocol for Wireless Sensor Networks by Srinivasan G., Murugappan S.

786

Load Balancing and Optimization of Network Lifetime by Use of Double Cluster Head Clustering Algorithm and its Comparison with Various Extended LEACH Versions by T. Shankar, S. Shanmugavel, A. Karthikeyan, Akanksha Mohan Gupte, Suryalok Sarkar

795

(continued)

Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved

Block Matching Algorithms Study According to the Video Dynamic by Wissal Hassen, Hamid Amiri

804

Approximation of 3D Face Model by T. Khadhraoui, F. Benzarti, H. Amiri

810

MATHIS: a New Approach for Creating Views to Materialize in a Hybrid Integration System by Samir Anter, Ahmed Zellou, Ali Idri

816

An Efficient Hierarchical Tree Alternative Path (HTAP) and Prims Algorithm Based QOS Routing Approach for MPLS Networks with Enhanced Bandwidth Constraints by S. Veni, G. M. Kadhar Nawaz

826

Performance Analysis of Probabilistic Routing Algorithms in Mobile Ad Hoc Networks Using a New Smart Algorithm by Sahar Ahmadzadeh Ghahnaviehei, Saadan Zokaei, Abbas Vafaei

837

A MDA-Based Model-Driven Approach to Generate GUI for Mobile Applications by Ayoub Sabraoui, Mohammed El Koutbi, Ismaïl Khriss

844

KSR-Quadtree: an Intelligent Knowledge Storage and Retrieval Using Quadtree Structure by A. Meenakshi, V. Mohan

853

“EAAS3” Distributed Control Architecture of MAS Robotic Systems by M. El Bakkali, S. Benaissa, S. Tallal, A. Sayouti, H. Medromi

863

A Contourlet Based Block Matching Process for Effective Audio Denoising by B. Jai Shankar, K. Duraiswamy

868

Software Project Scheduling Techniques: a Comparison Study by Osama K. Harfoushi

876

A Novel Approach Based on Nearest Neighbor Search on Encrypted Databases by Lakshmi R. B., Subramaniyaswamy V., Janani M.

881


International Review on Computers and Software (I.RE.CO.S.), Vol. 8, N. 3 ISSN 1828-6003 March 2013

Achieving Optimal Firewall Filtering Through Dynamic Rule Reordering Janani M., Subramaniyaswamy V., Lakshmi R. B. Abstract – Latest cutting edge technologies like cloud computing, web services, web architecture have enhanced the business experience. The technologies have increased the demand for bandwidth which in turn has heightened the necessity for routers that can handle large traffic volumes up to thousands of packets per second. However, the greatest challenge is to protect the network from unintended information leakage through unauthorized traffic. Firewall act as a defense against this unauthorized traffic by establishing secure communication in networks. Nevertheless, firewalls are controlled by security policies which are complex and fraught with thousands of conflicting rules written by administrators over a period of time while resolving issues. Therefore, an effective firewall conflict management is required to act as a barrier between the trusted and untrusted network traffic opposing unauthorized access to Internet-based enterprises. In this study, we have proposed a framework to handle the policy conflict in firewalls based on risk assessment of conflicts. We have identified the risk level of the policy conflict on the basis of vulnerability assessment on the secured network. We have utilized Dynamic Rule Reordering to reorder the conflicting rules and achieve optimal solutions for conflict resolution. The proposed method was found to detect and anomalies much faster than the existing methods. Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved. Keywords: Anomaly Management, Firewall Policy, Policy Conflicts, Rule Reordering

I.

Frequently, the policies are complex and written by experts in language whose standard is neither understood nor the process of implementing an easy task. Using the policies incorrectly can have worse repercussion than not using it at all. This is well captured by [1], who emphasizes the correct usage of security mechanism. In general, the function of the firewall is to route or prevent the network traffic. As the firewall is placed between the private network and the internet, it ensures that all the packets pass through it. Based on the identified policy, the firewall gauges the packet for its legitimacy to pass through it. The policy, further, has a set of conditions that identifies the packet arrival and actions that defines which packet to be accepted or discarded or else selectively accept and discard. The main challenge in implementing these rules is when a packet matches more than one filtering rule or when independent firewalls in the same track execute dissimilar filtering activities on the same oncoming traffic, which makes the decision-making complex. To overcome this challenge, an efficient security policy that can be easily understood by users is required. Thus, the users can evaluate, filter and confirm the accuracy of the rules and apply them appropriately. Therefore, rule reordering is essential to establish security policies which can be followed with minimal intervention.

Introduction

Securing data from internal and external threats is crucial for any successful businesses. In the recent years, a number of reports have been published on identity threat, phishing attacks and other online crime that has put pressure on organizations to protect personal and business information against such threats. The protection is achieved through firewalls, which act as a barrier between the trusted and untrusted network traffic blocking entry by unauthorized Internet-based elements (Fig. 1). Nonetheless, the security of the network provided by the firewall is as good as the security policy of the organization, as the policy is the one which manages the configuration of the firewall to be implemented. More often firewalls are observed to violate wellestablished security guidelines due to misunderstanding of the policies. Understanding that firewalls act as foundation for corporate intranet security, organizations need to implement a comprehensive security policy and regularly audit them to ensure compliance. Despite the importance of the firewall implementation as step toward securing the network from untrusted elements, the complexity of managing security policies restrains the use of firewalls to its maximum potential. Seldom are the policies easily comprehended by those who use it. Manuscript received and revised February 2013, accepted March 2013


680

Janani M., Subramaniyaswamy V., Lakshmi R. B.

Firmato tool is especially useful in managing complex and multi-firewall environment. Yet another firewall analysis tools, such as Fang and Luemta [8], [9], extract appropriate rules of the firewall security policy to conduct specific queries on a set of filtering rules. Similar to Firmato, these tools also work on extremely complex environment to configure and manage firewalls. Firewalls follow corporate security policy which comprises thousands of rules and objects written at different times by administrators. User machines, servers, sub-networks all form part of these objects [6]. The dynamic nature of the today’s corporate world requires constant change of the rules, which results in large and complex configuration, which becomes hard to be managed. In addition they also increase the network vulnerability for attacks [10]. Attempts have been made by many researchers to arrive at a policy that would unscramble the complexity in rules that govern the network security. In one such attempt, Lupu and Sloman [11] developed a framework for firewall policy conflicts based on role management. Eronen and Zitting [12] sought to develop a system which could answer queries related to permitted network traffic posted by the administrator. This system, in particular, was useful in listing out the ports which are permitted by a host. Al-Shaer and Hamed [5] proposed an algorithm that could automatically discover the anomalies in the firewall policy. This algorithm reveals the rule conflicts and challenges in firewall policy, and, also inserts new rules, as well as modifies and eliminates the conflicting rules thus achieve a conflict-free policy. This firewall policy advisor tool significantly simplifies the firewall policy management and reduces the risk to network security. Firewall policy anomalies are classified by many authors [13], [7] into shadowing, generalization, correlation, and redundancy. Shadowing anomaly refers to a rule that is shadowed by a set of previous rules that matches the packet including the present rule in such a way that the shadow rule is never expressed [14]. This kind of anomaly results in immobilizing the authorized traffic. Hence, it is essential to identify and rectify the shadowed rule occurred in firewall policy. In a generalized anomaly, the subsets of the packet rule match with the preceding rule but at the same time perform different actions. In a correlation anomaly, the first rule matches some packets of second rule and the second rule matches some packets of first rule but have different filtering actions. However, the correlation anomaly can be resolved by choosing the correct order without policy changes [15]. In a redundant anomaly, a rule becomes redundant when another rule has the same action. Redundancy in the rule increases the search time and space requirements. Consequently, it is essential to identify the redundancy between the rules and make the administrator modify its filtering effect. Our analysis of the literature found few related work [16], [17] on correlation conflict resolution.

Fig. 1. Firewall architecture

In a large scale network, hundreds of filtering rules are recorded by the administrators at different time. Such a large number of filtering rules in a process can make the modification of existing rules or writing a new rule cumbersome and time consuming, further, resulting in introduction of conflicting rules. In addition, they increase the vulnerability of the network to threats. In this work, we focus on the relationship and interactions between the rules to regulate and organize the rule reordering. In this study, we have focused on conflict detection and resolution techniques through defining a framework for firewall policy management. The main objective of the study is to identify the conflicting rules and resolve the conflicts through effective risk assessment and dynamic rule reordering, which resolves the conflicts pertaining to specific actions. The Dynamic rule reordering reorders the conflicting rules in order to satisfy equivalent action constraints. The proposed methodology can be effectively used to develop a conflict detection and resolution policy within the firewall policy.

II.

Related Work

The firewall mechanism protects against information leakage through internet, which is used by the collaborating partners to transfer corporate packet data [2], [3]. The firewalls in addition to filtering traffic they also manage packet forwarding, bandwidth, routing control, etc. [4]. The firewall is managed by administrators and identified the policy required for the security of the network. However, managing the firewall policy is an arduous task due to highly complex and interdependent nature of policy rules. Further, the ever changing network and system environments add to its woe. Despite having several administrators, firewalls still contain anomalies [5]. Further, Wool [6] found that firewall policies of the most of the organizations had security flaws within them. Tools like Firmato have shown to help with management of the firewall [7].


International Review on Computers and Software, Vol. 8, N. 3

681

Janani M., Subramaniyaswamy V., Lakshmi R. B.

In both [16] and [17] the cases, they propose an algorithm to identify the conflict and resolve them in general packet filters. The main drawback of this algorithm is its ambiguity in the classification of packets. A geometric model using 2-tuple filtering rules were proposed by [16] to optimize the packet categorization in high speed networks. This rule would be highly beneficial if used in the policy rule analysis in firewall. In an alternative approach, Relational Algebra (RA) technique and Raining 2DBox Model was used to discover the anomaly within a rule set. An anomaly in this approach was represented by two dimension box consisting of a set of relations that are mapped from the rules. The relation between the rules and the specified action for each of the rules was represented in a rectangular box. An action which is not found in the rectangular box meant any actions, i.e., accept or deny, can be taken [18]. Conflicts which occur due to shadowing of rules or redundancy were resolved through an algorithm called range algorithm [19]. This algorithm appears to be one of the best methods to resolve conflict due to shadowing problems resulting in conflict-free rules. Though many authors [17], [11] have proposed traditional anomaly approaches, yet they lack consistency and are limited to detecting pairwise redundancy. Therefore, this study attempts to identify the relationship and interactions between the rules to regulate and organize the rule reordering. Janani et al. focuses on wide-ranging assessment and research to illustrate the competence of dynamic rule reordering, which restructure the rules which are conflicted [21]. Mattas et al. focuses on the execution and function of the DARBAC (Dynamically Administering Role- Based Access Control) model [22]. Singh et al. focuses on network security trends throughout an exploration of reactive and proactive network security approaches [23]. Kondakci focuses on uncomplicated representation on security scheduling practice to articulate conditional risk features [24]. Run Chen et al. focuses on scattered and similar recognition capability of network attacks [25]. Qiang Fan et al. focuses on incorporated network system adapt to the existing market surroundings and also have a market viewpoint [26].

Fig. 2. Admin aspect in proposed system

Fig. 3. End user aspect in proposed system 1. Algorithm: Detecting and Resolving firewall anomalies 2. Input: Set of Rule R, Set of Packet P 3. Begin 4. Initialize NO: =5 5. For each i=0 to R do 6. PCpi interrogate with RCri; 7. If pi matches ri >=5 then 8. RCri can be reordered 9. Else If pi matches ri INTEREST-FREQUENCY Then SA = SA UNION {Ai} End If End For Return SA; End.

The access cost associated with the attribute A1 is lower than that associated with A2. However, A1 changes less frequently than A2. Materialize A2 will significantly increase the cost of updating the local database that if we materialized A1. To explain the principle, assuming that the unit of time considered is second. Thus, in one hour, the attribute A1 will be changed 36 (3600 * 0.01) while the attribute A2 will change 306 (3600 * 0.085) time. This implies that these two attributes are updated, respectively, 36 and 306 times. However, their access costs are 6 and 4, which means that their update costs in one hour are, respectively, 216 and 1224. We note that the materialization of A1 is less expensive the long term than A2, even if the access cost is higher in the first than in the second. This implies that the access cost alone does not give a correct idea on the attribute. For this reason, we define a new cost that we have called update cost defined by the following formula:

Availability: Among the objectives of a hybrid integration system, is to provide answers to queries that require data temporarily unavailable. For this, we proceed to load this type of data in the mediator once it becomes available. For this, we compute for each attribute Ai its availability expressed by:

c Ai  accessCost  Ai  * frequencyOfChange  Ai 

a Ai 

A1 A2

TABLE V CHARACTERISTICS OF THE ATTRIBUTES A1 AND A2 Frequency of change Access cost 0,01 6 0,085 4

Size: The size of the space available for materialization is a critical factor in a hybrid integration system. For this reason, we need to take it into account in the choice of data to materialize. In most approaches, it is better to choose small views up that there is more free space. This choice is in our opinion inappropriate. In fact, our goal is not to materialize a maximum number of views, but rather to satisfy a maximum number of queries in the local database. This will increase the performance of our system especially the queries response time. Thus, for each attribute Ai, we calculate the ratio expressing the number of queries satisfied relative to the size:

Number of times Ai is available Number of times Ai is requested

Frequency of change: The freshness of data is a very important factor in a hybrid integration system. For this reason, it is necessary to update the data at the mediator once they change in the sources. This in order to ensure that this information reflects really those present in the sources. However, if the materialized data change very often in the sources, their up-to-date in the mediator affect significantly the performance of the system. Thus, it is very important to consider this factor in the phase of the choice of the part to materialize. To do this, we compute for each attribute Ai its frequency of change fAi, which represents the number of times when this attribute has changed. Update cost: The access cost associated with an attribute is the time needed to load it in the mediator. However, it should not be considered independently of

t Ai 

f Ai s Ai

where sAi is the size occupied by the attribute Ai in the materialization database and fAi its frequency of appearance. The attributes with a higher value of tAi are favored for materialization than those with a lower value.



821

S. Anter, A. Zellou, A. Idri

Let for example two attributes A1 and A2 which their characteristics are presented in the Table VI.

A1 A2

TABLE VI CHARACTERISTICS OF THE ATTRIBUTES A1 AND A2 Size Frequency of appearance 8 24 5 12

Given: SQ the set of queries previously posed on the system and S the size of the local database. Begin SM=Ø ; /*the set of attributes that will appear in the views to materialize*/ SA ; /*the set of attributes of interest*/ SA = EXTRACT-ATTRIBUTES-OF-INTEREST(SQ); While size(SM)0 Then j = 0; For i=1 to Cardinality(SA) Then If (Ai)> (Aj) Then j = i; End If End For If (Aj) > BENEFIT-THRESHOLD Then SM = SM UNION {Aj} End If SA = SA \ {Aj} \*The operator"\" calculates the difference between two sets*\ End while Return SM; End.

t 3 2,4

In this example, it is preferable to materialize the attribute A1 then A2. Indeed, it satisfies more queries relatively to its size against the attribute A2. IV.1.2. Selection of Attributes Until now, we have defined the selection criteria. It only remains to choose the attributes that will appear in the views to materialize. Thus, for each attribute Ai of a set SA of attributes of interest (extracted), we compute the profit expressed as:

 

IV.2. Creation of Views

 

  Ai    *  f Ai   *  c Ai   1  *  aA  i

The attributes obtained in the previous step are the attributes that will appear in the views to materialize. Thus, it suffices to organize them as views. To do this, we use the algorithm k-schema. This provides a set of view schemas to materialize. In a last step, we assign constraints to the attributes forming instances. Each set of instances, of the same schema, are merged to obtain the views to materialize [14].

    *  t Ai  

 

where α, β, γ and δ are constants defined by the administrator according to the importance given to each criterion, and ω a harmonization function defined for each xi {xj / 1 ≤ j ≤ N} by:

V.

    xi   Max  x j   Min  x j  j 1..N j 1..N xi  Min x j j 1..N

Experimental Evaluation

In this section, we will test our approach. To do this, we have developed a prototype as shown in Fig. 3. Then we will compare the results obtained in our approach with those obtained in the PHIS approach. To do this, consider the global schema {A1, A2, ..., A30} which the characteristics of its attributes are shown in Table II. Assuming that the threshold considered to select the attributes of interest is the average frequency (here equal to 35.53). Thus, the attributes of interest extracted are {A1, A10, A12, A15, A16, A17, A18, A19, A20, A21, A24, A26, A27, A28 , A30 }. From Table II, we calculated the new characteristics of the attributes of interest and their benefits. The obtained results are shown in Table VII. Assuming that the size of the space available for materialization is 40 units, and the threshold benefit considered is the average benefit (here equal to 1.75). Thus, the algorithm ASA has selected the attributes {A20, A27, A15, A24, A17, A21, A30, A26, A10 }. Calling the algorithm k-scheme, we obtain the views to materialize in the mediator as shown in Table VIII. In the next section, we will compare the results obtained by our approach with those obtained by the PHIS approach. To do this, we generated at different times and in a random manner, a global schema with the characteristics of its attributes.

This function is justified by the fact that the values taken by attributes relative to different criteria are in different orders. Thus, it becomes necessary to harmonize it to obtain values of the same order (in [0,1]). This will give the same importance to different criteria and thus the same influence on the final benefit. Once we have obtained the benefit for each attribute, we continue to select those that are most beneficial until there is no longer free memory space. However, there may be some attributes that will not be selected even if the local database is not yet saturated. These attributes are those with a benefit below a threshold known as BENEFIT-THRESHOLD. This choice is justified by the fact that such attributes will not bring any benefit to the system. Otherwise, it will affect its performance. For this reason, we have defined an algorithm that we have called Attributes selection algorithm (ASA). It receives as input the set of queries posed on the system and return as output the set of attributes based on which we will create views to materialize.



822


Ai A20 A27 A24 A17 A15 A21 A10 A30 A26 A28 A19 A16 A18 A12 A1

fAi 0,02 0,016 0,027 0,023 0,04 0,057 0,034 0,06 0,045 0,046 0,032 0,05 0,063 0,092 0,089

TABLE VII ATTRIBUTES OF INTEREST AND THEIR CHARACTERISTICS aAi sAi cAi 12% 2 7 14% 4 7 27% 2 8 17% 3 5 22% 3 2 19% 3 3 43% 4 3 8% 7 2 41% 3 3 27% 3 9 55% 5 5 36% 6 4 66% 8 5 85% 86 41 91% 70 35

tAi 31,5 17 31,5 12 12 26 12 8 16,67 15,67 10,4 9,83 4,88 0,42 0,51

(Ai) 2,52 2,19 1,89 1,50 0,95 1,34 0,86 1,36 0,89 1,11 0,86 0,68 0,35 1,01 0,86 * α=1, β=1, γ=1 and δ =1

TABLE VIII MATERIALIZED VIEWS View 1 Attribute A21 A10 A15 G.C

Frequency of change 0,057 0,034 0,04 0,057

Size 3 4 3 10

Availability 19% 43% 22% 28%

Access cost 3 3 2 8

Attribute A26 A30 G.C

Frequency of change 0,045 0,06 0,06

Size 3 7 10

Availability 41% 8% 24,5%

Access cost 3 2 5

Attribute A27 A17 A24 A20 G.C

Frequency of change 0,016 0,023 0,027 0,02 0,07

Size 4 3 2 2 11

Availability 14% 17% 27% 12% 17,5%

Access cost 7 5 8 7 27

View 2

View 3

Fig. 3. Prototype of our solution



823


We then defined a number of factors on which will based our comparison, namely:  Availability: This is an important factor to take into consideration in the choice of the part to materialize. A system is best if it materialize the data rarely available.  Update cost: The cost needed to update the data at the mediator. It is equal to the access cost to these data multiplied by their frequency of change.  Ratio queries satisfied / size: The difference between our approach and the others is that they seek to maximize the size of data to materialize, whereas in our approach we seek to maximize the satisfied queries/size. This represents the number of satisfied queries per space unit.  Access cost: This factor is used to select the data with a high access cost, in order to materialize it in the mediator. A system is better if it materialize views with access cost is high. Based on these factors, we have plotted the following curves (Fig. 4).

we noted in Fig. 5, our approach has provided good results in this sense as PHIS.

Fig. 6. The ratio number of satisfied queries/size

As explained earlier in this paper, it is unnecessary to envisage the maximization of the materialized part and in the same time, it satisfies fewer requests. For this reason, we tried, as evidenced in Fig. 6 to maximize the number of satisfied queries / size.

Fig. 4. The availability of the materialized part Fig. 7. The access cost to the materialized part

As we note in Fig. 4, MATHIS has materialized the less available data. This implies that the number of queries satisfied among those requesting the sources rarely available becomes important.

The strength of our approach is that the cost of updating the materialized data is less important, and at the same time, the cost to extract them from sources is important. Materializing this type of data will make user queries less expensive. That explains the results presented in Fig. 7.

VI.

Conclusion and Outlooks

The hybrid integration systems are the most efficient solution for the integration of information sources. In fact, they offer a single point of access to various sources on the one hand, and provide a tradeoff between queries response time and freshness of data on the other hand. For this reason, they provide a local database where we store the information chosen in a selective manner. As the materialized data are organized as views, we propose a solution for this reason. To do this, we select the attributes most requested by users. Among them, we select those that respond better to selection criteria. These latter are then organized as views that will be

Fig. 5. The update cost of the materialized part

In a hybrid integration system, it is important that the up to date of the materialized part is less expensive. As



824


[13] S. Anter, A. Zellou and A. Idri, “vers une architecture d’un système d’intégration hybride ”, in 7th international conference on intelligent systems: theories and applications, mohmmedia, morocco, 2012. [14] S. Anter, A. Zellou and A. Idri, “personalization of a hybrid integration system: Creation of views to materialize based on the distribution of user queries”, IEEE international Conference on Complex Systems (ICCS'12), Agadir, Morocco, 2012. [15] S. Anter, A. Zellou and A. Idri, “K-Schema: A new approach, based on the distribution of user queries, to create views to materialize in a hybrid integration system”, Journal of Theoretical and Applied Information Technology, Volume 47, Issue 1, Pages 158-170, 2013. [16] Xinyu Geng, Li Yang, and Xiaoyan Huang, A Distributed Data Access Model based on Multi-task Cooperative Agent, (2012) International Review on Computers and Software (IRECOS), 7 (7), pp. 3770-3775.

materialized. In the selection step, we are based on the same criteria used by the most approaches. Unlike the later that apply them on views, in our approach, they are applied on attributes. This choice is justified by the fact that it will be difficult to choose among the views that contain attributes that respond to selection criteria with different degrees. We also proposed a new method to calculate the aggregation of different criteria where we have exploited the relationships between them, unlike other methods that assume that they are independent. In our approach, we based only on the distribution of user queries for the selection of attributes that will appear in the views. It will be useful to exploit the user profile to obtain information about its interests and thus consider it in this phase. We based also, on the appearance of attributes in queries to calculate the degree of dependency. It is possible to exploit the domain ontology to calculate this dependency. In this article, we have calculated the global characteristics of the views in a very simplified manner. In future work, we propose to define functions that provide the precise results of view’s global characteristics, from the characteristics of their attributes.

Authors’ information Software Project Management (SPM) Team, Computer Science and Systems Analysis National Higher School (ENSIAS) Mohammed V Souissi University, Rabat, Morocco. E-mails: [email protected] [email protected] [email protected] Samir Anter is a professor at preparatory classes for high schools in Rabat – Morocco. He is a Ph.D student in systems Analysis National Higher School (ENSIAS) in Rabat – Morocco. His main research interests are hybrid integration information, fuzzy clustering, and data clustering.

References [1]

Laura Haas, " Beauty and the beast: The theory and practice of information integration ". ICDT, 2007. [2] Elmagarmid A., M. Rusinkiewicz et A. Sheth. " Management of Heterogeneous and Autonomous Database Systems ". Morgan Kaufmann, San Francisco, 1999. [3] Wiederhold, G.: “Mediators in the architecture of future information systems”. IEEE Computer, Vol. 25(3):38-49, 1992. [4] J. Widom., Integrating Heterogeneous Databases: Lazy or Eager?, ACM Computing Surveys 28A(4), Décembre 1996. [5] Voisard, A. and Jürgens, M. “Geospatial Information Extraction: Querying or Quarrying?”. In Goodchild M, Egenhofer M, Fegeas R, Kottman C, eds. Interoperating Geographic Information Systems. 1sted. Dordrecht: Kluwer Academic. 165–79, 1999. [6] A. Zellou. “Contribution à la réécriture LAV dans le contexte de WASSIT, vers un Framework d’intégration de ressources ”. Thèse de Doctorat. Rabat. Maroc. Avril 2008. [7] R. Hull, G. Zhou, University of Colorado. " A Framework for Supporting Data Integration Using the Materialized and Virtual Approaches". SIGMOD’96 6/96 Montreal, Canada.1996. [8] N. Ashish, C. A. Knoblock and C. Shahabi, “selectively materializing data in mediators by analyzing user queries”, in Fourth IFCIS Conference on Cooperative Information Systems, pages, 1999. [9] N. Ashish, “optimizing information mediators by selectively meterializing data”, Ph.D dissertation, Departement of computer Science. faculty of the graduate school, university of southern california, 2000. [10] N. Ashish, “selectively materializing data in mediators by analyzing user queries”, in International Journal of Cooperative Information Systems Vol. 11, Nos. 1 & 2, 2002. [11] V. Y. Bichutskiy, R. Colman, R. K.. Barchmann and R. H. Lathrop, “Heterogeneous Biomedical Database Integration Using a Hybrid Strategy: A p53 Cancer Research Database”. Cancer Informatics 2006. [12] W. Hadi., A. Zellou, B. Bounabat, “Hybrid information integration: fuzzy controller for selecting views to materialize”, in 7th international conference on intelligent systems: theories and applications, mohmmedia, morocco, 2012.

Ahmed Zellou is a university professor at systems Analysis National Higher School (ENSIAS) in Rabat - Morocco. He has received his Ph.D in Applied Sciences from Mohammedia Engineering School - Rabat (Morocco) in 2008. His main research interests are hybrid integration information, fuzzy and semantic integration, semantic P2P. Ali Idri is a university professor at systems Analysis National Higher School (ENSIAS) in Rabat- Morocco. He has received his Ph.D in cognitive informatics from Quebec university Montreal in 2003. His main research interests are Empirical Software Engineering, Software Project Management, Data mining.



825


An Efficient Hierarchical Tree Alternative Path (HTAP) and Prims Algorithm Based QOS Routing Approach for MPLS Networks with Enhanced Bandwidth Constraints S. Veni1, G. M. Kadhar Nawaz2

Abstract – Multi- Protocol Label Switching (MPLS) technology helps the progress of numerous features in the Internet, such as routing performance, speed, and traffic engineering. MPLS which may gives mechanisms in IP backbones for over routing using Label Switched Paths (LSPs), summarizing the IP packet in an MPLS packet. This research work concentrates on a new constraint based routing algorithm for MPLS networks. The proposed research work utilizes both bandwidth and delay constraints. Hierarchical Tree Alternative Path (HTAP) Algorithm is utilized in this research work for balancing traffic loads through underutilized paths in order to reduce network congestion. It means that the delay of the path which is figured by the algorithm is less than or equal to the delay constraint value and the remaining bandwidth of all the links all along with the computed path must be equal to or greater than the bandwidth constraint value. In the proposed algorithm best path is computed based on avoiding vital links to reduce call blocking rate, deleting the paths which are not satisfying the bandwidth and delay constraints to reduce complexity of the algorithm and using the prims algorithm is used for shortest path algorithm to reduce path length by finding the shortest path. The proposed algorithm also compares two different topologies to study the performance of the proposed algorithm. Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved.

Keywords: MPLS, Delay and Bandwidth Constraints, Hierarchical Tree Alternative Path Algorithm, Prims Algorithm

bandwidth for traffic flows with differing QOS requirements concerning bandwidth, delay, jitter, packet loss and reliability. Multi-Protocol Label Switching is a multiservice internet expertise supported on forwarding the packets by means of a specific packet label switching technique. The premise of MPLS is to connect a short fixed-length label to the packets at the access router of the MPLS domain. The edge routers are called as Label Edge Routers (LERs), at the similar time as routers able of forwarding both MPLS and IP packets are called Label Switching Routers (LSRs). Packets are ahead along a Label Switch Path somewhere each Label Switch Router creates forwarding assessment. Each LSR re-labels and switches inward packets according to its forwarding table. Label Switching presents a new efficient and quick resilience mechanism and speeds up the packet forwarding. Figure 1 shows that an MPLS domain. Label Distribution Protocol (LDP) and a porch to the Resource Reservation Protocol (RSVP) are used to set up, maintain, and remove LSPs. MPLS network structural design does not offer header or payload encryption.

Nomenclature (∑

)

Total receiving rate

(Txmax)

Maximum rate that is able to transmit

C(J)

Critical links

W(J)

Link weight

I.

Introduction

The growth of the Internet has been inspiring in the recent years. The imminent high-speed optical networks are predictable to maintain an extensive variety of communication on real-time multimedia applications. Internet has become an integrated carrier slowly but surely, which has multi business such as data, voice, and video, multimedia and so on. Novel multimedia applications necessitate the network to guarantee quality of service. MPLS network have the potential of routing with some specific constraints for underneath favored QOS. To a certain extent replacing IP routing, MPLS is designed to superimpose its functionality on top of existing and future routing technologies and to work across a variety of physical layers to facilitate wellorganized data forwarding together with reservation of

MPLS Operation MPLS mechanism works by prefixing packets by

Manuscript received and revised February 2013, accepted March 2013


826

S. Veni, G. M. Kadhar Nawaz

means of an MPLS header consist of one or more labels are called a label stack. Each label stack access contains four fields [1], [18]: • A 20 bit label value. • A 3-bit Traffic class field for QOS priority(experimental) and ECN(Explicit Congestion Notification) • A 1 bit bottom of stack flag. If this is set, it signifies that the current label is the last in the stack. • An 8 bit TTL (Time to Live) field. MPLS networks consist of the labeled packets are switched behind a label lookup/switch as an alternative of a lookup into the IP table. Labels are disseminated between LER’s and LSR’s by means of the Label Distribution Protocol. LSR’s in an MPLS network repeatedly swap over the label and reach ability information with each other by means of uniform procedures in order to construct a complete picture of the network they can then use to forward packets. When an unlabeled packet come in the access router and needs to be passed on to an MPLS tunnel, the router initially set up the forwarding equivalence class (FEC) the packet supposed to be in and then include one or more number of labels in the packet’s recently formed a MPLS header. The packet is afterward accepted onto the subsequent hop router for this tunnel. When a labeled packet is expected by an MPLS Router, the uppermost label is examined initially. Based on the contents stuffing of the Label a change, push or pop operation can be carry out on the packet’s label stack. Routers can include prebuilt search for tables that tell them which kind of operation to do based on the uppermost label of the inward bound packet so they can practice the packet very quickly. To make sure the end to end QOS guarantees [2], [3], QOS routing protocols more often than impose a minimum QOS requirement on the path for data transmission. Limiting the hop count of the path creature elected can decrease the resource consumption at the same time as choosing the smallest amount of loaded path can balance the network load. There survive many QOS routing protocols in MPLS networks. All the QOS routing protocol in MPLS finds an optimal path by means of their path selection algorithms. In this paper, focus on both `bandwidth and delay constraints and the congestion control during traffic load balancing in networks. As a result that the delay of the path is calculated by the algorithm is less than or equal to the delay constraint value and the remaining bandwidth of all the links next to the computed path should be equivalent to or greater than the bandwidth restriction value. The proposed MPLS Routing algorithm called New QOS Routing Algorithm for MPLS Networks by means of Delay and Bandwidth constraints present performance improvement based on CPU Time, path length, call back ratio and maximum flow.

Fig. 1. MPLS Domain

II.

Related Work

The most popular algorithms, such as the Minimum Hop algorithm (MHA), the widest shortest path algorithm (WSP), the minimum interference routing algorithm (MIRA) are presented in [1], and Bandwidth guaranteed MPLS Routing Algorithm (BGMRA). These algorithms take into consideration of the topological layout of the way in and the way out points of the network. II.1.

Min-Hop Algorithm

The Min-Hop algorithm decides the path through the least amount of links sandwiched between source and destination. This scheme is based on Dijkstra’s algorithm is an effortless and computationally resourceful. On the other hand, by means of MHA using in any place can result in heavily loaded bottleneck links in the network, as it is inclined to overload some links leaving others underutilized. II.2.

Widest Shortest Path Algorithm

The widest shortest path algorithm is a development of the Min- Hop algorithm, as it endeavors to balance the network load. WSP chooses a realistic path with minimum number of links, and if there are numerous such paths, the one with the largest residual bandwidth, thus dispiriting the use of previous heavily loaded links. Though, WSP at a standstill it has the same disadvantage as MHA as the path selection is performed between the shortest feasible paths that are used until saturation before switching to longer feasible paths. II.3.

Shortest Widest Path Algorithm

The shortest widest path algorithm finds the path with the greatest accessible bandwidth and if there is more than one such path, the one with the least number of hops is chosen. SWP also create bottlenecks for future LSPs and show the way to network underutilization.



827


II.4.

Minimum Interference Routing Algorithm

in-way out) routers. A path setup command arrives at the ingress router in which an unambiguous route for the request is computed nearby. The ingress router set up the path to the egress and reserves resources on each link along the path. Intended for computation of explicit route, ingress router necessitate knowing current network topology, links set aside bandwidth and minimum delay which are assuming to be known [5]. The main goal is to establish a reasonable path for each request which satisfies the constraints of bandwidth and delay and performs better in terms of call blocking ratio, path length, CPU time and maximum flow [3]. The proposed algorithm is one of link controlled or constrained and path controlled routing. The designing objectives are the weight calculation, path selection and the particulars of the routing algorithm for the proposed algorithm are shown below.

MIRA is to route a novel link over a path which least interferes with possible outlook requests. MIRA make use of the knowledge of way in - way out pairs of finding a possible path. The main aim is to route a new connection from first to last path that does not impede with a path that may be critical to convince a future demand. At this time, a critical link is acknowledged as a link that it can decrease the maximum flow (max-flow) value of one or more way in-way out pair if critical link has been chosen in a path. The algorithm attempts to evade the critical links as much as probable for the duration of a path selection procedure. In detail, MIRA considers the amount of interference on a particular way in- way out pair (s,d) as the decrease in the maximal accessible bandwidth between (s,d).[4] With this type of algorithm, the path lengths can turn out to be long enough to make path virtually impracticable. The main goal of QoS based routing is to select the most appropriate path according to traffic supplies for multimedia applications. Selection of appropriate transmission paths is finished through routing mechanisms based on accessible network resources and QoS requirements. Multimedia applications may undergo degradation in eminence in traditional networks such as Internet [11]. This problem can be resolved in networks that enclosed dynamic path creation features with bandwidth-guaranteed and inhibited delays [12]. Real– time applications oblige severe QoS requirements. These application requirements are articulated by parameters such as acceptable and end-to-end delays, necessary bandwidth [21] and acceptable losses. For instance, audio and video transmissions have strict requirements for delay and losses. Wide bandwidth has to be guaranteed for high capacity transmission. Real time traffic, video in meticulous, fairly often utilizes most important quantities of network resources. Efficient executive of network resources will reduce network service cost and will allow more applications to be transmitted at the same time. The task of finding appropriate paths in the course of networks is treated by routing protocols. In view of the fact that common routing protocols are reaching their acceptable difficulty limits, it is imperative that difficulty proposed by QoS based routing [13], [19] should not injure scalability of routing protocols. MPLS is a multiple solution for a lot of current problems faced by Internet [14]. By a wide support for QoS and traffic engineering, MPLS is establishing itself as a standard of the next generation’s network.

III.1. Designing Objectives  Minimize interference levels in the middle of sourcedestination node pairs, in order to reserve more resource for future bandwidth demands.  Balancing traffic loads through underutilized paths in order to reduce network congestion using Hierarchical Tree Alternative Path (HTAP) Algorithm.  Optimize the network resource utilization using Prims algorithm.  Reduce algorithm complexity. III.2. Hierarchical Tree Alternative Path (HTAP) Algorithm In this work to evaluate a congestion control algorithm, which bases its functionality on the conception of alternative paths from sources to sinks in order to prevent congestion from occurrence. HTAP is an energetic congestion control algorithm based on its path switching decision on local information, such as the congestion state of its neighbors. HTAP consists of four different designs: • • • •

Topology control; Hierarchical Tree Creation; Alterative Path Creation; Handling of Powerless (dead) nodesIII.2.1. Topology Control

Topology control is critical in WSNs because it can hold an issue occurs from the superfluous number of nodes and their dense deployment. Problems like interfering, maximum number of possible routes and use of maximum power to exchange a few words to isolate the nodes directly are possible to arise. In view of the fact that HTAP is an algorithm which attempts to utilize the network’s extra resources (unused nodes), it is palpable that guaranteeing a redundant number of paths is

III. Proposed Algorithm for QOS Routing A New QOS Routing Algorithm for MPLS Networks with Bandwidth and Delay as constraints is presented here. Consider a network with n number of nodes (routers). To set of connections in the paths a subset of these routers is measured to be the ingress-egress (way



828


essential. In order to maintain the concert characteristics of the network in case of congestion, these paths must be cautiously chosen. Therefore, Topology control is, the first scheme applied in the HTAP algorithm. An effectual topology control algorithm ought to be able to protect connectivity with the use of minimal power, at the same time as maintaining an optimum number of nodes as neighbors to each node. In this work make use of, with a variation, the Local Minimum Spanning Tree algorithm (LMST) [6] is introduced as the preliminary topology control that runs on the network. LMST is an algorithm accomplished by conserving the network connectivity by means of using minimal power, whereas the degree of some node in the resulting topology is limited to six. As it is explained in [7], this feature (6 neighbors per node) is enviable because a small node degree reduces the MAC-level contention and interference. In LMST every node constructs its local minima separately; by means of Prim’s algorithm [8] and stay on the tree only those nearest nodes which are one hop away. Additionally, the ensuing topology is probable to use only bi-directional links, a matter which is costly for the successful operation of HTAP. The variation established to LMST by HTAP speaks about to the selection of the neighbor list. As an alternative of selecting any node that fulfils the condition as neighbor (maximum six), the modified LMST used by HTAP keeping as neighbor nodes only those that reside one level closer to the sink than itself.

Particularly, it is probable for a node to be the last one that receives the level discovery packet, since there are not any other nodes upstream able to forward that packet. Within that case this node broadcasts a “negative ACK” packet (NACK) representing that for this reason it cannot route any packets. When the node, that forwarded the packet, receives the NACK it is attentive that it will not route any packets during this node. Figure 1(a) corresponds to the networks connectivity after the LMST topology control algorithm applies. Subsequent to topology control algorithm applies; level placement modus operandi takes as input to this topology and attempts to place nodes in levels from each source to sink. For instance, node 1 is measured as the source. Therefore, when node1 turn out to be a source, it transmits a message to the nodes that it can connect according to the results of the topology control algorithm. In this case, nodes 2, 3 and 4 receive this message and are allocated as level 1 node for this source (node 1). Then these nodes (2, 3 and 4), broadcast this message to the nodes that they are linked to and these nodes are allocate as level 2 nodes (5, 6, 7 and 8). This modus operandi iterates until packets reach sink. For the period of this procedure, if a node receives a packet from more than one node, then it keeps link only with the node that assigns it the higher level. For instance node 7 is connected, after the end of topology control phase, with nodes 6 and 4. In level placement procedure, it will receive a packet form node 6, which will ask it become a level 3 node and from node 4, that will ask it to become level 2 nodes. In this case, node 4 will become a level 2 node and will ask from its neighbor nodes to become level 3 nodes. Finally, in this figure, node 17 represents the case where a node does not have any upstream to transmit a packet. In such a case, level placement algorithm removes this node from the table of upstream nodes of node 13.

III.2.2. Hierarchical Tree Creation The hierarchical tree creation algorithm runs greater than the topology control algorithm and only at the moment, wherever a node becomes a source (starts sensing a phenomenon). This algorithm consists of two main steps: A. Path Creation In this step a hierarchical tree is formed commencement at the source node. After the end of the topology control phase each node is able to be linked to at least most six nodes which are only one hop away from itself. During this phase each node that is flattering a source node is self-assigned as level 0 and sends a level discovery message to the six neighbors selected during topology control phase. Nodes that take delivery of this packet are measured as children to the source node and are set as level 1. Each of these nodes transmits again the level discovery packet and the pattern continues with the level 2 nodes etc. This procedure iterates awaiting all nodes is assigned a level and stops when the level discovery packets reach the sink. When the procedure finishes it is probable that the sink receives more than one level discovery packets from different nodes and each packet may have a different level value. This is a suggestion that disjoint paths are getting the sink. The hierarchical tree algorithm is also able to recognize and rectify some issues that are probable to arise.

B. Flow Establishment A connection is reputable between each transmitter and receiver pair by means of a two-way handshake. Packets are switch over between each transmitter and receiver in the network, in order to get associated. During this packet switch over, the congestion state of each receiver is exchange a few words to the transmitter. Let us consider once more the Figure 1, where node 1 is the source and nodes 2, 3, and 4 are receivers. Initially, node 1 sends a packet to node 2. When node 2 receives this packet, it sends an acknowledge packet back to 1. In this acknowledge packet, node 2 piggybacks its current congestion state. This exchange makes the source node attentive of the congestion state of all its next hop neighbors that can eavesdrop. When the congestion state of children reaches a prespecified limit these nodes update their congestion state and inform node 1 using the bidirectional link.



829


III.2.3. Alternative Path Creation

path alternation.

This algorithm runs when congestion is probable to take place at a particular node in the specified network. In view of the fact that the working topology control algorithm is capable to counteract collisions in the medium by choosing the smallest transmission power (one hop nodes), congestion is still immobile possible to come about when a node receives packets with a higher rate than it can transmit (buffered based congestion). In a wireless sensor network where all nodes, with the exception of the sink, are accurately the same, this can happen if a node is receiving packets from atleast two flows, or if the nodes to which it has to transmit packets to, cannot agree to any more packets. When the buffer of a node starts filling satisfyingly, this node has to take action. In such a case each node is involuntarily programmed to run nearby by the light weight congestion detection (CD) algorithm. As soon as the buffer reaches a buffer-level threshold value, the CD algorithm presented here starts counting the rate by means of which packets are reaching the node. In view of the fact that each packet is acknowledged by the NodeID in its packets header, the CD algorithm is attentive of the nodes that are transmitting packets through this node, in addition to their data rate: ≥

A. Congestion Threshold As mentioned before, a key point to the operation of the HTAP algorithm is the cost of the congestion threshold. Nodes, using equation 1, calculate the total receiving rate and evaluate it with the maximum transmission rate that they have at the moment. The issue that arises is that both parameters that the receiving and sending rate are greatly dependent on the current network situation. Therefore, it is probable for a node to receive a large number of packets in a short period of time and then to keep receiving packets at a lower (normal) rate. A node is then congested in terms of engaged buffer space but in point of fact is that it not experiencing any problems that force it to control this overload situation. Consequently, the parameter that wants to be tuned is not just the value of the threshold, but also the duration anywhere this threshold is exceeded (burst period). But if the duration is set too low then the alternative path creation algorithm will be triggered frequently. In that cases when the situation is transient the structure of alternative paths would add redundant overhead to the network (delays and power consumption). Alternatively, if the burst period is very big, buffer overflows will take place and nodes will react overdue to this situation. HTAP handles this matter by means of an adaptive method. Initially, buffer monitoring set in motion when the buffer occupancy of each node reaches 50% of the total. At this point, the affected node counts the number of nodes from receiving packets. Then it assumes that each node is transmitting with the maximum data rate and calculates the time awaiting the buffer tenancy will reach the 85% limit. When these times elapse it checks another time the occupancy of the buffer. Condition is between 80 and 85% it considers that, certainly, is receiving packets with a higher rate than it can transmit and it triggers the “alternative path” algorithm, so as to avoid congestion. If the buffer occupancy is less than 80% it re-calculates the remaining buffer and regulate according to the time, which evidently is strictly reduced. If at the next measuring era the buffer occupancy is immobile below 80%, the first threshold is adjusted from 50% to 70%. Such performance can happen in areas where an everlasting event is taking place. When an everlasting event is affecting the network, is expected, that by setting the buffer to just 50% will contribute to the overhead. Let us consider that the network consists of nodes with a buffer size (B) of 128Kbytes and a maximum data rate (r) of 128Kbps. In addition consider that a node is receiving data from five different nodes (n). In this case when the 50% of the buffer (64Kbytes) is full, node begins the process. It counts the number of nodes from which is receiving data (in this case is five) and considers that they transmit with maximum data rate. Thus, it calculates the time t which is t=B/(n*r)=64Kbytes/(5*128kbps)=0.8s. This means that in 0.8s the CD algorithm will substantiate again the buffer tenancy. If the buffer occupancy is between 80 and

(1)

By using Equation (1), the CD algorithm is clever to ) and compare estimate the total receiving rate (∑ it with the maximum rate that is able to transmit (Txmax). When this ratio is large and higher than a certain percentage the node sends a backpressure message to the nodes that keep transmitting packets from first to last it to search for an another path. The selection of these nodes is performed initially from those that transmit with the lower rate. The purpose of this method is to continue the performance characteristics of the network (keep the throughput of nodes at the maximum possible level without packet drops) and minimize the impact of the change to the network. When a node is knowledgeable through the bidirectional link to stop broadcasting through the specific node, it searches in its neighbor table and finds the next obtainable node with the higher level and starts transmitting through it. If this node is in the transmitting range of the congested node, is previously attentive of its condition and does not use it moreover. This method, results in the relief of a congested node. An advantage of HTAP in assessment with similar schemes like [9] and [10] is the fact that it does not make use of specific nodes such as distributors and mergers. From first to last the advantage that is obtainable from the use of the topology control algorithm and the source based hierarchical tree, each node is able to slow down the transmission of packets through itself and also it is capable to join the first accessible shortest path, after Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved


830


85% it will trigger the “alternative path” algorithm. If it is less e.g. 65% then it will calculate again the number of nodes that broadcast packets and will regulate the time consequently. Thus, consider that now is receiving packets from four nodes instead of five, it will calculate the remaining buffer up to 85 % which is now 20% as well as the maximum data rate of 4 nodes. This will give a new time and algorithm will check buffer occupancy after this time elapses. If it remains below 70% but higher than 50% it will set the first threshold to 70% and will stop monitoring this node until buffer occupancy exceeds this threshold (70%). By make use of this extensive but lightweight congestion detection scheme the HTAP algorithm is able to face both permanent and transient overcrowding situations effectively.

( )

( )=

(3)

ℎ

From (3), Weight of the link is directly proportional to critical links and hence higher the value of criticality, higher will be the weight of that particular link (j). Also, it is inversely proportional to the Residual bandwidth, so when remaining bandwidth is less, weight of the link will be more. So in the proposed algorithm, avoid the link with more weight, so as to balance loads through underutilized paths. iii. Calculation of path The weight of path fit in to source destination node pair {S,D} is attain by: { , }=

III.2.4. Handling of Powerless (Dead Nodes)

()

ℎ

∈

{ , }

(4)

This path weight is used to route LSP from ingress node S to egress node D. The constraint is to avoid the path with more path weight. However, if there are many result paths with the same minimum path weight, the algorithm would pick a shortest path between those result paths in order to reserve network bandwidth.

HTAP algorithm relating to the nodes which are power fatigued. These nodes cause major problems to the network in case they act as sources or relay nodes. Therefore, when a node is to get power fatigued, it should instantly be extracted from the network and the tables of its neighbor nodes should be updated. This procedure should be as easy as probable due to the fact that this can come about when the network is in an emergency state. When the power of a node reaches the “power extinction “limit, it instantly transmits this fact to the nodes around it. The nodes that receive this packet extract this node ID from their neighbor list. If this node is a part of an active path (a path that is relaying packets to sink), the nodes that are sending packets to this node and take delivery of “power extinction” message, concern with that then apply the “alternative path” algorithm and find another path to forward packets to the sink.

iv. Proposed Algorithm 1) Compute c(j) i.e. the critical links according to formula (2); 2) Compute weight of the link according to formula (3); 3) Use minimum interference routing algorithm to obtain the path with minimum path weight W[S,D]; 4) Best path algorithm is used to select the best path as follows: Suppose (j, k) is the link between nodes j and k: i) If bandwidth (j, k) < bandwidth constraint then delete paths containing link (j, k) ; ii) If delay (j, k) )> delay constraint then delete paths containing link (j, k).

i. Calculation of Critical links ( )=

ℎ

5) Use prims routing algorithm to obtain the shortest path among the path selected. 6) Establish the best path with requests bandwidth and delay constraints. 7) If no path selected the algorithms fails.

(2)

From (2), critical links directly depends on the value of total demands per link. Higher value of criticality means that numbers of future requests are possible through these ingress-egress routers. So, avoid critical links with higher values to reduce network congestion. It also satisfies the first objective to minimize interference levels among source-destination node pairs.

IV.

Implementation Setup

For the simulation study, an extensive routing simulation program is done via ns-2 simulator. The topology adopted from [16], [17] is called MIRA topology is used here. Figure 2 represents the sample MPLS network with the nodes. Figure 3 clearly depicts the MPLS label for each node. Data communication in the MPLS network is shown in Figures 4.

ii. Calculation of Link Weight Here weight of link j could be determined by:



831


Fig. 2. Node Creation in MPLS

Fig. 3. Node Label in MPLS

Figs. 4(a). Data communication from 0 to 9 in MPLS



832


Figs. 4(b). Data communication from 0 to 9 in MPLS

V.

From the Figure 6, it can be observed that with increasing number of requests, the call blocking ratio increases consistently and when the user bandwidth requirement is increased and delay requirement is reduced, then also the call blocking ratio of the proposed routing based on prims and HTAP algorithm gives better results. From the Figure 7, it is observed that the maximum available flow decreasing as the number of requests is increasing. Hence from the above graph it can be seen clear that the proposed routing based on prims and HTAP algorithm is more efficient than the existing method. From the Figure 8, it can be observed that the performance of CPU Time is increasing as the numbers of requests are increasing but the proposed routing based on prims and HTAP algorithm shows better results for more complex topology.

Performance Evaluation

From the simulation program, there are 4 measured parameters to test the performance of algorithms; they are call back ratio, mean length, maximum flow and CPU calculation time. These parameters can be obtained from (5) to (8). MPLS Routing algorithm must have low call blocking ratio, less mean path length, high maximum flow and low CPU calculation time: =

(5)

ℎ =

(6)

− (

=

ℎ) (7)

=

VI.

(8) V.1.

Conclusion

Finding a path in the network for every traffic flow is capable to assure that the several quality parameters such as bandwidth and delay is the task of QoS routing algorithms are urbanized for new IP networks based on label forwarding techniques as Multiprotocol Label Switching (MPLS). In this paper a new QOS Routing Algorithm for MPLS Networks Using Delay and Bandwidth Constraints. Path selection is done by prims algorithm and it is used to reduce the path length. Simulation experiments have been conducted to examine the performance of new algorithm using network topologies. From that it can be observed that the proposed algorithm performs better for complex network in terms of length, call blocking ratio, maximum flow and CPU Time.

Results

Simulations are performed using network topologies and found better results in terms of Length, Call back Ratio, maximum flow and CPU calculation time. From the Figure 5, it can be observed from the graph that the path length increases with the increase in number of requests and it is more when the user bandwidth requirement is increased and delay requirement is reduced. As compared to the existing routing based on Dijistra’s algorithm the performance of the proposed routing based on prims and HTAP algorithm is better with more number of nodes and complex topology.



833


Length

0,02 0,018 0,016 0,014 0,012 0,01 0,008 0,006 0,004 0,002 0

Length

Proposed Routing based on Prims and HTAP algorithm

Routing based on Dijistra's algorithm

1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 No of Request Fig. 5. The mean path length for user bandwidth requirement

Call back ratio 0,25 Proposed Routing based on Prims and HTAP algorithm

Call Back Ratio

0,2 0,15


0,1 0,05 0

1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 No of Request Fig. 6. Plots for call blocking ratio versus number of requests

2100

Maximum available Flow

Maximum Flow

2050

Proposed Routing based on Prims and HTAP algorithm

2000 1950


1900 1850 1800

1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 No of Request Fig. 7. Maximum available Flow



834


Time 100 Proposed Routing based on Prims and HTAP algorithm

80 Time

60 40


20 0

1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 No of Request Fig. 8. Plots for CPU Time versus number of requests [14] Baolin Sun, Layuan Li, Chao Gui - Fuzzy QoS Controllers Based Priority Scheduler for Mobile Ad Hoc Networks, Mobile Technology, Applications and Systems, 2005 2nd International Conference on, Publication Date: 15-17 Nov. 2005. [15] Gurpreet S. Sandhu and Kuldip S. Rattan, Design of a Neuro Fuzzy Controller, Department of Electrical Engineering Wright State University. [16] M. Kodialam, T.V. Lakshman, Minimum Interference Routing with applications to MPLS traffic engineering, IEEE INFOCOM 2000, March 2000. [17] Kotti, A. Hamza, R. Bouleimen, K,”Bandwidth Constrained Routing Algorithm for MPLS Traffic Engineering”, Networking and Services, ICNS, Third International Conference on, 19-25 June 2007. [18] J. Oubaha, A. Habbani, M. Elkoutbi, New Approach Multicriteria MPLS Networks: Design and Implementation, (2011), International Review on Computers and Software (IRECOS), 6 (2), pp. 237-243. [19] Liu Chunxiao, Chang Guiran, Jia Jie, Sun Lina, Li Fengyun, A Hybrid Routing Algorithm for Load Balancing in Wireless Mesh Networks”, (2012) International Review on Computers and Software (IRECOS), 7 (7), pp. 3513-3519. [20] Farzaneh Azimiyan, Esmaeil Kheirkhah, Mehrdad Jalali, Classification of Routing Protocols in Wireless Sensor Networks, (2012) International Review on Computers and Software (IRECOS), 7 (4), pp. 1614-1623. [21] R. Aggarwal, H. Aggarwal, L. Kaur, On Bandwidth Analysis of Fault-tolerant Multistage Interconnection Networks, (2008) International Review on Computers and Software (IRECOS), 3 (2), pp. 199-202.

References [1] [2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

E. Rosen, A. Vishwanathan, R. Callon, Multiprotocol Label Switching Architecture, in: IET. S. Chen, K.Nahrstedt, “An overview of quality of service routing for the next generation high speed networks: problems and solutions, IEEE Network 12 (6)(1988) 64-79. Bin Wang, Xu Su, C.L.P Chen, A new bandwidth guaranteed routing algorithm for MPLS traffic Engineering, in: IEEE International Conference on Communications, ICC 2002, vol 2, 2002, pp 1001-1005. Antonio Capone, Luigi Fratta, Fabio Martigon, “ Dynamic online QOS Routing schemes: Performance and bounds, Article in press, Science Direct, Computer Networks (2005). Koushik Kar, Murali Kodialam, T. V. Lakshman, “Minimum Interference Routing of Bandwidth Guaranteed Tunnels with MPLS Traffic Engineering Applications”, IEEE Journal On Selected Areas In Communications, VOL. 18, No. 12, December 2000. Alidadi A., Mahdavi M., Hashmi M.R., “A New LowComplexity QOS Routing Algorithm for MPLS Traffic Engineering”, Proceedings of the 2009 IEEE 9th Malaysia International Conference on Communications 15-17 December 2009 Kuala Lumpur Malaysia, Pg 205-210. N. Li, J. Hou, L. Sha, Design and Analysis of an MST-based Topology Control Algorithm, in: INFOCOM 2003. TwentySecond Annual Joint Conference of the IEEE Computer and Communications. IEEE Societies, vol. 3, ISSN 0743-166X, 1702 – 1712 vol.3, doi:10.1109/INFCOM.2003.1209193, 2003. R. C. Prim, Shortest Connection Networks and Some Generalizations, Bell System Technology Journal 36 (1957) 1389– 1401. J. Kang, Y. Zhang, B. Nath, TARA: Topology-Aware Resource Adaptation to Alleviate Congestion in Sensor Networks, IEEE Transactions on Parallel and Distributed Systems 18 (7) (2007) 919–931, ISSN 1045-9219, doi:http://doi.ieeecomputersociety.org/10.1109/TPDS.2007.1030. W.-w. Fang, J.-m. Chen, L. Shu, T.-s. Chu, D.-p. Qian, Congestion Avoidance, Detection and Alleviation in Wireless Sensor Networks, Journal of Zhejiang University - Science C 11 (2010) 63–73,ISSN 1869-1951, URL http://dx.doi.org/10.1631/jzus.C0910204,10.1631/jzus.C0910204 A. Vasilakos, C .Ricudis, K. Anagnostakis, W .Pedrycz, A. Pitsillides-“Evolutionary Fuzzy Prediction for Strategic QoS Routing in Broadband Networks”, 0-7803-4863-W98 @10.0001998 IEEE. Balandin, S. Heiner, A.P, SPF protocol and statistical tools for network simulations in NS-2”- Information Technology Interfaces, 2002. ITI 2002. Proceedings of the 24th International Conference on Publication Date: 2002. Eric Osborne, Ajay Simha, “Traffic Engineering with MPLS”, Cisco Press, Pub Date:July 17, 2002, ISBN: 1-58705-031-5.

Authors’ information 1

Research Scholar, Bharathiar University, India. E-mail: [email protected]

2

Director, Department of Computer Applications, Sona College of Technology, Salem, India. S. Veni is presently working as a Assistant Professor in the Department of Computer Scienec, Karpagam University, Coimbatore, India. She has ten years of teaching experience. She has participated and presented ten papers in national conferences and two papers in International conferences. Her area of research includes network architecture and network protocols.



835


Dr. G. M. Kadhar Nawaz is presently working as Director in the Department of Computer Applications, Sona College of Technology, Salem, India. He has presented and published papers in various national and international Conferences and journals. He has also organized national conferences. He completed his Ph.D in Computer Science from Periyar University and his area of research includes Digital image Processing and Stegnography.



836


Performance Analysis of Probabilistic Routing Algorithms in Mobile Ad Hoc Networks Using a New Smart Algorithm Sahar Ahmadzadeh Ghahnaviehei1, Saadan Zokaei2, Abbas Vafaei3

Abstract – Flooding and Ad hoc on Demand Distance Vector routing algorithms (AODV) are common algorithms in Mobile Ad hoc Networks (MANETs). Flooding can dramatically affect the performance of MANET, but probabilistic approach to flooding has been proposed recently to solve the flooding storm problem. This problem leads to contention, collision and duplicated messages. This paper will propose a new smart probabilistic method to improve the performance of existing flooding protocol by increasing the throughput and decreasing the average end-toend delay. Simulation results show that combination of flooding and a suitable smart probabilistic optimization method can reduce the average end-to-end delay and increase throughput. In this paper, this method is also used for AODV algorithm. Results indicate that AODV will be enhanced while nodes send message probabilistically. More ever, using proposed optimization algorithm improves this algorithm as well as flooding, but enhancement occurs in different probabilities. Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved. Keywords: MANET, Flooding, AODV, Smart Probabilistic Optimization

I.

This paper presents new probabilistic flooding and AODV algorithms. These new algorithms are called cuckoo optimized probabilistic flooding (COPF) and cuckoo optimized probabilistic AODV (COPAODV). This method enhances probabilistic flooding algorithms by proposing a probability density function (PDF) dependent to number and speed of the nodes. Cuckoo optimization algorithm (COA) is used to optimize the PDF. The results show that implementing simple probabilistic flooding (SPF) and probabilistic AODV(PAODV) with cuckoo optimized probabilistic density function (COPDF) helps to reduce the average end-to-end delay and increase the throughput compared with traditional SPF, simple flooding (SF), PAODV, and AODV as well. The rest of this paper is organized as follow: Section 2 presents related works, in section 3 the results of throughput and average end-to-end delay using probabilistic flooding is evaluated. Cuckoo optimization algorithm (COA) is defined in Section 4 and section 5 presents the new algorithm based on cuckoo optimized probabilistic flooding (COPF). Comparison with traditional simple flooding (SF) and simple probabilistic flooding (SPF) and also COPF are discussed in Section 6. Section 7 is related to analyze of AODV, and finally section 8, concludes the paper and offers some new approaches for future works.

Introduction

Mobile ad hoc networks (MANETs) are collections of wireless mobile devices, in which data transmission is done by intermediate devices that are independent from any base station [1]-[12]. In another word, MANETs are self –configuring networks with mobile hosts that are connected by wireless links. Due to the nodes' movement, the network topology will be changed rapidly and sometimes unpredictably. To transfer data between source and destination, there are several algorithms which are based on flooding such as AODV, dynamic source routing (DSR) and simple flooding. Although the flooding scheme is presumable to distribute and propagate messages between all nodes, there are several problems deal with it such as duplicate transmissions, collisions and contention that lead to redundancy and packet loss. These problems are called broadcast storm problems. As broadcast storm problem can cause negative effects on performance of the network so lots of efforts have been performed to overcome this problem. Recently, the probabilistic approach of flooding has been proposed to solve the broadcast storm problem. In traditional probabilistic flooding algorithms each node after receiving a message, rebroadcasts it with probability of p. this method can reduce redundant packets as the nodes are allowed to drop messages with probability of 1-p. so the propagated redundant messages will be reduced and performance will be improved. Manuscript received and revised February 2013, accepted March 2013

837


Sahar Ahmadzadeh Ghahnaviehei, Saadan Zokaei, Abbas Vafaei

II.

using the local neighborhood information. Each node located in a sparse region is assigned a high forwarding probability whereas the nodes located in the dense regions are assigned low forwarding probability.

Related Works

At first probabilistic flooding in MANETs is explained, and then some kinds of efforts to reduce the effects of broadcast storm problem on the network is introduced. The broadcast storm problem is the result of collision and contention of redundant messages using flooding based routing algorithms and can be avoided by reducing the number of nodes that forward the broadcast packets [2], [3]. Authors in [2] have classified several proposed flooding algorithms in two categories: probabilistic and deterministic. In reference [4], authors have compared the performance of several proposed flooding approaches including the probabilistic, counter-based, area based and cluster-based.

III. Evaluation of Probabilistic Flooding In this section simple probabilistic flooding (SPF) will be evaluated considering throughput and average end-to-end delay. To achieve this goal, a network with the following characteristics is considered and the results of using SPF will be evaluated on it. Our test bed is a network with 250 nodes, and packet inter-arrival time of 0.05 second. The considered area is flat grid and is 1000m×1000m in dimension where the nodes are located in a central square area of 200m× 200m. Nodes can move and change their speed but they are not able to break the network’s boundary. This pseudo code is used for probabilistic flooding.

II.1. Probabilistic Flooding Probabilistic flooding is one of the most efficient flooding techniques that have been suggested in the literature [3]. In this approach, each intermediate node floods received packets with predetermined forwarding probability. It seems that, appropriate choice of the forwarding probability determines the effectiveness of this technique. Authors in reference [5] have suggested use of random graphs [6] and percolation theory [8] in MANETs. Some authors have claimed that there exists a probability value Pc < 1, so by using Pc as forwarding probability, almost all nodes can receive a broadcast packet, while there is not much improvement on throughput for another choice of p which is greater than Pc. it should be considered that as the optimal value of Pc is different in various MANET topologies, and there isn’t any mathematical method for estimating Pc, many probabilistic approaches use a predefined value of Pc. One important advantage of probabilistic flooding in comparison with other proposed methods [7], [8], [9] is its simplicity. However, investigations [3], [5] reveal that although probabilistic flooding algorithms can significantly reduce the degrading effects of the broadcast storm problem [3], these networks suffer from poor throughput, especially in a sparse network topology these characteristics are more appear. Authors in [10] have argued that poor throughput exhibited by the probabilistic flooding algorithms in [3], [5] is due to assigning same forwarding probability at every node in the network. Cartigny and Simplot in reference [11] have described a probabilistic scheme in which the forwarding probability p for each transmitting node is computed from the local density of total number of its neighbors. Authors in reference [10] argued that, the network topology is partitioned into sparse and dense regions

1) For (number of iteration selected value) 2) Set random generator between (0:1) 3) If forwarding probability random generator Upon receiving a message at a node, if it is the first time Set rebroadcast probability to p=p0 while p0 is the forwarding probability 4) Else drop the message End if 5) For every forwarding probability between 0.3 to1 with the step of 0.1, do the followings: - Evaluate throughput, the ratio of packets delivered to the whole packets propagated in the network - Find average end- to -end delay that is the average delay of received packets in the destination 6) Repeat the process using different simulations random seeds and average the results for throughput and delay separately. The simulation time is 20 seconds and constant bit rate (CBR) traffic is considered. Media access control (MAC) layer is based on 802.11 standard and transmission range is considered 25m. The direction of the nodes' movement is random and each node’s speed is chosen uniformly from [0, 1.5] m/s that is close to human walk speed. The explained algorithm is used to evaluate the throughput and average end-to-end delay of SPF. Figures 2 and 3 depict the results while the simulation is repeated 50 times for each point of the curve. As it is shown in these figures, the best forwarding probability that provides maximum throughput and minimum delay using SPF algorithm, is between 0.5 and 0.7.



838


IV.

Cuckoo Optimization Algorithm

Cuckoo optimization algorithm (COA) is one of the most accurate and precise optimization methods. This algorithm is inspired by the life of a bird, called cuckoo. This is a new algorithm which is suitable for both of the linear and non linear optimization problems [1]. Cuckoo populations, in different societies, can be categorized in two different types: mature cuckoos and eggs. The effort to survive among cuckoos explains basis of Cuckoo Optimization Algorithm. During survival competition some cuckoos or their eggs demise. Those Survived cuckoo societies immigrate to better environments and then, start reproducing and laying eggs. Considering this point is important that Cuckoos' survival effort hopefully converges to just one cuckoo society. This algorithm initiates with some populations of cuckoos that have some eggs to lay in some host bird's nests. Eggs which have more similarity with host bird's eggs are more likely to be grown up and become mature. Other eggs that aren't similar to host bird's eggs are identified by the host bird, so they will be killed. The more eggs survive in an area, the more profit is gained in this area. Position in which more eggs survive, is a suitable position which COA wants to optimize. Because cuckoos want to maximize the survival rate of their eggs, they search for the most suitable area. After survival, and also after became mature then they establish some societies, that each society has a habitat to live in it. The best habitats however, are areas in which more eggs survive. So, cuckoos immigrate towards this region and inhabit somewhere behind it. There are some parameters and concepts that are defined in COA and should be considered. One concept is the egg laying radius, which is the distance between the goal and where cuckoos lay their eggs. Process of laying eggs in some nests inside the goal will continue until the best position with maximum profit value is obtained. In this algorithm the values of variables are gathered in form of an array. In COA this array is called habitat. In this optimization algorithm, a habitat is an array of 1×Nvar which represents current positions of cuckoos. This array is defined as follow:

Fig. 1. Throughput versus Probability in different speeds using SPF

Fig. 2. Average End-to-End Delay versus probability in different speeds using SPF

Simple flooding (SF) which is a specific form of SPF with forwarding probability of 1 leads to high amount of redundant packets and collisions as it is mentioned in previous sections. Probabilistic approach in SPF algorithm is an effective way to control the collision, contention and also broadcast storm problem, especially in probabilities in range of 0.5 to 0.7, this algorithm achieves its maximum efficiency for throughput and also delay. The reason is that each node that receives the message for first time forwards it with probability of p and drops it with probability of 1-p and p is more than 1-p in this range. Packet loss however increases in probabilities less than 0.5 because 1-p (the probability of message dropping) is more than p, which is messages’ forwarding probability. So the probabilities between 0.5 and 0.7 are best forwarding probabilities to achieve low average end-to-end delay and high throughput through the use of SPF. Another reason is that in this range especially around 0.6, forwarding probability is not high enough to cause message redundancy and also is not low enough to cause high packet loss. Since forwarding probability is an important factor to achieve higher throughput and lower delay results, the cuckoo optimized probability density function (COPDF) is proposed to obtain better results. In this paper COPDF is based on cuckoo optimization algorithm (COA) and also it depends on number and speed of the nodes.

Habitat   x1 , x2 ,...........xNVAR 

(1)

Each variable in above equation is a float number. The profit of the habitat is obtained by evaluation of profit function of fp at a habitat. In this case, profit is defined as below: Profit  f p  habitat   f p  x1 ,x2 ,...........xNVAR  (2)

For using COA, a matrix of size Npop×Nvar is considered and then some randomly number of eggs



839


IV.1. Results of Using COPDF

consider for each habitat. As, in nature each cuckoo lays about 5 to 20 eggs, so, these values are used to upper and lower limits for simulation of egg laying process in different iterations. As cuckoos lay their eggs with maximum distance from their habitat, this maximum distance is called” Egg Laying Radius (ELR)”. It can be defined as follow:

In this section the probability density function is a considered equal to which a, b, c and d b  N  c V  d are real coefficients, a Thresh Splitnode  Ni 

Size = 2.0; thresh = 0.001; ds = 0.0; Len = 0.0; angle = 0.0;

Else Addpoint node  pt, Ni 

Len = Vectorlength  pts 

Flag = true End IF End While

Do

Size = Size / 2; Rpts = Referencepoint  pts, Size  ds = Eq distance  pts,Rpts 

where: Pt – Point to be added into quad tree (It is a structure, which represents two dimensional search space in whole space, it has the features i.e. vector-length, angle and plant id). Ni - Node Id (It is a structure, which represents two dimensional converted a record in database, has the features i.e. node id, children id and List of Points added into it). Ds – Distance between node’s coordinates and point’s coordinates. Thresh -Threshold value to add point to a node The quad-tree values are integrated with the parent table attributes and modeled into an XML structure which is followed by retrieval.

angle = angle + Vectorangle  pts,Rpts  While  ds > thresh  ;

The 2 dimensional points, D2 =  Len, angle  . IV.1.3. Construction of 2 D Quad-Tree Now, the records will be in the form of 2D points, the quad-tree is constructed with the help of the points. Quadtree Construction Algorithm: Input: length, angle, plant Id. Output: Quadtree 1) Based on the range the graph is divided into four quadrants like Top Left, Top Right, Bottom Left and Bottom Right.

IV.1.4. Construction of XML Quadtree

2) Initially the first set of input values plotted in any one of the corresponding quadrant.

The creation of the 2D QUADTREE is given to the function for creating XML quadtree. This formation helps us for easy retrieval of knowledge. Once the information is converted into XML quadtree, the tree is stored in the file.

3) The next input values are plotted by calculating the deviation with respect to already plotted value. 1. If the size of the quadrant is less than the threshold value the repeated splitting of the quad tree is stopped and the x-side and y-side of the region is calculated. 2. For each node id the corresponding x, y coordinates and side values are found and the region value is calculated by using the following expression X -side of the region= x-coordinate + side/2 Y -side of the region= y-coordinate + side/2

XML Quadtree Construction Algorithm (Quad Tree To XML conversion):

4) The process 3 is repeated until all the input values are plotted in the quadtree.

3. The corresponding node id of the quad tree is assigned to the root id of XML quad tree.

5) Now, with root node as Null, the points in the major quadrants as the child of root node and the point in the sub quadrants as the child of major quadrants, the quadtree is constructed for all the input values.

4. The points in the least divided region are used to map the plant details from the plant database.

1. The mandatory fields like top right, top left, bottom left, bottom right for XML quad tree are specified. 2. The root node of the XML quad tree is assigned to 0.

5. Finally XML quad tree is created with already



858

A. Meenakshi, V. Mohan

traversed region in quad tree with the plant details in the least divided region.

The proposed system is made efficient by using quadtree data structure which gives accuracy for storage and it is then converted into XML quadtree for retrieval.

For n = 1 to quadtree Count Knowledge retrieval from XML

node = Quadtree  n 

Function Searchxml

Ele = XmlCreateElement  node.name  ;

Element

 pt   ,in 

nxy  4,2 Element Ec  4

'X',node . X ,'Y',node . Y ,    'SIDE',  Ele .setAttribute   node . size,'ISPARANT',     node . haspoints  IF  node . haspoints 

Element E = Xml.GetElementById  ni  While

 E . GetAttribute  ital ISPARNT  = 1 Ec 1 = E . NextChildElement  ital TopLeft 

Ec  2 = E . NextChildElement  ital TopRight 

Points pts   = node.getpoints

Ec 3 = E . NextChildElement  ital BotLeft 

For k = 1 to pts.Count

Ec  4 = E . NextChildElement  ital BotRight 

Pe = XmlCreateElements  ital plant  ;

nxy 1,: = E1 . GetAttributes  X,Y 

Pe. addString  pts  k  .plantname 

nxy  2,: = E2 . GetAttributes  X,Y 

Ele . add  pe  ;

nxy 3,: = E3 . GetAttributes  X,Y 

Pe 2 = XmlCreateElements  genelogy  ;

nxy  4,: = E4 . GetAttributes  X,Y 

Pe2 . addString  pts  k  . genelogy 

id = find nearestnode

Ele . add  pe 2  ;

ni = Ec  id  . GetAttribute  ital NID 

pe3 = XmlCreateElements  ital taxnomy  ;

E = Xml . GetElementById  ni 

Pe . addString  pts  k  . taxnomy 

Id

 pt,nxy 

End While Return ni End Function

Ele . add  pe  ;

End For End IF

XML Search

xmlroot . add  Ele  ;

Function Add Plant

End For

Xml

 pt   , plant 

ni = 0, flag = false; nxy 1,2

IV.2. Knowledge Retrieval

While

The efficient retrieval of knowledge is accomplished by the usage of XML architecture. Knowledge base that is constructed using the quadtree data structure cannot be directly used for knowledge retrieval because it is in the form of coordinate values, which is the prime advantage of building an XML quadtree. So the xml tags are used and xml tree is constructed using the (x, y) coordinate values as parameters along with the side of the quadrant present in the quadtree. The user can also add a new plant detail by giving the appropriate soil characteristics and depth with the plant id. The user is notified if the plant id is already present. The plant id is received from the user, because plant ids are used in building the XML quadtree. When the query is posted by the user with input parameters it fetches the appropriate plants from the XML quadtree and gives decision supporting information which helps edaphologists and agricultural experts from identifying the right crops/plants for the given soil characteristics.

!

flag 

ni = SearchXml

Element

 pt,ni 

Element E = Xml . GetElementBy Id  ni  nxy 1,: = E . GetAttributes  X,Y  ds = distance  pt,nxy  IF  ds > thres  Split Xml

Else Add Plant

node

E

xmlnode

 E, plant 

Flag = true End IF End While End Function



859


V.

algorithm 1 where the soil characteristics are given as the user query and plant list that fit the query is the output. For the analysis, we test by having six different queries and we evaluate the algorithm using the performance metrics. The six queries used for the testing are given in Table III. In the analysis, we make use of the performance metrics parameters of number of plants retrieved, computation time and memory usage. Tables V and VI show the values obtained for different metrics attributes for different queries for the proposed method and the baseline method. Figs. 6, 7 and 8 show the chart graph for number of plants retrieved, computation time and memory usage for various queries for the two methods.

Results and Discussion

This section presents the experimental results of the proposed Quad tree-based knowledge retrieval algorithm. The proposed algorithm is implemented in .NET\SQL platform with the system of having 4GB RAM. The input database consists of two tables, of which one is the plant list table and the other, soil characteristic table. The two tables are linked by the foreign key plant identification number. There are 148 plant ids in the database in each plant table there are four attributes and in soil characteristics table there are 15 attributes. The plant table attributes are plant identification number, name, geology and the taxonomy. The soil characteristics table attributes are plant identification number, depth, description, clay, silt, sand, hydrogen ion concentration, electrical conductivity, calcium, magnesium, sodium, potassium, phosphorous pent oxide and potassium oxide.

V.3. Performance Analysis The performance evaluation of the proposed knowledge retrieval algorithm is presented in Tables V and VI. For four different set of queries, Precision, Recall and F-measure are computed and the corresponding values are plotted in Table V. From the table, we can identify that the precision of the proposed system obtained 88% with the F-measure of 93% for query set 6. Similarly, total computation time of the proposed knowledge retrieval algorithm for different set of queries is given in Table VI.

V.1. Evaluation Metrics The proposed Quad tree-based knowledge retrieval algorithm is evaluated with the following evaluation metrics to prove the effectiveness of the algorithm. 1) Effectiveness measure: The performance of the proposed knowledge retrieval algorithm is evaluated on the input dataset using the precision, recall and Fmeasure. The definition of the evaluation metrics are given as follows: Precision  P  =

Recall  R  =

 Relevant

plants    Retrieved plants 

 Retrieved

 Relevant

plants 

plants    Retrieved plants 

 Relevant F  measure =

plants 

2 P  R P+ R

Fig. 5. Screenshot of the proposed system TABLE II SAMPLE QUERY Input Query Output Depth: 0-18 Name: Palmyrah Color: red Geology: Granite Sand: sandy clay loam Taxonomy: Fine, Strength: moderate mixed, isohyperthermic, medium subangular noncalcareous, Typic blocky Rhodustalfs Moist: slightly sticky Pores: common pores Clay: 24.60 Silt: 15.2 Sand: 60.2 PH: 7.36 EC: 0.04 CA:8.0 Mg: 1.5 Na: 0.68 K: 0.32 P2O5: 29.45 K2O: 228.00

2) Efficiency measure: Computation time: Computation time refers to the time incurred between the input query and the output list. The input query may be soil characteristics and the output will be the plant. V.2. Experimental Sample Results The sample results of the proposed Quad tree-based knowledge retrieval algorithm are presented in this section. The screenshot of the system is given in Fig. 5 and the sample query and its corresponding output is given in Table II. For experimentation, query set is generated by combining six queries to evaluate the performance of the proposed technique. The sample query set is given in Table III and its corresponding plant output with the execution time is given in Table IV. In this section, we discuss the detailed analysis of



860


Performance Metrics Plant Retrieved Computation time (Milliseconds)

Query 1 Depth =26-52 Color = light grey to grey Sand = sandy clay Strength = medium moderate subangular blocky Moist = sticky Pores = pores Clay =36.31 Silt =13.51 Sand=50.18 PH=6.28 EC=9.40 CA=9.40 Mg=1.80 Na=0.50 K=0.12 P2O5=11.26 K2O=108.00

TABLE III SAMPLE QUERY SET Query 2 Depth =50-91 Color = brownish yellow Sand = clay Strength = medium moderate subangular blocky Moist = sticky Pores = few pores Clay =38.89 Silt =20.25 Sand=50.18 PH=8.65 EC=0.09 CA=15.34 Mg=7.82 Na=3.01 K=0.12 P2O5=14.00 K2O=302.00

Query 3 Depth =13-33 Color = reddish brown Sand = sandy clay Strength = medium weak subangular blocky Moist = sticky Pores = few pores Clay =47.00 Silt =22.00 Sand=8.00 PH=8.00 EC=1.50 CA=14.00 Mg=10.00 Na=1.10 K=0.60 P2O5=7.00 K2O=208.00

Query 4 Depth =41-51 Color =very pale brown Sand =sandy Strength =medium subangular blocky Moist = Pores = Clay =3.40 Silt =4.00 Sand=92.60 PH=8.00 EC=0.06 CA=3.57 Mg=0.51 Na=0.25 K=0.10 P2O5=4.00 K2O=49.00

Query 5 Depth =23-37 Color = dark red Sand =sandy clay Strength =moderate medium subangular blocky Moist =slightly sticky Pores =few fine pores Clay =40.00 Silt =24.00 Sand=36.00 PH=7.17 EC=0.16 CA=10.00 Mg=4.50 Na=1.28 K=0.93 P2O5=25.65 K2O=195.00

Query 6 Depth = Color =dark red Sand =sandy clay Strength =moderate medium subangular blocky Moist =slightly sticky Pores =few fine pores Clay =40.00 Silt =24.00 Sand=36.00 PH=7.17 EC=0.16 CA=10.00 Mg=4.50 Na=1.28 K=0.93 P2O5=25.65 K2O=195.00

Query 1 Prosophis juliflora, Ipomea, Neem 703

Number of retrieved plants Number of relevant plants

TABLE IV SAMPLE QUERY SET AND I TS CORRESPONDING OUTPUT Query 2 Query 3 Query 4 Manjanathi Neem, Palmyrah, Palmyrah, Prosophis Prosophis juliflora juliflora 196 64 97

TABLE V EVALUATION METRICS OF D IFFERENT QUERY SET Query set 1 Query set 2 11 15 15

17

Query 5 Neem, Prosophis juliflora

Query 6 Neem, Prosophis juliflora

56

223

Query set 3 10

Query set 4 13

12

15

Retrieved plants Relevant plants

11

15

10

13

Precision

0.73

0.88

0.83

0.86

Recall

1

1

1

1

F-measure

0.84

0.93

0.90

0.92

Query set 3 25 1030

Query set 4 41 1045

Proposed method Existing method [17]

TABLE VI COMPUTATION TIME OF D IFFERENT Q UERY SET Total Computation time (ms) Query set 1 Query set 2 133 274 1072 1026



861


VI.

[11] Chengjun Liu, ”The Bayes Decision Rule Induced Similarity Measures”, IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol.29, No.6, June 2007. [12] Hanan Samet,” A Top-Down Quadtree Traversal Algorithm”, IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol, Pami-7, No.1, January 1985. [13] Sameer A. Nene and Shree K. Nayar, ”A Simple Algorithm for Nearest Neighbor Search in High Dimensions”, IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol. 19, no. 9, September 1997. [14] Jo-Mei Chang And King-Sun Fu,”Extended K-d Tree Database Organization: A Dynamic Multiattribute Clustering Method”, IEEE Transactions On Software Engineering, Vol. Se-7, no. 3, May 1981. [15] David Eppstein,Michael T.Goodrich and Jonathan Z. Sun,”The Skip Quad-tree: A simple Dynamic Data Structure for Multidimensional Data”. [16] Jon Louis Bentley, ”Multidimensional Binary Search Trees used for Associative Searching”, Communications of ACM ,Vol 18, No. 9, September 1975. [17] A.Meenakshi and V. Mohan, “An Efficient Tree-Based System for Knowledge Management in Edaphology”, European Journal of Scientific Research, Vol.42, No.2, pp.253-267, 2010. [18] Rizwana Irfan and Maqbool-uddin-Shaikh , “Enhance Knowledge Management Process for Group Decision Making”, In Proceedings of World Academy of Science, Engineering and Technology, 2009. [19] Lin Cui, Caiyin Wang, A Resource Retrieval Scheme based on Ontology Reasoning under Semantic P2P System, (2012) International Review on Computers and Software (IRECOS), 7 (4) pp. 1850-1854. [20] Feifei Tao, Huimin Wang, Jinle Kang, Lei Qiu, Study on Knowledge Base System of Extreme Flood Magnitude Based on Data Driven Reasoning, (2012) International Review on Computers and Software (IRECOS), 7 (5) pp. 2226-2230.

Conclusion

We have proposed an efficient knowledge management system based on quad tree and XML for handling the information which is in the form of knowledge. The information is collected from the edaphologists. In this project, we have two phases namely: Knowledge Storage and Knowledge Retrieval. Here, the knowledge is mapped into quad tree and using this quad tree we construct the XML quadtree. Thus the knowledge is stored . During the knowledge Retrieval the query given by user is converted coordinates and compared with the XML quadtree. The matching value is displayed. The experimental results portrayed that the knowledge engineering approach achieved persistent and compact data storage and faster and knowledge retrieval. The performance of the knowledge retrieval is evaluated with the evaluation metrics. In terms of computational time, the proposed method obtained average computation time of 0.1185 seconds, which is better than the computation time (1.043 sec) achieved by previous method.

References [1]

Shyh-Kwei Chen, “An Exact Closed-Form Formula For DDimensional Quadtree Decomposition Of Arbitrary Hyperrectangles”, IEEE Transactions On Knowledge And Data Engineering, Vol 18, No. 6, June 2006. [2] Omer Egecioglu, Hakan Ferhatosmanoglu, And Umit Ogras, ”Dimensionality reduction and similarity Computation by innerproduct approximations”,IEEE Transactions On Knowledge And Data Engineering, Vol. 16, No. 6, June 2004. [3] Brian Kulis, MemberPrateek Jain and Kristen Grauman, “Fast Similarity Search For Learned Metrics”, IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol. 31, No. 12, December 2009. [4] Kyuseok Shim, Ramakrishnan Srikant, And Rakesh Agarwal, “High-Dimensional Similarity Joins”, IEEE Transactions On Knowledge And Data Engineering, Vol 14, No 1, January/February 2002. [5] D. S.Yeung, And X.Z. Wang, “Improving Performance Of Similarity-Based Clustering By Feature Weight Learning”, IEEE Transactions On Pattern Analysis And Machine Intelligence,Vol 24, No 4, April 2002. [6] You Jung Kim And Jignesh M.Patel, Performance Comparison Of The R*-Tree And The Quadtree For Knn And Distance Join Queries, IEEE Transactions On Knowledge And Data Engineering, Vol 22, No 7, July 2010. [7] Xiaofeng Zhu,Shichao Zhang, Zhi Jin, Zili Zhang And Zhuoming Xu,”Missing Value Estimation For Mixed-Attribute Data Sets”,IEEE Transactions On Knowledge And Data Engineering, Vol 23, No 1, January 2011. [8] Zhiwei Linhui Wangand Sally Mcclean,”A Multi-Dimensional Sequence Approach To Measuring Tree Similarity”,IEEE Transactions On Knowledge And Data Engineering. [9] B.B.Chaudhuri,”Application Of Quadtree,Octree, And Binary Tree Decomposition Techniques To Shape Analysis And Pattern Recognition”, IEEE Transactions On Pattern Analysis And Machine Intelligence,Vol,Pami-7, No.6, November 1985. [10] Christos Faloutsos,H.V.Jagadish And Yannis Manolopoulos, “Analysis Of The N-Dimensional Quadtree Decomposition For Arbitrary Hyperrectangles”, IEEE Transactions On Knowledge And Data Engineering , Vol.9, No.3, May/June 1997.

Authors’ information Meenakshi A., received the B.Sc-Physics (May 1995), MCA (June 1998) degrees from Madurai Kamaraj University and M.E (Computer Science & Engineering) in June 2005 from Anna University Chennai. She has registered for PhD (Information and Communication Engineering) under Anna University –Tiruchirappalli and doing research in the field of Knowledge Engineering for the past two years. She has presented more than 10 Papers in national and International conferences. Since 1998 working in the field of education in Computer science as a Lecturer, Assistant Professor, Associate Professor and Professor in various Engineering colleges with more than 11 years of teaching experience. Now she is Working as a Professor in the Department of CSE in K.L.N.College of Information Technology, Sivagangai District,Tamil Nadu, India. Dr. V. Mohan - received Doctoral degree in Applied Mathematics from Madurai Kamaraj University, Tamil Nadu, India. He is having more than 30 years of experience in research and teaching. He is now working a Professor and Head of the Department, Dean – Planning and Administration at Thiagarajar College of Engineering, Madurai. His research interests include Graph Theory, Artificial Intelligence and Finite State Automata. His papers were published in various national and international Journals and Conferences.



862


"EAAS3" Distributed Control Architecture of MAS Robotic Systems M. El Bakkali1, S. Benaissa2, S. Tallal3, A. Sayouti4, H. Medromi5

Abstract – There is currently a wide use of remote monitoring in several areas. They usually respond to work-related problems in areas either inaccessible or hazardous for humans. The remote-control applications are potentially many [1]. In order to find a reliable solution to the problems associated with the remote control, several architectures appear [2]. The main idea behind these architectures is to propose a unified model of execution and planning and constantly interleave these two phases. In parallel with these architectural solutions, the emergence of components based software frameworks on the one hand, and powerful dynamic languages on the other hand, have fostered coordination and robustness. We shall present in this article the third version of an "EAAS3 " architecture which has been in development for several years. To make our approach tangible and concrete, we presented two actual 2CRM/T cases [3] [4] so to reflect the effectiveness of this control architecture. This architecture is built on a set of concept based on the following systems: multi-agent, robotics, real-time embedded. Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved.

Keywords: Multi-Agent Systems, Robotic Systems, Embedded Systems Real-Time, Dynamic Languages, EAAS3, 2CRM/T

Nomenclature EAAS EAS 2CR 2CRM 2CRT 2CRM/T

EAS Architecture for Autonomous System Equipe Architecture System Recognition, Connection, Control Recognition, Connection, Control, Measure Recognition, Connection, Control, Testing Recognition, Connection, Control, Measure / Testing

I.

Fig. 1. SPA Architecture

II.

Description of the EAAS2 Architecture

The EAAS1 (EAS Architecture for Autonomous System) was originally designed for autonomous mobile systems [6]. The goal of this architecture is to provide a solution to design and create such remote control applications either via Internet or other means. The EAAS2 version has been devolved following an innovation based on theoretical research and application. The architecture shown in Figure 2 [7] has two control modes, the first is the autonomous control mode in which the mobile robot is totally free, the second is the remotely operated mode. In this mode, the communication agent allows through various communications protocols (Internet, Radio, Bluetooth …) the control of the system. In the case of situations where a rapid response is requested (case of inter-locking), the recovery agent can switch to low-level control which directly sends the actions to the effectors.

Introduction

The first control architectures for autonomous robots revolved around a decomposition of the system into three sequentially called sub-parts, namely perception, planning and execution (named SPA paradigm). The sub-perception system constructs a model of the environment from the data collected, then planning computes a plan based on this model, a sequence of actions to achieve the goal, which is executed by the subsystem implementation (Figure 1). The execution being a simple problem, the majority of the work focuses on the phases of planning and modeling [5]. Also, we are moving to assert and push the multi-agent architectures and facilitate, at the same time, the task of modular integration or modification without any coding but rather with a flexible and progressive knowledge base. Before proposing and developing our new architecture, we have studied different systems that have put the emphasis on our approach.

III. Proposed EAAS3 Architecture III.1. The Agents and 2CR During the course of communication, many causes of failures are possible, either a fault related to the functional layer, temporary inability to respond to stress, a missing



863

M. El Bakkali, S. Benaissa, S. Tallal, A. Sayouti, H. Medromi

agent.

existing in the knowledge base. In case of nonconformity, an alarm is triggered to announce the failure. The reception of a message is as simple as sending. The agent uses methods to retrieve a non read message. The agents used for this purpose are: Connection Agent  It allows the recognition of another agent.  It allows connecting with another agent according to a knowledge base. Security Agent  It allows the security of the transmission through encryption.  It intervenes in the event of a risk.  It ensures the integrity i.e data must be as expected, and should not be altered accidentally or voluntary.  It ensures confidentiality i.e only authorized agents have access to the information intended to them. Any unwanted access must be prevented.  It ensures the availability i.e. the system should operate without flaw during the time of use, ensure access to services and installed resources within the expected response time.  It ensures non-repudiation and imputation i.e no agent should be able to contest the operations it has performed in the framework of its authorized actions, and no third party should be able to attribute to itself the actions of another agent.  It provides authentication i.e. the identification of agents is paramount to manage access to the relevant workspaces and maintain confidence in the exchange relations.

Fig. 2. EAAS2 Architecture

The use of 2CR behaviors pairing closely and internally low level perceptions and actions. According to Figure 3 the communication between the agents of the terminals is done in three phases:  Recognition: This phase allows the consciousness of the existence of another agent,  Connection: This step allows the alliance and association between the agents of the terminals.  Control: This phase gives the command and the dispatch of orders between the agents of the terminals. This mechanism applies to all communications between the agents and ensures the confidentiality of communications.

Measure Agent  It allows the reading of the parameters that define a agent. Test Agent  It allows the verification after measure i.e a comparison of the measured parameter with its normal state. Alarm Agent  When there is non-conformity of the parameter, the Alarm agent intervene to inform about the problem. The general structure of the Measure control/test between two agents is in the form indicated in the Fig. 4.

Fig. 3. The agents and 2CR

III.2. The Agents and 2CRM/T III.3. Links Between EAAS2 and 2CRM/T

Figure 4 shows the collaboration between two agents of the terminals in the case of measure control and testing. The first phase to achieve is the 2CR. This communication allows for the exchange of message through "question-answer". This exchange allows the reading of the target parameters. Also, during a test, a control is started to check the current status of the agent of the terminal B by comparing it with the normal state

We made a restricted change to the EAAS2 architecture to be integrated into the agent of the terminal B. This aims to establish a control with a mobile terminal "Robot". This architecture is hybrid architecture: it has a high-level deliberative layer which uses Actions Selection agent and a low-level reactive layer based on the couple agents Perception/Action.



864


Hardware Connection Agent  It aims to link between the hardware and software layers to make our control architecture modular and scalable compared to the hardware used ( mobile system, calculators (s), transmission between calculators and mobile system, specific interface cards with sensors ) [8].  It transforms the physical values of the environment (mm, mm/s, degrees, degrees/s) in such a way to make it understandable by the mobile system. This allows Action Agents to be relatively independent of the used mobile system. Perception Agent  It allows, after processing, to determine the location of the mobile system and create representations of the environment.  It provides the Actions Selection Agent the location of the mobile system and the representations of the environment to enable the latter to achieve its decision-making. Fig. 4. The agents and 2CRM/T

Localization  It allows awareness of the environment and all its proximities by using a target model.

Figure 5 shows the various agents of the control architecture and their sequence of actions that lead to the desired goal set by the remote user. When receiving the goal of the mission, the reading of the architecture starts from the bottom up.

Actions Selection Agent  It represents the deliberative part of our architecture.  It chooses the sequence of actions that would lead to the desired goal set by the remote user, the location of the mobile system, the current action, and representations of the environment and of their validity.  It is used for the maintenance of the representations (periodic or exceptional processing), to the monitoring of the environment and processing (forecast/correction) and the effective utilization of computing resources. Planning  It directly gives to the mobile system a path to follow; it can be more interesting to give it a purpose and to leave the Control Architecture autonomously defines the optimal path to follow.  Its aim may be expressed in different forms: end point to reach, a set of places to visit in a predefined order by the mobile system, etc. , the choice of the path is closely linked to the environment.

Fig. 5. EAAS2 and 2CRM/T

This part of the EAAS3 control architecture has the following properties: Mi and R Sensors  They provide the measures or images depending on what we use for the Perception Agent (odometers, ultrasonic sensors, infrared sensors, light sensors or, cameras ...).  They can be proprioceptive sensors (which measure internal parameters of the system).  They can be exteroceptive sensors (which measure external parameters of the system).

Navigation  The piloting level shows that it is possible to give the mobile system decision-making autonomy or responsiveness which take into account unforeseen events. To give the system even more autonomy, we built a higher hierarchical level called navigation to which the entry is a path to follow.  It delivers at the piloting level a trajectory which is determined based on the desired path but also on



865


kinematic or dynamic capabilities of the mobile system and environmental constraints.

Real World Agent  This agent can be considered as an interface between the real world and the mobile system. To establish communication between the agent of the terminal A and the mobile agent B "Robot", we integrate an Manager Agent.

Piloting  This level receives as input the description of a path to follow which is already defined by the navigation level. The path is chosen taking into account the kinematic or dynamic constraints of the mobile system, i.e its real capacity for action. This path / trajectory is described by a suite of coordinates in the operational space.  It decides to make dynamic obstacle avoidance; this is the case when it finds that the desired trajectory is impossible to perform because of the position of certain elements in the environment of the mobile system.  It should also be able to make a choice in the behavior to adopt, in particular, if an unexpected event appears (safeties management).

Manager Agent  This agent has the sociability and flexibility to manage connections and commands initiated by the agent of the terminal A via HMI (Human Machine Interface). The EAAS3 general architecture becomes as indicated in the Fig. 6.

Action Agent  It establishes a control rules for the control of mobile system effectors.  It carries out reflex actions from a representation given by a Perception Agent designated by the Actions Selection Agent.  It combines the low-level controllers associated with the mobile system effectors.  It allows the calculation of the control commands of the effectors.

Fig. 6. EAAS3 Architecture

Figure 6 shows the various agents of the control architecture we offer. The Sensors (proprioceptive and exteroceptive) in this figure, provide measurements or pictures (depending on what we use: odometers, ultrasonic sensors, infrared sensors, light sensors, or cameras ...) to the Perception Agent which allows, after processing, to determine the location of the mobile system and create representations of the environment. The multilevel breakdown of the Actions Selection Agent allows to clearly break the different commands to perform. This breakdown allows to distinguish clearly the planning and navigation functions. In fact, they often have the disadvantage of being nested within other architectures and having very dependent borders from the actual application.

Hardware Connection Agent  It aims to bond between the hardware and software layers of the architecture to make it modular and scalable for the hardware used. Effectors  Those are the components that perform actions (Example of effectors: engine, pneumatic cylinder, electro-magnet). Communication Agent  It allows through various communications protocols (Internet, Radio, Bluetooth …) the control of the system. In the case of a situations where a rapid response is requested (case of inter-locking), the recovery agent can switch to low-level control which directly sends the actions to the effectors.

IV.

Conclusions and Perspectives

In this paper, we explain the EAAS3 architecture and its basic aspects. We introduce mechanisms for multiagent based remote control systems. In addition, the proposed solution is inherently robust i.e a fault of an agent is generally confined to the single agent and does not jeopardize the entire system. Furthermore, the concept of collaboration is built in to solve the problems of failure of an agent by giving the relay to another. Of course, the EAAS3 architecture aims to push forward the multi-agent architectures and facilitate, at the same time, the task of modular integration or modification without any coding but rather with a flexible and progressive knowledge base. It is worth mentioning that this work is related to already

Recovery Agent  It allows to switch to low-level command mode in order to directly send actions to the effectors. Interface Agent  This agent can be considered as an interface between the man and the mobile system in order to achieve the desired goal.



866


implemented projects by the operator Maroc Télécom 2CRM [3] and the Eurecom laboratory 2CRT[4]. One of the current challenges is to continue to validate the EAAS3 architecture on more complex issues in the field of autonomy of the movement which is considered in this thesis, and confront problems of control over different domains (eg robotics service and interactions with users). This may lead to the development of the architecture and its implementation to take into account new or currently difficult to express needs with EAAS3. Beyond these developments, we see six major problems to explore, as an extension of this work:  A better exploitation of the definition of the functional layer,  A learning capacity in control architecture,  The extension of the modeling and control methods,  The integration of the advanced security mode,  A good collaboration between all the robots of the system [ 9],  And finally the extension of the principles of EAAS3 systems to multi-collaborative robots.

Authors’ information Mohammed El Bakkali had his Master’s Degree in Computer Multimedia Telecommunications from Mohammed V Agdal University, Rabat, Morocco, in 2008. He obtained a degree in Electronics from Moulay Ismail University, Meknes, Morocco, in 2006. He has been preparing his PhD, since 2008, within the system architecture team of the ENSEM, Casablanca, Morocco. His actual main research interests concern the Design and Construction of a Measurement Robot Platform of Physical Parameters of Lines and Telecommunication Terminal, Real-Time Based on Multi-agents Systems. Said Benaissa received his Degree in High Education Deepened in Telecommunication Networks in 2008, from the National Institute of Science and Technology, Marrakech. In 2009 he joined the System Architecture Team of the ENSEM School, Casablanca, Morocco. His current main research interests embedded Systems Based on Multi-agents Systems. Saadia Tallal received her PhD from the University of Poitiers in France in 1990, and she is currently Research Professor of Computer Science at ENSEM, Hassan II University in Casablanca, Morocco. She is PhD and her current research focuses on Multi Agent Systems in the Systems Architecture Team related to LISER Laboratory.

References [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

Otmane [ 2000] Samir Otmane / Télétravail Robotisé et Réalité Augmentée Application à la Téléopération via Internet, Phd thesis, CEMIF Laboratory of Complex Systems, Specialty: engineering sciences, University of Evry-Val of Essonne - 13 December 2000. Samir Otmane / Modèles et techniques logicielles pour l'assistance à l'interaction et à la collaboration en réalité mixte, paper to obtain the title of accreditation to lead research, Laboratory of Computer Science, Integrative Biology and Complex Systems, Specialty: computer science, University of Evry-Val of Essonne - December 08, 2010. M. El Bakkali, H. Medromi, A Multi-Agent Systems based on a Real-time distributed “2CRM" Robot Platform for the measurement of telecommunications Lines and terminals, (2010) International Review on Computers and Software (IRECOS), 5 (6), pp. 701-705. El Bakkali, H. Medromi, Real Time " 2CRT" architecture and platform for the testing of Telecommunications Terminals according to the Manhattan mobility, (2011) International Review on Computers and Software (IRECOS) 6 (6), pp. 950-955. Arnaud Degroote / Une architecture de contrôle distribuée pour l'autonomie des robots, a thesis of the University of Toulouse 2012. Adil Sayouti / Conception et Réalisation d’une Architecture de Contrôle à Distance Via Internet à Base des Systèmes MultiAgents, Phd thesis, specialty: Computer Science, ENSEM, University Hassan II - July 2009. Fouad Moutaouakkil / Conception et Réalisation d’une plateforme de Contrôle autonome distribuée: Application en Robotique Mobile, doctoral thesis, specialty: Computer Science, ENSEM, University Hassan II - October 2010. Adil Sayouti and Hicham Medromi / Les Systèmes Multi-Agents : Application au Contrôle sur Internet, European Editions university - August 2012. Said Benaissa, Mohammed El Bakkali, Hicham Medromi, Agent Protocols for Autonomous Mobile Robots, (2012) International Journal on Communications Antenna and Propagation (IRECAP) 2 (3), pp. 200-204.

Adil Sayouti received the PhD in computer science from the ENSEM, Hassan II University in July 2009, Casablanca, Morocco. In the same year he received the price of excellence of the best sustained thesis in 2009. In 2003 he obtained the Microsoft Certified Systems Engineer (MCSE). In 2005 he joined the system architecture team of the ENSEM, Casablanca, Morocco. His actual main research interests concern Remote Control over Internet Based on Multi agents Systems. Hicham Medromi received his PhD in engineering science from the Sophia Antipolis University in 1996, Nice, France. He is responsible of the system architecture team of the ENSEM Hassan II University, Casablanca, Morocco. His current main research interests concern Control Architecture of Mobile Systems Based on Multi Agents Systems. Since 2003 he is a full professor for automatic productic and computer sciences at the ENSEM School, Hassan II University, Casablanca.



867


A Contourlet Based Block Matching Process for Effective Audio Denoising B. Jai Shankar1, K. Duraiswamy2

Abstract – Audio signals are frequently infected by background environment noise and buzzing or whining noise from audio equipments. Audio denoising is the technique intends to satisfy the noise as retaining the fundamental signals. For the elimination of noises from the digital audio signals more number of denoising techniques is introduced by the researchers. However the effectiveness of those techniques remains a problem. In this paper, an audio denoising technique based on contourlet transformation is proposed. The contoulet transform which is more efficient in finding and removing the noises. Denoising is carried out in the transformation domain and the enhancement in denoising is attained by a process of grouping closer blocks and creation of multidimensional arrays. In this technique each and every finest detail supplied by the grouped set of blocks and also it protects the important unique features every separate block. Every block are filtered and restored in their original positions from where they are separated. The grouped blocks overlap each other and therefore for every element a much different assessment is obtained that are to be combined to remove noise from the input signal. The experimental result shows that the proposed Contourlet and the Daubechies’s transformation is more efecient when compared with the other techniques. Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved. Keywords: Denoising, Contoulert Transform, Audio Signal, Daubechies’s, Block Matching Process

Diagonal time-frequency audio denoising algorithms attenuate the noise by dealing out with each window Fourier or wavelet coefficient independently, with empirical Wiener [2], power subtraction [3], [4], [5], or thresholding operators [6]. The above algorithms generate remote time-frequency structure that are supposed as a “musical noise” [7], [8]. In [9], [10] demonstrate that this musical noise is robustly attenuated with non diagonal time-frequency estimators that normalize the evaluation by recursively cumulative time frequency coefficients. This approach has been enhanced additionally by optimizing the SNR evaluation with parameterized filters that rely on stochastic audio models. On the other hand, these parameters are supposed to be attuned to the nature of the audio signal, which frequently varies and is unknown. In practice, they are empirically fixed. Decomposition, variation of detail coefficient and restoration are the three main procedure of the wavelet denoising. The variation is the core of wavelet denoising and it mainly depends on the capacity to choose the suitable threshold value and how to execute the thresholding. In recent times, more research has been done to locate the most accurate way to compute the threshold value. A SURE threshold is chosen by means of the rule of Stein’s Unbiased Estimate of Risk (SURE) [11]. This rule provides an approximate of the risk for a particular threshold, where risk is definite by SURE. The risk minimization in the threshold gives a threshold

Nomenclature ′ ′ 0 , 1, … 2 ′ ′

ℎ

Biorthogonal wavelet domain Transformation matrix Vector signal Reference block Number of grouped blocks Distance between the vector elements of the Temporary vector Temporary block matrix Aggregated value Threshold value Amount of noise level Kaiser Window function

I.

Introduction

Multiscale techniques have been well-liked in several signal, medical field and application of image processing in current years. Based on shrinking in time-frequency signal representations many signal denoising techniques are introduced. Complex wavelet transform has a high resolution in time and frequency which depends on wavelet scales [1]. The short-time Fourier representation is compatible for examines immobile parts of signals, while the highly local wavelet atoms in high frequency bands agree to capture transient features.



868

B. Jai Shankar, K. Duraiswamy

selection. A hybrid threshold is a mixture of the common and the SURE rules and shot to overcome the limitation of SURE [12]. In [13] a new speech development method is based on time and scale variation of wavelet thresholds. The time dependence was set up by approximating the Teager energy of the wavelet coefficients; at the same time as the scale dependency was defined by make longer the principle of level dependent threshold to wavelet packet thresholding. Performance was calculated on the recorded speech in real conditions like plane, sawmill, tank, subway, jabber, car, exposition hall, restaurant, street, airport, and train station and synthetically added noise. MEL-scale decomposition based on wavelet packets was compared to the recurrent wavelet packet scale. Signal-to-noise ratio (SNR) comparison was taken into an account intended for time adaptation and time– scale adaptation of threshold wavelet coefficients. Visual inspection of spectrograms and experiments were also used to maintain the results. In [14] a method is found on critical-band decomposition which converts a noisy signal into wavelet coefficients (WCs), and improved the WCs is done by deducting a threshold on or after noisy WCs in each subband. The threshold of each subband is customized according to the segmental SNR (SegSNR) and the noise masking threshold. Thus residual noise might be resourcefully suppressed for a speech conquered frame. In a noise-dominated frame, the background noise could be approximately detached by adjusting the wavelet coefficient threshold (WCT) according to the SegSNR. Speech distortion might be concentrated by decreasing the WCT in speech dominated subbands. The method could efficiently improve noisy speech which was contaminated by colored noise. Its performance was better than other wavelet based speech enhancement methods in their experiments. More research on decrease of the noise in cassette recordings has been carried out by [15]. Dolby noise reduction is based on reducing the insight of tape hiss, which comes from the magnetic particles on a tape. The design is that music be encoded just prior to recording. Here, the level of soft, high frequency passages is hoisted to build them louder than the tape’s noise, while loud passages are not changed. The quantity to which a particular set of wavelet coefficients form a practical demonstration of a signal is a function of how well the mother wavelet matches with the fundamental signal characteristics, in addition to the times and scales selected. For application to the signal improvement, frequently referred to in the text as wavelet denoising, the coefficient magnitudes are condensed after evaluation to a threshold. By means of finding a good choice of representation, this thresholding will remove noise at the same time as maintaining signal properties. To concentrate on the fact that many types of signals have considerable non-stationary and possibly will not be well-represented by a single fixed set of parameters, it is

possible to make the wavelet transform adaptive, such that characteristics of the transform change in excess of time as a function of the underlying signal characteristics. There are more than a few possible approaches to adaptive wavelet enhancement, including adaptation of the wavelet basis, adaptation of the wavelet packet pattern, straight adaptation of the time and scale variables or adaptation of the thresholds or thresholding algorithms used. The most common approach is to use a time-varying threshold or gain function based on an a priori energy or SNR quantify [16]-[20]. In denoising of the audio signals, the denoised signal attained after performing the wavelet transformation techniques is not totally free from noise, which means that there may be some kind of residue of noise left or some other kinds of noise gets introduced by the transformation that is disturbing the output signal. Several techniques have been begin so far for the removal of noise from an audio signal, thus far, the efficiency remains an issue as well as they have some drawbacks are common. In this research, an audio signal denoising technique is proposed based on an improved block matching technique in transformation domain. The improvement of the block matching is attained by grouping similar fragments of the audio signal into a set of multidimensional arrays. Due to the similarity between the grouped blocks, the transformation can achieve a clear representation of the input signal so that the noise can be removed well by reconstruction of the signal. A biorthogonal wavelet transform is used for the transformation process which is invertible, but not necessarily orthogonal. A multi dimensional signal vector is generated from the transformed signal vector and the original vector signal is reconstructed by applying inverse transform for the generated multi dimensional signal vector. The denoised signal thus reconstructed from the vector signal has a well attenuated noise level. The signal to noise ratio (SNR) is comparatively higher than SNR level of the noisy input signal thus increasing the quality of the signal to a remarkable level.

II.

Contourlet Based Block Matching Process for Denoising

In this section, describe the proposed audio denoising technique for the effectual removal of unwanted noises from any audio signal. At this time, it is measured that the audio signal is polluted by Additive White Gaussian Noise (AWGN) and the polluted signal is subjected to the removal of noise using the proposed denoising technique. The processes performed in the proposed denoising technique are explained in detail as follows (i) Transformation of the noisy signal to contourlet domain [21], [22], [23], (ii) generation of a set of closer blocks, (iii) generating a multidimensional array, (iv) calculating the weight and aggregate value of non-zero elements and finally (v) reconstructing the denoised audio signal. In the proposed work, primarily, the noisy audio signal is subjected to Contourlet Transformation. Contourlet



869


=

transformation produces a few important coefficients for the signals with discontinuities. Audio signals are smooth with a small number of discontinuities. Hence, Contourlet transformation has better competence for representing these signals when compared to other transformations. Once the signal is transformed to wavelet domain, set of closer blocks are synthesized from it. The process of synthesizing the set of closer blocks are represented in Figure 1 and detailed in the following section. Input Noisy signal

Set of closer blocks

Contourlet Transformatio n

Synthesis of closer block

∗

(1)

where, ′ ′ in biorthogonal wavelet domain and is the transformation matrix of the biorthogonal wavelets. Once the noisy signal has been transformed to the wavelet domain, it is represented as a vector signal that eases the operation. denoted by

Grouping by block matching

Fig. 3. Block Representation of the noisy input signal

Thresholding

The vector signal is then converted into a set of blocks by grouping. Grouping can be defined as the thought of collecting the successive fragments of the given signal into a single data structure. Each block is grouped into an exact size given by size b such that it can be altered as per the requirement of the number of blocks. The representation of the blocks is given in Eq. (2):

Fig. 1. Generation of a Set of Closer blocks Roman

II.1.

∗

Synthesizing Closer Blocks from the Noisy Vector Signal

= { 0 , 1, … . . ,

Let, the input noisy signal for the proposed technique is and it is correspond to a vector of length ‘n’ which is shown in Figure 2. Bior 1.5 wavelet transformation is applied to the input noisy signal and a transformed signal is obtained as output during this process. A biorthogonal contourlet transformation is a linked contourlet transform that is invertible but not essentially orthogonal. Designing biorthogonal wavelets allows more level of choices than orthogonal contourlet. One additional level of choice is the likelihood to generate symmetric contourlet functions. For the process of transformation, the vector noisy audio signal is reshaped into a matrix of size same as that of the transformation coefficient matrix. For the process of transformation, the vector noisy audio signal is reshaped into a matrix of size same as that of the transformation coefficient matrix.

− 1}

(2)

where, 0 , 1, … represents the number of grouped blocks. A variable step size is continued for the purpose of the number of blocks that can be generated for the noisy vector signal. The selected step size decides the number of blocks obtained from the noisy signal. The number of blocks generated is a function of the size of the input signal, the block’s size and the step size. The process is then followed by the calculation of the 2 norm distance for each block generated in the vector signal. The transformed initial block is kept as the reference block and it is denoted as the distance between the reference block and all the other grouped blocks are calculated. Similarly, the process is repeated in the same fashion for all the blocks with the consideration of every block as a reference block. The calculation of the 2 norm distance between the vector elements of the its associated blocks are given in Eq. (3):

2 =

Fig. 2. A Discrete Audio Signal Representation

( )−

( )

(3)

As an alternative considering all the generated L2 distance values, a threshold value is set to obtain the sufficient number of L2 distances to make the process much faster. The blocks corresponding to the respective

The noisy audio signal is transformed to biorthogonal wavelet transformation domain that is given in Eq. (1):



870


L2 distances are grouped into a set of closer block. The set of blocks with L2 distances lesser than the threshold value are collected and considered as the closer blocks. The same mode of operation is repeated for the next reference block and their corresponding blocks. Thus, a set of closer blocks with respect to all the reference blocks based on their L2 norm distances are determined. The generated set of blocks are considered for the further processes instead of computing the process with the whole set of blocks with respect to their L2 distance values. The following section explains the generation of a multidimensional vector signal for the removal of noise and thus reconstructing the original signal in Figure 3.

II.2.

Generation of a Multi Dimensional Vector

A set of closer blocks is synthesized from the operations explained in the previous section. A Multidimensional vector signal is generated for the whole set of blocks considering each reference block. The closer blocks relevant to the first reference block are processed to produce the multidimensional vector as given in Figures 4. Every first element of the set of closer blocks are detached and stacked independently in a temporary block of size 1× L where L is the determined number of blocks considered in the process. The separation of first element of the closer blocks is represented in Figure 4(b).

(a)

(b)

(c) Figs. 4. Generation of a Multi Dimensional Vector

element’s value is less than ℎ then the particular element is replaced with ‘0’ and replaced back in the temporary block. If the element’s value of the transformed block is greater than the ℎ then the value of the transformed element is not changed. Once the temporary block is generated the values are being retransferred to the corresponding locations in the respective blocks from where they were detached. The process is repeated for all the elements in the reference block and their respective grouped set of closer blocks. The same process is followed for all the blocks with the consideration of each block as a reference block and their respective set of closer blocks. The process leads to the generation of a multidimensional data array for the whole transformed vector signal. The generation of the multi dimensional array thus simplifies the process of denoising the input audio signal. To perform the reconstruction process, the transformed vector is subjected to inverse Daubechies as well as contourlet transformation to obtain the time domain multidimensional array. The aggregated value (aggregation of the non-zero elements present in the

In Figure 4(a) r represents the number of elements in a block and represents the number of closer blocks for its relevant reference block. A temporary block consisting of all the first elements of the first reference block and its corresponding set of closer blocks are obtained that is represented in Figure 4(c). The obtained temporary vector signal is then transformed using Daubechies’s transformation. The process of reshaping is carried out to perform the transformation process. Daubechies’s Transformation deals with a 2-point mean and difference operation. The transformation is carried upon with the temporary vector signal and the transformed vector signal is given by Eq. (4): (4) ′ = ∗ ∗ where, D is the transformation matrix of Daubechies’s transformation of the temporary block matrix ′ and ′ is the temporary vector in Daubechies’s domain. Once the transformed matrix is obtained, it is converted into vector form as discussed earlier and is obtained. Then, every element in the transformed vector block B is compared with a threshold value ℎ such that if the



871


II.4.

multidimensional array) and the weight of the array are calculated in the following process that is explained in the following section in a detailed manner. The reconstruction of the audio signal is detailed in Figure 5. Set of closer blocks

Generation of multidimensional vector

Denoised output signal

Reconstruction of signal

The process of reconstruction of the audio signal from the multidimensional vector is explained in the section. The generated multidimensional array elements are subjected to inverse Daubechies transformation to create a regenerated signal that is free from noise. The signal is then subjected to inverse bior1.5 transformation to obtain the time domain noisy vector signal. Reshaping process is performed before the inverse transformation process is done and the inverse transformed matrix is reconstructed to generate the denoised vector signal. For the purpose of reconstruction, a null vector signal denoted as of size same as that of the input noisy is generated and grouped into set of blocks signal " as similarly done in the input signal. Let ′ and be the intermediate variables used for storing the reconstructed values. For calculating ′ , firstly, the product of the reference block of the multidimensional vector signal and theKaiser Window function is determined. Secondly, the obtained product is summed up with the reference block of the null vector. The resultant is then divided by the aggregated count value. For the calculation of " , operation is similar as performed in ′ , but resultant value is divided by the weight. " The formulas for calculating ′ are given as:

Inverse Daubechies ‘s transformation

Inverse contourlet transformation

Fig. 5. Reconstruction of the denoised audio signal

II.3.

Calculation of Weight of the Generated Multidimensional Array

The transformed vector consists of both the replaced and the non-replaced elements. For the reason of aggregated value calculation, all the non-zero elements present in the vector signal are counted. The process is followed for the particular set of blocks and the values are summed up to create a final count value for the 1st reference block and its respective set of blocks. The process is repeated for all the set of blocks and by summing up their count values, an aggregated value for the whole set of reference blocks and their respective set of closer blocks is obtained. The expression for the = – ∅. aggregated value is given as = | | where the can be given as: =

; ;

∅

>0

1

×

+

= 0 ≤

≤

"

0 ≤

− 1

,0 ≤ +

=

≤

×

− 1

≤ ×

,0 ≤

≤

;

(7)

−1

;

(8) −1

represents the Kaiser Window function is a window function that fit in to single parameter family and it is utilized for DSP applications. The window function is given as:

(5)

where, 0 ≤ ≤ − 1 , 0 ≤ ≤ 1 and 0 ≤ ≤ – 1. The aggregated value is used to characterize the total count of non-zero elements present in the entire set of block. The weight of the data array is calculated from the following equation:

=

Reconstruction of the Denoised Audio Signal

⎧ ⎪ =

(6)

⎨ ⎪ ⎩

1−

2

−1

(9) ;0 ≤

0

;

≤

ℎ

where, is the zeroth order modified Bessel function of the first kind. denotes the arbitrary real number that evaluates the shape of the window; is an integer, and the length of the sequence is = + 1. The original image is reconstructed from the generated ′ " values and the audio speech signal is obtained as follows:

where, is the aggregated value of a reference block and is the amount of noise level in the signal. Once the weight of the data array is being calculated for the multidimensional vector signal, the reconstruction is performed.



872


=

"

(10)

The obtained signal is then passed through wiener filter and residues of noise present in the signal are attenuated through this process. Eventually, the audio speech signal is obtained as output of from the filter and it is denoised effectively by the proposed block matching noise removal technique.

III. Implementation Results The proposed block matching technique has been implemented in the working platform of MATLAB (version 7.8). The performance of the proposed technique was tested using different audio signals. The observations showed that the denoised signal had a remarkable level of improvement in their SNR values. As explained above, in the experimentation of the , = 16 number of proposed technique, for every blocks was selected. To reconstruct the multi dimensional array from the temporary block the hard threshold value ℎ was selected as − 0.2 For testing the performance of the proposed technique a female laughing audio signal is considered as input. The input signal is then contaminated with AWGN noise for the testing purpose. The input noisy signal with respect to its length n =12000 is represented in Figure 6(a) and the same noisy signal with respect to the length n =100 is given in Figure 6(b). The AWGN was generated and added to the input signal. The SNR of the noisy audio signal was 5.13 dB at a noise level 0.047. A linear combination of the generated noise and the original signal is used as the primary input for the block matching technique. Figure 6(c) the denoised audio signal which is obtained as a final output.

Fig. 6(b). Noisy Audio signal

Fig. 6(c). Denoised audio signal

Table I shows the different levels of noise added to the audio signal and their respective SNR values of the noisy signal and the denoised signal. TABLE I SNR LEVEL FOR N OISY I NPUT AND DENOISED OUTPUT SIGNAL SNR of the Input signal SNR of the Noise level σ denoised type noisy signal signal 0.021 10.35 18.68 Female 0.041 6.26 16.35 laughing sound 0.066 3.09 12.41 0.081 1.30 6.90

Table II shows the different type of audio signals and their respective SNR values of the noisy and the denoised signals. Various sound signals were tested with the proposed block matching technique and the variation in their SNR levels were examined with their noise level set as = 0.041. From the observations obtained it is clearly visible that the proposed technique was able to attenuate the noise level from the input noisy signal to a remarkable level.

Fig. 6(a). Input of the Audio signal



873


Sl. No 1 2 3 4 5

TABLE II SNR LEVELS OF NOISY AND DENOISED SIGNALS SNR level SNR level Noise level of noisy of the Input signal type σ signal denoised in db signal in db Dog Barking 0.041 8.45 13.68 sound Female laughing 0.041 7.28 12.86 sound Laser gun sound Male laughing sound Doorbell sound

IV.

0.041

12.36

14.52

0.041

11.12

15.65

0.041

12.59

16.76

[9]

[10]

[11]

[12]

[13]

Conclusion

The proposed technique was based on the denoising approach and its efficient implementation was presented in full detail. The implementation results have been exposed that the process of block matching has achieved a state-of-the-art denoising performance in terms of both peak signal-tonoise ratio and subjective improvement in the audible quality of the audio signal. Grouping of the similar blocks to generate a multidimensional array also improved the efficient operation of the technique. The blocks were filtered and replaced in their original positions from where they were detached. The grouped blocks were overlapping each other and thus for every element a much different estimation was obtained that were combined to remove noise from the input signal. The reduction in the noise level interprets that the technique has protected the vital unique features of each individual block even when the finest details were contributed by grouped blocks. In addition the technique can be modified for various other audio signals as well as for other problems that can be benefit from highly linear signal representations.

[14]

[15] [16]

[17]

[18] [19]

[20]

[21]

[22]

[23]

References [1] [2]

[3]

[4]

[5] [6] [7]

[8]

L. Breiman, J. Friedman, R. Olshen, and C.J Stone, Classification and Regression Trees, Belmont, CA:Wadsworth, 1983. R. J. McAulay and M. L. Malpass, “Speech enhancement using soft decision noise suppression filter,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-28, no. 2, pp. 137–145, Apr. 1980. M. Berouti, R. Schwartz, and J. Makhoul, “Enhancement of speech corrupted by acoustic noise,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), 1979, vol. 4, pp. 208–211. S. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoustics, Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113–120, Apr. 1979. J. S. Lim and A. V. Oppenheim, “Enhancement and bandwidth compression of noisy speech,” Proc. IEEE, vol. 67, Dec. 1979. D. Donoho and I. Johnstone, “Idea spatial adaptation via wavelet shrinkage,” Biometrika, vol. 81, pp. 425–455, 1994. O. Cappé, “Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor,” IEEE Trans. Speech, Audio Process., vol. 2, pp. 345–349, Apr. 1994. G. Yu, E. Bacry, and S. Mallat, “Audio signal denoising with complex wavelets and adaptive block attenuation,” in Proc. IEEE

Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Apr. 2007, vol. 3, pp. III-869–III-872. Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean square error short-time spectral amplitude estimator,” IEEE. Trans. Acoust., Speech, Signal Process., vol. 32, no. 6, pp. 1109–1121, Dec. 1984. Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean square error log-spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-33, no. 2, pp. 443–445,Apr. 1985. D.L. Donoho, I.M. Johnstone. “ Ideal Denoising in an Orthonormal Basis Chosen from a Library of Bases” Technical report, vol. 461, Department of Statistics, Stanford University, 1994. Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200--1224. Bahoura, M., Rouat, J., 2001. Wavelet speech enhancement based on the teager energy operator. IEEE Signal Process. Lett. 8 (1), 10–12. Ching-Ta and Hsiao-Chuan Wang, 2003. "Enhancement of single channel speech based on masking property and wavelet transform", Speech Communication, Vol. 41, No 2-3, pp: 409427. Dolby, “Making Cassettes Sound Better,” http://www.dolby.com/cassette/bcsnr/, 2000. Chen, S.-H., Chau, S.Y., Want, J.-F., 2004. Speech enhancement using perceptual wavelet packet decomposition and teager energy operator. J. VLSI Signal Process. Systems 36 (2–3), 125–139. Jianzhao Huang, Jian Xie, Qinhe Gao, Liang Li, A Signal Threshold Denoising Method Based on Improved EEMD, (2012) International Review on Computers and Software (IRECOS), 7 (7), pp. 3600-3604. Fu, Q., Wan, E.A., 2003. Perceptual wavelet adaptive denoising of speech. Paper presented at the Eurospeech, Geneva. Hu, Y., Loizou, P.C., 2004. Speech enhancement based on wavelet thresholding the multitaper spectrum. IEEE Trans. Speech Audio Process. 12 (1), 59–67. Lu, C.-T., Wang, H.-C., 2003. Enhancement of single channel speech based on masking property and wavelet transform. Speech Commun. 41 (2–3), 409–427. Rekha Lakshmanan and Vinu Thomas,” Microcalcification Detection by Morphology,Singularities of Contourlet Transform and Neural Network”, Bonfring International Journal of Networking Technologies and Applications, Vol. 1, No. 1, 2012 M. H. Malik, S. A. M. Gilani, Anwaar-ul-Haq, Adaptive Image Fusion Scheme Based on Contourlet Transform and Machine Learning, (2008) International Review on Computers and Software (IRECOS) 3 (1), pp. 62-69. Xuelong Hu, Wei Fang, Wanpei Chen, Tongyu Jiang, Canjun Qian, Mean-Shift Tracking Algorithm Based on Fused Texture Feature of Contourlet Transform, (2012) International Review on Computers and Software (IRECOS), 7 (7), pp. 3502-3506.

Authors’ information 1

Assistant Professor, KPR Institute of Engineering and Technology, Coimbatore, India. E-mail: [email protected]

2

DEAN, K. S. Rangasamy College of Technology, Tiruchengode, India. B. Jai Shankar received B.E. degree in Electronics and Communication Engineering from Government College of Engineering, Salem and M.E. degree in Applied Electronics from Kongu engineering College, Erode. He has worked in K.S.R College of Engineering, Tiruchengode and Kumaraguru College of Technology, Coimbatore. Currently, he is working as a Assistant



874


Professor in KPR Institute of Engineering and Technology. His research interest includes Digital Signal Processing, Image Processing and Wavelets. K. Duraiswamy received his B.E. degree in Electrical and Electronics Engineering from P.S.G. College of Technology, Coimbatore in 1965 and M.Sc.(Engg) from P.S.G. College of Technology, Coimbatore in 1968 and Ph.D. from Anna University in 1986. From 1965 to 1966 he was in Electricity Board. From 1968 to 1970 he was working in ACCET, Karaikudi. From 1970 to 1983, he was working in Government College of Engineering Salem. From 1983 to 1995, he was with Government College of Technology, Coimbatore as Professor. From 1995 to 2005 he was working as Principal at K.S. Rangasamy College of Technology, Tiruchengode and presently he is serving as DEAN of K.S. Rangasamy College of Technology, Tiruchengode, India. Dr. K. Duraiswamy is interested in Digital Image Processing, Computer Architecture and Compiler Design. He received 7 years Long Service Gold Medal for NCC. He is a life member in ISTE, Senior member in IEEE and a member of CSI.



875


Software Project Scheduling Techniques: a Comparison Study Osama K. Harfoushi

Abstract – Project Management tools are considered very important for the success of software projects. Moreover, there are different Software projects scheduling methods that project managers can follow in order to estimate the duration of Software projects. This research discusses the main distribution methods such as Uniform, Beta, Triangular and Gaussian. Moreover, it provides a comparison between them through a simulation experiment. The simulation experiment is conducted using a special software that was developed for the purpose of this study. Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved. Keywords: Project Management, Schedule Methods, Software, Simulation

I.

Meeting or exceeding the needs as well as the expectations of the customer involves controlling demands including scope, time, cost and quality. Ref [3] summarizes the project management process according to the project plan, project schedule and project control in an integrated tasks based on this the objectives of the project will be achieved successfully and to satisfy the stakeholders needs, such as customers. Most project management definitions highlight the importance of working with stakeholders to define needs, expectations and tasks. Moreover, project management is about managing people to achieve the expected objectives rather than managing the work itself. According to [4], project management is used on projects to increase the efficiency as well as the effectiveness. Many researchers have tried to explain what a project management is and what most of these definitions share is that they all discuss in somehow the criteria for the success of the project, namely cost, quality, time. These three factors are often referred to as The Iron Triangle [5]. Project Management Institute classifies in Project Management Body of Knowledge argue that the knowledge of project management can be related to one or more of the following related fields; integration management, scope management, quality management, time management, cost management, risk management, human resources management, procurement management and communications management [6]. In other words, project management has to deal with a wide range of issues, which is a big challenge for project managers, mainly because managers do not see themselves as experts in all project management aspects [7]. This paper focuses on one main aspect of the project management which is the to minimize the risk regarding the scheduling of the project task, therefore, to decrease any possible delays risk. This is done by understanding

Introduction

During the last few decades, project management technique has developed and taken its place along with other important tools in order to increase the efficiency of the project life cycle. Project management tools have been found to be an effective way to supervise and manage major new systems development by most of organizations all over the world. In a more basic form, project management tools can be a helpful technique in the management of any project—including, and in particular, software projects. As the objectives of this study is to find and evaluate the best techniques to decrease the risk of project scheduling delays, it is important to understand the different approaches techniques concepts that will be used in this study experiment such as Uniform, Triangular, Beta, Gaussian and others. Before taking a look at the definition of project management, project controlling and project risk management, it is important to know the definition of project and project risk. [1] defines project as a something outside normal day-to-day work. They add a more specific details to this definition by that the project is organized with a “pre-defined” objective that has clear borders and needs resources as well as unique effort, and therefore, has a level of risk and uncertainty and which operates under quality, cost and time constraints. Project risk, on the other hand, is a “source of risk is any factor that can affect project performance, and risk arises when this effect is both uncertain and significant in its impact on project performance” [2]. According to Project Management Institute Standards Committee (1996), project management is “the application of knowledge, skills, tools and techniques to project activities in order to meet or exceed customer needs and expectations from a project”.


876


Osama K. Harfoushi

data.

and comparing the different scheduling techniques.

II.

Scheduling Approaches and Problems in Project management

III. Experiment Results and Discussion The main objective of the research is to help project managers in estimating the project duration in early time of the project life cycle in the best possible ways. The comparison between the different techniques for scheduling in project management is made through a simulation experiment using a software that was developed especially for this objective. The software developed for the purpose of this experiment takes different number of inputs. The first main input for this software is the total number of activities required to finish a new software project. Of course, the number of activities is given by the software project manager who will decide the total number of these activities as well as the description of these activities. Also, the project manager should assign a precedence of these activities. Precedence means the activity number that should be completed before the begging of the current activity. Once a work activity has been defined, the relation between the tasks can be identified and tasks must take place in a pre-define sequence by the project managers usually. Sometimes one activity requires more than one precedence or even no precedence at all. This software is developed to employ different kinds of probability distribution techniques or methods. As stated early, there are different probability techniques that this software is used. The differences between these probability methods should be understood by the project manager or one of the project management team as some of these distribution methods require one or two or three parameters. Therefore, when a method is chosen, the related number of parameters' fields for that methods is shown. These parameter(s) represent the duration or the time needed to finish that activity. Usually, there are two different time durations, optimistic expectation and pessimistic expectation which any project managers must take into consideration. Some time and in case of three duration time, the middle number is called the most likely. Another input for the experiment is the total number if Iterations. This means the number of repetitions of the simulation process. In other words, how many random simulation will be applied for the same input data. Due to limited resources such as time, team and contacts in real software projects, this simulation experiment uses a random sample. It assumes that a software project consists of 10 activities with a specific number of assigned precedence. It is not really important whether it is 10 activities or 100, the important thing is that this number, activities number, to remain the same during the implementation of the different probability distribution methods. It is also important that not only the number of tasks but also the precedence task to

Ref. [8] argues that because of the increasingly competitive environment, scheduling tools prove to be better for certain types of projects. However the limitations of these tools are also being known and research is going on to improve these tools and increase utilization of other tools such as linear scheduling techniques, simulation techniques, genetic algorithms, and so on, for activities. Project scheduling is an essential part of projects in all stages starting from the feasibility stage of the project up to the end of the project or completion. There are different approaches used to help in scheduling project activities but the most common approaches in use are the Critical Path Method (CPM) and Program Evaluation Review Technique (PERT). This paper attempts to evaluate the available techniques in project scheduling in order to enhance the efficiency of estimating the project schedule. Ref [8] developed a new model for improving the project schedule reliability which face some uncertainty in the duration of its activities. Their study shows that the tool for applying the new concept needs minimal information, is simple to use and helps in the preparation of schedules at an accepted level of reliability. Ref [9] has clarified the model NETCOR (NETworks under CORrelated uncertainty), that can evaluate project schedule networks when activity durations are related. This study has showed the practical application of NETCOR to a current project. Using the same inputs, PERT and several simulation analyses that do not consider correlation are also evaluated. A comparison of the results shows the significance of considering correlation in scheduling analysis. The Triangular distribution is one of the Mathematical Distribution techniques. This approach uses minimal value, maximum values and the most expected values. It is considered a useful when there is clear available information. The triangular probability distribution is well applicable in models where is lack of data. Another distribution method is called the Uniform. A uniform distribution is one in which there are equivalent number of procedures in each intermission. It shows the same chances between its endpoints. Moreover, every selected value in the range are equivalent. This uniform allotment ( U,0,1 ) is crucial for building random values compared to other distribution. It might be preliminary used as a model for unknown



877

Osama K. Harfoushi

three suggested number of iteration experiment. Note the max duration times in days for the three experiment is approximately close to each other. With 100 Iterations experiment, the estimated duration in days is 27.621 with a standard deviation of 1.78. With 1000 iterations experiment, it is 29.3 days duration with standard deviation of almost 1.97. For the third experiment (10000 iterations), it is around 29.63 days with standard deviation of 2.061. These close outputs indicate that this experiment results is more likely accurate and very close to the real life.

remain the same in whole simulation experiment. The duration of each activity may take one, two or three different expected finishing time. These duration times must be added to the experiment in order to run the simulation experiment. The reason why there are one, two or three different duration times is because each probability distributions techniques work in a different way, some with one duration , other with two or three. These expected time durations will remain also the same during the simulation experiment to assure the accuracy of comparison between the different probability methods in this experiment. In other words, all inputs of the simulation experiment are exactly the same and the only two things that change from a simulation to simulation is the distribution types (Uniform, Triangular, Beta and Gaussian (Normal)) and the number of Iterations. Figure 1 shows the main interface of the experiment. As shown in the figure, the number of current activities is 10 with the shown precedence and their expectation end duration times. The simulation experiment will be repeated for each of the distribution types (Uniform, Triangular, Beta and Gaussian (Normal)) and for each distribution type, 3 different iteration tests will occur (100, 1000 and 10000).

TABLE I UNIFORM DISTRIBUTION METHOD COMPARISON Number of Iterations Max Duration in Standard Deviation in days days 100 Iterations 27.621 1.78 1000 Iterations 29.3 1.97 10000 Iteration 29.63 2.061

III.2. Triangular Distribution Analysis Triangular distribution is used when the minimum, maximum and most likely values of duration time are known without any other information. Mainly, project manager uses this approach often when there is an absence of data. The simulation is using the same number of activities, Precedence and duration times for each activity, along with triangular distribution techniques with 3 different iteration tests as the previous experiment. The max duration times in days for the three experiments is approximately close to each other just like the previous experiment, however, here the standard deviations in general are less than the uniform approach which means more reasonable results for the project managers. With 100 Iterations experiment, the estimated duration in days is 27.332 with a standard deviation of 1.431. With 1000 iterations experiment, it is 28.782 days duration with standard deviation of almost 1.6. For the third experiment (10000 iterations), it is around 29.145 days with standard deviation of 1.559. These close outputs indicate that this experiment results is more likely accurate and very close to the real life as well as it is better that the uniform method especially because of the standard deviation output (Table II).

Fig. 1. System Input Value

III.1. Uniform Distribution Analysis Uniform distribution is when there are an equal number of measures in each interval. It states that an equal probability between its endpoints is exist. The uniform distribution means the probability that there is no duration time between the minimum (optimistic) and maximum (Pessimistic) that is more likely to occur than the other. The simulation is using the same number of activities, Precedence and duration times for each activity, along with uniform distribution techniques. It was conducted three times with 100, 1000 and 10000 number of repetition s of the simulation. Table I summarizes this first comparison between the

TABLE II TRIANGULAR DISTRIBUTION METHOD COMPARISON Number of Max Duration in Standard Deviation in Iterations days days 100 Iterations 27.332 1.431 1000 Iterations 28.782 1.6 10000 Iteration 29.145 1.559

III.3. Beta Distribution Analysis This simulation experiment was carried out three different times with 100, 1000 and 1000 number of



878

Osama K. Harfoushi

repetition s of the simulation. The max duration times in days for the three experiment is approximately close to each other just like the previous experiment, however, here the standard deviations in general are the lesser than all of other distributions. With 100 Iterations experiment, the estimated duration in days is 32.360 with a standard deviation of 0.31. With 1000 iterations experiment, it is 34.450 days duration with standard deviation of almost 0.29. For the third experiment (10000 iterations), it is around 33.6 days with standard deviation of 0.291 (Table III).

IV.

Summary, Conclusions and Recommendations

This research helps project managers to explore more options about what distribution methods they can use when they want to estimate the duration of the software project. It will help them to compare and explore the advantages of each types and when they can use it to achieve the maximum benefits from these techniques and, therefore, to minimize the risk associate the project estimation time period.Moreover, the results of this research can benefit practitioners in charge of project management. It will allow them to have deeper understanding of the current distribution methods and to move forward to well-designed and less-risky software project development. The objective of this study was to test and evaluate the different distribution techniques that project managers can follow to estimating the project schedule mainly to help the reducing the risk of delays. The studied distributions are Uniform, Triangular, Beta and Gaussian (Normal). It was found that using these methods is in general can decrease the risk of wrong project estimation time. However, it needs a project manager to be expert or at least familiar with the usages of each of the tested distribution types. Also, it was found that, based on the given inputs, Triangular method is provide the quickest duration days among the types with the minimum standard deviation from the mean. Table V shows the final output of the series of simulation experiment which were conducted for the purposes of testing the developed software. Note that Triangular and Uniform distributions method are recommended to follow by project manages because they offer a challengeable duration time. On the other hand, Gaussian distribution type may have the longest duration time, however, this is with less risk possibility of losing the deadline of the project.

TABLE III BETA DISTRIBUTION METHOD COMPARISON Number of Max Duration in Standard Deviation in Iterations days days 100 Iterations 32.360 0.31 1000 Iterations 34.450 0.299 10000 Iteration 33.6 0.291

III.4. Gaussian (Normal) Distribution Analysis Gaussian Distribution or Normal distribution is used to describe events that have an equally likely chance of occurring above or below an average. Most outcomes are concentrated near this average; the farther away, the less likely is an outcomes. The normal distribution is very common because it is computationally convenient and it has properties that can be easily generalized to real-world phenomena. As we did with the previous simulation experiments, this is using the same number of activities, Precedence and two parameters for each activity are assigned one is the mean distribution, the second is the standard deviation for each activity, along with Gaussian distribution techniques. Just like the previous, this simulation experiment was carried out three different times with 100, 1000 and 1000 number of repetitions. The max duration times in days for the three experiment is less than the other distributions types. Moreover, the standard deviations are much higher than the other approaches which means more reasonable results for the project managers. With 100 Iterations experiment, the estimated duration in days is 28.1 with a standard deviation of 1.699. With 1000 iterations experiment, it is 29.8 days duration with standard deviation of almost 1.601. For the third experiment (10000 iterations), it is around 30.59 days with standard deviation of 1.73. These close outputs indicate that this experiment results is different from other distributions (Table IV).

TABLE V CROSS ANALYSIS OF DISTRIBUTION TYPES Number of Iterations Uniform Triangular Beta 100 Iterations 1000 Iterations 10000 Iteration

27.621 29.3 29.63

27.332 28.782 29.145

28.1 29.8 30.59

Gaussian 28.1 29.8 30.59

Based on the research findings, the following recommendations were proposed. Software institutions must be aware of the great advantages project scheduling tools can provide. It can save managers' time as well as increase the flexibility; therefore, it increases the productivity and performance of organization. Another recommendations of this research is to apply software tools into the estimation of project duration because accurate project estimation leads to client satisfaction. Moreover, it leads to reduce the waste resources of the software project. One more recommendation is to force project managers to know the main differences and

TABLE IV GAUSSIAN DISTRIBUTION METHOD COMPARISON Number of Max Duration in Standard Deviation in Iterations days days 100 Iterations 28.1 1.699 1000 Iterations 29.8 1.601 10000 Iteration 30.59 1.73



879

Osama K. Harfoushi

usages of different distribution types available in the market.

Authors’ information Department of Business Information Technology, The University of Jordan, Amman, Jordan.

References [1] [2]

[3] [4]

[5]

[6]

[7]

[8]

[9]

Osama K. Harfoushi has a PhD degree in Information Technology from University of Bradford in the UK. He was graduated in 2008. He was born in Jordan, Amman in 28th of August 1980. Currently, he is an Assistant Professor at King Abdullah the Second School for Information Technology at the University of Jordan. Osama's major interests in research include e-business, e-government, elearning and mobile business applications.

M. Field and L. Keller, Project management (London International Thomson Business Press), 1998. C. Chapman and S. Ward, 1997, Project risk management: processes, techniques and insights (West Sussex. John Wiley & Sons), 1997. H. Kerzner, Advanced project management: best practices on implementation (Second edition, John Wiley & Sons), 2004. K. Jugdev and R. Müller, A retrospective look at our evolving understanding of project success, Project Management Journal, Vol. 36. Issue 4. pp 20. 2005. R. Atkinson, Project management: cost, time and quality, two best guesses and a phenomenon, its time to accept other success criteria, International Journal of Project Management, Vol. 17. Issue 6. pp. 337-338, 1999. J. Turner, Towards a theory of Project Management: The functions of Project Management, International Journal of Project Management, Vol. 24. Issue 3. Page 187, 2006. B. Kolltveit, J. Karlsen, and K. Grønhaug, Perspectives on project management, International Journal of Project Management, Vol. 25. Issue 1. Pages 4, 8-9, 2007. Y. Ben-Haim and A. Laufer, Robust Reliability of Projects with Activity-Duration Uncertainty, ASCE Journal of Construction Engineering and Management, 124, (2), pp.125–132, 1998. W. Wang and L. Demsetz, Model for Evaluating Networks Under Correlated Uncertainty – NETCOR, Journal Of Construction Engineering and Management, 126 (6), pp. 458-466, 2000.



880


A Novel Approach Based on Nearest Neighbor Search on Encrypted Databases Lakshmi R. B., Subramaniyaswamy V., Janani M. Abstract – Data mining services which permit designers and individuals to store their data to a server, and reduce maintenance cost. The spatial queries does not provide privacy, because the location of the query reveal sensitive information about the query. Only authorized user is allowed to access the query, even though service provider not able to view the query. This paper focuses on adducing a novel k- nearest neighbor search on the encrypted database. It describes the strong location privacy that renders a query identical in data space from any location. Due to communication cost in query processing, existing work fails to hold this search. We include a method that endeavors strong location privacy, by amalgamate Metric Preserving Transformation (MPT). Empirical results reveal that efficacy and performance of the adduced methodology has been increased and as compared to the existing methodologies. Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved. Keywords: Data Privacy, Encryption, Nearest Neighbor Search, Protection, Query Processing, Security

I.

has particularly inexpensive distance function that have lower bound on actual distance. With privacy preserving benchmark KNN protocol and feature distance, our work can reduce the amount of times which desirable for application check over distance protocol. In our work, it can be used to improve the nearest neighbor algorithm that preserves the privacy, as KNN classification and outlier detection. In our work, we focus on k Nearest Neighbor queries concentrate at highest privacy, which has strong location privacy. The existing techniques can contribute a degree of location privacy. This can be classified into three concepts: Location confusion, metric preserving transformation and data transformation. In our proposed work, KNN search with privacy has two main elements: MPT and query plan.. Additionally confusion techniques the location based service can constrict the client in a subspace of overall domain. MPT approaches developed a simple query original, this process retrieves the database block from LBS without discovering that the block are retrieved. This original process can oppose to pattern attacks. Then, the client use to reduce a spatial query. To the best of our acquaintance, the MPT method deals with KNN process search, our proposing algorithm include a flexible number of block can retrieves the spatial query. Even though, each query retrieval is completing a private process, then the MPT approach can request KNN query process it may acknowledge the information, which is parallel to data transformation techniques. Therefore, this method disobeys the strong location privacy. Thus, it renders all queries equal in satisfying the strong location privacy. However, this process can handle single NN queries. Moreover, this approach can

Introduction

The positioning capabilities in mobile devices are location-based services (LBS), which is the next killer application in the wireless database environment [1][23]. LBS allow client to query based a service provider in all around manner, concerning to retrieve detailed information about the point of interest. Association exploration has extensive operations during the different database systems, since multimedia databases, biological databases and many various systems, is similar. The majority of ordinary problem is k- nearest neighbor which exploration to retrieve k similar objects through the query operation purpose. The difficulty during the distributed location, it considers as S it is dispersed between several parties, which disclose their private data. These problems occur when data to be mined, for example. Commercial secrets, proprietary, a national security effects and preserved private legal requirements. This process uses to develop the privacy preserving protocol for data mining. The particular domain is nearest neighbor performing privacy protocol it includes comparisons among chemical architecture, biology, location privacy. Shaneck et.al planned adequately protected protocols for this trouble. Additionally they find out a number of shortcomings of their answer, it includes effective discharge in sequence concerning confidential data with quadratic complication. Furthermore, they use techniques which allow for provably and efficient privacy applications of multi-step powerful approach is given in [10], the higher dimensional situations are performing an accurate distance between two objects and has low accurate but



881

Lakshmi R. B., Subramaniyaswamy V., Janani M

lead to computational process and communication cost is very small in POI databases, due to relies on an expensive MPT. The rest of this paper is structured as follows. Section 2 gives a deliberation on the related work. Section 3 describes the proposed approach to the problem and introduces a perceptive privacy. Section 4 presents the experimental results and Section 5 concludes the work with pointers to future work.

II.

new entries, there is no need to change the encrypted values. Kreigal et al. described k-NN search with multi-step query processing algorithm using two boundary values in the filter step [8]. Then, this process shows that is optimal algorithm can produce and maintains the minimum amount of the applicants that needs to be polished. Due to this, they can use to generalize the Roptimality in the distance estimation that is available in the filter into a different account. Yao et al. discussed about the revisit of the security problem in nearest neighbor [15]. Then they show that the insecurity of existing solutions and main is the hardness of the problem. They designed a new model of partition based Secure Voronoi Diagram method (SVD). This technique is used to secure the encryption function and its uses and it can secure several encryption schemes to be employed in the SVD techniques. Subsequently, search on encrypted multimedia databases [9] described a first effort of the content based retrieval in excess of the encrypted database. By image database can be focused on the building protected search indexes that can be privacy for the image in the content from the server side and it can preserve the ability of same comparison. There are two types of securities in indexing scheme:  Secure is inverted index;  The secure is minimized in the hash sketch method. By using this method they can achieve a good retrieval in performance over the encrypted image indexes and server is mentioned as good for privacy preserving method in multimedia retrieval. Song et al. described an easy but efficient method to encrypt a index composed of pairs [13]. The objective is to retrieve the parallel value only if an applicant input is supplied. Consequently, to retrieve the tuple we must need an unacceptable key. Feasible and effective method for large dataset processing using support vector machines (SVM) has been proposed by Liang et al. [22]. Based on the study of the nature and difficulties in training SVM, a novel reduction approach is proposed. Attractive schemes are projected to hold a keyword search over an encrypted text repository is given by Feigenbaum [7]. Here encrypted email messages are retrieved effectively. Obviously, relational databases and their techniques that adapt for relational databases are not discussed. Zagrouba et al. proposed a reliable image based retrieval based on graph matching [23]. In order to reduce the semantic gap between the query and retrieved result this method is proposed. Bouganim et al. explained about encryption and query processing capabilities of smart card that protect rescue of encrypted data kept on untrusted servers [4]. Moutachouik et al. described a hybrid method for informational retrieval based on similarity between queries. Based on statistic x2 and mutual information the similarity is measured [20]. A new index structure for multidimensional data based on a forking technique is given by Hadi et al. [21]. Here the multidimensional data is visualized with a class label so that the impurity is

Related Work

Some of the notable research efforts have been made in the previous research work. Shashank et al. focuses on the content based multimedia retrieval which is an excess of encrypted database, it can be in both the query and database, where the document is encrypted and their privacy is confined. This method proposed allows to retrieve straight away from the encrypted database, but it should be without multiple round of communication method between the user and server. They have to reveal the proposed method using the image, even though this method is valid to other multimedia such as video. Several algorithms were proposed [6], [3], however they have to assign that n input value points to the query point q as public. Currently think over the problem of dispersed setting where S is disseminated among several parties, who wish to release their private data to others. This occurs while data mining is sensitive, for example: containing marketable secret, national security significance, to be reserved privately for legal supplies. This motivates the expansion of a private protocol for data mining. Athitos et al. , described a method of DBH for estimating nearest neighbor retrieval in arbitrary spaces [2]. The DBH hashing method can create several hashing tables into which database object and query processing object are mapped. The formulation is applicable for distance measures and arbitrary spaces in the key feature of DBH. It is inspired by LSH, DBH can create a technique that allows some key concept in LSH that has to be applied in arbitrary spaces. Cao et al., solved the problem of (PPGQ) privacy preserving graph query [5]. This problem has to reduce the sub graph isomorphism and they adopt the competent standard of filtering and verification that can lead to many relative graph data as possible before the verification process. Then, for the filtering process this functionality is needed to gain the index construction in each and every associated binary vector as sub index, each bit value is represented as the parallel feature is a sub graph isomorphic to its data graph. The k-anonymity model [14] is applied in the privacy preserving publication of the data set process. This can be generalized in least k tuples in the table value. Rakesh Agrawal and Jerry Kiernan summarized the order preserving encryption system, where queries can be applied straight away to the encrypted databases [1]. The result of the query process is neither false positive nor any miss value. To add the



882


easily perceived by the user.

we do not look for database to protect. Hence, we suppose that database and index value are not encrypted. Our proposed model provides strong location privacy for the database. In MPT Qi, j and Ci, j do not reveal knowledge about the parallel request chunk Bi, j to any client and the query processing force to every KNN query process in same MPT retrievals on the database in similar order. Thus, the KNN query process becomes identical.

III. Proposed Methodology In our proposed work, a location based service acquires a database and the client desire to issue a NN query in database without revealing its location. It constructs an index structure on the database. Then, LBS integrate database with index and arrange them in m which is disjoint databases DB1 ,DB2 ............DBm , m (  1 ) count on effective proposed solution, this decomposition will become clear soon. Each DB1 consists of certain blocks Bi, 1, Bi, 2… all in equal size. Considering a black box, then the devices have the original query Qi, j 1 archive on DBi. Then, the results are denoted as Ci, j computed version of jth block DBi Qi, j and Ci, j be understandable just through client and the secure hardware. Fig. 1 shows overview of our methodology.

III.2. Benchmark Explanation We plan the solutions called benchmarks and computing a query processing is classifying to implement strong location privacy in databases. It describes a KNN algorithm. III.2.1. KNN Algorithm Let database is a POI database, then we defined P ∈ DB has P.id it is the single attribute of P, that are denoted as P’s with P. tail where symbolize as further database connected through P. Location based services creates an g×g matrix grid G, more than the POIs. This process can construct two databases DB1, DB2 includes B1, i and B2, i blocks. It's used to determine by MPT protocol. Then, we focus on DB1. In all chamber c ∈ G, the LBS process creates a block B, it stores an entry for a POI in c, it has P.x, P.y both are similar meaning. Moreover, B cannot provide the entries of all POIs, and it LBS can create additional blocks which form a list in the B. Thus, the LBS process can store each chamber cij in DB1. Then

Fig. 1. Overview of proposed work

In this method Q be the client KNN query. Client process uses to execute an algorithm, that process Q in informed multi step method. This algorithm is initially has set of blocks to retrieve privately for the location based services. Consequently, the client process uses to send the LBS parallel set Qi, j, Qu, v…… MPT queries. This process the query and send replies in Ci, j, Cu, v…. These block processes contain the index or results data, that algorithm is used to determine the blocks which is to be retrieved in the next step. Then, the above process is frequent until the collection of Q results. A KNN query process translates to MPT query. The two obligatory supplies for the security of KNN query.  Databases must be queried in the identical order  Database entry in the category must include the same number of MPT query, LBS must create a query processing due to these supplies. The location based service creates Query processing in offline pre-processing stage and it's widely available. While generating MPT queries, the query algorithm at client side takes into description query processing. While completing this process QP has guaranteed the successful result retrieval in any query data space.

the further block is appended at the end of DB1.The KNN algorithm process runs on the client and it contain in two phases. CPM is the first conditions it implements the grid based KNN method. During the query processing this process can retrieve effective chamber of the grid ascending order number. These techniques are regularly led to the effective maximum retrieves cell, that has circle centered at the query, through radius distance to kth NN. While the method decides the chamber cij . It should be accessed and all blocks with cij from the DB1. This is possible because the client process can identify head block with cij In DB1 it is described as i-1, g+j. The KNN algorithm is determined as a second phase, its result is based on DB1 entries which is retrieved in the first phase. Hence, DB2 blocks locate the results tails using the pointer process. III.3. Transformation Methods In our work we propose a transformation technique including the equivalent queries, there are three transformation method EHI, MPT, FDH. They have various tradeoffs between query cost, accuracy and data privacy method. Then, they extended MPT and FDH in order to obey  gap guarantee.

III.1. Threat Model and Protection We assume that query algorithm; in our framework primary privacy is strong location privacy. In this process



883


anchor object. Ci denotes the ith anchor object and set the value ci, the value S is assigned as set of objects. The total anchor value is covered by radius ri, that can be denoted the maximum distance from ci to several object value it set as ci: S. In query processing anchor distance plays a significant role. We compute distance dist (ci: t) from the anchor, for each objects p in set of ci : S and need to concern a classify preserving encryption function OPE on (dist (ci; t)) and encrypted object is ECR (t: CK) it will send to the server process. This profit for using OPE be hiding effective inventive distance along with then concede comparisons correctly to evaluate on the server process.

III.3.1. Encrypted Hierarchical Index Search (EHI) This method introduces a user process algorithm for developing NN search on EHI, which is stored in the server process. In this technique it offers data secrecy whereas the data owner, throughout the query processing this approach obtains the several communication round trips. III.3.1.1. Query Processing In query processing techniques server cannot measure the KNN query by itself, while there is a tree index method is stored on the server and its encrypted. A query processing algorithm is developed to respond the NN query process correctly for communication among effective server and clients' needs. The hole response time of algorithm contains data transfer and the round trip latency. This two process looks for a similar time and relocate occasion in hard disks. The data relocate time is minimized by best first NN search algorithm. Though, this process server needs to transfer a message in time, while a node is appealed, furthermore the process acquires extremely in round trip latency. Moreover, client-server design improves the leading first NN search algorithm.

Algorithm: MPT Building Algorithm Metric Preserving Transformation Searching Algorithm (Key Ck, Integer a, Integer b, Query q) /*Query Processing*/ 1. 2. 3. 4. 5. 6. 7. 8.

III.3.2. Metric Preserving Transformation (MPT)

9.

In our proposed work, introduce a method which shifts the search functionality to the server. Our work MPT collection qualified instruction on the server side, the anchor objects that respect to a private set. The technique guarantees the accuracy of the concluding search product, however by the charge of two round communication process. Dissimilar the EHI method and MPT acquire only two steps of communication in the query phase. The fundamental idea in the MPT is take a set of objects from dataset T and the arranged objects and is assigned to all objects of T in nearest anchor. Then, all object pointers have to compute its distance dist from its anchor ci need to assign a classify encryption purpose OPE on the distance value. Then, the classify preserving encrypted distance is used to store on the server furthermore, it utilized for processing the NN queries.

10. 11. 12. 13.

Data owner uploading files (qe) to server Using Encryption key (Ck) For each process Qe = qe + Ck (encrypted datasets) H : = new min-heap; pi=null; Y : = mini€[1,p] dist (q,ai); //initial bound on NN Distance For i=1 to A do Let a=ith anchor node leading to y value, Samp decrypted objects Request the server for a query q object whose anchor id equals to ai near For each p∑ Samp do If (mindist(q, (ai, pi) ≤ y then [OPE(dist(q, ai)-y),OPE(dist(q,ai)+y)]; Return the object p € Samp with minimum dist (q, p) value as the result.

III.3.3. (FDH) Flexible Distance-Based Hashing We propose a Flexible Distance Based Hashing techniques intended for query operation effective NN query. The major benefit of the FDH method is a server side, which constantly return the set of constant examine applicant set. Client process has subsequently clarified the applicant gather effective concluding results. Moreover, Flexible Distance-based Hashing techniques haven't either affirmed to occurrence of the accurate concluding result value that is secure until the concrete NN practice. During this query, the Flexible Distance-based Hashing conclude the effective client side process to identify the numeral limitation being developed effective precision value concerning the query processing result, server can be stored without rebuilding the data transformed. Query precision cannot build the directory organization in our in our earlier work development process. In our proposed work, Flexible Distance-based Hashing techniques can employ the novel method for linking same hash buckets and it has to minimize the transformed data for answering queries.

III.3.2.1. Transformation in MPT Algorithm process the implementing of Metric Preserving Transformation from the input dataset T. A is denoted as the number of anchor objects. In our proposed, we can concern the M-tree to choose a group of anchor A from T. In this heuristic is designed for optimizing the quality of anchors objects. Next we have to calculate the value of B, it is defined as the maximum quantity of the objects, which be able to be fixed to the equal anchor objects. Subsequently, we can apply a method to allocate every object. Next, we observe each



884


IV.

Experimental Results

Before evaluating the special effects of the parameters, the table shows MUSH values of data sets is given below. Data sets used to collect the related sets of information that is composed to separate the elements in each record, it is used to store the YEAST and MUSH value. Table I shows that dataset taken for our experimental evaluation. MUSH dataset describes the categorical attributes for North American mushrooms. TABLE I MUSH DATA SETS RAD FLOW 55 56 50 55 41

FPVClose

FPVCLOSE

HIGH

BYPASS

0 0 -1 2 0

81 96 89 82 84

0 0 -7 0 3

-6 52 50 54 38

BPV CLOSE 11 -4 0 -6 -4

BPV OPEN 25 40 25 26 43

MAX

MCGOPEN

MCGCLOSE

88 44 37 28 45

64 4 12 2 2

4 4 4 1 1

Fig. 3. Extracted MUSH dataset

In order to test the results of our adduced work, we organized several executions on the synthetic sequence of tasks. We proposed the methodology with the model of nearest neighbor search on the encrypted database. We include real data set methods, a measurement and parameters. Our concept includes the consequence of a variety of factors of quality and presentation of the distinct techniques. The extracted YEAST dataset attributes that we have taken for our evaluation. The attributes are extracted and they are transformed and stored in the database for the query processing (see Figs. 2 and 3). The ranking results of KNN on the datasets YEAST vs MUSH. Recognize that the result rank of MPT is better than EHI and FDI in the time complexity for the YEAST vs MUSH. Considering time complexity, largest ranks is for EHI much more than FDI, MPT. Moreover, a certain time complexity has a great level on the dataset value of Encrypted Hierarchical Index Search. According to our proposed work, while comparing with other anchors, MPT is better in time complexity for MUSH (see Fig. 4).

Fig. 4. Time Complexity for YEAST vs MUSH

Fig. 2. Extracted YEAST dataset

The consequence from conflicting the amount of anchors that are used in the EHI, FDH and MPT methods is shown in Fig. 5. In this graph, results show the effects on query cost for the YEAST vs MUSH. We recognize a regular development during the query cost methods by increasing the amount of anchor objects.

Fig. 5. Query Cost for YEAST vs MUSH



885


[3]

This affords more information on the location of objects due to more anchors. Moreover, it’s repeated during the connection quality concern EHI that improves the number of anchor objects in both YEAST and MUSH value, while comparing with YEAST query cost MUSH gives better execution time during the query cost communication process in MPT (see Fig. 5). Processing accuracy comparatively with YEAST vs MUSH, the MUSH value comparable increases the accuracy value then YEAST. Increasing MUSH value leads to more complex anchor objects. Moreover, MPT improves the number of process accuracy in MUSH value it gives better accuracy time in MPT anchor (see Fig. 6).

[4]

[5]

[6]

[7]

[8]

[9]

[10] [11]

[12]

[13]

[14]

[15] Fig. 6. Processing Accuracy for YEAST vs MUSH [16]

IV.

Conclusion [17]

In this paper, we derived an proficient methodology for k Nearest Neighbor search on encrypted databases, in which query is distributed correspondingly anywhere in the data space. We proposed amalgamate approaches that break a kNN query into different database block retrievals. In our proposed work, we introduced a MPT approach, which supplies the comparative information on the effective server side along with a secret set of anchor objects. MPT promises exactness of end result with two steps of communication process. Even FDH technique has been demonstrated with a single step of communication process, it does not promise to retrieve effective accurate result.

[18]

[19]

[20]

[21]

References [1]

[2]

[22]

R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, "Order preserving encryption for numeric data," presented at the Proceedings of the 2004 ACM SIGMOD international conference on Management of data, Paris, France, 2004. V. Athitsos, M. Potamias, P. Papapetrou, and G. Kollios, "Nearest neighbor retrieval using distance-based hashing," in Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on, 2008, pp. 327-336.

[23]


S. Berchtold, B. Ertl, D. A. Keim, H.-P. Kriegel, and T. Seidl, "Fast nearest neighbor search in high-dimensional space," in Data Engineering, 1998. Proceedings., 14th International Conference on, 1998, pp. 209-218. L. Bouganim and P. Pucheral, "Chip-secured data access: Confidential data on untrusted servers," in Proceedings of the 28th international conference on Very Large Data Bases, 2002, pp. 131-142. N. Cao, Z. Yang, C. Wang, K. Ren, and W. Lou, "Privacypreserving query over encrypted graph-structured data in cloud computing," in Distributed Computing Systems (ICDCS), 2011 31st International Conference on, 2011, pp. 393-402. C. Clifton, M. Kantarcioglu, J. Vaidya, X. Lin, and M. Y. Zhu, "Tools for privacy preserving distributed data mining," ACM SIGKDD Explorations Newsletter, vol. 4, pp. 28-34, 2002. J. Feigenbaum, M. Liverman, and R. N. Wright, "Cryptographic protection of databases and software," Distributed Computing and Cryptography, vol. 2, pp. 161-172, 1991. H. -P. Kriegel, P. Kröger, P. Kunath, and M. Renz, "Generalizing the optimality of multi-step k-nearest neighbor query processing," Advances in Spatial and Temporal Databases, pp. 75-92, 2007. W. Lu, A. Swaminathan, A. L. Varna, and M. Wu, "Enabling search over encrypted multimedia databases," SPIE/IS&T Media Forensics and Security, pp. 7254-18, 2009. T. Seidl and H. -P. Kriegel, "Optimal multi-step k-nearest neighbor search," in ACM SIGMOD Record, 1998, pp. 154-165. M. Shaneck, Y. Kim, and V. Kumar, "Privacy preserving nearest neighbor search," Machine Learning in Cyber Trust, pp. 247-276, 2009. J. Shashank, P. Kowshik, K. Srinathan, and C. Jawahar, "Private content based image retrieval," in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, 2008, pp. 1-8. D. X. Song, D. Wagner, and A. Perrig, "Practical techniques for searches on encrypted data," in Security and Privacy, 2000. S&P 2000. Proceedings. 2000 IEEE Symposium on, 2000, pp. 44-55. L. Sweeney, "k-anonymity: A model for protecting privacy," International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems, vol. 10, pp. 557-570, 2002. B. Yao, F. Li, and X. Xiao, "Secure nearest neighbor revisited," In Proceedings of 29th IEEE INternational Conference on Data Engineering(ICDE2013),Brisbane,Australia,2013. G. R. Hjaltson and H.Samet,"Index-Driven Similarity Search in Metric Spaces", ACM Trans.Database Systems,Vol.28,n.4,pp.517580,2003. M. L. Yiu, I. Assent, C. S. Jensen, Fellow, and P.Kalnis, "Outsorced Similarity Search On Metric Data Assets"IEEE Transactions on KnowledgeQueries Using Space and Data Engineering, Vol. 24, n. 2, 2012. A. Khoshgozaran and C. Shahbi,"Blind Evaluation of Nearest Neigrbor Queries Using Space Transformation to Preserve Location Privacy", Proceedings 10th International Conference Advances in Spatial and Temporal Databases(SSTD),PP.239-257, 2007. G. Ghintia, P. Kalnis, A. Khoshgozaran, C. Shahbi, and K.L.Tan," Private Queries in Location Based Services:Anonymizers are not necessary",Proceedings ACM SIGMOD International Conference on Management of Data,pp 121-132,2008. H. Moutachouik, B. Ouhbi, H. Behja, B. Frikh, A. Marzak, H. Douzi, Hybrid Method for Information Retrieval based on the similarity between queries, (2012), International Review on Computers and Software (IRECOS), 7 (6), pp. 2960-2967. S. Hadi, R. Beg, S. Rizvi, An Intelligent Tree Based Clustering Method for Large Multi Dimensional Data, (2009) International Review on Computers and Software (IRECOS), 4 (6), pp. 648651. K. Liang, F. Chen, X. Qu, Research and Application of SVM in Large Scaled Data Set Processing, (2012) International Review on Computers and Software (IRECOS), 7 (4), pp. 1536-1540. E. Zagrouba, S. Ouni, W. Barhoumi, A Reliable Image Retrieval System Based on Spatial Disposition Graph Matching, (2007) International Review on Computers and Software (IRECOS), 2 (2), pp. 108-117.


886


Authors’ information Department Of Computer Science And Engineering, School Of Computing, Sastra University, Thanjavur, Tamil Nadu, India. E-mails: [email protected] [email protected] [email protected] R. B. Lakshmi received his B.E. from Anna University, Chennai and pursuing M. Tech in SASTRA University, Thanjavur. Her research interest includes Data Mining, Network Security and Database Systems.

V. Subramaniyaswamy received his B.E. from Bharathidhasan University, Trichy and M.Tech. from Sathyabama University, Chennai. He is pursuing the Ph.D. degree in Computer Science and Engineering, Anna University, Chennai, India. He is currently working as Assistant professor in the School of Computing at SASTRA University, Thanjavur. His research interest includes Data Mining, Information Retrieval and Recommender Systems. M. Janani received his MCA from SASTRA University, Thanjavur and pursuing M.Tech in SASTRA University, Thanjavur. Her research interest includes Network Security, Data Mining and Cloud computing.



887

1828-6011(201303)8:3;1-S Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved