LNAI 7906 - Recent Trends in Applied Artificial Intelligence - UNLa

Moonis Ali Tibor Bosse Koen V. Hindriks Mark Hoogendoorn Catholijn M. Jonker Jan Treur (Eds.)

Recent Trends in Applied Artificial Intelligence 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2013 Amsterdam, The Netherlands, June 17-21, 2013 Proceedings

13

Series Editors Randy Goebel, University of Alberta, Edmonton, Canada Jörg Siekmann, University of Saarland, Saarbrücken, Germany Wolfgang Wahlster, DFKI and University of Saarland, Saarbrücken, Germany Volume Editors Moonis Ali Texas State University San Marcos, TX 78666, USA E-mail: [email protected] Koen V. Hindriks Catholijn M. Jonker Delft University of Technology 2628 CD Delft, The Netherlands E-mail: {k.v.hindriks; c.m.jonker}@tudelft.nl Tibor Bosse Mark Hoogendoorn Jan Treur VU University Amsterdam 1081 HV Amsterdam, The Netherlands E-mail: {t.bosse; m.hoogendoorn; j.treur}@vu.nl

ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-38576-6 e-ISBN 978-3-642-38577-3 DOI 10.1007/978-3-642-38577-3 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2013938609 CR Subject Classification (1998): I.2, H.4, F.2, H.2.8, I.5, I.4, H.3, F.1, G.2, C.2, K.4.4, I.6 LNCS Sublibrary: SL 7 – Artificial Intelligence © Springer-Verlag Berlin Heidelberg 2013 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Using Ising Model to Study Distributed Systems Processing Capability Facundo Caram, Araceli Proto, Hernán Merlino, and Ramón García-Martínez Facultad de Ingeniería, Universidad de Buenos Aires, Argentina Grupo de Investigación en Dinámica y Complejidad de la Sociedad de la Información y Grupo de Investigación en Sistemas de Información, Departamento Desarrollo Productivo y Tecnológico, Universidad Nacional de Lanús, Argentina {aproto,rgarcia}@unla.edu.ar

Abstract. Quality of service based on distributed systems must be preserved throughout all stages of life cycle. The stage in which this feature is critical is in stage planning of system capacity. Because this is an estimate, the traditional approach is based on the use of queues for capacity calculation. This paper proposes the use of Ising traditional model to capacity study. Keywords: Distributed Systems, Processing Capability, Ising Model.

1

Introduction

Capacity planning of distributed systems presents the challenge of having to give an estimate of the resources used by the system based on the requirements defined at an early stage of life cycle system. To address this, the community has used queuing networks models (Network Queuing Models - QN) QN models are a set of interconnected queues, each of which includes the waiting time to meet every user [1]. The Users move between these queues to complete your request. The input parameters for a QN model are divided into two groups: [a] load current: this parameter provides the system load at any given time, [b] service demand: is the average time of service provided by a resource. One important aspect of QN models is performance system. Technologies for building software systems such Web Services, XML RPC, Grid Computing allow building more robust and complex systems, but use of these technologies make capacity planning more complex, and the community needed tools to make a precise estimation, for this in this paper authors present an alternative to estimate the performance of a distributed system based on the traditional Ising model. In distributed architecture each component has a finite processing capacity, these components may be redundant, in this particular, traditional Ising model introduces a mechanism for resource cooperation. Through the simulation are determined "virtual" temperature, magnetization and critical system parameters. The Performance Engineering (PE) encompasses a set of tasks, activities and deliverables that are applied to each phase of the life cycle in order to achieve quality M. Ali et al. (Eds.): IEA/AIE 2013, LNAI 7906, pp. 92–101, 2013. © Springer-Verlag Berlin Heidelberg 2013

Using Ising Model to Study Distributed Systems Processing Capability

93

of service as detailed in the non-functional requirements. PE is a collection of methods that supports the development of performance oriented systems over the entire life cycle. In performance-oriented methodologies, the different stages of the life cycle are linked with models of engineering performance: (a) load model: simulates the real burden to be borne by the system, (b) performance model: is the response time to be answered a job application with respect to a given workload and (c) availability model: is the standard by which to evaluate how long the system is available to work.

2

Distributed Systems Modeling Using Ising

Our interest is to find an analogy of distributed systems based on grid computing with a well-known physics model [3]. We believe that the Ising model [4] is particularly suitable for this purpose as described below. In our model each site or traditional spin Ising model represents a cell or computer processing grid-computing. The computers with available resources are considered if = -1 (black site), and those without computers or resources are saturated with if = +1 (target site), (see Figure 1).

Fig. 1. Cell’s black have available resources, cell’s white not, these are congested

Energy of the Ising system in specified configuration {si} is defined as:

Where the subscript I refer to Ising and symbol denotes a pair of nearest neighbor spins; don’t have distinction between and . Thus the sum of has γ.N/2 terms, where γ is the number of nearest neighbors of any given site and the geometry of the grid (lattice) is described by γ y eij. Situation very close to reality is use a grid of Ising LxL, with all eij= e, γ = 4 and H= 0. Time between service requirements (TMER), corresponds to a temperature value "virtual" grid, calculated using the parameter estimation technique [7, 8].

3

Parameter Estimation in Ising Model

If defined σi,j = (si-1,j + si+1,j + si,j-1 + si,j+1) and calculate the conditional probability for a fixed site at position i, j, we obtain the following:

94

F. Caram et al.

(1) The β parameter estimate is made by using a common technique of parameter estimation in Markov random fields [5], which is the maximum like hood estimator pseudo [7, 9]. This criterion states that β^, (the estimator β) is one that maximizes the following product of probabilities: (2) As it is usual to apply the natural logarithm of the above expression, using the expression 1, we achieve the following objective function parameter β: (3) Now we are defined respect to β, and the above expression is found that the value of β^ makes J(β^) = 0. (4) To analyze the expression achieved and develop an algorithm for the estimation of β^, we make the following definitions novice: is Nα the number of sites for which the sum of its neighbors is equal to α. Note that the possible values for α are restricted to 0; +2; -2; +4 y -4 and of course we have to L2 = N0 + N-2 + N+2 + N-4 + N+4. Now we decompose sum in 4 as follows:

(5) Where the notation ΣΝαsij means taking the sum of all sites whose neighbors join α. Numbers A, B, and C are quantities that can easily calculate the sample {si,j} and the estimator is obtained as the β^ satisfies: (6)


95

Once we have calculated A, B, and C, find the zero function using a numerical technique such as the optimal method Myller [6].

4

Simulations and Results

This section presents the modes of operation of a computer network as a grid (section 3.1), an analysis of the cooperative and non-cooperative scenarios (section 3.2), and discusses the dynamic scenario (section 3.3). 4.1

Modes of Operation of a Computer Network as a Grid

To model operation of a computer network (grid computing - GC) in a 2D Ising grid, author simulates the following cases: when the cells (computers) interact with their nearest neighbors "cooperative case" when cells don`t interact "non-cooperative case" (normal operation), and finally when no predetermined relationships between cells and these demand dynamically among its neighbors to find a free resource, temperature grid evolution (1/β) as a function of time between requirements transactions. It is also observed that when modeling the cooperative case, the number of lost requirements is lower. In latter case establishing a rule demands that gives rise to a type adaptive dynamic interaction; when a computer is congested, start looking among its first neighbors with available resources. If found a neighbor with availability, it’s establishing a link between two that remain fixed for future operations. Failure to find any resources available neighbor, the computer will lose that requirement and remain unattached until the next time the need for a new search for resources to process requirements. Thus as a random graph in [3] we see that the network evolves or grid, for example at the beginning of the simulation computer is not previously connected with each other. Then during the processing requirements, according to the level of congestion, computers begin to look for resources in their neighbors. This produces a transition in the operation state of the grid, which generates links between computers. The number of possible links for each computer in this model goes from zero to four. Rate of generation of links depends on TMER. Once this is overcome transient and steady state is reached, most computers come to have four links created and the grid behaves as if in a cooperative scenario. The difference with pure cooperative case is that this scenario almost cooperative was created dynamically based on demand. Dynamically network involve static scenario (without cooperation) to start another static scenario (with cooperation) in ready state. 4.2

Analysis of Cooperative and Non-cooperative Scenarios

To evaluate temperature (1/β) of grid explained before, we define the network parameters: number of computers (cells), amount of computer resources (processing power), number of requirements, mean time between requirements and mean duration of processing of each request, generating the same amount of requests for each computer, using an exponential distribution for the arrival of requirements. The duration of

96

F. Caram et al.

each requirement is defined by a normal distribution. Then enter TMER different values for both scenarios. Finally these parameters with the parameter β are determined temperature corresponding to "virtual" grid in each case. In Figure 2 have different values of the parameter β according to each value of TMER. Besides the cases with and without cooperation, in the same figure also shows the theoretical value of the critical temperature of the Ising model. In Figure 3 you can see the percentage of unprocessed requests for both cases.

Fig. 2. β is TMER function, for cases with and without cooperation, versus critical temperature of Ising model, β = 0; 44. Tested with Lattice of 25x25 computers with 10 resource each one and requests media of 0,1 and dispersion of 0,05.

Fig. 3. Proportions of missing requirements and cared for both scenarios

Figure 3 shows that there are three zones or regions; one when the TMER ≤ 0.0001, and is very short in relation to the amount of resources and computers, and the average duration of each requirement. This situation is reflected in the case of a high rate of arrival of requests, so high that the grid processing cannot reach in time, due to lack of capacity and resources therefore quickly becomes congested, we call this situation, by high congestion magnetization (magnetization 2). There is another area where TMER ≤ 0.0001 ≤ 0.01, where the grid is within the range of operation, no magnetization area. Finally this zone where 0.01 ≤ TMER, here the grid is never high congestion, however this is on average rather low or zero. This situation is called for vacancy magnetization (magnetization 1), because the network has almost all its resources to process requests that are presented. All states are shown for each value stationary TMER. It is easy to see that in the non-cooperative case, the percentage of lost or unattended requirements is high and is always above the number of un-served requirements for the cooperative scenario, which is a fairly logical. Analyzing the cooperative case, because of its relevance, you can clearly see that for low values of TMER, where the grid is very congested, the number of missing requirements is very high in both cases, although it is higher in case non-cooperative. After this analysis, we can conclude that the cooperative scenario is more convenient, because the number of missing requirements is lower than for non-cooperative as the simulation shows.


4.3

97

Dynamic Scenario Analysis

In this section we simulate the behavior of the grid when there is no predefined neighboring links in the initial state. Here the links are generated according to the need of each computer and by TMER. We can see the evolution of the state grid noncooperative to cooperative state, according to the values of TMER. Therefore the bonds are generated in a dynamic manner according to requirements and demand as a function of the parameter TMER (see Figure 4). During simulation it’s taking pictures of the evolution of the grid during the search process explained above, and observing a sample TMER for different values, we can determine how long the transition for each value of TMER (see Figures 5 and 6). Graphically this process is a gradual Represented By change of color of black, where the computer does not have ties, through shades of gray to reach the target, where the computer has generated four possible links (see Figure 4). To know how many links has every computer in every step, we add counters that indicate the number of computers 12 without links L0, the number of computers that have a single link L1 and so on up to four computers with links L4. Therefore we need only observe the evolution of these counters during simulation to determine when each computer has reached steady state for each TMER. For a simulation of 1500 requirements, have been sampled at times proportional counters, one every 50 requirements for each computer. This operation results in 30 samples. From Figures 5 and 6 shows that the steady state shown between samples 20 and 30 (when the counter reaches L4 almost about 100%). Then for small values of TMER (magnetization 2), at first glance one might say that in a very high level of congestion, the speed of relationships should be very high, but the simulation indicates otherwise. This means that within this range the generation of links is very slow, but the explanation is reflected in the amount of lost requirements, which is about 45% as shown in Figure 7. This interpret rate of arrival of requests is so high that the grid does not have processing power, so the computer does not have enough time to meet all the requirements that are presented. Thus only a few links and computers can generate most of the requirements are discarded.

Fig. 4. In function TMER, scenarios for static and dynamic cooperation, versus the actual critical temperature in Ising model

Fig. 5. Links distribution for different TMER in a 30 size sample

98

F. Caram et al.

Another way of looking at this phenomenon is to observe the evolution of each counter during the simulation, as shown in Figures 8 and 9, there are some more representative values TMER where the behavior is different grid.

Fig. 6. Distribution of Links in 20 samples of different TMER

Fig. 7. Proportion of requests serviced and lost in the dynamic scenario

Fig. 8a. Links evolution in computers for TMER


99

Fig. 8b. Links evolution in computers for TMER

Fig. 9. Evolution of grid during processing requirements 1500 on each computer, for 6 representatives TMER

100

F. Caram et al.

You can see status of the counters during simulation for four TMER securities, describing different behaviors of the grid: the first two correspond to a high level of congestion. It is easy to see that in the sample 20 L4 counter has not yet reached its maximum. For TMER = 0.0020, now in the steady state without magnetization observed on the sample 20, the counter has reached its maximum L4. For the latter, we can see that the slope of L4 is smaller, since the request rate is lower and the need to generate links is reduced. With respect to parameter values in this particular case discussed earlier, there is a certain relationship between them, as when for instance vary the number of each computer resources, also provoke the increase or reduction of the total resource quantity grid. This produces the change in shape of the curve as a function of β TMER. The same happens if we change the average or the standard deviation of each request to be processed; because of this, we chose vary only the parameter representing the rate of arrival of requests, leaving the other parameters fixed.

5

Conclusions

The result of this work was to find an analogy between the Ising model and dynamics processing requirements within a grid of computers. It was observed that grid reaches a specific virtual operating temperature as a function of TMER (in this case, leaving all other parameters fixed). To do this, authors used a method of pseudo maximum similarity estimation. Authors found that, there are three areas of operation listed in this analogy analysis: two of them where the grid is "magnetized" and the other where it is "nonmagnetized". Grid behaves like a ferromagnet subjected to a temperature higher than the critical temperature for the Ising model. The two magnetisation zones indicating the presence of different behavior, a grid produced when high traffic supports (Magnetization 2) and consequently will lose a large percentage of requirements, since it is incapable of processing. While the other zone (magnetization 1) occurs when the TMER is sufficiently high, so the congestion level is low and practically only a few requirements are discarded. States mentioned above was analyzed for two separate scenarios: cooperative and non-cooperative. In both cases the range is not similar magnetization varies between TMER ≤ 0.0001 ≤ 0.01. Different kinds of curves were observed for each scenario. For the cooperative case, the minimum was at TMER ≤ 0.0019. For the other case, the minimum was at TMER ≤ 0.0011. Finally, authors performed an analysis of a dynamic process of relationships, where the grid at the start was completely uncooperative and after going through a transitional state, became almost completely cooperative, since not all computers have generated the four links. Here, we also identified three zones: the transient state, where the number of un-served requests is very high, and another state where it became clear that the grid is within the area of "no magnetization" and the last state where depending on time elapsed, the grid has generated almost 100% of the links.


101

References [1] Fortier, P., Howard, M.: Computer Systems Performance Evaluation and Prediction. Digital Press (2003) [2] Menascé, D., Almeida, V., Dowdy, L.: Performance by Design: Computer Capacity Planning by Example. Prentice Hall PTR (2004) [3] Albert, R., Barabási, A.-L.: Statistical mechanics of complex networks. Reviews of Modern Physics 74, 47–97 (2002) [4] Huang, K.: Statistical Mechanics, 2nd edn. Wiley (April 1987) [5] Greaffeath, D.: Introduction to Random Fields. In: Denumerable Markov Chains, pp. 425–458. Springer, New York (1976) [6] Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd edn., p. 364. Cambridge University Press, Cambridge (1992) [7] Bouman, C.A.: ICIP 1995 Tutorial: Markov Random Fields and Stochastic Image Models (1995) [8] Peierls, R.E.: On Ising´s model of ferromagnetism. Proc. Camb. Phil. Soc. 32, 477 (1936) [9] Besag, J.: Eficiency of pseudo likely-hood estimation for simple Gaussian Fields. Biometrica 64, 616 (1977)