Jun 30, 2003 - The thesis studies the performance of one of the services of the subscriber database on three different h
HELSINKI UNIVERSITY OF TECHNOLOGY Department of Engineering Physics and Mathematics
Heikki Uljas
Performance evaluation of a subscriber database with queuing networks
Master's thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Technology Espoo, 30.06.2003
Supervisor: Professor Ahti Salo Instructor: M.Sc. Jukka-Petri Sahlberg
HELSINKI UNIVERSITY OF TECHNOLOGY ABSTRACT OF MASTER'S THESIS Department of Engineering Physics and Mathematics Author: Heikki Uljas Department: Department of Engineering Physics and Mathematics Major subject: Systems and Operations Research Minor subject: Software Systems Title: Performance evaluation of a subscriber database with queuing networks Title in Finnish: Jonoverkot tilaajatietokannan suorituskykyanalyysissa Number of pages: 64 Chair: Mat-2 Applied Mathematics Supervisor: Professor Ahti Salo Instructor: M.Sc. Jukka-Petri Sahlberg Abstract: Often, software development attempts to build software that is correct from the functionality point of view before considering other desirable qualities like security, availability and performance. Commonly the performance of the software is determined only using measurement, which is only possible at a late phase of development. Performance modeling provides means to predict the performance of software before it can be measured. Queuing networks are the most commonly used performance models for software. The software studied in this thesis is a subscriber database that serves as a centralized storage of subscriber preferences for other systems in the network. The database has a capacity of several million subscribers and can serve several thousands of read requests per minute. The thesis studies the performance of one of the services of the subscriber database on three different hardware platforms using queuing network models. The model service time parameter values are estimated on one of the platforms using measurement and are then projected to match the other two platforms based on the hardware performance differences. The results obtained from the model are compared to results from measurement. According to the literature queuing network models are suitable for software performance modeling. The results obtained in the thesis confirm this view. Key words: Queuing networks, Software performance models
TEKNILLINEN KORKEAKOULU DIPLOMITYÖN TIIVISTELMÄ Teknillisen fysiikan ja matematiikan osasto Tekijä: Heikki Uljas Osasto: Teknillisen fysiikan ja matematiikan osasto Pääaine: Systeemi- ja operaatiotutkimus Sivuaine: Ohjelmistojärjestelmät Työn nimi: Jonoverkot tilaajatietokannan suorituskykyanalyysissa Title in English: Performance evaluation of a subscriber database with queuing networks Sivumäärä: 64 Professuuri: Mat-2 Sovellettu matematiikka Valvoja: Professori Ahti Salo Ohjaaja: DI Jukka-Petri Sahlberg Tiivistelmä: Ohjelmistonkehityksessä on monesti pyritty ensin rakentamaan toiminnallisesta näkökulmasta oikein toimiva ohjelmisto ennen muiden toivottavien ominaisuuksien kuten tietoturvan, palvelunsaatavuuden tai suorituskyvyn huomioonottamista. Suorituskyky määritellään usein pelkästään mittaamalla, joka on mahdollista vasta ohjelmistokehityksen myöhäisessä vaiheessa. Mallintamisen avulla on mahdollista ennustaa ohjelmiston suorituskyky ennen kuin se on mitattavissa. Jonoverkot ovat yleisimpiä ohjelmiston suorituskyvyn kuvaamiseen käytettyjä malleja. Tässä työssä tutkittava ohjelmisto on tilaajatietokanta, joka toimii muiden verkon järjestelmien keskitettynä tilaajakohtaisten asetusten tallennusjärjestelmänä. Järjestelmään pystytään tallentamaan miljoonia tilaajia ja se kykenee palvelemaan useita tuhansia lukupyyntöjä minuutissa. Työssä tutkitaan yhden ohjelmiston tarjoaman palvelun suorituskykyä kolmella eri laitteistoalustalla jonoverkkomallien avulla. Mallin palveluaikavaadeparametrit estimoidaan ensin yhdellä laitteistolla mittaamalla, minkä jälkeen ne muutetaan vastaamaan kahta muuta laitteistoa vertailemalla eroja laitteistojen suorituskyvyssä. Lopuksi työssä verrataan mallista saatuja tuloksia mittaamalla saatuihin tuloksiin. Alan kirjallisuuden mukaan jonoverkot soveltuvat ohjelmiston suorituskyvyn mallintamiseen. Työstä saadut tulokset vahvistavat tämän näkemyksen. Avainsanat: Jonoverkot, Ohjelmiston suorituskykymallit
Preface: I want to thank my instructor M.Sc. Jukka-Petri Sahlberg and M.Sc. Vesa Kärpijoki for their help and patience. I would also like to thank my supervisor Professor Ahti Salo for good advice and support. Espoo, 30th of June 2003 Heikki Uljas
4
Table of contents: 1
INTRODUCTION
7
2
SOFTWARE PERFORMANCE EVALUATION
9
3
2.1
Software performance engineering
10
2.2
Limits to performance
12
2.3
Evaluation techniques
14
2.3.1
Measurement
15
2.3.2
Simulation
16
2.3.3
Analytical modeling
17
QUEUING NETWORK MODELS 3.1 3.1.1
3.2
21
Network of centers
21
Types of service centers
3.2.2
Types of workload
24
3.2.3
Product form networks
24
23
Stochastic analysis
26
3.3.1
M/M/1 queue
26
3.3.2
M/M/m queue
27
3.4
Operational analysis
28
3.4.1
Basic quantities
28
3.4.2
Utilization law
29
3.4.3
Interactive response time law
29
3.4.4
Job flow analysis
30
3.4.5
General response time law
31
3.5
Comparison of operational analysis and stochastic analysis
SOLUTION TECHNIQUES FOR QUEUING NETWORK MODELS
32
33
4.1
Bounding techniques
34
4.2
MVA Algorithm
35
4.2.1
Single class MVA
35
4.3
Hierarchical decomposition
38
4.4
Software performance modeling
39
4.4.1 4.4.2
4.5 4.5.1
5
19
Little’s law
3.2.1
3.3
4
Single service center queue
19
SPE execution models Method of surrogates
Sensitivity analysis Uncertainties in model input
PERFORMANCE EVALUATION OF A SUBSCRIBER DATABASE
40 42
43 44
46 5
5.1
Technical description
46
5.2
Shortcomings in earlier performance
47
5.3
Performance modeling
48
5.3.1
Service center identification
50
5.3.2
Workload definition
50
5.3.3
Service time estimation
51
5.3.4
Service time transformation
53
5.4
56
5.4.1
Production hardware A
56
5.4.2
Production hardware B
58
5.5
6
Results
Summary of the results
CONCLUSIONS
REFERENCES:
61
64 66
6
1 Introduction
Traditionally, software development is a functionality-centered process. It attempts to build software that gives correct results before considering other desirable qualities of software like security, availability and performance. The difficulty with performance often is that it is not specified until at the late stage of development when the performance is measured. Performance modeling provides means to predict the performance of software before it can be measured. When used in combination with other techniques like prototyping, it can reduce the performance-related risks in software development. The problem with modeling is that it is not usually part of the software development process and people developing the software are not familiar with modeling techniques or the theory behind them. Also it is not clear what kind of effort the modeling requires and how well the performance of modern software can be described with modeling. Performance models are usually divided in analytical and simulation models according to the applicable solution techniques. Queuing network models are the most commonly used analytical performance models for computer systems and software. The basic principle in queuing network models is that when a resource is reserved by one customer, other customers have to wait for it in a queue. A comprehensive theory and a wide selection of solution techniques exist for the queuing networks [7]. The software studied in this thesis is a subscriber database that is a centralized storage of subscriber preferences for other systems in the network. The software has a distributed architecture for scalability and fault-tolerance. The database itself has capacity of several million subscribers and the system can serve several thousands of read requests per minute. The software is available on several different hardware configurations.
7
The goal of the thesis is to study the feasibility of queuing network models in the performance evaluation of the subscriber database product. The scope of the thesis is limited to a single service of the subscriber database. This is a subscriber preferences read service dedicated to a specific client, which has high performance requirements for throughput and response time. The model is built to determine the performance under similar, artificial conditions the performance tests do. That is, the load for the system is generated artificially and the data content of the database is artificially generated. The thesis starts with a description of software performance evaluation in Chapter 2. Chapter 3 provides a general description of queuing network models and elementary results from queuing theory. Chapter 4 concentrates on describing different solution techniques for solving the queuing network models. Chapter 5 describes the performance evaluation of the subscriber database and Chapter 6 contains the conclusions.
8
2 Software performance evaluation
Traditional software engineering methods have taken the so-called fix-it-later [23] approach towards performance. In this approach the focus is on the functional requirements of the system. The non-functional requirements, such as the security, availability, reliability and performance are typically taken into consideration at the later phase of the development process, i.e. integration and testing [13]. If performance problems are discovered at this point, the software is tuned to correct them or additional hardware is used. It is clear that the fix-it-later approach is undesirable. Performance problems may be so severe that they require extensive changes to the system. These changes will increase the development cost, delay deployment or affect other desirable qualities of a design, such as understandability, maintainability or reusability [23]. Some common reasons for performance problems in software development are [13]: •
Lack of scientific principles and models. Conventional engineers use scientific principles and models based on mathematics, physics and computational science to support their design process. Software engineers do not need to rely on formal and quantitative models; they can write code without formal methods.
•
Education. Topics of computer performance evaluation are not taught to the graduates of computer science.
•
Single-user mindset. Most system designers and programmers who develop systems do not realize that the code they are writing will be instantiated by many concurrent requests.
•
Small database mindset. Code that accesses the database is usually written without taking into account the size of the database. 9
2.1 Software performance engineering The traditional phases of software development process are the following: •
Project planning. The requirements for the software, resource allocation and scheduling are defined.
•
Design. Detailed design that describes how the requirements are implemented is constructed.
•
Implementation. Implementation is done according to design.
•
Testing. It is verified that the software satisfies the requirements.
•
Introduction. The software is deployed into the real environment.
Software performance engineering (SPE) was developed to aid in designing software systems that meet performance requirements [24]. It attempts to estimate the performance at design phase with performance models and to compare design alternatives. The additional activities introduced by the SPE for the software development process phases are: •
Project planning. Tasks required by the SPE are included in the project plan. Potential performance risks are identified and the cost of SPE tasks is estimated [21]. Based on the risk assessment, it is determined how much effort will be put into SPE activities [24]. Performance requirements are defined as part of the software requirements. Resource requirements for hardware, network architecture and software services are specified [21]. Performance critical use-cases are identified; those that are important to performance or those for which there is a performance risk [24].
•
Design. On the basis of performance requirements to the system, the initial performance model is established for each scenario using estimated values. This model is used to check the proposed rough architecture and identify potential
10
performance bottlenecks [21]. Alternatives are compared using a crude model. Refined requirements to be used in design are derived from the model [21]. Benchmarks are used to measure the performance of required services (hardware, middleware, security). With more detailed design the model used is detailed further. The more detailed information can be obtained from measurements of existing systems, benchmarks, manufacturer’s information and estimates. The performance-critical sections of the system are prepared for instrumentation in the design level. More detailed performance requirements are derived for implementation purposes. [21] The performance model is used when considering various design alternatives [24]. If the determined performance is not satisfactory, means of performance improvement considering also other aspects are investigated. These means include changing the design, increasing the planned resource requirement – increasing the overall cost, defining stricter development requirements and reducing the technical demands [21]. •
Implementation. The assumptions established for each model during the analysis and design are replaced with the latest performance data during the implementation [21]. This data is kept up-to-date during the implementation and development of the actual performance is monitored [24]. Instrumentation is inserted to the performance-critical sections of the system.
•
Testing. System performance is tested under conditions resembling the anticipated conditions as closely as possible. The performance models are checked for validity using the performance test data. Possible performance anomalies, i.e. unexpected performance data or behavior are identified. [21]
•
Introduction. The performance behavior of the information system is reviewed in pilot installations over a longer period under real operational conditions [21]. Performance data from the real environment is collected to provide feedback for the development information flow [21]. Model is used to estimate performance effects of changes in the system [24].
11
The activities from above list are illustrated in Figure 2.1. The activities shown are risk management, requirement analysis, performance modeling and performance measurement. Project planning
Design
Implementation
Testing
Introduction
Risk management Project level risks
Identification of critical scenarios
Risk control
Requirement analysis Definition of performance requirements
Refining performance requirements
Verification
Redefinition based on feedback
Performance modeling Comparison of design alternatives
Performance prediction
Model validation
Capasity planning
Measurement Benchmarking services
Benchmarking code
Performance measurement
Data collection
Figure 2.1. SPE activities in software development.
2.2 Limits to performance The performance in computer systems depends on executing elements that reserve and compete on the system resources. The performance is most commonly measured with time, i.e., the time it takes for the system to complete a task, response time or the number of operations performed within time interval, throughput [7]. The resource utilization is also often used as a performance measure [7], like the amount of memory used by the application or amount of disk I/O.
12
In modern computer systems the elements of execution are divided into processes or lightweight processes called threads [26]. They perform tasks for applications or human users utilizing the resources in the system. The performance in a computer system is limited by the service time and resource contention. Service time is the time required for completing a task. All hardware devices in computer systems have physical boundaries that limit the amount of work they can perform per time unit. For example the speed of a processor is governed by its clock rate and instruction length and hard disks are limited by the rotational speed and density of the disk. In early computer systems the operating systems allowed the applications or processes to execute sequentially [26]. When a process started execution it had control over all resources in the system until it was finished. There was always only one active process executing in the system, while all others waited for their turn. Systems with such behavior were called simple batch systems [26] and their performance was limited by the performance of the hardware. The problem with simple batch systems was that since the I/O devices were slow compared with the processor, the processor was often idle. In a multiprogramming system there are many concurrently active processes. When a process begins to wait for I/O, it is switched out from the processor and another application is switched in. [26] From the performance point of view, multiprogramming means two things: •
The overall performance of the system is improved since processes can use different resources independently.
•
The performance of an individual process may become worse because of resource contention.
Resource contention is caused by competition on hardware resources. When a resource is reserved by one process, other processes have to wait for it to become free. In such a case, performance is determined by a combination of wait time and service time. The gain in performance from resource sharing is generally more than the loss due to resource contention. [26] 13
Hardware devices such as processor or disk are usually limited by both service time and resource contention. Executing processes wait for the device, use it for service and release it for others to use. The reservation of these devices is handled by the operating system. Devices that have high capacity compared to their load do not necessary have resource contention, but do have service time. Some resources may be delay type, like with users accessing the system through a user interface. A user executes some operation and then spends some time looking at the results. This time is in general called the think time [7]. Some resources are required for processing but they do not perform any work. Memory is an example of such resource where the limiting factor is the resource contention alone. Critical software resources such as locks or semaphores used in synchronization [26] fall also into this category. These three different types of resources are listed in Table 2.1. Table 2.1. Different types of resources in computer systems.
Resource type
Service time
Service time with limited Limited. capacity. Service time with infinite Limited. capacity. No service time with None. limited capacity.
Resource contention Limited.
Examples
None.
Network delay, human users. Memory, software resources.
Limited.
CPU and disk.
2.3 Evaluation techniques The most common technique used in performance evaluation is measurement. Every system where the performance is important the performance is at least measured. A more rarely used way is modeling. Two types of models are used, analytical and simulation models, depending on the solution method used. These two modeling techniques can also be combined to through hybrid modeling [7] [9]. What technique is used depends on the development stage of the system, required accuracy of results and resources available for the evaluation. Any of these techniques might give results that are misleading or wrong [7]. It is not stated that performance evaluation 14
is done using only one of the technique, different sections in a study might be solved using different techniques. For example, the system could be divided into sub-systems and the performance for each sub-system might be determined using different techniques. The total performance could then be determined using either simulation or analytical model. Another example is the use of measurement in the parameterization of models.
2.3.1 Measurement Measurement is the most fundamental technique for performance evaluation. It is commonly used to verify that a system meets its performance requirements. It is used with analytical and simulation modeling to obtain parameter values for the models and to validate the results. In a normal situation there is a system with a specific configuration and workload running on that system. In addition to that the measurement is affected by random fluctuation. [1] When conducting a measurement, it is important to collect data for all four quantities shown in Figure 2.2. Careful tracking of configuration changes and monitoring of the workload levels is required. The random noise can be handled using statistical analysis of results. [1]
System configuration
Workload applied to the system
Computer system for measurement
Performance measurements
Random fluctuation
Figure 2.2. Conceptual model of system under measurement.
The measurement technique is the most accurate of the techniques if the premises are correct and the measurement is conducted appropriately. It is commonly used and therefore credible. Measurement requires some time for preparing the test, collecting the data 15
and analyzing the results. On the other hand, it is prone for disruptions and can be very time consuming if there are unexpected difficulties. The most serious drawbacks of measurement are that it may be very expensive and requires a functioning system [7],[20]. These problems can be addressed with prototyping but this usually requires some kind of model for uncertainty analysis and estimation of the total performance of the system since only part of the functionality is covered by the prototype. Another drawback is that measurement technique is likely to give accurate knowledge of system behavior under one set of assumptions, but not any insight that would allow generalization [11].
2.3.2 Simulation The simulation technique involves the construction of a simulation model of the system’s behavior and running it with an appropriate abstraction of the workload. It is used when analytical modeling is infeasible. It might be difficult to construct an analytical model of the system or the model could be impossible to solve by analytical means. The major advantage of the simulation is that almost any kind of behavior can be simulated. [9] The parameters that correspond to the workload and configuration of the actual system are given for the simulation model as input. The model is evaluated repeatedly and random variables are used in every evaluation round. Simulation data is received as a result, which is then analyzed to get performance measures. These concepts are illustrated in Figure 2.3.
16
System configuration parameters
Workload parameters
System under study
Simulation data
Generated random noise
Figure 2.3. Conceptual model of system for simulation.
Simulation models are more comprehensive than analytical models in the sense that more details can be included in them. If the model is sufficient, it will provide accurate results. The building of a simulation model may require a long time if no ready simulation software is available. Simulation does not depend on the development stage of the system, so it can be used for example in the design phase. [7]
2.3.3 Analytical modeling Analytical modeling involves the construction of a mathematical model of the system behavior after which the model is solved analytically. The problem with this approach is that the domain of models solvable by analytical means is limited. Thus, analytical modeling will fail if it is required to study the system in great detail. Simulation can be used to solve more detailed models. [9] Analytical models are parameterized to describe the workload and system configuration. Sensitivity analysis is used to determine the sensitivity of the performance estimates to the model parameters [22]. The estimates with their confidence intervals are received as results from the model. These concepts are illustrated in Figure 2.4.
17
System configuration parameters
Workload parameters
Analytical model
Estimates of performance
Confidence analysis
Figure 2.4. Conceptual model of a system for analytical modelling.
Analytical models tend to be less accurate than simulation models in that very complex models cannot be solved by analytical means. However analytical modeling is usually the cheapest and fastest of the techniques [7] and it can also be used with incomplete systems [20]. Analytical modeling can provide valuable insight into the functioning of the system even if the results are not correct [11], since the system has to be understood before the model can be built. Very abstract and simple models can provide surprisingly accurate estimates of the system performance and results from the analysis have better predictive value than those obtained from measurement or simulation [9]. Commonly used analytical models are queuing network models [13], [2]. More simple models can also be used, such as the software execution model [24].
18
3 Queuing network models
This chapter describes the basics of the queuing network models in that extent they are used in the later sections of this thesis. The main focus is on simple results that can be easily adopted in software performance evaluation. Sophisticated stochastical analysis is not in the scope of this thesis. In a queuing network model the system is represented as a network of service centers, which represent system resources, and customers, which represent users or requests. Queuing networks are used because their structure closely resembles the structure of computer systems. Operating systems for example use queue-like structures, which are used to handle resource contention. Queuing network models have been widely used in computer system performance modeling. Experience indicates that queuing network models can be expected to be accurate to within 5% to 10% for utilization and throughput and to within 10% to 30% for response times [11].
3.1 Single service center queue In its simplest form, a queuing network consists of a single service center. Customers arrive at the center, wait their turn in the queue if necessary and depart. This simple queue is illustrated in Figure 3.1.
19
Queue
Server
Arriving customers
Departing customers
Figure 3.1. Single service center queue.
The basic quantities used to describe a center are: •
Arrival rate (λ): Describes the number of arrivals to the center in unit time. Expressed in requests per second or jobs per second (req/s or jobs/s).
•
Service time (S): The amount of time the customer spends in the service. Expressed in seconds or milliseconds (s or ms).
•
Number of devices (m): The number of devices serving the requests for the center.
Basic performance measures obtained from a center are: •
Utilization (U): The utilization can be understood as the fraction of time the center has a customer or is busy. For a center with multiple devices this definition does not take into account the number of devices that are reserved. An alternative definition of the number of customers in service is more accurate in such a case. Utilization is expressed as percentage, where 100% means a fully utilized service center in a single device case. The maximum utilization for a multiple device center is m*100%.
•
Throughput (X): Describes the number of departures from the service center per time unit. Used units are requests per second or jobs per second (req/s or jobs/s).
•
Response time (R): The time the customer spends in the center, including the wait and service times. Usually expressed in seconds or milliseconds (s or ms).
20
•
Queue length (Q): The number of customers in the center, including both the customers waiting for service and receiving it.
3.1.1 Little’s law The following relationship is known as the Little's law and it was first proved in [12]. It states that N = λR
3-1
where N is the average number of customers in the system, λ is the arrival rate to the system and R is the average time spent in the system. Little's law holds for all those systems where the number of arrivals equals the number of departures. It is important because it can be used with most queuing systems and applies to both single center systems and network of centers as well.
3.2 Network of centers In a network of centers multiple centers are connected with each other. Requests flow through the system receiving service on different service centers. When a customer departs from a center he proceeds to another until his service requirements have been met. In an open network, like the one shown in Figure 3.2, customer population for the system is infinite and customers arrive and depart from the system freely. The number of customers in the system varies over time.
21
Arrivals
Departures
Figure 3.2. Open queuing network.
In a closed network the customer population is fixed so the customers never leave or arrive at the system. Figure 3.3 shows an example of a closed network. Closed networks are used to model systems, which have some constraints on the number of customers. However open networks are easier to analyze than closed ones.
Arrivals
Departures
Figure 3.3. Closed queuing network with its open subnetwork.
The basic quantities used with queuing networks are: •
Arrival rate (λk): Describes the number of arrivals to the kth center in unit time. Used to describe the workload in open networks.
•
Routing probability (pkl): The probability of a customer departing from the kth center to proceed to the center l.
•
Service time (Sk): The amount of time customer spends in service in center k.
•
Number of devices (mk): The number of devices serving the requests for the kth center.
•
Number of centers (K): The number of centers in the network
•
Number of customers in the network (N): The sum of all customers in all centers of the network. Fixed in a closed network. 22
•
Think time (Z): The time that the customer spends thinking between visits to the system. Used when modeling interactive systems.
Basic performance measures obtained from a center are: •
Center utilization (Uk): The utilization for center k.
•
System throughput (X): The throughput of the whole network.
•
Center throughput (Xk): The throughput of the kth center.
•
System response time (R): The response time of the whole network.
•
Center response time (Rk): The response time of the kth center.
•
Queue length (Qk): The number of customers in the kth queue.
The units used with a network are the same ones used with a single service center.
3.2.1 Types of service centers There can be different kinds of service centers in a queuing network. Here they are divided into load-independent, load-dependent and infinite service centers due to the difference in the analysis they require. In a load-independent service center the service rate of the service center is independent of the number of customers in the center. An example of this kind of a service center is a subsystem with single device serving the requests, such as a single processor. Load-dependent and infinite service centers are examples of service centers where the service rate depends on the number customers in the service center. Load-dependent center can be used to model any kind of load-dependent behavior. Commonly loaddependent centers are used for subsystems where multiple devices serve a single queue, but are also used when a single device or subsystem has load-dependent behavior. The infinite service center is a special case of load-dependent service center where the customer never waits for service. In it the number of devices serving the customers is always larger than the number of customers at the center. Infinite service centers are also 23
sometimes referred to as delay centers. Infinite service centers are used to model resources that have infinite capacity with some service time.
3.2.2 Types of workload The workloads for a system can be divided into transaction, batch and interactive workload [11]. In transaction workload the customers arrive and depart the system freely. Therefore an open network is used for modeling it. The workload intensity is defined by the rate of arrivals to the system. In batch workload the number of customers in the system remains constant. In interactive workload the number of customers in the system does change due to the time they spend thinking between the operations. The batch workload can be thought of as a special case of the interactive workload where the think time is zero. Closed networks are used to model systems with batch or interactive workload and interactive workload is modeled with open network. The workload intensity in interactive workload in defined by the number of customers in the system and think time. The think time in interactive workload is included in the model as an infinite service center, as shown in Figure 2.2.
Figure 3.4. Network with interactive workload.
3.2.3 Product form networks A major difficulty in the analysis of network of queues is that the state of a single queue may depend on the state of other queues. Fortunately a large subset of queuing networks called the product form networks [7] allows the analysis of the queues to be done separately.
24
Product form networks have the property that the measures of performance, such as mean queue length or response times are insensitive to the higher moments of distributions of service times, and depend only on the mean service time. Furthermore, they do not depend on the order in which the service centers are visited nor on the number of times a service center is visited but only on the total demand that a customer places on a service center. [19] A network is a product form network if it has the following properties: •
The interarrival-time distribution is exponential for open network. [9]
•
Each center in the network is one of the following three types: 1. Delay center with general service-time distribution for each class. [9] 2. FCFS (First-Come-First-Served) center with the same exponential service time for each class. The center may be load-dependent, but the service rate may only be dependent on the total customers at the center. [7] 3. PS (Processor-Sharing) center with arbitrary service-time distribution for each class. The station may be load-dependent, with some restrictions on the load dependency among different classes. [9]
•
For each class, the number of arrivals to a center must equal the number of departures from the center. [7]
•
A customer may not be present at two or more centers at the same time. [7]
The FCFS and PS are examples of queuing disciplines. A queuing discipline describes the order in which the customers are served at the center. With FCFS, the customer that arrives at the center first is served first. With PS, the center is shared by the customers and they all receive service at the same time. [7] The list presented above is not complete, a more complete listing can be found at [7] or [9]. The exact mathematical representation of product form networks can be found at [14].
25
3.3 Stochastic analysis In the traditional stochastic analysis of queuing networks the queues are modeled as random processes, whose state is defined by the number of customers in them. A random process describes the probabilities of the states of the system over time. Service times and the times between arrivals are modeled as random variables with known probability distributions. A shorthand notation of form A/S/m is used to describe a single queue, where the letters stand for arrival process, service time distribution and number of devices in the queue [14]. An important and widely used probability distribution is the exponential distribution [5], which is denoted with the letter M. The letter M refers to the memoryless property that the distribution has [14]. For a service time the memoryless means that the remaining service time does not depend on the time the customer has already spent in the service.
3.3.1 M/M/1 queue The simplest possible queue is the M/M/1 queue. It is a single-device queue with Poisson arrivals and exponential service times. This queue is an example of a loadindependent center. The customers arrive and depart from the M/M/1 queue one at a time. The state transition diagram shown in Figure 3.5 can be used to illustrate the transitions between different states. λ
λ 0
λ i-1
1 µ
λ
µ
µ
λ i
µ
λ i+1
µ
µ
Figure 3.5. State transition diagram for M/M/1 queue.
Since there is only one customer arriving or leaving the system, the transitions in the system occur only between adjacent states. And since the arrival of a new customer is inde26
pendent from the systems state, the arrival rate λ is same for every state. Furthermore, since the completion of a customer is independent from the systems state, the service rate µ is the same for every state. The service rate of the M/M/1 center is [14]:
µ=
1 S
3-2
where S is the average service time of a customer. The probability of having n customers in the center is [14]: p (n ) = (1 − ρ )ρ n where the term ρ =
3-3
λ is called traffic intensity. µ
The utilization of the service center is given by the probability of having more than 0 customers in the system. From (3-3) we get U = 1 − p0 = ρ
3-4
The average queue length is given by [14]:
Q=
U ρ = 1− ρ 1−U
3-5
Using Little's law from (3-1) with (3-5), the average response time for the queue is
R=
1/ µ S = 1− ρ 1−U
3-6
3.3.2 M/M/m queue Commonly required extension to the M/M/1 queue is a variant with m identical devices serving the requests. This kind of a queue is called the M/M/m queue. It is an example 27
of load-dependent center where its service rate of the center µ depends on the number of customers in the center n. The service rate of the center is [7]:
ì n ⋅ µ0 µ (n ) = í îm ⋅ µ 0 where the µ 0 =
n