1
Int. J. of Knowledge Society Research, Vol. x, No. x, xxxx
Deriving In-Depth Knowledge from IT-Performance Data Simulations Konstantin Petruch Deutsche Telekom AG, Bonn,
[email protected]
Germany
E-mail:
Vladimir Stantchev* FOM Hochschule für Oekonomie und Management Bismarckstr. 107 10625 Berlin, Germany Telephone: +49 30 318-6212 E-mail:
[email protected] ∗ Corresponding author
Gerrit Tamm SRH Hochschule Berlin, Germany E-mail:
[email protected]
Abstract: Knowledge of behavioral patterns of a system can contribute to an optimized management and governance of the same system, or of similar other systems. While human experience often manifests itself as intuition, intuition can be notoriously mislieading, particularly in the case of quantitative data and subtle relations between dierent data sets. In this article we aim to augment managerial intuition with knowledge derived from a specic byproduct of automated transaction processing performance and log data of the processing software. More specically, we consider data generated by incident management and ticketing systems within IT support departments. Raw data from such systems can be evaluated both qualitatively and quantitatively. In our approach we utilize a rigorous analysis methodology based on System Dynamics. This allows us to identify real causalities and hidden dependencies between dierent datasets. We can then use then to derive and assemble knowledge bases for improved management and governance in this context. This approach is able to provide more indepth insights as compared to typical data visualization and dashboard techniques. In our experimental results section we demonstrate the feasibility of the approach. We apply it on real-life datasets and logles from an international telecommunication provider and consider dierent improvements in management and governance that result from it.
Keywords:
IT Governance, Knowledge Extension
Simulation,
c 2011 Inderscience Enterprises Ltd. Copyright
Cloud
Governance,
2
Petruch, K. et. al
Reference to this paper should be made as follows: Petruch, K. et. al. (xxxx) `Deriving In-Depth Knowledge from IT-Performance Data Simulations', Int. J. of Knowledge Society Research, Vol. x, No. x, pp.xxxxxx.
1 Introduction The ubiquity and fast availability of computer-based processing systems allows enterprises and organizations to conduct a wide range of operative processes electronically. This results not only in improved processing with respect to time and quality, but also in an (more or less) automatically stored performance data about the processing (e.g., how many requests were processed, what was the duration of each processing task). Such data is often generated as a byproduct of these systems and is often neglected as a source for potential knowledge. Furthermore, data gathered by such operative systems often hides further nontrivial insights. Visualization can be a rst step towards a more in-depth data comprehension. This is the application domain of the business dashboards. One specic example is the usage of Google analytics (www.google.com/analytics/) to visualize web server logles. More complex application scenarios involve the aggregation of multiple data sources and the subsequent analytical processing of data within a data warehouse. Data logged by operative transaction processing systems can provide insights about two distinct types of measurements key goal indicators (KGIs) which provide insights about the results of an operative task, and key performance indicators (KPIs) which dene the way these results were achieved (e.g., speed, transaction rate). When we dene such indicators with respect to specic business processes we can apply an approach known as process mining [7]. In this article (the article is an extended version of a paper we presented at WSKS 2011) we extend this approach as follows: (1) we aim to derive more indepth knowledge with respect to both qualitative and quantitative assessments, and (2) we introduce and use more complex simulation techniques to accomplish this objective. Our focus lies on KPIs and KGIs we address the question whether an extended data analysis using simulation methodologies can provide an additional value as compared to standard data visualization.
1.1 Work Structure The rest of this article is structured as follows: Section 2 presents the state of the art in the measurement of process indicators, the terminology we use, as well as related work in the area of knowledge discovery. In Section 3 we posit our hypothesis and give an overview of our assessment framework for indicators in the
Deriving In-Depth Knowledge from IT-Performance Data Simulations
3
area of IT operations. In Section 4 we describe our experimental results regarding a specic data transformation and simulation process that we have assessed based on real-life data from an international telecommunications provider. Section 5 contains a summary of our experimental results and outlook on our future research activities.
2 Preliminaries In this section we present the motivation for performance and output metrics. We also discuss the possibilities for data and log le analysis in this context and also reect on specic aspects of knowledge discovery and generation.
2.1 Concepts of Indicators As stated above, indicators can be generally divided into two groups key performance indicators (KPIs) and key goal indicators (KGIs). KPIs measure how well a process is performing and are expressed in precisely measurable terms. KGIs represent a description of the outcome of the process, often have a customer and nancial focus and can be typically measured after the fact has occurred [8]. While KGIs specify what should be achieved KPIs specify how it should be achieved.
2.2 Objectives of Data and Log File Analysis Various algorithms [30, 3] have been proposed to discover dierent types of models based on a log le. A special issue of Computers in Industry on process mining [31] oers more insights. In the context of process model verication there are several notions for equivalence of process specications such as behavioral equivalence [32, 28], trace equivalence, and bisimulation [33] that have been developed. Traditional equivalence notions like bisimulation or trace equivalence are dened as a verication property which yields a yes-or-no boolean value, but no insights on the degree of equivalence. When comparing a reference model with a process model, it is not realistic to assume that their granularities are the same. Therefore, the equivalence analysis with classical equivalence notions will most likely not be conclusive. In the context of process mining we should apply notions searching for behavioral similarity. Examples include causal footprint [32] and tness function [28]. In [32], the authors introduce an approach for determining the similarity between process models by comparing the footprint of such models. Thereby the footprint describes two relationships between activities the soc. look-back and look-ahead links and returns the degree of the process similarity expressed in [0, 1]. This value is not conclusive and requires further explanation. It is not possible to trace the missing or diering activities. Since traceability is an important requirement of the organization, the approach is not suitable in general. A specic focus on such organizational patterns can be found in [9]. In [28], the authors introduce the behavioral and the structural precision and recall. The behavioral equivalence of the process models compares a process model with respect to some typical behavior recorded in log les. The structural precision and recall equate the term "`structure"' with all ring sequences of a Petri net
4
Petruch, K. et. al
that may occur in a process model. Other related works exist in the areas of pattern matching or semantic matching. Existing approaches [4] assume that the correspondence of activities can be established automatically. Since they suppose that the same label implies same function, they try to identify the content of an activity by using an automated semantic matching algorithm based on the label of activities. One specic approach for quality improvement in compliance is IT supported compliance evaluation [16]. The notion of compliance has also been discussed in the context of business alignment [29].
2.3 Aspects of Knowledge Discovery and Generation The generation of nontrivial knowledge from raw data is a typical objective of data mining and knowledge discovery [5]. Of particular importance in this context is the data quality of the raw data and the proper extraction, transformation and loading (ETL) [34] of the data for further analysis. Data quality has emerged as a research topic over the last 25 years [13, 14, 36]. Typically, the idea of a datum is thereby understood according to the denition of data representation and data recording as a collection of triples < e, a, v >. The value v is selected from the domain of the attribute a in order to represents its value in the unit e. Data representation is then dened as the rules for storing these triples on a medium and a data record is one physical instance of a dataset [6]. There exist dierent quality aspects that can be considered in the context of data representation. In a more narrow context we can regard data quality as the quality of the values v . Further dimensions of models and data representations are presented in [13], Section 3. We can consider four categories of dimensions of data quality: correctness, actuality and topicality, integrity, as well as consistency [6]. Delving further, we can try to operationalize it through parameters and indicators [35].
2.3.1 Parameters of Data Quality Parameters of data quality represent a qualitative or subjective dimension the data quality as experienced by an observer or user. Examples are trustworthiness of data source or timeliness of data.
2.3.2 Indicators of Data Quality Indicators of data quality represent a data dimension that oers objective information about the data, e.g., source, time of generation, methodology of data gathering on the eld.
2.3.3 Attributes of Data Quality Attributes of data quality are composed from parameters and indicators of data quality. Values of indicators represent measured characteristics of the data, e.g., source of data: Standard and Poors.
Deriving In-Depth Knowledge from IT-Performance Data Simulations
5
2.3.4 Parameter Values of Data Quality Parameter values of data quality depend on indicator values. For example, if an observer considers Standard and Poors as a trustworthy source, this observer would consider the credibility of data from this source as high.
2.3.5 Data Quality Requirements Data quality requirements specify which indicators should be observed and documented, so that a data user can receive data with a specic quality as a result of a query.
3 Research Hypothesis and Assessment Framework for IT Governance 3.1 Research Hypothesis Analysis of log data can in general provide a clear picture of performance and utilization of IT infrastructure components. Such data can even be used to dynamically recongure systems for better performance at dierent architectural levels (the denition of these levels is presented in [17, 19], while details about the techniques are presented in [18, 20]). Our hypothesis is that such data can also be used as a source to derive knowledge about IT management and governance objectives and their corresponding actions. In order to derive meaningful knowledge with respect to IT management and governance objectives, we need to augment and assess available log data. This augmentation and assessment should be considered within the context of IT management and IT governance. Our hypothesis is that such augmentation and a subsequent analysis facilitating existing data modeling approaches can provide extended nontrivial knowledge for IT executives.
3.2 IT Governance Frameworks IT governance frameworks aim to dene standardized processes and control metrics for IT provision. Commonly applied frameworks in this area include the IT Infrastructure Library (ITIL) [27] and the Control Objectives for Information and Related Technology (CObIT) [10]. They typically provide best practices for measurement and control of IT-specic indicators.
3.3 IT-specic Indicators IT indicators should demonstrate the added value of IT to the business side. A well accepted view of business objectives is Porter's distinction between operational eectiveness (eciency and eectiveness) and strategic positioning (reach and structure) [12]. This view can be translated directly into corresponding goals and indicators for IT [26]. Organizations require well designed business processes to achieve excellence in a competitive environment: Here, not one-time optimized business processes play
6
Petruch, K. et. al
the essential role, but rather the ability to quickly react to new developments and to exibly adapt respective business processes are decisive[2]. It is important that these processes are eectively supported through IT. These requirements have consequently been catalyzing increased interest in reference modeling for IT process management. Reference models such as ITIL and CObIT represent proven best practices and provide key indicators for the design and control of IT services [27]. On the one hand, utilization of reference models promises to enhance quality and facilitates better compliance according to statutes and contractual agreements. On the other hand, IT processes have to correspond to corporate strategy and its respective goals. Therefore, the question arises how best practices can be implemented in a particular corporate environment. Another challenge lurks in the checking of reference process execution as well as in assuring compliance to IT procedure in respect to new or altered business processes [25].
3.4 Aligning IT and Business Indicators One way towards IT and business alignment can be the application of approaches such as CObIT and ITIL for the optimization of IT organizations. We recently introduced an approach for the continuous quality improvement of IT processes based on such models [7] and process mining. An organization can also try to assure the continuous provision of service levels as demonstrated in our previous work with such reference models and our work in the area of service level assurance in SOA [24, 21, 23]. Furthermore, in order to coordinate and govern IT production, we can assess operative data and try to analyze it more deeply with the help of simulation models.
3.5 Cloud Governance Aspects Governance of cloud computing should regard dierent deployment models. Abstracting services at the level of infrastructure (IaaS) allows comparatively easy virtualization the user organization can congure and customize the platform and the services within the virtual image that is then being deployed and operated. This includes the denition of performance parameters for specic services (e.g., parameters of a Web Service Container), the security aspects of service access, and the integration of services within the platform. When using a standardized platform (the PaaS approach) the user organization deploys the services in a virtualized operating environment. This operating environment is typically provided as a service the virtualization technology and the operating environment are managed by the provider. Integration capabilities are always provider-specic and there are currently no commonly accepted industry standards for integration between services operated in dierent PaaS environments.1 The usage of software services itself (the SaaS approach) precludes ne-grained control and enforcement of non-functional aspects (e.g., QoS, response time) and security parameters of the infrastructure and the platform by the user organization. These dierent levels of virtualization require dierent levels of security and abstraction. The grade of control and responsibility for security aspects declines with higher levels of abstraction in IaaS the conguration is generally in the
Deriving In-Depth Knowledge from IT-Performance Data Simulations
7
hand of the user organization, while in SaaS it is primarily a responsibility of the Cloud provider. There are several emerging patterns for cloud usage. The rst one is a natural consequence of the trend to outsource IT-Operations (aka. IT-RUN functions) to external providers and results in demand for IaaS. IaaS is typically used for the implementation of test projects and as a way to overcome underprovisioning in onpremise infrastructures. The second one is coming from the SaaS area and focuses on the provision of Web 2.0 applications. Some well-known sites oer the user the chance to develop simple applications (a la PaaS) and oer them in a SaaS-like manner later on. This usage pattern could also be called extension facilities. PaaS is an optimal environment for users seeking testing and development capabilities, these are two new emerging use patterns which are gaining popularity. Probably, gaming will be one of the most remarkable usage patterns for Cloud technologies, due to an inherent scalability, endowing such applications with virtually unlimited graphical power and players. Also the rise of netbooks in the computer hardware industry triggered the development of Clouds. These slim devices depend on services being deployed in remote Cloud sites since their own capacity is limited. Behind this stand the idea of getting access to everything, from anywhere, at any time. A set of general Corporate Governance rules has to be specically rened and targeted for every operational area in an enterprise. The idea of manageability in Cloud Computing is closely related to the operationalization of Corporate Governance in the dierent phases of the use of a Cloud Computing oering. A specic manifestation of such operationalization can be the introduction of SLA-based Governance. This would mean that the organization has to incorporate specic governance requirements as part of a service level agreement for a Cloud Computing oering. Suitable examples include the so called "four-eyes-principle" that can be part of the SLA for a SaaS oering, or data availability requirements that can also be part of the SLA for a SaaS oering. In order to introduce such transparent Cloud Governance mechanisms an organization has to consider all phases of the usage of a Cloud Computing oering. During the rst phase of requirements identication and elicitation (often called the Plan-Phase) these requirements need to be specied and formalized. This allows addressing them already within a rst assessment of the Cloud Computing market for the specic oering. Potential Cloud Computing providers can then be specically evaluated with respect to the requirements and specic SLAs can be negotiated with them during the second phase. The third phase can focus on the transparent communication of values and benets of the SLA during start of production for the specic business unit. The fourth phase would deal with performance monitoring and assessment of SLA fulllment and associated bonuses or penalties. These phases and their associated activities can be introduced as specic Cloud Computing extensions to more traditional IT-Governance approaches such as CObIT and ITIL. This introduction is typically non-trivial, as there are signicant dierences between the abstraction levels and the semantics of Cloud Computing and IT-Governance. In the specic area of SaaS a more straightforward approach can focus on the introduction of a more specic approach from the area of SOA Governance the
8
Petruch, K. et. al
SOA LifeCycle [22]. It describes a governance approach for software functionality as provided by web services which makes its paradigms and concepts more applicable to the aspects of SaaS Governance. On the other side, the SOA LifeCycle can be incorporated as part of a general IT-Governance strategy based on CObIT and ITIL.
4 Experimental Results In this section we provide an experimental evaluation of our approach for analyzing IT sta behavior based on log data using the System Dynamics as described in [15].
4.1 Operative Log Data The operative log data is generated by software applications that provide service support as a set standardized IT functions. It describes request processing for three dierent IT services of an international telecommunications provider. The IT services are an e-mail service, an IP-based video-on-demand service, and a webhosting service. Data is generated in a comma separated values (CSV) format and includes an incident number, as well as the following elds:
• priority, • short description, • aected service, • start of incident, • end of incident, • range of impact, • number of process steps needed.
4.2 The Concept of System Dynamics System Dynamics allows to represent and analyze complex causality structures. It can often provide insights that are not easily derived from the original data and are sometimes even counterintuitive. Sometimes such analyzes can lead to the revision of already made decisions. Figure 1 shows an example for such model.
4.3 Selection of a Simulation Tool There are various tools for the denition and conduction of System Dynamics models. Our next objective was the selection of a suitable tool for the enterprise application scenario. Our assessment was based on cost benet analysis and the analytical hierarchy process (AHP) and included the following categories of requirements with their weightings.
Deriving In-Depth Knowledge from IT-Performance Data Simulations
9
Figure 1 A Sample System Dynamics Model • Technical (15%) • Functional (50%) • Environment (25%) • Supplier / Support (10%) Figure 2 shows exemplary an excerpt from the assessment of the technical requirements of the four alternatives. The nal decision was to use the simulation software Consideo Modeller (http://www.consideo.de).
4.4 Data Transformation and Augmentation The use of the log data as input for the System Dynamics simulation required further transformation. Examples for two specic transformations that we had to conduct are:
• The original log data includes timestamps (start and end of an incident) UNIX-type datetime data. It had to be transformed to the supported DD.MM.YYYY HH:MM:SS format. • We had to include an incident increment that we then facilitated to model coordinates between a time value and the number of incidents that are processed.
10
Petruch, K. et. al
Figure 2 An Excerpt from the Technical Assessment • We used only an excerpt of the available log data that covered several years. Using data blocks per month allowed us to keep the execution time of the simulation short.
4.5 Qualitative and Quantitative Models and Their Results We represent our simulation models and results as follows:
• First, we describe the construction and the subsequent renement of our qualitative model, • Second, we present the assessment of results from the qualitative model, • Third, we describe the construction and the elaboration of our quantitative model, • Fourth, we present the simulation approach and the assessment of results from the quantitative model.
4.5.1 Construction and Renement of the Qualitative Model Figure 3 shows an overview of our qualitative modeling approach. It considers costs (Kosten), number of employees (Anzahl der Mitarbeiter), speed of ticket processing (Schnelligkeit der Ticketbearbeitung), customer satisfaction (Zufriedenheit beim Kunden), quality of ticket processing (Qualität der Ticketbearbeitung), and the number of open incidents (Anzahl oener Incidents). The considered factors are derived from operative systems in the area of IT support (speed of ticket
Deriving In-Depth Knowledge from IT-Performance Data Simulations
11
processing, quality of ticket processing, number of open incidents) and from standard ERP systems (costs, number of employees). We conducted specic augmentations as described in the previous section. Furthermore, we considered specic time frames and therefore had to synchronize data from these two sources correspondingly.
Figure 3 An Overview of the Qualitative Model. Legend: costs (Kosten), number of
employees (Anzahl der Mitarbeiter), speed of ticket processing (Schnelligkeit der Ticketbearbeitung), customer satisfaction (Zufriedenheit beim Kunden), quality of ticket processing (Qualität der Ticketbearbeitung), number of open incidents (Anzahl oener Incidents)
This rst qualitative model gives us specic insights with respect to the dependencies between the singular factors. In order to provide reacher knowledge about other (hidden) factors we can consider further factors that can be aggregated from the existing ones and can inuence the qualitative model. Figure 4 shows the consideration of the emergent factor stress that can be aggregated from the existing factors speed of ticket processing, quality of ticket processing, and number of open incidents. We then further augment the factor Stress with the new factor 'number of sta away sick'(Krankenstand) as shown in Figure 5. This augmentation can provide potential insights into whether increased levels of stress contribute directly to increased numbers of sta ill. Figure 6 shows the complete qualitative model. It contains also the inuence of the factor Stress (included as a sub-model) on the factor number of employees.
12
Petruch, K. et. al
Figure 4 Augmentation of the Qualitative Model with the Factor Stress. The factor Stress is derived from the existing factors speed of ticket processing, quality of ticket processing, and number of open incidents.
4.5.2 Results from the Qualitative Model Figure 7 shows a results matrix for the qualitative model. Dependency types and their relative strengths dependent on the number of open incidents are shown in a Cartesian coordinate system. The analysis of the dependencies oers the following insights:
• An increase in quality of ticket processing (4) correlates with increase in number of open incidents. • An increase in number of employees (1) correlates with an insignicant decrease in number of open incidents. • An increase in speed of ticket processing (5) correlates strongly with a decrease in number of open incidents. Let us assume that we are taking responsibility for an IT department and need to provide several quick wins. If we focus on number of open incidents we can aim to increase the speed of ticket processing. We can then focus on the inuences of the factors on the speed of ticket processing and thus nd the proper variables that we need to control. Our results with the qualitative model seem to conrm our hypothesis the augmentation of log data and its subsequent transformation to relevant
Deriving In-Depth Knowledge from IT-Performance Data Simulations
13
Figure 5 Augmentation of the Qualitative Sub-Model of the Factor Stress with the
Factor 'number of sta away sick'(Krankenstand). The new factor is derived from existing factors and further data from the HR system.
factors of the model can provide additional knowledge and specic action points for IT executives. This approach can be subsequently extended and further operationalized with our quantitative models that we present next.
4.5.3 Construction and Renement of the Quantitative Model An important a-priori consideration is that not every qualitative model can be transformed directly into a quantitative one. Not every factor of the qualitative model can be represented quantitatively. In the context of this work we therefore present one characteristic example where we transform a subset of our qualitative model into a quantitative model. More specically, we are assessing the inuence of stress to the number of open incidents, or, even more precisely, we are assessing the dependency of the number of (currently) open incidents on the stress levels. The dynamic aspects of the model are simulated by the introduction of daily opened incidents into the system model using a random function f (x) = [0, 1], probability distribution of incidents f (x) = [0, 100] and subsequent rounding (see Figure 8). The number of daily completed (closed) incidents is dependent on the following three factors:
• Daily active working time, • number of employees, and
14
Petruch, K. et. al
Figure 6 The Complete Qualitative Model. It considers the sub-model of the Factor Stress, its inuences on the other factors, and vice versa.
• Processing ratio per hour. We then augment the quantitative model at two places with the inuence of the factor stress (see Figure 8). Stress impacts the completed incidents negatively and is itself inuenced as follows: once, by the increasing number of open incidents, and second, by the introduction of quotas based on the processing ration per hour. Here we assumed values of n open incidents where stress is negatively impacted when n > 100; ∆ = −0.20 and values of nsim for simultaneously processed incidents where stress is negatively impacted when nsim > 6; ∆ = −0.20. These values were also conrmed empirically.
4.5.4 Simulation Results from the Quantitative Model A key benet of the quantitative model is that it allows an extended simulation. The model is executed step-by-step based on the specied factors, values and transformation functions. Changes in every factor can then be plotted (e.g., as histograms). Furthermore, when working with probabilities and random-generated values, we can employ a Monte-Carlo simulation [1]. Figure 10 shows results of the Monte-Carlo simulation of our quantitative model. The model considers quotas based on the processing ratio per hour. The ratio is set to 6 incidents per hour (down, right). Results in the histogram show that the daily number of new incidents can be processed without delays in more
Deriving In-Depth Knowledge from IT-Performance Data Simulations
15
Figure 7 The Results Matrix for the Qualitative Model. Legend: 1. number of
employees (Anzahl der Mitarbeiter), 2. number of open incidents (Anzahl oener Incidents), 3. number of sta away sick (Krankenstand), 4. quality of ticket processing (Qualität der Ticketbearbeitung), 5. speed of ticket processing (Schnelligkeit der Ticketbearbeitung), 6. Stress
than 90% of the cases during a quarterly period (120 days). A gradual increase in the number of open incidents beyond the threshold of 100 leads to an increased feedback eect that causes the outliers. A short term availability of additional employees to handle the number of incidents that exceed the threshold of 100 would have only a minor eect on costs but would practically eliminate delays in the incident processing. This nding can be regarded as a further corroboration of our hypothesis based on the quantitative model and its subsequent simulation we are able to further extend knowledge and provide direct optimization recommendations from governance point of view (e.g., employ additional resources, even if they are more expensive, when number of open incidents is beyond the threshold of 100). Certainly, this is a specic nding, specic to the IT organization we assessed during our experiments. Nevertheless, it demonstrates the feasibility of the approach. Experiments that we conducted in other organizations provided similar results, with the value of the threshold being slightly dierent. As a next step, we thought of further extending the knowledge that we can derive from the quantitative model and tried to simulate dierent governance settings without changing the number of employees. Our expectation was, that we can nd settings where we can avoid a self-enforcing increase in the number
16
Petruch, K. et. al
Figure 8 First Iteration of the Quantitative Model. On the left - introduction of new
daily opened incident via a random function f (x) = [0, 1], probability distribution of incidents f (x) = [0, 100] and subsequent rounding; on the right - daily completed incidents dependent on the daily active working time, the number of employees, and the processing ratio per hour.
of open incidents without having to provide additional manpower. Therefore we congured the model with various factor settings and then ran the simulations. Figure 11 provides an overview of results from one specic conguration. It diers from the model in Figure 10 in the value of only one factor the value of the factor quotas based on the processing ratio per hour has been increased: Qhour → Qhour| ; Qhour| = 7. The results of this simulation are a particularly representative case for the generation of nontrivial knowledge from existing datasets using our methodology. The increase in hourly quotas Qhour → Qhour| is on rst sight a typical increase in stress levels. The simulation shows that as a direct eect of this increase all incidents can be processed on time. Furthermore, there are even negative levels of stress that we can observe (see Figure 11, left). Overall, our experimental results conrm our hypothesis we are able to generate new knowledge by augmenting existing log data from day-to-day systems and modeling dependencies between them. The usage of a System Dynamics-based models allows us to model dependencies both qualitatively and quantitatively. It also allows the usage of standard simulation software, which is another benet of our approach. Using our value-benet assessment every organization can choose the optimal simulation software, based on its specic preferences and requirements.
Deriving In-Depth Knowledge from IT-Performance Data Simulations
17
Figure 9 Second Iteration of the Quantitative Model. The model is now augmented
with the inuence of the factor Stress as follows - once, by the increasing number of open incidents (Stress durch oene Incidents, top), and second, by the introduction of quotas based on the processing ration per hour (Stress durch Quotenvergabe, right).
We have constructed our qualitative and quantitative models using the software, that we selected as suitable for the IT organization we evaluated. We cannot guarantee that the models can be created equally in any other of the software alternatives. Based on our qualitative and quantitative models we can provide directly applicable knowledge in the form of specic managerial (e.g., increase number of operators when n > 100) or governance (e.g., set Qhour → Qhour| ) recommendations.
5 Conclusion and Outlook In this article we proposed the hypothesis that the proper augmentation of log data from day-to-day systems and a subsequent analysis facilitating existing data modeling approaches can provide extended nontrivial knowledge for IT executives. In order to verify our hypotheses we considered existing IT governance frameworks and developed an augmentation and assessment approach based on System Dynamics. Our experimental results with real-life data from a complex IT organization conrm our hypothesis we are able to generate new knowledge by
18
Petruch, K. et. al
Figure 10 Results of the Monte-Carlo Simulation of the Quantitative Model. The model considers quotas based on the processing ratio per hour. The ratio Qhour is set to 6 incidents per hour (down, right).
Deriving In-Depth Knowledge from IT-Performance Data Simulations
Figure 11 Results of the Monte-Carlo Simulation of the Quantitative Model. The model considers quotas based on the processing ratio per hour. The ratio Qhour is set to 7 incidents per hour (down, right).
19
20
Petruch, K. et. al
augmenting existing log data from day-to-day systems and modeling dependencies between them. While our hypothesis is conrmed, we explicitly warn researchers and IT practitioners to apply our results unchanged to other organizations. We presume that the factors we used in our models can be generally applicable, as we derived them based on standard indicators from the areas of ITIL and COBIT. The specic values of the factors are organization-specic and need to be estimated using our approach for every organization independently. Nevertheless, the usefulness and the relevance of the obtained knowledge is considerable, although dependent on the specic augmentation and assessment approach applied by us. Furthermore, such a research landscape can be a suitable environment for the evaluation of knowledge and learning objects, processes, strategies, systems, and performance as dened in [11].
References [1] K. Binder. Monte-carlo methods. Mathematical tools for physicists, pages 249280, 1986. [2] J. Borzo. Business 2010 - Embracing the Challenge of Change. Technical report, 2005. [3] A.K.A. de MEDEIROS, A.J.M.M. Weijters, and W.M.P. van der Aalst. Genetic process mining: an experimental evaluation. Data Mining and Knowledge Discovery, 14(2):245304, 2007. [4] M. Ehrig, A. Koschmider, and A. Oberweis. Measuring similarity between semantic business process models. In Proceedings of the fourth Asia-Pacic conference on Comceptual modelling-Volume 67, pages 7180. Australian Computer Society, Inc., 2007. [5] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy. Advances in knowledge discovery and data mining. 1996. [6] C. Fox, A. Levitin, and T. Redman. The notion of data and its quality dimensions. Information Processing and Management: an International Journal, 30(1):919, 1994. [7] K. Gerke and G. Tamm. Continuous Quality Improvement of IT Processes based on Reference Models and Process Mining. AMCIS 2009 Proceedings, page 786, 2009. [8] Wim Van Grembergen, editor. Strategies for Information Technology Governance. IGI Publishing, Hershey, PA, USA, 2003. [9] M. Grundstein. A pattern of reference to insure organizational learning process: The semi-opened infrastructure model (sopim). International Journal of Knowledge Society Research (IJKSR), 2(3):1325, 2011. [10] J.W. Lainhart IV. COBIT: A Methodology for Managing and Controlling Information and Information Technology Risks and Vulnerabilities. Journal of Information Systems, 14:21, 2000. [11] M.D. Lytras and M.A. Sicilia. The Knowledge Society: a manifesto for knowledge and learning. International Journal of Knowledge and Learning, 1, 1(2):111, 2005. [12] M.E. Porter. What is strategy? Harvard Business Review, 74(4134):6178, 1996. [13] T.C. Redman. Data quality: management and technology. Bantam Books, Inc. New York, NY, USA, 1992. [14] T.C. Redman. Data Quality for the Information Age. Artech House, Inc. Norwood, MA, USA, 1997.
Deriving In-Depth Knowledge from IT-Performance Data Simulations
21
[15] George P. Richardson and Alexander L. Pugh. Introduction to System Dynamics Modeling with Dynamo. MIT Press, Cambridge, MA, USA, 1981. [16] S. Sackmann and M. Kähmer. Expdt: A layer-based approach for automating compliance. Wirtschaftsinformatik, 50(5):366374, 2008. [17] Vladimir Stantchev. Architectural Translucency. GITO Verlag, Berlin, Germany, 2008. [18] Vladimir Stantchev. Eects of Replication on Web Service Performance in WebSphere. Icsi tech report 2008-03, International Computer Science Institute, Berkeley, California 94704, USA, February 2008. [19] Vladimir Stantchev and Miroslaw Malek. Architectural Translucency in Serviceoriented Architectures. IEE Proceedings - Software, 153(1):3137, February 2006. [20] Vladimir Stantchev and Miroslaw Malek. Addressing Web Service Performance by Replication at the Operating System Level. In ICIW '08: Proceedings of the 2008 Third International Conference on Internet and Web Applications and Services, pages 696701, Los Alamitos, CA, USA, June 2008. IEEE Computer Society. [21] Vladimir Stantchev and Miroslaw Malek. Translucent replication for service level assurance. In High Assurance Services Computing, pages 118, Berlin, New York, 06 2009. Springer. [22] Vladimir Stantchev and Miroslaw Malek. Addressing dependability throughout the soa life cycle. IEEE Transactions on Services Computing, 99(PrePrints), 2010. [23] Vladimir Stantchev and Christian Schröpfer. Negotiating and enforcing qos and slas in grid and cloud computing. In GPC '09: Proceedings of the 4th International Conference on Advances in Grid and Pervasive Computing, pages 2535, Berlin, Heidelberg, 2009. Springer-Verlag. [24] Vladimir Stantchev and Christian Schröpfer. Service level enforcement in webservices based systems. International Journal on Web and Grid Services, 5(2):130 154, 2009. [25] Vladimir Stantchev and Gerrit Tamm. Addressing non-functional properties of services in it service management. In Non-Functional Properties in Service Oriented Architecture: Requirements, Models and Methods, pages 324334, Hershey, PA, USA, 05 2011. IGI Global. [26] Paul P. Tallon, Kenneth L. Kraemer, and Vijay Gurbaxani. Executives' perceptions of the business value of information technology: a process-oriented approach. J. Manage. Inf. Syst., 16:145173, March 2000. [27] J. Van Bon. Foundations of IT service management based on ITIL V3. Van Haren, 2008. [28] W. Van der Aalst, A. de Medeiros, and A. Weijters. Process equivalence: Comparing two process models based on observed behavior. Business Process Management, pages 129144, 2006. [29] W. M. P. van der Aalst. Business alignment: using process mining as a tool for delta analysis and conformance testing. Requir. Eng., 10:198211, November 2005. [30] W. M. P. van der Aalst, B. F. van Dongen, J. Herbst, L. Maruster, G. Schimm, and A. J. M. M. Weijters. Workow mining: a survey of issues and approaches. Data Knowl. Eng., 47:237267, November 2003. [31] W.M.P. van der Aalst and A. Weijters. Process mining: a research agenda. Computers in Industry, 53(3):231244, 2004. [32] B. van Dongen, R. Dijkman, and J. Mendling. Measuring similarity between business process models. In Advanced Information Systems Engineering, pages 450 464. Springer, 2008.
22
Petruch, K. et. al
[33] R.J. Van Glabbeek and W.P. Weijland. Branching time and abstraction in bisimulation semantics. Journal of the ACM (JACM), 43(3):555600, 1996. [34] Panos Vassiliadis, Alkis Simitsis, and Spiros Skiadopoulos. Conceptual modeling for etl processes. In Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP, DOLAP '02, pages 1421, New York, NY, USA, 2002. ACM. [35] RY Wang, HB Kon, and SE Madnick. Data quality requirements analysis and modeling. Data Engineering, 1993. Proceedings. Ninth International Conference on, pages 670677, 1993. [36] R.Y. Wang, M. Ziad, and Y.W. Lee. Data quality. Kluwer Academic Publishers Norwell, MA, USA, 2001.