Application Run Time Estimation: A Quality of Service Metric for Web-based Data Mining Services Shonali Krishnaswamy
Seng Wai Loke
Arkady Zaslavsky
School of Computer Science and Software Engineering Monash University 900 Dandenong Rd, Caulfield East, VIC 3145, Australia IDD + 61 3 9903 1402
School of Computer Science and Information Technology RMIT University GPO Box 2476V, Melbourne, VIC 3001, Australia IDD + 61 3 9925 9678
School of Computer Science and Software Engineering Monash University 900 Dandenong Rd, Caulfield East, VIC 3145, Australia IDD + 61 3 9903 2479
[email protected]
[email protected]
[email protected]
ABSTRACT The emergence of Application Service Providers (ASP) hosting Internet-based data mining services is being seen as a viable alternative for organisations that value their knowledge resources but are constrained by the high cost of data mining software. Response time is an important Quality of Service (QoS) metric for web-based data mining service providers. The ability to estimate the response time of data mining algorithms apriori benefits both clients and service providers. The advantage for the clients is that it helps to impose QoS constraints on the service level agreements and the benefit for the service-providers is that it facilitates optimising resource utilisation and scheduling. In this paper we present a novel rough sets based technique for identifying similarity templates to estimate application run times. We also present exp erimental results and analysis of this technique.
Keywords Data Mining E-Services, Quality of Service, Application Run Time Estimation, Rough Sets.
1. INTRODUCTION Application Services are a type of e-service/web service characterised by the renting of software. Application Service Providers (ASP) operate by hosting software packages/applications for clients to access through the Internet (or in certain cases through dedicated communication channels) via a web interface. Payments are made for the usage of the software rather than the software itself. The ASP paradigm is leading to the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Conference ’00, Month 1-2, 2000, City, State. Copyright 2000 ACM 1-58113-000-0/00/0000…$5.00.
emergence of several Internet-based service providers in the domain of business intelligence applications such as data mining, data warehousing, OLAP and CRM. This can be attributed to the following reasons: •
•
The economic viability of paying for the usage of high-end software packages rather than having to incur the costs of buying, setting-up, training and maintenance.
Increased demand for business intelligence as a key factor in strategic decision-making and providing a competitive edge. ASPs can also be characterised on the basis of their client interaction model as having either a “single service provider” approach or a “multiple service provider” approach [7]. The single service provider model is the currently predominant approach for business intelligence service providers. In this model, the client has a long-term contractual agreement with a single service provider. The mode of interaction, exchange of information, specification of tasks and delivery of results are determined mutually. The multiple service provider model involves a more ad-hoc relationship between clients and service providers who typically have contractual agreements on a task by task basis. This model typifies a virtual market place of e-services where service providers compete for requests made by clients. The model requires sophisticated techniques for locating service providers, matching task requests with service providers’ capabilities, ranking service providers and optionally performing negotiation. Emerging e-services platforms such as E-Speak and technologies like Universal Description Discovery and Integration (UDDI) and on-going research in areas such as description of data mining services [7] and ranking of service providers [12] facilitate this interaction model of ASPs. The models of operation and interaction notwithstanding, Service Level Agreements [SLA] are acknowledged as fundamental to the functioning of ASPs. Service Level Agreements govern the modalities and the performance of the services. In effect, they specify the Quality of Service (QoS) guaranteed by service providers. As ASPs become widespread, performance metrics and quality of service issues assume great significance [14]. This is evident from the number of QOS tools
and initiatives for e-services that are becoming available including WebQoS[13] and bizQoS[9]. While typical QoS metrics for eservices include performance, availability, response time and reliability, the interpretation of these metrics and the processes that have to be monitored in order to obtain the metrics are specific to the type of application that is being provided as a service. For example, “response time” in an ASP that hosts an online shopping agent could imply the time taken for querying virtual malls for availability of a requested product, while in a data mining service provider it could refer to the time to process a data set and deliver the results. In this paper, we focus on application run time estimation for ASPs who host data mining services. While we principally address data mining ASPs, the metric is generic to application services that are characterised by intensive and iterative computations on high volumes of data. We present a novel rough sets algorithm for estimating application run time estimation. As discussed above, response time for data mining services is the time taken to process a data mining task – that is apply a data mining algorithm on a specific data set (that has been cleaned and preprocessed) and obtain patterns from the data. In such services, the ability of the ASP to estimate the response time apriori can have several benefits:
The potential benefits and the intuitive soundness of the concept of hosting data mining services is leading to the emergence of a host of commercial business intelligence application service providers such as digiMine [1] Information Discovery [5]. For a comparative analysis of the operations of these service providers readers are referred to [7]. It must be noted that we have not been able to analyse and compare the QoS metrics used by these commercial data mining service providers, as this information is not publicly available. The currently predominant modus operandi for data mining ASPs is the single-service provider model discussed in section 1. Thus, there is a one-to-one relationship between clients and service providers and QoS metrics are negotiated and are confidential. The mode of operation for ASPs hosting data mining services does not reflect a marketplace environment of e-services where clients can make ad-hoc requests and service providers compete for tasks. However, the emergence of e-services platforms such as E-Speak and enabling infrastructures such as UDDI which facilitate the registration and discovery of service providers and provide an environment for interaction have the potential to bring about a transformation to the current model of operation. QoS metrics that are well defined are vital for the dynamic nature of such a model and often have to be determined on a task-by-task basis.
•
Form the basis for a SLA
•
Facilitate the scheduling of different tasks for the ASP
•
Allow ranking of ASPs in a multiple service provider environment
In this paper, we propose application run time as metric that is appropriate for data mining service providers and present a novel rough sets based algorithm. While its significance cannot be discounted in a marketplace of service providers, the ability to estimate application run time can be effectively used even in the current mode of operation to govern service level agreements.
It is evident that a mechanism for estimating response time is valuable for data mining service providers. However, the most significant factor in using such a technique is the accuracy of the estimation. The following sections of the paper present a novel rough sets approach to application run time estimation. The paper is organised as follows: section 2 reviews related research and surveys the landscape of commercial data mining service providers. Section 3 introduces the general principles of application run time estimation. Section 4 presents the theoretical aspects of our rough sets based algorithm for application run time estimation. Section 5 presents the experimentation and analysis of results. Finally, section 6 presents the conclusions and future directions.
2. RELATED WORK In this section we review emerging research in the area of Internet delivery of data mining services. We also survey commercial data mining service providers There are two aspects to the on-going research in delivering webbased data mining services. In [10], the focus is on providing data mining models as services on the Internet. In [7], the focus is on the exchange of messages and description of task requests, service provider capabilities and access to infrastructure in a multiple service provider model.
In summary, predicting application run times is an important QoS metric for web-based data mining services. Clients benefit from it as it helps to impose QoS constraints on the service level agreements and service-providers benefit from it as it facilitates optimising resource utilisation and scheduling. To the best of our knowledge, we are not aware of any other research which focuses on applying run time estimation techniques for web-based data mining services.
3. APPLICATION RUN TIME ESTIMATION Application run time estimation techniques has its uses in applications such as improving the performance of scheduling algorithms and estimating queue times in metacomputimg environments to support better resource selection [11]. Application run time prediction algorithms [2][3][11] operate on the principal that applications that have similar characteristics have similar run times. Thus, a history of applications that executed in the past along with their respective run times is maintained in order to estimate the task run time. Given an application for which the run time has to be predicted, the steps involved are: 1.
Identify “similar” applications in the history
2.
Compute a statistical estimate (e.g. mean, linear regression) of the run times of the “similar” applications and use this as the predicted run time.
The fundamental problem is the definition of similarity. There can be diverse views on the criteria that make two applications similar. For instance, it is possible to say that two applications are similar because the same user on the same machine submitted them or that two applications are similar because they have the same application name and are required to operate on data of the same size. Thus, as discussed by [11], the basic question that needs to be addressed is the development of techniques that can effectively identify similar applications. Such techniques must be able to accurately choose the attributes of applications that best determine similarity. The obvious test for such techniques is the prediction accuracy of the estimates obtained by computing a statistical estimate of the run times of the applications identified as being similar. Thus, the closer the predicted run time is to the actual run time, the better is the prediction accuracy of the technique. Several statistical measures can be used for computing the prediction (using the run times of similar applications that executed in the past) including measures of central tendency. Studies by [11] have shown that the mean performs very well as a predictor. Early work in this area by [2][3] proposed the use of “templates” of application characteristics to identify similar tasks in the history. Thus, for histories recorded from parallel computer workloads, [2] selected the queue name as the characteristic to determine similarity. Applications that were assigned to the same queue were deemed similar. In [3] several templates are used for the same history including: (user, application name, number of nodes and age) and (user, application name). Thus, in order to estimate the run time for a new task, the following steps are performed: 1.
The template to be used is defined/selected
2.
The history is partitioned into different categories where each category contains applications which have the same values for the attributes specified in the template. That is, if the template used is (user, application name), then tasks that had the same user and the same application name are placed into a distinct category.
3.
The current task (whose run time has to be predicted) is matched against the different categories in the history to determine which set it belongs to.
4.
The run time is predicted using the run times of the applications in the similar category.
This technique obviously has the limitation of requiring a sufficient history in order to function. In [11], it was proposed that the manual selection of similarity templates had the following limitations: •
It is not always possible to identify the characteristics that best determine similarity
•
It is not generic. Thus, while a particular set of characteristics may be appropriate for one domain, it is not always applicable to other domains.
They proposed automated definition and search for templates and used genetic algorithms and greedy search techniques. They were able to obtain improved prediction accuracy using these techniques. For a detailed description of the template identification and search algorithms readers are referred to [11]. In this paper we propose a rough sets based technique to address the problem of automatic selection of characteristics that best define similarity. Rough sets provide an intuitively appropriate theory for identifying good “similarity templates” (or sets of characteristics on the basis of which applications can be compared for similarity). The following sections of the paper discuss the theoretical soundness of our rough sets based approach and present experimental results of this technique establishing its good prediction accuracy and low mean error.
4. SIMILARITY TEMPLATE SELECTION USING ROUGH SETS The theory of rough sets was introduced by Zdislaw Pawlak [8] to deal with classification and analysis of data tables. Rough sets are particularly suitable for handling uncertainty in data. When handling such data, rough sets produce an inexact or "rough" classification. Rough sets have been widely used in several application domains for rule generation, attribute reduction and prediction [6][8]. A distinctive feature of rough sets is that it operates using only the information provided by the data and does not require any additional model assumptions such as grade of membership and prior probabilities. A detailed treatment of rough sets is beyond the scope of this paper and readers are referred to [6]. In this section we address the following questions: • The suitability of the theory of rough sets for identifying the characteristics that define similarity in application run time estimation • A systematic method for applying the constructs of rough sets in this domain
4.1 Suitability of Rough Sets for Similarity Template Selection The primary objective of similarity templates in task run time estimation is to identify characteristics of applications that define similarity. Thus, it is necessary in order to establish that applications that are identical with respect to a given set of properties recorded in the history are similar and will consequently have similar run times. The issue is to identify this set of properties, on the basis of which applications can be compared. It is possible to attempt identical matching – that is if n characteristics are recorded in the history, two applications are defined as similar if they are identical with respect to all n properties. However, this limits the ability to find similar applications considerably since not all properties that are recorded are necessarily relevant in determining the run time. Such an
approach could also lead to errors as applications which have important similarities might be considered dissimilar even if they differed in a characteristic that had little bearing on the run time. This has been the main reason for previous efforts [2][3][11] to use subsets of the properties recorded in the history. Rough sets provides us with a sound theoretical basis to determine the properties that define similarity. A data set in rough sets is represented as a table called Information System, where each row is an object and each column is an attribute. The attributes are partitioned into condition attributes and decision attributes. The condition attributes determine the decision attribute. The history, as shown in figure 1, is a rough information system, where the objects are the previous applications whose run times (and other properties) been recorded. Figure 1: A Task History Modelled as a Rough Information System The attributes in the information system are the properties about the applications that have been recorded. The decision attribute is the application run time that has been recorded. The other properties that have been recorded constitute the condition attributes. This model of a history intuitively facilitates reasoning about the recorded properties so as to identify the dependency between the recorded attributes and the run time. Thus, it is possible to concretise similarity in terms of the condition attributes that are relevant/significant in determining the decision attribute (i.e. the run time). Thus, the set of attributes that have a strong dependency relation with the run time can form a good similarity template. The fact that rough sets operate entirely on the basis of the data that is available in the history and require no external additional information is of particular importance as the lack of such information (beyond common sense and intuition) was the bane of manual similarity template selection techniques. Having cast the problem of application run time as a rough information system, we now examine the fundamental concepts Condition Attributes
App. Size Name
Comp. … Resou rces
Decision Attribute
Run Time
Object 1 Object 2
Object n
that are applicable in determining the similarity template. Degree of Dependency. Using rough sets it is possible to measure the degree of dependency between two sets of attributes. The measure takes values [0,1] and higher values represent stronger degrees of dependency. It is evident that the problem of
identifying a similarity template can be stated as identifying a set of condition attributes in the history that have a strong degree of dependency with the run time. This measure computes the extent of the dependency between a set of condition attributes and the decision attribute and therefore is an important aspect of using rough sets for identifying similarity templates. Significance of Attributes. The significance of the attribute is the extent by which the attribute alters the degree of dependency between a set of condition and decision attributes. If an attribute is “important” in discerning/determining the decision attribute, then its significance value, which is measured in the range [0,1], will be closer to 1. The similarity template should consist of a set of properties that are important for determining the run time. The significance of an attribute allows computing the extent to which an attribute affects the dependency between a set of condition attributes and a decision attribute. This measure allows quantification of the impact of individual attributes on the run time, which in turn results in identification and consequent elimination of attributes that do not impact on the run time and therefore should not be the basis for comparing applications for similarity. Reduct. A reduct consists of the minimal set of condition attributes that have the same discerning power as the entire information system. All superfluous attributes are eliminated from a reduct. According to [4], while it is relatively simple to compute a single reduct, the general solution for finding all reducts is NPhard. A similarity template should consist of the most important set of attributes that determine the run time without any superfluous attributes. In other words, the similarity template is equivalent to a reduct which has the most significant attributes included. It is evident that rough sets theory has highly suitable and appropriate constructs for identifying the properties that best define similarity for estimating application run time estimation. A similarity template should have the following properties: •
It must include attributes that have a significant impact on the run time
•
It must eliminate attributes that have no impact of the run time
This ensures that the criteria on which applications are compared for similarity have a significant bearing in terms of determining run time. As a consequence, applications that have the same characteristics with respect to these criteria will have similar run times. We have informally discussed the aspects of rough theory that are important for identifying similarity templates. We now present our rough sets algorithm for identifying similarity templates in formal terms.
4.2 Algorithm for Similarity Template Selection Our technique for applying rough sets to identify similarity templates centres round the concept of a reduct. A reduct by
definition is a set of condition attributes that are minimal (i.e. contain no superfluous attributes) and yet preserve the dependency relation between the condition and decision attributes by having the same classification power as the original set of
decision attribute is incremented by the SGF value of the attribute that is under consideration. This identifies any reduct of size |Dcore + 1|. This is one of the variations we introduce to the original algorithm developed by [4]. From our exp eriments we found that
1. 2. 3. 4. 5. 6.
Let A={a1, a2,….,an}be the set of condition attributes and D be the set of decision attributes. Let C be the D-Core REDUCT = C A1 = A – REDUCT Compute the Significances of the Attributes (SGF) in A1 and sort them in ascending order For i = |A1| to 0 K(REDUCT, D) = K(REDUCT, D) + SGF(ai ) If K(REDUCT,D) = K(A,D) REDUCT = REDUCT ∪ ai Exit End If K(REDUCT, D) = K(REDUCT, D) - SGF(ai ) End For 7. K(REDUCT, D) = K(REDUCT, D) + SGF(a|A1| ) 8. While K(REDUCT,D) is not equal to K(A,D) Do REDUCT = REDUCT ∪ ai (where SGF(ai ) is the highest of the attributes in A1) A1 = A1 - ai Compute the degree of dependency K(REDUCT,D) End 9. |REDUCT| -> N 10. For i = 0 to N If ai is not in C (that is the original set of attributes of the REDUCT at the start and SGF(ai ) is the least) Remove ai from REDUCT End If Compute degree of dependency K(REDUCT, D) If K(REDUCT, D) not equal to K(A,D) REDUCT ∪ a -> REDUCT
condition attributes. As explained before, conceptually a reduct is equivalent to a similarity template in terms of comprising attributes that have the best ability to determine the run time and thereby are best suited to form the basis for determining similarity. An information system can have several reducts and there are several techniques for computing reducts [6]. In computing a reduct for use as a similarity template in application run time estimation we require the reduct to consist of attributes that are “significant” with respect to the run time. For this purpose, we use a variation of the reduct generation algorithm proposed by [4]. The algorithm proposed by [4] was intended to produce reducts that included user specified attributes. It computes reducts by iteratively using the most significant attribute. The modified algorithm we use to compute the reduct for use as a similarity template is shown in figure 2. This algorithm treats the D-core (note: the core of a rough information system is the intersection of all reducts and the D-core is the core with respect to the set of decision attributes) as the initial attribute set of the reduct. The attribute significances for the remaining attributes are then computed and sorted in ascending order. Each attribute (starting with the most significant) is added to the initial reduct and the value of the dependency between the initial reduct and the
it is not unusual for the D-core to combine with the a single attribute (typically the most significant attribute) to form a reduct. This step identifies reducts of size |D-core + 1|. This iteration is computationally inexpensive as it involves only simple additions and no analysis. In order to determine a change in the degree of dependency by the addition of an attribute, it is normally necessary to perform a re-classification of the data by computing equivalence relations for the changed data set. However, in this particular instance, we can mathematically show that simply adding the significance of the attribute to the dependency between the D-core and the decision attributes is sufficient. As part of computing the significances of the attributes and sorting them, the degree of dependency between the initial reduct and the decision attributes (represented as K(REDUCT, D))is already determined. Further, the attribute significances are calculated relative to the Dcore. From section 4.2, we know that the significance of an attribute (SGF) is computed as: SGF(a) = K(R ∪ a), D) – K(R,D) In this case R = REDUCT, which implies that: SGF(a) = K(( REDUCT∪a), D) – K(REDUCT,D) Since, we know the value of K(REDUCT,D): K(( REDUCT∪a), D) = SGF(a) + K(REDUCT,D)
Figure 2: Reduct Generating Algorithm Thus, if a reduct of size |D-core + 1| exists, we find it without further computation and since we start with the most significant attribute, we attempt to find reducts that involve attributes with the highest significances. In the absence of such a reduct, the most significant attribute is added to the initial reduct and the dependency is computed as discussed. Then attributes (starting from the next most significant one) are iteratively added to the “new” reduct and the dependency between the new reduct and the decision attributes is re-computed. This process of recomputation of the dependency is expensive as it involves finding equivalence classes and the previous iteration we initiated was principally an attempt to avoid it. This is done until the dependency between the reduct (which is constantly growing due to the addition of attributes) and the decision attributes is the same as the dependency between the condition attributes and the decision attributes. Thus, at the end of the iteration, the reduct is the set of attributes, which have some significance (or are useful in determining the decision attributes). However, at this stage the reduct might not be minimal – as it could contain attributes that may be redundant. However, reducts have to be minimal and so inorder to eliminate superfluous attributes, the algorithm removes an attribute and checks whether the removal changes the dependency. A superfluous attribute will not affect the dependency and can be removed. The computationally most expensive component of this algorithm is the need to determine equivalence classes several times since this is necessary to determine the degree of dependency. As an implementation enhancement, we have introduced the notion of an “incremental equivalence class”, which enables us to reduce the number of comparisons that have to be made iwhen computing equivalence classes. Typically, in Step 8 of the above algorithm, the dependency is computed as follows. Let REDUCT be the set of initial attributes. An attribute a is added to the reduct and equivalence classes are determined for the data with respect to the attributes REDUCT ∪ a in order to find the new dependency as a result of adding a to the REDUCT. When the next attribute b has to be added to the REDUCT, as before the equivalence classes have to be determined, but in this case the data set includes the REDUCT, attribute a and the current attribute b. We can show that instead of computing equivalence classes for {REDUCT,a,b} in the second instance, it is sufficient to compute equivalence classes for {b} alone. We term this incremental determination of equivalence classes. In this section, we have presented the theoretical aspects of our rough sets based approach to defining similarity for application run time estimation. We now present the experimental results of our technique.
5. EXPERIMENTAL RESULTS AND ANALYSIS The general purpose of the experimentation is to establish the prediction accuracy of our rough sets algorithm for application run time estimation. However, the objectives of the experimentation
need to be placed in the context of the objectives of this work. There are two questions that have to be addressed in this regard: 1.
We have proposed a rough sets approach as a viable alternative to other automated similarity template selection techniques for application run time estimation developed by [11], who used genetic algorithms and greedy search. Therefore, it is important to perform a comparison of the predictive accuracy between the techniques to justify the use of a rough sets approach.
2.
While application run time estimation has its uses in several domains, our principal focus is on its use as a QoS metric for web-based data mining services. Therefore, it is important to test our algorithm on histories of data mining tasks.
Our first objective establishes the general validity and justification for using our algorithm and our second objective demonstrates its applicability in the domain of web-based data mining services. Unfortunately, our research is currently in progress and at this stage we are only able to present experimental results that meet our first objective. It must be noted that without proving the improved performance of our technique over existing approaches, there is no validity for its application in any area. Therefore, in this section of the paper, we present a comparative evaluation of the predictive accuracy of our rough sets algorithm with the techniques presented by [11] - genetic algorithms and greedy search. In order to compare our technique with the greedy search and genetic algorithm methods we performed experiments with two of the four data sets used by them (these two were publicly available). The two data sets are parallel computer workloads collected at the San Diego Super Computing Supercomputer Center (SDSC) by Allen Downey for the years 1995 and 1996. These two data sets are termed SDSC95 and SDSC96 respectively. These data sets have the following information recorded for each job: account name, login name, partition to which the job was allocated, the number of nodes for the job, the job type (batch or interactive), the status of the job (successful or not), the number of requested cpu hours, the name of the queue to which the job was allocated, the rate of charging for cpu hours and idle hours and the duration of the task in terms of when it was submitted, when it started and when it completed. Thus the data sets provided us with a history on which we could test the run time prediction accuracy of our algorithm. The experimental runs were conducted as follows. For each run, we took two samples from the data set: a history and a set of test cases. We tested our algorithm by varying the size of the history from 1000 to 100. For each run we used 25-30 test cases. The test case is distinguished from the records in the history by removing the run time information. Thus, a test case consists of all the information specified except the recorded run time. The run time information recorded in the test case is the actual run time of the task. The idea is to determine an estimated run time using our
prediction technique and compare it with the actual run time of the task. For each data set (SDSC95 and SDSC96) we conducted 10 experimental runs. For each test case, we recorded the reduct, the estimated run time and the actual run time. In [11], the results have been published in terms of the mean error in run time (i.e. the mean variation between the actual and the estimated run times) for their greedy search (GR) and genetic algorithm (GA) techniques. Since the mean error is the mean of the difference between the
Mean Error (Minutes)
Comparison of Mean Error for the SDSC95 Data Set 100 50 0 Rough (Best)
Rough (Worst)
Rough (Avg)
Smith (GA-best)
59.65
59.65
59.65
Smith(GR-best)
67.63
67.63
67.63
3.4
61.46
28.76
Rough Smith (GA-best)
Smith(GR-best)
Rough
actual and the estimated values, a lower mean error indicates better prediction accuracy. In order to make a fair comparison, we compare the best results obtained by [11] with our best, worst and average mean errors. The comparison for the SDSC95 data set is presented in figure 3 (Note: “Smith-GA” refers to the genetic algorithm and “Smith-GS” refers to the greedy search used by [11]). Figure 3: Mean Error Comparison for the SDSC95 The mean error (in minutes) shows that the rough sets algorithm performs better than the greedy search technique by having a
Mean Error (Minutes)
Comparison of Mean Error for the SDSC96 Data Set 150 100 50
In this case we find that in the best and average case rough sets approach performs better than the best case of both the both the GA and GS. However, the mean error for the worst case of the rough sets is significantly higher than the best cases of the GA and GS. The above comparisons in general show that the prediction accuracy of the rough sets approach is good. When compared with the best results of the other techniques, the rough sets approach fares well in its best and average results. The worst case of the rough sets is arguably worse than the best cases of the others. However, we believe that the most relevant comparison is that between the average case of the rough sets approach and the best cases of the greedy search and genetic algorithms. It is here that the good performance of the rough sets technique is best evident. Another measure used for comparison is the mean error as a percentage of the mean run times. A comparison is presented in table 1. Table 1: Mean Error as a Percentage of the Mean Run Times Smith -GA Smith -GS Rough Sets - Best Rough Sets -Average Rough Sets - Worst
SDSC95
SDSC96
55.35 62.76 0.94 29.59 10.99
44.79 45.77 72.59 126.17 85.50
In terms of the mean error as a percentage of the mean run time, the rough sets approach performs exceedingly well with the SDSC95 data set but falls short on the SDSC96 data sets. This is principally because we had high mean errors for two experimental runs, which involved applications with long run times. Rough sets operates only on the basis of the data and on examination of the data we discovered that the applications which resulted in errors with large differences in run time were due to the fact that there were two jobs in the history – which were identical to the test case (not merely with respect to the similarity template, but with respect to all recorded characteristics) – but had completely different run times. In general, the algorithm works on the basis of what has been recorded and in this particular case the recorded characteristics did not sufficiently distinguish the jobs.
0 Rough (Best) Rough (Worst)
Rough(Avg)
Smith (GA-best)
74.56
Smith(GR-best)
76.2
76.2
76.2
Rough
9.95
110.44
47.02
Smith (GA-best)
74.56
Smith(GR-best)
74.56
Rough
significantly lower mean error in all three cases (i.e. we compare the best mean error recorded by [11] with our best, worst and average mean errors). In comparison to the genetic algorithm technique, the rough sets outperforms this in the best and average case and performs marginally worse in the worst case (again – this is the comparison of the best case for the GA with the worst case for the rough sets approach). The same comparison for the SDSC96 data set is presented in figure 4. Figure 4: Mean Error Comparison for the SDSC96
The above experimentation shows that a rough sets based approach has good results for application run time estimation in general. This gives a basis for testing it on histories of data mining tasks. We are currently building such a history and we intend to publish results from this experimentation.
6. CONCLUSIONS AND FUTURE WORK We have presented response time as a QoS metric for web-based data mining service providers. QoS metrics are becoming increasingly important for both customers of application services and ASPs. For data mining ASPs the ability to estimate application run time apriori can have the benefits of both QoS and scheduling and optimising resource utilisation. We have presented a novel rough sets approach to estimating application run time. The theoretical foundation of rough sets provides an intuitive solution to the problem of application run time estimation. We
have discussed the suitability of the different rough sets constructs and shown how they can be applied to defining similarity to estimate application run time. This hypothesis is validated by the experimental results, which demonstrate the good prediction accuracy of our rough sets approach. Our current experimental results use history from a parallel workload. While we have established the general validity of our approach, we realise that experimentation with data mining histories is important. We are currently developing a history of data mining tasks.
7. ACKNOWLEDGEMENTS
[10] Sarawagi,S., and Nagaralu,S,H., (2000), “Data Mining Models as Services on the Internet”, SIGKDD Explorations, Vol. 2, Issue. 1, http://www.acm.org/sigkdd/explorations/
[11] Smith,W.,
Foster,I., and Taylor,V., (1998), “Predicting Application Run Times Using Historical Information”, in Proc. of the IPPS/SPDP '99 Workshop on Job Scheduling Strategies for Parallel Processing.
[12] Tewari,G., and Maes,P., (2000), “A Generalized Platform for the Specification, Valuation, and Brokering of Heterogeneous Resources in Electronic Markets”, in Lecture Notes in Artificial Intelligence (LNAI), 2033, Springer-Verlag, pp.7-24.
We thank Allen Downey for making the SDSC workloads publicly available and for his permission to use this in our experiments. We thank the Distributed Systems and Technologies Centre (DSTC), Australia for partially funding this project.
[13] Web Quality of Service (WebQoS), (2001), Hewlett-Packard,
8. REFERENCES
[14] Wolter,K., and Moorsel,A., (2001), “The Relationship
[1] digiMine – URL: http://www.digiMine.com [2] Downey,A,B., (1997), “Predicting Queue Times on SpaceSharing Parallel Computers”, Proceedings of the Eleventh International Parallel Processing Symposium (IPPS), Geneva, Switzerland, April.
[3] Gibbons,R., (1997), “A Historical Application Profiler for Use by Parallel Schedulers”, Lecture Notes in Computer Science (LNCS), 1291, Springer-Verlag, pp.58-75.
[4] Hu,X., (1995), “Knowledge Discovery in Databases: An Attribute-Oriented Rough sets Approach”, PhD Thesis, University of Regina, Canada.
[5] Information Discovery – URL: http://www.datamine.aa.psiweb.com
[6] Komorowski,J., Pawlak,Z., Polkowski,L., and Skowron, A., (1998), “Rough sets: A Tutorial”, in Rough-Fuzzy Hybridization: A New Trend in Decision Making, (eds) S.K.Pal and A.Skowron, Springer-Verlag, pp. 3-98.
[7] Krishnaswamy,S., Zaslavsky,A., and Loke,S,W., (2001), “Towards Data Mining Services on the Internet with a Multiple Service Provider Model: An XML Based Appraoch”, Journal of Electronic Commerce ResearchSpecial Issue on Electronic Commerce and Service Operations, Vol. 2, Number 3, http://www.csulb.au/journals/jecr
[8] Pawlak,Z.,(1992), “Rough sets: Theoretical Aspects of Reasoning about Data”, Kluwer Academic Publishers, London.
[9] Sahai,A., Ouyang, J., Machiraju,V., and Werster,K., (2001), “BizQoS: specifying and Gauranteeing Quality of Service for Web Services through Real Time Measurement and Adaptive Control”, Hewlett-Packard Labs Technical Report HPL2001-96, http://www.hpl.hp.com/techreports/2001/HPL2001-134.html
Technical White Paper, http://www.hp.com/products1/webqos/infolibrary/whitepape rs/wp.html between Quality of Service and Business Metrics: Monitoring, Notification and Optimization”, HewlettPackard Labs Technical Report HPL-2001-96.