Yi Wang, Minglu Li, Jian Cao, Ying Li, Lin Chen, Xinhua Lin, Feilong Tang. Department of Computer Science and Engineering, Shanghai Jiao Tong University, ...
An ECA-Rule-Based Workflow Approach for Advance Resource Reservation in ShanghaiGrid Yi Wang, Minglu Li, Jian Cao, Ying Li, Lin Chen, Xinhua Lin, Feilong Tang Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China {wangsuper, li-ml, cao-jian, liying, chenling, lin-xh, tang-fl}@cs.sjtu.edu.cn Abstract ShanghaiGrid is the first metropolitan grid in China. The project aims to develop an environment to share distributed resources in Shanghai conveniently based on grid technology. Yet, resources are finite for ever, especially for some expensive and specific resources. Users must reserve these resources in advance before use them. Present solutions need users confirm the absolute start time of reservation before the reservation which is impossible in practice at most time. But they can often estimate the relative time to start reservation after a specific event, e.g. beginning or finishing a task. As a part of ShanghaiGrid, we develop an ECA rule-based workflow management system (EWMS) which can serve for arranging the relative start time of advance resource reservation as users’ demand. We introduce the design architecture of EWMS in the paper, and give a modeling demo to show how to realize advance resource reservation through our system.
1. Introduction Shanghai Grid, one of the top five grand1 projects in China, is an ongoing metropolitan grid project to enhance the digitalizing of city. It aims at constructing metropolis-area information service grid infrastructure and establishing an open standard for widespread upper-layer applications from both research communities and official departments, so as to share distributed resources in Shanghai conveniently based on grid technology [1]. Yet, these resources are still finite, especially for some expensive and specific resources, such as supercomputers, large instruments, can serve for more users. The need of getting high QoS impassions the research interest on advance resource reservation (ARR) in grid. Present solutions for ARR focus on the concrete realizing mechanism of advance resource reservation, and need users confirm the
absolute start time of reservation. If the start time is set too early, it will make a big waste to the user considering that most of the reservation is charged according to the reserving time. On the contrary, if the start time is set too late, the user can not get the resource and finish his work on time. We present an approach to use workflow to solve the problem. There have been many ways to define and describe workflow model, such as WFMC define language, RAD graph, EPCM model, etc. We choose ECA rule, which is put forward from the research field of active database, to model workflow for ARR. There are at least three good reasons for using ECA rule for Web services workflow modeling and analysis: − It is easily understood by end-users; − It can express complicated logical relationship of web services; − It fits in with graphic realization. In this paper, An ECA-rule-based workflow management system (EWMS) is presented. Through the system, users can construct and instantiate workflow models which can help users reserve resources in advance at specific working process. ECA rule plays the main role in our system. The rest of this paper is organized as follows: Section 2 gives a brief overview of related work. Section 3 redefines some basic definitions of ECA rule to meet the need of our system. In section 4, we overview our workflow management system—EWMS. A modeling example for advance resource reservation is shown in detail in section 5. The last section concludes the whole paper and points out some future works briefly.
2. Related Works Advance resource reservation is put forward with the development of distributed multimedia applications [2]. Lars C. Wolf and Ralf Steinmetz overview some concepts of resource reservation in advance [3]. Different techniques have been proposed for
Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06) 0-7695-2751-5/06 $20.00 © 2006
representing advance reservations, for balancing immediate and advance reservations [4], for advance reservation of predictive flows [5], and for handling multicast [6]. Ian Foster and his colleagues present an architecture which is named GARA to support advance reservation and co-allocation [7]. All these proposals focus on the concrete realizing mechanism of advance resource reservation, and need users confirm the absolute start time of reservation which is impossible at most time in practice. Oppositely, we notice that people can often exactly estimate the relative time that they would start reservation after a specific event, e.g. beginning a task or finishing a task, we present an ECA rule based workflow approach to do the advanced resource reservation. Event-Condition-Action (ECA) rule is put forward in the research field of active database [8][9]. The rule make the data repositories react to internal or external events and trigger a chain of activities that includes notifying users and applications or performing data base updates. This process is similar with the business process, so the workflow researchers believed that we can take advantage of ECA rule and propose the framework of ECA rule-based workflow [10]. But as some other workflow frameworks, the former framework can’t fit the development of Internet applications based on grid environment.
3. ECA Rule The semantics of ECA rules can be described straightforward, that is, when the event occurs (is signaled), the system evaluates the specified condition; if the condition is satisfied, the corresponding actions will be executed. For the purpose of using ECA rule in our system, we redefine some basic definitions of ECA rule. Definition 1: An ECA rule-based workflow model (EWM) is an seven-tuple ( E , C , A, R, LC , F , DO) : E is a finite set of events, z z C is a finite set of conditions, z A is a finite set of activities which is also means action in our system, z R is a finite set of rules, R ⊆ E × C × A , z LC is a finite set of logical connectors, z F ⊆ ( A × LC ) ∪ ( LC × A) ∪ ( LC × LC ) ∪ ( A × A) is a set of flows. z DO is a finite set of data objects used in workflow model. Definition 2: Event presents a function that map time to boolean value and can be presented as follows:
E : T → {True, False} True, if Event of E kind happen at the time of t E(t) = False, other instances Event can be differed as atomic event ( Ea ) and composite event ( Ec ). z There are six kinds of atomic events for each activity. E a ⊆ {e ∀ a ∈ A, e ∈ { Initialize d ( a ), Started ( a ),
EndOf ( a ), Overtime ( a ), Aborted ( a ), Error ( a )} .
The six atomic event types denote the different execution state of an activity. z Ec is a set of atomic events or composite events related by defined event operators. We define two operators: AND, OR. e1 AND e2 means that both
e1 and e2 has to happen. e1 OR e2 denotes that at least e1 or e 2 should happen. Definition 3: Condition presents the limitation of the relationship of objects or relationship of objects and constants defined by processor designer. C : Exp → {True, False} , and Exp denotes some expressions which can get a boolean result. Definition 4: An activity is a software application or a procedure set executed in order to accomplish a mission. According to the different function, EWMS has several activity types. If a is an activity ( a ∈ A ), ain denotes the number of flows indicate to activity a,
and a out means the number of flows come from activity a. z As and Ae are start and end activity set. There are just one start activity and one end activity in a EWM, assumed that a ∈ As , b ∈ Ae , ain =0 and
bout = 0. z z z z
Ai
refers to the set of activities responsible for invoking a service. Ad means the set of delaying for a specific time. The activity set of ATX is utilized to execute the transformation of XML document by XSLT ASV is responsible for the value assignment of variables.
Definition 5: ECA rule is a triple R=, where E is an event to trigger the rule and can be an atom event or a composite one, C is the condition set to reflect the status of the system and environment and
Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06) 0-7695-2751-5/06 $20.00 © 2006
will be evaluated when the rule is triggered by its event, and A is the action set that is executed when the rule is triggered and its condition is satisfied. An ECA rule states that if E happens and the condition set C can be satisfied then all the actions of A will be executed. Definition 6: LC can express the logic relation of the conditions and the actions. We define a special transfer function TC to map each logic connector onto a connector type. TC : LC → { In, Out In, Out ∈ {AND, OR}} . Here,
Object or Other Document Object. It is usually used to assign values from one field of XML Object to another. Both Inherent variable and object variable have five types: string, integer, float, double and boolean.
4. EWMS 4.1. Architecture
“In” represents the incoming flow logic whereas “Out” represents the outgoing flow logic. For example, if ∃lc ∈ LC, TC(lc) =< AND, OR > , and ∃f1 , f 2 ∈ { f f ∈ A × {lc} ∪ LC × {lc}} . suppose that the triggering event of f1 , f 2 is respectively e1 and e 2 , then the triggering event of lc is e1 ANDe2 . It is equal to the AND-JOIN workflow pattern in Petri-Net. Similarly, if there are two outgoing control flow to activities, the execution pattern is in accordance with the “Out” operator, in this case OR for parallel execution of activities. This is equal to the OR-SPLIT workflow pattern in Petri-Net. It is obvious that there are four logic connector types: and-and, and-or, or-and, and oror.
Definition7: F means the set of directed flows, it decides the executing and data transfer sequence of a EWM, And F ⊆{(Ps, Pe) | Ps, Pe ∈A ∪ LC} , which Ps means the start point of the flow, and Pe means the end point of the flow. F can be divided into data flow ( FD) and control flow ( FC ). And: F = FC ∪ FD, FC ∩ FD = φ,
FD ⊆{Pe, Ps| Pe∈A, Ps∈A},
FC ⊆{Pe, Ps| Pe∈A, Ps∈LC}∪{Pe, Ps| Pe∈LC, Ps∈A}
∪{Pe, Ps| Pe∈LC, Ps∈LC}
Definition 8: There are four categories of data object (DO) definitions for the control and exchange of data in ECA rule-based workflow model: z XML Objects are XML schema based data and generally used to represent the input and output messages of services. z Other Document Object is an abstract representation of documents formats data except XML document, such as word, PDF, rtf and so on. z Inherent Variable is basic data type. It can be utilized to set guarding condition or act as a decision point. z Object Variable is data item extracted from XML
Figure 1. Architecture of EWMS EWMS, an important part of ShanghaiGrid project, is an ECA rule-based workflow management system. EWMS includes five layers: user layer, engine layer, agent layer, grid middleware layer and resource layer (see Fig. 1). User layer supply the interfaces for people to design workflow processes, enact the engine, and monitor the executing of engine. Workflow engine receives the invoking requests, creates workflow instances, executing the instances and saves the executing log to the database for monitoring. Multiagent platform can automated discovering, integrating and invoking services as the user’s request. Grid middleware is some related software for using grid resources directly, the core of this layer is a component named SGSIC — ShanghaiGrid Service Invocation Core. We will give more details about SGSIC. Resource layer mainly involves but not limited the service resource in a service grid.
4.2. SGSIC SGSIC is the key component of grid middleware layer in EWMS. It can support three different service standards existed in present grid environment: standard web service, OGSI service based on GT3, and WSRF service based on GT4. As be shown in Fig. 2, SGSIC
Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06) 0-7695-2751-5/06 $20.00 © 2006
including two sub-layers: basic layer and extended layer.
and resolve traffic jam problems. Fig. 3 shows the detail [1].
Figure 2. Framework of SGSIC Basic layer includes the essential components for service invocation and management, such as task manager, data manager, service manager and information center. Service manager is used to register a new service, index the service and search proper service according to user’s requirement. Task manager manage the input parameter, the output result of service invocation. Data manager is used to transfer data between user and grid site. Information center is used to store all data published by service provider. The adapter can encapsulate the implementing details of invoking different services, and give some uniform interfaces to the management component. Extended layer consists of several components that offer more advanced abilities, such as service counting, resource monitor, service dispatch, grid transaction management, and so on. These components can assistant users manage and monitor the grid environment more effectively.
5. Case Study – Traffic Flow Analysis 5.1. Background Traffic jam control is a big problem to large cities. Solving this problem need collecting real-time traffic data and analyzing the data quickly to get a control solution. Shanghai has lots of traffic monitoring instruments and high computing ability. However, these resources belong to different government agencies. Traffic Information Grid (TIG) is a typical domain grid in ShanghaiGrid which provides the traffic information and guidance to people. It utilizes grid technology to integrate traffic information, share the data and computing resources, analyze traffic flow,
Figure 3. Architecture of TIG Data Collection Nodes (DCN) provide the dynamic traffic information through various of traffic monitoring instruments. The Data Proxy Servers (DPS) are used to manage the data and provide grid data services to other grid application. TIG Virtual Database Nodes (VDBN) provide Virtual Database Services (VDBS) which support uniform access to retrieve data from DPS. The returned data are cached in cache server and will be automatic destroyed after a certain period of time defined by the Time To Live (TTL) property. TIG Computing Node (CN) performs the computing task. It is a grid node that provides cluster computing services. Each CN contains a Cluster Resource Information Server (CRIS), a PC cluster (Linux Server), and a supercomputer. For example, in the SCC Computing Node, the supercomputer is Drawing 3000, the PC cluster contains about 30 Linux PC Servers. The SJTU, FDU, SHU, TJU computing nodes have the similar configuration. Note that these supercomputers or clusters are serving for other people and applications at the same time. Users must reserve these computing resources in advance before use. TIG Information Server (IS) provides Information Service. All Grid nodes are registered in IS. It provides the Information about the Grid resources. Since we just want to validate the feasibility of our advance resource reservation method, we use some emulator to simulate the ShanghaiGrid environment in our demo.
Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06) 0-7695-2751-5/06 $20.00 © 2006
5.2. Workflow Model Fig. 4 shows the workflow model of traffic flow analysis. The right side of Fig. 4 is a graphic design panel to show the flow of the model. Icon is the symbol of the activity to invoke a service, means the activity of delay for some time as designer’s demand, means control flow, indicate and-and connectors. The left side of Fig. 2 is an object navigating tree of the model which reflects the relation of the objects that belong to the workflow model.
portal to the workflow engine for running. Figure 5 shows a management tool for workflow engine. It supports to deploy, start and monitor workflow engine service. When the workflow request is coming, workflow engine obtains a copy of the workflow model from database and then takes charge of the specific invocation and routing according to ECA rules defined in the workflow model. The process of the workflow instance is also saved in an instance database.
Figure 5. Engine Management Tool
Figure 4. Workflow Model for Traffic Flow Analysis This model illustrates such an information handling process: The process starts with a service of collecting traffic data done by DCN, then the data will be integrated by DPS and VDBN, including data transmission, data cleaning and data reconstruct, etc. We assume that the integrating time will use some time. So we use a “delay” activity to decide the time point of reservation. In our simulation environment, we let it be 5 minutes. After 5 minutes of the start of “integrate” service, System will invoke the “reserve” service to process advance resource reservation. The “compute” can be invoked only if both “integrate” and “reserve” services end. Note that there are two control flows begin from “integrate” service, and the triggering events of the two control flows are different: start of “integrate” and end of “integrate”. This reflects the advantage and effect of ECA rule.
5.3. Instantiating and Result Analysis When finishing the design of the medical image processing workflow, it can be submitted through user
Table 1 shows the part snapshot of the activity information of the instance. The column of startTime and endTime denotes the timestamp of each activity’s start and end, and their format is “yyyy-mm-dd hh:mm:ss”. We find that when the Integrate activity starts at the timestamp “2006-04-21 10:18:23”, the Delay activity is triggered, and wait for 5 minutes as we designed. Then the Reserve activity is triggered after the delay time is passed, the Compute activity starts when both Integrate and Reserve activities end. We notice that the Integrate activity processes more than 6 minutes and finishes about one minute later than the reservation which means we have waste the reservation cost of 1 minute. Through analyzing the instance database, we can revise our workflow model to get a better reservation policy. In this demo, for example, we can adjust the delay time to 6 minutes.
6. Conclusion and Future Work Advance resource reservation is an important research field in grid. Present solutions need users confirm the absolute start time of reservation at design time which is too difficult for the users. We have proposed an ECA rule-based workflow approach which can help people choose the relative start time of
Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06) 0-7695-2751-5/06 $20.00 © 2006
reservation after a specific event as users’ demand which is more feasible than other solutions. We just finish the elementary job. The demo in this paper is very simple, and ignores many factors which can affect the result of ARR. And the management function for advance resource reservation of our solution is very weak, it cannot find resources
dynamically and adaptively. We will extend the system to settle these problems in near future.
Table 1. Activity Information of Instance activityID
activityName
triggerEvent
startTime
endTime
2006-04-21 10:08:12 2006-04-21 10:08:13 2006-04-21 10:18:23 2006-04-21 10:18:23 2006-04-21 10:23:23
2006-04-21 10:08:12 2006-04-21 10:18:23 2006-04-21 10:24:47 2006-04-21 10:23:35 2006-04-21 10:23:30
1
start
Instantiation
2
Collect
End of start
3
Integrate
End of Collect
4
Delay
Start of Integrate
5
Reserve
End of Delay
6
Compute
(End of Reserve) and (End of Integrate)
2006-04-21 10:24:48
2006-04-21 10:30:46
7
end
End of Compute
2006-04-21 10:30:46
2006-04-21 10:30:46
Acknowledgement This paper is supported by ShanghaiGrid grand project of Science and Technology Commission of Shanghai Municipality (05DZ15005), National Scientific Fund of China (No.60503041), and Natural Science Foundation of Shanghai (05ZR14081).
References [1] Ying Li, Minglu Li, Jiadi Yu, et al., “A Workflow Services Middleware Model on Shanghai Grid”, Proceedings of Services Computing. Proceedings of IEEE International Conference on SCC 2004, 2004, pp. 366-371 [2] Wilko Reinhardt, “Advance Resource Reservation and its Impact on Reservation Protocols”, Proceedings of Broadband Island’95, 1995, pp. 28-35 [3] Yu Tang, Luo Chen, Kai-Tao He, Ning Jing, “SRN: Concepts for Resource Reservation in Advance”, Multimedia Tools Applications, vol. 4, no.3, 1997, pp. 255-278 [4] D. Ferrari, A. Gupta, G. Ventre, “Distributed advance reservation of real-time connections”, Journal on Multimedia Systems, vol. 5, no. 3, 1997, pp. 187-198
[5] M. Degermark, T. Kohler, S. Pink, O. Schelen, “Advance reservations for predictive service in the internet” Journal on Multimedia Systems, vol. 5, no. 3, 1997, pp. 177-186 [6] S. Berson, R. Lindell, “An architecture for advance reservations in the internet”, Technical Report, USC Information Sciences Institute, 1998 [7] I. Foster, C. Kesselman, C. Lee, et al., “A Distributed Resource Management Architecture that Supports Advance Reservations and Co-Allocation”, Proceedings
of IWQoS'99, 1999 [8] McCarthy, D. R., Dayal, U., “The Architecture of An Active Database Management System”, Proceedings of ACM-SIGMOD 1989 Int’l Conf. Management of Data, Portland, Oregon, 1989, pp. 215-224 [9] McCarthy, D. R., Dayal, U., “The Architecture of An Active Database Management System”, Proceedings of ACM-SIGMOD 1989 Int’l Conf. Management of Data, 1989, pp. 215-224 [10] Dayal, U., Buchmann, A. P., McCarthy, D. R., “Rules are Objects Too: A Knowledge Model For An Active, Object-Oriented Database System”, Proceedings of the 2nd Intl. Workshop on Advances in Object-Oriented Database System, 1988, pp. 129-143
Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06) 0-7695-2751-5/06 $20.00 © 2006