SOCA (2007) 1:171–184 DOI 10.1007/s11761-007-0014-z
ORIGINAL RESEARCH
Automated building blocks selection based on business processes semantics in ERPs Tommaso Di Noia · Eugenio Di Sciascio · Francesco M. Donini · Eufemia Tinelli · Francesco di Cugno · Azzurra Ragone
Received: 1 May 2007 / Revised: 24 July 2007 / Accepted: 25 July 2007 / Published online: 6 October 2007 © Springer-Verlag London Limited 2007
Abstract We present a novel approach and a system for automated selection of building blocks, by exploiting business processes semantics. The selection process is based on a novel greedy concept covering algorithm, which computes, given a description of the request, the set of needed building blocks. If such a set does not exist the algorithm returns the covered subset and explanation information on both the left-uncovered part and conflicting part of the request description, exploiting non-standard inference services. The requested description can be expressed as conjunction of mandatory requirements and preferences. In order to efficiently cope with large datasets a further enhancement is proposed, exploiting pre-classification techniques using a Description Logics reasoning engine in conjunction with a RDBMS to reduce the computational burden. The approach has been deployed in a system specifically designed for SAP
T. Di Noia · E. Di Sciascio (B) · E. Tinelli · F. di Cugno · A. Ragone SisInf Lab, Politecnico di Bari, via Re David 200, 70125 Bari, Italy e-mail:
[email protected] T. Di Noia e-mail:
[email protected] F. di Cugno e-mail:
[email protected] A. Ragone e-mail:
[email protected] F. M. Donini Università della Tuscia, via S. Carlo 32, 01100 Viterbo, Italy e-mail:
[email protected] E. Tinelli Università di Bari, Via Orabona 4, 70125 Bari, Italy e-mail:
[email protected]
R/3 best practices reusability, which is fully functional and is currently being evaluated in an industrial setting. Keywords Description logics · Concept abduction · Concept contraction · Concept covering · Business processes · ERP · Building Blocks
1 Introduction Enterprise resource planning (ERP) systems nowadays play a critical role in several enterprises, as they allow to improve in a substantial way quality and efficiency of their business processes, both administration and production oriented. As a consequence more and more companies, even medium/small sized tend to move to ERPs to reengineer and integrate their processes. Nevertheless, it is well-known that the customization and adaptation process is usually much more expensive than the actual purchase of the software licences [1]. SAP R/3, for example, provides a huge number of parametric customizations in order to adapt the system to every particular organization context, and usually consultants—or consulting firms—are hired to provide the needed expertise in such reengineering process. In SAP R/3 terms, the configuration process is called Customizing [2,3]. The customizing process plays a fundamental role for the satisfaction of needs using a SAP R/3 solution. To reduce the analysis period various new methodologies were developed. Among others we mention: global accelerated SAP (GASAP), accelerated SAP (ASAP) and Best Practices [3,4]. These methodologies aim at reusing results obtained using customized implementations. They provide support to a SAP customizer, showing the implementation roadmap, with technical and organizational optimizations. Central to the best practices
123
172
approach is the Building Block concept [4]. The basic idea is the modularization of a vertical solution, i.e., in SAP terminology a complete SAP R/3 solution, developed for a well defined organization scenario, identifying and extracting all its client independent information. Building blocks (BBS) are basically flexible, transparent and re-usable pieces of functionality that store the preconfiguration, the tools and the documentation required to set up a SAP solution. They can be used to create in a fast and reliable way new business processes or scenarios, or adapt existing ones. Each BB comes with a building block description, which summarizes its contents. BB contents in SAP Best Practices are defined considering from the start the possibility of their reuse from an implementation point of view. Basically, the BB content is defined by the identification of which Business Process parts can be reused within a predefined solution. The BB Library [5] provided by SAP in fact aims at sharing SAP knowledge within the community of developers. It is also possible to develop specific BBs able to provide particular solutions within a company context. Nevertheless, because of the rapid growth of the BBs number, choosing the correct BB in order to satisfy part of a specific Business Process, is increasingly expensive in terms of time, as the selection is usually driven only by the developers experience. There are currently over 2,500 BBs in SAP best practices library. It is also noteworthy that consulting companies whose core business is often the customizing process may develop their own libraries in addition to the public one, and have the community of SAP consultants share their knowledge in a flexible way. Such libraries become primary assets of the company itself, as their effective use can drastically reduce the customizing costs and efforts. There is hence the need to make selection of suitable BBs, given a businesss process description, simple and fast. In this paper, we show how technologies borrowed from the Semantic Web initiative can help both in modeling—using semantic annotation—building blocks descriptions and business processes, and in supporting consultants share their knowledge, by allowing automated selection of BBs from available business processes to compose new ones. More in detail we present a complete framework that allows to: simply semantically annotate BB descriptions using a graphical tool; given a business process description— also created using a GUI—carry out a semantic-based selection of either a single BB or a set of BBs covering the requested process requirements. Distinguishing aspects of our approach include the possibility to semantically rank retrieved BBs w.r.t. the requested business process, and provide explanations on the ranking results. This is made possible exploiting standard and non standard reasoning services [6] that will be illustrated later on. Furthermore, although in a perfect world we would like to find BBs that perfectly fit the business process request it is often not so. But, differently
123
SOCA (2007) 1:171–184
from other approaches, when computing a covering, if available BBs cannot cover the request we are still able to provide explanations on what is in conflict and what is still missing to fully cover the request. Also interesting is that our approach allows users to set mandatory and preferential requirements of the requested business process features. The remaining of the paper is structured as follows: next section provides the basics of the theoretical framework we adopt, which is based on Description Logics formalism; then we present and motivate concept covering algorithms used to select, given a semantically annotated business process, BBs fulfilling its functionalities. The need to deal efficiently with large sets of BBs led us to study an enhancement to the basic framework, supporting persistence and the ability to efficiently solve Concept Covering problems, reverting whenever possible, to RDBMS. Such framework is presented and discussed in Sect. 4. We present the system architecture in Sect. 5. Performance evaluation, in terms of execution times and covering results, are proposed in Sect. 6. Last section summarizes relations with other works and draws the conclusions. Please note that, to ease presentation and without loss of generality, we will refer in examples to simplified business process and BBs such as BP The process creates a service contract, which would reflect the terms of service agreed upon with the customer. A vendor contract is set up to define the agreement for external services. The service order is then used for recording the work to be performed by both internal and external resources. A purchase requisition is automatically generated from the service order for the external service requirement and a purchase order is created from it. The costs associated with internal work as well as the ones for materials to be used, are transferred to the service order. BBs: ZB3 submits orders and purchase requisitions referring to a vendor contract for an external service. ZC22 evaluates external employees costs as well as the ones related to material to be used during the process. ZMM37 records times of both internal and external employees related to a vendor contract for an external service. We will show how our logic-based approach answers to the question “do these BBs cover the requested Business Process?”, and if not so what is missing and/or conflicting with requirements? 2 Basics of our semantic-based framework The formal framework we adopt here borrows from Semantic Web (http://www.w3.org/2001/sw/) Languages
SOCA (2007) 1:171–184
and Description Logics. In particular, Description Logics (DLs) are a family of logic formalisms for Knowledge Representation [7–9] whose basic syntax elements are concept names, role names, individuals. Intuitively, concepts stand for sets of objects in the domain, and roles link objects in different concepts. Individuals are used for special named elements belonging to concepts. Formally, concepts are interpreted as subsets of a domain of interpretation ∆, and roles as binary relations (subsets of ∆ × ∆). DL formulas (concepts) are given a semantics by defining the interpretation function over each construct. For example, given two generic concepts C and D, their conjunction C D is interpreted as set intersection: (C D)I = C I ∩ D I , and also the other boolean connectives and ¬, when present, are given the usual set-theoretic interpretation of union and complement. The interpretation of constructs involving quantification on roles needs to make domain elements explicit: for example, (∀R.C)I = {d1 ∈ ∆ | ∀d2 ∈ ∆ : (d1 , d2 ) ∈ R I → d2 ∈ C I }. Concepts can be used in inclusion assertions C D, and definitions C ≡ D, which pose restrictions on possible interpretations according to the knowledge elicited for a given domain. The semantics of inclusions and definitions is based on set containment: an interpretation I satisfies an inclusion C D if C I ⊆ D I , and it satisfies a definition D ≡ C when D I = C I . A DL theory (a.k.a. TBox, or ontology) is a set of inclusion assertions and definitions. A model of a TBox T is an interpretation satisfying all inclusions and definitions of T . It is possible to introduce a variety of constructors in DL languages, making them more and more expressive. Unfortunately, it is well-known result that this may lead to an explosion in computational complexity of inference services. Hence a trade-off is necessary. In this paper, we refer to the ALN (Attributive Language with unqualified Number restrictions) DL, a subset of OWL-DL [46]. Example 1 Consider the description building block ZC22 (see Sect. 1). A possible translation of its description in DL is
173
Subsumption: T | D C. Given a DL theory T and two concepts D and C, is D more general than C in any model of T ? Given an ontology T modeling the investigated knowledge domain, a business process description C, and a Building Block description D, using subsumption it is possible to evaluate either (1) if the Building Block completely fulfills the request —T | D C, full match—or (2) if they are at least compatible with each other—T | C D ⊥, potential match—or (3) not—T | C D ⊥, partial match [10]. It is easy to see that in case of full match all the information specified on C is expressed, explicitly or by means of the ontology T , also in D. In an analogous way one may discover whether two BBs share some functionalities or one is simply a specialization of the other one. Nevertheless, there are several cases when we are looking for approximate matches to a given request, which should be provided keeping into account the semantic annotations, as pointed out e.g., in [11]. In our approach we use non-standard inference services to this purpose, whose rationale and definition are hereafter briefly recalled. Let us consider two concepts C and D; if their conjunction C D is unsatisfiable in the TBox T representing the ontology, they are not compatible with each other. Let us suppose D being a business process description and C a BB. This would mean that the BB cannot fulfill the requirements because of some conflicting specification. Yet, when nothing better is available, we may accept, as in a belief revision process, to retract some requirements—hopefully not essential ones—in D, G (for Give up), to obtain a concept K (for Keep) such that K C is satisfiable in T . More formally Definition 1 Let L be a DL, C, D, be two concepts in L, and T be a set of axioms in L, where both C and D are satisfiable in T . A Concept Contraction Problem (CCP), identified by
L, D, C, T , is finding a pair of concepts G, K ∈ L × L such that T | D ≡ G K , and K C is satisfiable in T . We call K a contraction of D according to C and T .
Obviously, we do not expect our users to have any knowledge of DLs; users will interact with a GUI completely hiding language and logical details, both to examine and to compose a description, see e.g. Fig. 1. All DL-based systems provide at least two basic reasoning services:
Q is a symbol for a CCP, and S O LCC P(Q) denotes the set of all solutions to Q. Obviously, there is always the trivial solution G, K = D, to whatever CCP Q, that is give up everything of D. When C D is satisfiable in T , the “best” possible solution is , D, that is, give up nothing. Hence, a Concept Contraction problem is an extension of a satisfiability one.
Concept satisfiability: T | D ⊥. Given a DL theory T and a concept D, does there exist at least one model of T assigning a non-empty extension to D?
Example 2 Consider an excerpt of the BP we are using as reference: “. . . evaluate only costs related to internal employees and material to be used during the process”.
123
174
SOCA (2007) 1:171–184
Fig. 1 A simple BB description as visualized in the system interface
In DL syntax it could be expressed as: ...
On the other side, consider the description of ZC22 (see Example 1). Our ontology obviously models that internal and external personnel are disjoint sets of objects (InternalEmployee ¬ExternalEmployee), ZC22 is in conflict with the business process request. Nevertheless, if the requester gives up the constraint on the internal employee costs, and then contracts his request, the BB may now suit the request. In this case, a solution to the contraction problem would be
G, K = ...
,
Again, in an informal way, the scenario is one where a given BB is compatible with a request, but it does not completely fulfill it. We may want to know why. In [15] the Concept Abduction Problem (CAP) was introduced and defined as a non standard inference problem for DLs to provide a logicbased answer to such question. Definition 2 Let C, D, be two concepts in a Description Logic L, and T be a set of axioms, where both C and D are satisfiable in T . A Concept Abduction Problem (CAP), denoted as L, C, D, T , is finding a concept H such that T | C H ⊥, and T | C H D. P is a symbol for a CAP, and S O L(P) denotes the set of all solutions to P. Example 3 Let us consider, in the requested business process: “. . . submit a purchase order and a purchase requisitions related to a contract for an external service”. In DL it could be modeled as: ...
Since usually one wants to give up as few things as possible, some minimality in the contraction must be defined [12]. In most cases a pure logic-based approach could be not sufficient to decide between which beliefs to give up and which to keep. There is the need of modeling and defining some extra-logical information. One approach is to give up minimal information [13]. Another one considers some information more important than other ones and the information that should be retracted are the least important ones [14]. In an analogous way, when subsumption does not hold i.e., a full match is unavailable, one may want to hypothesize some explanation on which are the causes of this result.
123
Let us consider ZB3 BB; it does not completely satisfy the BP request because it is not said if the BB is able to submit purchase orders. Nothing is specified regarding the kind of orders. In this case a solution to the CAP would be: H = ∀submitDocument.PurchaseOrder Given a CAP P, if H is a conjunction of concepts and no sub-conjunction of concepts in H is a solution to P, then H is an irreducible solution. In [15] also minimality criteria for H and a polynomial algorithm to find solutions which are irreducible, for an ALN DL, have been proposed.
SOCA (2007) 1:171–184
The rankPotential algorithm [10] allows to numerically compute the length of H . Recalling the definition, rankPotential (C, D) corresponds to a numerical measure of what is still missing in C w.r.t. D. If C = we have the maximum value for rankPotential (C, D), that is the maximum (potential) mismatch of C from D. The value returned by rankPotential(, D) hence amounts to how specific is a complex concept expression D with respect to an ontology T , what we call the depth of D. Obviously, we are not trivially measuring the depth of a node in a tree; in fact, an ontology is not a simple terms taxonomy tree, i.e., its structure is not limited to simple IS–A relations between two atomic concepts, and a complex concept can be the conjunction of both atomic concepts and role expressions. As an example of the usefulness of rankPotential used in conjunction with Concept Abduction consider again the case where we need to “. . . submit a purchase order and a purchase requisitions related to a contract for an external service” and there are other two BBs ZB32 “submits purchase order documents and purchase requisitions that refer to a contract” and ZB33 “submits purchase order documents and purchase requisitions that refer to a service contract”. In this case the solution to the two CAPs is exactly the same w.r.t. to both ZB32 and ZB33:
Nevertheless, rankPotential ranks ZB33 better than ZB32 because the latter is more generic than ZB33 with respect to the request related to the external service contract. The solution to a CAP P can be interpreted as what has to be hypothesized in C, and in a second step added to, to make C more specific than D, which would make subsumption result true. So, as Concept Contraction extends satisfiability, Concept Abduction extends subsumption. Previously described services can be used to infer information when we are matching a request with a given BB to find a suitable one. Yet, given a request we may have to fulfill such a request by selecting several BBs. To reach such an objective using semantics, we have to make one step beyond. 3 Semantic-based selection of building blocks Concept Covering, originally defined in [16] for a particular set of DLs with limited expressiveness, was later extended and generalized in terms of Concept Abduction in [17]. We recall here this definition: Definition 3 Let D be a concept, R = {S1 , S2 , . . . , Sk } be a set of concepts, and T be a set of axioms representing an ontology, all in a DL L, where D and S1 , . . . , Sk are satisfiable in T . Let also ≺T be an order relation over L taking into account the ontology T . The Concept Covering Prob-
175
lem (CCoP) denoted as V = L, R, D, T is finding a pair
Rc , H such that 1. 2.
Rc ⊆ R, and T | Si ∈Rc Si ⊥; H ∈ S O L( L, C, D, T ), and H ≺T D.
We call Rc , H a solution for V, and say that Rc covers D. Finally, we denote S O LCCoP(V) the set of all solutions to a CCoP V. In [17] the algorithm G R E E DY solveCCoP was also proposed, to perform a greedy concept covering using concept abduction. Intuitively, Rc is a set of concepts that completely or partially cover D w.r.t. T , while the abduced concept H represents what is still in D and is not covered by Rc . It is worth noticing that a greedy approach is needed for performance reasons as also the basic set covering problem is NP-Hard. Also note that a Concept Covering Problem is similar, but has remarkable differences when compared to classical set covering. In fact CCoP is not trivially a different formulation of a classical minimal set covering (SC) problem, as an exact solution to a CCoP may not exist. Furthermore, in SC elements are not related with each other, while in CCoP elements are related with each other through axioms within the ontology, and while the aim of an SC is to minimize the cardinality of Rc the aim of a CCoP is to maximize the covering, hence minimizing H . There can be several solutions for a single CCoP, depending also on the strategy adopted for choosing concepts in Rc . In the basic implementation of our approach [18] we adopted as Concept Covering algorithm the G R E E DY solveCCoP. Examining first results of tests on users experiencing our system, we noticed that, when an exact covering is unavailable, users would anyway appreciate approximate covers, also with requirements in conflict—as it eases anyway the customizing process—as long as the system can clearly point out conflicting requirements. Hence we implemented an enhanced version of the Concept Covering algorithm, which is better able to take into account such needs (see Algorithm 1). The algorithm BBComposer(R, BP, T )—where BP is the required Business Process and R = {BBi }, i = 1 . . . n is a set of Building Blocks, all described w.r.t. a TBox T —takes also concept contraction into account during the process of selection of Building Blocks. The algorithm tries to cover BP description “as much as possible”, using the concepts BBi ∈ R. If a new building block BBi can be added to the already composed set Rc , i.e., T | ( B Bk ∈Rc BBk ) BBi ⊥, then an extended matchmaking process is performed (rows 8–22). If BBi is not consistent with the uncovered part of the business process BPuncovered , the latter is contracted and subsequently an abduction process is performed between the contracted uncovered part and BBi (rows 8–10). If BBi is consistent with BPuncovered , only a concept abduction problem is solved
123
176
SOCA (2007) 1:171–184
Algorithm 1 BBComposer(R,B P,T )—Extended Concept Covering
(rows 11–15). Based on the previously computed concepts G, K and H a global score is computed as a metric to evaluate how good BPi is with respect to the covering set (rows 16–21). The score is determined through the function Ψ . It takes into account what has to be given up (G), kept (K ) in the uncovered part (BPu ) of BP in order to be compatible BBi and what has to be hypothesized (H ) in in order to fully satisfy the contracted BPu , i.e., K . For the ALN subset of OWL-DL the following closed form expression was originally proposed in [19] Ψ (G, K , H, BBi , BPu ) = 1 −
N h ∗ 1− N −g k
(1)
where N , k, g, h represent numerical evaluation, computed via the rank Potential algorithm[10], for respectively – – – –
BPu the uncovered part of B P in an intermediate step of the covering process – N = rank Potential(, BPu ). K : a solution of Q = ALN , BBi , BPu , T – k = rank Potential(, K ). G: a solution of Q = ALN , BBi , BPu , T – g = rank Potential(BPu , K ). H : a solution of P = ALN , BBi , K , T – h = rank Potential(BBi , K ).
123
At a first glance the right formulation for g, would seem to be g = rank Potential(, G, T ), as it was done for K . Using a trivial example we will show that the former formulation is more appropriate than the latter one. Imagine a simple T = {C B A; E ¬C} with BBi = E and BPu = C F. For Q = ALN , BBi , BPu , T we have G, K =
C, B F ∈ S O LCC P(Q). Now, rank Potential(, C) = 3 while rank Potential (C F, B F) = 1.1 We note that the latter result better takes into account how much BPu has been contracted in order to be compatible with BBi . The rationale of the closed form expression (1) is given in [19]. By choosing the candidate descriptions minimizing Ψ , the algorithm takes into account both g and h, i.e., a numerical measure of how much it has to be given up in the uncovered request BPu , and how much we have to hypothesize in the building blocks analyzed at each stage. Notice that we do not need to change the definition of Concept Covering Problem provided in [17]: the problem we solve is still the one detailed in Definition 3, if we consider D = K BP . The distinguishing element between the two solving algorithms G R E E DY solveCCoP and BBComposer is in the choice criterion among the BBi greedy selected to compose the set. In G R E E DY solveCCoP such choice was made on the basis of a minimality criterion on H , the solution of an abduction problem on BPuncovered —the part of the needed process yet to cover at each stage of the algorithm—and the selected building block BBi . In BBComposer the choice criteria is the minimization of the function Ψ , which takes into account also a measure of how much the request should be contracted before the selection, thus improving the selection process. The outputs of BBComposer are – –
– –
Rc : the set of BBs selected to compose the business process BPuncovered : the part of the business process description that has—in case no exact cover exists—not been covered K BP : the contracted BP G fin : the part of the business process description givenup at the end of the whole composition process
The enhanced covering algorithm previously presented takes into account also the contraction of the request during the search, thus providing approximated results also in the presence of conflicting requirements. Nevertheless, there can be cases when users are not willing to give up from the begininning to characteristics they consider of paramount importance in their request. To satisfy such a reasonable need 1 Roughly speaking, in this example, T can be seen as a tree and rank Potential(C, D, T ) measuring the hops to go from D to C.
SOCA (2007) 1:171–184
177
Table 1 Graphical representation of: the reference Business Process (a); ZB3 (b); ZC22 (c); ZMM37 (d)
we offer users the possibility to explicitly express mandatory requirements MND and preferences PRF within their request as B P ≡ M N D P R F. Based on such specification we pre-select the building blocks to be taken into account if and only if their description is compatible with MND. The pre-selection process is based on the following simple check:
Actually, if the user sets: M N D = ∀evaluatesData.∀refersToHuman Resource.ExternalEmployee, then the contraction process cannot not performed w.r.t. ZC22, which is hence not selected as part of the solution. In this case, the new solution would be (see Example 1, 2)
if (T | M N D B Bi ≡ ⊥) then R M N D = R M N D ∪ B Bi ; 3 end
1 2
where R M N D is the set of BBi which are compatible with the mandatory requirements MND of the user. Considering MND and PRF as conjunctions of ALN concepts, it is obviously possible to check users specification coherence, using non-standard inference services. This incoherency can be detected by checking if there exists an overlap between MND and PRF. Using Concept Abduction [15] or its numerical version [10] it is possible to compute how dissimilar PRF is from MND. If MND and PRF overlap, the user can be asked to reconsider her requirements. Now, based on the formalization of the reference Business Process, ZB3, ZC22, and ZMM37 (Table 1), it is easy to see (see Examples 1–3) that the covering solution is:
That is, nothing is contracted in the BP request, but since ZC22 is not selected, the uncovered part increases.
4 Improving efficiency While semantic-based systems offer several benefits, they often become hard to use on real world large datasets; so several recent approaches try to merge DL-based reasoning with classical Relational DBMS. The objective is obvious: by pre-classifying large knowledge bases on RDBMS it is possible to reduce the burden on inference engines and deal efficiently with large datasets. Such approaches can be classified as either RDF-based, and include [20–23] or OWLbased [24,25]. In most cases they are limited to the persistent
123
178
storing of ontologies on RDBMS, and sometimes have limited inference capabilities. We developed a novel Java-based framework, Enduring Oak, which was designed specifically to support persistence and the ability to efficiently solve Concept Covering problems. Enduring Oak is compliant with DIG specifications, currently uses Hypersonic SQL DB for storage and retrieval of persistent data, and calls a DIG-compliant DL reasoner for computing inferences. We adopt MaMaS-tng, a DL reasoner that implements rankPotential and algorithms to solve concept abduction and contraction problems. The database schema (see Fig. 2) includes various tables, namely: individuals, concepts, roles, individual types, children, descendants, roles range, and roles domain, with the obvious meaning. The relational model we adopted is clearly redundant [26], as e.g., the children table could have been withdrawn by adding a Boolean attribute with a true value assigned whenever a descendant is also a child. Our design choices were guided by the will to improve retrieval efficiency, also at the expenses of normalization, with a greater storage area and a more complex insertion mechanism. In the following we illustrate the modified Concept Covering algorithm implemented in the Enduring Oak framework (see Algorithm 2). Before starting to compute the actual Concept Covering, a preprocessing step is needed for the ontology T . For each role R in T having as range a concept C, add the definition axiom AU X R ≡ ∀R.C to T .2 Notice that, since AU X R is a defined concept i.e., it is introduced via an equivalence axiom, the structure of the domain knowledge modeled in the original T is not modified. Doing so, we are able to classify also descriptions expressed in terms of universal restrictions—as the ones used to describe Building Blocks and Business Process (see Examples 1–3)—w.r.t. a concept taxonomy. We also note that the main difference w.r.t. the original algorithm are the calls to the Candidate Retrieval procedure (see Algorithm 3). Candidate Retrieval selects a subset of all available BB descriptions in order to avoid checking all the available dataset. Retrieval is carried out combining database queries and inference services calls to the reasoner. Differently from the basic version of the algorithm, using BBComposer a “full” check for covering condition is not needed. In fact, C contains concepts which are surely consistent with BPu . What is still to evaluate is if the best concept—from a covering point of view—in C is consistent with the ones in Rc (row 24 in Algorithm 2). Then, only a consistency check is needed instead of a full covering check. On the other hand, while computing a covering, the size of C cannot increase. In fact, candidate Retrieval computes We recall that a range relation can be modeled with the axiom ∀R.C in T .
2
123
SOCA (2007) 1:171–184 Algorithm 2 GsCCoP+(R, B P, T ) – Fast Concept Covering
Algorithm 3 candidateRetrieval(B P,R,T )—Candidates Retrieval
the candidates to be added to Rc considering only the ones which overlap BPu . Since BPu decreases after each iteration, the same is for C. Notice that, due to the preprocessing step, also AU X R concepts are taken into account and returned by candidate Retrieval, in this way also complex concepts involving roles are taken into account.
SOCA (2007) 1:171–184
179
Fig 2 Relational structure of the adopted database
5 System architecture The framework and system we built have been designed with the objective of allowing a developers community to share the knowledge they acquire, together with the one provided by SAP documentation. The framework is based on an architecture, which is sketched in Fig. 3, comprising:
– a Knowledge Base Repository composed by an Ontology Repository (OR) and Instance Repository (IR). In OR all ontologies shared within the system are stored in OWL files—we use different ontologies interacting with each other via the owl:imports TAG. IR contains all concept descriptions representing both BBs capability descriptions and Business Processes. IR is a DBMS-based repository; curently we adopt Hypersonic SQL DBMS interfaced via JDBC. IR and OR can be obviously stored on different machines. The idea behind this decentralized structure is based on the observation that different consultants and companies may refer to the same ontologies in order to describe their own BBs or Business Processes. Hence, the rationale is that companies can share the ontologies on a single OR but there are different IRs, one for each company. – a Client GUI: a user friendly interface allowing the user to interact with the system in order to query, tell new knowledge, examine the available modeled knowledge. The client allows to: (a) browse an ontology; (b) describe a new BB and store its ontology-based description within the Instance Repository, see Fig. 1;
(c)
compose a business process request using the knowledge modeled within ontologies, and query the system, see Fig. 4. – a Reasoner— MAMAS-tng to perform inferences based on formal semantics of the language used to model both the ontologies and the individuals. MAMAS-tng is a DL reasoning engine endowed both of standard reasoning services and non-standard inferences. It is a multi-user, multi-ontology Java servlet based system and exposes an enhanced DIG 1.1 interface. It is available as an HTTP service at http://dee227.poliba.it:8080/MAMAS-tng/DIG/. A common terminology can be shared by the community of SAP consultants within OR, the Ontology Repository. We modeled all ontologies used within the system using the ALN subset of OWL-DL, thus allowing to exploit Concept Abduction and Concept Contraction service via MAMAStng and perform a Concept Covering of the user requests. In order to satisfy the needed prerequisites, a Business Process description contains the list of the activities to be performed by such process. Notice that in a Businss Process, from the application point of view, only activities requiring particular functionalities from R/3 are relevant and then modeled. In an equivalent way, a BB is described with respect to the functionalities it provides, representing the set of activities which can be performed using the BB itself with respect to a Business Process. The main competency questions for the ontologies we developed are basically related to BB functionalities and Business Process activities. Some of them are –
Which are the activities a Business Process shall perform?
123
180
SOCA (2007) 1:171–184
Fig. 3 System architecture
– –
Which scenario better approximates a given Business Process and which are the BBs used to implement it? Which are the BBs needed in order to manage a particular Business Process?
The GUI allows users to communicate with the KB and use the knowledge modeled and stored within it, without requiring any knowledge of the adopted formalisms. Care has been paid in making the look-and-feel of the interface at least similar to the typical ones of R/3. Given a Business Process description, by exploiting algorithms and inferences previously described, our system basically performs two types of query: – –
a semantic search on already known and stored business scenarios; a fully automated search of a set of BBs that can implement the requested Business Process.
The user interface displays various useful information, as shown in Fig. 5: a visual representation of the of the OWL-DL formalization as a tree structure; an automatically generated natural language-like translation of the formalized description; dockable panels presenting the DIG code of descriptions and R/3 available modules; search results for the two query types. It is noteworthy that, when search results do not completely fulfill a query, a warning icon is displayed, and by clicking on it missing and/or conflicting requirements are presented.
123
6 Performance evaluation Extensive experiments aimed at both quantitative and qualitative evaluation have been carried out. In particular, we report here on the comparison between the enhanced Enduring Oak approach versus the basic system presented in [27]. We note that all experiments presented here have been carried out w.r.t. purely abductive versions of both algorithms, to ease comparison with existing data from previous tests. The test dataset is a randomly generated population of Building Blocks descriptions having a rank with a Gaussian distribution µ = 7, σ = 5. Tests have been run using as sample requests descriptions randomly extracted from the dataset, having rank 5, 10, 15, versus sets of 10, 50, 100, 500, 1,000, 5,000 descriptions randomly extracted from the same dataset at each iteration of the tests. We note that a random generated population places the experiment in a worst case. Tests were run on PC platform endowed of a Pentium IV 3.06 GHz, RAM 512MB, Java 2 Standard Edition 5.0. We point out that reported results include time elapsed for ontology uploading and database connection, with a duration, on average, of 5.81 and 0.26 s, respectively. Obviously such bootstrap times should not be kept into account in the actual usage of the system. Figure 6 presents a comparison between the basic greedy algorithm and the one presented in this paper, clearly showing the computational benefits of the enhanced approach, also in the presence of huge datasets. In both cases, time is normalized w.r.t. to the depth of BP. To evaluate the effectiveness of the approach we carried out another set of tests using the same dataset. In this case requests were built as conjunction
SOCA (2007) 1:171–184
181
Fig. 4 Results for a Basic Search of existing business scenarios
Fig. 5 The GUI used to compose requested Business Processes and display search results
of a description randomly extracted from the selected dataset with a randomly generated description generated at run time. The rationale is that in this way we ensured that at least a part
of the request would have to be satisfied by the description available in the dataset. Results refer to datasets of respectively 10 and 50 descriptions and are summarized in terms of
123
182
Fig. 6 Comparison of concept covering time performances (GsCCoP: basic composer algorithm, GsCCoP+: enhanced composer algorithm)
Fig. 7 Comparison w.r.t. the BPu (uncovered part H )(GsCCoP: basic composer algorithm, GsCCoP+: enhanced composer algorithm)
BPu rank,3 see Fig. 7, and covering percentage, see Fig. 8, w.r.t. the basic BBComposer and the enhanced GsCCoP+ algorithms. The tests show that approximations introduced in GsCCoP+ affect in a limited way the effectiveness of the covering procedure. We again note that these are worst case tests, as the population is randomly generated; tests carried out with actual Building Block descriptions and requests provide results that are practically equivalent for both algorithms.
7 Discussion and conclusions In this paper, we have presented an approach and a system, fully exploiting technologies and languages mutuated from the Semantic Web initiative, for automated building blocks selection within ERPs, and particularly SAP R/3. Although, to the best of our knowledge, there are no other proposals using semantic-based technologies offering features compa3 Recall that BP is the part that has to be hypothesized, i.e., what u remains uncovered.
123
SOCA (2007) 1:171–184
Fig. 8 Successful covering percentage(GsCCoP: basic composer algorithm, GsCCoP+: enhanced composer+ algorithm
rable to our system in the specific domain we tackle, we note that several other approaches exist in the framework of web services selection and composition. Without any claim of completeness we just hint to the interested reader some interesting and related works. In [28] web service discovery is based only on the inputs and outputs of a Web service without taking into account the overall description of the service; DL based approaches include those in [10,29–35], in which the discovery phase amounts to a semantic matchmaking process between requests and services. Agarwal and Lamparter [31] use a fuzzy description logic to obtain a ranking of matches, while in [33] it is proposed to adopt a mixed approach, using both logic-based classification and rankings determined using classical unstructured text retrieval measures. DIANE, recently proposed in [36], also adopts a mixed fuzzy oriented approach to provide automated selction of suitable services. In [34] a model to automate service discovery was proposed. In this approach static and dynamic aspects of service descriptions are considered, providers of services are dynamic selected based on the semantic descriptions of their capability. Various proposals for composition derive from planning community, although some issues, related to the typical closed world assumption of planning approaches, have still to be clearly clarified [37,38]. In [39] a prototype was developed guiding the user in the dynamic composition of Web services. A semi automated system has also been proposed in [40], adopting a constraint-based approach. A rule-based system was proposed by [41]. Petri Nets were proposed both for modeling and orchestration of services in [42,43], too. In [44] an approach based on finite state machine was presented for automatic e-service composition. In [45] also incomplete specification of the client action sequences was taken into account. In [46] several inferences are discussed based on different ways of handling variance during the matching process. [47], differently from pure semantic approaches, propose a framework for semi-automated
SOCA (2007) 1:171–184
service discovery, where both structured and unstructured information are used in order to satisfy user requests. Differently from other approaches, our system, exploiting nonstandard inference services, is able to provide a semanticbased discovery and selection of BBs able to fulfill requirements of a business process. When an exact covering solution is unavailable we are able to identify an approximate set, pointing out missing and—in case—conflicting requirements. For the OWL-DL subset we use, algorithms adopted are polynomial in time, as for Concept Covering we adopt greedy algorithms. In fact, as previously hinted, also the basic set covering is NP-Hard, and from numerical tests carried out, computing a non-greedy Concept Covering would prevent any practical use of our approach, even with sets of a few dozens of BBs. Furthermore, to effectively cope with large datasets of BBs we have further extended the approach, using the DL reasoning engine in conjunction with a RDBMS to reduce the computational burden, thus allowing to scale our approach up to thousands of BB descriptions, with negligible loss in terms of effectiveness. We are also working on an extension of our approach, providing complete composition and activities schedule. It must anyway be pointed out that extreme care has to be paid when strictly binding a business process to a workflow-like structure; as recently argued in [48] a particular sequencing of activities may become an over-specification of a business process, introducing unnecessary constraints on the execution of the process. Qualitative evaluation based on large scale experiments with consultants is currently ongoing in the framework of project ACAB-C2. Acknowledgements We acknowledge partial support of Apulia region explorative project ACAB-C2 (System for Automated Composition of business processes in ERPs using Abduction-Based Concept Covering), PON project PITAGORA, and EU-FP-6-IST STREP TOWL. We also thank Antonio Cascone for his support and useful implementations.
References 1. Scheer AW, Habermann F (2000) Making ERP a success. Commun ACM 43(4):57–61 2. SAP-AG (2003) Contract-based service order with third-party services and time & material billing, SAP best practices for service providers. http://help.sap.com/bestpractices/industry/ serviceindustries/v346c_it/html/~index.htm 3. Hernández JA (2000) SAP R/3 handbook, 2nd edn. McGraw Hill, New York 4. R/3-Simplification-Group (2003) Best Practices for mySAP Allin-One, Building Blocks Concept 5. SAP-AG Sap building block library (http://help.sap.com/ bestpractices/~BBLibrary/bblibrary_start.htm) 6. Di Noia T, Di Sciascio E, Donini F (2007) Semantic matchmaking as non-monotonic reasoning: a description logic approach. J Artif Intell Res (JAIR) 29:269–307 7. Borgida A (1995) Description logics in data management. IEEE Trans Knowl Data Eng 7(5):671–682
183 8. Donini FM, Lenzerini M, Nardi D, Schaerf A (1996) Reasoning in description logics. In: Brewka G (ed) Principles of knowledge representation. Studies in Logic, language and Information. CSLI Publications Stanford, pp 193–238 9. Baader F, Calvanese D, Mc Guinness D, Nardi D, Patel-Schneider P (eds) (2002) The description logic handbook. Cambridge University Press, Cambridge, 10. Di Noia T, Di Sciascio E, Donini F, Mongiello M (2004) A system for principled Matchmaking in an electronic marketplace. Int J Electron Commer (IJEC) 8(4):9–37 11. Goderis A, Sattler U, Lord P, Goble C (2005) Seven Bottlenecks to workflow reuse and repurposing. In: Proceedings 4th international semantic web conference (ISWC 2005), Volume 3279 of LCNS 12. Gärdenfors P (1988) Knowledge in flux: modeling the dynamics of epistemic states. Bradford books, MIT Press, Cambridge 13. Colucci S, Di Noia T, Di Sciascio E, Donini F, Mongiello M (2003) Concept abduction and contraction in description logics. In: Proceedings of the 16th international workshop on description logics (DL’03), volume 81 of CEUR workshop proceedings 14. Di Noia T, Di Sciascio E, Donini FM (2004) Extending semantic-based matchmaking via concept abduction and contraction. In: Engineering knowledge in the age of the semantic web, volume 3257 of LNAI. Springer, Heidelberg, pp 307–320 15. Di Noia T, Di Sciascio E, Donini F, Mongiello M (2003) Abductive matchmaking using description logics. In: Proceedings of the 18th International joint conference on artificial intelligence (IJCAI 2003), Morgan Kaufmann, Los Altos, pp 337–342 16. Hacid MS, Leger A, Rey C, Toumani F (2002) Computing concept covers: a preliminary report. In: DL’02, volume 53 of CEUR workshop Proceedings 17. Colucci S, Di Noia T, Di Sciascio E, Donini F, Piscitelli G, Coppi S (2005) Knowledge based approach to semantic composition of teams in an organization. In: SAC-05, ACM, New York, pp 1314–1319 18. di Cugno F, Di Noia T, Di Sciascio E, Donini F, Tinelli E (2005) Building-blocks composition based on business process semantics for SAP R/3. In: ISWC 2005 workshop semantic web case studies and best practices for eBusiness (SWCASE05) 19. Colucci S, Di Noia T, Di Sciascio E, Donini F, Mongiello M (2005) Concept abduction and contraction for semantic-based discovery of matches and negotiation spaces in an E-marketplace. Electron Commer Res Appl (ECRA) 4(4):345–361 20. Broekstra J, Kampman A, Harmelen F (2002) Sesame: a generic architecture for storing and querying RDF and RDF-Schema. In: Proceedings of ISWC 2002 21. McBride B (2001) Jena: implementing the RDF model and syntax specifications. In: Proceedings of SemWeb 2001 22. Volz R, Oberle D, Staab S, Motik B (2003) Kaon server—a semantic web management system. In: Alternate track proceedings of the Twelth international world wide web conference, WWW2003, ACM, New York 23. Wood D, Gearon P, Adams T (2005) Kowari: a platform for semantic web storage and analysis. In: Proceedings of Xtech conference 24. Kessel T, Schlick M, Stern O (1995) Accessing configurationdatabases by means of description logics. In: Proceedings of KI’95 workshop: knowledgre representation meets databases 25. Horrocks I, Li L, Turi D, Bechhofer S (2004) The instance store: DL reasoning with large numbers of individuals. In: Proceedings of the 2004 description logic workshop (DL 2004), pp 31–40 26. Abiteboul S, Hull R, Vianu V (1995) Foundations of Databases. Addison Wesley, Reading 27. di Cugno F, Di Noia T, Di Sciascio E, Donini F, Ragone A (2006) Concept covering for automated building blocks selection based on business processes semantics. In: The 8th IEEE conference on E-commerce technology (CEC’ 06) and the 3rd IEEE conference
123
184
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
SOCA (2007) 1:171–184 on enterprise computing, E-commerce and E-services (EEE’ 06), pp 72–79 Paolucci M, Kawamura T, Payne T, Sycara K (2002) Semantic matching of web services capabilities. In: The semantic web— ISWC 2002. Number 2342 in lecture notes in computer science. Springer, Heidelberg, pp 333–347 Gonzales-Castillo J, Trastour D, Bartolini C (2001) Description logics for matchmaking of services. In: Proceedings of the KI-2001 workshop on applications of description logics (ADL2001), volume 44, CEUR Workshop Proceedings Li L, Horrocks I (2004) A software framework for matchmaking based on semantic web technology. Int J Electron Commer (IJEC) 8(4):39–60 Agarwal S, Lamparter S (2005) sMart—A semantic matchmaking portal for electronic markets. In: Proceedings of the 7th International IEEE Conference on E-commerce technology (CEC’05), pp 405–408 Benatallah B, Hacid MS, Rey C, Toumani F (2003) Request rewriting-based web service discovery. In: International semantic web conference, volume 2870 of lecture notes in computer science, Springer, Heidelberg, pp 242–257 Klusch M, Fries B, Khalid M, Sycara K (2005) OWLS-MX: hybrid semantic web service retrieval. In: Proceedings of 1st International AAAI fall symposium on agents and the semantic web Keller U, Lara R, Lausen H, Polleres A, Fensel D (2005) Automatic Location of Services. In: The semantic web: research and applications, second european semantic web conference, ESWC 2005, Heraklion, Crete, Greece, LNCS, pp 1–16 Fronk M, Lemcke J (2006) Expressing semantic web service behavior using description logics. In: Semantics for business process management workshop at third european semantic web conference, ESWC 2006 Kuster U, Konig-Ries B, Klein M, Stern M (2007) DIANE—A matchmaking-centered framework for automated service discovery, composition, binding and invocation on the web. Int J Electron Commer (IJEC) 12(2) (to appear) Rao J, Su X (2004) A survey of automated web service composition methods. In: Proceedings of the first international workshop on semantic web services and web process composition, SWSWPC 2004, volume 3387 of lecture notes in computer science, Springer, Heidelberg, pp 43–54
123
38. Peer J (2005) Web service composition as AI Planning—a survey, technical report, University of St.Gallen (http://elektra.mcm.unisg. ch/pbwsc/docs/pfwsc.pdf) 39. Sirin E, Hendler J, Parsia B (2003) Semi automatic composition of web services using semantic descriptions. In: Proceedings of the ICEIS workshop on web services: modeling, archit. and infrastructure, pp 17–24 40. Liang Q, Chakarapani LN, Su SYW, Chikkamagalur RN, Lam H (2004) A semi-automatic approach to composite web services discovery, description and invocation. Int J Web Service Res 1(4):64– 89 41. Medjahed B, Bouguettaya A, Elmagarmid AK (2003) Composing web services on the semantic web. VLDB J 12(4):333–351 42. Mecella M, Parisi Presicce F, Pernici B (2002) Modeling e-service orchestration through petri nets. In: Proceedings of the 3rd VLDB international workshop on technologies for e-services (VLDB-TES 2002), Springer, Heidelberg, LNCS 2444, pp 38–47 43. Mecella M, Pernici B (2002) Building flexible and cooperative applications based on e-services, Technical report 44. Berardi D, Calvanese D, Giacomo GD, Lenzerini M, Mecella M (2003) Automatic composition of E-services that export their behavior. In: Orlowska ME, Weerawarana S, Papazoglou MP, Yang J (eds) ICSOC, volume 2910 of lecture notes in computer science, Springer, Heidelberg, pp 43–58 45. Berardi D, Calvanese D, Giacomo GD, Lenzerini M, Mecella M (2004) Synthesis of underspecified composite e-services based on automated reasoning. In: Proceedings of the 2nd International Conference on service oriented computing (ICSOC), pp 105–114 46. Grimm S, Motik B, Preist C (2004) Variance in e-business service discovery. In: Semantic web services: preparing to meet the world of business applications, workshop at ISWC-2004. volume 119, CEUR Workshop Proceedings 47. Lara R, Corella M, Castells P (2007) A flexible model for the location of services on the Web. Int J Electron Commer (IJEC) 12(2) (to appear) 48. van der Aalst W, Pesic M (2006) Specifying, discovering, and monitoring service flows: making web services process-aware. Technical report BPM center technical report, no. BPM-06-09