XFolders: A Flexible Workflow System based on Electronic Circulation Folders Stefania Castellani and Francois Pacull Xerox Research Centre Europe, Grenoble Lab 6, chemin de Maupertuis, F-38240 Meylan, France Stefania.Castellani,
[email protected]
Abstract We present a flexible light-weight document-centered workflow system, called XFolders, that breaks physical and organisational boundaries allowing users across distributed virtual organizations to flexibly collaborate and share documents. The paradigm underlying XFolders is the well known, and widely adopted, internal office circulation envelope. We show how XFolders combines the simplicity of this metaphor with the power and the benefits of today’s networked computers, in terms of speed, distance bridging and support. In particular, we show how in XFolders documents can be migrated across organizations hidden by firewalls and how we deal with scalability and dynamicity issues.
1. Introduction We present XFolders, a light-weight document-centered workflow system designed to provide flexible support for collaborations both within and across organisations. XFolders breaks physical and organisational boundaries allowing users belonging to distributed (virtual) organizations to flexibly collaborate and share documents. We have chosen for XFolders a methaphor close to users’ work practices: the well known and widely adopted internal office circulation envelope. XFolders combines the simplicity of this metaphor with all the power and benefits of today’s networked computers. Each collaborator simply inserts documents into the envelope specifying the necessary recipients by writing their name(s) in the boxes on the front. The envelope then magically flies from pigeon hole to pigeon hole until everyone fulfills their role and the final task, based on people will and teamwork, is finally completed. As an example of such kind of collaborations, consider the preparation of a multi-organisational project proposal involving a group of distributed users. Such a collaboration requires that: (1) Relevant documents already stored in local heterogeneous Document Management Systems (DMS) are accessible to all involved partners without having to copy them
into a centralised repository; (2) The process guiding the work is flexible, allowing it to evolve according to the different partners’ activities; (3) Users at distributed sites are aware at any time of the current status of the whole process; and (4) Project partners are able to safely collaborate writing documents. We have ourselves experienced evidence of the first need collaborating with other teams in the frame of european projects or with other entities of the company spread over the world (e.g. administrative units, lawyer ’cabinets’, etc.) Typical examples related to the second and third needs [7] are the difficulties both to deal with office workers’ absence or unavailability and to keep track of who has done what. The latest need relates directly to a research topic orthogonal to the workflow aspect: collaboration in document authoring. This paper shows how XFolders addresses these issues offering to the user a high degree of flexibility in modeling and enacting this kind of collaborations. XFolders is built on top of CLF [1], a platform providing advanced features to build distributed applications and to deploy them over large scale distributed systems. We show how XFolders leverages CLF coordination transactional facilities provided by a highly portable scripting language, adapted to describe flexible coordination of tasks across distributed users. Sec. 2 and Sec. 3 describe XFolders main functionalities and a typical example of collaboration. Sec. 4 shows the XFolders architecture and the coordination infrastructure underlying it. Finally, Sec. 5 discusses related work.
2. XFolders functionalities In XFolders, a user starts a collaboration creating an Electronic Circulation Folder (ECF), defining and attaching to it a routing process, i.e. a process description. Process Descriptions can be defined from scratch or as instances of ECF models1 . They define folder circulations as sequences of steps. A step is defined by a set of tasks that may be concurrently performed. A task definition includes: 1 An
ECF model can be created saving the structure of an instance.
a user responsible for the task; a textual description of what the user is supposed to do; and, possibly, the documents required (either for reference, modification or production) for accomplishing the task. Documents are stored in one of the document repositories associated to the ECF. Document Repositories store documents involved in ECFs. They are not required to be centralized areas dedicated to the collaboration but can be existing repositories like document management system collections or even simple directories. At any time, a user can associate a number of repositories to an ECF, provided she has the appropriate access rights. Once associated a repository to the ECF, the user can reference its documents in a task of the routing process. In order to keep the user interface simple and the semantic clear, the document accessibility is defined as follows: a user can only reference (or access) a document he knows about, i.e. either the document is stored in one of the document repositories she has associated to the ECF or it has been associated to an ECF task and thus made publicly available by another user. Control A folder circulates from step to step according to the routing process definition. Users involved in an active step work in parallel and enact the folder circulation through the following mechanisms: Forward Once a task in a step is completed, a user forwards the folder. Then, depending on the dependency mode specified in the routing process, the next step may be activated. The sequential mode requires a step be finished (all users have forwarded) before activating the next one. The overlap mode allows two subsequent steps to be performed in parallel. As soon as one of the users in the step has forwarded, all tasks in next step are activated. Accept A user involved in an active step can start to work once she accepts the task. Then, all required documents not accessible by the user (not stored in a repository for which she has access rights) are automagically “migrated” (physically moved to a user accessible repository.) Retrieve In the paper version of the circulation folder, it is often hard to retrieve a folder given to someone. This is a pity because the ability to retrieve a folder, abandoned on a desk for days (or even weeks), adds flexibility. For instance, it allows to update a document before it is processed or to deal with unpredictable people unavailability (e.g. sick leave). The XFolders retrieve mechanism allows to deal with this kind of situations. When all users in a step have forwarded the ECF and until anybody in the next step accepts her task, any of the forwarders can retrieve the folder. Then, he may appropriately modify the routing process, e.g. updating a document or replacing an unavailable person in a task, and forward the folder again. Dynamic modifications of the routing process are possible for all users in an active step, to change contents and definitions of future steps. This ability is a major advantage
of XFolders allowing a user to add, modify or remove steps and tasks. Modifying a task consists in adding or removing documents or changing the task description. Documents can be chosen among the documents already attached to a task of the ECF or stored into a document repository the user has declared as part of the ECF. Awareness of an ECF current status is provided to all involved users through an up-to-date, personalised view of the ECF. The view shows what are the possible interactions for each task according to its status. Active icons show if a task is proposed (folder “available”), accepted (folder “opened”), done, or retrievable. No active icons are displayed for tasks in future (not yet activated) steps. Also, the ECF view shows to a user the current activities of the other involved users (when and if they have accepted, forwarded or retrieved a folder). Moreover, a user is notified by email upon creation of the first (resp. removal of the latest) task for him in an ECF. Timeout mechanisms can be used to remind a user a task that have to be considered but also to notify the users who have forwarded a folder that the task is still not accepted after a given delay. One of the users can then decide to contact the concerned user or to retrieve the folder in order to allocate the task to somebody else before forwarding the folder again. Concurrency control mechanisms provide basic concurrent access to workflow descriptions and documents. More sophisticated mechanisms are under consideration to allow concurrent access to a document by several users who do not share a common document repository. One possible strategy could be to allow replication of the document, letting the various instances to diverge merging them (possibly with the users help) once the step completed. Another possibility is to split the document into different segments (e.g. sections) then handled separately by the different users. In this case, only the document segments migrate instead of the whole document. Nomadic Operations, for example document modifications and forward requests, can be performed by a user disconnected from the network. Modifying a document requires the user be able to access it: the document is stored in a document repository hosted by the user laptop (e.g. a file system directory). When the user connects back to the system, a synchronisation is automatically performed thus triggering the appropriate operations, e.g. forwarding the folder or migrating documents.
3. XFolders by example This section gives an example of the kind of collaborations requiring the flexibility XFolders aims at providing the users with, the “newcomer” process in our research centre. It involves several entities with their own contraints, rules
Figure 1. Viewing the newcomer ECF before (a) and after (b) a forward and modifying it (c). and management; it cannot be completely defined from the beginning; and the number of possible exceptions is very high (e.g. special hardware requirements.) A newcomer will join the research centre under both
Francois and Anto responsability. Stefania, who coor-
dinates the welcoming processes, defines a first draft of the process. Francois and Anto have to fill for the newcomer respectively a hardware allocation request form (HARF) and a badge allocation request form. Both forms have to be approved by the manager, Christer, who validates for example the type of machine to be allocated to the newcomer and her badge access rights. Finally, the person responsible for the support, Christophe, has to effectively provide the machine and an office has to be allocated with the corresponding facilities. Stefania forwards the folder to the next step actors: Francois and Anto. Fig. 1(a) shows the resulting ECF. Francois accepts the task, fills a HARF asking for a laptop with Linux for the newcomer. He then modifies (Fig. 1(c)) the ECF attaching the form (“harfForm.txt”) to Christer’s task (“approve forms”) and forwards the folder (Fig. 1(b)). Since the second and third step are overlapping, the “approve forms” task can already be proposed to Christer, even if Anto has not accomplished her task yet. When Christer accepts his task, the migration of “harfForm.txt”, originally stored in a document repository restricted to Francois, is triggered. Upon reception of the form, Christer decides that a laptop is not mandatory and he decides to allocate a desktop. He modifies then the HARF and refines the task “allocate workstation” for Christophe with “allocate desktop”, attaching the HARF
document. Anto performs her task and forwards at her turn the folder. After Christer’s approval and forwarding, the fourth step becomes active and Christophe will specify the rest of the process (e.g. either the desktop can be taken from the pool of available machines or it has to be ordered.)
4. An infrastructure for XFolders XFolders is built on top of CLF, a distributed object coordination platform, which offers a library of ready-made customizable components (Mekano) and a portable scripting language allowing to describe coordination rules [1]. CLF is built around an object model where objects are “resource managers” manipulating resources which are made visible only through their interfaces. A resource may denote both tangible entities such a document, as well as virtual entities such as a task or a decision. A CLF component offers two types of interfaces which define abstract services through which operations on resources are made possible. The first is a classical method invocations scheme triggering actions through URL encoded method calls that return XML or HTML pages. It is used to access the resources of a single object in order to offer web-based graphical user interface. The second interface is based on the CLF protocol that graps with 8 verbs high-level resource manipulations: resource discovery, resource operations atomic performance and resource insertion. It is used for crosscomponents transactional manipulations of resources thank to the CLF scripting language, where resources manipulations are declaratively expressed by rules. The CLF
4.1. Document migration CLF rules allow to model coordination mechanisms, in particular users interactions and documents migrations. For example, accepting a task is expressed by a script including the following rule: acceptTask(EcfI,StepI,UserI): UserI -> EcfI, StepI is LOOKUP CS.Accept ... ‘acceptTask(EcfI,StepI,UserI) @ requiredDoc(EcfI,StepI,UserI,DocI,DocT) - checkMigration(EcfI,StepI,UserI,DocI)
Figure 2. An XFolders instance.
resource-oriented approach fits quite naturally the workflow context. Activity states, documents, activity traces, etc. are modelled as resources managed by components. In particular, process descriptions and the status of the instances are sets of resources and status transitions are modelled with coordination rules expressing cross-components manipulations of such resources. An XFolders instance can be deployed across several collaborating sites (intranets), leveraging the CLF infrastructure. It includes some CLF/Mekano pre-defined components: Coordinator, workflow components and Document repository (DR). A Coordinator manages the coordination rules modelling the interactions among the users across sites. Workflow components manage workflow descriptions and instances status. A DR offers a uniform interface to a storage unit able to manage documents, e.g. a document management system or a file system directory. An XFolders system instance includes some dedicated components as well. A CirculationSpecification (CS) manages information on the ECFs task flow across users. A DocumentCirculation (DS) manages information about the documents circulation across users in a folder. A DocumentMigrator (DM) allows to temporarely store documents during the document migration process (see Sec. 4.2). For each user involved in the collaboration, a User manages user information (e.g. preferences). Finally, for each collaborating site (LAN), a Site manages information on Users and DRs hosted on the site. A site can be hidden by a firewall and then the components it hosts are not accessible from outside. An XFolders system can have one or more CS, DS and DM hosted by accessible sites. Fig. 2 shows the architecture of an XFolders instance with 4 sites (A, B, C, D) each hosting a Site, a DR, a Coordinator, and a User. Sites A, B and C are hidden by firewalls, site D being the only accessible one from everywhere. Site D also hosts CS, DS, Worflow and DM.
The acceptTask, requiredDoc and checkMigration tokens are interfaces to CS services. acceptTask is linked to the Accept service managing acceptations of tasks by users. The token is pending until a resource representing the task acceptation by a given user in a step becomes available. requiredDoc is linked to a service storing resources describing the documents associated to a given task and a given user. checkMigration is linked to a service storing resources describing documents candidate for migration. For example, if Stefania accepts a task in of , the resource acceptTask(,,Stefania) is created from the (web-based) user interface through the DMI mechanism. The rule is triggered as soon as a complete solution for all the tokens parameters instantiation is found. More precisely, the system propagates the , and values and looks for identifiers (DocI) and titles (DocT) of documents associated to this step through the requiredDoc token. The system retrieves values, if any, for those token parameters (,). For each possible solution (combination of parameters values verifying both acceptTask and requiredDoc tokens) the coordinator verifies the corresponding resources availability and if the resources are available, the rule is executed. First, the resource associated to the solution for requiredDoc is removed from the corresponding service. The resource associated to the solution for acceptTask is not removed (“‘” prefix) to allow further applications of the rule for the other documents, if any, associated to the user. Second, the resource checkMigration(,,Stefania, ) is inserted in the corresponding service. The presence of this resource means that the document may have to move to a DR the user Stefania has access to. Then, for each document candidate for a migration, one of the two following (simplified2) concurrent rules in the migration script may be triggered: checkMigration(EcfI,StepI,UserI,DocI) @ ‘docLocation(DocI,DocT,DmsI) @ ‘userProfile(UserI,DmsIDest) @ match(DmsI,DmsIDest) - docAvailable(EcfI,StepI,UserI,DocI) 2 We consider here that a user has access only to a DR. The case of multiple DRs requires a third rule for iterating on the list of DRs.
checkMigration(EcfI,StepI,UserI,DocI) @ ‘docLocation(DocI,DocT,DmsI) @ ‘userProfile(UserI,DmsIDest) @ noMatch(DmsI,DmsIDest) - migDoc(EcfI,StepI,UserI,DocI,DmsI,DmsIDest)
If the DR containing the document is accessible by the user (information encapsulated by a userProfile resource), the document does not need to migrate (first rule) and is “available” for the user during the step containing the task (insertion of a resource in docAvailable.) Otherwise (second rule), if the document is not available it has to migrate (insertion of a resource in migDoc). The match and nomatch tokens allow to express the mutual exclusive condition of enactment for the two rules. After the rule execution, the resource returned by the token checkMigration is consumed thus preventing the other rule being performed later. The migration has to take into account some security aspects, in particular firewalls.
4.2. Firewall, access control and security Security is a crucial problem in environments where people, belonging to several organisations, have to share information that may be sensitive. In XFolders, we propose to migrate, when required, a document from a safe place to another safe place. However, this requires the system be able to access the appropriate documents, which is not always easy since most enterprises install firewalls that cut all incoming traffic from the network. Even if it is always possible to operate some holes in a firewall, this solution is not suitable for security reasons and flexibility of the system: we cannot adopt a solution that requires an action from a system manager each time we want to aggregate a new site into the system. Our approach is based on: (1) the use of the classical HTTP proxy, that allows an enterprise to be connected to the Web (global authorization of outcoming HTTP communication); (2) a set of CLF rules that, once enacted by coordinators appropriately located, create a channel between the two concerned DRs relying only on outcoming HTTP communications; and (3) a classical public key mechanism [8] ensuring that at any moment only encrypted data are outside the firewalls. As a result, we are able to connect two DRs that work without any help from the system manager. As in CLF all the communications are from the coordinator to the other components, the virtual channel is based on two CLF rules that are enacted by respectively a coordinator inside the firewall of the source DR and a coordinator inside the firewall of the destination DR. The first rule takes the document from the source DR, builds a resource (containing the document and a reference to the target DR) and inserts it in a (DM), outside the firewall, that acts as a relay. The second rule waits for any available resource that contains a given DR. If such a resource becomes available, the
resource is consumed, the document is extracted from the resource and then inserted into the DR. In order to secure the document temporarely stored in the relay, the document is encrypted with the public key of the recipient user. Thus, only the latter can decript the document once delivered. Security is then granted at two levels. First, the transactional capabilities of the CLF rules ensure that the document is never lost during the transfer. Second, all the documents that circulate outside the firewall are encrypted and only readable by the effective recipient. This is important since the document has to be temporarily stored into a relay that is by definition public and then relying only on encrypted communication protocol (such as SSL) would have not been enough.
4.3. Dynamicity, extensibility and scalability The number of users, document repositories and even sites may vary since a collaboration is a very dynamic process and at any moment new users, new documents, new repositories, have to be considered. To deal with such aspect we have architectured XFolders in such a way that each user does not interact directly with the other users, but interacts with the system. On the implementation point of view, each user encapsulator component generates its own set of rules describing how it gives information to the system and how it obtains information from the system. This set of rules is inserted in the coordinator of the user site when it joins the system. Analogously, leaving definitively the system is just a matter of removing this set of rules from the coordinator. This load distribution across coordinators participates in system scalability, extensibility and autonomy.
4.4. Flexibility Xfolders provides several forms of flexibility. First, it is very simple to extend the scope of the collaboration by dynamically (and possibly temporarely) adding new users, new documents and new document repositories during the life of a workflow process. This is true not only for entities that where already known at the process creation time but also for entities that are dynamically added later in the process. This is very important for example if it is necessary to sub-contract some aspects in case of an exceptional event. Second, no extra administration is required: a new site can be started with no need to modify the current setting of the firewall, and re-lunch the application. Third, because not specific role is required to modify the process, any person in an active step can take the decision to modify it. Finally, the resource based programming approach of the CLF is very well suited for modeling disconnected operations thus allowing more flexibility for mobile users.
5. Discussion The electronic circulation folder coordination concept, introduced by ProMInanD [7], has been adopted in POLITEAM [9]. In ProMInanD, like in XFolders, an ECF consists of the folder circulation description and the documents. However, in ProMInanD the circulation depends mainly on organizational information and parallel work on sub-tasks is obtained defining a family of ECFs. In XFolders users are referred by their identity and may dynamically join and leave the process. Moreover, all users in a step work in parallel on the documents belonging to a folder without partitioning it into sub-folders. POLITEAM uses ECFs for document transportation according to hierarchical organization requirements in a ministerial environment, where awareness on documents previous and future recipients may be an issue. XFolders aims at supporting collaborations where awareness of the workflow status is instead a strong requirement, thus allowing through powerful mechanisms like the retrieve to flexibly manage a work process. In POLITEAM, a web-based interface for the interaction with ECFs (POLIWEB [5]) has been then added on top of LinkWorks TM , the base system for providing a global working environment. In XFolders, we use CLF that directly leverages the Web infrastructure. XFolders shares with POLITEAM the choice of keeping the process descriptions simple, but strongly expressive. More sophisticated process modelling techniques (e.g. [3]), for the kind of activities we aim at providing support with, may complicate the collaboration among the involved actors. As a light-weight workflow system, XFolders can be compared to e-mail based workflow. XFolders provides more consistency and awareness than what users can have when e-mailing copies of documents across organizations. Consistency because it avoids the classical problem due to diverging versions of duplicated documents that are sent from user to user. Awareness because, when a user emails a document to a collegue, there is no practical way to know if the receiver has already read the mail and processed the message. Moreover, with e-mail based workflow, it is very difficult to maintain an history of the communications and there is no clear mechanism for describing future recipients of the documents. Compared to more traditional centralized workflow systems, XFolders is lighter and less constraining in terms of overhead for frequently evolving collaborations. For such collaborations, XFolders provides various forms of flexibility, which is still a strong need in workflow systems [2]. One form of flexibility is the support of dynamic changes of workflows at runtime. Several proposals, e.g. [4] and [6], nicely allow to change workflow schemas correctly propagating changes to instances whose execution started with the old schema. In XFolders we propose a complementary approach providing users with
a high degree of freedom for dynamically and distributely transactionally change workflow instances. To some extents, XFolders aims more at helping users to achieve their goals in the collaboration letting them decide on how to reach their goals, which is close to the approach followed by case handling systems [10]. Analogously, for role resolution, we have chosen to let the users directly address the actors of the collaboration, with the possibility of dynamic modifications using the retrieve mechanism, even if more sophisticated role resolution could be easily added (as in WebFlow [4]), thanks to the underlying declarative framework. Other forms of flexibility provided by XFolders have already been discussed. Currently, our main research activity for XFolders is focussed on the support for mobile users.
References [1] J.-M. Andreoli, D. Arregui, F. Pacull, M. Riviere, J.-Y. VionDury, and J. Willamowski. CLF/Mekano: a Framework for Building Virtual-Enterprise Applications. In Proc. of EDOC’99, Manheim, Germany, 1999. [2] CACM. Special Issue on Adaptive Workflow Systems. Communications of the ACM, 9:3, Nov. 2000. [3] G. Florijn, T. Besamusca, and D. Greefhorst. Ariadne and HOPLa: Flexible coordination of collaborative processes. In Proc. of COORDINATION’96. LNCS 1061, 1996. [4] A. Grasso, J. L. Meunier, D. Pagani, and R. Pareschi. Distributed Coordination and Workflow on the World Wide Web. CSCW: An international journal, 6:175–200, 1997. [5] W. Grather, W. Prinz, and S. Kolvenbach. Enhancing Workflows by Web Technology. In Proc. of GROUP’97, Phoenix Arizona USA, 1997. [6] J. J. Halliday, S. K. Shrivastava, and S. M. Wheater. Flexible Workflow Management in the OPENflow system. In Proc. of EDOC’2001, Seattle, Washington, USA, 2001. [7] B. Karbe, N. Ramsperger, and P. Weiss. Support of Cooperative Work by Electronic Circulation Folders. In Proc. of Conf. on Office Information Systems. ACM Press, Cambridge, MA, 1990. [8] S. Lakshmivarahan. Algorithms for public-key cryptosystems: Theory and application. Advances in Computers, 22:45–108, 1983. [9] W. Prinz and S. Kolvenbach. Support for Workflows in a Ministerial Environment. In Proc. of CSCW’96, Cambridge MA USA, 1996. [10] W. van der Aalst and P. Berens. Beyond Workflow Management: Product-Driven Case Handling. In Proc. of Group’01, Boulder, Colorado, USA, 2001.