An Extensible Virtual Digital Libraries Generator - CiteSeerX

7 downloads 11307 Views 874KB Size Report
ing to become general systems dedicated to cover the whole spectrum of the ... available assets, ranging from computers and servers to collections and services,.
An Extensible Virtual Digital Libraries Generator Massimiliano Assante, Leonardo Candela, Donatella Castelli, Luca Frosini, Lucio Lelii, Paolo Manghi, Andrea Manzi, Pasquale Pagano, and Manuele Simi Istituto di Scienza e Tecnologie dell’Informazione “Alessandro Faedo” – CNR, Pisa - Italy [email protected]

Abstract. In this paper we describe the design and implementation of the VDL Generator, a tool to simplify and automatise the Digital Library development process. In particular, we discuss how our approach to the realisation of this tool simplifies the task of implementing, extending and modifying such a fundamental component. This tool models its issue as a generic search problem that can easily be adapted to different application scenarios. In particular, to guarantee its extensibility we carefully identify, isolate and organise the VDL Generator constituents, i.e. (i) the set of logical components that can be used when designing a Digital Library, (ii) the set of physical components that by implementing the logical components contribute to implement the Digital Library and (iii) the search strategy exploited to accomplish the generation task. Furthermore, we report on the experiences matured in implementing and exploiting such an innovative service in the context of the D ILIGENT EU funded project and discuss future plans for its consolidation.

1 Introduction Since the conception of the term ‘Digital Library’ (DL) and the initial implementation of systems realising the functions expected from such special kind of information system, the Digital Library Community in the large has gone on a very long journey. DLs are now moving far beyond any connotation of the term “library” and are rapidly evolving to become general systems dedicated to cover the whole spectrum of the knowledge management task [15,16,6]. DLs are now envisioned as systems that are at the center of any intellectual activity and have no logical, conceptual, physical, temporal, or personal barriers on information [1]. In particular, they are shifting from content-centric systems in charge of simply organising and providing access to particular collections of data to person-centric systems aiming at providing facilities for communication, collaboration and any kind of interaction among scientists, researchers, and for the general audience interested in topics of pertinence to the knowledge the DL has been set up for. This novel scenario brings into question the development approaches adopted in the past, mainly consisting in from scratch development of ad-hoc systems, and poses new requirements that demand for innovative solutions leading to Digital Library Management Systems (DLMSs) [6]. A very promising and innovative trend is represented by DLMSs built on e-Infrastructures. By definition, an e-Infrastructure is a framework enabling secure, cost-effective and on-demand resource sharing [11] across organisation boundaries. A resource B. Christensen-Dalsgaard et al. (Eds.): ECDL 2008, LNCS 5173, pp. 122–134, 2008. c Springer-Verlag Berlin Heidelberg 2008 

An Extensible Virtual Digital Libraries Generator

123

is here intended as a generic entity, physical (e.g. storage and computing resources) or digital (e.g. software, processes, data), that can be shared and interact with other resources to synergistically provide some functions serving its clients, either human or inanimate. Thus, an e-Infrastructure poses as a “broker” in a market of resources having the role to accommodate the needs of resource providers and consumers. The infrastructure layer gives support to: (i) resource providers, in “selling” their resources through it; (ii) resource consumers, in “buying” and orchestrating such resources to build their applications. Further, it provides organizations with logistic and technical aids for application building, maintenance, and monitoring. A well-known instance of such an e-Infrastructure is represented by the Grid [10], where a service-based paradigm is adopted to share and reuse low-level physical resources [9]. Application-specific eInfrastructures are in their turn inspired by the generic e-Infrastructure framework and bring this vision into specific application domains by enriching the infrastructural resource model with specific service resources, i.e. software units that deliver functionality or content by exploiting available physical resources. This potentially not-limited market of resources allows a new development paradigm based on the notion of Virtual Digital Library (VDL)1 , i.e. an advanced Digital Library implementing the person-centric pattern mentioned above. This is built by aggregating the needed constituents after hiring them through the e-Infrastructure. In this development paradigm, Digital Libraries are considered as organised ‘views’ built atop the pool of available assets, ranging from computers and servers to collections and services, and the DLMS is the system taking care of the definition and operation of such views. To make this novel paradigm working three fundamental facilities are needed, i.e. (i) a mechanism operating the e-Infrastructure while guaranteeing all the involved parties about the Quality of Service, (ii) a mechanism supporting DL communities in easily characterising the VDL they are interested in, (iii) a mechanism guaranteeing the deployment and operation of the defined VDLs, in addition to a comprehensive pool of resources. In this paper, we focus on the implementation of this approach as made in the context of the D ILIGENT EU project [8,5]. In particular, we present the VDL Generator, i.e. the component in charge to support VDL definition and commence the VDL deployment task. This component guarantees an optimal consumption of the e-Infrastructure resources by selecting the minimal amount of resources to be used and instruct them with a concrete and detailed specification on the behaviour that is needed to implement the expected environment. It is worth noticing that the design of this component is particularly critical because of the e-Infrastructure’s very heterogeneous and dynamic pool of resources of which it must, at the same time, provide an abstract and comprehensive view to the VDL designer, i.e. the user in charge to characterise the expected VDL. The remainder of the paper is structured as follows. Section 2 is focused on the principles guiding the design of the VDL Generator component by showing how the proposed approach guarantees extensibility in the resulting system w.r.t. the set of resources to be dealt with. Section 3 presents the concrete implementation of the VDL Generator service performed during the D ILIGENT project by providing examples of exploitation of the designed features in concrete scenarios. Section 4 reports on existing 1

In other contexts it is also known as Virtual Research Environment or Collaboratory.

124

M. Assante et al.

works addressing similar issues to the one touched by the VDL Generator component. Finally, Section 5 concludes the paper and reports on future research issues.

2 VDL Generator Design In a DLMS supporting VDLs, the VDL Generator service is the component that must provide its users, a.k.a. VDL designers, with the facilities needed to declaratively characterise the VDLs their community needs and identify the concrete pool of resources to operate them. Because of this, a VDL Generator is highly bound to the characteristics of the VDLs it can build by using the particular DLMS it is a part of. If this intrinsic binding is not managed properly since the design of the component, the risk is to come up with a tightly coupled software component that can be hardly exploited in different application scenarios. This issue is particularly critical if the application scenario is extremely dynamic as an e-Infrastructure is supposed to be. New resources of any type can appear at any time and the VDL Generator must be able to adapt to this novel scenario by providing its users with the possibility to exploit the newly available assets. Having the above extensibility and maintainability as firm requirements, in the remainder of this section we present the design of the VDL Generator service that by exploiting an object-oriented approach clearly separates the constituents on such a component and makes room for the easy plug-in of novel components into the VDL generation phases. This design was inspired by a work of N. Kabra and D. DeWitt on the design of a database query optimizer [17]. 2.1 Basic Concepts We assume that: (i) a digital library specification can be logically represented in terms of one or more plans of logical components (logical plans), i.e. trees having logical components as nodes; each logical component can be applied to other logical components representing its inputs; (ii) each logical component can be implemented through one or more physical components, i.e. the concrete constituents implementing the logical component and thus constituents of the actual VDL runtime environment. By replacing any logical component in a logical plan with physical ones, it is expected to generate a concrete deployment plan; (iii) during its work, the VDL Generator is expected to generate various logical plans that represent the VDL specification and various deployment plans corresponding to the logical plans; moreover, it estimates/compute various properties of the logical and deployment plans (e.g. cost) to evaluate alternative solutions; and (iv) the VDL runtime environment exploits a component-oriented approach. Examples of simple logical and deployment plans are presented in Figure 1. These are sample plans generated to satisfy a specification asking for a simple VDL designed to provide its users with Google-like search facilities on Collection1 and Collection2 , and similarity search on Collection3 and Collection4 . Logical plans and deployment plans usually are isomorphic as it is expected from the assumptions above but, as it is shown in the figure, in some cases they can slightly diverge because of the need to force certain characteristics on the input plans (cf. Sec. 2.3). The above assumptions once combined with the identified requirements lead to the identification of the set of abstract classes in Figure 2. Both logical components and

An Extensible Virtual Digital Libraries Generator (a) Logical Plan

125

(b) Deployment Plan Portal

UserInterface

KeywordBasedSearch

Collection1

SimilaritySearch

Collection2 Collection3

Collection4

Portlet

Portlet

KBSearchService

SSSearchService

CollService(C1) CollService(C2)Transform CollService(C3)

Transform CollService(C4)

Fig. 1. Logical plans and deployment plans SearchStrategy Specification Resources OptimalPlan optimize() expandPlan()

LogicalComponent DeploymentPlan[ ] Properties input[ ] compose() applyComponent()

LogicalProperties isEqualTo()

DLSpecification

ComponentSet

PhysicalComponent LogicalComponent Properties input[ ] makePlan() applyComponent() requirements() generatePlan2Plan() PhysicalProperties isEqualTo()

Fig. 2. VDL Generator main classes and methods

physical components are modelled in terms of abstract classes. The search strategy is entirely implemented in terms of these abstract classes and their envisaged methods, and is, itself, an abstract class. DLSpecification and ComponentSet are helper classes dedicated to represent, respectively, the VDL specification the VDL Generator is requested to satisfy and the set of available logical components and the relative physical components implementing them. A VDL Generator for a specific scenario can be implemented by deriving further classes from these abstract classes. Information about the specific logical components and physical components are expected to be encoded in the virtual method instances of the derived classes. The standard object-oriented inheritance mechanism ensures that the search strategy does not have to be changed whenever novel operators, either logical or physical, are added. As anticipated, the VDL Generator accomplishes its task by exploiting a search strategy that, by relying on the set of a priori defined virtual methods, dynamically and incrementally generates the logical plans as well as the deployment plans.Such plans representing alternative candidate solutions are evaluated by using various characteristics attached to the generated plans. These actions are described in detail in the following sections by introducing the dedicated classes and the relative methods.

126

M. Assante et al.

2.2 Representing and Generating Logical Plans The LogicalComponent class is dedicated to represent logical components that can be used when defining a VDL. Examples are the single collections forming the requested information space, the functions to be activated on such information space like a full text or geo-spatial search, and the presentation aspects, i.e. the customisations on the VDL user interface. All of these classes of components are expected to be represented in terms of LogicalComponent derived classes. Logical plans are represented in terms of instances of these classes by binding each component instance with the component it is input of. During the generation task, the VDL Generator service has to produce alternative logical plans and keep track of the properties of the resultant plans in order to evaluate alternative solutions. In order to produce plans, the search strategy relies on the compose and applyComponent methods. The compose method is inherited by all the logical component classes and implements the machinery to incrementally and systematically produce new plans having the application scenario as input and the operator it represents as parent of this input. The applyComponent method is the method that must be implemented in each specific component to determine whether the component can be applied to the given input or not (such a logic is really component specific). Whenever a new logical plan is generated, it is annotated with an instance of the LogicalProperties class. This instance captures properties of the annotated plan. These properties are to be used when evaluating alternative solutions. In particular, they contain the plan cost that is usually conditioned by many and heterogeneous factors ranging from monetary costs to numbers of deployed resources. 2.3 Representing and Generating Deployment Plans The PhysicalComponent class is dedicated to represent concrete architectural components that, once deployed in the context of a VDL, implement the functions of the logical component they are conceptually dedicated to. Examples are the component implementing a collection service, the component implementing the Google-like search by relying on Lucene technologies, the component implementing the user interface through Ajax technologies2. All the physical components that can be used are expected to be represented as derived classes of PhysicalComponent. It is worth noticing that alternative physical components, having different costs and characteristics, can be developed to implement the same logical component. It is a job of the VDL Generator to produce all the valid deployment plans. Deployment plans are represented in terms of instances of these classes by binding each component instance with the component it is input of. During the generation task, the VDL Generator service has to produce alternative deployment plans and keep track of the properties of the resultant plans in order to evaluate alternative solutions. In order to produce plans, the search strategy relies on the makePlan and applyComponent methods. These methods have a behaviour 2

These examples are intentionally “technology agnostic”, i.e. the notion of component can be materialised in a web service or in a software module in the VDL runtime environment.

An Extensible Virtual Digital Libraries Generator

127

similar to the behaviour of the homonymous LogicalComponent method, the former produces new plans having the application scenario as input and the component it represents as parent of this input, the latter implements the component specific logic to determine whether the component can be applied to the given inputs or not. In addition to them, two other methods are envisaged, namely, requirements and generatePlan2Plan, to deal with enforcers (a.k.a. glue) components, i.e. physical components explicitly added to the input plans in order to guarantee the applicability of the current physical component. Examples are the portlets and the transformers in Figure 1. The requirements implements the logic needed to declare the expected characteristics the input plan must satisfy, e.g. the metadata formats of the objects. The generatePlan2Plan implements the logic needed to verify and apply one of the available physical components to the input plan in order to guarantee the expected characteristics. Both these methods are component specific. Whenever a new deployment plan is generated, it is annotated with an instance of the PhysicalProperties class that is in charge to capture the properties of the deployment plan to be used when evaluating alternative solutions, namely the plan cost that is function of various factors. 2.4 The Search Strategy In the previous sections we have presented the foundation mechanisms. In this section we clarify how these mechanisms are used by the search strategy to generate logical and deployment plans. Any search strategy that is implemented entirely in terms of the envisaged abstract classes and methods becomes independent of the pool of logical components and physical components that can be used while generating the plan satisfying the VDL specification requested. The SearchStrategy class is the abstract class dedicated to represent a specific search strategy that in turn must be implemented by deriving from this abstract class. It is provided with two abstract methods: optimize, i.e. the method orchestrating the whole generation process that generates the deployment plan by consuming the VDL specification, and expandPlan, i.e. the method implementing the single steps in the generation task. Various search strategies can be implemented by using this approach. Moreover, it is worth noticing that, since the search strategy and the application context are independent from each other, it is possible to easily re-use and experiment strategies developed in other disciplines, e.g. dynamic programming, greedy, simulated annealing, hill climbing and iterative improvement techniques [22]. The current implementation of the VDL Generator has been equipped with a search strategy adopting the dynamic programming with a bottom-up approach. This strategy consists in generating the various plans in a bottom-up manner, as follows. The first step consists in generating the leaf plans implementing the information space. Then, one logical plan to be expanded is iteratively selected and new logical plans are exhaustively generated having the selected plan as input and the possibly applicable operators as roots. For each newly generated logical plan, the strategy exhaustively generates the deployment plans having the plans produced to implement the input logical plans as input and the physical components implementing the root logical component as root.

128

M. Assante et al.

Cost-based pruning of the generated plans is performed in order to avoid the exploration of part of the search space that cannot lead to optimal solutions. In addition to this strategy, the VDL Generator has been equipped with a greedy version of it that ensures a short research time but is only capable to guarantees a nearoptimal solution. This greedy version replaces the exhaustive generation of all the possible plans with the expansion of the most promising plan only in each expansion step. 2.5 Extendibility As anticipated, to meet the extensibility and maintainability requirement previously identified the VDL Generator is, by design, composed of three parts: the search strategy, the logical components and their search space, the physical components and their search space. Each of these macro parts can be changed independently. However, it is expected that the main aspects on which the VDL Generator implementers are expected to act to adapt and extend the system will be the logical components and physical components, in particular the second ones because of the potentially higher number of alternative solutions to implement a certain logical component. Adding a novel component, either logical or physical, is quite simple, as it is sufficient to derive the relative class and implement its fundamental methods. More invasive is making a change in the plans, i.e. in the rules governing the plan shapes. These changes can potentially affect the whole set of the logical or physical components classes. For what concerns the search strategies, it is expected to re-use the one the system is already equipped with. However, the adjunction of a novel search strategy is straightforward also, as it is sufficient to derive the dedicated class.

3 Experiences with the VDL Generator: D ILIGENT D ILIGENT [8,5] is an EU FP6 funded project which lasted more than three years and successfully ended in November 2007. The goal of the project is to implement an eInfrastructure for supporting the creation and maintenance of virtual research environments, i.e. virtual digital libraries activated on a shared pool of resources ranging from traditional grid resources like computing and storage resources to various types of collections, data sources and application services. To prove the feasibility of the proposed approach, two concrete and complementary scenarios, involving various resources from the Cultural Heritage and Earth Observation domains, have been operated. Because of this, the resulting infrastructure is highly heterogeneous, e.g. data sources range from those containing digital documents like research studies, reports and papers to those containing satellite images, maps plotting the distribution of certain indicators like pollution and cloud cover, ancient books and related images, and Earth Observation products such as Chlorophyll-1. Driven by the project goals and the concrete needs arising from these scenarios, a powerful service-oriented application framework named gCube [7] has been developed. Such a framework is an implementation of the DLMS concept and it has been entirely conceived (i) to provide an e-Infrastructure with the management functions needed to properly deal with the potentially huge and heterogeneous set of constituent

An Extensible Virtual Digital Libraries Generator

129

resources and users, (ii) to have on board the minimal set of functions needed to implement VDLs offering uniform information management and organisation facilities on a heterogeneous and dynamic information space, and (iii) to be open and extensible so as to easily adapt to different application contexts. The resulting framework follows a service-oriented approach and consists of 60 web services, 44 helper software libraries and 33 portlets. One of the constituents of such a complex framework is the VDL Generator service previously described. Actually, the service is organised according to the Model-ViewController pattern [4]. It completes the presented machinery implementing the logic needed to generate the specified VDL with a user interface dedicated to help the VDL designer during the specification creation. This user interface has been implemented by using a wizard approach, i.e. the definition process is organised in steps and the user interface drives the designers through these steps. Steps have been inspired by the Digital Library Manifesto [6] and consist in a subset of the identified dimensions through which designers define their needs in terms of constraints on Content, Functionality, User and Architecture. Figure 3 shows highlights of the wizard steps. Once the specification is completed, it is passed to the VDL Generator logic that must digest it and produce the deployment plan. The search strategy exploited to accomplish this task has been described before (cf. Sec. 2.4). For what concerns the set of logical components implemented in the D ILIGENT context, we will not enumerate all of them because of the space constraints of the paper but provide an exemplification of some of them instead. A Collection component has been implemented to represent the constituents of the information space. Such component instances represent the leaf

Fig. 3. D ILIGENT VDL Generator Wizard Snapshots

130

M. Assante et al.

of any D ILIGENT logical plan and can be used to capture both a materialised and a virtual collection, i.e. a collection generated at deployment time by manipulating existing information objects. A Search component has been implemented to represent the various search operations that have to be activated on the VDL information space. Each instance of such a component represents a different type of search function and leads to the selection of a different physical component (usually set of components) in the deployment plan. A Process component has been implemented to represent workflows, i.e. macro-functionality defined by combining existing functions into a sequence, usually not linear. Each instance of such component represents a workflow the user explicitly decided to use in the VDL and leads to a set of physical components in the deployment plan. A UserInterface component has been implemented to represent the presentation layer of the VDL. This is the placeholder for all the VDL user interface customisation aspects and it is expected to be the root of all the logical plans. During the generation of the logical plans, the search strategy incrementally generates also the deployment plans by exploiting the physical components the generator has been instructed to deal with. Because of the space constraints, we will not enumerate all the components that have been implemented in D ILIGENT but provide some samples to discuss how the VDL Generator exploitation patterns have been used. A CollectionService component has been implemented to represent each collection forming the information space. In case of a virtual collection, the plan is enforced with a physical component implementing the transformations needed to materialise the collection. A GeoSearch component has been implemented to represent the process needed to accomplish a Geo-referenced search. This component leads to the enforcement of the deployment plan with a set of specific component, e.g. QueryPlanner and GeoIndexLoockupService [24]. A ProcessEngine component has been implemented to represent the component in charge of executing workflows. An instance of such a component is dedicated to implement an instance of the Process component above and may lead to the enforcement of the input deployment plan by adding the physical components that are exploited in the Process workflow. A generic gCube peculiarity, i.e. not linked to a particular physical component, is related to the management of new service instances versus existing service instances. In fact, one of the innovative features supported by gCube is the dynamic deployment, i.e. the possibility to dynamically create a new instance of a service by using one of the available hosting nodes the infrastructure is provided with. This feature is captured through an enforcer, i.e. each service that has to be deployed is added to the deployment plan together with its candidate hosting node. On the contrary, if an existing instance has to be used, it is sufficient to add the instance of the physical operator representing it. A similar approach is implemented in the case of services adopting the factory patterns [3], i.e. services having a factory that is capable to dynamically generate the resource on which the service has to act. In this case the factory is modelled through an enforcer. Another particularly relevant exploitation of the enforcers arises in the generation of the deployment plan implementing the UserInterface logical operator. The Portal component has been envisaged to implement it but it is expected to enforce the input deployment plan with a set of Portlet component instances each implementing a specific piece of the interface. For example, the GeoSearchPortlet is

An Extensible Virtual Digital Libraries Generator

131

added whenever the deployment plan contains a GeoSearch component in order to give the VDL users access to its facilities. These few examples provide an overview of the issues arising during dynamic VDLs generation in complex scenarios as those captured by D ILIGENT. Despite the apparent complexity of the solution, the choices driving the VDL Generator design have proven to be effective to successfully satisfy the extensibility and maintainability needs.

4 Related Work Virtual Digital Libraries represent the mechanism we envisaged in order to overcome the drawbacks of the actual digital library development processes. The VDL Generator, being a component of the DLMS, has not been created nor proposed elsewhere simply because none of the actual systems, besides gCube, provides the functionality expected by the DLMS we predicated. The closest work to our approach is represented by a family of tools built on top of the 5S framework and discussed in detail in the following section. The problem of service composition has been studied in the web services community [19,2,21,20]. However, the proposed techniques seem to suffer from the general purpose approach. The digital library area is restricted to certain types of components and well known constraints, therefore the problem is more manageable and can be tackled with different and domain specific techniques. 4.1 The 5S Products: 5SL, 5SGraph, and 5SGen In his dissertation based on the 5S framework [14,12], Gonc¸alves presented a series of tools and applications for modelling and semi-automatically customising digital library services named 5SL, 5SGraph, and 5SGen. 5SL [13] is a declarative domain specific language for digital library specification. With this language the specification of a digital library consists of five models related to the dimensions of the underlying formal framework. The stream model is devoted to specify the format of media objects supported by the digital library according to the web standard MIME types. The structural model defines via an XML Schema the structure of the information objects as well as the properties of collections and metadata the digital library deals with. The spatial model gives details about the digital library retrieval model, the characteristics of indexes, and the user interface appearance. The societal model makes it possible to model the characteristics of actors and services by identifying the five core services each digital library must provide, i.e. user interface, index, search, repository, and browse. For both actors and services, the set of attributes and the set of interactions with the services are modelled. For services the description of the operations is provided as well. Finally, the scenario models the behaviour of a service via a sequence of events. All these constructs are provided in XML. As any domain specific language, 5SL has its own problems, namely (i) different semantics (at least one for each model) must be understood to define a digital library, (ii) the definition of a complex digital library is difficult even for experts since there is a great amount of XML to be manually produced and a number of semantic constraints

132

M. Assante et al.

and dependencies to be verified in order to ensure consistency, and (iii) it is difficult to obtain the big picture of the defined digital library. To overcome these problems, the 5SGraph is proposed. 5SGraph [25] is a domain specific visual digital library modelling tool whose output is a specification of a digital library in terms of the 5SL language. This tool can be configured with a set of characteristics on the digital libraries it is allowed to create. These characteristics are expressed in terms of a 5S metamodel. The tool is thus able to enforce these constraints and ensure the semantic consistency and correctness of the digital library specifications produced. 5SGen [18] is the last link in the digital library development chain proposed. In particular, this software system is dedicated to the semi-automatic production of digital library components fulfilling the model of societies and scenarios expressed in terms of the 5SL language. The proposed approach is based on a component oriented view of the digital library systems and, thanks to this, the DL designers become capable to model (via sequence diagrams and state chart diagrams) the behaviour and co-operation of such basic components in delivering the digital library expected functionality. Then, they are able to obtain a set of service manager modules that produce the planned digital library functionality by exploiting freeware tools capable to dynamically generate Java code (i) from the XMI3 representation of the models for societies (XMI2Java) and (ii) from the finite state machine representation of the models for societies. In comparison, our proposed approach is less software engineering and code generation oriented as it aims at creating digital libraries without the production of code. Recently, Santos et al. [23] presented a wizard tool for setting up DLs. This research presents many commonalities with the one proposed in this context despite the application context is completely different as well as the process governing the wizard activity is described too vaguely to be properly compared with the one proposed in this paper.

5 Conclusion and Future Work The demand for DLs has evolved in the last years, bringing into question the development processes adopted in the past. A new trend is taking shape, based on DLMS supporting VDLs built by dynamically hiring the needed resources from those made available by an e-Infrastructure . In this paper we have presented the VDL Generator, a tool that supports this scenario by providing VDL designers with the facilities needed to characterise the VDLs the communities are interested in and to transparently obtain the desired environments. In particular, we have discussed how the design of the VDL Generator guarantees the extensibility of all its constituents, i.e. the logical components, the physical components and the search strategy, and exemplified the concrete exploitation of such a tool in the context of the D ILIGENT EU project. In January 2008, the D4Science4 EU funded project started to consolidate and put in production the gCube technology experimented in D ILIGENT. The project plans to 3 4

An XML serialisation of the UML diagrams. DIstributed colLaboratories Infrastructure on Grid ENabled Technology for Science www.d4science.eu

An Extensible Virtual Digital Libraries Generator

133

provide the Environmental Monitoring and Fishery Resource Management communities with VREs. These application scenarios represent a very challenging arena in which the VDL Generator service will be consolidated and expanded by enlarging its pool of logical and physical components so as to capture the novel resources these communities are interested in. Moreover, novel search strategies will be experimented aiming at improving the quality of the generation service both in terms of responsiveness, i.e. the time needed to find a solution, and breadth of the analysed search space, i.e. the set of constraints and characteristics taken into account while producing the deployment plan. Acknowledgments. This work is partially funded by the European Commission in the context of D ILIGENT (FP6) and D4Science (FP7) projects.

References 1. DIGITAL LIBRARIES: Future Directions for a European Research Programme. Brainstorming report, DELOS, San Cassiano, Alta Badia, Italy (June 2001) 2. Aggarwal, R., Verma, K., Miller, J., Milnor, W.: Constraint Driven Web Service Composition in METEOR-S. In: Proceeding of the IEEE SCC 2004 (2004) 3. Banks, T.: Web Services Resource Framework (WSRF) - Primer. Committee draft 01, OASIS (December 2005), http://docs.oasis-open.org/wsrf/ wsrf-primer-1.2-primer-cd-01.pdf 4. Buschmann, F., Meunier, R., Rohnert, H., Sommerlad, P., Stal, M.: Pattern-Oriented Software Architecture. John Wiley & Sons, Chichester (1996) 5. Candela, L., Akal, F., Avancini, H., Castelli, D., Fusco, L., Guidetti, V., Langguth, C., Manzi, A., Pagano, P., Schuldt, H., Simi, M., Springmann, M., Voicu, L.: DILIGENT: integrating Digital Library and Grid Technologies for a new Earth Observation Research Infrastructure. International Journal on Digital Libraries 7(1-2), 59–80 (2007) 6. Candela, L., Castelli, D., Ioannidis, Y., Koutrika, G., Pagano, P., Ross, S., Schek, H.-J., Schuldt, H., Thanos, C.: Setting the Foundations of Digital Libraries The DELOS Manifesto. D-Lib Magazine 13(3/4) (March/April 2007) 7. Candela, L., Castelli, D., Pagano, P.: gCube: A Service-Oriented Application Framework on the Grid. ERCIM News (72), 48–49 (January 2008) 8. Castelli, D., Candela, L., Pagano, P., Simi, M.: DILIGENT: A DL Infrastructure for Supporting Joint Research. In: IEEE Computer Society (ed.) 2nd IEEE-CS International Symposium Global Data Interoperability - Challenges and Technologies, pp. 56–69 (2005) 9. EGEE. Enabling Grids for E-sciencE, INFSO 508833, http://public.eu-egee.org/ 10. Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. In: Open Grid Service Infrastructure WG, Global Grid Forum (June 2002) 11. Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of the Grid: Enabling Scalable Virtual Organization. The International Journal of High Performance Computing Applications 15(3), 200–222 (2001) 12. Gonc¸alves, M. A.: Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications. PhD thesis, Virginia Polytechnic Institute and State University (November 2004)

134

M. Assante et al.

13. Gonc¸alves, M.A., Fox, E.A.: 5SL - A Language for Declaratively Specification and Generation of Digital Libraries. In: Proceedings of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2002), Portland, Oregon, pp. 263–272 (July 2002) 14. Gonc¸alves, M.A., Fox, E.A., Watson, L.T., Kipp, N.A.: Streams, Structures, Spaces, Scenarios, Societies (5S): A Formal Model for Digital Libraries. ACM Transactions on Information Systems (TOIS) 22(2), 270–312 (2004) 15. Ioannidis, Y.: Digital libraries at a crossroads. International Journal on Digital Libraries 5(4), 255–265 (2005) 16. Ioannidis, Y., Maier, D., Abiteboul, S., Buneman, P., Davidson, S., Fox, E., Halevy, A., Knoblock, C., Rabitti, F., Schek, H., Weikum, G.: Digital library information-technology infrastructures. International Journal on Digital Libraries 5(4), 266–274 (2005) 17. Kabra, N., DeWitt, D.J.: OPT++: An Object-Oriented Implementation for Extensible Database Query Optimization. VLDB Journal 8(1), 55–78 (1999) 18. Kelapure, R., Gonc¸alves, M.A., Fox, E.A.: Scenario-based Generation of Digital Library Services. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 263–275. Springer, Heidelberg (2003) 19. Martin, D., Burstein, M., Hobbs, J., Lassila, O., McDernott, D., McIIraith, S., Narayanan, S., Paolucci, M., Parsia, B., Payne, T., Sirin, E., Srinivasan, N., Sycara, K.: OWL-S: Semantic Markup for Web Services (2004), http://www.daml.org/services/owl-s 20. Narayanan, S., McIlraith, S.A.: Simulation, verification and automated composition of web services. In: WWW 2002: Proceedings of the 11th international conference on World Wide Web, pp. 77–88. ACM Press, New York (2002) 21. Pistore, M., Bardon, F., Bertoli, P., Shaparau, D., Traverso, P.: Planning and Monitoring Web Service Composition. In: Workshop on Planning and Scheduling for Web and Grid Services in conjunction with ICAPS 2004 (2004) 22. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall International, Englewood Cliffs (1995) 23. Santos, R.L.T., Roberto, P.A., Gonc¸alves, M.A., Laender, A.H.F.: Design, Implementation, and Evaluation of a Wizard Tool for Setting Up Component-Based Digital Libraries. In: Gonzalo, J., Thanos, C., Verdejo, M.F., Carrasco, R.C. (eds.) ECDL 2006. LNCS, vol. 4172, pp. 135–146. Springer, Heidelberg (2006) 24. Simeoni, F., Candela, L., Kakaletris, G., Sibeko, M., Pagano, P., Papanikos, G., Polydoras, P., Ioannidis, Y.E., Aarvaag, D., Crestani, F.: A Grid-Based Infrastructure for Distributed Retrieval. In: Kov´acs, L., Fuhr, N., Meghini, C. (eds.) ECDL 2007. LNCS, vol. 4675, pp. 161–173. Springer, Heidelberg (2007) 25. Zhu, Q., Gonc¸alves, M.A., Shen, R., Cassell, L., Fox, E.A.: Visual Semantic Modeling of Digital Libraries. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 325– 337. Springer, Heidelberg (2003)

Suggest Documents