(SCM), Package Managers (PM), Content Delivery (CD) systems and push ... and ITs deal with the delivery of applications from a producer site to consumer user.
Enterprise Software Deployment: Foundations & Related Technologies T. Coupaye and J. Estublier IMAG LSR / Dassault Systèmes Joint Laboratory Actimart Bat. 8, 2 Allée de Roumanie 38610 Gières, France. {Thierry.Coupaye,Jacky.Estublier}@imag.fr
Abstract. While Software Engineering has mainly focused so far on software development, software deployment is now emerging as a new research field. Software (or application) deployment is a complex process which covers all the activities that have to be carried out from the end of the development itself on the producer sites to the actual installation and maintenance of the application on consumer computers. This article set up grounds for enterprise software deployment which is the large-scale deployment of complex applications in large companies. It also scans related technologies and describes how they are relevant to enterprise deployment.
1
Introduction
Application (or software) deployment is a complex process which covers all the activities that have to be carried out from the end of the development itself on producer sites to the actual installation and maintenance of the application on consumer computers. Examples of such activities are the packaging of applications on producer sites, the transfer of applications from producer sites to consumer sites, the installation, the update and eventually the uninstallation of applications on consumer sites. It is worth noting that until recently software engineering focused on the development and evolution of applications. Very few research work dealt with the delivery, installation and maintenance of applications on consumer sites. Whether this was considered as unimportant or too complex or hardly feasible with respect to available technologies, application deployment was done in a ad-hoc and very poorly automated way (installation scripts basically). The software industry, however, is becoming very sensitive on the subject. The manual approach of deployment is not viable much longer. Dassault Systèmes is one of the world leaders in CAD/CAM solutions. CATIA, its main product, is very large (millions of lines of code) and composed of hundreds of components. Such large-scale deployments are hazardous because: • applications are more and more complex: numerous components, versions and variants,
• applications evolve more and more quickly (a few months between two consecutive released versions), • applications rely on other applications, components or services such as operating systems, middleware, servers (databases, web), editors, compilers, etc. • computing environments have shifted from mainframes to workstations, • coexisting hardware and software targets are more and more numerous (flavors of Unix or Windows). Now, among Dassault Systèmes clients are very big companies such as Boeing or Chrysler in which CATIA is deployed on thousands of seats. This aspect of largescale deployment, that we call enterprise software deployment, adds on some new hazards. Deployment in this context must satisfy some constraints: • Temporal constraints: the deployment must occur following strict scheduling and synchronization constraints; for example simultaneously on all seats (big bang), or incrementally following well defined paths. • Consistency constraints. The different instances of the application, installed on each seat, are likely to interact with each other. A number of compatibility constraints between the different application versions are to be observed. • Criticality. Deploying a vital application on all seats simultaneously is a very risky operation that may hamper the whole company from performing its job. For all these reasons, it is becoming crucial to minimize the deployment time and cost, and to avoid risks of production breaks in consumer enterprises. The research community with the work on Software Dock [26,27,28,29] seems to show some interest in software deployment. We describe in Section 3.2 what are the contributions of the work on the Software Dock to application deployment. For the time being, we argue that if there is no tool covering the whole deployment process yet, there are however many domains in computer science related to deployment whose study might be very useful. Among these technologies are: Software Description Formalisms (SDF), Organization Description Formalisms, Software Configuration Managers (SCM), Package Managers (PM), Content Delivery (CD) systems and push technology, Installation Tools (IT), Application Management Systems (AMS) and Configurable Distributed Systems (CDS).
Producer
SDF SCM PM LAN
WAN
Enterprise Deployment
CD
ODF
Enterprise
AMS CDS
IT User
LAN
Fig. 1 Enterprise Software Deployment and Concurrent Technologies. Our aim in this article is to set up grounds for enterprise application (software) deployment by stressing that enterprise deployment spans three conceptual layers: the Producer, Enterprise and User layers (cf. Fig. 1). Also these grounds could be used for developing deployment infrastructures which would automate as much as possible the deployment activities. Existing technologies only cover one or two of the layers mentioned above(cf. Fig. 1): SDFs, SCMs and PMs focus on activities performed at the producer level, CDs and ITs deal with the delivery of applications from a producer site to consumer user computers. AMS and CDS deal somehow with the enterprise and user layers but they generally adopt a centralized approach in which the enterprise layer can be considered as the producer layer actually – and thus the enterprise layer is missing in a sense. The enterprise layer is the less developed layer in existing technologies (including the Software Dock). Yet, it is of primary importance according to us for two reasons: • Enterprise customized applications. Professional software development applications most often offer to their client’s capabilities for specific customer extensions, either using an API, or a specific language. This is especially true in the case of Dassault Systèmes. Boeing, for instance, has developed over time many CATIA extensions which size is about the same as the size of CATIA itself. It is at the enterprise level that the producer application and the enterprise extensions are assembled together to constitute the application that will be actually deployed on users computers. • Large consumer enterprises. The second reason is that consumer enterprises would like to control as much as possible the overall deployment process, especially with respect to scheduling, synchronization and consistency of the overall deployment activities in order for applications to be always operational and to avoid production stops. The deployment process is based on the organization of the consumer enterprises (teams, positions, projects, etc.) and on
deployment policies defined in theses enterprises (likely different ones from one application to another). The remainder of this article is organized as follows. Section 2 defines the enterprise deployment process. Section 3 scans existing technologies and states how they are relevant to enterprise deployment. Section 4 concludes and introduces ongoing efforts we are currently pursuing to develop an enterprise deployment infrastructure.
2
The Enterprise Deployment Process
We see application deployment as a process, which is thus described in terms of: • activities which represent steps in the process, • products which represent data that flows between activities, • and resources (humans, hardware and software) needed by the process. Section 2.1 focuses on activities while Section 2.2 focuses on products and resources which are described by different models. 2.1
Activities
Deployment is a single process but it can be seen as well as three interconnected processes since activities are not performed on the same sites: producer, enterprise and user, nor they have the same organizational objectives and responsibilities: • The objective of a producer is mainly to pack and advertise what it wants to deliver to its consumers. • The goal at the enterprise level is to prepare the physical deployment on the users computers inside the enterprise. The main activities here are the assembly of different components or applications that will form the final application to be deployed; and the specification of deployment policies, which specify when and how the physical deployment will be performed. • The objective at the user level is the actual physical deployment, i.e., the assembly and maintenance of an operational version of the application on each user’s computer. Activities on the Producer Site The activities presented here are under the producer’s responsibility. They represent the link between the application development and the application deployment. From a producer’s point of view, application deployment covers two activities: release and unrelease (cf. Fig. 2).
application artefacts
build & test tools & process
application model
unrelease
release
select
configure
build retire
pack advertise PDS
released application package
SDK
Fig. 2 The Deployment Process at the Producer Level.
Releasing an application covers the activities that have to be done before the actual transfer of the application to consumers. The release activity has two sub-activities: pack and advertise. Packing an application consists in building a self-contained package that contains the application artifacts, the application model and other information which will be necessary to manage (build, configure, test, activate, etc.) the application after its transfer to consumers. The package should be in an easily transferable form (encoded, compressed, etc.) most commonly used by electronic means. Unreleasing an application is suppressing the producer’s support to consumers for this application. The main activity here is retire. It is assumed that an enterprise deployment infrastructure would provide on the producer’s site an application repository, which would provide support for the storage and retrieval of applications distributed by this producer. When retiring an application, the infrastructure must make sure that this activity is painless to other applications in the repository (they might share some components for instance). Advertise is a sub-activity of both the release and unrelease activities. Advertising is letting know consumers that a new or revised version of an application is released or that support for an application is cancelled. The advertisement can include any form of communication such as telephone, postal mail, etc. In an automated deployment infrastructure, it would for instance make use of e-mails, newsgroups to notify people -- or other notifications (events) to notify parts of the infrastructure itself (agents) on the consumer side. Activities described in this section are carried out at the level of the enterprise. They are preparatory activities to the physical deployment on end users’ computers. This
level is crucial for enterprise deployment because it reflects the organizational choices concerning the deployment process in the whole enterprise. The first activity that follows the release of an application is the transfer of this application to consumer enterprises (cf. Fig. 3). As a matter of fact, it is questionable whether this activity is undertaken on the consumer’s site or on the producer’s site. From a technical point of view, in an automated deployment infrastructure, control of the transfer activity from the producer’s site tends to imply the use of a push technology; while control of the transfer activity from the consumer’s site tends to imply the use of a pull technology. We believe transfer would be more likely undergone on consumer’s site not for technological reasons (although crossing firewalls could be a bit of a obstacle) but for more political reasons: large enterprises which deploy complex applications would very likely like to have as much as possible control over the deployment process. Perhaps small enterprises should go for a push technology, especially with small applications. However, at this point, the important matter is that there exists a transfer activity, which is a real activity as it implies organizational and technical issues.
released application package
SDK
Push/pull
transfer EDS
released application package
SDK
Fig. 3 The Transfer Activity from a Producer’s Site to Consummers’Sites.
Once the released application package has been transferred on the consumer’s site, the assemble activity begins (cf. Fig. 4). This activity is one of the most time and effort consuming activity. It consists in building a complete application – which will be the typical application deployed on end-users machines inside the enterprise. Building blocks or components come from possibly different released application packages coming from possibly different producers – plus extensions developed by the consumer enterprise itself. We believe this activity is crucial to enterprise deployment of complex applications. It is so in the context of Dassault Systèmes: the main application CATIA is already complex (several hundreds of components) and
voluminous (several millions of lines of code) by itself – but the extensions developed by consumers can be as huge or even bigger. This is the case with the Boeing Company for instance, which has developed over the years a lot of components devoted to the airplane design and construction. The assemble activity is broken down into four sub-activities: unpack, compose, test and pack. After the unpack activity, components of the released application package are available directly. Then starts the compose activity. This is really the core of the assemble activity. It takes a significant amount of time and human resources and cannot be fully automated. It consists in putting together components coming from different sources. This is mainly a ‘manual’ programming activity. Tools provided by the Producer can be useful, not to say indispensable. These tools can be a complete developing environment or Software Development Kit (SDK in the Figures). Again in the case of Dassault Systèmes which provides a complete SDK because CATIA is developed using special programming conventions that need special pre-processors, etc. The result of the compose activity is a complete prototypal or typical application which can then be tested (using the build and test processes that are parts of the released application package) and finally packed (it is then the enterprise application package) again before its transfer to each end-user desktop. released application package
deployable extensions package
SDK
EDS assemble
unpack
compose
enterprise model
test
enterprise application package
pack
deployment policies
EDS predispose
deployment model
advertise
Fig. 4 The Assemble and Predispose Activities at the Enterprise Level
The assemble activity specifies what will be deployed in the enterprise. The next activity, the predispose activity, specifies where, when and how it will be deployed. It is probably the activity most characteristic of enterprise deployment (and the less studied so far as well) for it establishes a link between the (options of the) enterprise application and the organization of the enterprise (the enterprise model) – plus a set of
deployment policies which control the deployment process. Deployment policies would probably be described as a set of constraints (team A should get a new version of an application before team B, etc.). Its result is a generated deployment model -- at least partly generated and then manually extended -- which specifies for each user (computer), what version of the application he/she will get according his/her position, role, the team or project he/she belongs to (these information are provided by the enterprise model). The deployment model specifies as well the overall control, scheduling and synchronization between all deployment activities in the enterprise; as well as with the control of each specific activity. The deployment model is a process model that can be seen as a scenario that will be interpreted by the deployment infrastructure to perform automatically deployment activities on end-users desktops. It is our belief that, if feasible, the generation of the deployment model, as well as its interpretation by a dedicated process engine could lead to substantial gains, both in the time required to prepare a deployment, and in the reduction of errors. Then again an advertise activity takes place. This time the advertisement is internal to the enterprise. We can then assume it involves less technical challenges than the advertisement we have seen in Section 2.1 – but it can be exactly the same considering the extended (an enterprise along with its producers, sub-contractors, clients) or virtual enterprises (geographically or organizationally dispersed enterprises). Also the deployment model could specify the advertise activity, i.e., describe when and to whom the advertisement should be done. As we will see later on, existing technology provide almost no support for these activities. This is especially true for the predispose activity which is nonetheless the core of large-scale enterprise software deployment according to us. This is thus at this level that forthcoming works on deployment should focus. Activities on the End User Site Activities carried out at the enterprise level introduced in the previous section finally come up with a deployable application package which specified what will be deployed; and a deployment model which specifies when, how and to whom it will be deployed. Everything is then ready to actually deploy the application on end users’ computers. Again this part of the deployment that we call physical deployment is in fact a set of interrelated activities. Activities depicted by Fig. 5 are build-time activities. The first of them is transfer from the enterprise site to each user’s site. The second activity is dispose. This is a preparatory activity pretty much like the predispose activity at the enterprise level. This activity takes as inputs the deployable application package and the deployment model which have been set up at the enterprise level; and a site model which represent the hardware and software configuration of the considered user desktop computer. This activity has two sub-activities: select and unpack. The goal of the select activity is to select the applications options with respect to the site model. For instance, a given option of the application might require at least 64 Megabytes of main memory, while another might work with 32 MB. If the considered target computer has only 32 MG then the second option of the application would actually deployed on this computer.
Enterprise application package
site model
deployment model
transfer
dispose
unpack
select
UDS
deployable application
install
update
configure
adapt
uninstall
executable application
build UDS
Fig. 5 Build-time Activities on User Sites
executable application
activate
test
deactivate UDS
executing application
reconfigure User DS
Fig. 6 Runtime Activities on User Sites
The deployable application can then be installed, maintained or uninstalled. We make use of the update and adapt concepts, which have been introduced in the work
on the Software Dock with semantics slightly changed to take into account the enterprise level, which does not explicitly exist in the Software Dock. Update is thus a modification of the deployed application in response to a modification on the Producer Site or the Enterprise Site. Adaptation is a modification of the deployed application in response to a modification on the End User Site. I.e., a modification in the resources available on this computer. The update and adapt activities are special cases of the install activity. The three of them have the same two sub-activities: configure and build. Four runtime activities are depicted on Fig. 6. Activation is all the operations such as launching executables, daemons, etc. that are needed in order to make the application ‘up and running’. Deactivation is the opposite, i.e., the shutdown of the application. The test activity is introduced here as a runtime activity, but of course it can be part of the build time activities since static tests can be done without the whole application running. Finally the reconfigure activity means dynamic reconfiguration as described in the configurable distributed systems literature, i.e., addition, removal, update or relocation of components while the application is running. 2.2
Products and Resources
Products and resources are the data which flow between activities and hardware, and software resources required by the deployment process. These informations are captured by four models introduced hereafter: the application, enterprise, deployment and user site models. Application Models An application model is an abstraction of this application. Besides the application artifacts themselves (files, scripts, documentation, etc. that compose the application), the application model should provide a description of the architecture of the application in terms of components and connectors. It should describe as well: • the options of the applications, • the compatibilities between versions of components, • constraints (hardware and software), • external dependencies (dependencies with components and applications which are not provided by the same producer), • other miscellaneous information such as contact information, dates of release, etc. The formalism offered to express application models could also allow for an algebra to add, remove or replace components and connectors. This is important since deployment is not only concerned with installation but also with maintenance of an installed application. Enterprise Models An enterprise model is an abstraction of this enterprise that should describe its organization in terms of teams and sub-teams, agents (humans), positions (manager, engineer) and roles (programmer, tester, etc.). This is the kind of information
provided by existing organization models such as Actor Dependency (AD) [4] and Organization and Process Together (OPT) [5]. Organization model might as well describe projects. The concept of project is becoming central to more and more enterprises. Projects are generally transversal to the ‘static organization’ (teams, positions, roles). An enterprise model might as well describe the process and data-flow used in the organization – which might be useful for deployment. For instance, if the data-flow indicates that a document made by a team A must be transferred to a team B within 3 weeks, then deployment activities in team B can be realized within 3 weeks after team A. Also something that might be very useful as far as automated deployment is concerned is a link between the organization and the agents’ (users) computers. This kind of information is absent for most existing organization models. Deployment Policies and Models Deployment policies represent decisions taken at the enterprise level concerning what should be deployed to whom, and when and how this should be done. Deployment policies need the Application and Enterprise models to be defined. A Deployment model is a scenario in some internal format that can be interpreted by a deployment infrastructure to perform physical deployment on user desktops. Selection On the one hand, as we have seen before, an enterprise model establishes a link between the organization of the enterprise and the user’s sites. On the other hand, a deployment model should establish a link between the organization of the enterprise and the application, i.e., to select and associate a given configuration to a team, a position or a role. By combining the two, a deployment model would then establish a link between a configuration of the application and each user site. The deployment infrastructure would then know which configuration has to be deployed on each site. Scheduling and Synchronization A deployment model could specify the type of scheduling use in the enterprise. Scheduling policies can be for instance: • Big-Bang: all sites of the enterprise must commute at the same time; • Incremental: a scheduling is established based on teams, positions, roles or sites. In this case, several synchronization policies between the activities on each site can be established: as soon as possible, by deadlines, etc. • Continuous: each site is autonomous and performs the deployment activities when it wishes to. Process Control A deployment model could specify different kinds of policies to control the overall deployment process: • If the process should be automatic, manual or semi-automatic. In the automatic mode, the deployment process is completely silent and performs automatically the different activities. In the manual mode, it asks for confirmation to the user before performing any activity. In the semi-manual mode, the user can specify which activities can be performed automatically and which must be performed manually.
• If the process should use a recovery (if an activity fails, the system, comes back in its previous consistent state) or a compensation (if an activity fails, the system performs an alternative action) mechanism in case of crash during a deployment activity. • If the process should be simple or recursive. In the simple mode, if an activity fails because some resources are missing, the infrastructure uses the recovery or compensation mechanism as explained above. In the recursive mode, the infrastructure tries to get the missing resources by itself. • If a log should be built to keep a trace of the process work or not. • Etc. Activity Control A deployment model could as well specify some finer grained policies to control each individual activity. For instance, activities can be activated or deactivated (update and adapt for instance if the enterprise does not want to deploy each revision of the application). Another example would be to describe specific options. For instance, if the advertise activity should make use of e-mails, newsgroups or both; if the transfer activity should use push or pull technology, etc. Site Models Sites models are abstractions of end users desktop computers on which applications will be physically deployed. They can be used by a deployment infrastructure to adapt the deployment model defined at the enterprise level to each individual desktop. Site models should exhibit: • Hardware and operating system information such as the kind of processor, processor frequency, available memory and disk space, operating system, version, etc. • Software information such as general tools available (editors, compilers, etc.) and already deployed components and applications. The latter is of primary importance because some components may be shared by distinct applications, their update or removal should thus be very cautious. Application Management Systems such as HP’s Open View [19], Novell’s Zen [20] or Amdahl ADS [23] define ad-hoc site models and formalisms to register and query site information. Microsoft and Tivoli together have initiated a more general approach. They propose AMS (Application Management Specification) which is a format that shall be used by all kind of application management systems on Windows NT platforms. A part of AMS is dedicated to site description (hardware configurations are described by more than 200 parameters!). It is foreseeable that site models will be largely available on every platforms (Windows, Unix, etc.) in a near future. The question here is to know if these models will be adequate to deployment infrastructures or is the latter will have to offer ad-hoc models and formalisms.
3
Related Technologies and Researches
Application deployment is only emerging now as a research field. There are very few research works dedicated to that specific problematic – apart from the Software Dock
project we describe in Section 3.2. Still, several technologies from different fields in computer science cover some parts of the enterprise deployment process, and thus can concurr to enterprise software deployment. Section 3.1 introduces these technologies. 3.1
Related Technologies
There exist a lot of works, propositions and systems representing each technology we introduce here. Albeit we do not claim to be exhaustive. On the contrary, we (tried to) introduce the most representative works with respect to enterprise application deployment in each area. System Description Formalisms Besides application models provided by Software Configuration Managers, Package Managers, Installation Tools, Application Management Systems and Configurable Distributed Systems we shall see in the followings sections, several stand-alone formalisms have been proposed to represent complex software systems. OSD (Open Software Description) [1] from a joint effort by Microsoft and Marimba is a format for describing applications. OSD is based on XML (it has been submitted to the W3C). It represents an application as a graph with parent, child and sibling relationship between application components. The main elements (nodes) are softwares packages, implementations which implement software packages – and dependencies between softwares packages and between software packages and implementations. CIM (Common Information Model) [2] from the DTMF (Distributed Management Task Force) is based on XML as well. CIM is more complete than OSD. It exhibits several models: a core model, an application model, etc. However, just like OSD, its major limitation is that CIM does not support variability, i.e., it cannot describe a complex application with several options, versions and variants. DSD (Deployable Software Format) [3] is the formalism associated to the Software Dock (cf. Section 3.2). DSD is also based on XML. It supports variability as it can describe application families. It is also possible to specify properties, composition rules that are relationships between properties, assertions and activities (that have to be undertaken during deployment). Organization Description Formalisms Several organization description formalisms such as Actor Dependencies (AD) [4] and Organization and Process Together (OPT) [5] have been proposed. Their basis resides in the fact that it is more and more recognized that the organization is essential with respect to the development process of complex applications. We believe it is equally important with respect to the deployment process. AD is a pretty simple and intuitive formalism, which can be used to describe an organization in terms of graphs in which nodes are actors and arcs are dependencies. An actor might be a agent (human), a role or a position. An agent can occupy one or more positions. He/she can play one or more roles. A position can cover one or more roles. Task, resource and goal dependencies between actors can be defined.
OPT is a more general formalism that is actually made of three models: • a process model which describe activities, products and resources, • a organization model which is based on AD (it provides hierarchical descriptions, i.e., trees instead or graphs in AD), • a coupling model which establishes relations between the two first models. There are two kinds of relations: responsibility relations relate the process and the organization or an activity and an agent; communication relations relate the organization and the process. As far as organization models alone are considered, the two formalisms presented here are quite close from one another. As far as we can tell today, they are probably sufficient to deal with deployment. OPT is interesting though because it is very close from our approach in which the application model and enterprise model as use together to define the abstract deployment model, which is then used with site models to define the physical deployment model used on each computer. Software Configuration Managers Software Configuration Management (SCM) systems such as ClearCase, Continuus, Adele [6], or TrueChange [8] are now recognized as central pieces in software development. They provide support for developing and managing applications artifacts that can exist in multiple versions, revisions and variants. With respect to application deployment, configuration management is intrinsically related to activities performed on the Producer side. Functionnalities of SCMs can however be extended to deal with activities performed on the Consumer side. Configuration management systems are relevant to deployment because: • they provide rich application models which can deal with application variability, • they provide support for the selection of consistent application configurations which is relevant to the select activity of the deployment process, • they can provide support for the building of applications which is relevant to the build activity which is part of the install activity, • they can provide some support for the update, adapt and uninstall activities. This is for instance the case of TrueChange which is based on the so-called change sets approach. Package Managers Package Managers such as Red Hat’s RPM [9] or HP-UX’s Mkpck [10] command are used to create and distribute software packages. A software package is a archive which contains the application artifacts, constraints and dependencies, plus some general information associated to the application (producer’s name, address, size of the application, etc.), plus meta information which describe the content and organization of the package. Packages are generally stored in repositories (databases) on the producer sites. Package Managers offer ways to add, remove, update and query software packages in the repositories. Some of them (RPM) can perform consistency checks on the application (conflicts between packages, non-satisfied dependencies). Once a package has been transferred to a consumer, the application can be configured, built and installed.
Package Managers are relevant to deployment primarily with respect to the pack activity of course. But they also offer application models and some support for configure and build and retire activities, which participate in install, update and adapt activities. Yet they provide poor or no support at all for activities at the enterprise level or for runtime activities on users sites (activate, deactivate, reconfigure). Content Delivery and Push Technology Content Delivery Systems such as Pointcast [11] or Marimba’s Castanet [12] are used to transfer any kind or artifacts (news, files, etc.) from one site one to another site on a network. This area is moving very fast and systems referring to publish/subscribe or push technology fall in this category. With respect to deployment, these systems deal mainly with the transfer activity of course. Note that they are not all equivalent on that point. They all claim to be based on push technology but specialists speak of smart pull technology: communications are established by subscribers which poll publishers sites which in turn reply by sending back messages. There is one message per subscriber. This cannot really scale to thousands or millions of subscribers. This is the case for Poincast and Marimba for instance. True push technology is based on asynchronous multicast (IP multicast for instance): only one message is actually sent by a publisher to multiple subscribers. True push technology representatives are Softwired’s iBus [13], TIBCO’s TIB/Rendez-vous [14] for instance. Installation Tools Installation tools such as InstallFromTheWeb [15], NetDeploy [16], InstallManager [17] and AutoInstall [18] provide ways to pack, transfer and install applications through the network. Some of them offer additional functionnalities. InstallFromTheWeb checks files before and after transfer and offers a recovery mechanism (if an installation crashes before its end). NetDeploy provides automatic update (by a polling mechanism) of installed applications. InstallManager offers recovery and rollback mechanisms as well as conflicts (dependencies between applications) detection and resolution mechanisms. It can also simulate installations. Installation tools are relevant to deployment with respect to pack, transfer and install activities --and also (but at in a smaller extent) to update or uninstall activities. AutoInstall also offers application and site models and much more originally supports different deployment policies. It is possible to specify deployment scheduling between sites on a Local Area Network (LAN). Users, computers, teams, days, hours, operating systems, etc might order these scheduling. Also it is possible to specify if updates must be pushed (by producers) or pulled (by consumers). Application Management Systems The purpose of Application Management Systems (AMSs) such as HP Open View’s Sofware Distributor [19] Novell’s ZENworks [20], Microsoft’s System Management Server (SMS) 2.0 [21], Tivoli Enterprise’s Software Distribution [22] and Amdahl’s EDM ADS [23] is to manage applications deployed on corporate LANs. They are based on highly centralised client/server architecture. Inside a LAN, one server is responsible for application releases and deployments on client computers.
Applications packages are stored on the server inside centralized repositories. The deployment process is completely directed by the server. Application management Systems are very relevant to enterprise deployment as they provide application and site models; and support for almost all the deployment activities on end-user sites. Those at the enterprise level, which depend upon organization models, which do not exist in these approaches, are missing of course. Also the producer and enterprise level are mixed together. Site models can be very rich. As we have said before, Microsoft’s site model offers more than 200 parameters. It is supposed to reside on each Windows station and will be used by any kind of applications. Tivoli’s approach is quite different. It offers a repository in which it is possible to register and query hardware and software configurations. The deployment is often transactional with recovery mechanism. SMS uses a push mechanism and thus does not support the adapt activity. Tivoli Enterprise supports adapt by using a publish/subscribe mechanism. EDM ADS is based on desired state technology and object differencing technology: the system polls the inventory (which is an abstract repository containing the application models, deployment models and sites models) and react to changes in the inventory by performing maintenance activities. In conclusion, there is a great deal of interesting things in AMS with respect to enterprise deployment. Still they suffer from serious drawbacks. First, they only cover two levels: enterprise and user -- with the enterprise acting as a producer actually. And then, as a consequence, they offer poor support or no support at all for deployment policies and models Configurable Distributed Systems Configurable distributed systems provide ways for defining applications in which it is possible to dynamically add, remove, update or move around (from one computer to another) components and connectors of the application while it is running. The split up of the application into components is essential because some parts (components) of the application can be deactivated to do maintenance operations while the rest of the application is still running. ArchShell [24] is a representative of the most general approach in configurable distributed systems. An Architecture Description Language (ADL) is used to define applications architecture. This architecture is then compiled into an executable system. Arunja [25] is an original approach in which a Transactional Workflow System is used to dynamically reconfigure applications. Configurable distributed systems are relevant to application deployment because they offer application models and support for the reconfigure but also the pack, transfer, update and adapt activities. To do so, ArchShell proposes the Architecture Construction Notation (ACN) which is a API used to express modifications of the application architecture such as the addition, removal or update of components, the reconfiguration of the application architecture (connections components-connectors) or the system architecture (mapping components-processors). In Arunja applications are seen as workflow schemas and components as activities in a workflow. It is then possible to add, remove and update components but modifying the workflow.
Component Models SUN’s Java Beans and Enterprise Java Beans, CORBA components or Microsoft’s COM+ are component models in which some char of the components are provided by containers in which components are executed. These characteristics can be manipulated declaratively at deployment time. Links with enterprise application deployment concern the pack, configure, build and install activities. The problem is that these activities are performed on each object in isolation. There does not exist the concept of application (except possibly a feeble one in CORBA 3). Synthesis Table Tab. 1 shows how existing technologies cover the models and activities of the deployment process. In the table we use normal and bold characters. Normal characters indicate a feeble covering while bold characters indicate a strong covering. For instance, near all technologies offer application models of some sort but the more accurate models come mainly from System Description Formalisms and Software Configuration Managers. Also, the table cannot be rigorous. A given system may or may not support the activities indicated. We try here to give a by domain vision and not a by-product vision. Domain
Models
System Description Formalisms (SDF) [OSD/CDF, CIM (MIF/AMS), DSD]
Application
Organization Description Formalisms (ODF) [AD,OPT]
Enterprise
Software Configuration Systems (SCM) [ClearCase, Continuus,Adele, PCL, ADC]
Application select, configure, build, retire
Package Managers (PM) [RPM,MkPkg]
Application
Content Delivery and Push Technology (CD) [Castanet, PoinCast, iBus, TIB/Rendez-vous]
Activities
pack, unpack, configure, build, retire advertise, transfer, install, update
Installation Tools (IT) [InstallFromTheWeb, NetDeploy, InstallManager, AutoInstall] Application Management Systems (AMS) [ZEN, SMS, OpenView, Tivoli]
Application Site Deployment Application Site Deployment
Configurable Distributed Systems (CDS) [ArchShell, Arunja]
Application
pack, transfer, install, update, uninstall transfer, configure, build, install, update, adapt, uninstall, activate, deactivate update, adapt, reconfigure
Tab. 1 Deployment Models and Activities Covered by Existing Technologies. In bold caracters are indicated potential ‘strong contribution’ for the concerned field.
Executive Summary Most producer deployment activities are covered by Software Configuration Systems and Package Managers. Installation Tools also provide packing facilities. Existing technologies provide almost no support for enterprise deployment activities. This is especially true for the predispose activity which is nonetheless the core of large-scale enterprise software deployment according to us. As we said before, this is thus at this level that forthcoming works on deployment should focus according to us. User Deployment User (or physical) deployment is the part of the overall deployment process that is best covered by existing technologies. Installation Tools provide most build-time activities with the notable exception of the test activity though. Configurable Distributed focus on runtime activities. Application Management Systems cover almost all activities. They are very relevant to enterprise deployment but suffer from serious drawbacks. First, they only cover two levels: enterprise and user -- with the enterprise acting as a producer actually. And then, as a consequence, they offer poor support for deployment policies and models. 3.2
Related Researches
As we mentioned before, application (or software) deployment is only emerging as a research field. At the time we write this article, there exists, to the best of our knowledge, only one research work: the Software Dock at the SERL, University of Colorado at Boulder. The Software Dock [26,27,28,29] is “a distributed, agent-based framework to support the entire software deployment life cycle” [30] This work have four main contributions: • it establishes software deployment as a research field by introducing the problematical and basics of software deployment; • it provides a definition of the deployment process -- which is actually described as “a life cycle of interrelated processes” in the Software Dock literature; • it offers a declarative language called the Deployable Software Description (DSD) which is very complete and suitable for application deployment since it supports variability – which is a original feature (cf. Section 3.2); • Finally, it proposes a viable architecture that has been implemented as a prototype. The architecture is based on agents which perform deployment activities either in release docks at a producer sites, or in field docks at a consumer sites. It also uses a federated deployment registry which is an aggregation of the release and field registries and a wide-area messaging/event system by which agents communicate. The Software Dock approach, compared to the approach we introduce in this article, suffers from one drawback that can be summarizes in five words: it lacks the enterprise level. As a matter of fact, this is not exactly a drawback. It is more a difference of objectives. The Software Dock is a general-purpose system to deploy
simple enough software from a producer site to a consumer site. The purpose of enterprise deployment, as we see it, is to deploy complex applications in large enterprises. Complex applications that are made of components that can possibly come from multiple providers or that can be developed by consumers themselves. Deploying in large companies means that different teams will have different application configurations. Above all, it means that the enterprise would like to have as much as possible a control over the deployment itself, that is to specify deployment policies. This distinction we make between “deployment” incarnated by the Software Dock and “enterprise deployment” we introduce in this article impacts considerably the three last points above: the deployment process, the different models and formalism and the deployment infrastructures to be developed – which would need to have “industrial” qualities. We say a few words on that topic in the next section.
4
Conclusions and Ongoing Works
Until now Software Engineering has focused on applications development. Yet the software industry is more and more concerned with applications deployment. Applications deployment covers all the activities that have to be carried out in order to make applications operational on user computers. It is a complex process made of activities that must be performed on different sites: producers, consumer enterprises and end user computers. We aim at taking application deployment covered by existing technologies and emerging researches a step further in order to tackle enterprise application deployment, i.e., and deployment of complex applications in large companies. Characteristics of the enterprise deployment process are the followings: • It must be able to deal with complex applications, which are made of multiple components coming from multiple producers including the consumer enterprise itself. These components are assembled at the enterprise level in order to construct the actual application that can be then tested and validated before its actual physical deployment in the enterprise. • The deployment process should make use of the organization of the enterprise in terms of teams, positions and roles. • The deployment process should allow for the specification of deployment policies used for configuration selection and for scheduling and synchronization. This article aimed at defining the foundations of enterprise application deployment. Thus it did not say anything about the development of environments capable of supporting enterprise application deployment. Our mid to long term purpose is of course to develop such an infrastructure. It will probably be based on the Software Dock (we might even use the Software Dock) for user deployment support , and active repositories for storing the different models: appliocation, organization, deployment, user site. The infrastructure will reflect the three layers of enterprise deployment. It will be made of three main components: a Producer Deployment Server, an Enterprise Deployment Server and a User Deployment Server, which will
communication through the network. This architecture is represented on the figures in Section 2 by PDS, EDS and UDS respectively. Be that as it may, we believe the design of such an enterprise deployment infrastructure requires an special design effort for it must offer industrial qualities such as reliability and robustness (the infrastructure should preserve the integrity of deployed applications. by offering recovery or/and compensation mechanisms for instance), security, tractability and scalability -- which are clearly not addressed by existing technologies that claim to support application deployment. System and database services should be of great value in this context.
References 1.
2. 3.
4.
5. 6. 7.
8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
A.van Hoff, H. Partovi, and T. Thai. “The Open Software Description Format (OSD)”. Microsoft Corp. and Marimba Inc, August 1997, (http://www.w3.org/TR/NOTEOSD.html). Distributed Management Task Force. “Common Information Model (CIM) Specification. Version 2.0”, March 1998, (http://www.dmtf.org/spec/cim_spec_v20/). R.S. Hall, D. Heimbigner, and A.L. Wolf. “Specifying the Deployable Software Description Format in XML”, SERL Technical Report CU-SERL-207-99, Software Engineering Research Laboratory, Department of Computer Science University of Colorado, March 1999. E. S. K. Yu and J. Mylopoulos. “Modelling Organizational Issues for Enterprise Integration, Proc. Int. Conf. On Enterprise Integration and Modelling Technology”, Turin, Italy, October 1997. C. B. Seaman and V. R. Basili . “OPT: Organization and process Together”, Proceedings of CASCON'93, IBM Centre for Advanced Studies, Toronto, October 1993. J. Estublier and R. Casallas. “The Adele Configuration Manager”, Configuration Management, Wiley, 1994, pp.99-134. E. Tryggeseth, B. Gulla, and R. Conradi. “Modelling Systems with Variability using the PROTEUS Configuration Language”, Proc. of the 1995 Int. Symp. on System Configuration Management, Springer, 1995, pp. 216-240. TrueSoft. “TrueChange”. White Paper (http://www.truesoft.com). M. Ewing and E. Troan. “The RPM Packaging System”., Proc. of the First Conference on Freely Redistributable Software, Cambridge, MA, USA, February 1996. C. Staelin. “Mkpkg: A Software Packaging Tool”, HP-UX Tech. Report, January 14, 1997. Pointcast, Inc. “Poincast”, 1998, (http://www.poincast.com). Marimba, Inc. “Introducing Castanet”, Technical White Paper, 1999, (http://www.marimba.com/products/castanet.htm). Softwired, Inc. “Developing Publish/Subscribe Applications with iBus”, Technical White Paper, 1999, (http://www.softwired-inc.com/ibus). TIBCO Software Inc. “TIB/Rendez-vous”, White Paper, 1999, (http://www.rv.tibco.com/whitepaper.html). InstallShield. “InstallFromTheWeb Version 2.0”, White Paper, 1999, (http://www.installshield.com). Open Software Associates. “NetDeploy 4 Technical Specification”, White Paper, 1999, (http://www.netdeploy.com). Wise Solutions, “InstallManager”, White Paper, 1999, (http://wisesolutions.com). 20/20 Software. “AutoInstall”, White Paper, 1999, (http://www.twenty.com).
19. Hewlett-Packard. “HP OpenView Software Distributor Quick Reference”, Technical White Paper, February 1996. 20. Novell, Inc. “ZENworks”, 1999, (http://www.novell.com/products/nds/zenworks). 21. Microsoft Corp. “Systems Management Server 2.0, Reviewers’s Guide”, White Paper, 1998. 22. Tivoli. “Tivoli Software Distribution”, White Paper, 1999, (http://www.tivoli.com/o_products/html/swdist_ds.html). 23. Amdahl, Inc. “Desired State Software Management”, White Paper, 1999, (http://www.amdahl.com/aplus/deploy/edmlit.htm). 24. P. Oreizy, M. M. Gorlik, R. N. Taylor, D. Heimbigner, G. Johnson, N. Medvidovic, A. Quilici, D. S. Rosenblum, and A. L. Wolf. “Self-Adaptive Software”, Technical Report, UCI-ICS-98-27, Department of Information and Computer Science, University of California, Irvine, August 1998. 25. S. K. Shivastava and S. M. Wheater. “Architectural Support for Dynamic Reconfiguration of Large Scale Distributed Applications”, Proc. of the 4th Int. Conf. On Configurable Distributed Systems, IEEE Computer Society, May 1998, pp.10-17. 26. R. S. Hall, D. Heimbigner, A. van der Hoek, and A.L. Wolf. “ The Software Dock: A Distributed, Agent-based Software Deployment System“, Technical Report CU-CS-83297, Department of Computer Science, University of Colorado, 1997. 27. R.S. Hall, D. Heimbigner, A. van der Hoek, and A.L. Wolf. “An Architecture for PostDevelopment Configuration Management in a Wide-Area Network ”, Proc. of the 17th Int. Conf. on Distributed Computing Systems, Baltimore, USA, May 1997. 28. A. van der Hoek, R.S. Hall, A. Carzaniga, D. Heimbigner, and A.L. Wolf. “Software Deployment: Extending Configuration Management Support into the Field ”, Crosstalk, The Journal of Defense Software Engineering, volume 11, number 2, February 1998. 29. R.S. Hall, D. Heimbigner, and A.L. Wolf. “A Cooperative Approach to Support Software Deployment Using the Software Dock”, Technical Report CU-CS-871-98, Department of Computer Science, University of Colorado, October 1998. 30. A. Carzaniga, A. Fuggetta, R. S. Hall, A. van der Hoek, D. Heimbigner, and A. L. Wolf. “A Characterization Framework for Software Deployment Technologies”, Technical Report CU-CS-857-98, Department of Computer Science, University of Colorado, April, 1998.