Detailed models of the software development process and of the organization ... awareness and design rationale communities to global soft- ware engineering.
Building Awareness in Global Software Engineering: Using Issues as Context Rafael Kobylinski, Oliver Creighton, Allen H. Dutoit, Bernd Bruegge Technische Universitat Institut fur ¨ Munchen, ¨ ¨ Informatik Munich, Germany {kobylins,creighto,dutoit,bruegge}@cs.tum.edu
ABSTRACT In this paper, we propose an awareness system that enables participants to monitor the activities of others over a wide range of artifacts (e.g., system artifacts, organizational charts, or rationale models). Participants can subscribe to be notified when specific system artifacts are modified, when specific participants trigger an activity, or when participants trigger activities related to specific issues. Relationships among the system, organizational, and rationale models are then used to provide observers a context to interpret the activities of others. By providing context in terms of issues (as opposed to only system or communication artifacts), we hope to disseminate richer and more targeted awareness information, hence creating more opportunities for informal information exchanges and for distributed collaboration.
1.
INTRODUCTION
In researching distributed software engineering, we have taken the approach of “learning-by-doing.” We have taught several distributed project courses in which teams located in Pittsburgh, PA and in Munich, Germany collaborated on developing a system for a single industrial client [2, 9]. We have observed that the most critical issues related to distribution are actually not technical, but social: • Participants not knowing each other resulted in low awareness of each site and little daily collaboration. • Lack of daily collaboration made it more difficult to accurately communicate team and project status, resulting in many crises detected late. • Lack of status information and informal contacts between sites made it difficult to manage crises once they were identified. Researchers have already noted the importance of communication in software development [4, 12, 15]. In a field study,
Curtis et al. [4] observed that documentation does not reduce the need for communication, in particular, during the early phases of the project, when stakeholders coordinate their conventions and create informal communication networks. They also observed that obstacles in informal communication (e.g., organizational barriers) can lead to misunderstandings in conventions and rationale. Kraut and Streeter [15] note that formal communication (e.g., formal specifications, inspections) is useful for routine coordination while informal communication (e.g., hallway conversations, telephone) is needed in the face of uncertainty. Grinter et al. [12] studied distributed projects using different organizational models and confirmed these findings in distributed projects. Several approaches have been proposed to support distributed communication. In the software process community, distributed process support systems have been proposed [14]. Detailed models of the software development process and of the organization are developed and enacted with the help of a distributed tool. Participants are automatically notified of the completion of other participants tasks and are told what to do next. Assuming a realistic process model, the strength of process enactment environments are that responsible participants are notified in a timely manner, regardless of their location and independent of their counter parts. However, current challenges posed by such environments include the difficult adaptation of the workflow during exceptions, the interleaving of process modeling and process enactment, and user acceptance [11]. In computer supported collaborative work, groupware and videoconferencing have been proposed and evaluated with mixed results [13], especially in informal communication, as they require that participants are aware of each others activities. This, in turn, triggered substantial research in improving group awareness [7]. Awareness systems are usually integrated with a groupware system and provide their project participants some indication of who is currently using their computers and hint about their communication activity. Knowledge of what others are doing enables participants to find their counter parts and to initiate conversations about impending crises. We are researching comprehensive solutions for supporting collaboration in distributed software engineering projects, including supporting informal meetings [1], capturing and maintaining rationale [10], and the travel of small groups of
developers [9]. In this paper, we focus on group awareness. We propose to adapt and generalize the results from the awareness and design rationale communities to global software engineering. We see three critical issues when defining an awareness system: 1. Monitoring activities. Information about activities should be automatically constructed from events generated by communication and development tools. To make the awareness system usable, there should be few constraints on the tools monitored, so that participants can continue using familiar tools. For example, events monitored include checking new versions into a configuration management system or posting a message in a discussion tool. 2. Providing context for activities. Raw events generated by communication and development tools are too low level for a remote observer. We propose that decision making information (i.e., rationale [16]), including issues, alternatives, quality criteria, and decisions be captured and related to other system artifacts. In addition to making issues under discussion visible to other sites, issues can also be used to enrich the context of awareness events. 3. Filtering awareness information. In a distributed project, the number of participants, activities, artifacts, and issues that can be monitored is too large for any single participant. Moreover, participants are not interested in seeing all events that occur in the project. Instead, participants should be able to express their interests by rating the importance of specific artifacts, specific project participants, or specific issues. An awareness system enables participants to receive a stream of events that inform them about the activities of others, in real time. Although our primary objective is to create opportunities for participants to initiate informal communications, we see other longer term uses for this stream of events. For example, a variety of process and system metrics can be derived from awareness events over time, hence informing a project manager of the direction of the project. Since issues are used as a main context element, these metrics could also be based on the issue models, hence providing a rationale centric view of the software life cycle, as opposed to an entity centric or a process centric view.
code, and test cases are system models. System models are usually put under configuration management, as any changes need to be tracked and reversible. • Organizational models represent managerial information related to the system under development, including process, participants, teams, roles, and resources. • Rationale models represent the reasoning behind decisions. While system and organizational models represent the end point of the decisions and their consequences, a rationale model represents the individual decision making elements that lead to a decision, including problems addressed, alternatives considered, quality criteria used for assessments, arguments, and justifications. This information is usually not represented explicitely and scattered across a variety of communication media. Rationale-based approaches often represent this information as a semi formal graph of nodes, called an issue model. An issue model provides for a structured overview of the various decisions that are made in the course of a project: Every decision to-be-made in a project, no matter by whom, corresponds to one or more issues.
Issues can be classified by several inherent properties, such as • project phase: Requirements Elicitation, Analysis, System Design, Object Design, Implementation, or Maintenance • category: a spectrum ranging from technical to managerial • dependency: a collection of references of other issues that need to be resolved first In our environment, we store issue models in a repository shared across all tools manipulating issues.
Issue Argument Rationale Element
*
In the rest of the paper, we describe in more detail our awareness system, our plans for its experimental evaluation, and future directions.
2.
META-MODEL
* Organization Element
Participants use many different sources of information to stay informed about the current status. For building an awareness system, we need to represent these various sources of information. We identify three types of models:
*
Task
• System models represent the system under development from a variety of perspectives. A requirements specification, an architecture, a detailed design, source
Team
Criterion
*
Decision
*
Person Resource
Proposal
* System Element
Use Case
Subsystem
Domain Class
Solution Class
In our meta-model we introduce the linkage across models. The semantics of these links depend on which elements are linked. These relationships among system, organizational, and rationale elements are used as context for awareness events. Usually, these links are not explicitly represented or represented only in an ad hoc manner. For example, the login name of a user can be used to link the authorship of system models to corresponding address book entries in an organizational model, when login names are generated from the first and last names of a user. We postulate that the value of these cross-model links is high for providing context information in an awareness system and, in general, for ensuring consistency among models manipulated by different tools. Our approach so far has been to make it easier to create these types of links. For examples, REQuest [10], a tool for writing specifications in terms of use cases, enables users to create and attach issues to specific use cases. The issues and the links are stored in the issue repository, however, users view the issue models in the context of the specification. Following the same principle, we are developing a meeting manager tool as an extension of our communication infrastructure, which enables users to create meeting agendas from a set of issues, create a set of action items as a result of a meeting, and assign action items to participants. While users view issues in the context of meeting agendas and minutes, the meeting manager enables the creation of linkages between the organizational (action items, participants) and rationale (issues, decisions) models. The elements of the meta model and their linkages can then be traversed by the awareness system, both for monitoring, subscribing, and providing context. We describe the awareness system in more detail in the next section.
3.
BUILDING AWARENESS
Group awareness can be described as “an understanding of the activities of others, which provides a context for your own activity” [6]. This understanding allows to answer basic qustions like who is around, what is happening and where it is happening. In an environment where colloborating individuals are colocated, this understanding is aquired easily through direct audio-visual perception and informal communication. However in environments where the collaborators are distributed in time and space, most of the available awareness information is usually lost. A software developer can be considered working in two distinct work environments in parallel: the physical work environment and the virtual work environment. The physical work environment consists of offices, furniture, computers and other tangible assets, while the virtual work environment usually consists of project artifacts stored in files or databases and tools used during the development to manipulate those artifacts. Early general purpose awareness systems [7, 17] focused on the physical environment and allowed mediated access to audio and video from a remote site. We chose to focus on the virtual work environment, believing that, in terms of context, the most interesting activites in software development
happen there, rather than in the physical environment. To support awareness in the virtual work environment we need a system which gathers awareness information in form of events from the tools used by the developers and distributes them as notifications to interested parties. These events must include at least the originator of an action, the tool used, the artifact manipulated and the tool command used to manipulate the artifact. After receiving, the system has to assess how important this event might be for others and either filter the event out or deliver a weighted notification about the event to the recipients. Notifications for recipients who are not available need to be stored persistently for later delivery. Our approach for gathering awareness information is to provide an open interface for existing tools rather than to create new tools. The awareness system has to augment the exisiting development infrastructure instead of replacing it. This approach imposes restrictions on the granularity of information we can get from the tools. For example while it is quite easy to get notified by a CM system like CVS about a check in operation on a file, getting a notification about a method modification in an IDE can require elaborate customization, which often is not possible at all due to the lack of APIs to the IDE. To assess the importance of events, the awareness system needs to know about the interests of the individual developers. The developers should be able to explicitly state their interest level in events related to a person, to an artifact, to a tool, to a command or to a combination of all these elements. Because an awareness system based solely on explicit interests would require substantial ongoing configuration by each individual developer, awareness systems based on flexible rules have been proposed in the past [3]. Given a set of rules, such systems can create interest representations implicitly by processing the events originated by the user. One such rule could state that when a developer starts to manipulate an artifact, an interest representation for that artifact is created. When developers state (explicitly or implicitly by means of rules) their interest level for certain event types, they usually are also interested in events which are closly related to those events. One example of related events are events about artifacts that are strongly coupled in some way, e.g. a file containing the interface of a class and a file containing another class that uses this interface. Another example of such related events are events about developers belonging to the same team. Artifact relations can usually be derived from the internal, often hierarchical, models used by the tools. We belive that using these models to derive distance metrics for tool-specifc artifacts is necessary for the importance assessment in an awareness system. However, these models are not sufficient when the developers want to express interest in all events belonging to a particular process task, such as requirements analysis, or a specific abstract set of artifacts, such as those related to a subsystem. Those events might not have elements which are related directly to each other, e.g. a file containing a class and a posting on a Bulletin Board might
be coupled only because they are related to the same subsystem.
In a nutshell, the engineering methods that we intend to support with our infrastructure are:
To satisfy this requirement, we propose the addition of a service to the issue repository introduced in Section 2 that allows to map events to the elements of the meta-model. Using this meta-model, we can then express couplings between artifacts coming from different tools and thus unrelated on the tool level. Incoming events are then passed first to service which assigns the event a set of abstract rationale, organization and system elements before any further processing with rules and interests is done. Thus, rules and interests can contain references to the elements of the metamodel, and interests in a particular element can be easily expressed.
4.
• Issue-based Project Modelling: Applying the issue paradigm to every decision should create a uniform and routine method for problem solving • Participatory/Democratic Management: The issuebase is open to every participant. Decisions on every level are made transparent and can benefit from a larger decision-making population. • Shared Project Knowledge: As the issue-base is captured electronically, decision rationale can be shared among participants over time. The project gains a long-term memory that is externalized from experts.
TOWARDS AN ISSUE-BASED LIFE-CYCLE
• Design Rationale Capture: Design decisions can be captured from designer communication and feed into the documentation.
We think that a non-linear process for software engineering projects is needed; one that is not modelled after waterfalls or spirals, but rather in the style of concurrent distributed execution in modern operating systems. Additionally, we assume that the support infrastructure should be process agnostic, and should therefore be able to support any process model that project managers choose to adopt. We therefore propose to introduce a sub-model, or a layer underneath the usual process models, which enables us to evaluate, compare, and ultimately execute process models. Issues – as described in section 2 – are our atoms of larger structures, and represent all artifacts that need to be shared between project participants. Managers of team-based projects face the challenge of keeping track of project progress and health. Even when the process model is clear and simple, the overall project status depends on many variables that are often buried deep inside the artifact and communication repositories; exceptional states that are outside the scope of the applied model can cause substantially skewed perspectives. Our goal is to instrument all parts of our collaboration infrastructure to allow for the capture of events needed for building group awareness as described in section 3. But beyond the peer-to-peer benefits we expect from this, a consolidated view onto the project status could be generated by appropriate heuristics. For example, such status indicators could include communication metrics [8, 5], component metrics, quality metrics, scheduling metrics, or integration metrics. Another challenge for successful projects is the reduced project duration time, which forces the infrastructure to be setup as mechanically as possible. Due to its high complexity, the risk for omission errors increases with increasing number of tools. We therefore intend to automate as much of this task as possible and additionally provide an initial set of open issues, possibly even per project phase, which will populate the issue base with a collection of issues that were identified as recurring in previous projects. Through this we intend to enable a big-bang project startup by not only providing an ad-hoc infrastructure, but also an issue-based representation of multi-phase, multi-project management check-lists and best-practices guidelines.
• Total Quality Management: Generation of meeting agendas & minutes, and management of action items. • Active Awareness: The infrastructure augments the participants’ project perception by raising “red flags” at issues that are important for the respective participant. For these, a project requires an artifact repository as shared memory, a communication infrastructure, a project management component for prioritizing and scheduling of issues, and development teams that solve their alloted issues.
5.
EVALUATION
While an awareness system that satisfies the requirements stated in Section 3 can be designed in a domain agnostic way, it’s benefits for distributed software engineering depend on how well the implementation fits into the exisiting development infrastructure, and on how well its concepts, namely interests, rules and models are calibrated to satisfy the needs of its users. We believe that this has to be done through empirical research, and therefore we have designed and are currently implementing such a system - Awareness Builder - that we are going to start using this fall. The system will provide a XML-RPC interface for collecting events from the tools our developers use (CVS, Lotus Notes and REQuest), and a graphical notification system. To get enough events from CVS we need to make sure that changes in the working copies are frequently promoted to the central repository. We will therefore introduce a configuration management policy which asks for promotions of changes at least once a day. This policy will be supplemented by a set of criteria to which the new code has to comply. This summer, after the implementation is completed, we plan to perform a series of short studies with a limited number of students to calibrate the system.Those studies will
have three goals: to find suitable rules, define meaningful distance metrics for both the tool-specific models and the meta-model and to evaluate the user interface. The calibrated system will then be used by the participants of our next distributed project course (August 2002 - March 2003), where we will study the impact of increased awareness on project communication. Once the awareness system and its relationship with the issue repository are sufficiently well tuned, we plan to investigate further uses of the stream of awareness events. As mentionned in Section 4, issues can play a central role in the software life cycle, by providing both a tool for establishing context and for making explicit information that is usually stays implicit in single site projects. We will use the date generated by the awareness experiments to identify how issue models support this view of development and to elicit requirements for issue-based management tools.
6.
CONCLUSION
In this paper we propose a technology-driven approach, as opposed to a business administration-driven approach, for enabling software engineers to build better software through rationale capture and knowledge management. We investigate the use of group awareness to create opportunities for informal communication and for longer term uses, such as process and communication metrics. Key challenges in group awareness include identifying incentives for developers to accept the monitoring of their actions and addressing privacy concerns. When identifying potential incentives, we note that lack of sharing of information among sites leads to adverserial relationships and lack of trust. We anticipate that sites can benefit from the system by offering a greater transparency into their activities. Such transparency can then lead, for example, to certification frameworks for supplier sites and reinforce long term relationships among sites. When addressing privacy concerns, we avoid common problems occurring in other monitoring systems by ensuring that only information that is already public (e.g., CVS events) is broadcast. Moreover, the exchange of information is symmetric, as users can only view awareness events if they accept to produce awareness events. In general, awareness events should comply with the access control mechanisms already in place. In general, the above issues are difficult to predict and anticipate, as they relate to complex organizational and human processes. Only an experimental approach will enable us to assess the impact of the system with respect of these issues and design solutions to address them.
7.
REFERENCES
[1] A. Braun, B. Bruegge, and A. H. Dutoit. Supporting informal requirements meetings. In 7th International Workshop on Requirements Engineering: Foundation for Software Quality., June 2001. [2] B. Bruegge, A. H. Dutoit, R. Kobylinski, and
G. Teubner. Transatlantic project courses in a university environment. In Asian Pacific Software Engineering Conference, Dec. 2000. [3] M. B¨ urger. Unterst¨ utzung von Awareness bei der Gruppenarbeit mit gemeinsamen Arbeitsbereichen. PhD thesis, Institut f¨ ur Informatik der Technischen Universit¨ at M¨ unchen, 1998. [4] B. Curtis, H. Krasner, and N. Iscoe. A field study of the software design process for large systems. Communications of the ACM, 31(11), Nov. 1988. [5] D. Damian. An empirical study of requirements engineering in distributed software projects: Is distance negotiation more effective? In Asian Pacific Software Engineering Conference, Dec. 2001. [6] P. Dourish and V. Bellotti. Awareness and coordination in shared workspaces. In Conference Proceedings on Computer-supported Cooperative Work, 1992. [7] P. Dourish and S. Bly. Portholes: Supporting awareness in a distributed work group. In Conference Proceedings on Human Factors in Computing Systems, 1992. [8] A. H. Dutoit and B. Bruegge. Communication metrics for software development. IEEE Transactions on Software Engineering, 24(8), Aug. 1998. [9] A. H. Dutoit, J. Johnstone, and B. Bruegge. Knowledge scouts: Reducing communication barriers in a distributed software development project. In Asian Pacific Software Engineering Conference, Dec. 2001. [10] A. H. Dutoit and B. Paech. Rationale-based use case specification. Requirements Engineering Journal, 2002. [11] S. Goldmann and B. Koetting. Software engineering over the internet. IEEE Internet Computing, pages 93–94, July 1999. [12] R. E. Grinter, J. D. Herbsleb, and D. E. Perry. The geography of coordination: Dealing with distance in R&D work. ACM, 1999. [13] J. Grudin and L. Palen. Why groupware succeeds: Discretion or mandate? In European Conference on Computer Supported Collaborative Work, 1995. [14] P. J. Kammer, G. A. Bolcer, R. N. Taylor, A. S. Hitomi, and M. Bergman. Techniques for supporting dynamic and adaptive workflow. Computer Supported Cooperative Work, 9(3):269–292, 2000. [15] R. E. Kraut and L. A. Streeter. Coordination in software development. Communications of the ACM, 38(3), Mar. 1995. [16] T. Moran and J. Carroll. Design Rationale: Concepts, Techniques, and Use. Lawrence Erlbaum Associates, Mahwah, NJ, 1996. [17] R. I. Sara Bly, S. Harrison. Media spaces: Bringing people together in a video, audio and computing environment. Communications of the ACM, 36(1), Jan. 1993.