eration data, information and knowledge through software engineering tools and ... SOK framework is not a business intelligence or knowledge .... dashboard.
Introduction to the Software Operation Knowledge Framework Henk van der Schuur, Slinger Jansen, Sjaak Brinkkemper Department of Information and Computing Sciences Utrecht University Utrecht, The Netherlands
{hw.schuur, s.jansen, s.brinkkemper}@cs.uu.nl 1.
INTRODUCTION
Figure 1 depicts the software operation framework (SOK). The SOK framework identifies the software operation knowledge life cycle phases and models the flow of software operation data, information and knowledge through software engineering tools and processes, from a number of perspectives. The framework is designed to reveal the potential role of software operation knowledge in the support and improvement of software engineering processes and tools. Furthermore, the framework serves as a substrate in determining the scope of our research on software operation knowledge, and potentially fulfills an equivalent role in other research initiatives on software engineering processes, software evolution, tool development and change management. Note that the SOK framework is not a business intelligence or knowledge management framework. The phases, roles and perspectives that constitute the framework are all detailed in the remainder of this paper.
1.1
Parties
The SOK framework distinguishes two parties: software vendors and customers. The ‘Customer’ party represents a software vendor’s business-to-consumer customers (end-users, or end-users of external enterprises), as well as a vendor’s business-to-business customers (external software vendors, partners that have licensed software of the software vendor, as well as end-users of these external software vendors). End-users and their behavior form the initial source of software operation knowledge; software vendors assemble operation data from their customers and potentially respond to this data through software development, release, marketing or quality assurance processes. Other parties operating within a software vendor’s environment [4] are outside the scope of the framework.
1.2
Phases
As depicted in figure 1, five software operation knowledge phases can be identified. The proposed phases take place
subsequently, cyclically1 and independently per software operation knowledge type k ∈ κ. The phases represent the life cycle of software operation knowledge and illustrate the transformation of software operation data (Identification, Acquisition) via software operation information (Acquisition, Integration, Presentation) to software operation knowledge (Presentation, Utilization). All phases are detailed in the next subsections.
1.2.1
Identification
The first software operation knowledge phase encompasses the identification and selection of software operation data. Since software operation data acquisition potentially introduces a data explosion that hinders software vendors to successfully utilize software operation knowledge, directed acquisition is required. The amount of software operation data that is acquired is controlled by the definition of acquisition and mining criteria. Operation data is associated to one or more SOK types, and to each SOK type k ∈ κ, a weight w is assigned that represents the acquisition priority of k, for example. Furthermore, abstraction logic is defined for software operation data aggregation and encapsulation. Operation knowledge demands resulting from the utilization phase are translated into acquisition or mining criteria in order to adjust the SOK acquisition process. At the end of the software operation knowledge identification phase, a set of acquisition, mining and abstraction criteria is defined that is used in the acquisition phase. The criteria may be based on software operation knowledge utilization results from the utilization phase.
1.2.2
Acquisition
The SOK acquisition phase is concerned with a number of processes. First, the software behavior of end-users is translated to software operation data, incorporating the acquisition criteria defined in the previous phase. Secondly, software operation data is transferred from servers or workstations at which the software is deployed, to the software vendor. Next, based on mining and abstraction criteria and logic defined in the previous phase, software operation data sources are identified and software operation information is extracted from all operation data gathered. This operation information constitutes the input for the software operation knowledge integration phase. Although software operation data is often acquired manually by extending the software code base with log code or trace classes, research 1
Identification initiates again once Utilization is concluded.
shows that software operation data is also automatically deducted from deployed software. When the acquisition of software operation data is considered as a cross-cutting concern, for example, the data can be acquired by utilizing aspect-oriented programming (AOP) techniques [5]. Advices containing software operation data acquisition logic can be inserted at point-cuts in the target code, where each point-cut (or aspect) can be associated with a specific event type or method call to acquire specific software operation data [7]. Bowring et al. [1] apply a technique called software tomography to automatically gather data from deployed software. Note that the amount of software operation data that is acquired depends on an end-user’s behavior, as well as the type of software operation knowledge k ∈ κ that is eventually extracted from the operation data. While operation data associated with most operation knowledge types (κP , κQ , κU ) can be acquired ‘for free’ during software usage, the amount of acquired operation data associated with κF is directly proportional with the amount of feedback submitted by end-users. Software operation data may be transferred to the software vendor real-time or scheduled, depending on security, regulation or capacity constraints. Both the AOP and tomography techniques can be utilized to limit the amount of data that is transferred. Abstraction and data mining techniques like those described by Han and Kamber [3] are applied to the operation data stored at the software vendor, to aggregate, generalize or particularize operation data. The mining and abstraction of operation data results in software operation information, the output of the acquisition phase.
1.2.3
Software Vendor
data mining + abstraction logic
acquisition criteria
Acquisition
operation knowledge demands
Data mining + Abstraction Logic
Identification
behavior Acq.
operation data
Software
Software Operation Data
operation data
Data Mining + Abstraction
operation information
feedback response
Integration software modification (updates, licenses,etc.)
Development Perspective
Company Perspective
Customer Perspective
IDE
Management tools
Marketing tools
Bug tracker
License activation tools
Support system Training software
Exception type graph
Management dashboard
Usage report
Bug priority list
Operation report
Performance report
Usage report
Usage report
Presentation Performance report
Informed development Software maintenance Release management Usability improvement
Utilization
Resource management Roadmap construction
Relationship management
Core value determination Customized licensing
Directed assistance Pro-active support
Training modification
Integration
In the integration phase, software operation information resulting from the acquisition phase is integrated into the existing systems, tools, models and infrastructure of a software vendor. Plug-ins, conversion components or mediator services are developed to do so, and therewith enable purposeful presentation and utilization of software operation knowledge. An IDE plugin may be developed to integrate software operation information of a particular code file into the development environment of software developers. Furthermore, a software information conversion service could be developed in order to automatically register unhandled exceptions in the software vendor’s bug tracker. When software operation information is integrated with a vendor’s systems and tools, existing (software engineering) processes and workflows have to be adjusted to encapsulate and make use of the software operation information and the developed integration plugins and tools.
1.2.4
Customer
Presentation
Presentation, the fourth software operation knowledge phase, is concerned with the presentation of software operation information. Data resulting from the integration phase is visualized in graphs or diagrams, possibly by the integration plug-ins or tools developed in that phase. For example, based on exception event data, a bar chart is created showing exception frequencies per software component. Note that each of the perspectives (described in section 1.3) may require a different visual representation of software operation information, illustrating the data in various levels of detail. Especially when presented in combination with historical software operation information or with software de-
Legend Operation information visualisation Operation knowledge utilization
Figure 1: Software Operation Knowledge Framework
velopment, engineering or maintenance data, new insights or software operation knowledge is gained in this phase.
1.2.5
Utilization
The last software operation knowledge phase encompasses utilization of the software operation knowledge resulting from the previous phase to improve and enhance software engineering processes of a software vendor. The integration and presentation of software usage, performance and exception statistics in a programmer’s IDE, providing software developers knowledge and insights about the software performance, quality, usage and critical code failures, contributes to informed software development, for example. Moreover, the gathered software operation knowledge supports concrete decision making, as described by Siebes [6]. For example, end-user feedback and software quality knowledge support usability improvement and release management decisions. Considering the software operation knowledge utilization phase retrospectively potentially results in new operation knowledge demands, which form the input of the identification phase.
1.3
Perspectives
The software operation information route through software operation knowledge integration, presentation and utilization phases can be observed from three perspectives that find their origin in the product software research framework of Brinkkemper and Xu [2]. First, the Development perspective concerns all processes that eventually produce software products that can readily be deployed at the customers. Secondly, the Company perspective concerns processes not directly related to software development, such as marketing, sales, and quality control. Thirdly, the Customer perspective represents all factors and processes that influence the relation between a software vendor and its customers, such as training, support and relationship management processes. The SOK framework models software engineering processes and tools in terms of these perspectives. As a result, software operation knowledge that is integrated with development tools, or results in software modifications routes through the development perspective. Knowledge conducing to the indirect effects of software engineering processes of a software vendor (identity creation, core value determination and roadmap development) routes through the company perspective and knowledge that contributes to customer support processes, or is used to respond on feedback routes through the customer perspective. In the next subsections, the significance of the perspectives within the SOK framework is described.
2.
REFERENCES
[1] J. Bowring, A. Orso, and M. J. Harrold. Monitoring Deployed Software Using Software Tomography. In PASTE ’02: Proceedings of the 2002 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, pages 2–9, New York, NY, USA, 2002. ACM. [2] S. Brinkkemper and L. Xu. Concepts of Product Software. European Journal of Information Systems, 16:531–541, 2007. [3] J. Han and M. Kamber. Data Mining: Concepts and Techniques, 2nd edition — The Morgan Kaufmann series in data management systems. Kaufman, San Francisco, California, 2006. [4] S. Jansen, S. Brinkkemper, and A. Finkelstein. Providing Transparency in the Business of Software: A Modeling Technique for Software Supply Networks. In IFIP’07: Proceedings of the 8th IFIP Working Conference on Virtual Enterprises, pages 677–686, 2007. [5] G. Kiczales, J. Lamping, A. Menhdhekar, C. Maeda, C. Lopes, J.-M. Loingtier, and J. Irwin. Aspect-Oriented Programming. In M. Ak¸sit and S. Matsuoka, editors, Proceedings European Conference on Object-Oriented Programming, volume 1241, pages 220–242. Springer-Verlag, Berlin, Heidelberg, and New York, 1997. [6] A. Siebes. From Discovered Knowledge to Decision Making. In W. Kl¨ osgen and J. M. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery. Oxford University Press, Inc., New York, NY, USA, 2002. [7] H. van der Schuur, S. Jansen, and S. Brinkkemper. Becoming Responsive to Service Usage and
Performance Changes by Applying Service Feedback Metrics to Software Maintenance. In ASE’08: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering Workshops, pages 53–62, 2008.