A Software Monitoring Framework for Quality Verification Dileepa Jayathilake 99X Research 99X Technology Colombo, Sri Lanka
[email protected] Abstract— Software functional testing can unveil a wide range of potential malfunctions in applications. However, there is a significant fraction of errors that will be hardly detected through a traditional testing process. Problems such as memory corruptions, memory leaks, performance bottlenecks, low-level system call failures and I/O errors might not surface any symptoms in a tester’s machine while causing disasters in production. On the other hand, many handy tools have been emerging in all popular platforms allowing a tester or an analyst to monitor the behavior of an application with respect to these dark areas in order to identify potential fatal problems that would go unnoticed otherwise. Unfortunately, these tools are not yet in widespread use due to few reasons. First, the usage of tools requires a certain amount of expertise on system internals. Furthermore, these monitoring tools generate a vast amount of data even with elegant filtering and thereby demand a significant amount of time for an analysis even from experts. As the end result, using monitoring tools to improve software quality becomes a costly operation. Another facet of this problem is the lack of infrastructure to automate recurring analysis patterns. This paper describes the current state of an ongoing research in developing a framework that automates a significant part of the process of monitoring various quality aspects of a software application with the utilization of tools and deriving conclusions based on results. According to our knowledge this is the first framework to do this. It formulates infrastructure for analysts to extract relevant data from monitoring tool logs, process those data, make inferences and present analysis results to a wide range of stakeholders in a project. Keywords-application monitoring; quality blackbox testing; log analysis; software quality
I.
verification;
INTRODUCTION
Value of quality in software is well recognized. Poor quality can incur lot of costs during various phases in software life cycle and, can even bring disasters into a business at times. It is proven that detecting malfunctions of software in early phases of development saves a significant amount of money to an organization. The bottom line is that a malfunction should be captured and fixed before reaching production. Many practices are employed to detect software defects during the development life cycle. Manual and automated functional testing is one prominent approach. It focuses on spotting any observable deviations of the product with respect to the desired features and behavior. The usual practice is to perform a
sequence of predefined or ad hoc input actions on the software and to observe the output while comparing it with the prespecified desired output. Any mismatch is identified as a defect. Although this process, when followed with discipline and proper commitment, can cover a significant part of the whole spectrum of possible defects, it inevitably leaves out certain important problems that carry the potential to wreak havoc in later stages. These include certain memory corruptions, memory leaks, disk access errors, failures in certain low level operating system calls, pitfalls due to insufficient user access rights, unjustifiable performance bottlenecks, etc. Admittedly the symptoms of these problems may surface during functional testing, but there is no guarantee that it will. For example, surfacing of the effect of a memory corruption depends on the state of the virtual memory in a machine at the time. Accordingly, it can go unnoticed in a tester’s machine while causing a crash in the end user’s environment. Therefore it is important to have a deterministic way of capturing these problems for maintaining software quality. One way of attacking this problem is to supplement the traditional functional testing with a thorough analysis of the source code in view of identifying problems in a different stage of the software. This practice, which is usually termed “Software White Box Testing”, has proven its efficiency in surfacing many critical defects that would otherwise create severe problems in maintenance phases. Important metrics evaluated during white box testing include cyclometric complexity, testability, unit test coverage, potential memory leaks, class and function level complexity, deprecated API usage, code duplication, undocumented code, styling errors, etc. However, white box testing has its’ own limitations. Some quality aspects of software such as memory corruptions, inefficient memory usage (e.g. memory fragmentation), I/O failures, performance bottlenecks, security pitfalls, platformspecific problems, insufficient user access rights and compatibility issues can be hardly evaluated until the software is put in operation under the real environment. This brings the need for tools that can monitor the behavior of software while in operation. Fortunately, a rich collection of monitoring tools has been emerging during the recent years in every popular platform. These tools have the capability to record the behavior of a software application with respect to most of the aspects mentioned above. Recorded data can be used by an analyst to
identify potential problems. Although not in ubiquitous use, this practice, when combined with functional testing and white box testing, adds a significant value to the software quality verification process due to the above-mentioned reasons. However, one prominent problem is that, using these tools requires expertise. Since the tools generate a vast amount of data, even an expert needs lot of time for an analysis procedure. Both these facts incur a significant cost to an organization, hence pushing many teams to skip this step. Even if a team decides to afford for it, the results obtained from the analysis procedure are only understood by the expert analyst, which hinders other stakeholders like testers, project managers and customers from getting the opportunity to interpret the results according to their level for having an overall idea about the health of the application. Furthermore, manual analysis process raises the bar for the possibility of automating recurrent analysis patterns. Though some monitoring tools provide command-recording mechanisms, automating an analysis procedure that involves more than one tool is increasingly difficult. These factors speak for the value of a framework with the capability to integrate and harmonize multiple monitoring tools while providing an easy automation interface and a dashboard for presenting analysis results to different levels of users. Though there are automation frameworks for functional testing [1] and white box testing [2], no such framework exists with regards to software monitoring tools. This paper details a survey into existing software monitoring tools and a framework we developed to integrate them. The framework is an extension to the work we presented in [3]. It provides a simple scripting language to automate analysis procedures, a declarative language to interpret the output from monitoring tools, a C++ API for implementing control code and a set of convenience tools. Section II summarizes the criteria we used to pick the correct monitoring tools to be used in the framework. Section III provides a review on each monitoring tool used. We hope these two sections formulate a solid starting point for software developers, analysts, testers and managers to select the right set of monitoring tools for their applications. Section IV describes framework features while justifying the requirement for each. In Section V we include the architecture and framework component descriptions along with implementation details. Section VI presents the results of an experiment done with the framework to showcase its capabilities. Section VII contains related works. Section VIII concludes the work and potential future enhancements are discussed in Section IX. II.
TOOL SELECTION
Tools play a major role in the framework. One inspiration for developing this framework was that a range of excellent monitoring tools were developed by operating system vendors and open source community during recent past. Even so the majority of these tools are free. We decided to do a survey on these tools and pick appropriate ones for our framework rather than trying to develop our own monitoring tools due to the following reasons.(a) Software monitoring is a common problem in the industry and hence a large number of handy tools are already available for the job (b) Developing quality
evaluation functionality inside the framework would require a large amount of resources (c) It’s easy to create trust on the results of the framework when widely used standard tools are employed. The tools need to be carefully selected considering the current requirements and possible future issues in maintaining. First, the results obtained from the analysis procedure must be made credible for all the stakeholders. This requires employing widely used tools that support standard quality metrics. The combination of required quality metrics will differ across projects. For an example, a payment gateway application might prioritize security as a key concern while latency-related metrics may be the highlight in a communication application. In order to cater for this wide range of demands, the tools should be highly configurable so that the degree of importance for each quality metric can be set in project level. The collection of tools should be selected so that each platform gets sufficient coverage. Possibility of false positives is also a decisive factor. Tools that claim to generate relatively high proportion of false positives were omitted in our tool selection process. The cost of tools is also considered in view of making the framework affordable for a wider range of users. Ultimately all the tools selected are free tools. However, attention is paid when employing free tools with respect to evolution and support in future. Only the tools with an active community built around them are taken in. III.
TOOLS
This section provides an introduction and a practical review for each tool currently being integrated into the framework. A. Apache JMeter` This is an open source free Java tool from the Apache Jakarta project [4]. It can be used to load test client-server applications with respect to both static and dynamic resources such as files, servlets, Perl scripts, Java objects, database queries, ftp servers, etc. The tool facilitates simulations of heavy load on a server, a network or an object to test its strength or to analyze performance under different load types. For example a huge number of concurrent users and activities on a web site can be simulated using JMeter and the responsiveness can be quantified and be observed graphically. Key features of the tool include [5]: (a) Ability to load test different server types (HTTP, HTTPS, SOAP, Database via JDBC, LDAP, JMS, POP3 and IMAP) (b) Portability across different platforms (c) Concurrent sampling using multithreading (d) Caching and offline analysis (e) Ability to replay test results (f) High extensibility (Pluggable samplers for customized sampling, Pluggable timers for customized load statistics, Data analysis and visualization plug-ins for personalization, Custom functions for providing dynamic input to a test or data manipulation and Scriptable samplers) JMeter provides a simple and intuitive user interface so that the tool can be used after a short learning curve. It is stable and has a strong community so that one can expect free expert solutions for problems when using the tool. The downside is that it does not support client side scripts.
B. Microsoft Application Verifier This is a handy tool for detecting problems incurred in the system level when running an application in Microsoft Windows [6]. Some failures and malfunctions that occur in operating system calls do not cause immediate application level errors but result in subtle problems later on. These form one of the most difficult categories of bugs to track. Microsoft Application Verifier helps in detecting them immediately and hence identify the reasons far more easily. It is a free tool for Windows XP and later versions. Usage is quite simple. The tool comes with a default grouping of quality aspects that can be monitored. Each aspect is given a comprehensible description. First, the user needs to activate quality aspects of interest and optionally attach a debugger (e.g. Visual Studio or WinDbg). Then the required application (process) should be attached to the tool. After that, when the application is run and the test cases are performed, Application verifier logs all the potential problems with respect to the activated quality aspects. The log file can be analyzed offline by an experienced engineer to identify errors that are drilled down to source code level. In addition, if a debugger is attached, Application verifier raises errors during the run time as and when they occur. The key areas that can be analyzed using this tool include Virtual memory usage, First chance access violation exceptions, Inputoutput transfers, Synchronization objects (e.g. locks), Handle usage, Thread pool usage, Thread local storage, Dll handling, Interactive service creation, Dangerous API calls, Driver installations and User privilege issues. Application Verifier can also be used to simulate low resource conditions in the machine. C. LeakDiag & LDGrapher These are a pair of tools that can be used in conjunction to detect memory leaks in an application. Both tools are provided by Microsoft and are free. LeakDiag can be configured to monitor a specific area in an application’s memory allocations. The possible areas are Virtual memory allocator, Windows heap allocator, COM allocator, C runtime allocator and TLS slot allocator. When the application is running, LeakDiag generates an xml log file, which can be analyzed offline. This log file can be analyzed as it is (in xml form) or LDGrapher can be used to graphically display the information in the log file. LDGrapher displays the memory allocation graph for every thread in the application, which makes the analysis easier. Several details such as the stack of a thread at a given instance can be obtained for analysis [7]. D. Process Monitor Process Monitor is another free analysis tool from Microsoft [8]. It is an advanced Windows logging utility, which collects and records most of the system activity regarding file system, registry and process/thread activity. It comes with a rich GUI that displays this information in real time. The information also can be saved for later analysis. Process Monitor also offers advanced filtering options to trace specific activity in the interest of the analyst. Some of the key features of the tool are Capturing thread stacks for each operation, Reliable capturing of process details (image path, command line, user and session ID), Configurable and moveable columns for any event property, Non-destructive
filters, Advanced logging architecture scaling to tens of millions of captured events and gigabytes of log data, Ability to set filters for any data field including fields not configured as columns, Process tree showing relationship between processes, Detail tooltips, Cancellable search and Boot time logging. Since Process Monitor captures and logs a vast amount of system events it is important that it is run with only the essential monitoring areas being activated for only a limited time. Otherwise the log file can easily grow into an unmanageably huge size. E. Xperf Being a part of Microsoft Windows Performance Toolkit, Xperf is a free performance-profiling tool for Windows applications [9]. It can be used for tracking performance bottlenecks in applications as well as for comparing time taken for various operations in an application. Xperf uses the ETW (Event Tracing for Windows) mechanism to track system events. Profiles are employed to customize the specific domains the data is collected for, according to analyst’s preference. The command line utility can be used to trace events into a file, which is later opened using another GUI utility named xperfview. This GUI displays system resource usage in each domain in the form of graphs. Selected parts of any graph can be analyzed while drilling down to details such as stacks. An experienced analyst can hugely benefit from the parallel view of resource usage e.g. CPU, memory and IO. Monitoring areas can be configured using command line. Xperf comes with a rich collection of preset monitoring profiles that can fit into many situations. The tool can generate a xml log as well. F. Application Compatibility Toolkit This is a Windows application lifecycle management toolset that assists in identifying and managing overall application portfolio [10]. The toolset is free and can be used to reduce the time and cost involved in resolving application compatibility issues. This can be drilled down to following usage scenarios. Analyze portfolio of applications, web sites and computers, Centrally manage compatibility evaluators and settings, Rationalize and organize applications, web sites and computers, Prioritize application compatibility with filtered reporting, Add and manage issues and solutions for enterprisecomputing environment, Deploy automated mitigations to known compatibility issues and Send and receive compatibility information from the Microsoft Compatibility Exchange. The toolkit comprises of three tools; Standard User Analyzer, Internet Explorer Test Tool and Setup Analysis Tool. G. God God is a free process-monitoring tool for Linux, BSD and Darwin systems [11]. It is implemented in Ruby. It provides watchdog functionality on processes with respect to their CPU and memory usage. God can be configured to perform certain actions on a process depending on its CPU and memory consumption. For example it can restart a process if its memory footprint exceeds 100 MB. In addition to the polling mode where God collects data regarding a process periodically, it can also respond to process events such as termination. This is useful in cases where it is necessary to
keep a process running all the time. God writes a simple line log to the standard output, which can also be redirected to a physical file. Furthermore, it is capable of sending alerts as email, chat or Twitter messages. It can be configured to monitor multiple processes at the same time with respect to different aspects. Configuration is set via a Ruby file. H. Instruments This is the standard tool for profiling processes that run under Mac OS X or iOS [12]. It is a rich tool with a wide spectrum of monitoring capabilities. It uses monitoring templates called instruments to collect data on various aspects including CPU consumption, memory usage, memory leaks, I/O operations, power usage and network traffic. Data viewers are built in so that digging into different levels of details is possible. Monitored data can be saved for making comparisons across multiple runs. It is also possible to record a sequence of actions in an application for replaying it automatically later. IV.
FRAMEWORK FEATURES
After the selection of a tool set, the next step is to formulate a mechanism to automate the process of running those monitoring tools with appropriate configuration, interpret the monitored data and present them in a dashboard with views targeting different levels of stakeholders. Inspired by practical scenarios we can assume two use cases to the system. 1. Analyst runs the application to be analyzed and initiates the tool set with proper configuration. Then he performs several actions in the application and terminates the session. 2. Application is started automatically after a certain trigger (e.g. after the nightly build). Monitoring tools are automatically initiated with correct configuration. Test cases
are run in the application automatically. Then the session terminates. Both cases leave monitored data as the final output. Since automatic triggering of application test cases is handled by existing test automation frameworks [1] we only focus on automatically processing the monitored data and presenting results targeting different users. A straightforward requirement is to have a means for configuring a monitoring profile containing what aspects should be activated for monitoring in each tool. Almost all the monitoring tools generate some form of a log file containing monitored data. Vast majority of these logs are text logs (structured or flat text). Therefore it is reasonable to include a text log parser in the framework. However, since log files have arbitrary syntax and structure, this parser should be able to handle any log file format. After monitored data are extracted from the log files, the framework should be equipped with means for automating an analysis procedure. This can be either an interpretation to monitored data provided by the framework or a script written by an analyst, specifying steps of a recurring analysis procedure. After the monitored data are interpreted / analyzed, the framework should have a way of presenting them in a dashboard with configurable user interface controls. This will enable the users to customize the presentation interface according to project needs. V.
FRAMEWORK IMPLEMENTATION
Fig. 1 shows the architecture of the framework. Monitoring tools provide input and the framework output is the dashboard. Following is a brief description of each component in the framework.
Figure 1. Major components in the framework and their interconnections
Configs UI – Provides the user interface for activating and deactivating quality aspects in monitoring tools. A user can create a monitoring profile for a given project through this interface. Monitoring Tool Manager – This is the component in the framework that insulates other components from the differences in monitoring tools. It wraps the interfaces provided by various monitoring tools and exposes a unified set of entry points so that the tools can be manipulated in a homogeneous way. MML & LDEL- These are the two languages exposed by the framework. MML stands for Mind Map Language, which is the same mind map based procedural language we presented in [13]. It is a new language, which is optimized for structured log analysis using tree as the core data structure. LDEL (Log Data Extraction Language) is an implementation of the declarative language specification we presented in [14]. The implementation is compliant with Simple Declarative Language [15]. LDEL provides generic infrastructure for declaring the structure and format of any log file. Given the declaration for log file format and structure, it can extract log data into a tree, which can be further processed by MML scripts. Log Data Extraction – Framework is built with a LDEL declaration for the output log generated by each tool. The declaration specifies the format of the associated log file. When adding a new tool to the framework, the LDEL declaration for its log file format should also be added. Log Data Extraction module contains these declarations and uses them to extract monitored data from log files generated by tools employing the LDEL interpreter. Inference Engine – This module takes the output of the Log Data Extraction module as its input. The data are then processed according to a set of MML scripts. Framework provides some of these scripts for simple interpretations of monitored data. Other scripts are written by analysts for automating recurring analysis procedures. Output of this module is a set of mind maps containing processed information (inferences). User Interface Engine – This module contains MML scripts for generating the dashboard according to the inferences. Framework provides scripts for generating simple HTML user interface controls such as tables and graphs. Analysts are supposed to write scripts to customize the dashboard. Data Manager – In our previous work [13] we used inmemory data structures for keeping data. However, it is volatile and the total size is limited by the available memory. In this work we have implemented a module for keeping all the data in a database. The module is implemented in such a way that different databases can be plugged in later. In this implementation we used a relational database developed in SQLite. SQLite is selected because it is supported in almost every platform including mobile operating systems. Framework Tools – Two convenience tools are developed to make framework usage more efficient. One tool helps analysts to mark the instances when important actions are
performed in the application during an analysis process. It writes a string provided by the analyst describing the action or event along with a timestamp into a separate log file. This log file can be immensely useful later for correlating data logged by monitoring tools with application events. Other tool assists in authoring the format declaration for a new log file type by analyzing the log file and highlighting common constructs such as timestamps, IP addresses, error codes, recurring strings, etc. VI.
RESULTS
We extended the experiment done for the previous work [13] in this case too. The same XML reading application is used with more actions this time. Microsoft Application Verifier is used to monitor the application with all monitoring aspects activated. The first framework tool mentioned in Section IV is used to mark application events. LDEL declarations for both Application Verifier and the framework tool logs are used to extract recorded data. One MML script is written to correlate data from the two logs and group all the failures reported by Application Verifier based on the application event that caused it. Another MML script is added to the User Interface Engine to generate a line graph of error count against the time sequence of application events. Because of the use of LDEL for log data extraction (instead of using MML as in our previous work) total code line count reduced by 18%. In addition, the data extraction code in LDEL is more readable than that in MML. VII. RELATED WORKS Codehaus implemented a similar framework for aggregating the output of white box testing tools [2]. It provides infrastructure for code quality management with respect to architecture and design, complexity, coding rules, unit tests, duplications, potential bugs and comments. It is capable of generating a rich customizable dashboard with content from code quality evaluation tools. Furthermore it provides the ability to trigger alerts and notifications pertaining to particular conditions. Quality aspects used for evaluation can be customized. Although there are major differences between handling white box testing tools and software monitoring tools, Sonar was an inspiration when formulating the model for our framework. Earlier we presented the initial version of mind map based log analysis framework [13] as a solution to the log analysis problem in general. This framework, as its main component, included a new procedural language optimized for processing log files. The language had the capability to extract log contents, process extracted data and generate simple dashboards containing analysis results. Though the language seemed to be appropriate for log data processing it resulted in long and less readable code when used for log data extraction and dashboard generation. Then we realized the need for a separate mechanism for declaring the format of log files and presented a specification [14] for a language, which is capable of expressing the format of any log file. After that we presented a software quality verification framework [3], which included a basic platform to manage software-monitoring tools.
VIII. CONCLUSIONS With this work we achieved a significant improvement over the Product Quality Analyzer framework we presented in [3]. Introduction of our own declarative language LDEL into the framework significantly reduces the code required to extract data from tool logs. It also adds to the readability of the scripts for monitored data extraction. Since LDEL generates the same data structures used by MML there is no requirement for data format conversions between data extraction and processing phases. Using SQLite database as the data warehouse created a boost in the amount of data that can be handled during an analysis procedure. It also added data persistence, which can be immensely valuable when presenting historical view of monitored data like in the timeline in Sonar [2]. For example, recorded monitoring data over a period can be used to showcase the evolution of a product with respect to the monitored aspects. Although the use of relational databases for data handling improved capabilities of the framework in handling volumes of monitored data, it does not stand as a proper means for dealing with huge tool logs. The reason is that each node in the generated trees has its own set of attributes. Storing these heterogeneous data trees in relational databases results in sparse tables. They are inefficient both in terms of space and processing time. NoSQL might be a good candidate to handle cases like this because some NoSQL databases provide the capability for each row in a table to have its own schema. Depending on the requirements of the analysis procedure the users can switch between in-memory data structures, relational databases and NoSQL. Apparently, the weakest part of the framework in its current state is the User Interface Engine. Though MML possesses strong capabilities for log data processing, it does not prove to be a good choice for user interface generation. Making even a simple dashboard with it required lot of code. Cost incurred can be significantly reduced by having a library of user interface controls implemented in MML and built into the framework. However, in long term, we have to come up with a different means for generating more appealing user interfaces. IX.
FUTURE WORK
Few things must be done for bringing the framework to an industrially useful level. We first need a library of LDEL declarations for log file formats used by a range of monitoring tools. A starting point can be to declare the log file formats for all monitoring tools mentioned in this paper. Currently we have the declaration for the Application Verifier log. Another important milestone will be to formulate a categorized list of quality aspects covered by all the monitoring tools. Sonar does a similar thing with code quality analysis tools [2]. Addressing the inhomogeneity problem in the database is also another important future enhancement. Since the nodes in data trees can have vastly different sets of attributes, the framework should support a complying database model for
such cases. Our suggestion is to use a NoSQL database model such as BigTable (each data entity can have its own schema) or document-oriented model (database can have documents that contain key-value pairs). A future enhancement with paramount importance is to develop a new model for User Interface Engine. The model should be able to create user interfaces that meaningfully represent information kept in trees inside the framework. We suggest a HTML 5 based implementation because of platform neutrality.
REFERENCES [1]
[2] [3]
[4] [5]
[6]
[7]
[8]
[9]
[10]
[11] [12]
[13]
[14]
[15]
"Test Automation Frameworks." [Online]. Available: http://safsdev.sourceforge.net/FRAMESDataDrivenTestAutomationFra meworks.htm. [Accessed: 20-Feb-2012]. "Sonar: Put your technical debt under control." [Online]. Available: http://www.sonarsource.org. [Accessed: 11-Jan-2012]. D. Jayathilake, H. Yaggahavita, U. Senanayake, C. Alvitigala, D. Sriyananda, “A Scalable Product Quality VerifierFramework for a Outsourcing Supplier,” in IEEE Conference on Computer Applications & Industrial Electronics, Penang, 2011, pp. 432-437. "Apache JMeterTM." [Online]. Available: http://jmeter.apache.org. [Accessed: 26-Feb-2012]. Emily H. Halili. Apache JMeter: A practical beginner's guide to automated testing and performance measurement for your websites. Packt Publishing Ltd, 2008, pp. 16-25. "Using Application Verifier to Troubleshoot Programs in Windows XP." [Online]. Available: http://support.microsoft.com/kb/286568. [Accessed: 10-Aug-2011]. "Using LeakDiag to Debug Unmanaged Memory Leaks." [Online]. Available: http://mcfunley.com/277/using-leakdiag-to-debugunmanaged-memory-leaks/comment-page-1. [Accessed: 25-Feb-2012]. "Process Monitor v2.96." [Online]. Available: http://technet.microsoft.com/en-us/sysinternals/bb896645. [Accessed: 10-Aug-2011]. "Windows Performance Analysis Tools." [Online]. Available: http://msdn.microsoft.com/en-us/performance/cc825801. [Accessed: 10Aug-2011]. "Microsoft Application Compatibility Toolkit 5.0." [Online]. Available: http://technet.microsoft.com/en-us/library/cc507852.aspx. [Accessed: 10-Aug-2011]. "GOD: A Process Monitoring Framework in Ruby." [Online]. Available: http://godrb.com. [Accessed: 26-Feb-2012]. "Instruments User Guide." [Online]. Available: https://developer.apple.com/library/mac/#documentation/developertools/ conceptual/InstrumentsUserGuide/AboutTracing/AboutTracing.html. [Accessed: 26-Feb-2012]. D. Jayathilake, “A mind map based framework for automated software log file analysis,” in International Conference on Software and Computer Applications, Kathmandu, 2011, pp. 1-6. D. Jayathilake, “A novel mind map based approach for log data extraction,” in 6th IEEE International Conference on Industrial and Information Systems, Peradeniya, 2011, pp. 130-135. "Simple Declarative Language: A simple universal language for describing typed data in lists, maps and trees." [Online]. Available: http://sdl.ikayzo.org/display/SDL/Home. [Accessed: 15-Jan-2012].