Model-Based Performance Testing for Robotics Software Components

2 downloads 0 Views 662KB Size Report
The Protocol Buffers language itself is agnos- tic to specific programming .... the DSL-based approach against a state-of-the-art development workflow in Java.
Model-Based Performance Testing for Robotics Software Components Johannes Wienke∗‡ , Dennis Wigand∗‡ , Norman K¨oster‡ , and Sebastian Wrede∗‡ Email: {jwienke,dwigand,nkoester,swrede}@techfak.uni-bielefeld.de ∗ Research

Institute for Cognition and Robotics (CoR-Lab) Bielefeld University, Germany ‡ Center of Excellence Cognitive Interaction Technology Bielefeld University, Germany

Abstract—In complex technical systems like robotics platforms, a manifold of issues can impair their dependability. While common testing and simulation methods largely focus on functional aspects, the utilization of resources like CPU, network bandwidth, or memory is only rarely tested systematically. With this contribution we propose a novel Domain-Specific Language (DSL) for modeling performance tests for individual robotics components with the aim to establish a systematic testing process for detecting regressions regarding the resource utilization. This DSL builds upon a testing framework from previous research and aims to significantly reduce the effort and complexity for creating performance tests. The DSL is built using the MPS language workbench and provides a feature-rich editor with modern editing aids. An evaluation indicates that developing performance tests requires only one third of the work in comparison to the original Java-based API.

I. I NTRODUCTION With the advances in hardware solutions and software capabilities, robots are becoming more important in many application areas. Consequently, as people and businesses increasingly rely on their robots, dependability requirements are raised. However, also the complexity of robotics systems is increasing to cope with the more flexible and complex scenarios and application areas. Ensuring an appropriate level of dependability in such an environment is a challenging task and new techniques are required and evolving. Nevertheless, at least in some areas like research, robots are still known to be error-prone. One potential reason for this is that currently applied verification techniques like unit and integration testing [2] mainly focus on functional properties while disregarding nonfunctional aspects. One of these nonfunctional aspects is the utilization of system resources such as CPU, network bandwidth, and memory. Resource utilization is one important aspect of the performance of a system [3]. In previous research [4, 5] we have specifically focused on controlling the resource utilization of robotics software components, which has not yet received much attention in the robotics (research) community. Still, surveys have shown that such issues exist [2, 6] and unplanned changes in the utilization of system resources can severely impact the dependability of systems. For instance, a component wasting more memory than expected might eventually result in other components crashing or in processing

delays caused by the swapping behavior of the host system. Depending on the application domain, the outcomes of these issues might range from increased power consumption or spoiled data set trials in a research setting to severe injuries in safety-critical domains. To address these issues, we have previously proposed a framework for testing individual robotics software components regarding their resource utilization characteristics (not task performance) [5]. This framework generates middleware inputs for individual vertical components [7] based on an abstract specification of actions to perform. These actions are then executed for a set of varying parameters with the aim to simulate different loads and usage scenarios. A limited set of actions has been identified that was sufficient to create performance tests for a range of different robotics components. During test execution, the tested component is instrumented to record time series of its resource utilization. An analysis tool is able to compare these time series from different revisions of a component and to detect changes in the resource utilization with the aim to identify unintended changes. The whole tooling can be used with common automation solutions like the Jenkins continuous integration server. From the perspective of a developer or tester who creates performance tests with this framework, the most labor-intensive task is writing the specification of actions to perform for generating the intended inputs for the tested component. The testing framework has been implemented as a Java API as a compromise between the language and ecosystem complexity and the ability to generate high computation loads. Even though Java is an easy to learn and to maintain language and it has a sufficient efficiency to execute performance tests potentially mimicking high loads, the language – by

new RpcAction( new StaticData( new ParticipantValue(remoteServer)), new CreateTrainEvent(), StaticData.newString("capture"), StaticData.newNumber(10000L)); Listing 1. Code-level constructs necessary to use static data in test action specifications (StaticData and ParticipantValue).

design – has a restricted feature set. Consequently, the abstract specification of the test cases is often mixed with code-level constructs required to achieve different features. Several of these constructs could be hidden in more flexible programming languages but cannot with the simplicity of Java. Listing 1 shows an example of such constructs that are required to implement a parameter abstraction and a change detection mechanism. The necessity for code-level constructs ultimately slows down the creation of performance tests and reduces their readability. Domain Specific Languages (DSLs) [8] form one way to overcome the presented issues by providing an abstraction above the general-purpose programming language tailored for a specific use case. Therefore, with this contribution we present a DSL tailored for the needs of performance testing for robotics components. Tests formulated using this DSL specify high-level behavior and parameters for generating test loads in a concise language, which can be transformed into calls to the Java testing framework API. That way, performance tests are represented as abstract models of the test procedure and creation, maintenance, and analysis are fostered. This follows the general ideas of model-based testing [9]. In the broad scope of this discipline, our approach realizes the “generation of test scripts from abstract tests” [9, p. 7] flavor defined by Utting and Legeard [9]. In the following, we will first review related work regarding performance testing in general and DSLs for testing purposes. Afterwards, the structure of the existing testing framework will be briefly explained to introduce the underlying concepts of the domain and to identify points that can be addressed by a DSL. Then, the construction of the DSL will be explained in detail and an evaluation will demonstrate the benefits of the DSL compared to the pure usage of the testing framework. II. R ELATED W ORK In contrast to robotics, other domains have established more robust stategies for testing the performance of their applications. This has most notably happened for large-scale enterprise systems with the Application Performance Management (APM) idea [10]. In this domain, performance testing is generally performed on the level of the whole application instead of individual components. Refer to Jiang and Hassan [11] for a survey on solutions available in this domain. Regarding the way performance test cases are defined in this domain, several open-source tools can be analyzed. Tsung [12] uses an XML configuration file to specify the interactions and load profiles to impose on different kinds of web servers whereas Apache JMeter [13] provides a graphical user interface for this purpose. Other tools like Locust [14], NLoad [15], and The Grinder [16], or Chen et al. [17] provide an API using a general-purpose programming language to define the test cases. Most related to our approach in this category of tools is Gatling [18], where the Scala language is used to provide an internal DSL for test description. A model-based alternative for specifying general tests is the UML Testing Profile (UTP) [19], which is a proposal for

an application-agnostic language to define general software tests. Test behavior is modeled using extensions for UML behavioral diagrams (sequence, state machine). Because of the high abstraction level of UTP, in depth knowledge about UML is required for effectively using the language and acceptance is limited [20]. From the area of model-based performance testing, Jayasinghe et al. [21] present an approach targeting cloud environments. All aspects of test cases are specified using XML files and a multi-stage transformation process to executable test artifacts is realized using XSLT. In this domain, also several DSL-based approaches exist. Sun et al. [22] introduce the GROWL DSL for configuring web server load tests. A JSON-inspired notation is used to specify structural aspects of test cases. Behavioral specifications do not exist. GROWL specification are eventually transformed into configuration files for Apache JMeter. A comparable solution is also presented in Cunha et al. [23], where the CRAWL DSL resembles YAML notation instead of JSON and the generation target is a custom test execution framework. Dunning and Sawyer [24] present a DSL for specifying load tests for data management solutions. Here, a declarative programming language tailored to the specific needs of the application scenarios has been designed which covers all aspects required to execute a load test. Test cases are declared in a single configuration file comprising all aspects. Another DSL for performance testing of web applications was proposed in Bui et al. [25]. The developed language targets the declaration of load tests and their workloads and is transformed into C# code for the Microsoft Visual Studio load testing infrastructure. With a slightly different focus, the Wessbas-DSL presented in Hoorn et al. [26] is used to specify probabilistic performance tests and their load profile based on Markov chains. Tests are generated towards a custom extension for Apache JMeter enabling Markov-based testing. Finally, Bernardino Da Silveira et al. [27] introduce the Canopus DSL for performance testing of server systems. Based on a prose-like representation, performance tests are specified as virtual user using the tested system. Multiple aspects of the DSL allow modeling how the system is monitored, which scenarios are tested, and how virtual users behave. To our knowledge, no other performance testing framework exists for the robotics domain. Additionally, a recent survey on DSLs in robotics does not list a single DSL for this purpose [28]. Most of the aforementioned approaches address testing complete applications based on HTTP interactions or comparable protocols. The testing behavior in this case is often described by a combination of the URL to load, potential request parameters or POST data, and rates or probabilities for these interactions with the tested server. Some tools further allow to formulate more complex behaviors of virtual users with multiple sequential interactions, called sessions [e.g., 18, 27]. However more flexible declarative behavior specifications can rarely be found and targeting individual components is out of scope for existing tools. While UTP could be applied

1

generates

«Interface» ParameterProvider

ParameterSet

TestCase

* 1..*

Fig. 1.

TestPhase

1

+ children Action

resolves in

1..*

Parameter

Basic structure of the performance testing framework.

public interface Action { ReturnType execute(ParameterSet parameters); } public class Loop implements Action { public Loop(final Action action, final Action iterations); } Listing 2.

in such cases, further tooling would be necessary to execute the tests and the steep learning curve would prevent and easy applications in the fast-paced robotics research process. III. P ERFORMANCE T ESTING F RAMEWORK In this section we will briefly summarize the structure and concepts of the performance testing framework that are necessary to understand the design of the DSL. Moreover, we will outline key issues regarding code construction and readability that will be addressed with the implemented DSL. For further details, please refer to the framework publication [5]. Fig. 1 visualizes the basic structure of the performance testing framework. Inside this framework, a component is tested using one or several test cases. A test case is separated into several test phases that structure the interactions with tested component into identifiable parts. Each test phase contains a tree of actions that describe the behavior of the performance test and how it interacts with the tested component. The behavior description is abstract in the sense that no concrete values for arguments like rates, data sizes, or times are specified directly. Instead, actions refer to parameters related to the test case. At test runtime, these parameters will be filled with different values. For this purpose, a parameter provider is responsible of generating values (parameter sets) for the relevant parameters of the test case. The complete test case will be executed for all available parameter sets. This enables a distinct description of the test behavior (actions) and the load profile (parameters), which can even be reconfigured without having to change the behavior description.

Idealized Action interface and an exemplary implementation.

new ProtobufData( new StaticData