2015 IX Brazilian Symposium on Components, Architectures and Reuse Software
A Robust Software Product Line Architecture for Data Collection in Android Platform Gustavo M. Waku, Edson R. Bollis, Cecilia M. F. Rubira and Ricardo da S. Torres University of Campinas (UNICAMP) Institute of Computing, Campinas, S˜ao Paulo - Brazil Email:
[email protected],
[email protected],
[email protected],
[email protected] Abstract—Android is an open platform, developed by Google and Open Mobile Handset Alliance targeting mobile devices. Its constant evolution and increasing cost reduction made them suitable for complex applications especially for data collection applications. Data collection is a domain which evolved to use mobile devices to collect information, targeting different fields of study including: physical and social sciences, humanities, business, demographic surveys, agriculture, biology, and geology. Android usually runs on different hardware and software domains with similar functional and non-functional features. The data collection domain has a lot of sub-domains, creating an opportunity to explore software variability and quality properties such as reliability, availability, and data integrity using Component-Based Development (CBD), Fault Tolerance Techniques and Software Product Line (SPL) with Aspect-Oriented Software Development (AOSD). However, the use of mobile application for data collection poses some challenges like severe hardware restrictions (such as limited power processing and short battery lifetime) and the use of sophisticated techniques can negatively impact in application performance, and quality properties. In this work, these issues were addressed by proposing the development of a robust SPL architecture called Robust SPL for Data Collection (R-SPL-DC) and a real application called E-Phenology Collector for data collection domain to assess the use fault tolerance techniques, CBD, SPL, and AOSD to ensure availability, reliability, and data integrity without significant impacts on the overall performance of the mobile device. The results have shown that the use of R-SPL-DC is promising and suits the requirements for data collection domain.
I.
coping with software complexity by supporting modularity. In the literature, some studies have been conducted using mobile platforms on industry deployment scenarios based on real user requirements, but none of them discussed integrated approach with CBD, AOSD, and SPL in Android platform [2] [3] [4]. Data collection is a domain which evolved to use mobile devices to collect information, and it is defined as the process of gathering and measuring information on variables of interest, in an established systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. Data collectors are common to all fields of study including physical and social sciences, humanities, business, demographic surveys, agriculture, biology, geology, etc. The main goal for all data collection is to capture quality evidence that then translates to data analysis and allows the building of credible answers to questions that have been identified [5]. In the literature, many mobile data collectors have been developed for surveying and monitoring environmental changes [6]. In this context, the SPL approach can support the systematic reuse of software assets to generate applications for these different sub-domains of data collection. However, the mobile domain poses some challenges such as limited power processing and short battery lifetime, while the data collection domain poses other challenges, such as exposure to weather and danger to data loss. Therefore, the ability to continue operating even in the presence of faults is an essential requirement for mobile mission-critical data collection applications, and redundancy techniques should be properly applied to mobile software architectures in order to support fault tolerance. Moreover, the use of advanced software techniques, such as AOSD, CBD and SPL, for structuring fault-tolerant mobile applications could produce suboptimal code that could negatively affect the application performance.
I NTRODUCTION
Android is an open platform, developed by Google and Open Mobile Handset Alliance targeting mobile devices. Their increasing processing power and constant cost reduction made them very popular, allowing these devices to be equipped with Wi-fi, 2G/3G/LTE antennas, cameras, accelerometer, GPS, Bluetooth and high-speed processors. On one hand, this ecosystem has grown a lot and attracted many companies to develop applications whose requirements are becoming more sophisticated specially in terms of their quality attributes such as reliability, availability, and data integrity. On the other hand, Android usually runs on different hardware and software domains but with similar functional and nonfunctional features, creating an opportunity to explore software variability defined as locations where the software can be configured [1]. Software Product Line (SPL) is an approach for software development that allows variability management, by reusing a set of core assets to obtain a family of similar products. Different software engineering techniques, such as Component-Based Development (CBD) and Aspect-Oriented Software Development (AOSD), support SPL development 978-1-4673-9630-1/15 $31.00 © 2015 IEEE DOI 10.1109/SBCARS.2015.14
This work proposes a robust SPL architecture for data collection called R-SPL-DC. Robustness [7] is the delivery of a correct service in implicitly-defined adverse situations arising due to an uncertain system environment covering system internal faults and environmental faults in data collection domain (e.g., hardware breakage). The R-SPL-DC handles quality properties such as availability, data integrity, reliability, and performance. It supports: (i) the use of multiple devices to cope with data collection long duration, (ii) data synchronization to leverage the information among devices, (iii) storage redundancy to prevent data loss, and (iv) different storage format types to share with third-party applications the same database. In addition to that, basic operations (such as, loading information from the server and dispatching information to it) are monitored and blocked (if needed) using AOSD techniques 31
in order to prevent data loss.
service is not running the system will not be available; [NFR4]: The system must ensure the data collection basic operations such as storing and retrieving information are performed in less than a second, because if it does not occur, the users can not collect data faster and the overall performance would be compromised.
The proposed solution was used to derive an application for phenological data collection, called E-Phenology Collector, which was deployed in a real environment and used by biologists for performing data collection in the field. Both, laboratory and field running scenarios have shown that the application is well suited for data collection.
The major scenarios that must be handled are the following:
This document is organized as follows. Section II shows the basic concepts that will be used in this work. Section III shows the proposed solution. Section IV shows a case study using the proposed solution. Section V discusses the related work. Section VI presents the conclusions and future research directions. II.
[SC1] If low battery level is achieved, the functions that handle data storage must be blocked to avoid data loss. If this happens, a battery exception will be thrown - see section II-B. [SC2] If the data loss by environmental accidents happen with the phone or the low battery level is achieved, the data collected so far must be transferred to another mobile device using SD Card or bluetooth and imported in the application, so that the data collection process can be resumed.
BACKGROUND
A. Data Collection Requirements
[SC3] If the application finds data collection service failure and stops working, the data collected so far must be stored in another format to allow third-party applications to edit these data and allow the data collection to continue.
To establish the data collection requirements, seventeen projects of existing data collection applications were analyzed to build a document containing a generic set of requirements using reverse engineering. Reverse engineering is the process of analyzing a subject system to identify the system’s components and their interrelationships and create representations of the system in another form or at a higher level of abstraction. In this paper, the major requirements were extracted from the following projects: Maritaca Project [8], The Decision Support System of Dengue (DDSS),1 Design software for field data collection [9], e-phenology project [10].2
[SC4] If the application finds data collection service failure and provide erroneous data, the data collected stored in another format is imported to provide the correct data to the user. B. Fault Tolerance Fault is the root cause of the error, error is the part of the system state and occurs in presence of the fault, and failure is a deviation of the specified service [11]. Fault tolerance is the ability to provide service complying with the specification in spite of faults. Fault tolerance techniques are designed to allow a system to tolerate faults that remain in the system after its development. A key supporting concept for fault tolerance is redundancy, that is, additional resources would not be required if fault tolerance techniques were not implemented. Redundancy can take several forms: hardware, information, software, and time [11].
The functional requirements (FR) of data collection, in most of the cases, are linked with data collection, for example: [FR1]: The system must store data collected in the field; [FR2]: The localization of Global Position System (GPS) must be used to show the place where the data collection occurred; [FR3]: The system must identify the user; [FR4]: The system must record the date and hour of the data collection; [FR5]: The system must allow the possibility of loading information about which data would be collect; [FR6]: The system must persist the information; [FR7]: The system must export data to other systems or create external database representation, such as spreadsheets; [FR8]: The system must allow the validation of the collected data; [FR9]: The system must allow the visualization of the current map of the data collection area.
There are several types of redundancy, such as Hardware Redundancy (HR) and Data Redundancy (DR). Hardware redundancy is the inclusion of replicated and supplementary hardware to the system. An example of hardware redundancy usage occurs when the low battery level is achieved, then two or more mobile devices are required, the user stops the data collection in one device, the data is transferred and imported in another device. After that, the data collection continues. Data redundancy is the use of duplication of the same data while on recording it and can be classified in two types: Data Storage Redundancy (DSR) and Redundancy in Data Structure (RDS). DSR records the same information in different locations and RDS is the use of different data format type (i.e., data representation) to store data [11] (for example, XML, CSV, etc, - all of them are representations coming from the same source).
The non-functional requirements (NFR) of data collection are related to quality attributes, such as reliability, availability, data integrity, and performance to allow the system run smoothly in this mission critical environment. The nonfunctional requirements are: [NFR1]: The system need to prevent data loss and data collection service failure, because if the database of application was lost or corrupted, the major goal of the system will be affected, and the system will not ensure data integrity and availability. [NFR2]: The system can not be affected by low battery level of mobile device, because a typical data collection expedition can take several hours, and while battery level is low, the system will not be available; [NFR3]: The system must ensure the data collection services failure will not occur because if the system can not store and retrieve data correctly the system will not be reliable, and if the 1 http://www.cs.colostate.edu/ddss/index.html 2 http://www.recod.ic.unicamp.br/ephenology
Another way to provide satisfactory software handling to the faults or exceptional situations is the use of exception handling. Exceptional Handling is the ability to detect errors and return to normal operation. Examples of exceptions that require correct handling in data collection domain are: save data exception (is the exception handled when there are errors
(As of July 2015). (As of July 2015).
32
while on saving data); battery exception (is the exception thrown when the application does not have sufficient battery to finish a particular functionality); and load data exception (is the exception that occurs when the data loaded from the server is not correctly finished).
AspectJ and CaesarJ for java, and AspectC++ for C++. Android platform uses Java as a primary language, however its development tools do not support AspectJ officially, therefore a custom modification in the build environment is required to allow AOSD in this platform. This modification can be done using ant scripts or modifying the Eclipse project to support AspectJ plugin.
There are properties related to the use of systems that fault tolerance methods help to improve, for instance: data integrity is a property of the system related to the non-occurrence of improper data change [12], or how it can store data correctly, making sure it is not lost or not completely stored; availability is the property related to the readiness for usage [12], i.e., the system should be operable most of the time in the field; reliability is the property that a system has while on providing the correct requested service [12]. The use of fault tolerance techniques increase the power processing and may impact the quality attribute called Performance, which is defined as the timing behavior of the system [13].
E. Software Product Line Development 1) Terminology: A Software Product Line (SPL) is a set of software-intensive systems sharing a common managed set of features that satisfy the specific needs of a particular market segment or mission and that are developed from a common set of core assets in a prescribed way [16]. The large-scale reuse is achieved by the use of Product Line Architecture (PLA) that is common to a variety of similar products in terms of their architectural elements [16]. The SPL lifecycle is divided according to major activities: domain engineering and application engineering. Domain engineering focuses on the creation of the core assets that will be reused in the applications. Application engineering is responsible for adapting the core assets and deriving application products based on the core assets developed in the domain engineering. There are different software engineering techniques that support the SPL development such as component-based and aspect-based development.
C. Fault Model for Mobile Domain Fault model is the reunion of domain main faults on a hierarchical structure of fault classes [14]. The fault model for mobile domain is defined as follows: (a) Low Battery Level : It reduces system availability while on providing the desired service. In the case of data collection, the low battery level is a problem that prevents the usage of certain functions. The faults in this class are the following: the data can not be collected and operations such as data synchronization, data loading from server, and data dispatching to the server can not be performed; (b) Data Collection Service Failure : Sometimes, operations contain errors that stop the system and reduce the availability or process the data incorrectly and reduce its reliability. This disrupts data collection and can make the system unusable. In this case, the fault is part of the system and may appear in some specific conditions. The faults in this class are: the system stops working by a failure, the system can not access the data, the system generates a database in a invalid state, and the system erroneously process data; (c) Data Loss : Data loss can occur by software error or by accident in the field. Mobile devices such as tablets and mobile phones are inclined to lose data, because they were not meant to work in hostile environments. The faults in this class are: the breakage of mobile device ( e.g., the mobile device can fall down to the water), and the database can be corrupted.
2) Domain Analysis: In the domain engineering, a critical task is how to obtain the core assets. In order to do this, Kang et. al. [17] proposed the use of feature models to identify commonalities and variabilities of a family of systems for a particular domain. The feature model is a hierarchical structure that represents the features of a particular domain. Features are distinguishable characteristics of a concept (e.g., system, component, and so on) that is relevant to some stakeholder of the concept [18]. Kang et. al [17] proposed a feature model creation using the following steps: (1) collecting source of documents, (2) feature identification, (3) feature grouping and model building, (4) feature classification, and (5) model validation. In (1), all sources (including webpages, documents, and manuals) are consulted. Documents from other similar projects were collected as a source. In (2), features are identified. In (3), name conflicts are resolved and features are grouped hierarchically in consists-of relationship, for example, dispatching feature consists-of two features: Single server Dispatch and Multiple Server Dispatch. According to this relationship, the feature model is built for the data collection domain. In (4), features are classified according to whether they are: compile-time, runtime or load-time activated. In our model, all the features are compile-time activated. In (5), experts and users should be consulted to validate the model.
D. Aspect-Oriented Software Development Aspect-Oriented Software Development (AOSD) is a software development technique that modularizes crosscutting concerns, (i.e., features scattered in different places, for example, logging, exception handling, and authentication) through the use of abstractions called aspects. Aspects work with three mechanisms: joinpoints, pointcuts, and advices. Joinpoints are identifiable points of code execution, pointcuts are sets of joinpoints, and advices represent the implementation that will be activated when the code reaches the pointcuts [15]. In this work, the notion of concern is equivalent to feature. Aspects have implementations for some programming languages like
3) Aspect-Oriented Analysis: AO-Analysis [19] is a method that aims to identify the crosscutting concerns through the use of use cases and traceability matrices. The output of this method is a complementary view of the feature model called aspect-feature view. Use cases that are “extended” or “included” frequently by others are candidates to crosscutting concerns. The traceability matrix looks for the impacts of crosscutting concerns over existing software features and
33
Fig. 1: Partial Feature Model for the Robust SPL for Data Collection.
5) Architectural Implementation: COSMOS*-VP [20] [21] is a component model to realize the architectural design that combines AOSD and components preserving component information hiding, i.e., with this model, the component is still used as a black-box, but it defines two types of interfaces: AbstractAspects and XPIs. AbstractAspects are aspects that have the implementation (advice), i.e., the behavior that will be activated; whereas the XPIs have the points inside the component (i.e., pointcuts) that will allow the code inside AbstractAspects be activated. In other words, the XPI acts as an interface, allowing the execution of AbstractAspects but just at specific points. To bind the AbstractAspects (behavior) to the XPIs (pointcuts), the model uses Connector-VPs, which is an architectural connector that specifies which behavior (AbstractAspect) will be bound to which pointcuts (XPI). Architectural connectors serve to bind interfaces from one component to another, working similar to the Adapter design pattern. It is important to note that AbstractAspects and XPIs are patterns to structure the code in an existing aspect programming language (e.g., in Android platform, AspectJ).
other crosscutting concerns. Based on this matrix, an aspectfeature view is created, where each software feature is mapped as squares, representing base features and each crosscutting feature is mapped as diamond (see Figure 2). 4) Architectural Design: Aspect-Oriented Feature Architecture Mapping (AO-FArM) [19] is a method to obtain the software architecture based on the feature model and aspectfeature view. It proposes four transformations of the feature model: (T1) Removal of non-architecture related features and quality features resolution, (T2) Transformations based on architectural requirements, (T3) Transformations based on interacts relations, (T4) Transformations based on hierarchy relations. In (T1), all the non-architecture related features are removed, for example, the top-level feature model name. In (T2), if there is an architectural requirement such as SSL security, it is added on the feature model. In (T3), the feature model is analysed considering the dependencies between features, i.e., interact relations. An interact relation can express that a certain feature A uses, modifies, contradicts, crosscuts, or precedes another feature B. If A contradicts B, they must be implemented together. If a feature C uses another feature D, then this will be reflected in the architecture with component C requiring an interface from component D. If a feature C crosscuts a feature D, in the architecture, component C will provide an interface with behavior (i.e., advices II-D) and component D will provide an interface specifying which methods will be crosscut (i.e., pointcuts II-D). In (T4), all the hierarchy relations are checked. They must be: either a feature that specializes, is-part-of, or is-an-alternative to another. To preserve fine-grained mapping, the number of features implemented in a component should be low. At this point, all the major components and aspects are identified, then all the interfaces of each component are specified.
III.
T HE P ROPOSED S OLUTION
The proposed solution is separated in two sections: (a) generation of the feature model and the aspect-feature view (described in sections II-E2 and II-E3), and (b) design of the Robust SPL architecture using the AO-FArM method (described in section II-E4). A. Feature model and Aspect-Feature View Based on the data collection requirements (described in Section II-A), the fault model (Section II-C) is created. The data collection feature model is then devised based on functional and non-functional requirements integrating some pos-
34
sible exceptions of the domain and fault tolerance techniques. After that, the Aspect-Feature model is obtained to identify crosscutting concerns in the domain using the AO-Analysis (Section II-E3). Figure 1 presents the feature model for the data collection domain. The Mobile Number represents the number of mobile devices that are taken to the field. It depends on the number of hours that the system needs to work and the hardware configuration of the mobile device. The Loading feature is responsible for loading information from previous data collections. The data can be optionally edited (Data Edition) and can occur in three different categories (i.e., types): Identified Individual with Previous History, Mission (a collection of identified individuals) or Expedition (collection of missions). The Dispatching feature can be performed in two ways: using a Single Server or using Multiple Servers. Data Storage feature concentrates the rules to record data. It always records data in the internal memory (using SQLite), and optionally it does record the information in the external memory (SDCard) in a specific format (i.e., using the Storage Format Type feature, which represents the various types of file formats that the information can be stored, for example, Excel, CSV, JSON, or XML). The Login feature authenticates the user. The Data Synchronization feature represents the data synchronization that is needed in the field, i.e., the information leveraged among devices, to allow the user to switch from one device to another. It can be realized using Wi-Fi, Bluetooth, or data stored in another device (by means of the Storage Format Type feature). The feature Data Collection represents the data acquisition. It may use the optional validation feature, which checks for data consistency according to the business rules (for example, if the data is outdated or if it makes sense in a given context). Data Collection feature uses the Data Storage feature to record (i.e., store) data in either phone memory or in the external memory. Visualization of Collected Data is an optional feature that implements different formats to visualize the data, for example, using maps.
Fig. 3: Robust SPL Architecture for Data Collection Domain.
battery level into the internal database after each operation of “loading” or “pushing data” is sent to the server. Figure 2 shows the impact of crosscutting features (Block Usage and Battery Monitor) over base features (Loading, Dispatching, Data Collection, Visualization of Collected Data). It can be observed in Figure 1 and Figure 2, the mapping of fault tolerance techniques to one or more features in order to ensure quality properties. For instance, Hardware Redundancy is implemented by Mobile Number, Data Synchronization, Block Usage, and Battery Monitor features to ensure availability. Data Storage Redundancy is implemented by Data Storage and Synchronization features to ensure data integrity. Redundancy in Data Structure is applied in the Storage Format Type feature to ensure reliability. The exception handlers are implemented in the base features to tolerate domain exceptions to ensure availability and reliability: the Loading, Dispatching, Data Storage, and Data Collection features use the battery exception handler; the Data Storage, and Data Synchronization features use the save data exception handler; and Loading feature uses the loading exception handler. B. Robust SPL Architecture Figure 3 shows the R-SPL-DC. Each component has a specific responsibility and has strong mapping with the feature model. Mobile number feature does not have a respective component in the architecture, because it represents a choice in the number of hardware devices that will be used in the data collection. Data collection feature was mapped to Data Collection component. Data Collection feature uses Data Validation feature. Data Validation feature was mapped to Data Validation component, therefore the Data Collection component uses the Data Validation component. Note that since Data Validation feature is optional, which means that it can be added or removed depending on the requirements of a specific application. Data Storage, Internal Phone Memory, SQLite, Storage Redundancy, and SDCard features were mapped to Data Storage component and it uses the Storage Format component since the feature Data Storage uses the Storage Format feature. Dispatching feature and its sub-features were mapped to Dispatching Data component. Data Synchronization feature, Mobile Number, WI-FI feature, Bluetooth feature,
Fig. 2: Aspect-Feature View of the Robust SPL for Data Collection.
In the data collection domain, two crosscutting concerns were identified: Block Usage and Battery Monitor. Block Usage is a crosscutting feature that blocks the usage of base level some features (such as data collection and loading data from server) depending on a configurable battery level (e.g., it was specified 10 percent). If the battery level is lower than this value, features will not be allowed to execute and a battery exception will be thrown. Battery Monitor logs the current
35
and Storage Redundancy feature were mapped to Data Synchronization component. Loading feature and its sub-features were mapped to Loading Data component. Visualization of Collected Data feature was mapped to Visualization of Collected Data component. The Login feature was mapped to the Login component.
initially registered in paper spreadsheets. Now it needs to be collected using mobile devices. The data is collected in a remote location, which is subject to the influence of network availability, creating good conditions to derive an application to address these problems and exercise our proposed architecture.
The crosscutting feature Block Usage from the aspectfeature view (Figure 2) was mapped to Block Usage component, which is responsible for blocking functions such as loading, collecting, and dispatching data to the server. Another crosscutting feature from the same image: Battery Monitor was mapped to Battery Monitor component, which logs the current battery level when a function is activated and estimate the remaining usage time.
A. Research Questions This case study has three major objectives. The first objective is to exercise the proposed architecture, deriving an application for data collection domain that has genuine requirements and users, showing that the proposed solution works in real world environment. The second objective is to understand if components and aspects will cause an overhead in performance in Android that will prevent the user to perform any of the basic operations for data collection. The third objective is to understand if the applied fault-tolerance techniques will effectively mitigate the faults in the field, preserving the information and allowing the user to switch from one mobile device to another and continue data collection from this point on.
In the R-SPL-DC, the Data Storage component provides two APIs: one providing the Data Storage Redundancy, and another not providing Data Storage Redundancy. Therefore the variability of fault tolerance depends on the component client, i.e., it is up to the component client to decide to use or not the redundancy. For some applications, the data collection occurs in the city with high network availability, therefore the application does not need to store with redundancy in mobile. The application could store data locally only and use the Dispatching component to send data to the server in real time.
The research questions that are needed to be understood are the following: 1) Since Android has limited resources, can aspects, components, and fault-tolerance techniques be used in this environment without degrading the experience of the user? In Android, if the user experiences slow operations a message will pop-up in the screen asking the user to close the application. 2) Based on the proposed architecture, can one generate applications with variability in fault tolerance? 3) Are the fault-tolerance techniques applied satisfactory to allow the users to collect data in the field? Typically the data collection takes 8 hours, during this period, will the system be ready for use? 4) How can one integrate fault tolerance techniques to generate a mobile software architecture?
C. Robust SPL Implementation details The SPL was implemented using the COSMOS*-VP model, where each component and each architectural connector were mapped to a package. The Data Storage component creates internal mobile database using SQLite and delegates to the Storage Format component the creation of the CSV files (using SuperCSV3 opensource library). The Synchronization component takes advantage of the native core Android framework to share files among applications. The Loading and Dispatching components use JSON (JavaScript Object Notation4 ) protocol (with Jersey5 RESTful webservices library) to exchange information with the server. The BatteryMonitor and BlockUsage components “inject” code in components (Loading, Data Collection, Dispatching and Visualization of Collected Data, see figure 3) using AspectJ (through the use of COSMOS*-VP, i.e., AbstractAspects and XPIs - described in section II-E5). IV.
B. Preparation The on-the-ground observations are collected in expeditions. These expeditions are planned in a monthly basis and typically are composed of a small group of 3 to 6 biologists that go to the field and collect the data by observing each plant and registering the intensity of phenophases (e.g., leafing, budding, flowering, ripening) in a spreadsheet.
A C ASE S TUDY: E-P HENOLOGY C OLLECTOR
The e-phenology is a multidisciplinary project combining research in Computer Science and Phenology. It aims the study of periodic animal and plant life-cycle events and how these are influenced by seasonal and variations in climate. Its objective is the use of technologies for study local environmental changes. This project has three main goals: the use of new technologies to monitoring phenology; create a protocol to monitory phenology in Brazil; and provide studies about the data collected [22]. The collected data has on-the-ground observations that refer to phenological properties of about 2,000 plant individuals observed on monthly basis since 1994. These data have been 3 http://super-csv.github.io/super-csv/index.html
The software development cycle planned was a mix of prototyping [23] followed by a waterfall [23] using the proposed architecture. Therefore, two expeditions were planned. The first expedition aimed to understand most of the functional and non-functional requirements of the e-phenology domain and identify potential problems in the field using a prototype that does not have the architecture, but has a very well-defined interface. The second expedition aimed to use an application with the proposed architecture. Later, a defined set of tests to validate the major scenarios listed in section II-A were applied to verify if the obtained software addresses the data collection requirements properly.
(As of July 2015).
4 https://json.org/
(As of July 2015). 5 https://jersey.java.net/ (As of July 2015).
36
the user can continue data collection using other third party software [NFR1,NFR3]. This variability of fault tolerance can be adjusted using the connectors, easing the creation of other applications with different fault tolerance methods. This answers Question 2 and Question 3. The second expedition also validate the fault tolerance techniques implemented using four major test scenarios: 1) If the battery level is low, the functions should be blocked to avoid data loss. The data saved so far should be “importable” to another mobile device [SC1]; 2) Save data upon the individuals collected so far, transfer (using bluetooth) and import it to another mobile device [SC2]; 3) Save the data in CSV format and continue editing data in another application [SC3]; 4) Save the data in external database (SDCard), import it to the application and continue collecting data [SC4]; Fig. 4: Application Architecture for the E-phenology Collector.
These tests have shown that the performance and the use of multiple hardware (e.g., two) are satisfactory. The redundancy in the storage and synchronization were considered satisfactory and helped to maintain the system collecting data while the system was being used. This addresses Question 3.
C. Execution and Results
As for the last Question (4), this work proposed an integrated solution shown in Section III.
After the first expedition, one finding was about the battery usage: most of the battery lifetime used was due to the screen brightness another interesting finding is that the storage redundancy was really needed since the expedition took 8h to complete and a single mobile device battery life did not last [NFR2]. This finding was identified through the use of code instrumentation and Android native monitoring tool. Each of the operations was instrumented to log battery values before and after its execution, and the average battery value did not have a major decrease due to the execution of these operations. Moreover, the native Android monitoring tool has shown that the screen brightness process has the major battery consumption when compared to the rest of the processes.
D. Threats to Validity For this work, the major threat to validity is the simulation of faults of the fault model in the field. Faults such as database corruption and domain exceptions are difficult to emulate in the field. To mitigate this situation, tests were made in the laboratory simulating these faults and provide evidence that the methods of fault tolerance can handle these situations properly. An example of the tests is the deletion of the primary database of the E-Phenology Collector and the subsequent failure exposure to the user. When this occurs, the data can be imported from the secondary database (SD-Card) and the data collection be resumed. To simulate the exceptions, code was inserted to throw exceptions and observe if the code handlers (i.e., methods) were doing what was required.
The second application implemented the proposed architecture. To generate this application, first it was selected which features from the feature model could handle all the requirements for this particular domain. All the features from the feature model were selected, except for Data Validation and Visualization of Collected Data features. Therefore, the application architecture (Figure 4) has almost the same components as listed in Figure 3 except for component Data Validation. The rest of the components were implemented.
V.
R ELATED W ORK
A. Software Product Lines in mobile platforms Figueiredo et al. [24] present a comparative study of a SPL called MobileMedia, developed in J2ME.6 Two SPLs were implemented using different programming paradigms: Object-Oriented programming and Aspect-Oriented Programming. Later, different metrics were extracted to evaluate the change impact, cohesion, coupling and size of the program. The authors concluded that Aspect-Oriented SPL tends to have a more stable design when implementing optional and alternative features.
After the implementation of the second application, the performance requirements were verified using code instrumentation: the average time to perform these operations was lower than 1s. The major constraint regarding performance was that the operations should not last more than 1 second [NFR4]. The application succeeded on satisfying this requirement, even with the presence of fault tolerance techniques and aspects, answering Question 1.
Tizzei et al. [20] extend the work of Figueiredo et al. [24] adding the analysis about pure component-based SPLs and hybrid approach (with components and aspects). The combination of aspects and components created a SPL that
Changes related to fault tolerance variability were implemented and the application was adjusted as per requirements for the field expedition such as data storage redundancy using removable media, synchronization, and exportation of data base between distinct mobiles, and storage in CSV file format. Using this strategy, if application is in a failure state,
6 J2ME
37
- Java Platform Micro Edition
is more stable in terms of impact changes measured in a predefined set of scenarios.
Also it can be configured to allow levels of fault tolerance depending on the network availability since this configuration was concentrated in the architectural connectors, for example, it can derive an application that uses storage redundancy and another application that does not use it by modifying the architectural connectors.7
The work of Alves et al. [25] shows a method that uses incremental and extractive approaches in game evolution in mobile environment. The authors built a SPL in J2ME that creates product specific configurations (such as screen size, APIs and functions). Three game applications were refactored to create a product line that uses aspects to isolate functions and represent specific configurations.
The application for phenologycal domain is a proof of concept to foster evidence that the proposed architecture is usable and fits the purpose to deal with data collection requirements. This application is in use and was validated by real users (i.e., biologists) on performing their daily work. The application source code, documentation and demo video is publicly available.8
B. Fault Tolerance applied to mobile platforms Acker et al. [26] propose a fault injection software for mobile devices based on a fault model proposed by Cristian [27]. This fault model specifies the communication faults for distributed systems. Acker et al. [26] use the receiving and message interference to cause fault in communication between server and mobile devices to test the faults specified for distributed systems. They propose, with future works, to generate a fault model to mobile devices and study their behavior based on implementations of technique of fault tolerance.
As a future work, there is an opportunity to understand the battery levels in the field while on collecting data. This information will allow future benchmarks to create precise models for battery level prediction. In addition, the authors plan to compare the prototype without the architecture and the application that implements the architecture regarding maintainability, i.e., lines of code that are modified when a specific change in the requirements occurs (and other metrics, similar to the study of Figueiredo et al. [24]).
C. Applications for Data Collection Hartung et al. [9] wrote about a general software for field data collection that is a rapid way to build applications. This software is general and can be used in a variant of mobile operation systems giving support a lot of communication language to export and import data. That work is related because expose a general way to generate a product for data collection, on the other hand, it does not discuss fault tolerance or methods to protect data collected.
ACKNOWLEDGMENTS This research was supported by CNPq, CAPES, FAPESP and Microsoft Research. This project is partially funded by FAPESP-Microsoft Virtual Institute (grant 2013/50155-0). The authors would like to thank Professor Leonor Patricia Cerdeira Morellato for the opportunity to apply their ideas in the ephenology project. They are also grateful for the participation on the European DEVASSES Project (DEsign, Verification and VAlidation of large scale, dynamic Service SystEmS - http: //www.devasses.eu/). In addition to that, they would like to thank the numerous anonymous reviewers that helped to review and complement the ideas of this paper.
Dos Santos et al. [8] describe the Maritaca Project, which is a web-based environment to create different mobile applications to collect data in the field based on questionnaires. The user creates customizable online questionnaires and it automatically generates an application for it. It allows data collection from a wide range of data types like sounds, videos, GPS localization, text, and images. The Maritaca Project has a client-server architecture to send and receive data from the mobile application using web-services. This work does not use any fault tolerance technique.
R EFERENCES K. Pohl, G. B¨ockle, and F. J. van der Linden, Software product line engineering: foundations, principles and techniques. Springer Science & Business Media, 2005. [2] H.-Y. Chen, Y.-H. Lin, and C.-M. Cheng, “Coca: Computation offload to clouds using aop,” in Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium on. IEEE, 2012, pp. 466–473. [3] Y. Falcone and S. Currea, “Weave droid: aspect-oriented programming on android devices: fully embedded or in the cloud,” in Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering. ACM, 2012, pp. 350–353. [4] F. Lettner and C. Holzmann, “Sensing mobile phone interaction in the field,” in Pervasive Computing and Communications Workshops (PERCOM Workshops), 2012 IEEE International Conference on. IEEE, 2012, pp. 877–882. [5] Wikipedia, “Data collection.” [Online]. Available: https://en.wikipedia. org/wiki/Data collection [6] H. Pundt, “Field data collection with mobile GIS: Dependencies between semantics and data quality,” GeoInformatica, vol. 6, no. 4, pp. 363–380, 2002. [1]
There are studies discussing SPLs, Fault Tolerance, and data collection in mobile platforms, but none of them tried to propose an integrated SPL to develop fault tolerant applications that deals with challenges in the data collection. The application product uses the proposed architecture for the phenology domain in the context of the e-phenology project. In addition to that, none of the SPL studies tried the use of AOSD in Android platform. VI.
C ONCLUSIONS AND F UTURE W ORK
This paper presented a solution to generate a Robust SPL, a Robust SPL Architecture for Data Collection and an application for phenology domain called E-Phenology Collector. The SPL architecture can be used to derive various data collection applications. It was created based on existing data collection applications and contains most of the basic features of the data collection domain, therefore it can generate applications using different combinations of the proposed components.
7 Definition 8 Available
38
in section II-E5 at http://gustavowaku.com/e-phenology.html (As of July 2015)
[7]
[8]
[9]
[10]
[11] [12]
[13] [14]
[15] [16] [17]
[18] [19]
[20]
[21]
[22]
[23] [24]
[25]
[26]
[27]
B. Lussier, R. Chatila, F. Ingrand, M.-O. Killijian, and D. Powell, “On fault tolerance and robustness in autonomous systems,” in Proceedings of the 3rd IARP-IEEE/RAS-EURON joint workshop on technical challenges for dependable robots in human environments, 2004, pp. 351–358. B. G. dos Santos, A. H. Mamani-Aliaga, and J. V. S´anchez, “Projeto maritaca: Arquitetura e infraestrutura para coleta m´ovel de dados usando smartphones,” in 31 Simp´osio Brasileiro de Redes de Computadores e Sistemas Distribu´ıdos (SBRC), 2013, pp. 1076–1083. C. Hartung, A. Lerer, Y. Anokwa, C. Tseng, W. Brunette, and G. Borriello, “Open data kit: tools to build information services for developing regions,” in Proceedings of the 4th ACM/IEEE International Conference on Information and Communication Technologies and Development. ACM, 2010, p. 18. Y. Yu, Y. Wang, J. Mylopoulos, S. Liaskos, A. Lapouchnian, and J. C. S. d. P. Leite, “Reverse engineering goal models from legacy code,” in Requirements Engineering, 2005. Proceedings. 13th IEEE International Conference on. IEEE, 2005, pp. 363–372. L. L. Pullum, Software fault tolerance techniques and implementation. Artech House, 2001. J.-C. Laprie, “Dependable computing: Concepts, limits, challenges,” in Special Issue of the 25th International Symposium On Fault-Tolerant Computing, 1995, pp. 42–54. L. Bass, Software architecture in practice. Pearson Education India, 2007. F. C. G¨artner, “Fundamentals of fault-tolerant distributed computing in asynchronous environments,” ACM Computing Surveys (CSUR), vol. 31, no. 1, pp. 1–26, 1999. R. Laddad, Aspectj in action: enterprise AOP with spring applications. Manning Publications Co., 2009. P. Clements and L. Northrop, “Software product lines: practices and patterns,” 2002. K. C. Kang, S. G. Cohen, J. A. Hess, W. E. Novak, and A. S. Peterson, “Feature-oriented domain analysis (foda) feasibility study,” DTIC Document, Tech. Rep., 1990. K. Czarnecki and U. W. Eisenecker, “Generative programming: methods, tools, and applications,” 2000. L. Tizzei, C. Rubira, and J. Lee, “An aspect-based feature model for architecting component product lines,” in Software Engineering and Advanced Applications (SEAA), 2012 38th EUROMICRO Conference on, 2012, pp. 85–92. L. P. Tizzei, M. Dias, C. M. F. Rubira, A. Garcia, and J. Lee, “Components meet aspects: Assessing design stability of a software product line,” Information and Software Technology, vol. 53(2), pp. 121–136, 2010. M. O. Dias, L. Tizzei, C. M. Rubira, A. F. Garcia, and J. Lee, “Leveraging aspect-connectors to improve stability of product-line variabilities,” in VAMOS’10: Proceedings of the 4th International Workshop on Variability Modelling of Software-Intensive Systems, 2010, pp. 21– 28. P. Morellato, B. Alberton, J. Almeida, J. Alex, G. Mariano, and R. Torres, “e-phenology: monitoring leaf phenology and tracking climate changes in the tropics,” in EGU General Assembly Conference Abstracts, vol. 16, 2014, p. 12020. I. Sommerville, W. Boggs, M. Boggs, B. Bruegge, A. H. Dutoit, W. Boggs, and M. Boggs, “Software engineering 9 th edition,” 2011. E. Figueiredo, N. Cacho, C. Sant’Anna, M. Monteiro, U. Kulesza, A. Garcia, S. Soares, F. Ferrari, S. Khan, F. C. Filho, and F. Dantas, “Evolving software product lines with aspects: An empirical study on design stability,” in ICSE ’ 08. ACM/IEEE 30th International Conference on Software Engineering, 2008, pp. 261–270. V. Alves, P. Matos Jr, L. Cole, P. Borba, and G. Ramalho, “Extracting and evolving mobile games product lines,” in Software Product Lines. Springer, 2005, pp. 70–81. E. V. Acker, T. S. Weber, and S. L. Cechin, “Injec¸a˜ o de falhas para validar aplicac¸o˜ es em ambientes m´oveis,” in Workshop de Testes e Tolerˆancia a Falhas, vol. 11, no. 2010, 2010, pp. 61–74. F. Cristian, “Understanding fault-tolerant distributed systems,” Communications of the ACM, vol. 34, no. 2, pp. 56–78, 1991.
39