A Protocol for Systematic Literature Review on ... - Semantic Scholar

A Protocol for Systematic Literature Review on Architecture-Centric Software Evolution Research Pooyan Jamshidi, Mohammad Ghafari, Aakash Ahmad, Claus Pahl This document describes the protocol for the systematic literature review on architecture-centric software evolution research. For more detail please visit: http://www.computing.dcu.ie/~pjamshidi/SLR-ACSE.html Lero - The Irish Software Engineering Research Institute Faculty of Engineering and Computing School of Computing Dublin City University 10/1/2012

Overview Version 1.0: 2011/09/01 Version 2.0: 2012/10/01

Citation: P. Jamshidi, M. Ghafari, A. Ahmad, C. Pahl, A Protocol for Systematic Literature Review on Architecture-Centric Software Evolution Research, Technical Report, Lero- The Irish Software Engineering Research Centre, Dublin City University, Oct. 2012. Auxiliary materials and data: http://www.computing.dcu.ie/~ pjamshidi/SLR-ACSE.html

An essential element in conducting a systematic review is to establish a protocol for the study. The protocol defines in advance how the systematic review is to be conducted. Such definition is necessary to structure the work and important to avoid bias. This document describes the protocol for the systematic literature review on architecture-centric software evolution research that has been started in Sept. 2011. The protocol description is organized in 3 sections as follows. In Section 1 we discuss background and justify our study. Section 2 gives a general overview of the research methodology. In Section 3 we explain the research planning in general consisting questions we address, the search strategy and scope we define, the data items that need to be collected and finally, a brief description of data synthesis we perform.

Contents Overview ....................................................................................................................................................... 1 1. BACKGROUND AND JUSTIFICATION .......................................................................................................... 3 2. RESEARCH METHODOLOGY ...................................................................................................................... 4 3. PLANNING OVERVIEW............................................................................................................................... 7 3.1 Research Questions............................................................................................................................. 7 3.2 The Scope of Study .............................................................................................................................. 8 3.3 Data Extraction and Synthesis .......................................................................................................... 11 3.4 The Framework Entities and Attributes ............................................................................................ 11 Acknowledgements..................................................................................................................................... 13 4. REFRENCES .............................................................................................................................................. 13

1. BACKGROUND AND JUSTIFICATION As illustrated in Figure 1, a causally connected architecture model which is specified as S conceptually represents a running software system A embedded in an environment with domain assumptions D that satisfies the requirements R. This can be formally expressed as: . Software evolution deals with the violation of the correctness criterion after A is embedded in the environment and starts to operate [1]. The violation may occur as a result of (i) the violation of the system from specification S, (ii) the violation of the environment behavior from the specification D, (iii) the violation of system objective from requirement R. Traditionally, software system is taken down to apply a patch offline during maintenance [2]. However, this scenario cannot satisfy the requirements of the emerging missioncritical software systems in which the system must operates continuously and remain capable of on-thefly changes to the running system as a result of the violation [3]. Based on the classification in the context of Figure 2, we are interested to reason about evolutionary aspects enabled by formally expressing system models based on architectural specifications. Software architecture model therefore provides required abstraction and facilitates the reasoning about the evolution of system expressed by a model (S, D) in a formal theory (ex. Markov model) and the requirements which are specified in a formal language (ex. temporal logic). To put the above mentioned concepts into a concrete definition, we consider ACSE as a collection of operational and analytical activities to evolve a software system from its older version to the new version, which is enabled by architecture changes. Among various distinctive approaches which use architecture description to operationalize evolution [4], we are interested to those approaches which utilize formal theory which enable analytical reasoning to verify the specification-time evolution or to reason about the system properties at run-time when it is dynamically reconfigured.

Figure 1. Correlation of the conceptual framework and our classification criteria of ACSE

2. RESEARCH METHODOLOGY Our study followed the principles of a quasi-systematic review complemented by a seminal guideline and also documented learnt lessons in [5]. However, we extend it in a way that it can accommodate the classification scheme comprising quantitative and qualitative data points. We then needed to map selected studies into the framework comprising the data points. In this way, we take into account the recommended steps on thematic analysis in software engineering [6]. Figure 2 shows an overview of the three-phased methodology we applied in the study.

Figure 2. A layered view of our research methodology A. Literature Extraction and Investigation: Four researchers were involved in the literature study. In review planning (Planning in Fig. 1), the review protocol was defined, which includes the definition of research questions, the search strategy, and the initial version of the classification scheme based on it the data items that had to be collected is defined. The protocol was defined interactively and iteratively by the team of reviewers. The research questions express the research topics of interest in this literature review. As search strategy, we combined automatic with manual search. Automatic search was defined as a two-step process for which two categories of search strings were defined. The first category aims to select the studies on architectural constraints which have been formally specified, and the second category aims to filter the studies on architecture-based evolution. For the manual search, inclusion and

exclusion criteria were defined. Next, the classification framework was defined and for each theme of the framework some sub-theme and attributes were defined. The definition of data items was based on information derived from literature sources specifically the works of [7, 8, 2, 3], and the work of [9] for consolidating the framework, and from experiences with a preceding systematic review which its data is available in [10]. For some of the data items, additional attributes were introduced during trial run comprising 10 relevant and comparatively heterogeneous papers which helped us to iteratively validate the taxonomical scheme and synchronize the understanding of concepts between involved researchers. We then exercised the search strings against the 10 papers as validation set to test whether it captures all of them. The duration time of internal validation process was roughly 1 month. The protocol was cross-checked by an external reviewer and the feedback was used to make small adaptations. For instance, the reviewer points out that “we need to consider the data items regarding research method and evaluation settings of the included paper in our report because it can reflect the quality of the classified papers”. Subsequently, the two researchers out of four conducted the review and mapping to the scheme (Executing in Fig. 1). Before proceeding to the review the two researchers finalize the scheme based on the collected data of the trial run in the previous phase. Then, studies were automatically selected based on the search criteria defined in Phase 1. One reviewer was responsible for automatic search. Manual search was performed by the other researcher that checked each paper independently based on inclusion/exclusion criteria, to select the studies for answering the research questions of the study. Once the primary studies were selected, each study was read in detail by one reviewer to extract the data structured according to the data model of the classification framework (item C) and enter them in the data items with predefined list of options in an Excel sheet form. Collected data items were crosschecked by the other reviewer for any disagreement and conflicts resolved. In case of reaching no agreement, independent research in similar literature was the main strategy to reach a consensus between the two reviewers. Note that in this case, the lists of options of the disputed data items were usually extended. Finally, the data derived from the primary studies was synthesized, collated, classified and summarized to answer the 5 research questions. One of the reviewers coordinated the writing of this review report (Reporting in Fig. 1), in close consultation with the other reviewers. B. Data Validation and Synthesizing: The classification scheme evolves while doing the data extraction, like adding new themes/attributes or merging and splitting existing themes/attributes. In this step, we used Excel sheets to document the data extraction process. When the reviewers entered the data of a paper into the scheme, they provided a short rationale why the paper should be in a certain category (for example, why the paper contributes to runtime evolution). This rationale is used for internal validation purposes. The external validation has been conducted by a researcher outside the working group to provide constructive feedback to the classification scheme and initial review data. From the final scheme, the frequencies of publications in each category can be calculated. The analysis of the results focuses on presenting the frequencies of publications for each category. This makes it possible to see which categories have been emphasized in past research and thus to identify gaps and

possibilities for future research. The quantitative attributes used for thematic classification analysis and qualitative attributes used for narrative comparative analysis. C. ACSE Taxonomical Classification: We utilized a combination of pre-existing ACSE classification and thematic analysis to reduce the time needed in developing the classification scheme and ensuring that the scheme takes the existing studies into account. First, the reviewers read abstracts of the 10 initially selected most related papers to the research questions and look for segment of text, keywords and concepts that reflect the contribution of the paper. While doing so the reviewers also identify the context of the research. When this is done, the set of keywords from different papers are labeled, overlapped are reduced and combined together to develop a high-level understanding about the nature and contribution of the research. Classification reasoning and intuitive senses are iteratively used to determine which data “make sense to be with each other” when grouping them together. This helps the reviewers come up with a set of recurrent keywords which is representative of the underlying population. When abstracts are of poor quality to allow meaningful keywords to be chosen, reviewers can choose to study also the introduction or conclusion sections of the paper. When a final set of keywords has been chosen, they can be clustered to create a model of higher-order themes. A precise definition of the used terminologies both for headings and selective options were used to mitigate any misunderstanding or bias risks.

3. PLANNING OVERVIEW During planning, the protocol for the review is defined, which includes three main steps as: i) specify research questions, ii) define search strategy and scope, and iii) define data items.

3.1 Research Questions We formulated the general goal of the study through PICOC (Population, Intervention, Comparison, Outcome and Context) perspectives [11]: Population: consists of existing research efforts in ACSE as “Type of Evolution”, “Type of Specification”, “Type of Constraints”, “Runtime Issues” and “Infrastructure”; Intervention: taxonomical classification criteria for ACSE research, internal/external validation of review protocol consisting the classification framework, extracting literature contributory elements and mapping to the classification scheme, synthesis on the populated scheme; Comparison: A comparison among the population (5 categories) by mapping the literature to the classification framework; Outcome: A classification framework, a mapping of literature to the scheme, future research opportunities for ACSE community; Context: a systematic investigation to consolidate the academic, peer-reviewed research for ACSE researchers. The central research question translates to five concrete questions: RQ1: What types of evolution are supported in the ACSE? Getting insight in what type of evolution are proposed or considered by researchers in ACSE following four perspectives of evolution: “what”, “when”, “where”, and “why”. RQ2: What degrees of formalism and expressiveness are required to specify software architecture to enable ACSE? Getting insight in usage of formal methods by researchers in ACSE. This question aims to

assess which languages and in what level of expressiveness have been used for modeling architectural constraints, verifying properties and automation support. RQ3: What types of architectural constraints are specified in architectural models in the ACSE? Understanding why formal methods have been used in ACSE approaches. Concretely, we aim to assess types of architectural constraints considered in existing ACSE. RQ4: What types of execution environments and mechanisms are needed to enable run-time aspects of ACSE? Investigating what kind of execution environments or dynamic reconfiguration functionalities have been used in the ACSE literature. RQ5: What tools supports are available for ACSE? Investigating what kind of automation facilities have been used to provide guarantees about concerns in ACSE.

3.2 The Scope of Study Once we have defined the entities and attributes of interest for the framework, the literature extraction process has been driven by the following search terms structured into a logical expression. These search terms were combined by using the Boolean OR and AND operators that totally resulted in 14*11=154 search strings in order to find relevant studies.

No 1 2 3 4 5 6 7 8

Name ACM IEEE Science Direct Web Of Science Springer Link Pro-Quest Google Scholar Wiley Total

Results 663 1149 194 480 445 231 460 516 4138

Figure 3. A summary of search strings and results After applying the 154 search strings on Google Scholar, IEEE, ACM DL, Springer Link, Science Direct, Wiley Inter-Science, Pro-Quest, and ISI Web of Science, we extracted 4138 manuscripts (Table 2). Because we used our search criteria on "title and abstract", the search results provided a relatively high number of irrelevant studies. These databases were chosen as they provide the most important and with highest impact full-text journals and conference proceedings, covering the software engineering field in general. Once the initial set of publications identified, duplicate publications were removed. These publications were checked against some inclusion and exclusion criteria. Irrelevant and duplicate publications were removed and this resulted in 235 remaining publications. We recorded the selected papers by Zotero

bibliographical management tool (http://www.zotero.org/). After further filtering by reading titles and abstracts, 154 publications were left for full text screening to rank them by investigating whether contents focused on formal software architecture description and structures and architectural constraints that formally represented as a part of the description and software architecture constraint evolution and constraints that represented during evolution and formal analysis related to architectural constraint preservation and data analysis of the study is rigorous and based on evidence or theoretical reasoning instead of non-justified or ad-hoc evaluations. Additionally, all the references of the 154 studies were checked to be sure not to miss any relevant paper. Some authors also report similar results in multiple publications, which we eliminated to avoid bias of the results. In the end, 60 studies were identified as final studies with the highest ranks after the first search process. There were some general inclusion/exclusion criteria such as the studies must be peer reviewed in English language conferences and journals and should not be a letter, editorial, short paper, position paper, or white paper. We explicitly defined the following exclusion/inclusion criteria in the study protocol; additional details are available in [10]: Inclusion criterion 1: The study formalizes (at least a part of) the architecture description as well as architectural properties in terms of constraints which need to be analyzed during evolution of the system. Rationale: we might have included studies which employ some formal terms, but do not actually employ formal methods for a particular purpose related to ACSE; Inclusion criterion 2: The study must be evaluated rigorously and based on evidence or theoretical reasoning instead of non-justified or ad-hoc validations. Rationale: we identified a number of approaches which are in their initial stages of development and mostly are based on intuitive examples or primary user-based evaluation; Exclusion criterion 1: A study that is an editorial, abstract or a short paper. Rationale: these studies do not provide a reasonable amount of information; Exclusion criterion 2: A study that focuses on formal techniques themselves, rather than on the use of formal methods for ACSE concerns. Rationale: these studies do not provide information regarding the research questions. Exclusion criterion 3: A study that is a review paper whether a systematic review or mapping or narrative review. Rationale: these studies do not propose any specific approach that we can extract the data items from it.

Table 1. ITEMS TO RANK STUDY QUALITY

General items: G [12, 13, 14] 1. Problem definition of the study. Options are: (2) The authors provide an explicit problem description for the study. (1) The authors provide a general problem description. (0) There is no problem description. 2. Environment in which the study was carried out. Options are: (1) The authors provide an explicit description of the environment in which this research was performed (e.g., lab setting, as part of a project, in collaboration with industry, etc.). (0.5) The authors provide some general words about the environment in which this research was performed. (0) There is no description of the environment. 3. Research design of the study refers to the way the study was organized. Options are: (2) The authors explicitly describe the plan (different steps, timing, etc.) they have used to perform the research, or the way the research was organized. (1) The authors provide some general words about the research plan or the way the research was organized. (0) There is no description of how the research was planned / organized. 4. Contributions of the study refer to the study results. Options are: (2) The authors explicitly list the contributions/results of the study. (1) The authors provide some general words about the study results. (0) There is no description of the research results. 5. Insights derived from the study. Options are: (2) The authors explicitly list insights/lessons learned from the study. (1) The authors provide some general words about insights/lessons learned from the study. (0) There is no description of the insights derived from the study. 6. Limitations of the study. Options are: (2) The authors explicitly list the limitations/problems with the study. (1) The authors provide some general words about limitations/problems with the study. (0) There is no description of the limitations of the study.

Specific items: S 1. Main focus of the paper. Options are: (2) Application of formal theory to enable architectural property analysis (1) Architecture-centric evolution and property analysis (1/2) Architectural evolution (0) Formal theory to specify architecture without any evolutionary purposes 2. The architecture descriptions presented in the studies. Options are: (2) Formally specified by a formal theory (1) Specified by an ADL (0) Specified informally 3. Architectural properties in terms of constraints. Options are: (2) Formally specified by a formal language (1) Specified by a constraint language (0) Specified informally 4. Evaluation of the study. Options are: (2) Theoretical reasoning is part of the evaluation (1) Some informal evidences are provided (0) Non-justified or ad-hoc validations

Ranking formula=⌈ (

)

4= Quality papers; 3: Acceptable; 2, 1, 0: Excluded

(

)

⌉

3.3 Data Extraction and Synthesis The data extraction and synthesis process were carried out by reading each of the 60 papers thoroughly and extracting relevant data, which were managed through Zotero. In order to keep information consistent, the data extraction for the 60 studies was driven by the framework outlined in Table 1 and maintained in Excel spreadsheets (submitted as an appendix file). The general bibliographic data has been extracted automatically by applying an extraction algorithm on the Zotero database to make the extraction process less laborious and more precise. The included papers have been initially reviewed and necessary information has been extracted and the framework sheets populated with the extracted information by one of the authors and then thoroughly evaluated by the other authors. The results of the synthesis will be described in the subsequent sections.

3.4 The Framework Entities and Attributes The starting point of the classification scheme was based on the bottom-up thematic analysis we described in Section 2; however, the discussion in Section 5.1 can be taken as top-down justification of the framework. Based on that discussion we understand that ACSE has different dimensions. As conceptually outlined in Figure 2, architecture of the system needs to be specified (Type of Specification) formally and for analytical purposes properties of the system is also specified as a part of that architecture description as a number of constraints (Type of Constraints). An evolution mechanism (Type of Evolution) can analyze the specified changes and apply them either at design-time or runtime (Runtime Issues). For scalability, performance and economic issues, these activities require automation (Tool Support) as illustrated in Figure 3. As far as the mentioned goals are concerned, and by referring to Figure 4, we divide the basic characteristics of ACSE approaches into the following five themes, as presented in Table 1: (i) Type of Evolution. Characterizes taxonomical classification of the when, what, why, and how aspects of ACSE. (ii) Type of Specification. Concentrates on linguistic and description aspects of ACSE. (iii) Type of Constraints. Concentrates on reasoning and analytical aspects in ACSE activities. (iv) Runtime Issues. Concentrates on runtime mechanisms and infrastructures to enable ACSE. (v) Tool Support. Concentrates on the automation support to enable ACSE.

Table 2. The ACSE classification framework Taxonomical Classification (Higher-Order Theme)

Type of Evolution

Subclassifications (Sub-themes) Need for Evolution Mean of Evolution Time of Evolution Support Activity Stage of Evolution Level of Formalism Type of Formalism

Type of Specification

Description Language UML Specification Description Aspect Mean of Constraint

Type of Constraint

Intent of Constraint Analysis of Constraint

Run-Time Issue Tool Support

Environment of Runtime Evolution Mean of Runtime Evolution Need for Tool Support Analysis Usage of Tool Support Level of Automation Research Motivation

Research Method

Application Domain Evaluation Method

Coded Attributes (Domain Knowledge, Standards, Keywords) Corrective, Perfective, Adaptive, Preventive, All applicable Transformation, Refactoring, Refinement, Restructuring, Adaptation, Reconfiguration, Pattern Specification-Time, Run-Time Consistency checking, Impact analysis, Evolution test, Propagation, Versioning Analysis & Design, Implementation, Integration & provisioning, Deployment, Execution, Evolution Informal, Semi-Formal, Formal Modeling language: Graph, Petri-net, Ontology, ADL, Description logic, π-calculus, Prog. lang., Domain-specific, Alloy, Larch, State machine, Z, Constraint automata, Process algebra, Type systems, FSP, CSP, CHAM, Archface, Model-based; Constraint specification language: OCL, CCL, FOL, Grammars, Temporal logics, Rules Process algebra: Darwin, Wright, LEDA; Standards: UML, Ex-UML, SysML, AADL; Others: ACME, Aesop, C2, MetaH, Rapide, SADL, UniCon, Weaves, Koala, xADL, ADML, AO-ADL, xAcme Activity, State, Sequence, Class, Component, Object, Transition, Communication Structural, Behavioral, Semantic Pattern, Architectural style, Primitive, Metaphore, Component-level invariant, Cross-component invariant, Micro-structure, Crosscutting concern, Clue, Meta-model, Pre-Post-condition, Variability, Coordination, Temporal invariant, Coding rules, Cardinality Specify, Preserve, Change, Enforce, Match, Discover, Analyze Consistency checking after evolution, Model checker, Pattern conformance, Graph-based refinement, Constraint enforcement Fractal , KOALA, MS COM, OSGi, SOFA 2.0, CCM, EJB, JavaBeans, OpenCOM, KobrA, .NET, SIENA, SOA middleware, Reo Engines Reflection, Introspection, Constraint injection, State transfer, Intercession, Reification, Causal connection, Safe stopping, Runtime binding Business case, Creating architecture, Documenting, Analyzing, Evolving Simulation, Dependence analysis, Model checking, Conformance testing, Interface consistency, Inspection and review-based Fully automated, Partially automated, Manual A particular problem/challenge, Overview or Survey, Formalism for constraint specification, Formalism for architectural analysis, Formalism for arch. evolution, Formalism for code generation SPL, OO, SOA, CBS, Embedded, Ubiquitous, Mission-critical, Real-time, Process-aware, Distributed, Event-based, Concurrent, Mechatronic, Mobile, Robotic, Cloud computing, Smart-*, Autonomic computing, Grid computing Case study, Mathematical proof, Example application, Experience report

The groupings and their dimensions are shown in Table 1. Note that the proposed classification is used for the comparison of ACSE based on our focus which were discussed in the previous section and should not be adopted without any justification or verification. Firstly, we deliberately did not include all aspects of software architecture evolution. For example, the who question identifies the stakeholders involved in software architecture change is not covered in the current classification because it reflects human aspects of evolution and our focus of rigorous formal theory, analytical reasoning based on architectural constraints and runtime issues aims lessen human intervention. The where question is also excluded since it is too detailed for our aim. More importantly, it has been thoroughly covered in [2] in terms of specification-time evolution and in [3] in terms of runtime reconfiguration. Secondly, the proposed classification provides only one of the many ways in which ACSE can be grouped. Most importantly, the classification is subject to continuous evolution and extension, since the elements that it classifies continue to mature, due to scientific and technological advances in the field of software engineering.

An external validation (from active researchers, domain experts) for the classification schema suggested that: (i) data items regarding research method and evaluation settings of the included paper though it is rather a general category and orthogonal to the technical classification should be included in the classification, (ii) Some of the higher-order categories and their sub-classification need to be renamed to reflect the whole category properly and convey an standard definition. (iii) Some of the coded attributes which have a comparatively a vast number of options requested to be grouped to a manageable number of options to be more comprehensible and orthogonal to each other with least commonality.

Acknowledgements This research work was supported, in part, by Science Foundation Ireland grant 10/CE/I1855 to Lero the Irish Software Engineering Research Centre (www.lero.ie). The authors would like to thank Dr. Jim Buckely for evaluating the review protocol as an external researcher and domain expert.

4. REFRENCES [1] Calinescu, R. Ghezzi, C., Kwiatkowska, M., Mirandola R., Self-Adaptive Software Needs Quantitative Verification at Runtime, doi:10.1145/2330667.2330686. [2] Medvidovic, N., Taylor, R. N. 2000. A Classification and Comparison Framework for Software Architecture Description Languages. IEEE Trans. Software Eng. 26(1): 70-93. [3] A Classification of Formal Specifications for Dynamic Software Architectures, Bradbury et al., 2004. [4] Mens, T., Magee, J., and Rumpe, B. 2010. Evolving software architecture descriptions of critical systems. Computer, 43(5):42–48. [5] P. Brereton, B. Kitchenham, D. Budgen, M. Turner, and M. Khalil. Lessons from applying the systematic literature review process within the software engineering domain. Journal on Systems and Software, 80, 2007. [6] Cruzes, D.S, and Dyba, T. 2011. Recommended Steps for Thematic Synthesis in Software Engineering, ESEM. [7] Breivold, H. P., Crnkovic, I., Larsson, M. 2011. A systematic review of software architecture evolution research, Journal of Information and Software Technology, doi: 10.1016/j.infsof.2011.06.002. [8] Williams, B. J., Carver, J. C. 2010. Characterizing software architecture changes: A systematic review. Information & Software Technology 52(1): 31-51 (2010) [9] Buckley, J., Mens, T. Zenger, M., Rashid, A., Kniesel, G. 2005. Towards a taxonomy of software change. Journal of Software Maintenance 17(5): 309-332.

[10] Jamshidi, P., Ghafari, M., Aakash, A., Claus, P., A Framework for Classifying and Comparing Architecture-Centric Software Evolution Research [Online: Auxiliary Materials], http://www.computing.dcu.ie/~pjamshidi/SLR-ACSE.html. [11] Petticrew M, Roberts H. Systematic reviews in the social sciences: a practical guide. Oxford: Blackwell, 2006. [12] T. Dyba and T. Dingsyr, Empirical studies of agile software development: A systematic review, IST, vol. 50, pp. 833–859, 2008. [13] D. Weyns, U. Iftikhar, S. Malek, and J. Andersson. Claims and supporting evidence for self-adaptive systems, Software Engineering for Adaptive and Self-Managing Systems, 2012. [14] M. Riaz, E. Mendes, E. Tempero, A Systematic Review of Software Maintainability Prediction and Metrics, ESEM, 2009.