2015 12th 2015 Working IEEEIEEE 12th 12th IEEE/IFIP Conference Conference on Software on Software Architecture Architecture
Towards Architecting for Continuous Delivery Lianping Chen Technology Department Paddy Power Dublin, Ireland
[email protected] Abstract—Continuous Delivery (CD) has emerged as an auspicious software development discipline, with the promise of providing organizations the capability to release valuable software continuously to customers. Our organization has been implementing CD for the last two years. Thus far, we have moved 22 software applications to CD. I observed that CD has created a new context for architecting these applications. In this paper, I will try to characterize such a context of CD, explain why we need to architect for CD, describe the implications of architecting for CD, and discuss the challenges this new context creates. This information can provide insights to other practitioners for architecting their software applications, and provide researchers with input for developing their research agendas to further study this increasingly important topic.
II.
Paddy Power is a rapidly growing company in the bookmaking industry, with a turnover of approximately €6 billion and 4,000 employees. It offers its services in regulated markets, through betting shops, telephone, and the Internet. The company heavily relies on an increasingly large number of custom software applications. These applications include websites, mobile applications, trading and pricing systems, feeds distribution systems, and software used in the betting shops. These applications are developed using a wide range of technology stacks, including Java, Ruby, PHP, and .Net. To run these applications, the company has an IT infrastructure, which consists of thousands of servers in different geographical locations.
Keywords—software architecture; continuous delivery; continuous deployment; continuous software engineering; quality attributes; architecturally significant requirements; non-functional requirements; DevOps.
I. INTRODUCTION
These applications are developed and maintained by the Technology Department, which employees approximately 400 people. The size of each software development team varies depending on the size and complexity of the application, from two to 26 members. The majority of the teams have four to eight people.
Continuous Delivery (CD) is a software engineering discipline in which teams keep producing valuable software incrementally in short cycles and ensure that the software can be reliably released at any time [1]. It has emerged as an auspicious approach to provide an organization the ability to rapidly, efficiently, and reliably bring service improvements to market, and eventually stay a step ahead of the competition [2].
The release cycle for each application also varies. Typically, each software application used to have less than 6 releases a year. For each release cycle, the requirements were gathered at the beginning of the cycle. Engineers worked on development for months. Towards the end of the cycle, there were extensive testing and bug fixing. After this, the software was handed over to operations engineers for deploying to production. The deployment process involved many manual activities.
Because of its promising benefits, CD is attracting increasing attention and recognition. Implementing CD is one of the key initiatives for many enterprises. It also becomes an increasingly popular topic in the research community [3]. We have been implementing CD in Paddy Power, a large bookmaking company, for the last two years. We have moved 22 software applications to CD. I observed that CD has created a new context for architecting these applications.
With this release model, features completed early in the release cycle were artificially delayed. The value that would be generated by these features was therefore lost, and early feedback on these features was not available either.
In this paper, I make an initial attempt to answer the following questions: 1) what are the characteristics of this CD context; 2) why we want to architect for CD; 3) what does architecting for CD imply? I will also discuss the challenges that this new context creates.
Furthermore, many releases were a stressful experience, because the release process was not often practiced and there were many error-prone manual activities. It was not uncommon to get priority 1 incidents that were caused by manual configuration mistakes.
This information can provide fellow practitioners with insights for architecting their software applications for CD, and provide researchers with input for developing their research agendas that contribute to solving the various challenges associated with CD. 978-1-4799-1922-2/15 $31.00 © 2015 IEEE DOI 10.1109/WICSA.2015.23
ORGANIZATIONAL CONTEXT
Before diving into technical details, I first provide a brief overview of the context of our organization in which the observations were made.
In addition, the release activities were not efficient. It could take up to 3 weeks to just set up a testing environment. 131
IV.
To improve the situation, an initiative to adopt CD was started. A dedicated team of eight people was established. The team has been working on this for more than two years. So far, we have moved 22 applications to CD. They are developed by one of the largest software development groups. Their main users are business people in the company.
The main reason is the huge benefits we have observed after moving 22 software applications to CD. I summarize these benefits below, which are described in more detail elsewhere [1].
I have observed that this movement has created a new context for architecting the software applications. In the next section, I characterize this context in terms of its essential practices and principles. III.
WHY DO WE ARCHITECT FOR CONTINUOUS DELIVERY?
Before discussing the implications of CD to the architecting of software applications, I first address the question: why do we need to architect the applications for CD?
A. Accelerated Time to Market The release frequency has dramatically increased from once every one to six months to once a week on average. Some applications were released multiple times a day when needed. The cycle time from a user story’s conception to production has reduced from several months to two to five days.
CHARACTERISTICS OF CONTINUOUS DELIVERY
In our company, we define CD as a software engineering discipline in which teams keep producing valuable software incrementally in short cycles and ensure that the software can be reliably released at any time [1]. This definition reflects the characteristics of CD we observed. I describe these characteristics below.
CD enables us to deliver the business value inherent in new software releases to our customers more quickly. This capability helps us to stay a step ahead of the competition, in today’s competitive economic environment.
A. Releasable at Any Time / Frequent Releases Teams that practice CD ensure the software is releasable at any time. They usually make frequent releases, as frequent as multiple times a day. This is quite different from the past, where many finished user stories were not ready for release until a big release date. To request an unplanned release, people had to give many days or weeks of notice in advance.
B. Build the Right Product We observed that the frequent releases enable the application development teams to get faster feedback from the users of the applications. The feedback enables the teams to work only on useful features. When a feature is found to be not useful, no further effort will be spent on it. This helps the team to build the right product. Previously, it was not uncommon for the team to hear from customers that the feature that they had spent months building was not useful.
This characteristic implies the capability to get fast feedback on the release-readiness of the software application, whenever changes are made to it. This also implies that once the software passes all stages of the CD pipeline [1], it is guaranteed to have sufficient quality for release.
C. Improved Productivity and Efficiency Significant improvement in productivity and efficiency was also observed. This can be clearly seen in the savings achieved in the following areas.
B. Reliable/Automated Release If a release involves deploying an application to a production environment, the deployment should be reliable. Failures in deployment rarely happen; however, in case of a deployment failure, it is carefully managed so as not to affect the users of the application, which could result in revenue loss.
Developers used to spend 20% of their time on setting up and maintaining their test environments. Now, the CD pipeline automatically sets up the environments for them. Similarly, test engineers used to spend lots of efforts on testing environments set up. With the CD pipeline, they don’t need to do this, either.
To achieve reliable release, the deployment is usually automated. Before executing the production deployment, the deployment process and scripts have usually been exercised several times in different testing environments.
Operations engineers used to take several days’ effort to release an application to production. Now, they only need to click a button. The CD pipeline automatically releases the application to production.
C. Delivering Valuable Software The team continuously makes sure the software being developed provides value to customers. This implies the ability to quickly gather users’ feedback on a feature once it is delivered. This also implies that a big feature will be decomposed into smaller ones, so that each one can be quickly delivered to get feedback early.
Furthermore, developers and operations engineers used to spend lots of efforts on troubleshooting and fixing issues caused by the old release practice. Now, the CD pipeline eliminated these issues. The efforts that otherwise would be spent on fixing these issues are saved and can be used in more valuable activities. D. Improved Product Quality Significant product quality improvement was observed. The number of open bugs for the applications has been reduced by more than 90%. In addition, the number of priority 1 incidents in production has been reduced significantly.
D. Small Size The size of a user story is usually sufficiently small so that it can be finished within a week. The number of user stories in a single release is also small. Keeping such increments small helps to reduce the cycle time and release risks.
132
attackers (hackers) get more opportunities to attack the software during its start-up time.
E. Improved Customer Satisfaction The users of the applications are internal customers in a different department. Before the applications moved to CD, a lack of trust and tension existed between this department and the software development teams, due to previous quality and release issues. Managers have commented that the relationship between these two departments has improved. Trust has been established.
I observed that traditionally less attention is paid to this area. Software architects should not ignore this area when architecting software for CD. C. Loggabilty With CD, when deploying a new version of an application, one of the mostly used styles is that: first deploying the new version on a set of newly created machines; then, once the new version is proven working, the machines running the old version are destroyed. Each version is usually replaced by a new version once or twice a week.
Motivated by these major benefits, the company has decided to move all software applications to CD. In other words, it is these benefits that motivated us to architect our software applications for CD. V.
WHAT DOES ARCHITECTING FOR CONTINUOUS DELIVERY IMPLY?
With this deployment style, any information on the old machines is no longer accessible after a release. Any useful information about the execution of the software has to be logged properly, so that we can use log aggregation techniques to move these logs to a log server for analysis. The logs should contain sufficient information for diagnosis and troubleshooting when issues arise.
The most salient implications of CD to architectures of software applications are about Architecturally Significant Requirements (ASRs). ASRs are those requirements that have a measurable impact on a software system’s architecture [4]. It is this set of requirements that shape the architecture of the software applications. To be able to practice CD effectively and gain the maximum benefits from CD, software applications have to meet a set of ASRs. I describe each of these ASRs below.
At the same time, storing and managing logs costs time and money, and storing and maintaining more logs means more cost. From this perspective, the logs should be concise enough to not log anything that does not justify its cost.
A. Deployability After we moved to CD, typically a software application is deployed to several testing environments multiple times a day and deployed to the production environment once or twice a week.
Meeting both of the above requirements is challenging. When we move applications to CD, we do see applications that do not provide sufficient information for diagnosing and debugging issues. We also see applications that produce too many logs and cost a significant amount of extra money. Architecture principles should be put in place to guide the logging of the applications. The architecture principles can include rules on what to log, the log format, the logging mechanism, etc. Each of these aspects is important. For example, the log content will determine what information is available from the logs; the log format can affect how easily the logs can be analyzed; the logging mechanism can affect the logging performance.
A level of deployability that is acceptable for traditional multi-month release model may not be acceptable for CD. When releaseing twice a year, we can take the system down for a release; assign a significant task force to perform the deployment, and be less concerned with the actual time of the deployment as long as it happens within some time window. However, all of the above are unacceptable with a high frequency of releases in CD. For example, it is generally not feasible to take the system down for release, because this will dramatically affect the availability of the software system. To maintain the availability of the system, zero-downtime release is required.
D. Modifiability In CD, each user story is usually of a small size, allowing it to be released quickly to get fast user feedback. With fast feedback, the team can make sure that they are always working on things that offer value to customers, and eventually to the company. An application should be architected in a way that supports this style of software development. In general, the software application should exhibit a high degree of modifiability to allow constant incremental adding of small new features.
I observed that apart from deployment techniques, the architecture of a software application has an impact on whether and how easy we can achieve a reliable, quick, and zerodowntime release. A software application has to be architected with such deployment requirements in mind, otherwise, it will be very challenging to move the application to CD. For example, we have an application that requires many weeks to build a testing environment for it. Moving this application to CD is extremely challenging.
In CD, the concern of modifiability also extends to how a modification can be deployed. An example we observed is about SQL database schema changes. If the team does not manage the database schema changes properly, automating the deployment of the schema changes is usually very challenging. When architecting an application for modifiability, the ease with which modifications can be deployed should also be considered and evaluated.
B. Security With more frequent releases, the application goes down and up more frequently. The security vulnerabilities of the application during the start-up time become more important, as
133
E. Monitorability Monitoring is important for practicing CD. We rely on monitoring to get immediate feedback on a deployment, especially when things are broken, so that we can take remedial actions before those broken things have a major impact on our customers. Apart from putting in proper monitoring tools, the software application itself should be architected in a way that is amenable for monitoring.
from CD. Moving these applications to CD is challenging. Is it viable to improve the architectures of these applications to make them more aligned with what CD requires? What is the best route to take for making the changes? More research is needed to provide satisfactory answers. VII. RELATED WORK Not much research has been done on architecture in the context of CD. Bass et al. [5] present extensive discussions of architecture in relation to CD. The discussion focuses on ASRs of the tools for supporting CD, rather than the ASRs that the software applications need to meet. Bellomo et al. [6] provide an in-depth description of deployability (covering deployability goals, design decisions and tactics to meet the deployability goals). The description focuses on deployablity only. Our work is the first in attempting to provide a comprehensive discourse of CD’s implications to application architectures, in terms of ASRs.
With CD, we sometimes need applications to expose additional monitoring interfaces to facilitate CD. For example, we need each application to provide a monitoring interface to tell whether it is fully running and operational after a deployment. This monitoring is essential for coordinating the starting sequence of different components of an application and for implementing a zero-downtime release. We need the additional interface because an application usually consists of several components running on different machines, and the standard monitoring facilities will not give accurate insights as to whether an application is fully up.
VIII. CONCLUSION AND FUTURE WORK
Apart from traditional monitoring of the technical metrics of the system, monitoring business metrics regarding user feedback becomes more important. Fast feedback gathering right after the release is very important for the team to make sure that they are always working on things that bring values to the users. We noticed that this is a particularly important area. However, this area is still less explored.
Based on our experiences of moving 22 software applications to Continuous Delivery (CD), I observed that, to effectively practice CD, these applications should meet a set of ASRs. The list of ASRs reported here is not intended to be exhaustive. Rather, I believe this paper can evoke further discussions and more research activities on this increasingly important topic.
F. Testability A CD pipeline consists of several stages [1]. Each stage of the pipeline serves as a quality gate. When an application passes all the stages of the pipeline, the application should be ready for release.
Possible future work includes: compiling a comprehensive list of ASRs by incorporating other practitioners’ observations; investigating whether customized general scenarios templates [7] can be created to help architects to elicit and document these ASRs; and investigating techniques for better addressing these ASRs in a CD context.
To ensure the readiness for release, the CD pipeline heavily relies on tests that are hooked into the stages. Through these tests, the team ensures that when a code change passes all the stages, it is ready for release. Good testability should be architected into the software application, so that developing these tests is feasible and cost effective.
I thank my colleagues and Klaas-Jan Stol for their help and thoughtful comments. The article represents only my own views and does not necessarily reflect those of my employer.
ACKNOWLEDGMENT
REFERENCES
VI. DISCUSSION
[1]
I do not claim that the above ASRs are not important for software applications in a traditional context. What I observed is that architects usually can trade off these ASRs for other ASRs in a traditional software development context. In the context of CD, the priority of these ASRs becomes higher. To effectively practice CD, the architects cannot trade them off lightly. The application should meet these ASRs as well as the normal ASRs driven by the user requirements. This adds extra challenges for architecting these applications. Additional cost may be incurred as well. In general, the assumption is that the benefits of CD will justify the extra cost. However, more research is needed to study whether this assumption is true in different situations. At the same time, more research is needed to develop architecture technologies (e.g., patterns and tactics) in the CD context to make meeting these ASRs easier.
[2]
[3]
[4]
[5] [6]
Another challenge we have seen is that many existing applications that do not meet these ASRs also want to benefit
[7]
134
L. Chen, "Continuous Delivery: Huge Benefits, but Challenges Too," IEEE Software, vol. 32, 2015. J. Humble and D. Farley, Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation: AddisonWesley Professional, 2010. F. Brian and S. Klaas-Jan, "Continuous software engineering and beyond: trends and challenges," presented at the Proceedings of the 1st International Workshop on Rapid Continuous Software Engineering, Hyderabad, India, 2014. L. Chen, M. Ali Babar, and B. Nuseibeh, "Characterizing Architecturally Significant Requirements," Software, IEEE, vol. 30, pp. 38-45, 2013. L. Bass, I. Weber, and L. Zhu, DevOps: A Software Architect's Perspective: Addison-Wesley, 2015. S. Bellomo, N. Ernst, R. Nord, and R. Kazman, "Toward Design Decisions to Enable Deployability: Empirical Study of Three Projects Reaching for the Continuous Delivery Holy Grail," in Dependable Systems and Networks (DSN), 2014 44th Annual IEEE/IFIP International Conference on, 2014, pp. 702-707. L. Bass, P. Clements, and R. Kazman, Software Architecture in Practice, 2 ed.: Addison-Wesley, 2003.