Invited Paper
Distributed Agile Software Development for the SKA Andreas Wiceneca , Rebecca Parsonsb , Slava Kitaeffa , Kevin Vinsena , Chen Wua , Paul Nelsonc , David Reedd a ICRAR/UWA,
M468/35 Stirling Hwy., Perth, Australia; Inc; 200 E Randolph Street, 25th Floor, Chicago, IL 60601, USA c ThoughWorks Inc; Suite 600, 15455 Dallas Parkway, Addison, TX 75001, USA d ThoughWorks Inc; 303 Collins Street, Melbourne, VIC 3000, Australia
b ThoughWorks
ABSTRACT The SKA software will most probably be developed by many groups distributed across the globe and coming from different backgrounds, like industries and research institutions. The SKA software subsystems will have to cover a very wide range of different areas, but still they have to react and work together like a single system to achieve the scientific goals and satisfy the challenging data flow requirements. Designing and developing such a system in a distributed fashion requires proper tools and the setup of an environment to allow for efficient detection and tracking of interface and integration issues in particular in a timely way. Agile development can provide much faster feedback mechanisms and also much tighter collaboration between the customer (scientist) and the developer. Continuous integration and continuous deployment on the other hand can provide much faster feedback of integration issues from the system level to the subsystem developers. This paper describes the results obtained from trialing a potential SKA development environment based on existing science software development processes like ALMA, the expected distribution of the groups potentially involved in the SKA development and experience gained in the development of large scale commercial software projects. Keywords: Software development, Agile, Square Kilometre Array, SKA
1. INTRODUCTION This paper consists of four sections in addition to this introduction. Section two provides an overview of large scale, distributed software development, both in scientific and in commercial areas. The third section gives an overview of the agile software development paradigm and covers the practical areas like support and operations, platforms, continuous integration and continuous delivery, testing infrastructure and maintenance. The fourth section focuses on the structure of the software stack and discusses topics like common software layer, subsystems, integration and middleware, logging and alerts and data transfer. The last section discusses the relationship between software and hardware system development.
2. OVERVIEW OF EXISTING SYSTEMS 2.1 Science Systems Over the past few decades science projects became increasingly larger. more complex and more international. Typical big science projects are very often spanning continents, the cornerstone projects are global endeavours and require management of very complex political, sociological and technical boundary requirements and conditions. Often these projects have extended development, integration and deployment time scales. Almost always these projects present significant technological challenges on the consortia involved in them and in general this is also reflected in the requirements posed on the software infrastructure and functionality. In particular the so-called super-science projects like the Large Hadron Collider (LHC), ITER (originally the International Thermonuclear Experimental Reactor), the Atacama Large Millimeter and Submillimeter Array (ALMA) and ESO’s Very Large Telescope (VLT) are good examples of such science projects on various scales. In all these projects software plays an absolutely critical role, but receives very often only a very small fraction of the total budget. Typical Further author information: (Send correspondence to AW) AW.: E-mail:
[email protected], Telephone: +61 6488 7847 Software and Cyberinfrastructure for Astronomy II, edited by Nicole M. Radziwill, Gianluca Chiozzi, Proc. of SPIE Vol. 8451, 845106 · © 2012 SPIE · CCC code: 0277-786X/12/$18 · doi: 10.1117/12.926125
Proc. of SPIE Vol. 8451 845106-1 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 03/26/2014 Terms of Use: http://spiedl.org/terms
commercial projects of similar size reserve up to 25% of the total budget in software effort, for science projects this ratio is typical far less than 10% in the case of ALMA the original share was less than 5%.1 This situation calls for a more rigid, formal and pragmatic approach to software development, but that would be in sharp contrast to the prototypical and research nature of science projects, where technical and scientific requirements are very hard to gather and freeze and operational concepts for the resulting facilities are being developed in parallel with the deployment of the systems.
2.2 Large Scale Commercial Distributed Agile Development The first example is an e-commerce site that serves a primarily European market. The site has tremendous performance and load requirements and significant integration challenges. The code base is several hundred thousand lines of Java code spread across hundreds of classes. The team supporting and enhancing this system is located in two different countries at four different locations. At present, the team is about 100 strong; at its peak, the team was in excess of 200 people. The size of both the team and the system necessitates heavy use of automation in testing and builds. Continuous integration is critical to keep the large numbers of developers working together. Since so many features are under development at any point in time, the probability of conflicting changes is high. Continuous integration, including frequent execution of the functional tests helps spot any conflicts early, allowing for quick resolution. Given the size of the code, on-boarding of new team members is challenging. The unit and functional tests provide reliable documentation of the intended behavior of the code. Written documentation generally never gets kept up to date. The value of the test suite as documentation is simple; the documentation is provably accurate since the tests are all passing. Another critical success factor was the overall program management function. With the number of different systems involved and the size of the development effort, keeping the different teams working smoothly was essential. While program management evokes a sense of ”heavyweight” processes, Agile Program Management is essential.2 Agile and planning are not mutually exclusive. The second example highlights the way to work with a community of stakeholders. The system is a point of sale system for a large retail group that had multiple brands. The system needed to serve the needs of the individual brands while maintaining sufficient consistency for reporting and financial purposes. A critical role in this effort was the individual tasked with reconciling the needs of the different brands. This person accepted requirements from the different stakeholders and rationalized them into a coherent set for the development team. A critical enabler for the success of this role is the relationship formed with both the stakeholders and with the development team since this person’s job was to push back on some stakeholders and on the team at times. With the software complexity of the SKA systems and the geographic and organizational distribution of the development team, the Agile practices of test automation and continuous integration will be crucial in allowing the different teams to progress independently and safely. The practices will also support the efficient on-boarding of team members in the different locations. The management of the different stakeholders of the SKA software components will also be crucial. Frequent releases are very helpful in reassuring stakeholders of progress. Strong stakeholder management will still be critical, but the frequent showcasing of progress will make that task simpler.
3. AGILE SOFTWARE DEVELOPMENT FOR SCIENTIFIC COMPUTING The Agile software development methodology consists of a collection of principles and practices that influence the way teams organize, the tools they use, and the way individual team members interact with each other. This overview attempts to condense over a decade of research, insights and evolution into a few pages. The term ”Agile” to refer to a specific collection of software development methodologies first appeared in the Agile Manifesto, and included some value statement as well as a set of guiding principles. These values and principles arose from a discussion about the commonalities found amongst a large number of software development approaches being used at the time. These different approaches, including XP,3 Scrum,4 and others, address issues of software project planning as well as specific software development and engineering practices.
Proc. of SPIE Vol. 8451 845106-2 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 03/26/2014 Terms of Use: http://spiedl.org/terms
The main aspects of Agile software development considered here are the specific practices and principles most relevant to the SKA project specifically and scientific computing more generally. The principles of Agile include rapid feedback and complete transparency. While not a principle, Agile relies heavily on automation of testing, builds, and deployments, following the notion that anything done frequently should be automated. Focusing on the Agile principles helps when applying Agile to different situations. While scientific computing shares many characteristics with traditional business software development, there are differences. These differences manifest themselves in variations to some of the standard practices. The practices outlined here include the following: automated unit testing for code correctness, automated functional testing for behavioral correctness and documentation of expected behavior, frequent releases to demonstrate progress of development, more granular requirements to facilitate tracking of real progress, and continuous integration to facilitate collaboration. While these practices, and the principles that inspire them, do not make up the totality of what constitutes Agile, the set does include many critical enablers to Agile software development for scientific computing and particularly for teams that are geographically distributed. In addition to these practices, Agile software teams organize themselves differently than many traditional teams and adopt a working style that mirrors that often adopted by multi-disciplinary research teams. This section first introduces the relevant practices and provides references for further pursuit. Next, two actual projects are described that share characteristics with the SKA software effort and demonstrate the role Agile practices can fulfill in addressing the issues any software effort such as SKA faces. Finally, we discuss how community and open source development, common in the scientific computing world, affects agile software development.
3.1 The Role of Testing Two aspects of testing,5 and more specifically automated testing, that are relevant for scientific computing are unit testing and automated functional testing. Agile developers often practice test driven development,6 often abbreviated to TDD. There are two very different aspects to TDD, the tests that result from TDD and the impact on the design on software by using TDD. Indeed, some argue that a more accurate name for the practice is Test Driven Design. The basic idea behind TDD is simple; no code is ever written without a failing automated test. Once the test fails, the developer writes code to make the test pass. The result is two-fold: a suite of tests that represent the desired behavior of the code and code that is generally smaller, simpler, and easier to understand. The suite of tests provides a safety net to encourage refactoring and more general updating of the code, since the tests tell us when some piece of code no longer behaves as expected. Unit test coverage is frequently a metric tracked to monitor the health of a code base. The second relevant aspect of testing is the use of automated functional acceptance tests. These test suites are valuable for communicating the expected behavior of a system as well as speeding up regression testing as new functionality is added. These tests should focus on the expected systems behavior rather than on the implementation of that behavior.
3.2 Granularity of Requirements and Releases An important agile principle is rapid feedback and transparency. This principle manifests itself in the practices of frequent releases and small requirements. Frequent releases provide for earlier feedback from users of systems. However, achieving releases every month or so means that features must be small enough to be completed in that time. These two practices thus reinforce each other. Generally, agile teams like the individual feature requests, often called stories, to be small enough to be completed in at most two days, and the target is often only one day. If a particular feature is too complex to be developed that quickly, the feature is decomposed into smaller stories. A release is made up of a collection of stories. Agile development often uses the notion of an iteration, which is simply a short time box, one or two weeks. At the end of an iteration, the users of the system have the opportunity to provide feedback to the developer on the newly implemented features. The end users in the case of scientific computing may be the scientist developing the system or they may be a community of scientists. Regardless, the value of frequent releases derives from the transparency into real progress. In Agile development, there is an unambiguous definition of done. Code can never be 80 percent done. It is either done, meaning it
Proc. of SPIE Vol. 8451 845106-3 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 03/26/2014 Terms of Use: http://spiedl.org/terms
passes all the tests, or it isn’t finished. Since individual stories are small, this all or nothing status metric isn’t an onerous burden over the course of a software effort. One aspect of scientific computing might seem at odds with Agile development. Many development efforts in scientific computing are exploratory in nature, or possibly even pure research. The feasibility of requirements of this form for scientific computing is a legitimate question. While a full discussion of this topic is beyond the scope of this paper, the issue must be addressed. One key is to separate the correctness of the implementation from the validity of the hypothesis being tested with the code. For example, the requirements for some pattern matching code should focus on whether the pattern matching code works. The ability for the pattern matching code to detect planets is not something appropriate for testing in this context.
3.3 Continuous Integration and Continuous Delivery A critical aspect of Agile, and one that is particularly important for distributed development, is continuous integration.7 The basis tenet of continuous integration is that developers should be checking their code into source control frequently and ensuring the code passes all the tests before proceeding. This practice isn’t as important if there is only one developer, since there are limited code integration issues in that case. As the size and geographic distribution of a team grows, however, continuous integration is crucial. By integrating different aspects of the code frequently, feedback on incompatible changes comes in a more timely fashion, making the issues much easier to fix. This practice helps keep different parts of the development team synchronized, even across time zones. Continuous Delivery8 takes continuous integration further by focusing on the automation of deployments and environment provisioning. The goal of continuous delivery is to make deployments boring by making them reliable and fast. Automated environment provisioning enables the creation of new environments quickly to support new development teams or test environments. Deployment automation reduces the opportunities for mistakes in deployments, as well as speeding them up. The practices that make up continuous delivery will be important to support the complex software landscape of the SKA.
3.4 Role of the Scientific Community A major role in an Agile software project is that of the customer. The customer, or end user, interacts frequently with the development team, specifies the requirement of the systems in the form of stories with acceptance criteria, and participates in the frequent showcases of the software, providing feedback on the implemented features. The stakeholders for the SKA, however, are too numerous to interact with in this way. The SKA governing bodies will be interested in tracking progress against commitment. The scientific community will be interested in understanding the capabilities that SKA will make available to them. The consortia working on the new hardware components necessary for the SKA will need to know how and when they will integrate with the software components. While there won’t be a specific customer in this context, the engagement between the different stakeholders and the SKA software development consortia should still follow the customer model. Showcases, for example, could be made available to the broader science community through video and webcasts, providing opportunities for feedback to the software teams. The existence of the automated tests will simplify eventual community contributions to the SKA suite of analysis tools, as another example.
4. STRUCTURE OF A SOFTWARE STACK SUPPORTING THE AGILE PROCESS FOR THE SKA 4.1 Common Software In a distributed software development environment it is important to define rules, standards and procedures. Defining these solely in the form of documents and agreements is one way, but carries a fairly high risk and also requires additional infrastructural work to verify compliance. A common software layer on the other hand provides programmatic contracts which all parts of the system have to fulfil in order to be able to deliver a working system. Checking compliance is part of the normal testing and integration. The common software layer can be fairly light-weight or very inclusive, providing a wide variety of functionality. The various subsystems
Proc. of SPIE Vol. 8451 845106-4 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 03/26/2014 Terms of Use: http://spiedl.org/terms
The ACS is also based on an Object Oriented architecture [RD01 - 13.1.1 Distributed Objects and commands]. The following UML Package Diagram shows the main packages in which ACS has been subdivided.
Figure 1. Package diagram of the ACS architecture showing 4 layers. The implementation choices like the usage of CORBA Figure 2.1: ACS Packages and ACE or the Container-Component model are not that important in the current context and could indeed be replaced by other technologies. The important point here are the services and systems provided by ACS. Diagram from the ACS 9 Each package provides a basic set of services and tools that shall be used by all ALMA applications. Architecture document
haveprovided been grouped in 4 blocks layers. Packages are allowed to use services provided by other havePackages to use the building and frameworks to implement their respective functionality. This can packages on the lowerthe layers and on same layer, higher inside layers. the subsystems. This approach has substantially decrease amount ofthe code that hasbut tonot beon written been proven to be successful in a number of big science projects like the VLT, ALMA and in some accelerator projects. It wascontains even possible to for re-use common software across disciplines bybut keeping and domain packages software that is used bylayers many ALMA Subsystems, that is generic not A 5th group specific parts of the common software layer clearly separated. In the case of telescopes and accelerators common used by other ACS packages. These packages are for convenience integrated and distributed together parts include infrastructural services like subsystem communication middle ware, logging, alert system as well as with ACS but are not integral parts of ACS. sensor setup, control and communication. It also includes common user interface frameworks and re-usable and re-locatable widgets. In addition the software development environment, build, test and integration systems as description of the layers and thesystem packages provided hereafter, while the next chapteron will wellAasbrief tools like the versioning control andisthe ticketing system should be defined a project level and contain a detailed description of the features included in the packages. might be part of the common software, but could also be part of a separate software engineering effort. The usage of an inclusive common software layer will also support the agile development, since the subsystem developers can concentrate on the implementation of the functionality rathe than the details around. Moreover the integration 1 - Base Tools of the complete system will benefit from the standardized module integration and interface definition. For the common software the subsystem developers will occupy the customer role. The ALMA Common Software9 (ACS) was based on a very clean architecture which could be used as a blueprint for the SKA common software layer as well. The following sections describe a few of the more obvious common software components. The implementation of these common software elements could follow the Service Oriented Architecture (SOA)10 paradigm where the customers of the services are the other subsystems. 4.1.1 Integration/Middleware Strategy The major services most existing common software layers provide are based on some communication or messaging middleware like CORBA,11 ICE12 or DDS.13 There are numerous of such middleware packages available, addressing very generic but also more specialised and language specific needs. Most important within the current context is the fact that these packages impose certain patterns on the implementation of interfaces, intercommunication and software components. This helps to minimize the risk of broken interfaces and integrations by introducing a formal definition of the contracts between the software components. In particular in a large, distributed development environment this enables the agile approach of continuous integration which substantially less effort. Interface code and documentation can be automatically generated from the interface definition language and thus getting a complete system up and running just requires some additional stubs mimicking the functionality of the components.
Proc. of SPIE Vol. 8451 845106-5 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 03/26/2014 Terms of Use: http://spiedl.org/terms
4.1.2 Logging and Alerts Both logging and alerts are specialised forms of the more generic messages provided by the middleware. The very clear definition of not just the software interfaces, but also the formatting, prioritisation and propagation of log and error messages is a crucial pre-requisite for the usability of the logs and alerts. There is some overlap and communality with the hardware monitoring, since a subset of the log messages can be used and interpreted as software monitoring, some other parts as data flow monitoring or operator logs and very often logging is (mis-)used for debugging as well. Thus a clear separation of the scope of the messages is required. 4.1.3 Data Transfer and Persistence Middleware One of the less obvious areas where a project wide approach could potentially be beneficial in particular for the SKA are data transfer and persistent storage. The SKA processing requirements will hit the exascale both the floating point operations, but also in data volume. Many of the core algorithms will be be dominated by I/O14 and thus optimization of the data flow, data movements and persistence requirements plays a vital role, else the SKA data system will either be extremely expensive or not feasible at all. The design and implementation of SKA internal data transfer and persistence services can help mitigating this risk or at least help to get a global understanding of the I/O, storage and database requirements of the SKA.
4.2 Subsystems The proper splitting of the overall system design into manageable subsystem is a crucial task. In a complex landscape like the SKA there will be a number of non-functional requirements trying to influence the distribution or setup of subsystems across the various members of the consortium. A related problem arises from the allocation of resources to the individual subsystems. Since the requirements are not known in any detail at the time when the subsystems are defined, it is crucial to define at least the scope of them in order to be able to estimate the resource allocation. In ALMA the subsystem resource allocation was based on the science requirements, while the scope was defined by the computing architecture. Due to the assignment of many additional system requirements for some subsystems the scope was much wider than what the science requirements suggested and thus the allocated resources did not match the work to be done. The scale of the SKA data flow will most probably dominate the whole development and in many cases it will be crucial to optimize where and how to execute a certain functionality, rather than just implementing it.
4.3 Operations, Support and Maintenance While the majority of attention to a software system focuses on the initial development of the system, the support and maintenance costs and effort frequently dwarf the initial investment. As such, planning for the support phase is critical. Aspects to consider for supporting a system that was developed using Agile methods include the role of tests, metrics and continuous delivery. As described previously, tests contribute to the on-boarding of new team members by providing accurate documentation about both the behavior of the system and the expectations about how the various parts of the system fit together. Over time, these tests are critical in maintaining the organizational knowledge of the system, since people move on and forget the details. Software internal quality metrics are increasingly being recognized as important to understanding and improving the maintainability of a system. By incorporating these metrics into the build process, the system quality over time can be monitored and corrective action can be taken to address creeping quality issues. Continuous delivery provides the foundation for supporting the continued upgrading and release of a system. Once in a maintenance phase, there might seem to be less pressure to release quickly. However, the ability to quickly address a critical issue is no less important when system is in maintenance mode. The staffing approach to support varies depending on ownership of the code. Open source communities often provide the support resources for systems; these resources include support forums where people can get help, in addition to the more obvious bug fixing or feature requests. Support systems generally require a mechanism for getting questions answered, and some way of recording defects and feature requests. The health of a community
Proc. of SPIE Vol. 8451 845106-6 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 03/26/2014 Terms of Use: http://spiedl.org/terms
is often assessed on how active its forum is. The SKA software consortia will need to provide a similar set of capabilities. Agile methods apply equally well during the support phase, although the level of interaction may be different. The same testing, build, and release approaches should continue through the support phase of a system. The same attention to test maintenance and code quality must also continue or the software asset will atrophy. Systems become out-dated even if there is nothing wrong with them, as new technologies and new approaches enter into the industry. Thus, on-going maintenance of large software systems needs to be addressed, even though it is often given far less attention than it deserves.
4.4 Testing Framework Testing, and in particular automated testing, plays a prominent role in Agile Software Development. As such, a number of tools and frameworks have been developed, both proprietary and open source, to support this level of testing. The arguably best known is jUnit, the unit testing tool that supports Java applications. The analog in the C#/.NET world is nUnit. These open source frameworks provide the mechanism to support unit testing, including integration with the development environments (IDEs). These tools provide mechanisms to support expression of a variety of assertions, failure of which fails the build. Moving up the chain, we come to tools to support automated functional and acceptance testing. Here, the implementation mechanism is again important. There are web testing frameworks like Selenium, Sahi, and White. There are frameworks like jMeter that tend to be used more for load and performance testing. There are tools like Go and Cucumber that allow the expression of tests in a domain specific language making it easier to involve users in the test creation. Tools at this level can also drive tests of integrated systems by supplying the ”triggering event” that propagates through the various systems. Checking the success of such a test is more difficult, since the response is likely to be delayed in time. Tests of this nature will be a major testing component for the SKA. Of course, specifying the test and expected results is only one aspect of automated testing. A critical component is the provisioning of test environments. Access to test instances of specialized hardware is a serious issue for integration testing. Virtualization technology15 can help some, but it doesn’t address issues like performance testing in situations like scientific computing where compute time is a significant factor. One approach to addressing this issue is to utilize stub and mock systems that can mimic the behavior of systems that are hard to access in test environments. In general, stub systems simply replay canned responses to requests, where mocks tend to have more intelligence built into them. A critical issue with using this approach is to ensure they do not get out of sync with the systems they are mimicking. To determine the continued validity of the mocks and stubs, we generally have automated tests that run against the real systems to ensure the behavior hasn’t changed. This approach to contract testing is also useful in situations where cooperating systems are being developed by disparate teams. Each team provides the other with a test suite that shows the assumptions being made. A failure in one of these tests triggers a conversation between the development teams to address the issue. Large scale distributed development requires significant investment in testing to ensure the various parts of the system work together as needed. The tools that have matured significantly due to the emphasis on testing from the Agile development community will play a critical role in the success of the SKA software components. 4.4.1 System Development under Data Simulation For a complex system module level simulators is usually not sufficient. The full system is supposed to perform pretty high level tasks and provide the best data for scientific exploitation. In order to verify if the goals are met, system level development has to employ scientists to verify the output data and at least some of the intermediate products. A good approach would be to develop a set of science reference projects (SRPs) for the SKA and work against those. Although this process was already initiated,16 we propose to take this further and implement it as system level integration test cases following the SKA data flow []., i.e. turn them into observing proposals, create the scheduling units and execute them to produce simulated data. Ideally this data simulation would go down to the actual data collected by the receivers. This simulated data is then pushed through and processed
Proc. of SPIE Vol. 8451 845106-7 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 03/26/2014 Terms of Use: http://spiedl.org/terms
by the various data pipelines and QC steps up to the science archive. Thus the SRPs are used throughout the whole data flow and at the end scientists can compare the output of the whole system with their expectations. The SKA will be under construction for a very long time, while being operated already with reduced capabilities. Thus there should be SRPs available to test the system also with reduced capabilities. This will also allow to identify more clearly what functionality can be offered to users at what stage of the project. Simulated data can only represent a science case to a certain degree. For testing it might even be better to simulate a simplified science case first, where the results are totally predictable and then add complexity until the full science case is reached. Example of such a complexity hierarchy: point sources without noise; add noise; add extended sources; use observed data from other instruments; full scale sky simulation. 4.4.2 Scalability Tests For SKA it is important to add specific tests for data volume and data rate. Data volume tests should be implemented in the form of data challenges, showing that the system is able to handle 10%, 25%, 50%, 100% of the projected data rate. These data challenges should be very high level milestones during the development of the SKA and are really system level tests. That means that the whole system has to demonstrate the ability to handle the amount of data specified for the particular milestone. The exact definition of what exactly is meant by 10% data rate has to be given well before the actual milestone, broken down into specific requirements for the individual modules and agreed upon by the various groups. The preparation of the milestone tests and simulation data represents a fair amount of work and has to be taken into account accordingly. Using existing cloud systems and/or clusters to carry out these tests requires proper planning and implementation. 4.4.3 Failure Robustness Tests Failures will be a normal behaviour in a system with many hundreds of thousands of line replacement units, thus robustness against failures and failure recovery has to be an integral part of the system design and implementation. Systems engineering together with science have to define the expected availability of the whole system upfront and analyse the impact of this definition on the availability of the various sub-systems, modules and finally the LRUs. The identification of single- point of failures is a critical task, but has to be done on a very high level in order to reach the correct level of availability throughout the whole system. Testing the behaviour of the system in cases where some components are not available, or failing in the course of execution is even more complicated than testing nominal execution. However, if the failure case was anticipated during the whole development in every single module, the behaviour of the system should be a lot more predictable.
5. RELATIONSHIP BETWEEN SOFTWARE AND HARDWARE DEVELOPMENT The SKA will consist of thousands of antennas and hundreds of thousands of hardware components, which have to be controlled and monitored. The efficient and seamless integration of software and hardware will thus be essential in order to be able to deliver an operational and safe system. The pace of hardware and software development in general does not match and software development has to start before the hardware components are available. Often software is being developed at locations where in particular big or very expensive pieces of hardware are not accessible. To overcome potential dead-lock situations many projects are relying on very rigid interface control documents and on hardware simulators. Like for the data simulations above, this can not replace real hardware/software integrations to happen as often as possible, but at least it can mitigate some of the risks by applying continuous integration between software and hardware components using hardware simulators.
6. CONCLUSION Agile software development has been successfully applied to large scale and distributed commercial software projects. The discussion presented in this paper highlights the applicability of the agile paradigm to large scale scientific software projects in general and in particular for the SKA. The main argument given here is the fact that the SKA requirements will be in flux throughout the design but also during the construction phase. This requires a dynamic environment that can more easily cope with changing requirements, without loosing control of the deliveries. A properly implemented agile development environment can achieve a faster roll-out of the most critical functionality to the target users. This in turn will provide much faster feed-back to the developers and thus naturally a more agile environment.
Proc. of SPIE Vol. 8451 845106-8 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 03/26/2014 Terms of Use: http://spiedl.org/terms
ACKNOWLEDGMENTS ICRAR is a joint venture between the University of Curtin, the University of Western Australia and received grants from the Western Australian Government. The Pawsey centre is funded from Australian federal and Western Australian state grants.
REFERENCES 1. B. Glendenning and G. Raffi, “The ALMA computing project - Update and management approach,” in ICALEPCS, E. C. Abstracts, ed., 29J, 2005. 2. K. Beck and M. Fowler, Planning Extreme Programming, Addison-Wesley Professional, 2001. 3. K. Beck and C. Andres, Extreme Programming Explained: Embrace Change, Addison-Wesley Professional, 2004. 4. M. Cohn, Succeeding with Agile: Software Development Using Scrum, Addison-Wesley Professional, 2009. 5. L. Crispin and J. Gregory, Agile Testing: A Practical Guide for Testers and Agile Teams, Addison-Wesley Professional, 2009. 6. K. Beck, Test Driven Development: By Example, Addison-Wesley Professional, 2002. 7. P. Duvall, S. Matyas, and A. Glover, Continous Integration: Improving Software Quality and Reducing Risk, Addison-Wesley Professional, 2007. 8. J. Humble and D. Farley, Continuous Delivery: Reliable Software Releases through Build, Test and Deployment Automation, Addison-Wesley Professional, 2010. 9. G. Chiozzi, H. Sommer, and J. Schwarz, “ALMA common software architecture,” tech. rep., European Southern Observatory, 2009. Available as http://www.eso.org/projects/alma/develop/acs/ OnlineDocs/ACSArchitecture.pdf. 10. M. Bell, Service-Oriented Modeling (SOA): Service Analysis, Design, and Architecture, Wiley, 2008. 11. O. group, “The CORBA standard,” tech. rep., OMG, 2012. Available as http://www.omg.org/spec/ CORBA/Current/. 12. zeroc, “The ICE manual,” tech. rep., zeroc, 2012. Available as http://doc.zeroc.com/display/Ice/Ice+ Manual. 13. TwinOaks, “Data distribution service (DDS) brief,” tech. rep., TwinOaks, 2011. Available as http://www. omg.org/cgi-bin/doc?omg/11-08-01.pdf. 14. B. Humphreys and T. Cornwell, “Memo 132 Analysis of Convolutional Resampling Algorithm Performance,” Tech. Rep. 132, SKA, 2011. Available as http://www.skatelescope.org/uploaded/59116\_132\_Memo\ _Humphreys.pdf. 15. D. Kusnetzky, Virtualization: A Manager’s Guide, O’Reilly, 2011. 16. “The Square Kilometre Array design reference mission: SKA phase 1,” tech. rep., SKA Science Working Group, 2011. Available as http://www.skatelescope.org/uploaded/18714_SKA1DesRefMission.pdf.
Proc. of SPIE Vol. 8451 845106-9 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 03/26/2014 Terms of Use: http://spiedl.org/terms