Feature: Software Development
Two-Layer Wrapping for COTS Software Integration:
An Experience with Matlab Emilio García-Roselló, Jacinto G. Dacosta, María J. Lado, Arturo J. Méndez, and Baltasar G. Perez-Schofield, University of Vigo
// A two-layer wrapping approach to integrating Matlab into real engineering software development projects shows better integration of domain-specific functionality and notable effort savings overall. //
Commercial off-the-shelf (COTS) software is now present in nearly all engineering domains, and some COTS products—such as Matlab, AutoCAD, Rational, and Excel— are widely used. Although these products are designed for use as stand-alone applications, their broad domain functionality makes them good candidates for integration in larger systems, where they can significantly reduce development efforts, enhance software reli-
ability, and improve quality, which explains their increasing utilization.1 Nevertheless, some disadvantages can make this integration difficult— for example, the dependency on providers, the difficulty of estimating integration efforts, and the need to develop specific adaptation code.2 Because COTS products often include APIs, it’s tempting to consider them fairly integrable. However, APIs are sometimes complex, not easily lending
76 I E E E S o f t w a r e | p u b l is h e d b y t h e I E E E c o m p u t e r s o c ie t y
themselves to integration efforts, which require precision. This may explain why research often describes the use of COTS products either “as is” or extended by integrating new modules, but research on the their use as components inside larger systems is scarce. Most of the available research proposes broadly applicable solutions to COTS software integration. 3,4 However, these solutions are inherently limited by the heterogeneous nature of both COTS products and possible application domains. Limited research addresses more specific integration problems in particular application domains, although these problems are among the costliest COTS software integration efforts. 2 From 2004 to 2010, we worked on the IMO.Net Matlab-Based Academic Tools project at the University of Vigo to develop software for the engineering domain. The project aimed to simplify COTS software integration in a single domain, hypothesizing that this would facilitate applications development. We conducted the project as a case study centered on Matlab, which has extensive functionality in the engineering domain, making it an excellent candidate for integration in a wide range of applications. Over the course of the project, we designed a two-layer wrapping approach to empower Matlab integrability specifically in the engineering domain, which we tested with success in several development projects.
COTS Integrability: The Matlab Case Measuring a software artifact’s reusability (which includes its integrability) strongly depends on the artifact’s specific characteristics, reuse context, and target domain.5,6 COTS software can make this measurement more difficult because it often offers no more than ba0 74 0 -74 5 9 / 12 / $ 3 1. 0 0 © 2 0 12 I E E E
sic compliance with some platforms.7 Therefore, the IMO.Net project’s first step was to apply a specific case-study approach to assess Matlab integrability regarding our particular development aims. We also had to decide whether we needed to enhance its capabilities regarding those aims. Matlab is well suited for use as an end-user application. However, there are also several possibilities for extending its functionality—for example, by using toolboxes or integrating external components. Matlab also supports integration through either APIs for the C and Fortran languages or the Common Object Model (COM) Automation server capabilities. Both methods allow an application to open a Matlab session to exchange data and execute commands. The C and Fortran APIs obviously limit the use of other languages. They can also be complex because they lack type checking and require pointers and explicit dynamic memory management. The COM Automation interface offers almost the same capabilities but with the advantages of language independence and automated memory management. However, neither option offers a true object-oriented design, so they require developers to write explicit code constructing command strings to pass to the Matlab engine and collect the results. This task can quickly become laborious, complex, and errorprone, requiring developers to know all the COTS software details and considerably increasing the integration effort. So neither mechanism offered good integrability in relation to our needs, mainly because they involve too low an abstraction level. Matlab Builder is another option. It’s a package for building Java, .NET, and COM components from functions written in Matlab language, which helps in building specific components
that can be easily implemented using Matlab capabilities. However, it’s not a way of integrating Matlab as a whole. Furthermore, the resulting components don’t support some valuable Matlab capabilities, such as persistent work ses-
usefulness of the domain model that the asset implements. The real challenge is to make this model easier to use, not to rebuild it. Therefore, our first step was to capture the Matlab model in an object-oriented model. Figure 1 is a
We saw that Matlab had a potentially large utility regarding our needs but lacked usability and, consequently, integrability. sions and runtime interpretation—two features that are particularly useful in academic software.8,9 In summary, none of the native possibilities seemed satisfactory. In terms of the distinction between a component’s utility and usability, 5 we saw that Matlab had a potentially large utility regarding our needs but lacked usability and, consequently, integrability. COTS integrability features can differ significantly from one domain to another, so this conclusion isn’t generalizable. However, for our project, we elected to design our own solution to empower Matlab integrability.
IMO.Net Project Approach The IMO.Net project aimed to make it easier to integrate Matlab into academic engineering software development. We adopted a wrapping approach to reduce integration efforts and support an abstraction level adequate to academic engineering and its subdomains.
Making Matlab Generic Functionality More Reusable We started with the assumption that reuse of a software asset implicitly indicates reuse of a domain model.5 Furthermore, reuse probably indicates the
UML diagram of the model. It shows a MatlabSession class that represents a Matlab work session, in which we can create variables of different types. We modeled the variables as a MatlabVar class, which is the base class of a set of classes for each basic Matlab data type. The basic way to perform operations in Matlab is by calling functions, so we modeled operations in a MatlabFunctionCaller class. Finally, we used a MatlabFigure class to model figure graphics objects that can be created and handled in a Matlab session. This approach is different from a specific adaptation process. Domain requirements, not particular application requirements, guide the design and are captured by the model that’s implicit in the COTS software. It’s similar to a wrapper façade that provides concise object-oriented interfaces for encapsulating low-level COTS software functionality and data structures.10 The main difference is the design approach. A wrapper façade begins by identifying abstractions and relationships among API functions, whereas our approach captures the COTS software’s implicit domain model, which is sometimes poorly represented by its API. We implemented this design as a component library called IMO.Net
J u ly/A u g u s t 2 0 1 2
| IEEE
S o f t w a r e 77
Feature: Software Development
0..*
0..*
MatlabSession
MatlabFigure
MatlabFunctionCaller 0..*
MatlabBoolArray
MatlabVar
MatlabStructArray
MatlabObject
MatlabNumericArray
MatlabCharArray
MatlabCellArray
MatlabString
Figure 1. A UML class diagram of the Matlab domain model. A MatlabSession class represents a Matlab work session for creating variables modeled as a MatlabVar class. A MatlabFunctionalCaller class models operations, and a MatlabFigure class models figures that can be creating in a MatlabSession.
Basic Library for Matlab. It simplifies the Matlab integration by hiding the underlying complexity, allowing type checking, and reducing the need to handle string commands in code. This leads to shorter, less error-prone adaptation code. The library clearly improved the integration of Matlab functionality compared to native solutions. Because academic software is frequently GUIbased, we complemented this solution with the IMO.Net DataViewers library, which includes visual viewers and editors for Matlab basic data types that can be connected to the IMO.Net Basic Library components. Developers can integrate all the components in a Visual Studio.Net environment during the design. (Technical details of the libraries are available elsewhere.11) Different developers have used both libraries to build several applications—for example, a picoamperimeter data-capturing application and a water-quality assessment system.11 Another example is R-Interface, which is an application that provides an alternative GUI for Matlab, designed for educational purposes. The sub-
jacent Matlab engine supports all its functionality, which is easily integrated through IMO.Net Basic Library. The GUI design was also dramatically simplified thanks to IMO.Net DataViewers controls. ModelLab, an application that models software oriented toward environmental studies, also used both libraries. It originated from software that our research team developed earlier in a laborious process that involved building the equation evaluation from scratch. In ModelLab, we reengineered the software, implementing this functionality simply by integrating the Matlab engine through the IMO.Net Basic Library. This simplified development considerably while also adding calculus power.
Enhancing Subdomain Functionality Reuse Over the course of developing the generic part of our wrapping solution, it became clear that reusability was still lacking in the development of applications that required subdomain-specific Matlab functionalities, such as tool-
78 I E E E S o f t w a r e | w w w. c o m p u t e r . o r g / s o f t w a r e
boxes. Modules such as Maple toolboxes or Excel add-ons that integrate more specific functionality for particular subdomains are a common feature of large COTS software with broad domain functionality. In Matlab’s case, a large toolbox collection in many subdomains offers quite complex functionalities, so the ability to easily integrate this functionality in other software applications could provide significant benefits. IMO.Net Basic Library offered some benefits in these cases, but clearly less than it did for applications with more generic requirements regarding Matlab. This isn’t surprising because a component rapidly loses reusability as requirements become more domain specific.5 As far as we know, the research literature hasn’t explicitly addressed subdomain functionality until now. We opted to make this functionality easier to integrate in other applications by reapplying a wrapper façade-based process similar to the previous wrapping process but with significant differences in the design and implementation. Second-layer design. At first, we tried to proceed as we did before, capturing the implicit subdomain model, typically contained in a single toolbox, and using it to design a wrapper that makes reuse easier. However, this approach proved less suitable for subdomain functionality—mainly because toolboxes are developed on top of the hosting COTS functionality, so they inherit its behavioral style. With Matlab, such modules are commonly oriented to interactive use and often have a weakly typed and function-based style. Combining these features with specific, complex functionality can make integrating the subdomain model implicitly difficult. For example, Figure 2a shows the domain model for the Matlab Neural Networks Toolbox. It relies on a complex data structure that lets subobjects change dynamically to model different types of artificial neural networks
IMOMatlabNeuralNetBaseClass
Matlab Object
IMOMatlabNeuralNetworkSOM Matlab Neural Network IMOMatlabNeuralNetworkPerceptron Outputs SubObject
Inputs SubObject IMOMatlabNeuralNetworkLinealFilters
Layers SubObject
Biases SubObject IMOMatlabNeuralNetworkRBFNetworks
Layer Weights SubObject
(a)
Input Weights SubObject
IMOMatlabNeuralNetworkBackpropagation
(b)
Figure 2. UML class diagrams: (a) the domain model structure of the Matlab Neural Networks Toolbox and (b) the structure resulting from the wrapping façade design. Although the Matlab Neural Networks Toolbox domain model relies on a single complex structure that can change dynamically to model different types of neural networks, the wrapper façade result offers a more object-oriented class-based hierarchy that’s easier to reuse.
(ANNs) and a large function collection to handle this structure. It seemed clear from an object-oriented viewpoint that a class-based hierarchy with the appropriate methods for each ANN type would be far easier to reuse. Moreover, a higher degree of encapsulation should be an advantage, given that users interested in reusing these subdomain functionalities probably don’t need or want to manage lower-level aspects such as the underlying Matlab session or variables. So for this wrapping process, we applied a stricter wrapper façade approach, using the low-level functions and data structures of the piece to be wrapped to define higher-level abstractions.10 Figure 2b shows the resulting class structure. Second-layer implementation. We im-
plemented the subdomain library by reusing the wrapper library developed
earlier to access the COTS generic functionality. This resulted in a two-layer wrapping that’s more affordable than developing a library either from scratch or for two different COTS products. Furthermore, using the same underlying COTS software for all wrapper libraries can dramatically enhance functionality integration by allowing data sharing and interaction features. We used this approach to develop the IMO.Net Neural Networks library12 and other subdomain-specific component libraries based on the Matlab Wavelets Toolbox—notably, the IMO. Net Matrix Library and the IMO. Net Wavelets.13 The IMO.Net Wavelet wraps and models the large collection of Matlab Wavelets Toolbox functions using only two classes: one for encapsulating 1D data and the other for 2D data. The classes offer the corresponding wavelet methods for each data type,
provide type checking, and hide lowerlevel details such as session or command handling, making development easier. Several software development projects have fruitfully integrated these component libraries, including parts of a water-quality assessment model, an application to predict some chemical properties using ANNs, an empowered version of an ANN learning environment,14 and an intuitive GUI to use wavelets.13
Overall Process Schema Figure 3 schematizes the overall process and shows the solution architecture developed for Matlab. Dashed arrows relate a process step to a corresponding element in the architecture. As the diagram shows, the process can follow different paths, depending on COTS integration features as well as the target application domain.
J u ly/A u g u s t 2 0 1 2
| IEEE
S o f t w a r e 79
Feature: Software Development
Overall process diagram
Architecture of the developed solution
Assess COTS generic integrability Matlab COTS No Capture COTS domain model
Yes
IMO.Net Basic Library for Matlab first-layer wrapper façade
Develop first-layer wrapper guided by COTS domain
Subdomain functionality needed No
IMO.Net Artificial Neural Networks library IMO.Net Wavelets library
Yes Develop second-layer wrappers using a wrapper façade process and reusing first-layer wrapper
Second-layer wrapping façade
Adequate?
Toolboxes Toolboxes
(R-interface, ModelLab, Neurolab, etc...)
Software applications integrating COTS
Develop software integrating the COTS
(a)
(b)
Figure 3. (a) The overall commercial off-the-shelf (COTS) wrapping process diagram and (b) architecture of the developed solution for Matlab. The process considers COTS integrability to decide whether to apply different wrapper approaches at different levels (generic domain and subdomains).
Some COTS products, such as Maple (through OpenMaple) or Excel, offer comprehensive, object-oriented APIs that might be adequate for integrating their generic functionality in a wide range of application domains. Such APIs are equivalent to our approach’s first-layer wrapping. On the other hand, applying a process similar to our second-layer wrapping could simplify the integration of subdomain functionality, such as a Maple toolbox. Developers can repeat this part of the process for different subdomains, as we did in the IMO.Net project, to generate several specific second-layer wrappers. Software applications that integrate COTS products can use first- or second-layer wrapping services, depending on the required functionality.
Reuse Analysis Results To measure our approach’s benefits, we used the classic Gaffney and Durek reuse metric, considered one of the best economic reuse models.15 This metric defines relative effort, C, of a software development as E C = b + − 1 × R + 1, n where R is the proportion of reused code in the software, b is the relative effort of reusing code, E is the relative effort of developing code for reuse, and n is the number of times a piece of code is reused. Because all the IMO.Net development projects integrated Matlab, we used it as the neutral state for our measures, assuming R = 0 in this case and
80 I E E E S o f t w a r e | w w w. c o m p u t e r . o r g / s o f t w a r e
C = 100 percent. With these values, R will consider only the wrapper layers’ reuses, and any C value lower than 100 percent will indicate that the reuse saves development effort. For the formula constants, we used the most widely accepted values: b = 0.2 and E = 1.5.15 Tables 1 and 2 present our results. We estimated the E/n ratio in Table 1 for each developed component library according to the number of times we reused it as part of other software. The total count of 22 reuses (most instances have been reused more than once) for the four libraries matches the total number of times that these libraries show values greater than 0 in Table 2’s column representing the reuse percentage in the different development projects. We used this data together with the E/n ratios from Table 1 to calculate a weighted
Table 1 Table 2
E/n ratio estimation for each developed IMO.Net component library. Component library
Number of reuses (n)
E/n ratio
IMO.Net Basic Library
11
13.6%
IMO.Net Data Viewers
7
21.4%
IMO.Net Matrix Library
0
Not applicable
IMO.Net Neural Networks
3
50.0%
IMO.Net Wavelets
1
150.0%
Relative software development effort for each developed project. Percent of code reused from each component library Software development
IMO.Net Basic Library
IMO.Net Basic Library
Not applicable
0%
0%
0%
IMO.Net Data Viewers
100%
0%
0%
0%
13.6%
78.8%
21.2%
IMO.Net Matrix Library
74%
0%
0%
0%
13.6%
67.1%
32.9%
IMO.Net Artificial Neural Networks
81%
0%
0%
0%
13.6%
80.7%
19.3%
IMO.Net Wavelets
71%
0%
0%
0%
13.6%
63.0%
37.0%
100%
100%
0%
0%
18%
49.0%
51.0%
pAmperimeter data capture
59%
5%
0%
0%
14.2%
61.9%
38.1%
Chemical property prediction
49%
8%
18%
0%
23.1%
63.0%
37.0%
Water quality assessment
59%
6%
21%
0%
23.3%
77.4%
22.6%
ModelLab
49%
4%
0%
0%
14.3%
86.3%
13.7%
Visual NeuralNet
85%
7%
100%
0%
33%
69.2%
30.8%
Visual WaveletLab
74%
9%
0%
96%
87%
103.9%
−3.9%
R-interface
IMO. Net Data Viewers
mean in Table 2 and, finally, the relative effort of software development C and corresponding savings (1 − C). Table 2 shows notable savings in most of the projects, with an average savings of 25 percent. Additionally, most component library development efforts benefited from integrating the
IMO.Net Neural Networks
IMO.Net Wavelets
Mean E/n ratio of reused code
C: relative effort of software development
(1 − C): relative savings in software development
Not applicable
100%
0%
lower-level wrapper, as we pointed out earlier. The only current exceptions are IMO.Net Matrix Library, which is scheduled for an upcoming project but hasn’t yet been reused, and IMO.Net Wavelets component library, which has been reused only once (n = 1), leading to a high E/n ratio that ex-
plains the negative savings value of Visual WaveletLab. But these two results are temporary. As soon as those wrapper libraries are regularly reused in new projects, the E/n ratio will decrease, which in turn will increase savings. Beyond the direct benefits of our approach, we also expect benefits
J u ly/A u g u s t 2 0 1 2
| IEEE
S o f t w a r e 81
Feature: Software Development
About the Authors
References Emilio García-Roselló is an associate professor in the Uni-
versity of Vigo’s computer science department. His research interests include reusability and component-based software engineering. GarcíaRoselló received his PhD in computer science from the University of Vigo. Contact him at
[email protected].
Jacinto G. Dacosta is an associate professor in the University of
Vigo’s computer science department. His research interests focus on educational software reuse. Dacosta received his PhD in computer science from the University of Vigo. Contact him at
[email protected].
María J. Lado is an associate professor in the University of Vigo’s computer science department. Her research interests include computer-aided diagnosis and learning environments. Lado received her PhD in physics from the University of Santiago de Compostela. Contact her at
[email protected].
Arturo J. Méndez is an associate professor in the University of
Vigo’s computer science department. His research interests include computer-aided diagnosis and educational software. Méndez received his PhD in physics from the University of Santiago de Compostela. Contact him at
[email protected].
Baltasar G. Perez-Schofield is an associate professor in
the University of Vigo’s computer science department. His research interests include persistence, object-orientation, and dynamic languages. Perez-Schofield received his PhD in computer science from the University of Vigo. Contact him at
[email protected].
indirectly from an increase of COTS reuse, as developers see that it’s easier to integrate.
T
he positive results we achieved with our two-layer wrapping suggest that it could empower COTS integrability to reduce development efforts in other software projects. The approach’s generality is
obviously inherently limited, but like any case-study approach, it contributes to the scientific knowledge in a domain and can be discretionarily transposed, fully or partially, to other similar situations.
Acknowledgments
Xunta de Galicia has partially supported this work under grant PGIDIT06SIN30501PR.
82 I E E E S o f t w a r e | w w w. c o m p u t e r . o r g / s o f t w a r e
1. B. Boehm et al., “Composable Process Elements for Developing COTS-based Applications,” Proc. 2003 Int’l Symp. Empirical Software Eng, (ISESE 03), IEEE CS, 2003, pp. 8–17. 2. C. Abts, B. Boehm, and E.B. Clark, “COCOTS: A COTS Software Integration Lifecycle Cost Model,” Proc. European Software Control and Metrics Conf. Software Certification Programme in Europe (ESCOMSCOPE) 2000 Conf., Shaker Publishing, 2000, pp. 325–333. 3. A. Egyed and R. Balzer, “Integrating COTS Software into Systems through Instrumentation and Reasoning,” J. Automated Software Eng., vol. 13, no. 1, 2006, pp. 41–64. 4. L. Mariani and M. Pezzè, “Dynamic Detection of COTS Component Incompatibility,” IEEE Software, vol. 24, no. 5, 2007, pp. 76–85. 5. H. Mili et al., Reuse-Based Software Engineering, John Wiley & Sons, 2002. 6. J. Poulin, “Measuring Software Reusability,” Proc. 3rd Int’l Conf. Software Reuse, IEEE, 1994, pp. 126–138. 7. B. Boehm and C. Abts, “COTS Integration: Plug and Pray?” Computer, vol. 32, no. 1, 1999, pp. 135–138. 8. M. Knudsen, J. Nielsen, and J. Østergaard, “A Computer Based Instruction in Matlab,” Proc. 4th Int’l Conf. Computer Based Learning in Science (CBLIS 99), Univ. South Bohemia, 1999, pp. 10–17. 9. M.J. Lado et al., “R-Interface: An Alternative GUI for Matlab,” Computer Applications in Eng. Education, vol. 14, no. 4, 2006, pp. 313–320. 10. D. Schmidt, “Wrapper Facade. A Structural Pattern for Encapsulating Functions within Classes,” C++ Report, vol. 11, no. 2, 1999, pp. 40–50. 11. E. García-Roselló et al., “A Component Framework for Reusing a Proprietary Computer-Aided Engineering Environment,” Advances in Eng. Software, vol. 38, no. 4, 2007, pp. 256–266. 12. A. Méndez et al., “Integrating Matlab Neural Networks Toolbox Functionality in a Fully Reusable Software Component Library,” Neural Computing & Applications, vol. 16, no. 4, 2007, pp. 471–479. 13. E. García-Roselló et al., “Visual WaveletLab: An Object-Oriented Library and a GUI Application for the Study of the Wavelet Transform,” Computer Applications in Eng. Education, vol. 20, no. 2, 2011; 10.1002/ cae.20524. 14. E. García-Roselló et al., “Neuro-Lab: A Highly Reusable Software-Based Environment to Teach Artificial Neural Networks,” Computer Applications in Eng. Education, vol. 11, no. 2, 2003, pp. 93–102. 15. J. Poulin, Measuring Software Reuse, AddisonWesley, 1997.
Selected CS articles and columns are also available for free at http://ComputingNow.computer.org.