Technical Report TR06-025 Department of Computer Science Univ. of North Carolina at Chapel Hill
SPQR: Formal Foundations and Practical Support for the Automated Detection of Design Patterns from Source Code
Jason McC. Smith Department of Computer Science University of North Carolina Chapel Hill, NC 27599-3175
[email protected]
Dec 12, 2005
SPQR: Formal Foundations and Practical Support for the Automated Detection of Design Patterns From Source Code
Jason McColm Smith
A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science.
Chapel Hill 2005 Approved by:
P. David Stotts, Professor
Siddhartha Chatterjee, Reader
David Plaisted, Reader
Jan Prins, Reader
Albert H. Segars, Reader
c 2005
Jason McColm Smith ALL RIGHTS RESERVED
ii
ABSTRACT JASON MCCOLM SMITH: SPQR: Formal Foundations and Practical Support for the Automated Detection of Design Patterns From Source Code. (Under the direction of P. David Stotts.) Maintenance costs currently comprise the majority of total costs in producing software. While object-oriented techniques and languages appear to have assisted in the production of new code, there is little evidence to support the theory that they have helped lower the high cost of maintenance. In this dissertation, I describe the current problem and provide a system ultimately aimed at reducing this cost. The System for Pattern Query and Recognition, or SPQR, consists of: the rho-calculus, a formal foundation for conceptual relationships in object-oriented systems; a suite of Elemental Design Patterns that capture the fundamentals of object-oriented programming and their expressions in the rho-calculus; an XML Schema, the Pattern/Object Markup Language, or POML, to provide a concrete method for expressing the formalisms in a practical manner; an example mapping from the C++ programming language to POML; an implementation which ties the above components together into a practical tool that detects instances of design patterns directly from source code using the Otter automated theorem prover. I will discuss each of the components of the system in turn, and relate them to previous research in the area, as well as provide a number of future research directions. Using the results of SPQR, a system can be more easily documented and understood. The major contribution of SPQR is the flexible detection of design patterns using the formalisms of rho-calculus instead of static structural cues. Building on SPQR, I propose: a suite of metrics utilizing the Minimum Description Length principle that capture the salient conceptual features of source code as expressed in design patterns nomenclature as a method for measuring comprehensibility of code; an approach for mapping these metrics to cost-based metrics from current management principles. This combination should prove to be effective in facilitating communication between technical and managerial concerns in a manner that allows for the most efficient allocation of resources during maintenance of software systems.
iii
ACKNOWLEDGMENTS
My deepest thanks to all those friends and family who have provided me with support, and patience... ...my advisor and friend, Dr. David Stotts. Here’s to many more years of fruitful collaboration. ...my readers, editors, and all who have contributed to the discussions that proved to be so important in this endeavor. ...Dr. Marjorie Olmstead. Sometimes a simple compliment really does make all the difference in a life. ...Leandra Vicci. Your support was instrumental in my being here. ...but most of all to my wife, Leah, who has stood by me in each of the above ways, and so many more.
iv
CONTENTS
LIST OF FIGURES
xv
LIST OF TABLES
xix
1 Introduction
1
1.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2
Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.3
Difficulties of Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.3.1
Technical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.3.2
Managerial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3.3
Psychological Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.4
Object-Oriented Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
1.5
Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
1.6
1.5.1
Formalization of Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.2
Documentation Divide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.6.1
1.7
1.8
Syntactic metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Concepts Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.7.1
From code to comprehension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.2
Pattern Compilers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Conceptual metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.8.1
Design metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.8.2
Malignant pattern detection
1.8.3
Multi-scale analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.8.4
Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
v
1.9
SPQR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.10 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.10.1 Technical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.10.2 Managerial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.10.3 Human limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.11 Unresolved issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.11.1 No silver bullet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.11.2 False positives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.11.3 Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2 Related Work 2.1
2.2
28
Decomposition of patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.1.1
Refactoring Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1.2
Fragments
2.1.3
Minipatterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.1.4
Structural Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.1.5
Conceptual Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.1.6
Other Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Analysis techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.2.1
Cohesion and coupling analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3
Quantitative metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4
Formalization of Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5
2.4.1
LePUS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4.2
Other approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Pattern Detection and Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.5.1
Pattern Enforcing Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.5.2
FUJABA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5.3
RML and CrocoPat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3 Sigma Calculus 3.1
41
Basics of ς-calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.1.1
Objects, method, fields and types . . . . . . . . . . . . . . . . . . . . . . . . . . 43
vi
3.2
3.1.2
Selection and update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.1.3
Reduction rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.1.4
Classes and other constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Inflexibility of ς-calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4 Rho Calculus
47
4.1
Notation Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2
Direct Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3
4.2.1
Method-body-based reliance operators: