Convergence. ⢠Unified by transformations. ⢠mapping of programs to programs (or models to models). ⢠MDE. â map
Program Kubes
Don Batory Department of Computer Sciences University of Texas at Austin
1
Introduction • Future of software development (SWD) lies in automation • automate rote, time-consuming, error-prone tasks • three th ttechnologies h l i th thatt automate t t such h ttasks k will ill converge
• Model Driven Engineering (MDE) • specify target program by a set of high high-level level models which are easy to understand, write, and maintain • program is synthesized by transforming models to executables
• Refactoring R f t i • reorganize programs/models using transformations to improve structure
• Software Product Lines (SPL) • create a family of related programs from a common set of assets • automatically synthesize an SPL member from a declarative specification
2
Convergence • Unified by transformations • • • •
mapping of programs to programs (or models to models) MDE – map models of one type to another Refactoring – restructure models SPL – elaborate models by adding more detail
• E Emerging i Science S i off Automated A t t d SWD from f experiences i off practitioners • compositional p – based on function ((transformation)) composition p • fundamental mathematical structures provide an informal language to express program/model designs • SPL modeled by categories (POPL 2007, ICSE 2007, GPCE 2008) • theoretical basis for future tools and models for synthesis
3
Why Transformations? •
Java and C# programs use methods to update and translate objects • “programming in the small”
•
In SWD, objects are programs and methods are transformations • “programming p g g in the large” g
•
Transformations provide a fundamental way to understand SWD • foundation for MDE MDE, refactorings refactorings, and software product lines
•
Thinking mathematically leads to unusual design techniques that take time to understand • this talk is about one particular example
4
Long, Long Ago… •
In 2001, we (Lopez-Herrejon, et al) discovered a multi-dimensional structure to express interacting dimensions of variability in SPLs
•
Called “Origami” • design of a program P was defined by a matrix of transformations • rows, columns l were ffeatures t • matrix was folded in precise ways (thereby composing transformations) until a single term was produced • this term is an expression p ((composition p of transformations)) that synthesizes y P
P = •
technique that others will use in the future
Essential concept in building ATS (250K+ LOC) 5
“This Works???” • Take a “physics” approach – small number of principles (in mathematical form) that hold this entire universe together • Origami is sophisticated technique – taken years to connect seeming unrelated topics that it integrates • • • •
data cubes tensors, categories expression problem feature interactions
database technology mathematics programming languages software design g
• This is a modeling talk aimed at practitioners • no special mathematical background • program synthesis th i and d variability i bilit expressed d by b modern d mathematics th ti • raise the level on how to think about SWD and variability
• Your tour g guide through g a strange g universe… 6
Background from Databases: Data Cubes 7
Data Cubes • Data Cubes (Gray 1997) are a multi-dimensional array visualization of relational tables for data warehouses
Dimension Di i Attributes
Measures M Attribute
Items
Time
Tires Bikes Tools
Jan
Items
Location
Time
QtySold
t l tools
usa
f b feb
4
F b Feb
tires
spain
jan
3
Mar
bikes
france
may
30
Apr
…
…
…
…
May
USA
Spain
France Greece
Location 8
Data Cubes • Data Cubes (Gray 1997) are a multi-dimensional array visualization of relational tables for data warehouses • tuples are cube entries
Dimension Di i Attributes
M Measure Attributes
Items
Location
Time
QtySold
t l tools
usa
f b feb
4
4
tires
spain
jan
3
3
bikes
france
may
30 30
…
…
…
Items
Time
Tires Bikes Tools
Jan F b Feb Mar Apr
…
May
USA
Spain
Greece France
Location 9
Cube Queries • Subcubes restrict dimensional values • count # of bikes, tools sold in Europe in January, March, April
It Items
Time
Tires Bikes Tools
Jan Feb eb Mar Apr May
USA
Spain
France Greece
Location
10
Data Cube Operations • Projection eliminates unnecessary elements • count # of bikes, tools sold in Europe in January, March, April
It Items
Time
Tires Bikes Tools
Jan Feb eb Mar Apr May
USA
Spain
France Greece
Location
11
Data Cube Operations • Aggregate or roll-up elements to produce a scalar • scalar totals the # of bikes, tools sold in Europe in January, …
Items Bikes Tools
Time Jan
421
Mar a Apr
Spain
France Greece
Location
Order in which dimensions are aggregated doesn’t matter 12
The Kube Data Type
13
Kubes: N-Dimensional Arrays • Borrow terminology of tensors • • • •
rank of kube is its dimensionality scalar l iis a kkube b off rank k0 vector is a kube of rank 1 matrix is a kube of rank 2
• Notation: # of indices = rank of kube S
– kube of rank 0
(scalar)
Vk
– kube of rank 1
(vector – linear array)
Mij
– kube k b off rank k2
( t i – 2D array)) (matrix
Cijk
– kube of rank 3
(cube – 3D array)
14
Kube Operations • Roll-up is contraction • summing across dimensions i, j for kube Cijk:
Vk =
Σij Cijk
• Interested (in this talk) on contracting to scalars
S=
Σijk Cijk
order in which dimensions are contracted does not matter
• Projection limits values of indices
S =
Σi∈I, j∈J, k∈K Cijk 15
Previous Example Items
Time
Tires Bikes Tools
Jan Feb Mar
Cilt =
Apr May
USA
Spain
France Greece
Location
S = Σi∈{{Bikes,Tires}} l∈ {Spain,France,Greece} t∈{Jan,Mar,Apr} Cilt 16
Software Synthesis • Kube expresses multidimensional models of variability • Data cubes aggregate numbers, program synthesis composes transformations • element is a program transformation • dimensions we will see are defined by feature models F
G
F1 F2 F3
G1 G2
P Program =
Σ
G3 G4 G5
H1
H2
H3
H4
H 17
Example: AHEAD Tool Suite • Language, tool extensible IDE • synthesize tools for dialects of Java • optional extensions to Java
St t State_machine hi SM { States g, h, i; T Transition iti e1 1 : g Æ h ...;
refines class CL { ... }
Transition e2 : h Æ i ...; refines interface IN { ... }
... }
state machine
refines 18
AHEAD Tool Suite • Language, tool variability expressed as a 2D kube • ASE 2002 and SIGSOFT/FSE 2003
Lang Java D Ds Sm
Java J Language Features
Refines Quote
Tool
Parse Harvest Reduce Doclet
Pretty
Tool Features 19
AHEAD Tool Suite • AHEAD Tool specified by 2 sets of features • Jedi – javadoc tool for a dialect of Java
Jedi =
Lang
Σ
Java D Ds Sm Refines Quote
Tool
Parse Harvest Reduce Doclet
Pretty
20
Synthesize Jedi • Project matrix
Jedi =
Lang
Σ
Java D Ds Sm Refines Quote
Tool
Parse Harvest Reduce Doclet
Pretty
21
Synthesize Jedi • Contract matrix • Defines composition of transformations to synthesize Jedi
Jedi =
Σ
Lang Jedi
Java S Sm Refines
Tool
Parse Harvest Doclet
Most AHEAD Tools ((250K+ LOC)) are built by y contracting g 2D, 3D kubes; see ASE 2002, FSE 2003 papers 22
Here’s What’s Odd… • Lots of other ways to contract matrix – Each contraction p produces a different expression p – Function composition (unlike integer addition) is not commutative
Jedi =
Σ
Lang Jedi
Java S Sm Refines
Tool
Parse Harvest Doclet
23
Back to Basics: Transformations in SWD 24
Basics • Fact: no program is created spontaneously • no MDE model, Word document, etc… • created by extending simpler programs • and simpler programs come from simpler programs, recursively
0 P1
P2
P3
direction of time 25
Incremental Development • Going forward in time explains how a program’s design was developed p incrementally, y, in logical g steps p • incremental development • hallmark of automated program synthesis
0 P1
P2
P3
direction of time 26
Timelines • Think of arrows as transformations • mappings from one program to another • path from 0 to a program is its timeline • when arrows are given semantic meaning and are reusable they are called features
0 P1
P2
P3
direction of time 27
Example: Expression Trees • Parse tree of an expression
+ 2
• print() operation on trees:
3
+ print
= “2 + 3” 2
• eval() l() operation ti on trees: t
3
+ eval
=5 2
3
28
Expression Tree Program •
Arrow either defines base program or adds operations
Exp
0
shell
Exp
Exp
print()
print()
Int
Plus
Times
eval()
eval
print Int print()
Plus print()
Int print() eval()
Times print()
Plus print() eval()
Times print() eval()
Methods Product Line 29
Properties of Arrows x
• Arrows are composable
z y
Exp
Exp p print()
Int
print Plus
Int print() Plus print()
Times
Times print()
30
Software Product Lines •
Family of related programs
•
Features are increments in program functionality (arrows) • SPL building blocks
•
C Conceptualize t li SPL as a di directed t d graph h • nodes are programs, arrows are features • paths from 0 are program timelines • superimpose the timelines of all programs in an SPL
0
shell
print
eval
Methods Product Line Graph 31
Typically • Product line graphs are trees • one timeline for every program in a product line
32
Categories • Add identity arrow to each node
0
shell
print
eval
Category of the Methods Product Line
• Resulting graph (& arrow composition rules) is a category • object is a domain, arrow maps domain to co-domain co domain
• Software product line = micro (trivial) category • each domain contains one element 33
Typical Implementations of SPL categories and arrows
34
Feature Models • Standard representation of SPLs (Kang 1990) • and-or tree with cross-tree constraints • defines legal combinations of features
Car feature diagram
Cruise
Engine
and
Transmission
alternative choose1
or: 1+
Gasoline
Electric
cross-tree constraints
Body
Manual
Automatic
Cruise Æ Automatic 35
Ordered Feature Models • Order in which features (arrows) are composed matters! • Ordered feature model is a “GenVoca” GenVoca grammar • tokens are features • productions define sentences • order in which tokens appear in sentences = order of composition
car
car : [cruise] engine+ trans body; engine : gasoline | electric;
Cruise
Engine
Transmission
Body
trans : manual | automatic; Gasoline
Electric
Manual
Automatic
## cruise Æ automatic;
GenVoca Grammar
Cruise Æ Automatic
Feature Model 36
Encode Graph of SPL • As an ordered feature model
shell
0
print
eval
Methods Category/Graph
Ops: [eval] [print] shell;
Ops
## eval
print
shell
eval Æ print; eval Æ print
GenVoca Grammar
Feature Diagram 37
Feature Implementations • Arrows can change at least the following: • add new classes, packages • add new members (fields, methods) to existing classes • wrap existing methods …
• Lots of prototype technologies to do this….
Exp
0
shell
Exp
Exp
print()
print() eval() l()
eval
print Int
Plus
Times
Int print()
Plus print()
Times print()
Int print() eval()
Plus print() eval()
Times print() eval()
Methods Product Line 38
SPL Implementation • Collection of arrows (features) whose compositions produce programs p g of an SPL is a GenVoca Model Fi = [ Fn … F3, F2, F1 ]
// vector
• Program P of an SPL is a particular composition of arrows P = F8 + F4 + F2 + F1
// P’s timeline
where + denotes feature (arrow) composition
39
Connection to Kubes • Given
P = F8 + F4 + F2 + F1
• Rewrite P as:
P = Σ i∈(8,4,2,1) Fi • Program synthesis is projection and contraction of 1D kube • GenVoca grammar defines legal kube projections
40
Recap So Far… • Conceptualize a SPL as a micro category • quirk: programs and objects are computed
• Implement by ordered pair:
SPL = ( Grammar, Kube ) • 1 dimension of variability Î Kubes are vectors
41
Another way to derive program ET precursor to 2nd dimension of variability 42
Data Type Extensibility • Variability in the Methods SPL were methods a class supported pp • set of Exp subclasses (Int, Plus, Times) were fixed
• Another A th variability i bilit iis d data t ttypes • fix the set of methods • vary the set of subclasses of Exp Exp print()
exp
0
eval()
Exp
Exp
Exp
print()
print()
print()
eval()
int
eval()
plus
Int print() eval() ()
eval()
times Int print() eval() ()
Plus print() eval() ()
Int print() eval() ()
Plus print() eval() ()
Times print() eval() ()
Types Product Line 43
Types Product Line
0
exp
int
plus
times
Types Category/Graph
Types: [times] [plus] [int] exp;
Types
## plus Æ int; times Æ plus;
GenVoca Grammar
times
plus
int
exp
plus Æ int times Æ plus
Feature Diagram
44
Expression Problem classic example of 2D variability 45
Expression Problem • Fundamental problem of program design and variability • • • •
1975 J. Reynolds 1990 W. Cook 1998 S. Krishnamurthi, M. Felleisen, D. Friedman 1998 P. Wadler
• how to add new methods and data types in a type safe manner • SPL of 30-40 line programs
• ASE 2002, SIGSOFT/FSE 2003 • R. Lopez-Herrejon, p j , D. Batory, y, J-P. Martin,, J. Liu,, J. Sarvela • how ideas scale to synthesize large programs • SPL of 30-40K line programs (1000×) • how we synthesize the AHEAD Tool Suite 46
0
Exp
Exp
Exp
Exp Int
Int
Plus
Int
Plus
Times
print p Exp print()
Int
Plus
Times
print()
print()
print()
eval Exp print() eval() Int print() eval()
Plus print() eval()
Times print() eval()
0
Exp
Exp
Exp
Exp Int
Int
Plus
Int
Plus
Times
Exp print() Exp print() p () Int print()
Plus print()
Times print()
exp
Exp print() eval()
int Int print() eval()
Exp
Exp
Exp
print() eval()
print() eval()
print() eval()
plus Int print() eval()
Plus print() eval()
times Int print() eval()
Plus print() eval()
Times print() eval()
0
Expression Product Line (EPL) Exp
Exp
Exp
Exp Int
Int
Exp print()
Plus
Int
Plus
Exp
Exp
print()
print()
Times
Exp print() p () Int
Int
Plus
Int
Plus
Times
print()
print()
print()
print()
print()
print()
Exp
Exp
Exp
Exp
print() eval()
print() eval()
print() eval()
print() eval()
0
Int print() eval()
Int print() eval()
Plus print() eval()
Int print() eval()
Plus print() eval()
Times print() eval()
Commuting Diagrams in Categories Exp
Exp
Exp
Exp Int
Int
Plus
Int
Plus
Exp
Exp
Exp
print()
print()
print()
Times
Exp print() p () Int print()
Int print()
Exp print() eval()
Int print() eval()
Plus print()
Int print()
Plus print()
Exp
Exp
Exp
print() eval()
print() eval()
print() eval()
Int print() eval()
Plus print() eval()
Int print() eval()
Plus print() eval()
Times print()
Times print() eval()
To Explain Origami, Few More Questions •
What is the EPL feature model? 0 Exp
Exp
Exp
Exp Int
Int
Plus
Int
Times
Exp print()
Exp print()
Exp print()
Plus
Exp print() Int print() ()
Int print() p ()
Exp print() eval()
• • •
Int print()
Int print() eval()
Plus print() eval()
Plus print()
Times print()
Exp print() eval()
Exp print() eval()
Exp print() eval() Int print() eval()
Plus print() ()
Int print() eval()
Plus print() eval()
Times print() eval()
What is the origin of the EPL graph? What arrows are stored? How is EPL related to kubes? 51
Observations
52
Observation • Identify any program in EPL by a pair of axis timelines
0 0
exp
int
plus
times
Types
0
Exp
Exp
shell
Exp
Exp Int
Int
print
Plus
Int
Plus
Exp
Exp
Exp
print()
print()
print()
Times
Exp print() Int print()
Int print()
eval Exp print() eval()
Plus print()
Exp
Exp
Exp
print() eval()
print() eval()
print() eval()
Int print() eval()
Int print() eval()
Methods
Int print()
Plus print()
Int print() eval()
Plus print() eval()
Plus print() eval()
Times print()
Times print() eval()
53
Observation • Identify any program in EPL by a pair of axis timelines
0 0
exp
int
plus
times
Types
0
Exp
Exp
shell
Exp
Exp Int
Int
print
Plus
Int
Plus
Exp
Exp
Exp
print()
print()
print()
Times
Exp print() Int print()
Int print()
eval Exp print() eval()
Methods
Int print() eval()
Plus print()
Int print()
Plus print()
Exp
Exp
Exp
print() eval()
print() eval()
print() eval()
Int print() eval()
Plus print() eval()
Int print() eval()
Plus print() eval()
Times print()
Times print() eval()
54
Boundary Cases • Postulate extra null programs and arrows between them
0 0
0
exp
int
plus
0exp
times
0int
0plus
0shell
print
Int
Int
Plus
Exp
Exp
Exp
print()
print()
print()
Times
print()
eval
Methods
Plus
Exp Int print()
Int print()
0eval
Exp
Exp Int
0print
0times
Exp
Exp
shell
Types
Exp print() eval()
Int print() eval()
Plus print()
Int print()
Plus print()
Exp
Exp
Exp
print() eval()
print() eval()
print() eval()
Int print() eval()
Plus print() eval()
Int print() eval()
Plus print() eval()
Times print()
Times print() eval()
55
Last Pieces to Puzzle • Look at two seemingly unrelated topics • cross products of structures (SPLs) • feature interactions
• Putting them together explains Origami
56
Cross Products
57
Cross Products in Mathematics •
A common way to scale 1D structures to higher dimensions is by taking the cross product of 1D structures • 2D structure • nD structure
•
= cross product of a pair of 1D structures = cross product of n 1D structures
SPL structure is an ordered pair: SPL = ( Grammar, Kube )
•
Look at cross product of Methods and Types product lines Methods × Types
=
( Gmethods, Vmethods ) × ( Gtypes, Vtypes )
=
( Gmethods × Gtypes, Vmethods # Vtypes ) 58
EPL Feature Model is… • Union of Methods and Types feature models with extra rule
Methods: [eval] [print] shell; %% eval Æ print
×
Types : [times] [plus] [int] exp; %% plus Æ int times Æ plus
extra grammar rule
EPL : Methods Types ;
Methods: [eval] [print] shell; Types : [times] [plus] [int] exp; %% eval Æ print plus Æ int ti times Æ plus l 59
EPL Graph • tensor product (×) of Methods and Types graphs • shape we want • remember: programs/objects are computed, so too are the arrows 0 0 shell
exp
int
plus
times
Types
0 = Types × Methods
print
eval E Exp eval() print()
Methods Int print() eval()
Plus Times print() eval() eval() print()
60
EPL Graph • tensor product (×) of Methods and Types graphs • shape we want • remember: programs/objects are computed, so too are the arrows 0 0
exp
int
times
plus
Types
0
shell
= Types × Methods
print
eval E Exp eval() print()
Methods Int print() eval()
Plus Times print() eval() eval() print()
61
Kube of EPL • Interaction cross product (#) of the Methods &Types kubes • another tensor product • GenVoca model of EPL is a matrix (2D kube) EPLmethods,types methods types
but what are these entries?
=
Vmethods # Vtypes
=
[ shell, print, eval ] # [ exp, int, plus, times ] (shell # exp)
(shell # int)
(shell # plus)
(shell # times)
(print # exp) (p p)
(print # int)) (p
(print # p (p plus))
(print # times)) (p
(eval # exp)
(eval # int)
(eval # plus)
(eval # times)
=
62
Products of Features and Feature Interactions 63
Products & Interactions •
Features change a design
•
Structural feature interactions are changes to a design that are added conditionally based on the features that are present
•
p ((Kyo y Kang) g) Classical example • flood control, fire control • must add extra arrow Æ to control their interaction, i.e., so that they work together correctly
•
flood fire fire
fire’ flood
flood # fire
flood’
flood × fire = flood#fire + flood + fire
Taking the “square” of features
64
From EPL: int × print Exp Exp
int Int
print
Exp p print() Exp print() Int print()
65
From EPL: int × print Exp Exp
int Int
print Exp
print
print()
Int
print # int
int
Exp p print()
Exp print() Int print()
66
From EPL: eval × times Exp
Exp
print()
print()
times Int
Plus
Int
Plus
Times
print()
print()
print()
print()
print()
Exp print() eval()
eval
Exp
Int
Plus
Times
print() eval()
print() eval()
print()
eval # times Exp
print() p ()
print() p ()
eval()
eval()
Int print()
Plus print()
Int print()
Plus print()
Times print()
eval()
eval()
eval()
eval()
eval()
67
GMethods × GTypes Types : [times] [plus] [int] exp; %% plus Æ int times Æ plus
0
Methods : [eval] [print] shell; Types : [times] [plus] [int] exp; %% eval Æ print plus Æ int times Æ plus Methods # Types EPL: Methods Types;
Methods: [eval] [print] shell; %% eval Æ print
Methods: [eval] [print] shell; Types: [times] [plus] [int] exp; %% eval l Æ print i plus Æ int times Æ plus
Taking the Square of Features • Fundamental relationship among features & interactions • The changes features make always commute • Interactions are added to coordinate features correctly
g f f
f’
g
f#g g’
69
Adding Interactions • Recall the EPL graph • Expose interaction arrows • EPL Matrix = VMethods # VTypes 0 ( h ll # exp)) (shell
( h ll # int) (shell i t)
( h ll # plus) (shell l )
( h ll # times) (shell ti )
(print # exp)
(print # int)
(print # plus)
(print # times)
(eval # exp)
(eval # int)
(eval # plus)
(eval # times)
70
Important Property • Given the EPL matrix and boundaries, anyy arrow and anyy program p g of EPL can be computed p • Proof sketch • given f, g, f#g • can complete square
0
g f f
f’
g
f#g g’
71
Another Important Property • Aggregating adjacent squares sums interactions
g
h+g
h
f
f f#g
f#h
f#h + f#g
72
Finally, Why Origami Works
Any program of EPL can be computed by projecting & contracting EPL matrix
73
Projection • Projection of matrix = projection of category, yields a commuting y g diagram g • any path from 0 to P will synthesize P
0
P
commuting category diagram
matrix 74
Contract by Columns • Contraction aggregates adjacent squares and sums interaction arrows
0 +
+
+
+
P
commuting diagram
vector matrix 75
Contract by Rows • Contraction aggregates adjacent squares and sums interaction arrows
0
+++ P
commuting diagram
vector scalar 76
Final Step/Square • Boundary programs are 0s • 0Æ0 arrows add NOTHING NOTHING, and so too their compositions • interaction arrow defines an expression that synthesizes P • this is the arrow that is computed by matrix contraction
0
0 +
0 0 P
commuting diagram
vector scalar 77
Aside: Commuting Paths • • • •
Start from desired program Aggregate by row moves vertically up Aggregation by columns moves vertically left Final square any path will do
0 + +
+ +
+
P
commuting diagram
matrix 78
Generality of Arguments • Nothing special about how we contracted the matrix • any contraction (i.e., path) would do
• Nothing special about program P we selected • any program in the SPL would do
• Nothing special about this matrix • any SPL matrix would do – ex. AHEAD 2D kube
• Nothing special about 2D kubes • same ideas apply to higher-dimensions • e.g. a square becomes a cube with a single interaction arrow
• Result that can be applied to SPL in general • Origami g is a fundamental structure of SPLs
79
So What?
and other final thoughts thoughts…
80
Perspective •
Programming in the Small – Java and C# objects, methods
•
P Programming i in i the th Large L – objects bj t are programs, methods th d are xforms f
•
Transformations provide a fundamental way to understand SWD • foundation f d ti ffor MDE, MDE refactorings, f t i and d software ft product d t lines li
•
Working toward a Science of Automated Design • start from experiences and abstract to a theory • belief: few principles hold this universe together • principles assume a mathematical form as in other sciences and engineering disciplines that manipulate structures • a promising alternative to the ad hoc design techniques in use today
81
Dimensions of Variability •
Preplanned variability is key to automated software design • much like design patterns helped novices design like experts • SPLs help novices create customized programs like experts
•
Common in SPLs to have orthogonal and interacting sets of features • EPL – method variability vs vs. type variability • AHEAD – tool variability vs. language variability
•
Origami O g expresses p multiple p dimensions of variability y • powerful and elegant, it scales
•
Expose p new and basic relationships p in mathematical form • • • •
projection and contraction of kubes – database technology cross product of features (feature interactions) cross product of SPLs use of n-D kubes to represent n-dimensions of variability 82
Final Comments • Clear that ideas are being reinvented in different contexts • not accidental – evidence we are working toward general paradigm • modern mathematics is a simple language to express these ideas • maybe others may be able to find deeper connections
• At the earliest stages • Advice: think in terms of arrows, think in terms of kubes • if you look for kubes, you’ll find them… • if you y don’t look,, you y won’t find them…
83