with lecture slides and reading material. ♢ One exam question. 2. Main books. ♢
J. C. Mitchell. Concepts in programming languages. Cambridge University Press
...
Concepts in Programming Languages Practicalities Marcelo Fiore
Course web page:
Computer Laboratory University of Cambridge
with lecture slides and reading material. One exam question.
Easter 2011
2
1
Topics
Main books
I. Introduction and motivation.
J. C. Mitchell. Concepts in programming languages.
II. The first procedural language: FORTRAN (1954–58).
Cambridge University Press, 2003.
III. The first declarative language: LISP (1958–62). T. W. Pratt and M. V. Zelkowitz. Programming Languages:
Design and implementation (3 RD 1999.
EDITION).
Prentice Hall,
IV. Block-structured procedural languages: Algol (1958–68), Pascal (1970).
V. Object-oriented languages — Concepts and origins: ⋆ M. L. Scott. Programming language pragmatics (2 ND EDITION). Elsevier, 2006.
Simula (1964–67), Smalltalk (1971–80).
VI. Types in programming languages: ML (1973–1978).
R. Sethi. Programming languages: Concepts & constructs
(2 ND
EDITION).
Addison-Wesley, 1996.
VII. Data abstraction and modularity: SML Modules (1984–97). VIII. The state of the art: Scala (2007)
3
4
g Topic I g
Goals
Introduction and motivation References:
Critical thinking about programming languages.
? What is a programming language!?
Chapter 1 of Concepts in programming languages by
Study programming languages.
J. C. Mitchell. CUP, 2003. Chapter 1 of Programming languages: Design and
implementation (3 RD EDITION) by T. W. Pratt and M. V. Zelkowitz. Prentice Hall, 1999.
EDITION)
Be familiar with basic language concepts.
Appreciate trade-offs in language design.
Trace history, appreciate evolution and diversity of ideas.
Chapter 1 of Programming language pragmatics
(2 ND
Be prepared for new programming methods, paradigms.
by M. L. Scott. Elsevier, 2006. 5
Why study programming languages?
6
What makes a good language?
To improve the ability to develop effective algorithms. Clarity, simplicity, and unity.
To improve the use of familiar languages.
Orthogonality.
To increase the vocabulary of useful programming
constructs.
Naturalness for the application.
To allow a better choice of programming language.
Support of abstraction.
To make it easier to learn a new language.
Ease of program verification.
To make it easier to design a new language.
Programming environments.
To simulate useful features in languages that lack them. Portability of programs.
To make better use of language technology wherever it
appears. 7
8
What makes a language successful? Expressive power.
Cost of use.
Cost of execution.
Ease of use for the novice.
Cost of program translation.
Ease of implementation.
Cost of program creation, testing, and use.
Open source.
Cost of program maintenance.
Excellent compilers. Economics, patronage, and inertia.
9
10
Applications domains Influences
Era
Application
Major languages
Other languages
1960s
Business
COBOL
Assembler
Scientific
FORTRAN
A LGOL, BASIC, APL
System
Assembler
JOVIAL, Forth
AI
LISP
SNOBOL
Business
COBOL, SQL, spreadsheet
C, PL/I, 4GLs
Scientific
FORTRAN, C, C++ Maple, Mathematica
BASIC, Pascal
System
BCPL, C, C++
Pascal, Ada, BASIC, MODULA
AI
LISP, Prolog
Publishing
TEX, Postscript, word processing
Process
UNIX shell, TCL, Perl
Marvel, Esterel
New paradigms
Smalltalk, SML, Haskell, Java Python, Ruby
Eifell, C#, Scala
Computer capabilities. Applications.
Today
Programming methods. Implementation methods. Theoretical studies. Standardisation.
11
12
?> 89 :; Motivating application in language design=
=< 89 Classification of programming languages:;
Imperative
Examples:
procedural
C,Ada,Pascal,Algol,FORTRAN, . . .
object oriented
Scala, C#,Java, Smalltalk, SIMULA, . .
scripting
Perl, Python, PHP, . . .
Formal-language theory. Automata theory. Algorithmics. λ-calculus.
Declarative
functional
Haskell, SML, Lisp, Scheme, . . .
Semantics.
logic
Prolog
Formal verification.
dataflow
Id, Val
Type theory.
constraint-based
spreadsheets
Complexity theory.
template-based
XSLT
Logic. 17
18
Language standardisation
Standardisation
Consider: int i; i = (1 && 2) + 3 ; Proprietary standards.
? Is it valid C code? If so, what’s the value of i?
Consensus standards.
? How do we answer such questions!?
ANSI (American National Standards Institute)
IEEE (Institute of Electrical and Electronics Engineers)
BSI (British Standard Institute)
! Try it and see!
ISO (International Standards Organisation)
! Read the ANSI C Standard.
19
! Read the reference manual.
20
Language-standards issues
Language standards PL/1
Timeliness. When do we standardise a language? Conformance. What does it mean for a program to adhere to a standard and for a compiler to compile a standard?
?
What does the following 9 + 8/3 mean? − 11.666... ?
Ambiguity and freedom to optimise — Machine dependence — Undefined behaviour.
− Overflow ?
Obsolescence. When does a standard age and how does it get modified?
− 1.666... ?
Deprecated features.
21
22
For 9 + 8/3 we have: DEC(p,q) means p digits with q after the decimal point. Type rules for DECIMAL in PL/1: DEC(p1,q1) + DEC(p2,q2) => DEC(MIN(1+MAX(p1-q1,p2-q2)+MAX(q1,q2),15),MAX(q1,q2) DEC(p1,q1) / DEC(p2,q2) => DEC(15,15-((p1-q1)+q2))
23
DEC(1,0) + DEC(1,0)/DEC(1,0) => DEC(1,0) + DEC(15,15-((1-0)+0)) => DEC(1,0) + DEC(15,14) => DEC(MIN(1+MAX(1-0,15-14)+MAX(0,14),15),MAX(0,14)) => DEC(15,14) So the calculation is as follows 9 + 8/3 = 9 + 2.66666666666666 = 11.66666666666666 - OVERFLOW = 1.66666666666666 - OVERFLOW disabled
24
History 1951–55: Experimental use of expression compilers.
1981–85: Smalltalk-80, Prolog, Ada 83.
1956–60: FORTRAN, COBOL, LISP, Algol 60.
1986–90: C++, SML, Haskell.
1961–65: APL notation, Algol 60 (revised), SNOBOL, CPL.
1991–95: Ada 95, TCL, Perl.
1966–70: APL, SNOBOL 4, FORTRAN 66, BASIC, SIMULA, Algol 68, Algol-W, BCPL.
1996–2000: Java. 2000–05: C#, Python, Ruby, Scala.
1971–75: Pascal, PL/1 (Standard), C, Scheme, Prolog. 1976–80: Smalltalk, Ada, FORTRAN 77, ML.
25
26
Language groups Things to think about
Multi-purpose languages
Scala, C#, Java, C++, C
Haskell, SML, Scheme, LISP
Perl, Python, Ruby
What makes a good language? The role of
1. motivating applications,
Special-purpose languages
2. program execution,
UNIX shell
3. theoretical foundations
SQL
in language design.
LATEX
Language standardisation.
27
28
FORTRAN = FORmula TRANslator
g Topic II g
FORTRAN : A simple procedural language References:
Chapter 10(§1) of Programming Languages: Design and
implementation (3 RD EDITION) by T. W. Pratt and M. V. Zelkowitz. Prentice Hall, 1999.
(1957) Developed in the 1950s by an IBM team led by John Backus. The first high-level programming language to become widely used. At the time the utility of any high-level language was open to question! The main complain was the efficiency of compiled code. This heavily influenced the designed, orienting it towards
The History of FORTRAN I, II, and III by J. Backus. In
providing execution efficiency.
History of Programming Languages by R. L. Wexelblat. Academic Press, 1981.
Standards:
1966, 1977 (FORTRAN 77), 1990 (FORTRAN 90). 29
30
John Backus Overview Execution model
As far as we were aware, we simply made up the language as we went along. We did not regard language design as a difficult problem, merely a simple prelude to the real problem: designing a compiler which could produce efficient programs.a
FORTRAN program = main program + subprograms
Each is compiled separate from all others.
Translated programs are linked into final executable form during loading.
All storage is allocated statically before program execution
begins; no run-time storage management is provided. Flat register machine. No stacks, no recursion. Memory In R. L. Wexelblat, History of Programming Languages, Academic Press, 1981, page 30. a
31
arranged as linear array.
32
Overview
Overview
Compilation
Data types
FORTRAN program
Numeric data: Integer, real, complex, double-precision
real.
Compiler
Boolean data.
called logical
Incomplete machineUlanguage
Library routines
UUUU UUUU UUUU *
o ooo o o w o
Linker
Arrays.
of fixed declared length
Character strings.
of fixed declared length
Files.
Machine language program
33
34
Example Overview Control structures 10 100
FORTRAN 66
Relied heavily on statement labels and GOTO statements. FORTRAN 77
Added some modern control structures (e.g., conditionals). 999
35
PROGRAM MAIN PARAMETER (MaXsIz=99) REAL A(mAxSiZ) READ (5,100,END=999) K FORMAT(I5) IF (K.LE.0 .OR. K.GT.MAXSIZ) STOP READ *,(A(I),I=1,K) PRINT *,(A(I),I=1,K) PRINT *,’SUM=’,SUM(A,K) GO TO 10 PRINT *, "All Done" STOP END 36
Example Commentary C SUMMATION SUBPROGRAM FUNCTION SUM(V,N) REAL V(N) SUM = 0.0 DO 20 I = 1,N SUM = SUM + V(I) 20 CONTINUE RETURN END
Columns and lines are relevant. Blanks are ignored (by early FORTRANs). Variable names are from 1 to 6 characters long,
begin with a letter, and contain letters and digits. Programmer-defined constants. Arrays: when sizes are given, lower bounds are
assumed to be 1; otherwise subscript ranges must be explicitly declared. Variable types may not be declared: implicit naming
convention. 37
38
Data formats. FORTRAN 77 has no while statement.
On syntax
Functions are compiled separately from the main program.
Information from the main program is not used to pass information to the compiler. Failure may arise when the loader tries to merge subprograms with main program.
A misspelling bug . . . do 10 i = 1,100
vs.
do 10 i = 1.100
Function parameters are uniformly transmitted by
. . . that is reported to have caused a rocket to explode upon launch into space!
reference (or value-result). Recall that allocation is done statically. DO loops by increment. A value is returned in a FORTRAN function by assigning a
value to the name of a function. 39
40
Types
Storage
FORTRAN has no mechanism for creating user types.
Representation and Management
Static type checking is used in FORTRAN, but the
Storage representation in FORTRAN is sequential.
checking is incomplete.
Only two levels of referencing environment are provided,
Many language features, including arguments in subprogram calls and the use of COMMON blocks, cannot be statically checked (in part because subprograms are compiled independently).
global and local. The global environment may be partitioned into separate common environments that are shared amongst sets of subprograms, but only data objects may be shared in this way.
Constructs that cannot be statically checked are ordinarily left unchecked at run time in FORTRAN implementations.
42
41
COMMON
The sequential storage representation is critical in the definition of the EQUIVALENCE and COMMON declarations.
The global environment is set up in terms of sets of variables and arrays, which are termed COMMON blocks.
EQUIVALENCE
This declaration allows more than one simple or subscripted variable to refer to the same storage location.
A COMMON block is a named block of storage and may contain the values of any number of simple variables and arrays.
? Is this a good idea?
COMMON blocks may be used to isolate global data to only a few subprograms needing that data.
Consider the following:
? Is the COMMON block a good idea?
REAL X INTEGER Y EQUIVALENCE (X,Y)
Consider the following:
43
COMMON/BLK/X,Y,K(25)
in MAIN
COMMON/BLK/U,V,I(5),M(4,5)
in SUB 44
/. ()
-,
Aliasing*+ /. ()
Aliasing occurs when two names or expressions
-, *+
Parameters
refer to the same object or location.
There are two concepts that must be clearly distinguished. Aliasing raises serious problems for both the user
The parameter names used in a function declaration
and implementor of a language.
are called formal parameters.
Because of the problems caused by aliasing, new
When a function is called, expressions called actual
language designs sometimes attempt to restrict or eliminate altogether features that allow aliases to be constructed.
parameters are used to compute the parameter values for that call.
46
45
The language specifies that if a formal parameter is
FORTRAN subroutines and functions Actual parameters may be simple variables, literals,
array names, subscripted variables, subprogram names, or arithmetic or logical expressions.
assigned to, the actual parameter must be a variable, but because of independent compilation this rule cannot be checked by the compiler. Example: SUBROUTINE SUB(X,Y) X = Y END
The interpretation of a formal parameter as an array is done by the called subroutine. Each subroutine is compiled independently and no
checking is done for compatibility between the subroutine declaration and its call.
CALL SUB(-1.0,1.0) Parameter passing is uniformly by reference.
47
48
J. McCarthy. Recursive functions of symbolic expressions
g Topic III g
and their computation by machine. Communications of the ACM, 3(4):184–195, 1960.a
LISP : functions, recursion, and lists
J. McCarthy. History of LISP. In History of Programming
References:
Languages. Academic Press, 1981.
Chapter 3 of Concepts in programming languages by
J. C. Mitchell. CUP, 2003. Chapters 5(§4.5) and 13(§1) of Programming languages:
Design and implementation (3 RD EDITION) by T. W. Pratt and M. V. Zelkowitz. Prentice Hall, 1999.
Available on-line recursive.html>. a
from
’h’, First => 24); ? What about in ML? Can it be simulated?
the time that the actual parameter is evaluated, and
This is commonly used together with default parameters:
the location used to store the parameter value.
procedure Proc(First: Integer := 0; Second: Character := Proc(Second => ’o’);
NB: The location of a variable (or expression) is called its L-value, and the value stored in this location is called the R-value of the variable (or expression).
77
/. ()
78
-,
Parameter passing*+
/. ()
Pass/Call-by-value
-,
Parameter passing*+
In pass-by-value, the actual parameter is evaluated. The
value of the actual parameter is then stored in a new location allocated for the function parameter.
Pass/Call-by-reference
In pass-by-reference, the actual parameter must have
an L-value. The L-value of the actual parameter is then bound to the formal parameter.
Under call-by-value, a formal parameter corresponds to
the value of an actual parameter. That is, the formal x of a procedure P takes on the value of the actual parameter. The idea is to evaluate a call P(E) as follows: x := E; execute the body of procedure P; if P is a function, return a result. 79
Under call-by-reference, a formal parameter becomes
a synonym for the location of an actual parameter. An actual reference parameter must have a location.
80
Example: program main; begin function f( var x: integer; y: integer): integer; begin x := 2; y := 1; if x = 1 then f := 1 else f:= 2 end;
The difference between call-by-value and call-by-reference is important to the programmer in several ways: Side effects. Assignments inside the function body
may have different effects under pass-by-value and pass-by-reference. Aliasing. Aliasing occurs when two names refer to the
same object or location. Aliasing may occur when two parameters are passed by reference or one parameter passed by reference has the same location as the global variable of the procedure.
var z: integer; z := 0; writeln( f(z,z) ) end 81
82
/. ()
-,
Parameter passing*+
Efficiency. Pass-by-value may be inefficient for large
structures if the value of the large structure must be copied. Pass-by-reference maybe less efficient than pass-by-value for small structures that would fit directly on stack, because when parameters are passed by reference we must dereference a pointer to get their value.
83
Pass/Call-by-value/result Call-by-value/result is also known as copy-in/copy-out because the actuals are initially copied into the formals and the formals are eventually copied back out to the actuals. Actuals that do not have locations are passed by value. Actuals with locations are treated as follows: 1. Copy-in phase. Both the values and the locations of the actual parameters are computed. The values are assigned to the corresponding formals, as in call-by-value, and the locations are saved for the copy-out phase. 2. Copy-out phase. After the procedure body is executed, the final values of the formals are copied back out to the locations computed in the copy-in phase. 84
Examples: Ada supports three kinds of parameters:
A parameter in Pascal is normally passed by value. It is
passed by reference, however, if the keyword var appears before the declaration of the formal parameter. procedure proc(in: Integer; var out: Real); The only parameter-passing method in C is call-by-value;
however, the effect of call-by-reference can be achieved using pointers. In C++ true call-by-reference is available using reference parameters.
1. in parameters, corresponding to value parameters; 2. out parameters, corresponding to just the copy-out phase of call-by-value/result; and 3. in out parameters, corresponding to either reference parameters or value/result parameters, at the discretion of the implementation.
86
85
/. ()
Block structure
-, *+
Parameter passing Pass/Call-by-name
In a block-structured language, each program or
subprogram is organised as a set of nested blocks.
The Algol 60 report describes call-by-name as follows: 1. Actual parameters are textually substituted for the formals. Possible conflicts between names in the actuals and local names in the procedure body are avoided by renaming the locals in the body. 2. The resulting procedure body is substituted for the call. Possible conflicts between nonlocals in the procedure body and locals at the point of call are avoided by renaming the locals at the point of call. 87
A block is a region of program text, identified by begin and end markers, that may contain declarations local to this region. In-line (or unnamed) blocks are useful for restricting the
scope of variables by declaring them only when needed, instead of at the beginning of a subprogram. The trend in programming is to reduce the size of subprograms, so the use of unnamed blocks is less useful than it used to be.
88
Nested procedures can be used to group statements that are executed at more than one location within a subprogram, but refer to local variables and so cannot be external to the subprogram. Before modules and object-oriented programming were introduced, nested procedures were used to structure large programs. Block structure was first defined in Algol. Pascal contains
Each declaration is visible within a certain region of program text, called a block.
When a program begins executing the instructions contained in a block at run time, memory is allocated for the variables declared in that block.
When a program exits a block, some or all of the memory allocated to variables declared in that block will be deallocated.
An identifier that is not declared in the current block is considered global to the block and refers to the entity with this name that is declared in the closest enclosing block.
nested procedures but not in-line blocks; C contains in-line blocks but not nested procedures; Ada supports both. Block-structured languages are characterised by the
following properties:
New variables may be declared at various points in a program.
90
89
Algol The main characteristics of the Algol family are:
had a major effect on language design
the familiar semicolon-separated sequence of statements,
block structure,
functions and procedures, and
static typing.
The Algol-like programming languages evolved in parallel
with the LISP family of languages, beginning with Algol 58 and Algol 60 in the late 1950s. The most prominent Algol-like programming languages
are Pascal and C, although C differs from most of the Algol-like languages in some significant ways. Further Algol-like languages are: Algol 58, Algol W, Euclid, etc.
91
92
Algol 60 Features
Algol 60
Simple statement-oriented syntax.
Designed by a committee (including Backus, McCarthy,
Perlis) between 1958 and 1963.
Block structure.
Intended to be a general purpose programming language,
Recursive functions and stack storage allocation.
with emphasis on scientific and numerical applications. Fewer ad hoc restrictions than previous languages Compared with FORTRAN, Algol 60 provided better ways
to represent data structures and, like LISP, allowed functions to be called recursively. Eclipsed by FORTRAN because of the lack of I/O statements, separate compilation, and library; and because it was not supported by IBM.
(e.g., general expressions inside array indices, procedures that could be called with procedure parameters). A primitive static type system, later improved in
Algol 68 and Pascal.
94
93
Algol 60 Some trouble spots Algol 60 was designed around two parameter-passing
The Algol 60 type discipline had some shortcomings.
mechanisms, call-by-name and call-by-value.
For instance:
Automatic type conversions were not fully specified (e.g., x := x/y was not properly defined when x and y were integers—is it allowed, and if so was the value rounded or truncated?).
The type of a procedure parameter to a procedure does not include the types of parameters.
An array parameter to a procedure is given type array, without array bounds.
95
Call-by-name interacts badly with side effects; call-by-value is expensive for arrays. There are some awkward issues related to control
flow, such as memory management, when a program jumps out of a nested block.
96
Algol 60 pass-by-name Copy rule
Algol 60 procedure typesa In Algol 60, the type of each formal parameter of a procedure must be given. However, proc is considered a type (the type of procedures). This is much simpler than the ML types of function arguments. However, this is really a type loophole; because calls to procedure parameters are not fully type checked, Algol 60 programs may produce run-time errors. Write a procedure declaration for Q that causes the following program fragment to produce a run-time type error: proc P ( proc Q ) begin Q(true) end;
real procedure sum(E,i,low,high); value low, high; real E; integer i, low, high; begin sum:=0.0; for i := low step 1 until high do sum := sum+E; end integer j; real array A[1:10]; real result; for j:= 1 step 1 until 10 do A[j] := j; result := sum(A[j],j,1,10)
P(Q); where true is a Boolean value. Explain why the procedure is statically type correct, but produces a run-time type error. (You may assume that adding a Boolean to an integer is a run-time error.) Exercise 5.1 of Concepts in programming languages by J. Mitchell, CUP, 2003. a
By the Algol 60 copy rule, the function call to sum above is equivalent to: begin sum:=0.0; for j := 1 step 1 until 10 do sum := sum+A[j]; end 98
97
Algol 68
Algol 60 pass-by-namea The following Algol 60 code declares a procedure P with one pass-by-name integer parameter. Explain how the procedure call P(A[i]) changes the values of i and A by substituting the actual parameters for the formal parameters, according to the Algol 60 copy rule. What integer values are printed by the program? And, by using pass-by-value parameter passing? begin integer i; i:=1; integer array A[1:2]; A[1]:=2; A[2]:=3;
Algol 60 and to improve the expressiveness of the language. It did not entirely succeed however, with one main problem being the difficulty of efficient compilation (e.g., the implementation consequences of higher-order procedures where not well understood at the time). One contribution of Algol 68 was its regular, systematic
procedure P(x); integer x; begin i:=x; x:=1 end P(A[i]); end
Intended to remove some of the difficulties found in
type system.
print(i,A[1],A[2])
Exercise 5.2 of Concepts in programming languages by J. Mitchell, CUP, 2003. a
99
The types (referred to as modes in Algol 68) are either primitive (int, real, complex, bool, char, string, bits, bytes, semaphore, format, file) or compound (array, structure, procedure, set, pointer). 100
Type constructions could be combined without restriction. This made the type system seem more systematic than previous languages.
Algol innovations Use of BNF syntax description. Block structure.
Algol 68 memory management involves a stack for local
variables and heap storage. Algol 68 data on the heap are explicitly allocated, and are reclaimed by garbage collection. Algol 68 parameter passing is by value, with
Scope rules for local variables. Dynamic lifetimes for variables. Nested if-then-else expressions and statements. Recursive subroutines.
pass-by-reference accomplished by pointer types. (This is essentially the same design as that adopted in C.)
Call-by-value and call-by-name arguments. Explicit type declarations for variables.
The decision to allow independent constructs to be
Static typing.
combined without restriction also led to some complex features, such as assignable pointers.
Arrays with dynamic bounds. 102
101
Pascal Designed in the 1970s by Niklaus Wirth, after the design
A Pascal program is always formed from a single
main program block, which contains within it definitions of the subprograms used.
and implementation of Algol W. Very successful programming language for teaching, in
part because it was designed explicitly for that purpose. Also designed to be compiled in one pass. This hindered language design; e.g., it forced the problematic forward declaration. Pascal is a block-structured language in which static
Each block has a characteristic structure: a header giving the specification of parameters and results, followed by constant definitions, type definitions, local variable declarations, other nested subprogram definitions, and the statements that make up the executable part.
scope rules are used to determine the meaning of nonlocal references to names.
103
104
A restriction that made Pascal simpler than Algol 68: Pascal is a quasi-strong, statically typed programming
procedure Allowed( j,k: integer );
language. An important contribution of the Pascal type system is the rich set of data-structuring concepts: e.g. enumerations, subranges, records, variant records, sets, sequential files. The Pascal type system is more expressive than the
Algol 60 one (repairing some of its loopholes), and simpler and more limited than the Algol 68 one (eliminating some of the compilation difficulties).
procedure AlsoAllowed( procedure P(i:integer); j,k: integer ); procedure NotAllowed( procedure MyProc( procedure P( i:integer ) ) );
105
106
Pascal was the first language to propose index checking. Pascal uses a mixture of name and structural
Problematically, in Pascal, the index type of an array is
part of its type. The Pascal standard defines conformant array parameters whose bounds are implicitly passed to a procedure. The Ada programmig language uses so-called unconstrained array types to solve this problem. The subscript range must be fixed at compile time permitting the compiler to perform all address calculations during compilation. procedure Allowed( a: array [1..10] of integer ) ; procedure NotAllowed( n: integer; a: array [1..n] of integer ) ; 107
equivalence for determining if two variables have the same type. Name equivalence is used in most cases for determining if formal and actual parameters in subprogram calls have the same type; structural equivalence is used in most other situations. Parameters are passed by value or reference.
Complete static type checking is possible for correspondence of actual and formal parameter types in each subprogram call.
108
Pascal variant records Variant records have a part common to all records of that type, and a variable part, specific to some subset of the records. type kind = ( unary, binary) ; type { datatype } UBtree = record { ’a UBtree = record of } value: integer ; { ’a * ’a UBkind } case k: kind of { and ’a UBkind = } unary: ^UBtree ; { unary of ’a UBtree } binary: record { | binary of } left: ^UBtree ; { ’a UBtree * } right: ^UBtree { ’a UBtree ; } end end ;
Variant records introduce weaknesses into the type system for a language. 1. Compilers do not usually check that the value in the tag field is consistent with the state of the record. 2. Tag fields are optional. If omitted, no checking is possible at run time to determine which variant is present when a selection is made of a field in a variant.
109
Summary The Algol family of languages established the
command-oriented syntax, with blocks, local declarations, and recursive functions, that are used in most current programming languages.
110
g Topic V g
Object-oriented languages : Concepts and origins SIMULA and Smalltalk References:
The Algol family of languages is statically typed, as each
expression has a type that is determined by its syntactic form and the compiler checks before running the program to make sure that the types of operations and operands agree.
111
⋆ Chapters 10 and 11 of Concepts in programming languages by J. C. Mitchell. CUP, 2003. Chapters 8, and 12(§§2 and 3) of Programming
languages: Design and implementation (3 RD EDITION) by T. W. Pratt and M. V. Zelkowitz. Prentice Hall, 1999. 112
Objects in ML !? Chapter 7 of Programming languages: Concepts &
constructs by R. Sethi (2 ND 1996.
EDITION).
Addison-Wesley,
Chapters 14 and 15 of Understanding programming
languages by M Ben-Ari. Wiley, 1996. ⋆ B. Stroustrup. What is “Object-Oriented Programming”? (1991 revised version). Proc. 1st European Conf. on Object-Oriented Programming. (Available on-line from .)
exception Empty ; fun newStack(x0) = let val stack = ref [x0] in ref{ push = fn(x) => stack := ( x :: !stack ) , pop = fn() => case !stack of nil => raise Empty | h::t => ( stack := t; h ) }end ; exception Empty val newStack = fn : ’a -> {pop:unit -> ’a, push:’a -> unit} ref
113
114
val BoolStack = newStack(true) ; IntStack0 := !IntStack1 ;
val BoolStack = ref {pop=fn,push=fn} : {pop:unit -> bool, push:bool -> unit} ref
val it = () : unit
val IntStack0 = newStack(0) ;
#pop(!IntStack0)() ;
val IntStack0 = ref {pop=fn,push=fn} : {pop:unit -> int, push:int -> unit} ref
val it = 1 : int #push(!IntStack0)(4) ;
val IntStack1 = newStack(1) ;
val it = () : unit
val IntStack1 = ref {pop=fn,push=fn} : {pop:unit -> int, push:int -> unit} ref
115
116
Basic concepts in object-oriented languagesa
map ( #push(!IntStack0) ) [3,2,1] ; val it = [(),(),()] : unit list
Four main language concepts for object-oriented languages:
map ( #pop(!IntStack0) ) [(),(),(),()] ; val it = [1,2,3,4] : int list
1. Dynamic lookup. 2. Abstraction.
NB: ! The stack discipline for activation records fails!
3. Subtyping.
? Is ML an object-oriented language?
4. Inheritance.
! Of course not!
Notes from Chapter 10 of Concepts in programming languages by J. C. Mitchell. CUP, 2003. a
? Why?
118
117
/.
-,
()Dynamic lookup*+
Dynamic lookup means that when a message is sent to an
object, the method to be executed is selected dynamically, at run time, according to the implementation of the object that receives the message. In other words, the object “chooses” how to respond to a message.
Dynamic lookup is sometimes confused with
overloading, which is a mechanism based on static types of operands. However, the two are very different. ? Why?
The important property of dynamic lookup is that different objects may implement the same operation differently, and so may respond to the same message in different ways.
119
120
/. ()
-, *+
Abstraction
/.
inside a program unit with a specific interface. For objects, the interface usually consists of a set of methods that manipulate hidden data. Abstraction based on objects is similar in many ways to
abstraction based on abstract data types: Objects and abstract data types both combine functions and data, and abstraction in both cases involves distinguishing between a public interface and private implementation. Other features of object-oriented languages, however, make abstraction in object-oriented languages more flexible than abstraction with abstract data types.
-,
()Subtyping*+
Abstraction means that implementation details are hidden
Subtyping is a relation on types that allows values of
one type to be used in place of values of another. Specifically, if an object a has all the functionality of another object b, then we may use a in any context expecting b. The basic principle associated with subtyping is
substitutivity: If A is a subtype of B, then any expression of type A may be used without type error in any context that requires an expression of type B.
122
121
/. ()
-,
Inheritance*+
The primary advantage of subtyping is that it permits
uniform operations over various types of data.
Inheritance is the ability to reuse the definition of one
For instance, subtyping makes it possible to have heterogeneous data structures that contain objects that belong to different subtypes of some common type.
kind of object to define another kind of object. The importance of inheritance is that it saves the effort
Subtyping in an object-oriented language allows
functionality to be added without modifying general parts of a system.
123
of duplicating (or reading duplicated) code and that, when one class is implemented by inheriting from another, changes to one affect the other. This has a significant impact on code maintenance and modification.
124
History of objects SIMULA and Smalltalk
Inheritance is not subtyping
Objects were invented in the design of SIMULA and
refined in the evolution of Smalltalk.
Subtyping is a relation on interfaces,
SIMULA: The first object-oriented language.
inheritance is a relation on implementations. One reason subtyping and inheritance are often confused is that some class mechanisms combine the two. A typical example is C++, in which A will be recognized by the compiler as a subtype of B only if B is a public base class of A. Combining subtyping and inheritance is an elective design decision.
The object model in SIMULA was based on procedures activation records, with objects originally described as procedures that return a pointer to their own activation record. Smalltalk: A dynamically typed object-oriented language.
Many object-oriented ideas originated or were popularised by the Smalltalk group, which built on Alan Kay’s then-futuristic idea of the Dynabook. 126
125
A generic event-based simulation program
SIMULA
Q := make_queue(initial_event); repeat select event e from Q simulate event e place all events generated by e on Q until Q is empty
Extremely influential as the first language with classes
objects, dynamic lookup, subtyping, and inheritance. Originally designed for the purpose of simulation by
O.-J. Dahl and K. Nygaard at the Norwegian Computing Center, Oslo, in the 1960s.
naturally requires:
A data structure that may contain a variety of kinds of events. ; subtyping
The selection of the simulation operation according to the kind of event being processed. ; dynamic lookup
Ways in which to structure the implementation of related kinds of events. ; inheritance
SIMULA was designed as an extension and modification
of Algol 60. The main features added to Algol 60 were: class concepts and reference variables (pointers to objects); pass-by-reference; input-output features; coroutines (a mechanism for writing concurrent programs).
127
128
Objects in SIMULA
SIMULA
Class: A procedure returning a pointer to its activation record.
Object-oriented features Objects: A SIMULA object is an activation record
produced by call to a class.
Object: An activation record produced by call to a class, called an instance of the class.
; a SIMULA object is a closure
SIMULA implementations place objects on the heap. Objects are deallocated by the garbage collector (which
Classes: A SIMULA class is a procedure that returns
a pointer to its activation record. The body of a class may initialise the objects it creates. Dynamic lookup: Operations on an object are selected
deallocates objects only when they are no longer reachable from the program that created them).
from the activation record of that object.
130
129
Abstraction: Hiding was not provided in SIMULA 67 but Subtyping: Objects are typed according to the classes
was added later and used as the basis for C++. SIMULA 67 did not distinguish between public and private members of classes. A later version of the language, however, allowed attributes to be made “protected”, which means that they are accessible for subclasses (but not for other classes), or “hidden”, in which case they are not accessible to subclasses either.
131
that create them. Subtyping is determined by class hierarchy. Inheritance: A SIMULA class may be defined, by class
prefixing, as an extension of a class that has already been defined including the ability to redefine parts of a class in a subclass.
132
SIMULA Sample code
SIMULA Further object-related features Inner, which indicates that the method of a subclass
should be called in combination with execution of superclass code that contains the inner keyword. Inspect and qua, which provide the ability to test the type
of an object at run time and to execute appropriate code accordingly. (inspect is a class (type) test, and qua is a form of type cast that is checked for correctness at run time.)
a
CLASS POINT(X,Y); REAL X, Y; COMMENT***CARTESIAN REPRESENTATION BEGIN BOOLEAN PROCEDURE EQUALS(P); REF(POINT) P; IF P =/= NONE THEN EQUALS := ABS(X-P.X) + ABS(Y-P.Y) < 0.00001; REAL PROCEDURE DISTANCE(P); REF(POINT) P; IF P == NONE THEN ERROR ELSE DISTANCE := SQRT( (X-P.X)**2 + (Y-P.Y)**2 ); END***POINT*** See Chapter 4(§1) of SIMULA begin (2nd edition) by G. Birtwistle, O.-J. Dahl, B. Myhrhug, and K. Nygaard. Chartwell-Bratt Ltd., 1980. a
133
CLASS LINE(A,B,C); REAL A,B,C; COMMENT***Ax+By+C=0 REPRESENTATION BEGIN BOOLEAN PROCEDURE PARALLELTO(L); REF(LINE) L; IF L =/= NONE THEN PARALLELTO := ABS( A*L.B - B*L.A ) < 0.00001;
134
COMMENT*** INITIALISATION CODE REAL D; D := SQRT( A**2 + B**2 ) IF D = 0.0 THEN ERROR ELSE BEGIN D := 1/D; A := A*D; B := B*D; C := C * D; END; END***LINE***
REF(POINT) PROCEDURE MEETS(L); REF(LINE) L; BEGIN REAL T; IF L =/= NONE and ~PARALLELTO(L) THEN BEGIN ... MEETS :- NEW POINT(...,...); END; END;***MEETS*** 135
136
SIMULA
Example:
Subclasses and inheritance POINT CLASS COLOREDPOINT(C); COLOR C; BEGIN BOOLEAN PROCEDURE EQUALS(Q); REF(COLOREDPOINT) Q; ...; END***COLOREDPOINT**
SIMULA syntax for a class C1 with subclasses C2 and C3 is CLASS C1 ; C1 CLASS C2 ; C1 CLASS C3 ;
REF(POINT) P; REF(COLOREDPOINT) CP; P :- NEW POINT(1.0,2.5); CP :- NEW COLOREDPOINT(2.5,1.0,RED);
When we create a C2 object, for example, we do this by first creating a C1 object (activation record) and then appending a C2 object (activation record).
NB: SIMULA 67 did not hide fields. Thus, CP.C := BLUE; changes the color of the point referenced by CP. 138
137
Examples: 1. CLASS A;
SIMULA
REF(A) a;
Object types and subtypes
A CLASS B; REF(B) b;
a :- b; COMMENT***legal since B is ***a subclass of A ...
All instances of a class are given the same type. The
name of this type is the same as the name of the class.
b :- a; COMMENT***also legal, ***run time to ***a points to ***as to avoid
The class names (types of objects) are arranged in a
subtype hierarchy corresponding exactly to the subclass hierarchy.
but checked at make sure that a B object, so a type error
2. inspect a when B do b :- a otherwise ... 139
140
3. An error in the original SIMULA type checker surrounding the relationship between subtyping and inheritance: CLASS A;
REF(A) a;
REF(B) b;
SIMULA also uses the semantically incorrect principle that, if B e) : γ
γ ≈ α −> β
α0 ≈ α1 −> α2 ,
α2 ≈ α3 −> α4 ,
α7 ≈ α8 −> α6 ,
α5 ≈ α6 −> α4 ,
α7 ≈ α1 ,
α5 ≈ α1
α8 ≈ α3
Solution: α0 = (α3 −> α3) −> α3 −> α3 177
/.
-,
()Polymorphism*+
Polymorphism, which literally means “having multiple forms”, refers to constructs that can take on different types as needed. Forms of polymorphism in contemporary programming languages: Parametric polymorphism. A function may be applied to any arguments whose types match a type expression involving type variables. Parametric polymorphism may be: Implicit. Programs do not need to contain types; types and instantiations of type variables are computed.
178
Explicit. The program text contains type variables that determine the way that a construct may be treated polymorphically. Explicit polymorphism often involves explicit instantiation or type application to indicate how type variables are replaced with specific types in the use of a polymorphic construct. Example: C++ templates. Ad hoc polymorphism or overloading. Two or more implementations with different types are referred to by the same name. Subtype polymorphism. The subtype relation between types allows an expression to have many possible types.
Example: SML. 179
180
let-polymorphism
Polymorphic exceptions
The standard sugaring
let val x = v in e end
7→
Example: Depth-first search for finitely-branching trees. datatype ’a FBtree = node of ’a * ’a FBtree list ; fun dfs P (t: ’a FBtree) = let exception Ok of ’a; fun auxdfs( node(n,F) ) = if P n then raise Ok n else foldl (fn(t,_) => auxdfs t) NONE F ; in auxdfs t handle Ok n => SOME n end ;
(fn x => e)(v)
does not respect ML type checking. For instance let val f = fn x => x in f(f) end type checks, whilst
(fn f => f(f))(fn x => x) does not. Type inference for let-expressions is involved, requiring
type schemes.
val dfs = fn : (’a -> bool) -> ’a FBtree -> ’a option 182
181
g Topic VII g
When a polymorphic exception is declared, SML ensures that it is used with only one type. The type of a top level exception must be monomorphic and the type variables of a local exception are frozen.
Data abstraction and modularity SML Modulesa
Consider the following nonsense:
References: Chapter 7 of ML for the working programmer (2 ND
exception Poly of ’a ; (*** ILLEGAL!!! ***) (raise Poly true) handle Poly x => x+1 ;
EDITION)
by L. C. Paulson. CUP, 1996.
Largely based on an Introduction to SML Modules by Claudio Russo . a
183
184
The Core and Modules languages SML consists of two sub-languages:
The Standard ML Basis Library edited by E. R. Gansner
and J. H. Reppy. CUP, 2004.
The Core language is for programming in the small, by
supporting the definition of types and expressions denoting values of those types.
[A useful introduction to SML standard libraries, and a good example of modular programming.]
The Modules language is for programming in the large,
by grouping related Core definitions of types and expressions into self-contained units, with descriptive interfaces. The Core expresses details of data structures and algorithms. The Modules language expresses software architecture. Both languages are largely independent. 185
186
The Modules language Writing a real program as an unstructured sequence of Core definitions quickly becomes unmanageable. type nat = int val zero = 0 fun succ x = x + 1 fun iter b f i = if i = zero then b else f (iter b f (i-1)) ... (* thousands of lines later *) fun even (n:nat) = iter true not n The SML Modules language lets one split large programs into separate units with descriptive interfaces. 187
SML Modules Signatures and structures An abstract data type is a type equipped with a set of operations, which are the only operations applicable to that type. Its representation can be changed without affecting the rest of the program. Structures let us package up declarations of related
types, values, and functions. Signatures let us specify what components a structure
must contain. 188
Structures
The dot notation
In Modules, one can encapsulate a sequence of Core type and value definitions into a unit called a structure.
One can name a structure by binding it to an identifier. structure IntNat = struct type nat = int ... fun iter b f i = ... end
We enclose the definitions in between the keywords struct ... end. Example: A structure representing the natural numbers, as positive integers. struct type nat = int val zero = 0 fun succ x = x + 1 fun iter b f i = if i = zero then b else f (iter b f (i-1)) end
Components of a structure are accessed with the dot notation. fun even (n:IntNat.nat) = IntNat.iter true not n NB: Type IntNat.nat is statically equal to int. Value IntNat.iter dynamically evaluates to a closure. 189
Nested structures
190
Concrete signatures
Structures can be nested inside other structures, in a hierarchy. structure IntNatAdd = struct structure Nat = IntNat fun add n m = Nat.iter m Nat.succ n end ... fun mult n m = IntNatAdd.Nat.iter IntNatAdd.Nat.zero (IntNatAdd.add m)
Signature expressions specify the types of structures by listing the specifications of their components.
The dot notation (IntNatAdd.Nat) accesses a nested structure.
A signature expression consists of a sequence of component specifications, enclosed in between the keywords sig . . . end. sig type nat = int val zero : nat val succ : nat -> nat val ’a iter : ’a -> (’a->’a) -> nat -> ’a end
Sequencing dots provides deeper access (IntNatAdd.Nat.zero).
This signature fully describes the type of IntNat.
Nesting and dot notation provides name-space control.
The specification of type nat is concrete: it must be int. 191
192
Opaque signatures
Example: Polymorphic functional stacks.
On the other hand, the following signature sig type nat val zero : nat val succ : nat -> nat val ’a iter : ’a -> (’a->’a) -> nat -> ’a end specifies structures that are free to use any implementation for type nat (perhaps int, or word, or some recursive datatype).
signature STACK = sig exception E type ’a reptype (* ’a reptype -> ’a reptype val pop: ’a reptype -> ’a reptype val top: ’a reptype -> ’a end ;
This specification of type nat is opaque.
194
193
structure MyStack: STACK = struct exception E ; type ’a reptype = ’a list val new = [] ; fun push x s = x::s ; fun split( h::t ) = ( h , | split _ = raise E ; fun pop s = #2( split s ) fun top s = #1( split s ) end ;
val MyEmptyStack = MyStack.new ; val MyStack0 = MyStack.push 0 MyEmptyStack ; val MyStack01 = MyStack.push 1 MyStack0 ; val MyStack0’ = MyStack.pop MyStack01 ; MyStack.top MyStack0’ ;
;
val val val val val
t ) ; ;
195
MyEmptyStack = [] : ’a MyStack.reptype MyStack0 = [0] : int MyStack.reptype MyStack01 = [1,0] : int MyStack.reptype MyStack0’ = [0] : int MyStack.reptype it = 0 : int
196
Named and nested signatures
Signature inclusion
Signatures may be named and referenced, to avoid repetition: signature NAT = sig type nat val zero : nat val succ : nat -> nat val ’a iter : ’a -> (’a->’a) -> nat -> ’a end
To avoid nesting, one can also directly include a signature identifier: sig include NAT val add: nat -> nat ->nat end NB: This is equivalent to the following signature. sig type nat val zero: nat val succ: nat -> nat val ’a iter: ’a -> (’a->’a) -> nat -> ’a val add: nat -> nat -> nat end
Nested signatures specify named sub-structures: signature Add = sig structure Nat: NAT (* references NAT *) val add: Nat.nat -> Nat.nat -> Nat.nat end 197
Signature matching Q:
When does a structure satisfy a signature?
A:
The type of a structure matches a signature whenever it
198
Properties of signature matching The components of a structure can be defined in a
different order than in the signature; names matter but ordering does not.
implements at least the components of the signature. •
The structure must realise (i.e. define) all of the opaque
•
The structure must enrich this realised signature,
type components in the signature.
A structure may contain more components, or
components of more general types, than are specified in a matching signature.
component-wise: ⋆
every concrete type must be implemented equivalently;
⋆
every specified value must have a more general type scheme;
⋆
every specified structure must be enriched by a
Signature matching is structural. A structure can match
many signatures and there is no need to pre-declare its matching signatures (unlike “interfaces” in Java and C#). Although similar to record types, signatures actually
substructure.
play a number of different roles. 199
200
Using signatures to restrict access The following structure uses a signature constraint to provide a restricted view of IntNat:
Subtyping Signature matching supports a form of subtyping not found in the Core language: A structure with more type, value, and structure
components may be used where fewer components are expected.
structure ResIntNat = IntNat : sig type nat val succ : nat->nat val iter : nat->(nat->nat)->nat->nat end NB: The constraint str:sig prunes the structure str according to the signature sig:
A value component may have a more general type
scheme than expected.
ResIntNat.zero is undefined; ResIntNat.iter is less polymorphic that IntNat.iter. 202
201
Transparency of
SML Modules
:
Information hiding Although the : operator can hide names, it does not conceal the definitions of opaque types. Thus, the fact that ResIntNat.nat = IntNat.nat = int remains transparent. For instance the application ResIntNat.succ(~3) is still well-typed, because ~3 has type int . . . but ~3 is negative, so not a valid representation of a natural number!
203
In SML, we can limit outside access to the components of a structure by constraining its signature in transparent or opaque manners. Further, we can hide the representation of a type by means of an abstype declaration. The combination of these methods yields abstract structures.
204
The actual implementation of AbsNat.nat by int is
Using signatures to hide the identity of types
hidden, so that AbsNat.nat 6= int.
With different syntax, signature matching can also be used to enforce data abstraction: structure AbsNat = IntNat :> sig type nat val zero: nat val succ: nat->nat val ’a iter: ’a->(’a->’a)->nat->’a end
AbsNat is just IntNat, but with a hidden type representation. AbsNat defines an abstract datatype of natural numbers:
the only way to construct and use values of the abstract type AbsNat.nat is through the operations, zero, succ, and iter. E.g., the application AbsNat.succ(~3) is ill-typed: ~3 has type int, not AbsNat.nat. This is what we want, since ~3 is not a natural number in our representation. In general, abstractions can also prune and specialise components.
The constraint str :> sig prunes str but also generates a new, abstract type for each opaque type in sig.
206
205
2. abstypes 1. Opaque signature constraints structure MyOpaqueStack :> STACK = MyStack ; val MyEmptyOpaqueStack = MyOpaqueStack.new ; val MyOpaqueStack0 = MyOpaqueStack.push 0 MyEmptyOpaqueStack val MyOpaqueStack01 = MyOpaqueStack.push 1 MyOpaqueStack0 val MyOpaqueStack0’ = MyOpaqueStack.pop MyOpaqueStack01 MyOpaqueStack.top MyOpaqueStack0’ ; val val val val val
MyEmptyOpaqueStack = - : ’a MyOpaqueStack.reptype MyOpaqueStack0 = - : int MyOpaqueStack.reptype MyOpaqueStack01 = - : int MyOpaqueStack.reptype MyOpaqueStack0’ = - : int MyOpaqueStack.reptype it = 0 : int 207
structure MyHiddenStack: STACK = struct exception E ; abstype ’a reptype = S of ’a list (* unit val pop: unit -> unit val top: unit -> itemtype end ;
structure IntNatAdd = AddFun(IntNat) structure AbsNatAdd = AddFun(AbsNat) The actual argument must match the signature of the formal parameter—so it can provide more components, of more general types. Above, AddFun is applied twice, but to arguments that differ in their implementation of type nat (AbsNat.nat 6= IntNat.nat).
214
213
exception E ; functor Stack( T: sig type atype end ) : STACK = struct type itemtype = T.atype val stack = ref( []: itemtype list ) fun push x = ( stack := x :: !stack ) fun pop() = case !stack of [] => raise E | _::s => ( stack := s ) fun top() = case !stack of [] => raise E | t::_ => t end ;
structure intStack = Stack(struct type atype = int end) ; structure intStack : STACK intStack.push(0) ; intStack.top() ; intStack.pop() ; intStack.push(4) ; val val val val 215
it it it it
= = = =
() : unit 0 : intStack.itemtype () : unit () : unit 216
Why functors ? Functors support:
map ( intStack.push ) [3,2,1] ; map ( fn _ => let val top = intStack.top() in intStack.pop(); top end ) [(),(),(),()] ;
Code reuse. AddFun may be applied many times to different structures, reusing its body.
val it = [(),(),()] : unit list val it = [1,2,3,4] : intStack.itemtype list
Code abstraction. AddFun can be compiled before any argument is implemented. Type abstraction. AddFun can be applied to different types N.nat.
217
g Topic VIII g
218
A Scala Tutorial for Java Programmers by M. Schinz
The state of the art Scala < www.scala-lang.org >
and P. Haller. Programming Methods Laboratory, EPFL, 2008.
References: Scala By Example by M. Odersky. Programming
Methods Laboratory, EPFL, 2008. An overview of the Scala programming language by
M. Odersky et al. Technical Report LAMP-REPORT-2006-001, Second Edition, 2006. 219
220
Scala (I)
Scala has been designed to work well with Java and
C#.
Scala has been developed from 2001 in the Programming
Every Java class is seen in Scala as two entities, a class containing all dynamic members and a singleton object, containing all static members.
Methods Laboratory at EPFL by a group lead by Martin Odersky. It was first released publicly in 2004, with a second version released in 2006.
Scala classes and objects can also inherit from Java classes and implement Java interfaces. This makes it possible to use Scala code in a Java framework.
Scala is aimed at the construction of components and
component systems. One of the major design goals of Scala was that it should be flexible enough to act as a convenient host language for domain specific languages implemented by library modules.
Scala’s influences: Beta, C#, FamilyJ, gbeta, Haskell,
Java, Jiazzi, ML≤ , Moby, MultiJava, Nice, OCaml, Pizza, Sather, Smalltalk, SML, XQuery, etc.
222
221
A procedural language ! def qsort( xs: Array[Int] ) { def swap(i: Int, j:Int) { val t = xs(i); xs(i) = xs(j); xs(j) = t } def sort(l: Int, r: Int) { val pivot = xs( (l+r)/2 ); var i = l; var j = r while (i E
{ def f ( x1: T1 , ... , xn: Tn ) = E ; f }
Scala uses the usual block-structured scoping rules.
where f is a fresh name which is used nowhere else in the program.
229
230
Classes and objects Parameter passing classes provide fields and methods. These are
Scala uses call-by-value by default, but it switches to call-by-name evaluation if the parameter type is preceded by =>.
accessed using the dot notation. However, there may be private fields and methods that are inaccessible outside the class.
Imperative control structures A functional implementation of while loops: def whileLoop( cond: => Boolean )( comm: => Unit ) { if (cond) comm ; whileLoop( cond )( comm ) }
231
Scala, being an object-oriented language, uses dynamic dispatch for method invocation. Dynamic method dispatch is analogous to higher-order function calls. In both cases, the identity of the code to be executed is known only at run-time. This similarity is not superficial. Indeed, Scala represents every function value as an object.
232
Every class in Scala has a superclass which it extends.
A class inherits all members from its superclass. It may also override (i.e. redefine) some inherited members.
Methods in Scala do not necessarily take a parameter
If class A extends class B, then objects of type A may be used wherever objects of type B are expected. We say in this case that type A conforms to type B. Scala maintains the invariant that interpreting a value of a
subclass as an instance of its superclass does not change the representation of the value. Amongst other things, it guarantees that for each pair of types S x case Sum(l,r) => eval(l) + eval(r) case Prod(l,r) => eval(l) * eval(r) }
Case classes implicitly come with nullary accessor
methods which retrieve the constructor arguments. Case classes allow the constructions of patterns which
refer to the case class constructor.
If none of the patterns matches, the pattern matching expression is aborted with a MatchError exception.
243
244
Generic types Generic types and methods
Variance annotations The combination of type parameters and subtyping poses some interesting questions.
Classes in Scala can have type parameters.
abstract class Set[A] { def incl( x: A ): Set[A] def contains( x: A ): Boolean }
?
If T is a subtype of a type S, should Array[T] be a subtype of the type Array[S]?
!
No, if one wants to avoid run-time checks!
Example: val x = new Array[String](1) val y: Array[Any] = x // disallowed in Scala // because Array is not // covariant y.update( 0 , new Rational(1,2) )
Scala has a fairly powerful type inferencer which allows
one to omit type parameters to polymorphic functions and constructors.
246
245
Scala uses a conservative approximation to verify soundness of variance annotations: a covariant type parameter of a class may only appear in co-variant position inside the class. Hence, the following class definition is rejected:
In Scala, generic types like the following one: class Array[A] { def apply( index: Int ): A ... def update( index: Int, elem: A ) ... }
have by default non-variant subtyping. However, one can enforce co-variant subtyping by prefixing a formal type parameter with a +. There is also a prefix - which indicates contra-variant subtyping.
247
class Array[+A] { def apply( index: Int ): A ... def update( index:Int , elem: A ) ... }
248
Functions are objects Recall that Scala is an object-oriented language in that every value is an object. It follows that functions are objects in Scala. Indeed, the function type ( A1, ..., Ak ) => B is equivalent to the following parameterised class type:
Since function types are classes in Scala, they can be further refined in subclasses. An example are arrays, which are treated as special functions over the type of integers.
new Function1[Int,Int] { def apply( x:Int ): Int = x+1 } Conversely, when a value of a function type is applied to some arguments, the apply method of the type is implicitly inserted; e.g. for f and object of type Function1[A,B], the application f(x) is expanded to f.apply(x).
abstract class Functionk[-A1,...,-Ak,+B] { def apply( x1:A1,...,xn:Ak ): B }
The function x => x+1 would be expanded to an instance of Function1 as follows:
NB: Function subtyping is contra-variant in its arguments whereas it is co-variant in its result. ? Why?
249
250
Generic types Type parameter bounds
Generic types
trait Ord[A] { def lt( that: A ): Boolean } case class Num( value: Int ) extends Ord[Num] { def lt( that: Num ) = this.value < that.value }
trait def def def }
View bounds One problem with type parameter bounds is that they require forethought: if we had not declared Num as a subclass of Ord, we would not have been able to use Num elements in heaps. By the same token, Int is not a subclass of Ord, and so integers cannot be used as heap elements.
Heap[ A o.lt(pivot,x)) ) ) }
256
The principal idea behind implicit parameters is that arguments for them can be left out from a method call. If the arguments corresponding to implicit parameters are missing, they are inferred by the Scala compiler.
Mixin-class composition Every class or object in Scala can inherit from several traits in addition to a normal class. trait AbsIterator[T] { def hasNext: Boolean def next: T }
NB: View bounds are convenient syntactic sugar for implicit parameters. Implicit conversions
As last resort in case of type mismatch the Scala compiler will try to apply an implicit conversion.
trait RichIterator[T] extends AbsIterator[T] { def foreach( f: T => Unit ): Unit = while (hasNext) f(next) }
implicit def int2ord( x: Int ): Ord[Int] = new Ord[Int] { def lt( y: Int ) = x < y } Implicit conversions can also be applied in member selections.
258
257
object Test { def main( args: Array[String] ): Unit = { class Iter extends StringIterator(args(0)) with RichIterator[Char] val iter = new Iter iter.foreach(System.out.println) } }
class StringIterator( s: String ) extends AbsIterator[Char] { private var i = 0 def hasNext = i < s.length def next = { val x = s charAt i; i = i+1; x } } Traits can be used in all contexts where other abstract classes appear; however only traits can be used as mixins.
259
The class Iter is constructed from a mixin composition of the parents StringIterator (called the superclass) and RichIterator (called a mixin) so as to combine their functionality.
260
Language innovations The class Iter inherits members from both StringIterator and RichIterator.
Flexible syntax and type system.
NB: Mixin-class composition is a form of multiple inheritance!
Pattern matching over class hierarchies unifies
functional and object-oriented data access. Abstract types and mixin composition unify
concepts from object and module systems.
261
262