Concepts in Programming Languages Practicalities Main books ...

16 downloads 163 Views 344KB Size Report
V. Object-oriented languages — Concepts and origins: ... What is a programming language!? ♢ Study .... Chapter 10(§1) of Programming Languages: Design and ..... x := E; execute the body of procedure P; if P is a function, return a result. 79.
Concepts in Programming Languages Practicalities Marcelo Fiore

 Course web page:

Computer Laboratory University of Cambridge

with lecture slides and reading material.  One exam question.

Easter 2011

2

1

Topics

Main books

I. Introduction and motivation.

 J. C. Mitchell. Concepts in programming languages.

II. The first procedural language: FORTRAN (1954–58).

Cambridge University Press, 2003.

III. The first declarative language: LISP (1958–62).  T. W. Pratt and M. V. Zelkowitz. Programming Languages:

Design and implementation (3 RD 1999.

EDITION).

Prentice Hall,

IV. Block-structured procedural languages: Algol (1958–68), Pascal (1970).

V. Object-oriented languages — Concepts and origins: ⋆ M. L. Scott. Programming language pragmatics (2 ND EDITION). Elsevier, 2006.

Simula (1964–67), Smalltalk (1971–80).

VI. Types in programming languages: ML (1973–1978).

 R. Sethi. Programming languages: Concepts & constructs

(2 ND

EDITION).

Addison-Wesley, 1996.

VII. Data abstraction and modularity: SML Modules (1984–97). VIII. The state of the art: Scala (2007)

3

4

g Topic I g

Goals

Introduction and motivation References:

 Critical thinking about programming languages.

? What is a programming language!?

 Chapter 1 of Concepts in programming languages by

 Study programming languages.

J. C. Mitchell. CUP, 2003.  Chapter 1 of Programming languages: Design and

implementation (3 RD EDITION) by T. W. Pratt and M. V. Zelkowitz. Prentice Hall, 1999.

EDITION)

Be familiar with basic language concepts.



Appreciate trade-offs in language design.

 Trace history, appreciate evolution and diversity of ideas.

 Chapter 1 of Programming language pragmatics

(2 ND



 Be prepared for new programming methods, paradigms.

by M. L. Scott. Elsevier, 2006. 5

Why study programming languages?

6

What makes a good language?

 To improve the ability to develop effective algorithms.  Clarity, simplicity, and unity.

 To improve the use of familiar languages.

 Orthogonality.

 To increase the vocabulary of useful programming

constructs.

 Naturalness for the application.

 To allow a better choice of programming language.

 Support of abstraction.

 To make it easier to learn a new language.

 Ease of program verification.

 To make it easier to design a new language.

 Programming environments.

 To simulate useful features in languages that lack them.  Portability of programs.

 To make better use of language technology wherever it

appears. 7

8

What makes a language successful?  Expressive power.

 Cost of use. 

Cost of execution.

 Ease of use for the novice.



Cost of program translation.

 Ease of implementation.



Cost of program creation, testing, and use.

 Open source.



Cost of program maintenance.

 Excellent compilers.  Economics, patronage, and inertia.

9

10

Applications domains Influences

Era

Application

Major languages

Other languages

1960s

Business

COBOL

Assembler

Scientific

FORTRAN

A LGOL, BASIC, APL

System

Assembler

JOVIAL, Forth

AI

LISP

SNOBOL

Business

COBOL, SQL, spreadsheet

C, PL/I, 4GLs

Scientific

FORTRAN, C, C++ Maple, Mathematica

BASIC, Pascal

System

BCPL, C, C++

Pascal, Ada, BASIC, MODULA

AI

LISP, Prolog

Publishing

TEX, Postscript, word processing

Process

UNIX shell, TCL, Perl

Marvel, Esterel

New paradigms

Smalltalk, SML, Haskell, Java Python, Ruby

Eifell, C#, Scala

 Computer capabilities.  Applications.

Today

 Programming methods.  Implementation methods.  Theoretical studies.  Standardisation.

11

12

?> 89 :; Motivating application in language design=
=< 89 Classification of programming languages:;

 Imperative

Examples:

procedural

C,Ada,Pascal,Algol,FORTRAN, . . .

object oriented

Scala, C#,Java, Smalltalk, SIMULA, . .

scripting

Perl, Python, PHP, . . .

 Formal-language theory.  Automata theory.  Algorithmics.  λ-calculus.

 Declarative

functional

Haskell, SML, Lisp, Scheme, . . .

 Semantics.

logic

Prolog

 Formal verification.

dataflow

Id, Val

 Type theory.

constraint-based

spreadsheets

 Complexity theory.

template-based

XSLT

 Logic. 17

18

Language standardisation

Standardisation

Consider: int i; i = (1 && 2) + 3 ;  Proprietary standards.

? Is it valid C code? If so, what’s the value of i?

 Consensus standards.

? How do we answer such questions!?



ANSI (American National Standards Institute)



IEEE (Institute of Electrical and Electronics Engineers)



BSI (British Standard Institute)

! Try it and see!



ISO (International Standards Organisation)

! Read the ANSI C Standard.

19

! Read the reference manual.

20

Language-standards issues

Language standards PL/1

Timeliness. When do we standardise a language? Conformance. What does it mean for a program to adhere to a standard and for a compiler to compile a standard?

?

What does the following 9 + 8/3 mean? − 11.666... ?

Ambiguity and freedom to optimise — Machine dependence — Undefined behaviour.

− Overflow ?

Obsolescence. When does a standard age and how does it get modified?

− 1.666... ?

Deprecated features.

21

22

For 9 + 8/3 we have: DEC(p,q) means p digits with q after the decimal point. Type rules for DECIMAL in PL/1: DEC(p1,q1) + DEC(p2,q2) => DEC(MIN(1+MAX(p1-q1,p2-q2)+MAX(q1,q2),15),MAX(q1,q2) DEC(p1,q1) / DEC(p2,q2) => DEC(15,15-((p1-q1)+q2))

23

DEC(1,0) + DEC(1,0)/DEC(1,0) => DEC(1,0) + DEC(15,15-((1-0)+0)) => DEC(1,0) + DEC(15,14) => DEC(MIN(1+MAX(1-0,15-14)+MAX(0,14),15),MAX(0,14)) => DEC(15,14) So the calculation is as follows 9 + 8/3 = 9 + 2.66666666666666 = 11.66666666666666 - OVERFLOW = 1.66666666666666 - OVERFLOW disabled

24

History 1951–55: Experimental use of expression compilers.

1981–85: Smalltalk-80, Prolog, Ada 83.

1956–60: FORTRAN, COBOL, LISP, Algol 60.

1986–90: C++, SML, Haskell.

1961–65: APL notation, Algol 60 (revised), SNOBOL, CPL.

1991–95: Ada 95, TCL, Perl.

1966–70: APL, SNOBOL 4, FORTRAN 66, BASIC, SIMULA, Algol 68, Algol-W, BCPL.

1996–2000: Java. 2000–05: C#, Python, Ruby, Scala.

1971–75: Pascal, PL/1 (Standard), C, Scheme, Prolog. 1976–80: Smalltalk, Ada, FORTRAN 77, ML.

25

26

Language groups Things to think about

 Multi-purpose languages 

Scala, C#, Java, C++, C



Haskell, SML, Scheme, LISP



Perl, Python, Ruby

 What makes a good language?  The role of

1. motivating applications,

 Special-purpose languages

2. program execution,



UNIX shell

3. theoretical foundations



SQL

in language design.



LATEX

 Language standardisation.

27

28

FORTRAN = FORmula TRANslator

g Topic II g

FORTRAN : A simple procedural language References:

 Chapter 10(§1) of Programming Languages: Design and

implementation (3 RD EDITION) by T. W. Pratt and M. V. Zelkowitz. Prentice Hall, 1999.

(1957)  Developed in the 1950s by an IBM team led by John Backus.  The first high-level programming language to become widely used.  At the time the utility of any high-level language was open to question! The main complain was the efficiency of compiled code. This heavily influenced the designed, orienting it towards

 The History of FORTRAN I, II, and III by J. Backus. In

providing execution efficiency.

History of Programming Languages by R. L. Wexelblat. Academic Press, 1981.

 Standards:

1966, 1977 (FORTRAN 77), 1990 (FORTRAN 90). 29

30

John Backus Overview Execution model

As far as we were aware, we simply made up the language as we went along. We did not regard language design as a difficult problem, merely a simple prelude to the real problem: designing a compiler which could produce efficient programs.a

 FORTRAN program = main program + subprograms 

Each is compiled separate from all others.



Translated programs are linked into final executable form during loading.

 All storage is allocated statically before program execution

begins; no run-time storage management is provided.  Flat register machine. No stacks, no recursion. Memory In R. L. Wexelblat, History of Programming Languages, Academic Press, 1981, page 30. a

31

arranged as linear array.

32

Overview

Overview

Compilation

Data types

FORTRAN program

 Numeric data: Integer, real, complex, double-precision

real. 

Compiler

 Boolean data.

called logical



Incomplete machineUlanguage

Library routines

UUUU UUUU UUUU *

o ooo o o w o

Linker

 Arrays.

of fixed declared length

 Character strings.

of fixed declared length

 Files. 

Machine language program

33

34

Example Overview Control structures 10 100

 FORTRAN 66

Relied heavily on statement labels and GOTO statements.  FORTRAN 77

Added some modern control structures (e.g., conditionals). 999

35

PROGRAM MAIN PARAMETER (MaXsIz=99) REAL A(mAxSiZ) READ (5,100,END=999) K FORMAT(I5) IF (K.LE.0 .OR. K.GT.MAXSIZ) STOP READ *,(A(I),I=1,K) PRINT *,(A(I),I=1,K) PRINT *,’SUM=’,SUM(A,K) GO TO 10 PRINT *, "All Done" STOP END 36

Example Commentary C SUMMATION SUBPROGRAM FUNCTION SUM(V,N) REAL V(N) SUM = 0.0 DO 20 I = 1,N SUM = SUM + V(I) 20 CONTINUE RETURN END

 Columns and lines are relevant.  Blanks are ignored (by early FORTRANs).  Variable names are from 1 to 6 characters long,

begin with a letter, and contain letters and digits.  Programmer-defined constants.  Arrays: when sizes are given, lower bounds are

assumed to be 1; otherwise subscript ranges must be explicitly declared.  Variable types may not be declared: implicit naming

convention. 37

38

 Data formats.  FORTRAN 77 has no while statement.

On syntax

 Functions are compiled separately from the main program.

Information from the main program is not used to pass information to the compiler. Failure may arise when the loader tries to merge subprograms with main program.

A misspelling bug . . . do 10 i = 1,100

vs.

do 10 i = 1.100

 Function parameters are uniformly transmitted by

. . . that is reported to have caused a rocket to explode upon launch into space!

reference (or value-result). Recall that allocation is done statically.  DO loops by increment.  A value is returned in a FORTRAN function by assigning a

value to the name of a function. 39

40

Types

Storage

 FORTRAN has no mechanism for creating user types.

Representation and Management

 Static type checking is used in FORTRAN, but the

 Storage representation in FORTRAN is sequential.

checking is incomplete.

 Only two levels of referencing environment are provided,

Many language features, including arguments in subprogram calls and the use of COMMON blocks, cannot be statically checked (in part because subprograms are compiled independently).

global and local. The global environment may be partitioned into separate common environments that are shared amongst sets of subprograms, but only data objects may be shared in this way.

Constructs that cannot be statically checked are ordinarily left unchecked at run time in FORTRAN implementations.

42

41

 COMMON

The sequential storage representation is critical in the definition of the EQUIVALENCE and COMMON declarations.

The global environment is set up in terms of sets of variables and arrays, which are termed COMMON blocks.

 EQUIVALENCE

This declaration allows more than one simple or subscripted variable to refer to the same storage location.

A COMMON block is a named block of storage and may contain the values of any number of simple variables and arrays.

? Is this a good idea?

COMMON blocks may be used to isolate global data to only a few subprograms needing that data.

Consider the following:

? Is the COMMON block a good idea?

REAL X INTEGER Y EQUIVALENCE (X,Y)

Consider the following:

43

COMMON/BLK/X,Y,K(25)

in MAIN

COMMON/BLK/U,V,I(5),M(4,5)

in SUB 44

/. ()

-,

Aliasing*+ /. ()

Aliasing occurs when two names or expressions

-, *+

Parameters

refer to the same object or location.

There are two concepts that must be clearly distinguished.  Aliasing raises serious problems for both the user

 The parameter names used in a function declaration

and implementor of a language.

are called formal parameters.

 Because of the problems caused by aliasing, new

 When a function is called, expressions called actual

language designs sometimes attempt to restrict or eliminate altogether features that allow aliases to be constructed.

parameters are used to compute the parameter values for that call.

46

45

 The language specifies that if a formal parameter is

FORTRAN subroutines and functions  Actual parameters may be simple variables, literals,

array names, subscripted variables, subprogram names, or arithmetic or logical expressions.

assigned to, the actual parameter must be a variable, but because of independent compilation this rule cannot be checked by the compiler. Example: SUBROUTINE SUB(X,Y) X = Y END

The interpretation of a formal parameter as an array is done by the called subroutine.  Each subroutine is compiled independently and no

checking is done for compatibility between the subroutine declaration and its call.

CALL SUB(-1.0,1.0)  Parameter passing is uniformly by reference.

47

48

 J. McCarthy. Recursive functions of symbolic expressions

g Topic III g

and their computation by machine. Communications of the ACM, 3(4):184–195, 1960.a

LISP : functions, recursion, and lists

 J. McCarthy. History of LISP. In History of Programming

References:

Languages. Academic Press, 1981.

 Chapter 3 of Concepts in programming languages by

J. C. Mitchell. CUP, 2003.  Chapters 5(§4.5) and 13(§1) of Programming languages:

Design and implementation (3 RD EDITION) by T. W. Pratt and M. V. Zelkowitz. Prentice Hall, 1999.

Available on-line recursive.html>. a

from

’h’, First => 24); ? What about in ML? Can it be simulated?

 the time that the actual parameter is evaluated, and

This is commonly used together with default parameters:

 the location used to store the parameter value.

procedure Proc(First: Integer := 0; Second: Character := Proc(Second => ’o’);

NB: The location of a variable (or expression) is called its L-value, and the value stored in this location is called the R-value of the variable (or expression).

77

/. ()

78

-,

Parameter passing*+

/. ()

Pass/Call-by-value

-,

Parameter passing*+

 In pass-by-value, the actual parameter is evaluated. The

value of the actual parameter is then stored in a new location allocated for the function parameter.

Pass/Call-by-reference

 In pass-by-reference, the actual parameter must have

an L-value. The L-value of the actual parameter is then bound to the formal parameter.

 Under call-by-value, a formal parameter corresponds to

the value of an actual parameter. That is, the formal x of a procedure P takes on the value of the actual parameter. The idea is to evaluate a call P(E) as follows: x := E; execute the body of procedure P; if P is a function, return a result. 79

 Under call-by-reference, a formal parameter becomes

a synonym for the location of an actual parameter. An actual reference parameter must have a location.

80

Example: program main; begin function f( var x: integer; y: integer): integer; begin x := 2; y := 1; if x = 1 then f := 1 else f:= 2 end;

The difference between call-by-value and call-by-reference is important to the programmer in several ways:  Side effects. Assignments inside the function body

may have different effects under pass-by-value and pass-by-reference.  Aliasing. Aliasing occurs when two names refer to the

same object or location. Aliasing may occur when two parameters are passed by reference or one parameter passed by reference has the same location as the global variable of the procedure.

var z: integer; z := 0; writeln( f(z,z) ) end 81

82

/. ()

-,

Parameter passing*+

 Efficiency. Pass-by-value may be inefficient for large

structures if the value of the large structure must be copied. Pass-by-reference maybe less efficient than pass-by-value for small structures that would fit directly on stack, because when parameters are passed by reference we must dereference a pointer to get their value.

83

Pass/Call-by-value/result Call-by-value/result is also known as copy-in/copy-out because the actuals are initially copied into the formals and the formals are eventually copied back out to the actuals. Actuals that do not have locations are passed by value. Actuals with locations are treated as follows: 1. Copy-in phase. Both the values and the locations of the actual parameters are computed. The values are assigned to the corresponding formals, as in call-by-value, and the locations are saved for the copy-out phase. 2. Copy-out phase. After the procedure body is executed, the final values of the formals are copied back out to the locations computed in the copy-in phase. 84

Examples:  Ada supports three kinds of parameters:

 A parameter in Pascal is normally passed by value. It is

passed by reference, however, if the keyword var appears before the declaration of the formal parameter. procedure proc(in: Integer; var out: Real);  The only parameter-passing method in C is call-by-value;

however, the effect of call-by-reference can be achieved using pointers. In C++ true call-by-reference is available using reference parameters.

1. in parameters, corresponding to value parameters; 2. out parameters, corresponding to just the copy-out phase of call-by-value/result; and 3. in out parameters, corresponding to either reference parameters or value/result parameters, at the discretion of the implementation.

86

85

/. ()

Block structure

-, *+

Parameter passing Pass/Call-by-name

 In a block-structured language, each program or

subprogram is organised as a set of nested blocks.

The Algol 60 report describes call-by-name as follows: 1. Actual parameters are textually substituted for the formals. Possible conflicts between names in the actuals and local names in the procedure body are avoided by renaming the locals in the body. 2. The resulting procedure body is substituted for the call. Possible conflicts between nonlocals in the procedure body and locals at the point of call are avoided by renaming the locals at the point of call. 87

A block is a region of program text, identified by begin and end markers, that may contain declarations local to this region.  In-line (or unnamed) blocks are useful for restricting the

scope of variables by declaring them only when needed, instead of at the beginning of a subprogram. The trend in programming is to reduce the size of subprograms, so the use of unnamed blocks is less useful than it used to be.

88

Nested procedures can be used to group statements that are executed at more than one location within a subprogram, but refer to local variables and so cannot be external to the subprogram. Before modules and object-oriented programming were introduced, nested procedures were used to structure large programs.  Block structure was first defined in Algol. Pascal contains



Each declaration is visible within a certain region of program text, called a block.



When a program begins executing the instructions contained in a block at run time, memory is allocated for the variables declared in that block.



When a program exits a block, some or all of the memory allocated to variables declared in that block will be deallocated.



An identifier that is not declared in the current block is considered global to the block and refers to the entity with this name that is declared in the closest enclosing block.

nested procedures but not in-line blocks; C contains in-line blocks but not nested procedures; Ada supports both.  Block-structured languages are characterised by the

following properties: 

New variables may be declared at various points in a program.

90

89

Algol  The main characteristics of the Algol family are:

had a major effect on language design



the familiar semicolon-separated sequence of statements,



block structure,



functions and procedures, and



static typing.

 The Algol-like programming languages evolved in parallel

with the LISP family of languages, beginning with Algol 58 and Algol 60 in the late 1950s.  The most prominent Algol-like programming languages

are Pascal and C, although C differs from most of the Algol-like languages in some significant ways. Further Algol-like languages are: Algol 58, Algol W, Euclid, etc.

91

92

Algol 60 Features

Algol 60

 Simple statement-oriented syntax.

 Designed by a committee (including Backus, McCarthy,

Perlis) between 1958 and 1963.

 Block structure.

 Intended to be a general purpose programming language,

 Recursive functions and stack storage allocation.

with emphasis on scientific and numerical applications.  Fewer ad hoc restrictions than previous languages  Compared with FORTRAN, Algol 60 provided better ways

to represent data structures and, like LISP, allowed functions to be called recursively. Eclipsed by FORTRAN because of the lack of I/O statements, separate compilation, and library; and because it was not supported by IBM.

(e.g., general expressions inside array indices, procedures that could be called with procedure parameters).  A primitive static type system, later improved in

Algol 68 and Pascal.

94

93

Algol 60 Some trouble spots  Algol 60 was designed around two parameter-passing

 The Algol 60 type discipline had some shortcomings.

mechanisms, call-by-name and call-by-value.

For instance: 

Automatic type conversions were not fully specified (e.g., x := x/y was not properly defined when x and y were integers—is it allowed, and if so was the value rounded or truncated?).



The type of a procedure parameter to a procedure does not include the types of parameters.



An array parameter to a procedure is given type array, without array bounds.

95

Call-by-name interacts badly with side effects; call-by-value is expensive for arrays.  There are some awkward issues related to control

flow, such as memory management, when a program jumps out of a nested block.

96

Algol 60 pass-by-name Copy rule

Algol 60 procedure typesa In Algol 60, the type of each formal parameter of a procedure must be given. However, proc is considered a type (the type of procedures). This is much simpler than the ML types of function arguments. However, this is really a type loophole; because calls to procedure parameters are not fully type checked, Algol 60 programs may produce run-time errors. Write a procedure declaration for Q that causes the following program fragment to produce a run-time type error: proc P ( proc Q ) begin Q(true) end;

real procedure sum(E,i,low,high); value low, high; real E; integer i, low, high; begin sum:=0.0; for i := low step 1 until high do sum := sum+E; end integer j; real array A[1:10]; real result; for j:= 1 step 1 until 10 do A[j] := j; result := sum(A[j],j,1,10)

P(Q); where true is a Boolean value. Explain why the procedure is statically type correct, but produces a run-time type error. (You may assume that adding a Boolean to an integer is a run-time error.) Exercise 5.1 of Concepts in programming languages by J. Mitchell, CUP, 2003. a

By the Algol 60 copy rule, the function call to sum above is equivalent to: begin sum:=0.0; for j := 1 step 1 until 10 do sum := sum+A[j]; end 98

97

Algol 68

Algol 60 pass-by-namea The following Algol 60 code declares a procedure P with one pass-by-name integer parameter. Explain how the procedure call P(A[i]) changes the values of i and A by substituting the actual parameters for the formal parameters, according to the Algol 60 copy rule. What integer values are printed by the program? And, by using pass-by-value parameter passing? begin integer i; i:=1; integer array A[1:2]; A[1]:=2; A[2]:=3;

Algol 60 and to improve the expressiveness of the language. It did not entirely succeed however, with one main problem being the difficulty of efficient compilation (e.g., the implementation consequences of higher-order procedures where not well understood at the time).  One contribution of Algol 68 was its regular, systematic

procedure P(x); integer x; begin i:=x; x:=1 end P(A[i]); end

 Intended to remove some of the difficulties found in

type system.

print(i,A[1],A[2])

Exercise 5.2 of Concepts in programming languages by J. Mitchell, CUP, 2003. a

99

The types (referred to as modes in Algol 68) are either primitive (int, real, complex, bool, char, string, bits, bytes, semaphore, format, file) or compound (array, structure, procedure, set, pointer). 100

Type constructions could be combined without restriction. This made the type system seem more systematic than previous languages.

Algol innovations  Use of BNF syntax description.  Block structure.

 Algol 68 memory management involves a stack for local

variables and heap storage. Algol 68 data on the heap are explicitly allocated, and are reclaimed by garbage collection.  Algol 68 parameter passing is by value, with

 Scope rules for local variables.  Dynamic lifetimes for variables.  Nested if-then-else expressions and statements.  Recursive subroutines.

pass-by-reference accomplished by pointer types. (This is essentially the same design as that adopted in C.)

 Call-by-value and call-by-name arguments.  Explicit type declarations for variables.

 The decision to allow independent constructs to be

 Static typing.

combined without restriction also led to some complex features, such as assignable pointers.

 Arrays with dynamic bounds. 102

101

Pascal  Designed in the 1970s by Niklaus Wirth, after the design

 A Pascal program is always formed from a single

main program block, which contains within it definitions of the subprograms used.

and implementation of Algol W.  Very successful programming language for teaching, in

part because it was designed explicitly for that purpose. Also designed to be compiled in one pass. This hindered language design; e.g., it forced the problematic forward declaration.  Pascal is a block-structured language in which static

Each block has a characteristic structure: a header giving the specification of parameters and results, followed by constant definitions, type definitions, local variable declarations, other nested subprogram definitions, and the statements that make up the executable part.

scope rules are used to determine the meaning of nonlocal references to names.

103

104

A restriction that made Pascal simpler than Algol 68:  Pascal is a quasi-strong, statically typed programming

procedure Allowed( j,k: integer );

language. An important contribution of the Pascal type system is the rich set of data-structuring concepts: e.g. enumerations, subranges, records, variant records, sets, sequential files.  The Pascal type system is more expressive than the

Algol 60 one (repairing some of its loopholes), and simpler and more limited than the Algol 68 one (eliminating some of the compilation difficulties).

procedure AlsoAllowed( procedure P(i:integer); j,k: integer ); procedure NotAllowed( procedure MyProc( procedure P( i:integer ) ) );

105

106

 Pascal was the first language to propose index checking.  Pascal uses a mixture of name and structural

 Problematically, in Pascal, the index type of an array is

part of its type. The Pascal standard defines conformant array parameters whose bounds are implicitly passed to a procedure. The Ada programmig language uses so-called unconstrained array types to solve this problem. The subscript range must be fixed at compile time permitting the compiler to perform all address calculations during compilation. procedure Allowed( a: array [1..10] of integer ) ; procedure NotAllowed( n: integer; a: array [1..n] of integer ) ; 107

equivalence for determining if two variables have the same type. Name equivalence is used in most cases for determining if formal and actual parameters in subprogram calls have the same type; structural equivalence is used in most other situations.  Parameters are passed by value or reference.

Complete static type checking is possible for correspondence of actual and formal parameter types in each subprogram call.

108

Pascal variant records Variant records have a part common to all records of that type, and a variable part, specific to some subset of the records. type kind = ( unary, binary) ; type { datatype } UBtree = record { ’a UBtree = record of } value: integer ; { ’a * ’a UBkind } case k: kind of { and ’a UBkind = } unary: ^UBtree ; { unary of ’a UBtree } binary: record { | binary of } left: ^UBtree ; { ’a UBtree * } right: ^UBtree { ’a UBtree ; } end end ;

Variant records introduce weaknesses into the type system for a language. 1. Compilers do not usually check that the value in the tag field is consistent with the state of the record. 2. Tag fields are optional. If omitted, no checking is possible at run time to determine which variant is present when a selection is made of a field in a variant.

109

Summary  The Algol family of languages established the

command-oriented syntax, with blocks, local declarations, and recursive functions, that are used in most current programming languages.

110

g Topic V g

Object-oriented languages : Concepts and origins SIMULA and Smalltalk References:

 The Algol family of languages is statically typed, as each

expression has a type that is determined by its syntactic form and the compiler checks before running the program to make sure that the types of operations and operands agree.

111

⋆ Chapters 10 and 11 of Concepts in programming languages by J. C. Mitchell. CUP, 2003.  Chapters 8, and 12(§§2 and 3) of Programming

languages: Design and implementation (3 RD EDITION) by T. W. Pratt and M. V. Zelkowitz. Prentice Hall, 1999. 112

Objects in ML !?  Chapter 7 of Programming languages: Concepts &

constructs by R. Sethi (2 ND 1996.

EDITION).

Addison-Wesley,

 Chapters 14 and 15 of Understanding programming

languages by M Ben-Ari. Wiley, 1996. ⋆ B. Stroustrup. What is “Object-Oriented Programming”? (1991 revised version). Proc. 1st European Conf. on Object-Oriented Programming. (Available on-line from .)

exception Empty ; fun newStack(x0) = let val stack = ref [x0] in ref{ push = fn(x) => stack := ( x :: !stack ) , pop = fn() => case !stack of nil => raise Empty | h::t => ( stack := t; h ) }end ; exception Empty val newStack = fn : ’a -> {pop:unit -> ’a, push:’a -> unit} ref

113

114

val BoolStack = newStack(true) ; IntStack0 := !IntStack1 ;

val BoolStack = ref {pop=fn,push=fn} : {pop:unit -> bool, push:bool -> unit} ref

val it = () : unit

val IntStack0 = newStack(0) ;

#pop(!IntStack0)() ;

val IntStack0 = ref {pop=fn,push=fn} : {pop:unit -> int, push:int -> unit} ref

val it = 1 : int #push(!IntStack0)(4) ;

val IntStack1 = newStack(1) ;

val it = () : unit

val IntStack1 = ref {pop=fn,push=fn} : {pop:unit -> int, push:int -> unit} ref

115

116

Basic concepts in object-oriented languagesa

map ( #push(!IntStack0) ) [3,2,1] ; val it = [(),(),()] : unit list

Four main language concepts for object-oriented languages:

map ( #pop(!IntStack0) ) [(),(),(),()] ; val it = [1,2,3,4] : int list

1. Dynamic lookup. 2. Abstraction.

NB:  ! The stack discipline for activation records fails!

3. Subtyping.

 ? Is ML an object-oriented language?

4. Inheritance.

! Of course not!

Notes from Chapter 10 of Concepts in programming languages by J. C. Mitchell. CUP, 2003. a

? Why?

118

117

/.

-,

()Dynamic lookup*+

 Dynamic lookup means that when a message is sent to an

object, the method to be executed is selected dynamically, at run time, according to the implementation of the object that receives the message. In other words, the object “chooses” how to respond to a message.

 Dynamic lookup is sometimes confused with

overloading, which is a mechanism based on static types of operands. However, the two are very different. ? Why?

The important property of dynamic lookup is that different objects may implement the same operation differently, and so may respond to the same message in different ways.

119

120

/. ()

-, *+

Abstraction

/.

inside a program unit with a specific interface. For objects, the interface usually consists of a set of methods that manipulate hidden data.  Abstraction based on objects is similar in many ways to

abstraction based on abstract data types: Objects and abstract data types both combine functions and data, and abstraction in both cases involves distinguishing between a public interface and private implementation. Other features of object-oriented languages, however, make abstraction in object-oriented languages more flexible than abstraction with abstract data types.

-,

()Subtyping*+

 Abstraction means that implementation details are hidden

 Subtyping is a relation on types that allows values of

one type to be used in place of values of another. Specifically, if an object a has all the functionality of another object b, then we may use a in any context expecting b.  The basic principle associated with subtyping is

substitutivity: If A is a subtype of B, then any expression of type A may be used without type error in any context that requires an expression of type B.

122

121

/. ()

-,

Inheritance*+

 The primary advantage of subtyping is that it permits

uniform operations over various types of data.

 Inheritance is the ability to reuse the definition of one

For instance, subtyping makes it possible to have heterogeneous data structures that contain objects that belong to different subtypes of some common type.

kind of object to define another kind of object.  The importance of inheritance is that it saves the effort

 Subtyping in an object-oriented language allows

functionality to be added without modifying general parts of a system.

123

of duplicating (or reading duplicated) code and that, when one class is implemented by inheriting from another, changes to one affect the other. This has a significant impact on code maintenance and modification.

124

History of objects SIMULA and Smalltalk

Inheritance is not subtyping

 Objects were invented in the design of SIMULA and

refined in the evolution of Smalltalk.

Subtyping is a relation on interfaces,

 SIMULA: The first object-oriented language.

inheritance is a relation on implementations. One reason subtyping and inheritance are often confused is that some class mechanisms combine the two. A typical example is C++, in which A will be recognized by the compiler as a subtype of B only if B is a public base class of A. Combining subtyping and inheritance is an elective design decision.

The object model in SIMULA was based on procedures activation records, with objects originally described as procedures that return a pointer to their own activation record.  Smalltalk: A dynamically typed object-oriented language.

Many object-oriented ideas originated or were popularised by the Smalltalk group, which built on Alan Kay’s then-futuristic idea of the Dynabook. 126

125

 A generic event-based simulation program

SIMULA

Q := make_queue(initial_event); repeat select event e from Q simulate event e place all events generated by e on Q until Q is empty

 Extremely influential as the first language with classes

objects, dynamic lookup, subtyping, and inheritance.  Originally designed for the purpose of simulation by

O.-J. Dahl and K. Nygaard at the Norwegian Computing Center, Oslo, in the 1960s.

naturally requires: 

A data structure that may contain a variety of kinds of events. ; subtyping



The selection of the simulation operation according to the kind of event being processed. ; dynamic lookup



Ways in which to structure the implementation of related kinds of events. ; inheritance

 SIMULA was designed as an extension and modification

of Algol 60. The main features added to Algol 60 were: class concepts and reference variables (pointers to objects); pass-by-reference; input-output features; coroutines (a mechanism for writing concurrent programs).

127

128

Objects in SIMULA

SIMULA

Class: A procedure returning a pointer to its activation record.

Object-oriented features  Objects: A SIMULA object is an activation record

produced by call to a class.

Object: An activation record produced by call to a class, called an instance of the class.

; a SIMULA object is a closure

 SIMULA implementations place objects on the heap.  Objects are deallocated by the garbage collector (which

 Classes: A SIMULA class is a procedure that returns

a pointer to its activation record. The body of a class may initialise the objects it creates.  Dynamic lookup: Operations on an object are selected

deallocates objects only when they are no longer reachable from the program that created them).

from the activation record of that object.

130

129

 Abstraction: Hiding was not provided in SIMULA 67 but  Subtyping: Objects are typed according to the classes

was added later and used as the basis for C++. SIMULA 67 did not distinguish between public and private members of classes. A later version of the language, however, allowed attributes to be made “protected”, which means that they are accessible for subclasses (but not for other classes), or “hidden”, in which case they are not accessible to subclasses either.

131

that create them. Subtyping is determined by class hierarchy.  Inheritance: A SIMULA class may be defined, by class

prefixing, as an extension of a class that has already been defined including the ability to redefine parts of a class in a subclass.

132

SIMULA Sample code

SIMULA Further object-related features  Inner, which indicates that the method of a subclass

should be called in combination with execution of superclass code that contains the inner keyword.  Inspect and qua, which provide the ability to test the type

of an object at run time and to execute appropriate code accordingly. (inspect is a class (type) test, and qua is a form of type cast that is checked for correctness at run time.)

a

CLASS POINT(X,Y); REAL X, Y; COMMENT***CARTESIAN REPRESENTATION BEGIN BOOLEAN PROCEDURE EQUALS(P); REF(POINT) P; IF P =/= NONE THEN EQUALS := ABS(X-P.X) + ABS(Y-P.Y) < 0.00001; REAL PROCEDURE DISTANCE(P); REF(POINT) P; IF P == NONE THEN ERROR ELSE DISTANCE := SQRT( (X-P.X)**2 + (Y-P.Y)**2 ); END***POINT*** See Chapter 4(§1) of SIMULA begin (2nd edition) by G. Birtwistle, O.-J. Dahl, B. Myhrhug, and K. Nygaard. Chartwell-Bratt Ltd., 1980. a

133

CLASS LINE(A,B,C); REAL A,B,C; COMMENT***Ax+By+C=0 REPRESENTATION BEGIN BOOLEAN PROCEDURE PARALLELTO(L); REF(LINE) L; IF L =/= NONE THEN PARALLELTO := ABS( A*L.B - B*L.A ) < 0.00001;

134

COMMENT*** INITIALISATION CODE REAL D; D := SQRT( A**2 + B**2 ) IF D = 0.0 THEN ERROR ELSE BEGIN D := 1/D; A := A*D; B := B*D; C := C * D; END; END***LINE***

REF(POINT) PROCEDURE MEETS(L); REF(LINE) L; BEGIN REAL T; IF L =/= NONE and ~PARALLELTO(L) THEN BEGIN ... MEETS :- NEW POINT(...,...); END; END;***MEETS*** 135

136

SIMULA

Example:

Subclasses and inheritance POINT CLASS COLOREDPOINT(C); COLOR C; BEGIN BOOLEAN PROCEDURE EQUALS(Q); REF(COLOREDPOINT) Q; ...; END***COLOREDPOINT**

SIMULA syntax for a class C1 with subclasses C2 and C3 is CLASS C1 ; C1 CLASS C2 ; C1 CLASS C3 ;

REF(POINT) P; REF(COLOREDPOINT) CP; P :- NEW POINT(1.0,2.5); CP :- NEW COLOREDPOINT(2.5,1.0,RED);

When we create a C2 object, for example, we do this by first creating a C1 object (activation record) and then appending a C2 object (activation record).

NB: SIMULA 67 did not hide fields. Thus, CP.C := BLUE; changes the color of the point referenced by CP. 138

137

Examples: 1. CLASS A;

SIMULA

REF(A) a;

Object types and subtypes

A CLASS B; REF(B) b;

a :- b; COMMENT***legal since B is ***a subclass of A ...

 All instances of a class are given the same type. The

name of this type is the same as the name of the class.

b :- a; COMMENT***also legal, ***run time to ***a points to ***as to avoid

 The class names (types of objects) are arranged in a

subtype hierarchy corresponding exactly to the subclass hierarchy.

but checked at make sure that a B object, so a type error

2. inspect a when B do b :- a otherwise ... 139

140

3. An error in the original SIMULA type checker surrounding the relationship between subtyping and inheritance: CLASS A;

REF(A) a;

REF(B) b;

SIMULA also uses the semantically incorrect principle that, if B e) : γ

γ ≈ α −> β

α0 ≈ α1 −> α2 ,

α2 ≈ α3 −> α4 ,

α7 ≈ α8 −> α6 ,

α5 ≈ α6 −> α4 ,

α7 ≈ α1 ,

α5 ≈ α1

α8 ≈ α3

Solution: α0 = (α3 −> α3) −> α3 −> α3 177

/.

-,

()Polymorphism*+

Polymorphism, which literally means “having multiple forms”, refers to constructs that can take on different types as needed. Forms of polymorphism in contemporary programming languages: Parametric polymorphism. A function may be applied to any arguments whose types match a type expression involving type variables. Parametric polymorphism may be: Implicit. Programs do not need to contain types; types and instantiations of type variables are computed.

178

Explicit. The program text contains type variables that determine the way that a construct may be treated polymorphically. Explicit polymorphism often involves explicit instantiation or type application to indicate how type variables are replaced with specific types in the use of a polymorphic construct. Example: C++ templates. Ad hoc polymorphism or overloading. Two or more implementations with different types are referred to by the same name. Subtype polymorphism. The subtype relation between types allows an expression to have many possible types.

Example: SML. 179

180

let-polymorphism

Polymorphic exceptions

 The standard sugaring

let val x = v in e end

7→

Example: Depth-first search for finitely-branching trees. datatype ’a FBtree = node of ’a * ’a FBtree list ; fun dfs P (t: ’a FBtree) = let exception Ok of ’a; fun auxdfs( node(n,F) ) = if P n then raise Ok n else foldl (fn(t,_) => auxdfs t) NONE F ; in auxdfs t handle Ok n => SOME n end ;

(fn x => e)(v)

does not respect ML type checking. For instance let val f = fn x => x in f(f) end type checks, whilst

(fn f => f(f))(fn x => x) does not.  Type inference for let-expressions is involved, requiring

type schemes.

val dfs = fn : (’a -> bool) -> ’a FBtree -> ’a option 182

181

g Topic VII g

When a polymorphic exception is declared, SML ensures that it is used with only one type. The type of a top level exception must be monomorphic and the type variables of a local exception are frozen.

Data abstraction and modularity SML Modulesa

Consider the following nonsense:

References:  Chapter 7 of ML for the working programmer (2 ND

exception Poly of ’a ; (*** ILLEGAL!!! ***) (raise Poly true) handle Poly x => x+1 ;

EDITION)

by L. C. Paulson. CUP, 1996.

Largely based on an Introduction to SML Modules by Claudio Russo . a

183

184

The Core and Modules languages SML consists of two sub-languages:

 The Standard ML Basis Library edited by E. R. Gansner

and J. H. Reppy. CUP, 2004.

 The Core language is for programming in the small, by

supporting the definition of types and expressions denoting values of those types.

[A useful introduction to SML standard libraries, and a good example of modular programming.]

 The Modules language is for programming in the large,



by grouping related Core definitions of types and expressions into self-contained units, with descriptive interfaces. The Core expresses details of data structures and algorithms. The Modules language expresses software architecture. Both languages are largely independent. 185

186

The Modules language Writing a real program as an unstructured sequence of Core definitions quickly becomes unmanageable. type nat = int val zero = 0 fun succ x = x + 1 fun iter b f i = if i = zero then b else f (iter b f (i-1)) ... (* thousands of lines later *) fun even (n:nat) = iter true not n The SML Modules language lets one split large programs into separate units with descriptive interfaces. 187

SML Modules Signatures and structures An abstract data type is a type equipped with a set of operations, which are the only operations applicable to that type. Its representation can be changed without affecting the rest of the program.  Structures let us package up declarations of related

types, values, and functions.  Signatures let us specify what components a structure

must contain. 188

Structures

The dot notation

In Modules, one can encapsulate a sequence of Core type and value definitions into a unit called a structure.

One can name a structure by binding it to an identifier. structure IntNat = struct type nat = int ... fun iter b f i = ... end

We enclose the definitions in between the keywords struct ... end. Example: A structure representing the natural numbers, as positive integers. struct type nat = int val zero = 0 fun succ x = x + 1 fun iter b f i = if i = zero then b else f (iter b f (i-1)) end

Components of a structure are accessed with the dot notation. fun even (n:IntNat.nat) = IntNat.iter true not n NB: Type IntNat.nat is statically equal to int. Value IntNat.iter dynamically evaluates to a closure. 189

Nested structures

190

Concrete signatures

Structures can be nested inside other structures, in a hierarchy. structure IntNatAdd = struct structure Nat = IntNat fun add n m = Nat.iter m Nat.succ n end ... fun mult n m = IntNatAdd.Nat.iter IntNatAdd.Nat.zero (IntNatAdd.add m)

Signature expressions specify the types of structures by listing the specifications of their components.

The dot notation (IntNatAdd.Nat) accesses a nested structure.

A signature expression consists of a sequence of component specifications, enclosed in between the keywords sig . . . end. sig type nat = int val zero : nat val succ : nat -> nat val ’a iter : ’a -> (’a->’a) -> nat -> ’a end

Sequencing dots provides deeper access (IntNatAdd.Nat.zero).

This signature fully describes the type of IntNat.

Nesting and dot notation provides name-space control.

The specification of type nat is concrete: it must be int. 191

192

Opaque signatures

Example: Polymorphic functional stacks.

On the other hand, the following signature sig type nat val zero : nat val succ : nat -> nat val ’a iter : ’a -> (’a->’a) -> nat -> ’a end specifies structures that are free to use any implementation for type nat (perhaps int, or word, or some recursive datatype).

signature STACK = sig exception E type ’a reptype (* ’a reptype -> ’a reptype val pop: ’a reptype -> ’a reptype val top: ’a reptype -> ’a end ;

This specification of type nat is opaque.

194

193

structure MyStack: STACK = struct exception E ; type ’a reptype = ’a list val new = [] ; fun push x s = x::s ; fun split( h::t ) = ( h , | split _ = raise E ; fun pop s = #2( split s ) fun top s = #1( split s ) end ;

val MyEmptyStack = MyStack.new ; val MyStack0 = MyStack.push 0 MyEmptyStack ; val MyStack01 = MyStack.push 1 MyStack0 ; val MyStack0’ = MyStack.pop MyStack01 ; MyStack.top MyStack0’ ;

;

val val val val val

t ) ; ;

195

MyEmptyStack = [] : ’a MyStack.reptype MyStack0 = [0] : int MyStack.reptype MyStack01 = [1,0] : int MyStack.reptype MyStack0’ = [0] : int MyStack.reptype it = 0 : int

196

Named and nested signatures

Signature inclusion

Signatures may be named and referenced, to avoid repetition: signature NAT = sig type nat val zero : nat val succ : nat -> nat val ’a iter : ’a -> (’a->’a) -> nat -> ’a end

To avoid nesting, one can also directly include a signature identifier: sig include NAT val add: nat -> nat ->nat end NB: This is equivalent to the following signature. sig type nat val zero: nat val succ: nat -> nat val ’a iter: ’a -> (’a->’a) -> nat -> ’a val add: nat -> nat -> nat end

Nested signatures specify named sub-structures: signature Add = sig structure Nat: NAT (* references NAT *) val add: Nat.nat -> Nat.nat -> Nat.nat end 197

Signature matching Q:

When does a structure satisfy a signature?

A:

The type of a structure matches a signature whenever it

198

Properties of signature matching  The components of a structure can be defined in a

different order than in the signature; names matter but ordering does not.

implements at least the components of the signature. •

The structure must realise (i.e. define) all of the opaque



The structure must enrich this realised signature,

type components in the signature.

 A structure may contain more components, or

components of more general types, than are specified in a matching signature.

component-wise: ⋆

every concrete type must be implemented equivalently;



every specified value must have a more general type scheme;



every specified structure must be enriched by a

 Signature matching is structural. A structure can match

many signatures and there is no need to pre-declare its matching signatures (unlike “interfaces” in Java and C#).  Although similar to record types, signatures actually

substructure.

play a number of different roles. 199

200

Using signatures to restrict access The following structure uses a signature constraint to provide a restricted view of IntNat:

Subtyping Signature matching supports a form of subtyping not found in the Core language:  A structure with more type, value, and structure

components may be used where fewer components are expected.

structure ResIntNat = IntNat : sig type nat val succ : nat->nat val iter : nat->(nat->nat)->nat->nat end NB: The constraint str:sig prunes the structure str according to the signature sig:

 A value component may have a more general type

scheme than expected.

 ResIntNat.zero is undefined;  ResIntNat.iter is less polymorphic that IntNat.iter. 202

201

Transparency of

SML Modules

:

Information hiding Although the : operator can hide names, it does not conceal the definitions of opaque types. Thus, the fact that ResIntNat.nat = IntNat.nat = int remains transparent. For instance the application ResIntNat.succ(~3) is still well-typed, because ~3 has type int . . . but ~3 is negative, so not a valid representation of a natural number!

203

In SML, we can limit outside access to the components of a structure by constraining its signature in transparent or opaque manners. Further, we can hide the representation of a type by means of an abstype declaration. The combination of these methods yields abstract structures.

204

 The actual implementation of AbsNat.nat by int is

Using signatures to hide the identity of types

hidden, so that AbsNat.nat 6= int.

With different syntax, signature matching can also be used to enforce data abstraction: structure AbsNat = IntNat :> sig type nat val zero: nat val succ: nat->nat val ’a iter: ’a->(’a->’a)->nat->’a end

AbsNat is just IntNat, but with a hidden type representation.  AbsNat defines an abstract datatype of natural numbers:

the only way to construct and use values of the abstract type AbsNat.nat is through the operations, zero, succ, and iter. E.g., the application AbsNat.succ(~3) is ill-typed: ~3 has type int, not AbsNat.nat. This is what we want, since ~3 is not a natural number in our representation. In general, abstractions can also prune and specialise components.

The constraint str :> sig prunes str but also generates a new, abstract type for each opaque type in sig.

206

205

2. abstypes 1. Opaque signature constraints structure MyOpaqueStack :> STACK = MyStack ; val MyEmptyOpaqueStack = MyOpaqueStack.new ; val MyOpaqueStack0 = MyOpaqueStack.push 0 MyEmptyOpaqueStack val MyOpaqueStack01 = MyOpaqueStack.push 1 MyOpaqueStack0 val MyOpaqueStack0’ = MyOpaqueStack.pop MyOpaqueStack01 MyOpaqueStack.top MyOpaqueStack0’ ; val val val val val

MyEmptyOpaqueStack = - : ’a MyOpaqueStack.reptype MyOpaqueStack0 = - : int MyOpaqueStack.reptype MyOpaqueStack01 = - : int MyOpaqueStack.reptype MyOpaqueStack0’ = - : int MyOpaqueStack.reptype it = 0 : int 207

structure MyHiddenStack: STACK = struct exception E ; abstype ’a reptype = S of ’a list (* unit val pop: unit -> unit val top: unit -> itemtype end ;

structure IntNatAdd = AddFun(IntNat) structure AbsNatAdd = AddFun(AbsNat) The actual argument must match the signature of the formal parameter—so it can provide more components, of more general types. Above, AddFun is applied twice, but to arguments that differ in their implementation of type nat (AbsNat.nat 6= IntNat.nat).

214

213

exception E ; functor Stack( T: sig type atype end ) : STACK = struct type itemtype = T.atype val stack = ref( []: itemtype list ) fun push x = ( stack := x :: !stack ) fun pop() = case !stack of [] => raise E | _::s => ( stack := s ) fun top() = case !stack of [] => raise E | t::_ => t end ;

structure intStack = Stack(struct type atype = int end) ; structure intStack : STACK intStack.push(0) ; intStack.top() ; intStack.pop() ; intStack.push(4) ; val val val val 215

it it it it

= = = =

() : unit 0 : intStack.itemtype () : unit () : unit 216

Why functors ? Functors support:

map ( intStack.push ) [3,2,1] ; map ( fn _ => let val top = intStack.top() in intStack.pop(); top end ) [(),(),(),()] ;

Code reuse. AddFun may be applied many times to different structures, reusing its body.

val it = [(),(),()] : unit list val it = [1,2,3,4] : intStack.itemtype list

Code abstraction. AddFun can be compiled before any argument is implemented. Type abstraction. AddFun can be applied to different types N.nat.

217

g Topic VIII g

218

 A Scala Tutorial for Java Programmers by M. Schinz

The state of the art Scala < www.scala-lang.org >

and P. Haller. Programming Methods Laboratory, EPFL, 2008.

References:  Scala By Example by M. Odersky. Programming

Methods Laboratory, EPFL, 2008.  An overview of the Scala programming language by

M. Odersky et al. Technical Report LAMP-REPORT-2006-001, Second Edition, 2006. 219

220

Scala (I)

 Scala has been designed to work well with Java and

C#.

 Scala has been developed from 2001 in the Programming

Every Java class is seen in Scala as two entities, a class containing all dynamic members and a singleton object, containing all static members.

Methods Laboratory at EPFL by a group lead by Martin Odersky. It was first released publicly in 2004, with a second version released in 2006.

Scala classes and objects can also inherit from Java classes and implement Java interfaces. This makes it possible to use Scala code in a Java framework.

 Scala is aimed at the construction of components and

component systems. One of the major design goals of Scala was that it should be flexible enough to act as a convenient host language for domain specific languages implemented by library modules.

 Scala’s influences: Beta, C#, FamilyJ, gbeta, Haskell,

Java, Jiazzi, ML≤ , Moby, MultiJava, Nice, OCaml, Pizza, Sather, Smalltalk, SML, XQuery, etc.

222

221

A procedural language ! def qsort( xs: Array[Int] ) { def swap(i: Int, j:Int) { val t = xs(i); xs(i) = xs(j); xs(j) = t } def sort(l: Int, r: Int) { val pivot = xs( (l+r)/2 ); var i = l; var j = r while (i E

{ def f ( x1: T1 , ... , xn: Tn ) = E ; f }

Scala uses the usual block-structured scoping rules.

where f is a fresh name which is used nowhere else in the program.

229

230

Classes and objects Parameter passing  classes provide fields and methods. These are

Scala uses call-by-value by default, but it switches to call-by-name evaluation if the parameter type is preceded by =>.

accessed using the dot notation. However, there may be private fields and methods that are inaccessible outside the class.

Imperative control structures A functional implementation of while loops: def whileLoop( cond: => Boolean )( comm: => Unit ) { if (cond) comm ; whileLoop( cond )( comm ) }

231

Scala, being an object-oriented language, uses dynamic dispatch for method invocation. Dynamic method dispatch is analogous to higher-order function calls. In both cases, the identity of the code to be executed is known only at run-time. This similarity is not superficial. Indeed, Scala represents every function value as an object.

232

 Every class in Scala has a superclass which it extends.

A class inherits all members from its superclass. It may also override (i.e. redefine) some inherited members.

 Methods in Scala do not necessarily take a parameter

If class A extends class B, then objects of type A may be used wherever objects of type B are expected. We say in this case that type A conforms to type B.  Scala maintains the invariant that interpreting a value of a

subclass as an instance of its superclass does not change the representation of the value. Amongst other things, it guarantees that for each pair of types S x case Sum(l,r) => eval(l) + eval(r) case Prod(l,r) => eval(l) * eval(r) }

 Case classes implicitly come with nullary accessor

methods which retrieve the constructor arguments.  Case classes allow the constructions of patterns which

refer to the case class constructor.

If none of the patterns matches, the pattern matching expression is aborted with a MatchError exception.

243

244

Generic types Generic types and methods

Variance annotations The combination of type parameters and subtyping poses some interesting questions.

 Classes in Scala can have type parameters.

abstract class Set[A] { def incl( x: A ): Set[A] def contains( x: A ): Boolean }

?

If T is a subtype of a type S, should Array[T] be a subtype of the type Array[S]?

!

No, if one wants to avoid run-time checks!

Example: val x = new Array[String](1) val y: Array[Any] = x // disallowed in Scala // because Array is not // covariant y.update( 0 , new Rational(1,2) )

 Scala has a fairly powerful type inferencer which allows

one to omit type parameters to polymorphic functions and constructors.

246

245

Scala uses a conservative approximation to verify soundness of variance annotations: a covariant type parameter of a class may only appear in co-variant position inside the class. Hence, the following class definition is rejected:

In Scala, generic types like the following one: class Array[A] { def apply( index: Int ): A ... def update( index: Int, elem: A ) ... }

have by default non-variant subtyping. However, one can enforce co-variant subtyping by prefixing a formal type parameter with a +. There is also a prefix - which indicates contra-variant subtyping.

247

class Array[+A] { def apply( index: Int ): A ... def update( index:Int , elem: A ) ... }

248

Functions are objects Recall that Scala is an object-oriented language in that every value is an object. It follows that functions are objects in Scala. Indeed, the function type ( A1, ..., Ak ) => B is equivalent to the following parameterised class type:

Since function types are classes in Scala, they can be   further refined in subclasses. An example are arrays,   which are treated as special functions over the type of integers.

new Function1[Int,Int] { def apply( x:Int ): Int = x+1 } Conversely, when a value of a function type is applied to some arguments, the apply method of the type is implicitly inserted; e.g. for f and object of type Function1[A,B], the application f(x) is expanded to f.apply(x).

abstract class Functionk[-A1,...,-Ak,+B] { def apply( x1:A1,...,xn:Ak ): B } 

The function x => x+1 would be expanded to an instance of Function1 as follows:



NB: Function subtyping is contra-variant in its arguments whereas it is co-variant in its result. ? Why?

    249

250

Generic types Type parameter bounds

Generic types

trait Ord[A] { def lt( that: A ): Boolean } case class Num( value: Int ) extends Ord[Num] { def lt( that: Num ) = this.value < that.value }

trait def def def }

View bounds One problem with type parameter bounds is that they require forethought: if we had not declared Num as a subclass of Ord, we would not have been able to use Num elements in heaps. By the same token, Int is not a subclass of Ord, and so integers cannot be used as heap elements.

Heap[ A o.lt(pivot,x)) ) ) }

256

The principal idea behind implicit parameters is that arguments for them can be left out from a method call. If the arguments corresponding to implicit parameters are missing, they are inferred by the Scala compiler.

Mixin-class composition Every class or object in Scala can inherit from several traits in addition to a normal class. trait AbsIterator[T] { def hasNext: Boolean def next: T }

NB: View bounds are convenient syntactic sugar for implicit parameters.  Implicit conversions

As last resort in case of type mismatch the Scala compiler will try to apply an implicit conversion.

trait RichIterator[T] extends AbsIterator[T] { def foreach( f: T => Unit ): Unit = while (hasNext) f(next) }

implicit def int2ord( x: Int ): Ord[Int] = new Ord[Int] { def lt( y: Int ) = x < y } Implicit conversions can also be applied in member selections.

258

257

object Test { def main( args: Array[String] ): Unit = { class Iter extends StringIterator(args(0)) with RichIterator[Char] val iter = new Iter iter.foreach(System.out.println) } }

class StringIterator( s: String ) extends AbsIterator[Char] { private var i = 0 def hasNext = i < s.length def next = { val x = s charAt i; i = i+1; x } } Traits can be used in all contexts where other abstract classes appear; however only traits can be used as mixins.

259

The class Iter is constructed from a mixin composition of the parents StringIterator (called the superclass) and RichIterator (called a mixin) so as to combine their functionality.

260

Language innovations The class Iter inherits members from both StringIterator and RichIterator.

 Flexible syntax and type system.

NB: Mixin-class composition is a form of multiple inheritance!

 Pattern matching over class hierarchies unifies

functional and object-oriented data access.  Abstract types and mixin composition unify

concepts from object and module systems.

261

262