Dependency diagram of a typical Multi Pass Compiler: A multi pass compiler
makes several passes over the program. The output of a preceding phase is
stored ...
Course Overview
The “Phases” of a Compiler
PART I: overview material 1 2 3
Source Program
Introduction Language processors (tombstone diagrams, bootstrapping) Architecture of a compiler
Syntax Analysis Abstract Syntax Tree
PART II: inside a compiler 4 5 6 7
Error Reports
Syntax analysis Contextual analysis Runtime organization Code generation
Error Reports
Contextual Analysis
Decorated Abstract Syntax Tree Code Generation
PART III: conclusion 8 9
Object Code
Interpretation Review
Contextual Analysis (Chapter 5)
1
Contextual Analysis (Chapter 5)
Multi Pass Compiler
Recap: Contextual Constraints Syntax rules alone are not enough to specify the format of well-formed programs.
A multi pass compiler makes several passes over the program. The output of a preceding phase is stored in a data structure and used by subsequent phases.
Example 1: let const m~2 Undefined! in putint(m + x)
Dependency diagram of a typical Multi Pass Compiler: Compiler Driver calls
calls This chapter Syntactic Analyzer Code Generator Contextual Analyzer input output input output input output Source Text
calls
AST
Decorated AST
Object Code
Contextual Analysis (Chapter 5)
3
Contextual Analysis
Type Rules
Contextual Analysis (Chapter 5)
4
Recap: Contextual Analysis -> Decorated AST Annotations: result of identification :type result of type checking
Program LetCommand
=> Identification => Type checking
SequentialCommand SequentialDeclaration
• What do we mean by a “typical” programming language? Do there exist programming languages that are not typical, for which scope and/or type rules are not verified at compile time?
AssignCommand AssignCommand
SimpleV BinaryExpr :int
:int VarDecl
VarDecl
SimpleT Ident
Ident
n Integer Contextual Analysis (Chapter 5)
Scope Rules
Example 2: let const m~2 ; var n:Boolean in begin n := m Attribute) – Find an entry for an identifier
– Monolithic block structure: e.g. BASIC, COBOL (single block) – Flat block structure: e.g. Fortran (partition program into blocks) – Nested block structure: e.g. C, C++, Java, Algol, Pascal, Scheme, … (as in modern “block-structured” programming languages, each block might contain other blocks) block = an area of text in the program that corresponds to some kind of boundary for the visibility of identifiers. block structure = the textual relationship between blocks in a program.
Contextual Analysis (Chapter 5)
7
Contextual Analysis (Chapter 5)
Different kinds of Block Structure... a picture Monolithic
Flat
8
Monolithic Block Structure
Nested
Monolithic
AAlanguage languageexhibits exhibitsmonolithic monolithicblock blockstructure structureifif the theonly onlyblock blockisisthe theentire entireprogram. program. => Every identifier is visible throughout the entire program Very simple scope rules: • No identifier may be declared more than once • For every applied occurrence of an identifier I there must be a corresponding declaration.
Contextual Analysis (Chapter 5)
9
Contextual Analysis (Chapter 5)
Flat Block Structure Flat
Nested Block Structure
AAlanguage languageexhibits exhibitsflat flatblock blockstructure structureififthe the program programcan canbe besubdivided subdividedinto intoseveral severaldisjoint disjoint blocks blocks
Nested
There are two scope levels: global or local. Typical scope rules:
AAlanguage languageexhibits exhibitsnested nestedblock blockstructure structureifif blocks blocksmay maybe benested nestedone onewithin withinanother another(typically (typically with withno noupper upperbound boundon onthe thelevel levelof ofnesting nestingthat thatisis allowed). allowed). There can be any number of scope levels (depending on the level of nesting of blocks).
• a globally defined identifier may be redefined locally
Typical scope rules:
• several local definitions of a single identifier may occur in different blocks (but not in the same block)
• no identifier may be declared more than once within the same block (at the same level). • for any applied occurrence there must be a corresponding declaration, either within the same block or in a block in which it is nested.
• For every applied occurrence of an identifier there must be either a local declaration within the same block or a global declaration. Contextual Analysis (Chapter 5)
10
11
Contextual Analysis (Chapter 5)
12
2
Identification Table
Identification Table
For a typical programming language, i.e. a statically scoped language with nested block structure, we can visualize the structure of all scopes within a program as a kind of tree. Global Global A A B A1
A3 B
/** /**Add Addan anentry entryto tothe theidentification identificationtable, table,associating associating identifier identifierididwith withattribute attributeattr attrat atthe thecurrent currentlevel level*/*/ public publicvoid voidenter(String enter(Stringid, id,Attribute Attributeattr) attr){{... ...}}
Lookup path for an applied A3 occurrence in A3
A2 A1
public publicclass classIdentificationTable IdentificationTable{{
A2
= “direction” of identifier lookup At any one time (in analyzing the program) only a single path on the tree is accessible. => We don’t necessarily need to keep the whole “scope” tree in memory all the time.
Contextual Analysis (Chapter 5)
/** /**Retrieve Retrieveaapreviously previouslyadded addedentry. entry.Returns Returnsnull null when whenno noentry entryfor forthis thisidentifier identifierisisfound found*/*/ public publicAttribute Attributeretrieve(String retrieve(Stringid) id){{... ...}} /** /**Add Addaanew newdeepest deepestnesting nestinglevel levelto tothe the identification identificationtable table*/*/ public void openScope( ) { ... } public void openScope( ) { ... } /** /**Remove Removethe thedeepest deepestscope scopelevel levelfrom fromthe thetable, table, and anddelete deleteall allentries entriesassociated associatedwith withitit*/*/ public publicvoid voidcloseScope( closeScope()){{... ...}} ... ...
13
Contextual Analysis (Chapter 5)
Identification Table: Example let
var a: Integer; var b: Boolean in begin ... let var b: Integer; var c: Boolean in begin ... end ... let var d: Boolean; var e: Integer in begin let const x~3 in ... end end
Attributes
Level Ident Attr 1 a (1) Level 1 b (2) Ident 1 a 1 b 2 b 2 c Level 1 1 2 2
Ident Attr a (1) Level b (2) 1 d (5) 1 e (6) 2 2 3
Ident a b d e x
public public void void enter(String enter(String id, id, Attribute Attribute attr) attr) {{ ... ... }} public public Attribute Attribute retrieve(String retrieve(String id) id) {{ ... ... }}
Attr (1) (2) (3) (4)
What are these attributes? (Or in other words: What information do we need to store about identifiers?)
Attr (1) (2) (5) (6) (7)
What information is required by each of these two sub-phases and where does it come from?
Contextual Analysis (Chapter 5)
To understand what information needs to be stored, we must first understand what the information will be used for! • Checking Scope Rules • Checking Type Rules
15
Contextual Analysis (Chapter 5)
Attributes
Example 2: let const m~2 ; var n:Boolean in begin n := m Attribute = type information.
Type Rules
This may be sufficient for simple languages, but not for more complex languages. 17
Contextual Analysis (Chapter 5)
18
3
Attributes: Example 1: Mini-Triangle attributes
Attributes: Example 2: Triangle attributes
Mini Triangle is very simple: there are only two kinds of declarations single-Declaration single-Declaration ::= ::= const const Identifier Identifier || var var Identifier Identifier ::
Triangle is more complex than Mini Triangle => more kinds of declarations and types
~~ Expression Expression Type-denoter Type-denoter
public publicabstract abstractclass classAttribute Attribute{{... ...}} public publicclass classConstAttribute ConstAttributeextends extendsAttribute Attribute{...} {...} public publicclass classVarAttribute VarAttributeextends extendsAttribute Attribute{...} {...} public publicclass classProcAttribute ProcAttributeextends extendsAttribute Attribute{...} {...} public publicclass classFuncAttribute FuncAttributeextends extendsAttribute Attribute{...} {...} public class TypeAttribute extends Attribute {...} public class TypeAttribute extends Attribute {...}
... and only two types of values: BOOL or INT public publicclass classAttribute Attribute{{ public publicstatic staticfinal finalbyte byte CONST CONST==0, 0,VAR VAR==1, 1, ////two twokinds kindsof ofdeclaration declaration BOOL = 0, INT = 1; BOOL = 0, INT = 1; ////two twotypes types
}}
public publicabstract abstractclass classType Type{...} {...} public publicclass classBoolType BoolTypeextends extendsType Type{...} {...} public publicclass classCharType CharTypeextends extendsType Type{...} {...} public publicclass classIntType IntTypeextends extendsType Type{...} {...} public class ArrayType extends Type {...} public class ArrayType extends Type {...} public publicclass classRecordType RecordTypeextends extendsType Type{...} {...}
byte bytekind; kind; ////either eitherCONST CONSTor orVAR VAR byte bytetype; type; ////either eitherBOOL BOOLor orINT INT
Contextual Analysis (Chapter 5)
19
Contextual Analysis (Chapter 5)
Attributes: Pointers to Declaration AST’s
20
Attributes as pointers to Declaration AST’s Program
Mini Triangle is very simple, but in a more realistic language the attributes can be quite complicated (many different kinds of identifiers and many different types of values)
LetCommand SequentialDecl
=> The implementation of “attributes” can become much more complex and tedious.
VarDecl Ident
Observation: The declarations of identifiers provide the necessary information for attributes.
x
Ident
Ident
int a
bool
=> For some languages, a practical way to represent attributes is simply as pointers to the AST-subtree of the actual declaration of an identifier.
Contextual Analysis (Chapter 5)
21
LetCommand
VarDecl
Contextual Analysis (Chapter 5)
VarDecl
Id table Ident
Ident
y
int
Level 1 1 2
Ident x a y
Attr • • •
22
4