Compiler Design 1. Introduction to Programming. Language Design and to
Compilation. Compiler Design 1 (2011). 2. Administrivia. • Lecturer: – Kostis
Sagonas ...
Administrivia • Lecturer:
– Kostis Sagonas (Hus 1, 352)
Compiler Design 1
• Course home page: http://user.it.uu.se/~kostis/Teaching/KT1-11
• If you want to be enrolled in the course, send mail with your name and UU account to:
Introduction to Programming Language Design and to Compilation
• Assistant:
[email protected]
– Stavros Aronis (
[email protected]) – responsible for the lessons and the assignments
Compiler Design 1 (2011)
Course Structure
2
Course Literature
• Course has theoretical and practical aspects • Need both in programming languages! • Written examination = theory (4 points) • Assignments = practice (1 point)
– Electronic hand-in to the assistant before the corresponding deadline
Compiler Design 1 (2011)
3
Compiler Design 1 (2011)
4
Academic Honesty
The Compiler Project
• For assignments you are allowed to work in pairs (but no threesomes/foursomes/...) • Don’t use work from uncited sources
• A follow-up course • that will be taught by Sven-Olof Nyström • in period 3
– Including old assignments
PLAGIARISM
Compiler Design 1 (2011)
5
Compiler Design 1 (2011)
How are Languages Implemented?
Language Implementations
• Two major strategies:
• Batch compilation systems dominate
– Interpreters (older, less studied) – Compilers (newer, much more studied)
– gcc
• Some languages are primarily interpreted
• Interpreters run programs “as is”
– Java bytecode – Postscript
– Little or no preprocessing
• Some environments (e.g. Lisp) provide both
• Compilers do extensive preprocessing
Compiler Design 1 (2011)
6
– Interpreter for development – Compiler for production
7
Compiler Design 1 (2011)
8
(Short) History of High-Level Languages
FORTRAN I
• 1953 IBM develops the 701
• 1954 IBM develops the 704 • John Backus
• Till then, all programming done in assembly
– Idea: translate high-level code to assembly – Many thought this impossible • Had already failed in other projects
• Problem: Software costs exceeded hardware costs!
• 1954-7 FORTRAN I project • By 1958, >50% of all software is in FORTRAN • Cut development time dramatically
• John Backus: “Speedcoding” – An interpreter – Ran 10-20 times slower than hand-written assembly Compiler Design 1 (2011)
– (2 weeks → 2 hours)
9
Compiler Design 1 (2011)
FORTRAN I
The Structure of a Compiler
• The first compiler
1. 2. 3. 4. 5. 6.
– Produced code almost as good as hand-written – Huge impact on computer science
• Led to an enormous body of theoretical work • Modern compilers preserve the outlines of the FORTRAN I compiler
Compiler Design 1 (2011)
10
Lexical Analysis Syntax Analysis Semantic Analysis IR Optimization Code Generation Low-level Optimization The first 3, at least, can be understood by analogy to how humans comprehend English.
11
Compiler Design 1 (2011)
12
Lexical Analysis
More Lexical Analysis
• First step: recognize words.
• Lexical analysis is not trivial. Consider: ist his ase nte nce
– Smallest unit above letters
This is a sentence.
• Plus, programming languages are typically more cryptic than English: *p->f ++ = -.12345e-5
• Note the – Capital “T” (start of sentence symbol) – Blank “ ” (word separator) – Period “.” (end of sentence symbol)
Compiler Design 1 (2011)
13
Compiler Design 1 (2011)
And More Lexical Analysis
Parsing
• Lexical analyzer divides program text into “words” or “tokens” if (x == y) then z = 1; else z = 2;
• Once words are understood, the next step is to understand the sentence structure
14
• Parsing = Diagramming Sentences – The diagram is a tree
• Units: if, (, x, ==, y, ), then, z, =, 1, ;, else, z, =, 2, ;
Compiler Design 1 (2011)
15
Compiler Design 1 (2011)
16
Diagramming a Sentence (1) This
line
article
is
a
Diagramming a Sentence (2) longer
sentence
noun verb article adjective
noun
article
noun phrase noun phrase
This
line
is
a
longer
noun verb article adjective
subject
verb phrase
sentence noun
object sentence
sentence Compiler Design 1 (2011)
17
Compiler Design 1 (2011)
Parsing Programs
Semantic Analysis
• Parsing program expressions is the same • Consider: If (x == y) then z = 1; else z = 2; • Diagrammed:
• Once sentence structure is understood, we can try to understand its “meaning”
x
==
y
relation predicate
z
=
1
assignment then-stmt
z
=
18
– But meaning is too hard for compilers
• Most compilers perform limited analysis to catch inconsistencies
2
assignment • Some optimizing compilers do more analysis to improve the performance of the program
else-stmt
if-then-else Compiler Design 1 (2011)
19
Compiler Design 1 (2011)
20
Semantic Analysis in English
Semantic Analysis in Programming Languages
• Example: Jack said Jerry left his assignment at home. What does “his” refer to? Jack or Jerry?
• Programming languages define strict rules to avoid such ambiguities
• Even worse: Jack said Jack left his assignment at home? How many Jacks are there? Which one left the assignment?
• This C++ code prints “4”; the inner definition is used
Compiler Design 1 (2011)
21
{ int Jack = 3; { int Jack = 4; cout