{note the period}
- An example derivation (left most): (=> reads as derives) => . =>
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Context Free? • In the previous example you might wonder about the idea of context. • In a context-free grammar we find that replacements do not have any context which they cannot occur. For example you might imagine that pets as a verb should only be allowed in the case that girl is the subject – The dog pets the girl = wrong – The girl pets the dog = ok
• Of course this means that there are certain contexts that the rules don’t work, thus it would not be “context free” • Adding more productions you might be able to work around simple issues, but be careful we are starting to confuse syntax and semantics and there are some things that will not be possible no matter how many productions we add.
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Formal Methods of Describing Syntax (Continued) - Another example grammar from the book this time: → → | ; → = → a | b | c | d → + | - → | const - An example derivation: => => => = => a = => a = + => a = + => a = b + => a = b + const
4
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Lecture #
4
Formal Methods of Describing Syntax (Continued) - Yet another example grammar: → + | * | ( ) | → | → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 - An example derivation: => => => => 2 => 23 => 234
CSE 130
Programming Language Principles & Paradigms
Parse Trees and Abstract Syntax Trees • Syntax establishes structure, not meaning • However, the meaning of a sentence (or program) must be related to its syntax. • Given exprresult → expr + expr we expect to add the values of the two right hands to get the left hand. – We just added meaning there, this is called syntax directed semantics (semantics directed syntax?)
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Formal Methods of Describing Syntax (Continued) - A parse tree is a hierarchical representation of a derivation
a
=
+
const
b
5
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Parse Tree Example • Given “the girl sees a dog.”
noun-phrase
sentence verb-phrase
article
noun
verb
The
girl
sees
CSE 130
. noun-phrase article
noun
a
dog
Programming Language Principles & Paradigms
Lecture #
4
Parse Tree Notes • A parse tree is labeled by non-terminals at interior nodes and terminals at leaves – Interior nodes = production steps in derivation
• All terminals and non-terminals in a derivation are included in a parse tree • Not everything may be necessary to determine syntactic structure, we can leave out some details creating an abstract syntax tree (AST) or just a syntax tree – Useful in compilers and to understand HTML/XML markup
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Abstract Syntax Tree Example • For 3 + 4 * 5 we have the parse tree expr expr
+
expr
number
expr
digit
number
3
*
expr number
digit
digit
4
5
6
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Lecture #
4
Abstract Syntax Tree Example Contd. • As an AST we just need + 3
* 4
CSE 130
5
Programming Language Principles & Paradigms
Ambiguity • Two different derivations can lead to the same the parse tree, this is good because the grammar is unambiguous • Given 234 we have different derivations – number => number digit => number 4 => number digit 4 => number 3 4 => digit 3 4 => 234
CSE 130
number =>number digit => number digit digit => digit digit digit => 2 digit digit => 2 3 digit => 234
Programming Language Principles & Paradigms
Lecture #
4
Ambiguity Contd. • However the parse tree is the same in either case number number digit number digit 4 digit
3
2
7
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Ambiguity Contd. • This isn’t always the case consider 3+4*5 we might have two different simplified parse trees expr expr
3
OR
+
expr
expr
expr * 4
expr
expr
*
expr + expr
5
3
expr
5
4
As you can see here we seem to have a precedence problem now!
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Removing Ambiguity • A grammar that produces different parse trees depending on derivation order is considered ambiguous • We can try to revise the grammar and introduce a disambiguating rule to establish which of the trees we want • In the previous example we want multiplication to take precedence over addition, thus we tend to write a special grammar rule that establishes a precedence cascade to force the * at the lower point in the tree
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Removing Ambiguity Contd. • To remove the ambiguity you might add – → + | → * | ( ) |
• This doesn’t quite do it because 3 + 4 + 5 can be (3 + 4) + 5 or 3 + (4 + 5) – Addition becomes left or right associative, this isn’t so bad with addition but associativity can be a problem with other operators. We can fix this with some new rules where we find that • Left recursive rules become left associative • Right recursive rules become right associative
8
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Removing Ambiguity Contd. • The revised grammar is as follows → + | → * | → ( ) | → | → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 • This should be unambiguous, try it and see with some derivations and parse trees
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Lecture #
4
Formal Methods of Describing Syntax (Continued) - Extended BNF (just abbreviations): 1. Optional parts are placed in brackets ([]) -> ident [ ( )] 2. Put alternative parts of RHSs in parentheses and separate them with vertical bars -> (+ | -) const 3. Put repetitions (0 or more) in braces ({}) -> letter {letter | digit} - BNF: → | | → | |
+ * /
- EBNF: → {(+ | -) } → {(* | /) } There are even more BNF like forms out there if you look around Augmented BNF forms (http://www.ietf.org/rfc/rfc2234.txt) You may also see people using basic RegExes for at least portions of languages
CSE 130
Programming Language Principles & Paradigms
Formal Methods of Describing Syntax (Continued) - Syntax Graphs - put the terminals in circles or ellipses and put the nonterminals in rectangles; connect the lines with arrowheads Example here Pascal type declarations
type_identifier (
identifier
)
, constant
..
constant
9
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Attribute Grammars (AGs) (Knuth, 1968) - CFGs cannot describe all of the syntax of programming languages - Additions to cfgs to carry some semantic info along through parse trees - Primary value of AGs: 1. Static semantics specification 2. Compiler design (static semantics checking) - Def: An attribute grammar is a cfg G = (S, N, T, P) with the following additions: 1. For each grammar symbol x there is a set A(x) of attribute values 2. Each rule has a set of functions that define certain attributes of the nonterminals in the rule 3. Each rule has a (possibly empty) set of predicates to check for attribute consistency
CSE 130
Programming Language Principles & Paradigms
Lecture #
4
Lecture #
4
Attribute Grammars (continued) - Let X0 → X1 ... Xn be a rule. - Functions of the form S(X0) = f(A(X1), ... A(Xn)) define synthesized attributes - Functions of the form I(Xj) = f(A(X0), ... , A(Xn)), for i