automata and context-free grammars, with applications ... Subtopic outcomes. 3.1
Explain the pushdown-automaton model* ..... Peter Linz. An Introduction to ...
David M. Keil
3. PDAs and CFLs
Theory of Computing
David M. Keil, Framingham State University
CSCI 460 Theory of Computing
3. Pushdown automata 1. Pushdown automata (stack machines) 2. Context-free grammars 3. Derivations and parsing 4. Expressiveness of PDAs David Keil
Theory of Computing
3. Pushdown automata Spring 2015
1
Inquiry • How do compilers work? • Can a stack enable state-transition systems to solve more problems than DFAs do? • Are there limits to what PDAs compute? David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
2
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Topic objective 3. Explain the relation between pushdown automata and context-free grammars, with applications, proving relevant expressiveness results
David Keil
Theory of Computing
3. Pushdown automata Spring 2015
3
Subtopic objectives 3.0 3.1 3.2a 3.2b 3.3a
Describe the stack data structure* Explain the pushdown-automaton model** Write a linear CFG* Define a context-free grammar Perform a derivation using a context-free grammar* 3.3b Describe an application of CFLs to compilation 3.4 Give an expressiveness proof for stack machines* David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
4
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
PDAs and CFLs 1. Pushdown automata (stack machines) • Problem: Modify the DFA/NFA model to accept 0n1n – To accept: , 01, 0011, 000111 – To reject: 0, 1, 10, 011, 101 • Can any DFA solve this problem? • What does the automaton need to store as it reads input? David Keil
Theory of Computing
3. Pushdown automata Spring 2015
5
Subtopic objectives 3.0 Describe the stack data structure* 3.1 Explain the pushdown-automaton model**
David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
6
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Stacks and queues • Specialized collections with restricted access • A queue works on first-in, first-out principle
•
A stack is last-in, first-out (LIFO)
[Diagrams: D. Harel, The Science of Computing]
David Keil
Theory of Computing
3. Pushdown automata Spring 2015
7
Reverse Polish notation and stack • Postfix notation places operator after operands • Example: 2 3 * 4 + (for infix 2 * 3 + 4) • Steps to evaluate 2 3 * 4 +: 1. Read 2, push 2 2. Read 3, push 3 3. Read *, pop 3, pop 2, mult. 2 * 3, push 6 4. Read 4, push 4 5. Read +, pop 4, pop 6, add 6 + 4, push 10 • Why calculate this way? Easier to code than with infix notation David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
8
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
PDA example A • This (deterministic) PDA reads zeroes, storing a ‘#’ on stack for each • Then it pops ‘#’s and reads ones • The stack implements memory David Keil
Theory of Computing
3. Pushdown automata Spring 2015
9
Formal definition of PDAs • PDA A = Q, , , , q0, F • As with DFAs, Q, , q0, F are state set, alphabet, start state, accept state set, resp. • (Gamma) is a set of stack symbols • : (Q {}) (Q {}) is the transition relation • PDA reads a symbol, popping a symbol from stack; writes a symbol; pushes a symbol David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
10
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Transition relation (delta) • (q , a, x) A (q, y) means that in state q, on input symbol a , and popping x , A can go into state q and push y onto stack • Pop- transition pops nothing • Push- transition pushes nothing • PDAs are non-deterministic; may make transitions without consuming input
David Keil
Theory of Computing
3. Pushdown automata Spring 2015
11
Modes of acceptance for PDAs • Two types of PDAs exist: – Accepting an input string requires empty stack on termination of input – Acceptance upon entering an accept state • The sets of languages associated with these two types of PDA differ
David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
12
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
PDA example A (Formal definition) Let A = Q, , , , q0, F , Q = {q0, q1, q2} = {0,1} = {‘#’} = {(q0, 0, , q1, ‘#’), (q1, 0, ‘#’, q1, ‘#’), (q1, 1, ‘#’, q2, ), (q2, 1, ‘#’, q2, )} F = {q0, q2} David Keil
Theory of Computing
3. Pushdown automata Spring 2015
13
Computations with PDA A • Input: Accepted • Input: 1 No transitions, rejected • Input: 0011 0 # 0 ## 1 # 1 Accepted David Keil
Theory of Computing
• Input: 001 0 # 0 ## 1 # Nonempty stack, rejected
3. Pushdown automata
Spring 2015
14
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Not all languages are regular • Theorem: (L *) (L RL (PDA A) L = L(A)) • Proof: – The PDA on the previous slide accepts the language 0n1n – This language, by the Pumping Lemma, is not regular • Definition: A language is context-free if there exists a PDA that accepts it on empty stack David Keil
Theory of Computing
3. Pushdown automata Spring 2015
15
Another PDA example
What is ? What is L(B)? (Note: not a regular language) David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
16
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Nondeterminism and PDAs • Nondeterminism is an essential aspect of PDAs and CFL recognition • Deterministic PDAs are a weaker model of computation than PDAs • Some CFLs have no deterministic PDA that accepts them
David Keil
Theory of Computing
3. Pushdown automata Spring 2015
17
Deterministic PDAs • Effectively parseable; equiv. to LR(k) grammars • Have unique transitions; i.e., (q, a, x) has at most one return value • Accept some but not all CFLs; e.g., {wcwR | w *} has no DPDA • Reg(L) (m DPDA) L(M) = L • Thm: RL {L(M) | M DPDA} CFL • Thm: If L is accepted by a DPDA, then L has an unambiguous CFG • Example of a DPDA’s language: 0n1n David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
18
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
2. Context-free grammars • What is syntax? • How is syntax of regular languages expressed? • Can regular expressions generate CFLs? • How is English syntax defined? • Java syntax? David Keil
Theory of Computing
3. Pushdown automata Spring 2015
19
Subtopic objectives 3.2a Write a linear CFG* 3.2b Define a context-free grammar
David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
20
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
CFG for 0n1n S S 0S1 • This grammar defines a sentence as a null string or else as a 0, followed by a sentence, followed by a 1 David Keil
Theory of Computing
3. Pushdown automata Spring 2015
21
Palindrome example (B) • Palindromes may be defined inductively: Base: , 0, 1 PAL Induction: (x PAL) (0x0, 1x1 PAL) • CFG for PAL: S|0|1 S 0S0 | 1S1 • For any x, we may prove or disprove by induction that x PAL David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
22
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Definition of CFG syntax • CFG G = , NT, R, S , where - is a set of terminal symbols - NT is a set of nonterminals (names) - R NT ( NT)* is a set of production rules - S NT is the start symbol • Production rules are of the form X Y Z… where X NT and Y, Z… ( NT ) • The CFGs generate precisely the CFLs David Keil
Theory of Computing
3. Pushdown automata Spring 2015
23
Example of structural induction • Let S = {()} {(x) | x S} {yz | y, z S} • Theorem: Every element of S has equal numbers of left and right parentheses • Proof: 1. Base: ( ) is balanced Induction: 2. P(x) P((x)) adding one left and right yields balance 3. P(x) P(y) P(xy) concatenating strings, each with equal left and right, yields balance David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
24
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Linear CFGs generate RLs • Example: 0* | 10* is generated by S X | 1X X | 0X • This is a right-linear CFG: each production body has at most one nonterminal, the rightmost symbol in the production (e.g., S aS) • Right-linear or left-linear CFGs generate regular languages; every RL has such a CFG • A star component of regular grammar follows the pattern for X above David Keil
Theory of Computing
3. Pushdown automata Spring 2015
25
Linear CFG example • Regular expression: (11)* | (00)* • Grammar: SX|Y X | 00X Y | 11Y
David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
26
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Propositional-logic formulas • Formulas in propositional logic are a CFL • Terminals: truth values (true, false); variable names; and the operators (, ), , , , • Grammar: S true | false | ID | (S)|S|SS | SS|SS
David Keil
Theory of Computing
3. Pushdown automata Spring 2015
27
Java grammar fragment statement { statement-list } statement ID = expression ; statement-list statement statement-list statement-list expression ID expression num-literal expression ( expression ) expression num-literal + expression num-literal digit | digit num-literal Terminals are in red David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
28
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Natural-language grammars • The nonterminals are “parts of speech” • In English, a sentence is a noun phrase followed by a verb phrase • A noun phrase may be a noun or a noun phrase followed by an adjective • Etc.
David Keil
Theory of Computing
3. Pushdown automata Spring 2015
29
HTML and XML • Tags define formatted or semantic text elements • HTML example:
It is very windy
• The grammar uses productions such as text text text
• XML example: Tags define semantics of text, such as by marking field names: 19.95 David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
30
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Normal forms for CFGs • A normal form of a language (such as a logic or a way to express CFGs) is a restricted format that has some advantages • A CFG is in Chomsky normal form iff all productions are of the form A BC or A a, where A, B, C are nonterminals and a is a terminal • Theorem: {L(G) | G is in CNF } = {L | L is CF} David Keil
Theory of Computing
3. Pushdown automata Spring 2015
31
PDA and CFG conversions Linear-time: • CFG PDA • PDA that accepts by final state PDA that accepts by empty stack O(n3): • PDA CFG O(n2): • CFG Chomsky Normal Form David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
32
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
3. Derivations and parsing • What is the structure of a Java program? • Of an English sentence? • How is this structure diagrammed? David Keil
Theory of Computing
3. Pushdown automata Spring 2015
33
Subtopic objectives 3.3a Perform a derivation using a context-free grammar* 3.3b Describe an application of CFLs to compilation
David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
34
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Parsing • An application of PDAs is parsing, e.g., in program compilation • A derivation process demonstrates that a string is in the language of a certain grammar • Parse trees diagram the results of derivations • Parsing is used to build the structure of an utterance, as in compiling a program David Keil
Theory of Computing
3. Pushdown automata Spring 2015
35
Derivation of a string • Production rules are applied repeatedly, starting with start symbol, by replacing nonterminal on left side of rule with expansions using the right side • A derivation continues until no nonterminals remain • Example: PAL 1 PAL 1 10 PAL 01 1001 Rule: S 1S1
David Keil
Theory of Computing
0S0 3. Pushdown automata
Spring 2015
36
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Derivation steps • Suppose , are sequences of nonterminal and terminal symbols in a grammar • G denotes one derivation step, applying one production rule • *G denotes that can be derived from in multiple steps • If x L(G), and S is G’s starting symbol, then S *G x David Keil
Theory of Computing
3. Pushdown automata Spring 2015
37
Derivation example • G = , NT, R, S = {0,1} NT = Bal R = {Bal , Bal 0 Bal 1} S = Bal • Example: Derive 0011 from Bal, showing that 0011 is in L(G) Bal 0 Bal 1 rule 2 0 Bal 1 00 Bal 11 rule 2 00 Bal 11 00 11 rule 1 00 11 0011 definition of David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
38
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Parsing with a CFG • A parse tree is a diagram of a derivation • Leaves of tree are terminals, internal nodes are nonterminals • Parse tree for previous example: • A given element of a language generated by a CFG may have multiple parse trees David Keil
Theory of Computing
3. Pushdown automata Spring 2015
39
Parse trees
• Compilers build a program structure from tokens • Root may be program • Lexemes are leaves of tree • Nonterminal syntax elements (e.g., expression, factor) are internal tree nodes or the root David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
40
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Ambiguous grammars • These are grammars with which sometimes more than one rule can be applied. • Ambiguous grammars are those that define languages with some elements that have two or more different parse trees or derivations • Choosing rightmost possible nonterminal to expand produces rightmost derivation • Use of leftmost or rightmost derivations can resolve ambiguity David Keil
Theory of Computing
3. Pushdown automata Spring 2015
41
Parsing algorithms • Table-driven (bottom-up) parsing: Used in many compilers and parser generators, e.g., YACC, Bison • Top down (recursive-descent) parsing: one recursive method recognizes each element corresponding to one production – This method parses a string, consuming its symbols as it proceeds – Option: method may build a parse tree for the production recognized David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
42
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Top-down parsing example • Productions: Returns true iff Bal , Bal 0 Bal 1 string x is in the language of PDA • Parsing algorithm: example A, else Parse-Bal(x) If x = false return true if x[1] = ‘0‘ and x[length(x)] = ‘1’ return Parse-Bal (x[2..length(x) – 1]) else return false David Keil
Theory of Computing
3. Pushdown automata Spring 2015
43
Parsing and compilation • Lexical analyzer separates a sentence or program into tokens (lexemes) • Compiler checks syntax of a program (structure of lexemes) • Parser builds parse tree from sequence of lexemes • Compiler generates assember or machine or byte code from a program’s parse tree David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
44
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
CFG problems Define a CFG for L = 1. 0n12n 2. { x: n0(x) = n1(x) } 3. { x | n1(x) > n0(x) } 4. { 0m1n | n > m } 5. x123xR David Keil
Theory of Computing
3. Pushdown automata Spring 2015
45
4. Expressiveness of the PDA model • Properties of CFLs and CFGs • Does a PDA exist for every language? • Do PDAs and CFGs define the same set of languages? David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
46
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Subtopic objective 3.4 Give an expressiveness proof for stack machines*
David Keil
Theory of Computing
3. Pushdown automata Spring 2015
47
All regular languages are CF • Theorem: CFL RL
• Proof: 1. () Construct PDA A from DFA M, L = L(M), where of A is of M, i.e., without any stack operations 2. () Some CFLs are not regular; e.g., 0n1n • PDA model is more expressive than DFA David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
48
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Languages of CF Grammars • Language generated by a grammar: L(G) = { w * | S *G w } • Set of languages of all CF grammars: L (CFG ) = { L | G s.t. L = L(G), G is CF } • To prove about CFGs and CFLs: – L (CFG ) = CFL = { L | L = L(M), M PDA } – i.e., that CFGs define the same set of languages as PDAs David Keil
Theory of Computing
3. Pushdown automata Spring 2015
49
{L | L = L(G), G is CF} = CFL Theorem: The languages defined by CFGs are precisely the CFLs (languages accepted by some PDA) Proof:
1. (G CFG)(M PDA) L(M) = L(G) CFG PDA construction:
Construct a PDA M that works as follows: • When a nonterminal X is found on stack, if the CFG has the production X Y Z…, pop X and push Y Z …; otherwise reject • When a terminal is found on the stack, get next input a and pop stack if a is on stack; otherwise reject input David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
50
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
L(PDA) = L(CFG) 2. ( M PDA)( G CFG) L(M) = L(G)
Reverse the process used in proof of CFG PDA, expressing – stack content as variable part of production – Input as prefix in production Example: Construct G from A (PDA Example A) as S 0 S 1, S David Keil
Theory of Computing
3. Pushdown automata Spring 2015
51
Properties of CFLs • A class of languages is said to be closed under an operation if all results of the operation are also in the class • Closures: - Kleene star: (L CFL) (L* CFL) - Concatenation: (L1, L2 CFL) CF(L1L2) - Union: (L1, L1 CFL) CF(L1 L2) • CFLs are not closed under intersection or complement David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
52
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
CFL is closed under concatenation Proofs: • Construct PDA of two PDAs, similarly to DFA construction to show closure of RLs under concatenation or • Construct CFG where S S1 S2 where S1 S2 are start symbols of grammars that generate L1 and L2 David Keil
Theory of Computing
3. Pushdown automata Spring 2015
53
Decidable properties of CFLs • The following are algorithmically decidable, based on PDA or CFG representation: – Empty(L(M)), in O(n) time – x L(M), by CYK (table) algorithm in O(n3) time
David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
54
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Pumping lemma for CFLs • (L CFL, L infinite) (n)(z L) (|z| n (z = uvwxy, |vwx| n, vx , (i) uviwxiz L)) • In plain language, any sufficiently long string in an infinite CFL may be expressed in five parts, the second and fourth of which may be pumped with the resulting string also in L • Example: Any string in 0n1n is easily segmented so that a string of zeros and a string of ones may be pumped in tandem David Keil
Theory of Computing
3. Pushdown automata Spring 2015
55
About pumping lemma for CFLs • Proof uses the fact that path length in any parse tree is limited • Application: pumping lemma for CFLs may be used to show that a language like 0n1n2n (where n 1) is not CF
David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
56
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
Example of use of PL for CFLs • Example: L = {an bn cn} is not CF • Proof: Let string s = ap bp cp , with s in L, |s| p • Pumping lemma states that CF(L), s L s = uvxyz with uvixyiz in L for all i • Cases: – v, y (pumped) are uniform (all a, b, or c), hence even uv2xy2z L – v or y has at least one different symbol; so pumping it produces bs before as or cs before bs – Either case produces a contradiction, hence L is not CF David Keil
Theory of Computing
3. Pushdown automata Spring 2015
57
Implications of the pumping lemma for CFLs • Lemma is similar to that for RLs; any x L for CFL L may be rewritten uvwyz, with uvkwykz L for any k • Some non-CF languages like an bn cn can be recognized by simple algorithms • That a decidable language is not CF implies that a computational hierarchy of at least three levels exists: to recognize RLs, CFLs, and other decidable languages David Keil
Theory of Computing
3. Pushdown automata
Spring 2015
58
Spring 2015
David M. Keil
3. PDAs and CFLs
Theory of Computing
References Daniel I. A. Cohen. Introduction to Computer Theory, 2nd Ed. Wiley, 1997. J. Hopcroft, R. Motwani, J. Ullman. Introduction to Automata Theory, Languages, and Computation, 3rd Ed. Addison-Wesley, 2007. Peter Linz. An Introduction to Formal Languages and Automata, 4th Ed. Jones and Bartlett, 2006. Michael Sipser. Introduction to the Theory of Computation. PWS, 2013. David Keil
Theory of Computing
3. Pushdown automata Spring 2015
59
Spring 2015