CSE303 - Introduction to the Theory of Computation. Sample Solutions for
Exercises on. Context-Free Languages and Pushdown Automata. 1. Exercise 3.1
.4.
CSE303 - Introduction to the Theory of Computation Sample Solutions for Exercises on Context-Free Languages and Pushdown Automata
1. Exercise 3.1.4 Let Σ be the alphabet {a, b, (, ), ⊘, ∪, ⋆ }. The grammar G = (V, Σ, R, S) generates all strings that are regular expressions over {a, b}, where V
= Σ ∪ {S} and
R = {S → ⊘, S → a, S → b, S → (SS), S → (S ∪ S), S → S ⋆ }. 2. Exercise 3.1.9 (a) A suitable grammar is one with start symbol S and rules S → aSb S → aS S → e (b) A suitable grammar is one with start symbol A1 and rules A1 A1 A1 A1 A1
→ → → → →
e A2 → e aA1 d A2 → aA2 c A2 A2 → A4 A3 A4
A3 A3 A3 A4 A4
→ → → → →
e bA3 d A4 e bA4 c
(d) A suitable grammar is one with start symbol S and rules S → ab T S → aT b T S → bT b T
→ aa T → ab T → aT a T
→ aT b → bT a → bT b
(f) A suitable grammar is one with start symbol S and rules S → aaSb S → aSb S → Sb S → e 3. Exercise 3.3.1 Let M be the given pushdown automaton. (a) We first trace all possible sequences of transitions of M on input aba. State s s s s s s s f s f
Unread input aba ba a e aba ba a e aba ba
Stack e a aa aaa e a aa aa e e
Comments
Accepts not
Accepts not Accepts not
(b) Part (a) shows that the string aba is not accepted by M . In a similar way one can show that the strings aa and abb are not accepted by M either. The string baa, on the other hand, is accepted, as the following derivation shows: State s s f f
Unread input baa aa a e
Stack e a a e
Comments
Accepts
In a similar way one can show that the strings bab and baaaa are elements of L(M ).
(c) The automaton M accepts all strings of odd length with an a in the middle, i.e., L(M ) = {uav ∈ {a, b}∗ : |u| = |v|}. Note that in state s the automaton M may read a symbol, a or b, and push one a onto the stack while remaining in state s (“phase 1”). Similarly, in state f the automaton M may read a symbol, a or b, and pop one a from the stack while remaining in state f (“phase 2”). In addition, M may change from s to f when reading an a, without changing the stack. The automaton can only end up with an empty stack if the number of symbols pushed in phase 1 is the same as the number of symbols popped in phase 2. This is only possible if the input string is of odd length and M changes from s to f when reading the middle symbol of the input, which consequently must be an a. 4. Exercise 3.3.2 (c) The language {w ∈ {a, b}∗ : w = w R } is accepted by the pushdown automaton M = ({s, f }, Σ, Γ, ∆, s, {f }), where Σ = Γ = {a, b} and ∆ = {((s, a, e), (s, a)), ((s, b, e), (s, b)), ((s, e, e), (f, e)), ((s, a, e), (f, e)), ((s, b, e), (f, e)), ((f, a, a), (f, e)), ((f, b, b), (f, e))}. (d) The language {w ∈ {a, b}∗ : w has twice as many b’s as a’s} is accepted by the pushdown automaton M = ({s, q, p, f }, Σ, Γ, ∆, s, {f }), where Σ = {a, b}, Γ = {a, b, c} and ∆ = {((s, e, e), (q, c)), ((q, a, e), (q, bb)), ((q, b, b), (q, e)), ((q, b, c), (p, bc)), ((r, a, bb), (r, e), ((r, a, bc), (q, bc), ((r, a, c), (q, bbc), ((r, b, e), (q, b)),
((q, e, c), (f, e)), ((r, e, c), (f, e))}. The symbol c is used to mark the bottom of the stack. The marker is put on the stack initially, when M changes from state s to state q (and before any input sybols have been read). It can be removed only when M changes to a final state. A nonempty stack will be of the form bi c, where i ≥ 0. During the processing of the input the automaton is either in state q or state r. If it is in state q, then 2na ≥ nb , where nσ denotes the number of occurrences of the symbol σ that have been read. The stack height in that case will be 2na − nb + 1. If it is in state r, then 2na ≤ nb , and the stack height will be nb − 2na + 1. 5. Exercise 3.5.1 (a) The language L = {am bn : m 6= n} is the union of the two languages {am bn : m > n} and {am bn : m < n}, both of which are context-free (cf. Exercise 3.1.9 (a) for a very similar problem). Hence L is also context-free. (c) The language L = {am bn cpdq : n = q or m ≤ p or m + n = p + q} is the union of three languages, {am bn cpdq : n = q}, {am bn cp dq : m ≤ p}, and {am bn cpdq : m + n = p + q}. To show that {am bn cpdq : n = q} is context-free one can design a corresponding context-free grammar, using similar ideas as for the solution of Exercise 3.5.14 (a) below. A suitable grammar for {am bn cp dq : m ≤ p} consists of start symbol S and rules, S → TD T
→ aT c
T
→ Tc
T
→ B
B → bB B → e D → dD D → e
The set {am bn cp dq : m + n = p + q} was shown to be context-free in Exercise 3.1.9 (b). (e) The language L = {w ∈ {a, b}∗ : w = w R } is the union of the three languages {ww R : w ∈ {a, b}∗}, {waw R : w ∈ {a, b}∗}, and {wbw R : w ∈ {a, b}∗}, all of which are context-free (cf. Exercises 3.1.3 (a) and (b)). Hence L is also context-free. 6. Exercise 3.5.2 (b) We prove the assertion by contradiction. Suppose the set L = 2 {an : n ≥ 0} is context-free. By the version of the Pumping Theorem discussed in class, there exists an integer K > 0, such that every string w ∈ L of length at least K can be divided into five parts, w = uvxyz, where |vy| > 0, |vxy| ≤ K, and for each i ≥ 0, uv i xy i z ∈ L. 2
Let K be such an integer for L and let w be the string aK . We have w ∈ L and |w| > K. Suppose w is written as uvxyz, where |vy| > 0 and |vxy| ≤ K, and take the string w ′ = uv 2 xy 2 z. Then K 2 < |w ′ | ≤ K 2 + K = K(K + 1) < (K + 1)2 . Thus, w ′ 6∈ L, which contradicts the condition that for each i ≥ 0, uv i xy i z ∈ L. We may infer that the set L is not context-free. (d) See Example 3.5.4 for a solution. 7. Exercise 3.5.14 (a) The language L = {am bn cp : m = n or n = p or m = p} is the union of three sets L1 , L2 , and L3 , where L1 is the set {am bn cp | m = n}, L2 the set {am bn cp | n = p}, and L3 the set {am bn cp | m = p}. The set L1 can be generated by a grammar with start symbol S and rules S → BC B → aBb B → e C
→ cC
C
→ e
Similar context-free grammars can be designed for the sets L2 and L3 . Thus, L is the union of context-free languages, and hence is contextfree.
(b) The language {am bn cp : m 6= n or n 6= p or m 6= p} is context-free as it is the union of three context-free sets, L1 = {am bn cp | m 6= n}, L2 = {am bn cp | n 6= p}, and L3 = {am bn cp | m 6= p}. To see that these languages are context-free, note that L1 can be generated by a context-free grammar with start symbol S and rules S → TC T
→ aT b
T
→ A
T
→ B
A → aA A → a B → bB B → b C
→ cC
C
→ e
A similar grammar can be designed for L2 , whereas a suitable grammar for L3 uses start symbol S and rules S → aSc S → AB S → BC A → aA A → a B → bB B → e C
→ cC
C
→ c
(c) The set {am bn cp : m = n and n = p and m = p} is the same as the set {ai bici : i ≥ 0}, which is not context-free; see Example 3.5.2. (d) The specification of a set as consisting of all strings in {a, b, c}∗ that do not contain equal numbers of occurrences of a, b, and c is slightly ambiguous.
Denoting by nσ (w) the number of occurrences of a symbol σ in a string w, the intended set is probably {w ∈ {a, b}∗ : na (w) 6= nb (w) or na (w) 6= nc (w) or nb (w) 6= nc (w)}. This set is the union of three sets (cf. part (b) of this exercise), all of which are recognized by pushdown automata and hence are contextfree. But the description may also be interpreted as denoting the set {w ∈ {a, b}∗ : na (w) 6= nb (w) and na (w) 6= nc (w) and nb (w) 6= nc (w)}, which is not context-free. This can be proved by applying a stronger version of the Pumping Theorem known as “Ogden’s Lemma,” see the references for Chapter 3. For a specific solution see, for example, p. 130 in Introduction to Automata Theory, Languages, and Computation by John Hopcroft and Jeffrey Ullman.