Advanced Features of DCG

50 downloads 3510 Views 78KB Size Report
agreement is to duplicate the grammar rules for singular and plural with different names. ▫ Express the grammar rules by saying that there are two kinds of ...
Advanced Features of DCG

Lecture 14

Adding Extra Arguments 







The grammar rules considered so far are of restricted kind. Let us consider one useful extension, which allows phrase type to have extra arguments. One way to resolve the problem of number agreement is to duplicate the grammar rules for singular and plural with different names. Express the grammar rules by saying that there are two kinds of sentences viz., – singular sentence and – plural sentence.

Cont… – For example,



sentence

-->

sing_sent.

sentence sing_sent sing_np sing_vp sing_vp np np

--> --> --> --> --> --> -->

plur_sent. sing_np, sing_vp. s_det, sing_noun. sing_verb, np. sing_verb. sing_np. plur_np.

Similarly the rules for plur_sent are defined. It is clear that this is not an elegant way of handling singular and plural sentences.

Cont… 







These sentences have lot of structures in common. A better way is to associate an extra argument with phrase types, according to singular or plural. In the grammar shown below, an argument M corresponds to number of entire sentence and M1 to number of verb phrase. The modified grammar incorporating number agreement arguments is rewritten as follows: sentence sentence1(M) np(M)

--> --> -->

sentence1(M). np(M), vp(M). det(M), noun(M).



vp(M)

-->

verb(M).

vp(M) det(singular) det(singular) det( _ ) noun(singular) noun(singular) noun(singular) noun(plural) noun(plural) noun(singular) verb(singular) verb(plural) verb(singular) verb(plural)

--> --> --> --> --> --> --> --> --> --> --> --> --> -->

verb(M), np(M1). [a]. [an]. [the]. [boy]. [girl]. [apple]. [apples]. [girls]. [song]. [sings]. [sing]. [eats]. [eat].

Goal:

?- sentence([the, girl, sing, a, song], []).

Cont… 





It is to be noted that we have added context sensitivity in context free grammar by adding an extra argument. This type of grammar is called DCG grammar as nonterminal symbols can have arguments in contrast to CFG. Further, we can introduce arguments to express other important information as well such as, – an extra argument to return a parse structure for syntactically correct sentence rather than simply saying 'yes' and no.

Construction of Parse Structure 

Consider the following parse structure tree of a correct sentence 'the girl sings a song'. sent n_p v_p

d

n

verb

the

girl

sings

n_p d a

n song

Coding of Parse Structure 

In Prolog, above parse tree is coded as sent(n_p(d(the), n(girl)), v_p(verb(sings), n_p(d(a), (song))) ) – Here sent, n_p, v_p, n, v, d are user defined functor names representing sentence, noun phrase, verb phrase, noun, verb and determiner.



These names can be same as predicate names but for the sake of clarity we use different names.

Cont… 

The parse structure tree P of a sentence is constructed as follows: P = sent(NP, VP), – where NP and VP are the parse structures of noun phrase and verb phrase respectively. NP = n_p(D, N), – where D and N are the parse structures of determiner and noun respectively. VP = v_p(V, NP), – where V and NP are the parse structures of verb and noun phrase in verb phrase.



The grammar rules with argument P for parse structure tree and M for number of a sentence are given below: sentence (P) --> sentence(M, sent(NP, VP)) --> np(M, n_p( D, N)) --> vp(M, v_p(V)) --> vp(M, v_p(V, NP1)) --> det(singular, d(a)) --> det( _ , d(the)) --> noun(singular, n(girl)) --> noun(plural, n(girls)) --> noun(singular, n(song)) --> verb(singular, v(sings)) --> verb(plural, v(sing)) -->

sentence(M, P). np(M, NP), vp(M , VP) det(M, D), noun(M, N). verb(M, V). verb(M, V), np(M1, NP1). [a]. [the]. [girl]. [girls]. [song]. [sings]. [sing].

Goal: ?- sentence(P, [the, girl, sings, a, song], []). Search tree: ?- sentence(M, P, [the, girl, sings, a, song], []). P = sent(NP, VP) ?- np(M, NP, [the, girl, sings, a, song], Z), vp(M, VP, Z, []). NP = n_p(D, N) ?- det(M, D, [the, girl, sings, a, song], Z1), noun(M, N, Z1, Z), ….. D = d(the), Z1 = [girl, sings, a, song] ?- noun(M, N, [girl, sings, a, song], Z), vp(M, VP, Z, []). M = singular, N = n(girl), Z = [sings, a, song] ?- vp(singular, VP, [sings, a, song], []). VP = v_p(V, NP1) ?- verb(singular, V, [sings, a, song], Y), np(M1, NP1, Y, []). V = v(sings), Y = [a, song] ?- np(M1, NP1, [a, song], []). NP1 = n_p(D, N) ?- det(M1, D, [a, song], X), noun(M1, N, X, []). M1 = singular, D = d(a), X = [song] ?- noun(singular, N, [song], []). N = n(song) succeeds

Adding Extra Tests 







So far we have seen that the grammar rule translator (DCG handler) adds two extra arguments in each atom of the rule at the time of converting DCG grammar rules to Prolog clauses. Sometimes it is desirable to specify Prolog sub goals in DCG grammar rules. This can be easily achieved by putting Prolog sub goals inside the curly brackets. DCG handler at the time of conversion will leave sub goals enclosed in { } unchanged and brackets are removed.

Cont…  

This would be useful while defining lexicon . Suppose we want to add new nouns such as banana, apple and orange in the grammar specified earlier, we would write noun rules. noun (singular, n(banana)) noun (plural, n(apples)) noun (singular, n(orange))



--> --> -->

[banana]. [apples]. [orange].

We notice that there is lot of information to be specified for each noun, even when we know that every noun occupies only one element of an input list and will give rise to a small parse tree with the functor 'n'.

Cont… 





A much more economical way would be to express the common information about all the nouns at one place and the information about particular word somewhere else. We abstract the word details from the lexicon and put it in the grammar. Lexicon may be stored in a separate file which may grow or shrink according to the need. This file is loaded in the main program containing grammar rules at the time of execution by consult predicate.

Cont… Abstract DCG rule for Noun: noun(M, n(N)) --> [N], { is_noun(M, N) }. Equivalent Prolog rule: noun(M, n(N), [N | X], X) :- is_noun(M, N).  Here, is_noun is a normal Prolog predicate used to express an individual word.  An argument M represents number of a noun and N represents noun word.  Curly brackets indicate that the sub goals inside them remain unchanged after translation from DCG rule to Prolog rule.

Lexicon Coding 

The nouns in the lexicon are specified as follows: is_noun(singular, banana). is_noun(plural, apples). is_noun(singular, orange).



Similarly abstract DCG rule for verb phrase and lexicon for verbs are defined as follows: verb(M, v(V)) --> [V], { is_verb(M, V) }.

Equivalent Prolog rule: verb(M, v(V), [V | X], X) :- is_verb(M, V). is_verb(singular, eats). is_verb(plural, eat). is_verb(singular, sings). is_verb(plural, sing).

Cont… 





Here we notice that each noun or verb is still specified as singular or plural where the token is same with some characters added or removed at the end of the token e.g., banana / bananas, eat /eats etc. Handling conversions from singular to plural or vice versa could be done by using morphological rules. In that case one need to specify only one form of the token. For the sake of simplicity we consider both the forms to be included in the lexicon.

Complete DCG grammar with abstract rules sentence (P)

-->

sentence(M, P).

sentence(M, sent(NP, VP))

-->

np(M, NP), vp(M , VP).

np(M, n_p( D, N))

-->

det(M, D), noun(M, N).

vp(M, v_p(V))

-->

verb(M, V).

vp(M, v_p(V, NP1))

-->

verb(M, V), np(M1, NP1).

noun(M, n(N))

-->

[N], { is_noun(M, N) }.

verb(M, v(V))

-->

[V], { is_verb(M, V) }.

det(M, d(D))

-->

[D], { is_det(M, D) }.



Lexicon: (can be stored separately in a file ) is_noun (singular, girl). is_noun (plural, girls). is_noun (singular, song). is_noun(singular, banana). is_noun(plural, apples). is_noun(singular, orange). is_det( _ , the). is_verb(singular, eats). is_verb(plural, eat). is_verb(singular, sings). is_verb(plural, sing). is_det(singular, a). is_det(singular, an).