technology
NooJ International Conference - Palermo, Italy,
20—22
from seed June 2018
Cristina Mota1 Jorge Baptisa1,2 Anabela Barreiro1
THE LEXICON-GRAMMAR OF PORTUGUESE PREDICATE NOUNS WITH SER DE IN PORT4NOOJ 1 INESC-ID, Lisbon 2 Universidade do Algarve
Port4NooJ 3.1 eSPERTo Smart Paraphrasing System Port4NooJ Genesis 2009: OpenLogos ü Bilingual PT-EN ü Morpho-syntactic Relations ü Semantico-syntactic Properties (SAL)
LG of Human Intransitive Adjectives 2015
ü ü ü ü ü
LG of Support Verb Fazer 2017
Derivational Relations Support Verb Constructions Semantic Relations SentiLex 2016 Stencil NER 2016
LG of Support Verb ser de 2018
2
eSPERTo Paraphrasing Applications QA and Summarization Applications
Edgar, Virtual QA Agent (Fialho et al. 2013)
eSPERTo Paraphrasing •
.
SSNT Summarization (Ribeiro 2011)
3
eSPERTo Interface https://esperto.l2f.inesc-id.pt/esperto/esperto/demo.pl
Interactive language learning application – helps learners in producing and revising their texts 4
Predicate nouns with Vsup ser de •
(Baptista, 2005)
Lexicon-grammar of 2,085 predicate nouns which co-occur in constructions with the support verb ser de (‘be of ’) – Many correspond to adjective predicative constructions, so in those cases they are linked to a corresponding adjective in the LG table – Classified into 9 classes according to: • number of arguments (1 or 2) selected by the predicate noun • the syntactic (sentential/nominal) constraints • distributional (semantic) selection constraints on the nominal argument slots (human/ non-human). • Two special classes were established for: – nouns selecting a body-part noun as their subject – symmetrical constructions
•
The process of integrating the LG of predicates with Vsup ser de into Port4NooJ was very similar to integrating the LG of predicates with Vsup fazer
5
Predicate nouns with Vsup ser de Classification Criteria Ser de
N0 ser de N
9 classes
N0 ser de N Prep N1
N0 =: Nhum
N0 =: Nhum
N0 =: Nnhum
N0 =: Nnhum
N0 =: Que F0
N0 =: Que F0
N0 =: Npc de Nhum
N1 =: Que F
N1 ser de N Prep N0
Predicate nouns with Vsup ser de Classification Criteria Ser de
N0 ser de N
N0 ser de N Prep N1
SdH1
SdH2
328 19%
54 3%
SdNH1
SdNH2
363 17%
30 1%
SdQ0
SdQ1
820 39%
308 15%
SdNPC
SdQ2
30 1%
37 2%
SdSIM 55 3%
Transformations based on noun predicates with Vsup ser de Negation [in-N]
Paraphrases O Pedro é de uma certa intolerância à lactose Pedro is of a certain intolerance to lactose
[falta de N] ([lack of N])
O Pedro é de uma certa falta de tolerância à lactose Pedro is of a certain lack of tolerance to lactose
Vsup Substitution [Vsup=ser de]
Paraphrases A Ana é de uma alegria contagiante Ana is of a contagious happiness
[Vsup=ter]
A Ana tem uma alegria contagiante Ana has a contagious happiness
[Vsup=haver]
Há na Ana uma alegria contagiante There is a contagious happiness in Ana
[Vsup=ser de]
A Ana é de uma alergia ao pó impressionante Ana is of an impressive allergy to dust
[Vsup=faz]
A Ana faz uma alergia ao pó impressionante Ana makes an impressive allergy to dust 8
Integration of LG of Vsup ser de – Major challenges ² 50% of the predicate nouns already exist in Port4NooJ Ø Old news: somewhat being addressed since we started integrating the LG with Vsup fazer - consolidate information from old entry and LG table - solution is far from perfect - needs thorough revision
² 55% of the cases where the predicate nouns have an equivalent adjectival construction, the adjective is homograph of a human intransitive adjective (HIA) already formalized in the LG of HIA Ø New problem! Adjectives equivalent to predicate nouns are being treated by derivation. Not sure how to harmonize those derived entries with the HIA entries yet…
9
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Mostly done automatically with various scripts ≠ scripts used to integrate the tables of Human Intransitive Adjectives
Port4NooJ • •
Current version (CV) Version before removing Npred that derive from verbs (OV)
LG tables
ü Check if noun exists in Port4NooJ: ü If noun not in CV nor in OV § Create new entry ü If noun in CV and (not in OV or CV=OV entry) § Merge1 the LG properties with current entry ü If noun in OV only or (CV≠OV entry) then § Merge2 the LG properties with old entry § Remove nominalization from CV ü Create FLX and DRV codes and corresponding rules as needed ü Check for missing FLX and DRV codes
npred_vsupserde 10
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Representation of LG table properties
+Det…
+Vsup…
+Npred+Vsup =ser +Table=SdH1
+N0Nhum +PfxNeg=in +Vsupter=ter
Grammars take care of de +VsupteroNdeVinf0w=ter 11
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Representation of LG table properties +Prep…
+N1…
+DRV=N2A5:ALTO
• DRV code is determined and formalized automatically by finding the radical between the noun and the verb or adjective that are listed in a separate file activ(idade) => N2A5= o/A
•
FLX code of derived word is determined by consulting Port4NooJ activo,A+FLX=ALTO+AV+state+EN=brisk+DRV=AVDRV01:RAPIDAMENTE
If the derived form does not exist, then its code is assigned automatically 12
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Integration with eSPERTo dictionary entries ① Noun not in Port4NooJ (old or current): ü
Create new entry: ü ü ü
ü
FLX code is assigned automatically given the ending of the word Entries are checked for missing FLX codes and reviewed by a linguist All other properties come from LG table
Add entry to new standalone dictionary npred_vsupserde.dic
airosidade,N+FLX=CASA+Npred+Vsup=ser+Table=SdH1+N0Nhum+N0Npabst+N0Ncl asspessoa+DetEModif+DetUMModif+Vsupter=ter+VsupteroNdeVinf0w=ter+ DRV=N2A5:ALTO
13
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Integration with eSPERTo dictionary entries ② Noun exists both in current and old Port4NooJ A.
If entries are the same do Merge 1: ü ü
Blindly add additional properties as specified by the LG tables to current entries Add merged entries to npred_vsupserde.dic
aprumo,N+FLX=ANO+AB+prop+EN=aplomb
+Npred +Vsup=ser+Table=SdH1 +Negfaltade +N0Nhum +N0Npc +N0Npabst+N0Nclasspesso a+DetEModif +DetUMModif +Vsupter=ter+VsupteroNd eVinf0w=ter +Vsuphaver=haver+DRV=N2 A16:ALTO
14
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Integration with eSPERTo dictionary entries ② Noun exists both in current and old Port4NooJ A.
If entries are the same do Merge 1: ü ü
Blindly add additional properties as specified by the LG tables to current entries Add merged entries to npred_vsupfazer.dic
aprumo,N+FLX=ANO+AB+prop+EN=aplomb+Npred+Vsup=ser+Table=SdH1+Negfal tade+N0Nhum+N0Npc+N0Npabst+N0Nclasspess+DetEModif+DetUMModif+Vs upter=ter+VsupteroNdeVinf0w=ter+Vsuphaver=have+DRV=N2A16:ALTO
15
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Integration with eSPERTo dictionary entries ② Noun exists both in current and old Port4NooJ B.
If entries are not the same do Merge 2 with old entries as shown in case 3: ü ü ü ü
Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV
Entries in CV: avidez,N+FLX=LUZ+AB+qual+EN=avidity avidez,N+FLX=LUZ+AB+qual+EN=greed Entries in OV: avidez,N+FLX=LUZ+AB+strvb+Npred+Nom+EN=acquisitiveness+VRB=ansiar
16
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Integration with eSPERTo dictionary entries ② Noun exists both in current and old Port4NooJ B.
If entries are not the same do Merge 2 with old entries as shown in case 3: ü ü ü ü
Remove previous Npred related properties +Npred Blindly add additional properties as specified by the LG tables to old entries +Vsup=ser+Table=SdQ Add merged entries to npred_vsupserde.dic 0 Remove nominalization from CV
Entries in CV: avidez,N+FLX=LUZ+AB+qual+EN=avidity avidez,N+FLX=LUZ+AB+qual+EN=greed
Entries in OV: avidez,N+FLX=LUZ+AB+strvb+Npred+Nom+EN=acquisitiveness +VRB=ansiar
+N0Nhum+N0NpreddeN+ N0NopQueF+N0RestrNo pQueF+N0QueFconj+N0 OfactodeVinf0w +N0N0Vinf0w+N0Restr Vinf0w+N0Nclass+N0N classpessoa+DetEMod if+DetUMModif+Vsupt er=ter+Vsuphaver=ha ver+DRV=N2A18:ALTO
17
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Integration with eSPERTo dictionary entries ② Noun exists both in current and old Port4NooJ B.
If entries are not the same do Merge 2 with old entries as shown in case 3: ü ü ü ü
Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV
Entries in CV: avidez,N+FLX=LUZ+AB+qual+EN=avidity avidez,N+FLX=LUZ+AB+qual+EN=greed Entries in OV: avidez,N+FLX=LUZ+AB+strvb+Npred+Nom+EN=acquisitiveness +VRB=ansiar+Npred+Vsup=ser+Table=SdQ0+N0Nhum+N0NpreddeN+N0NopQu eF+N0RestrNopQueF+N0QueFconj+N0OfactodeVinf0w+N0N0Vinf0w+N0Rest rVinf0w+N0Nclass+N0Nclasspessoa+DetEModif+DetUMModif+Vsupter=te r+Vsuphaver=haver+DRV=N2A18:ALTO
18
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Integration with eSPERTo dictionary entries ③ Noun exists only in old Port4NooJ ü
Do Merge 2 with old entries as shown in Case 2-B: ü ü ü ü
Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV
capricho,N+FLX=ANO+AB+strvb+Npred+Nom+EN=caprice+VRB=caprichar
19
Integration of LG of PT Vsup ser de – From LG tables to NooJ dictionaries • Integration with eSPERTo dictionary entries ③ Noun exists only in old Port4NooJ ü
Do Merge 2 with old entries as shown in Case 2-B: ü ü ü ü
Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV
capricho,N+FLX=ANO+AB+strvb+EN=caprice
+Npred+Vsup=ser +Table=SdH1+N0Nhum +N0Npabst +N0Nclasspessoa +DetE +Vsupter=ter +VsupserumNdpdNhum=ser +Vsuphaver=haver +DRV=N2A25:ALTO
20
Integration of LG of PT Vsup fazer – From LG tables to NooJ dictionaries • Integration with eSPERTo dictionary entries ③ Noun exists only in old Port4NooJ ü
Do Merge 2 with old entries as shown in Case 2-B: ü ü ü ü
Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV
capricho,N+FLX=ANO+AB+strvb+EN=caprice+Npred+Vsup=ser+Table=Sd H1+N0Nhum+N0Npabst+N0Nclasspessoa+DetE+Vsupter=ter+Vsupser umNdpdNhum=ser+Vsuphaver=haver+DRV=N2A25:ALTO
21
Integration of LG of PT Vsup ser de – From LG to NooJ grammars •
New grammars to paraphrase constructions based on specific LG properties, such as paraphrase of negative constructions Ø Use of more than one LG property: PfxNeg & Negfaltade
O Pedro é de uma grande falta de sinceridade Pedro is of a great lack of sincerity
O Pedro é de uma grande falta de sinceridade Pedro is not of a great sincerity Pedro is of a great insincerity
22
Integration of LG of PT Vsup ser de – From LG to NooJ grammars •
New grammars to paraphrase constructions based on specific LG properties, such as when restructuring with a possessive pronoun Ø Unidirectional paraphrase: needs larger context and more complex analysis to be able to rephrase the possessive with the appropriate noun phrase
(Fazer isso) é do interesse do Pedro (To do this) is of interest to Pedro
(Fazer isso) é do interesse do Pedro (To do this) is of his interest
23
Integration of LG of PT Vsup ser de – From LG to NooJ grammars •
New grammars to paraphrase constructions based on specific LG properties that also exist on other LG grammars, such as the substitution of the support verb by another verb Ø Likely to be common to the three lexicon-grammar or at least have shared sub-graphs
O Pedro é de um certo altruismo Pedro is of a certain altruism
O Pedro é de um certo altruismo
24
Preliminary Results •
2132 predicate nouns with Vsup ser de (1376 different noun lemmas) – Additional 797 entries await revision of inflectional codes of derived adjectives or have format problems to be added to the final dictionary
•
450 new derivational paradigms, but there might be overlap with paradigms created when integrating LG of vsup fazer
•
Example grammars for the syntactic parser
•
Half of the nouns already existed in Port4NooJ (50%) è
6% increase in nominal entries and 20% increase in predicate nouns Table SdH1 SdH2 SdNH1 SdNH2 SdNPC SdQ0 SdQ1 SdQ2 SdSIM Total
Example In Port4NooJ O Zé é de uma alegria contagiante 183 O Zé é da confiança da Ana 41 Este molho é de uma acidez exagerada 153 Esta substância é de uma total indissolubilidade em água 16 O rosto da Ana era de uma palidez doentia 7 Essa medida é de grande abrangência 309 O Zé foi de uma agressividade desproporcionada para com a Ana 162 O Zé é de uma grande habilidade para tratar das roseiras 22 O Zé e a Ana são de um companheirismo exemplar 34 927
New % In 208 47% 14 75% 211 42% 15 52% 24 23% 512 38% 147 52% 16 58% 22 61% 1169 50% 25
Next Steps
Consolidate Dictionaries
Review Review Review
Build LG Paraphrasing Grammars
Integrate new LG
26
Thank you! Grazie!
[email protected] [email protected] [email protected] 27