Semantic Model for Improving the Performance of

0 downloads 0 Views 509KB Size Report
Therefore, natural language interfaces to databases (NLIDBs) emerge as an ... due to several problems, such as semantic ellipsis and the use of words of ... interface where the user completes the frame through a list of search values for the ..... role of the subject in a sentence (in active voice) that describes the ... by a bullet.
Semantic Model for Improving the Performance of Natural Language Interfaces to Databases Rodolfo A. Pazos R.1, Juan J. González B.1, Marco A. Aguirre L.1 1 Instituto Tecnológico de Ciudad Madero, Mexico [email protected], [email protected], [email protected]

Abstract. Despite the fact that since the late 60s many Natural Language Interfaces to Databases (NLIDBs) have been developed, up to now many problems continue, which prevent the translation process from natural language to SQL to be totally successful. Some of the main problems that have been encountered relate to 1) achieving domain independence, 2) the use of words or phrases of different syntactic categories for referring to tables and columns, and 3) semantic ellipsis. This paper introduces a new method for modeling databases that includes relevant information for improving the performance of NLIDBs. This method will be useful for solving many problems found in the translation from natural language to SQL, using a database model that contains linguistic information that provides more semantic information than that found in conventional database models (such as the extended entity-relationship model) and those used in previous NLIDBs. Keywords: Natural Language Interfaces to Databases, semantic modeling.

1 Introduction In the last decades information has played an important role in our daily life, most people request information before making some important decision. Currently, the largest sources of information are stored in databases. In order for a user to obtain information from a database, he/she needs formulate a query in such a way that the computer interprets and generates the correct answer (usually in a query language such as SQL). Unfortunately, only computing professionals can formulate queries in such way. The normal way in which people request information is through questions in natural language, but computers can not directly understand this kind of language. Therefore, natural language interfaces to databases (NLIDBs) emerge as an alternative to several problems that occur in systems for obtaining information, since they permit users accessing information stored in databases through a query in natural language. The need for NLIBDs has become important nowadays, because many users request accessing information from different types of systems [1] [2]. Most of the NLIDBs that have achieved good results (around 95% of queries correctly translated)

have been those that are domain dependent; i.e., those that can be used to query only one database. Since the 90s much effort has been devoted to developing domain independent NLIDBs; i.e., those that can be used for different databases; however, the limitations and deficiencies they have are worth noticing, which are reflected in a success rate of 80-90%, which is unsatisfactory for critical applications. Some commercial NLIDBs exemplify this situation: for example, LanguageAccess (developed by IBM) was discontinued, English Query (developed by Microsoft) was included for the last time in SQL Server ver. 8.0, released in 2000, and English Wizard (developed by Linguistic Technology Corporation) was discontinued several years ago [3]. The NLIDBs developed up to now have several problems. A survey of the literature on NLIDBs as well as the tests of some prototypes, reveal that the systems developed up to now have limitations in two aspects [1]: it is difficult to port them from one domain to another (i.e., from a database to another), and they fail with some queries, either because they misinterpret the queries or because they are unable to respond even though there exists an answer to the query. Understanding a query in natural language is undoubtedly a hard task for a computer due to several problems, such as semantic ellipsis and the use of words of different syntactic categories (nouns, verbs, adjectives and prepositions) for referring to tables and columns of the databases. In this paper we propose a new semantic model for databases based on an analysis of queries to three databases (ATIS [4], Northwind and Pubs), from which many problems present in those queries were identified and classified (Section 3). The modeling methodology on which our model is based is called semantically enriched model (SEM), and it aims at including semantic information (besides structural information of the database) that helps the semantic analysis of queries. This new model will constitute a key element for designing a data dictionary for a new version of a NLIDB developed by us (whose description can be found in [5] [6]).

2 Background The use of natural language interfaces for accessing databases dates back to the late 60s and the early 70s. Though many dozens of NLIDBs have been developed since then, in this survey we consider only the most recent NLIDBs that include/exclude three characteristics: domain independence, database modeling and learning capacity. CoBase [7] generates candidate queries based on the user input through a search algorithm in a semantic graph. This search is organized according to a metric of probabilistic information. This system uses an incremental method that helps users to formulate complex queries through a series of simple queries. NLPQC [8] is a NLIDB that is used in the virtual library CINDI. This system is constituted by a preprocessor and a run-time module. The preprocessor constructs a conceptual knowledge base of the database schema using Wordnet, which is used at runtime for a semantic analysis of the input according to several predefined templates and for constructing the SQL expression.

WYSIWYM [9] uses a semantic graph inspired by the one described in [7]. The NLIDB helps users to formulate queries using predefined frames through a query interface where the user completes the frame through a list of search values for the columns involved in the query. C-PHRASE [10] has an authoring tool for editing the NLIDB data dictionary of the NLIDB for customizing it for a specific domain. The query analysis centers mainly in recognizing nominal phrases. C-PHRASE uses a semantic grammar based on rules and patterns, which are used for translating the query to SQL. In [11] a NLIDB that uses Sequence and Tree Kernels (STKs) and some variants is presented. In this system sets of data are constructed that consist of pairs of queries in natural language and SQL expressions. The NLIDB represents these pairs through syntactic trees, and uses kernel functions and support vector machines for carrying out the translation from natural language to SQL. From the surveyed literature, it can be observed that most of the NLIDBs have been tested with one or a few databases, which does not fully prove their domain independence. On the other hand, only a few NLIDBs use dialogue managers for clarifying queries and learning capabilities, which are features that may improve the performance of the NLIDB. Most of the NLIDBs developed up to now have focused on solving existing complex problems; however, for solving them developers have implemented NLIDBs that involve some degree of difficulty in their use, and very few have taken care in defining a formal model of the database that helps achieve a better translation processing. Table 1 shows a comparison of the NLIDBs described above with respect to three characteristics: domain independence, use of a formal DB modeling process, learning capability, dialogues for clarifying queries and success rate (i.e., the percentage of correctly translated queries). The second column shows the characteristics that the new version of our NLIDB aims at. Table 1. Comparison of characteristics of important NLIDBs NLIDBs New version (1999) (2005) (2006) (2008) (2010) of our NLIDB CoBase NLPQC WYSIWYM C-PHRASE STK Domain independence ? L L L      L L L DB modeling       Learning capability Clarification       dialogues Success rate 95% ? ? ? 86% 76% Key:  Includes feature, ? No evidence, L Limited, and  Lacks this feature. Characteristics

At this point it is important to mention that the current version already has the following characteristics: domain independence, clarification dialogues and a success rate of 79-89% [6]. With the inclusion of the new features we aim at increasing the success rate to 95%.

The preceding and current versions of our NLIDB are described in [5] and [6]. The main characteristic of the system described in [5] consists of dealing with domain independence. In order to achieve this, it uses a preprocessor that automatically generates a domain dictionary and a translation technique that involves nouns, prepositions and conjunctions. It is important to mention that such version did not have a data dictionary with semantic information (such as the one proposed here). The second version, described in [6], includes domain-independent dialogue processes, which were designed for their use with any relational database, and were based on a typification of problems in queries that involves most of the cases found. In order to solve the problems present in those two versions, a new database model has been proposed that includes information necessary for solving the remaining problems (those described in Section 3), which is complemented with an architecture based on functional layers. Despite the competitive performance of our NLIDB (compared with other state-ofthe-art systems), it still does not successfully deal with some problems (such as those mentioned in the following section), which made it necessary to think in a major overhaul of the system based on a new data dictionary and a layered architecture.

3 Problems in Queries From an analysis carried out on query corpora involving three databases, four general types of problems (i.e., those that usually are found in queries for most databases) were identified and classified: 1. The use of words or phrases of different syntactic categories (such as nouns, verbs, adjectives, and prepositions) for referring to tables or columns of the database. 2. Semantic ellipsis that occurs when words that are necessary for clearly understanding the query are omitted. 3. Covering of the capabilities of SQL, such as involving several tables of the database and the use of aggregate functions. 4. Other type of problems related to human errors, such as nonexistent information in the database, words that indicate imprecise values, etc. The classification obtained and some examples are shown in Table 2. Table 2. Types of problems that occur in queries Cases 1 1.1 1.2 1.3 1.4

Problems Use of words or phrases of different syntactic categories Use of nouns or nominal phrases for referring to tables or columns. Example: List number of seats on D9S. Use of verbs or verbal phrases for referring to tables or columns. Example: What time does flight 102136 leave ATL to DFW? Use of prepositions or prepositional phrases for referring to tables or columns. Example: Give me an economy class flight from DFW to BWI one-way. Use of adjectives or adjectival phrases for referring to tables or columns. Example: How fast can the Concorde fly?

1.5 1.6 2 2.1

2.2 2.3 3 3.1

3.2 4 4.1

4.2 4.3 4.4 4.5 4.6

Use of temporal adverbs. Example: List fares for all flights leaving after twelve o'clock noon from BOS to BWI. Use of conjunctions. Example: Flights exiting Fort Worth and entering Dallas. Semantic ellipsis Lacking information of tables or columns. Example: List fares for all flights leaving after twelve o'clock noon from BOS to BWI. Note: there exist two columns related to the word "fare": "one_way_cost" and "rnd_trip_cost", thus it is not clear which of these columns is being referred to. Lacking information of tables or columns referred to by some value. Example: How much is Delta flight 539? Lacking information of tables or columns about the information requested. Example: All flights from ATL to SFO on Delta first class. Covering of the capability of SQL Queries that involve several tables. Example: Give me an economy class flight from DFW to BWI one-way. Note: the columns referred to by “economy”, “flight” and “one-way” belong to different tables. Queries that involve aggregate functions. Example: Which flight from Philadelphia to Dallas has the cheapest fare? Other type of problems Search values that involve two or more columns. Example: Give me the hire date of the employee Margaret Peacock. Note: the value "Margaret Peacock" involves two columns: “FirstName” and “LastName.” Search values constituted by two or more words. Example: Give me the postal code and city of the supplier “Exotic Liquids.” Incomplete search values. Example: What is the name of the store where is “the busy.” Inexistent tables or columns. Example: Get me a date on flight 294 leaving ATL to Washington. Note: the table or column “date” does not exist in the database. Spacing, punctuation and formatting mistakes. Example: Flights between SFO and Dallas between noon and 5:00 P.M. Note: the ATIS database uses the military format for time. Imprecise search values. Example: Show me the Atlanta to Dallas flights in the morning.

It is important to mention that the classification carried out according to the problems found in the corpora can be applied to most databases. The proposed model is based on including the information needed for solving the aforementioned problems in order to improve the performance of NLIDBs.

4 Semantically Enriched Database Modeling Existing data models (such as the Extended Entity-Relationship model (EER), the Unified Modeling Language (UML), etc.) include information about the real world and provide graphical techniques that are used for database design. Unfortunately, for some applications such as NLIDBs, their usefulness is insufficient because they do not include enough semantic information for an effective translation of queries from natural language to SQL. Some of the main problems that have been found in NLIDBs relate to domain independence (system portability), grammatical problems and semantic ellipsis (concerning query translation). When a NLIDB is ported from one domain to another, it needs different domain information and grammatical structures; therefore, it must acquire somehow this information for constructing a domain dictionary that is necessary for the translation process. Considering the limitations of the data models and the difficulty for solving the problems faced by NLIDBs, we propose the design of a software architecture based on a new database model. This new model does not intend to substitute existing database models, it rather intends to enrich their semantic information that permits to solve the problems faced by NLIDBs in query translation. The semantically enriched model (SEM) represents the knowledge of a set of data in a specific domain. It is formally defined as, SEM = (C, L), where C is a set of concepts that belong to the database schema (entities, attributes and relationships) and L is the set of links among such concepts. 4.1 Grammatical Descriptors In the analysis carried out with three query corpora, it has been observed that words that are used in queries for referring to entities or attributes belong to four syntactic categories; therefore, the inclusion of this information in the model will benefit the translation process. The categories that have been detected are the following:  Nouns are used mainly to refer to entities or attributes.  Verbs are used mainly to refer to relationships among entities.  Prepositions are used to refer to attributes (mainly those that indicate locations or times).  Adjectives qualify nouns (in Spanish they can occasionally be used instead of nouns). As a result of this observation, a grammatical descriptor has been defined, which permits representing semantic information classified according to syntactic categories. Grammatical descriptors G are syntactic categories that can be used for representing some of the schema concepts (entities or attributes). These tags can be of four types:  Verbs or verbal phrases, V.  Nouns or nominal phrases, N.  Prepositions or prepositional phrases, P.

 Adjectives or adjectival phrases, Adj. 4.2 Entities Like in the EER model, an entity (e) is defined as a concrete or abstract object that identifies an animated or unanimated being that exists in the real world. Unlike the EER model, in our model the representation of an entity includes a grammatical tag of type N, which defines the noun or nominal phrase that is usually used in queries for referring to the entity. 4.3

Attributes

Like the EER model, an attribute (a) is defined as a descriptive property of an entity. Each entity class Ei has a set of attributes Ai = (ai1, ai2,..., aim). In the proposed model attributes are denominated using nouns and are associated to the entity, and they are referred to by a gramatical descriptor of type N. It is important to mention that both entities and attributes may optionally have grammatical tags of types V, P and Adj. Attributes have a domain, which is defined as the set of possible values that an attribute may adopt, vij  Domij, where vij is the value of attribute aij, and Domij is the set of atomic values permitted for the attribute. In the set of attributes of each entity class Ei, there exists at least one attribute aij that is the identifier (or primary key) of the entity, which will be denoted by pki. From the EER model only the following types of attributes will be included in our model: simple attributes, composite attributes, univalued attributes and derived attributes. An attribute can be simple or composite. A simple attribute is one that can not be divided into smaller components, and conversely, a composite attribute is one that can be divided into several simple attributes. Most of the attributes are univalued, which means that they have a single value. The values of some attributes may be derived, which means that they are calculated from the values of other attributes. 4.4

Relationships

A relationship is an association between a pair of entity classes that is established through a link R. R is defined as the set of links ri, where each one of these associates a pair of entities [ei, ej]. Typically, relationships are referred to by a grammatical descriptor of type V. At this point it is important to remark that unlike the EER model, in our model relationships are defined differently: in the first model relationships may involve n entities (where n = 1, 2, 3, ...); however for the purposes of our model relationships are defined only for pairs of entities. A role is the function played by an entity in a relationship. According to this concept, entities can be of two types: Agent entity is an entity that normally plays the role of the subject in a sentence (in active voice) that describes the interaction

between the two entities involved. Patient entity is an entity that normally plays the role of the object in a sentence (in active voice) that describes the interaction between the two entities involved. Generally, entities that represent people are considered agent entities; while those that represent unanimated things are considered patient entities. Unlike the EER model, in our model relationships always have direction. Normally a relationship implies a forward direction (i.e., where the relationship points to) defined by the role played by its entities: from the agent entity to the patient entity. Additionally, for every direction between a pair of entities there exists an inverse direction that goes from the patient entity to the agent entity. Associated to such inverse direction, a sentence in passive voice can be defined that describes the interaction between the two entities involved. Occasionally, in a relationship there might exit two patient entities, and in this case one has to be chosen as agent for defining a forward direction. A recursive relationship is an association between an entity class and itself. In this case the entity class plays different roles. A relationship that associates a pair of entity classes [Ei, Ej] involves a primary key of entity Ei and some corresponding attribute of entity Ej. This attribute is called foreign key and is denoted by fkj. Ei is called referenced entity and its primary key is called referenced attribute; while Ej is called referencing entity and its foreign keys are called referencing attributes. Similarly to the EER model, we define the notion of cardinality, which is defined as the number of entities that may participate in a relationship. According to cardinality, relationships are divided into three types: one to one (1:1), one to many (1:N), and many to many (N:M). 4.5 Specialization and Generalization The definitions of specialization and generalization are defined similarly to those in the EER model, the only difference being in its representation. In our model, specialization is related to its entities using the verbal phase "it is a", and generalization is related to its entities using the verbal phase "it is a type of".

5 SEM Representation This section defines the graphical representation of the main concepts of our proposed semantic modeling method. An entity is graphically represented by a rectangle with two sections: the upper section is used for indicating the entity name, and the lower section indicates the attributes names. Each entity must have a unique name that distinguishes it from the other entities. Entities must have one grammatical descriptor, which will usually be of type N, and optionally grammatical descriptors of other types. An attribute represents some property of the entity that occurs in all the instances of the class. Each attribute, must have one descriptor of type N, and optionally it may have grammatical descriptors of other types. In the graphical representation, names of

attributes that are primary keys are written with underscoring; while those of attributes that are foreign keys are written with double underline. Names of derived attributes are written with segmented underscoring. The names of composite attributes are written in italics, and the constituent attributes are written as a list, each preceded by a bullet. Figure 1 shows an example of the graphical representation of entities and attributes. Relationships are represented by a line that joins the entities involved with one arrow head that points to the referenced entity. There exist two alternatives for considering the direction of a relationship: one defined by the agent and patient entities involved and the other defined by the referenced entity and the referencing entities. In our model we use the first alternative because it is more useful for the semantic analysis of queries. The relationship description includes a box that contains a grammatical descriptor of type V and an arrow that indicates the forward direction of the relationship, and another grammatical descriptor of type V with an arrow for the inverse direction. Additionally, there is an arrow that links the box to its corresponding relationship. Finally, the referencing and the referenced attributes of the relationship are written on the sides of the box and they are linked to it by lines. Figure 2 shows an example of the representation of a relationship.

Fig. 1. Example of representation of entities and attributes

Fig. 2. Example of representation of a relationship

The cardinality of relations is represented as follows: a one to one (1:1) relationship is represented writing number one at each end of the line that represents the relationship, a one to many (1:N) relationship is represented writing number one at the end of the line that has the arrow head and a letter N at the other end, and a many to many (1:1) relationship is represented writing the letter M at one end of the

line that represents the relationship and a letter N at the other end. Figure 3 shows examples of representations of relationships with these types of cardinalities.

Fig. 3. Representation of cardinalities: a) one to one, b) one to many, and c) many to many

Grammatical descriptors are one of the main components of the new semantic model. The representation of the four types of descriptors is shown in Figure 4.

Fig. 4. Representation of descriptors: a) verbal, b) nominal, c) prepositional and d) adjectival

Grammatical descriptors provide information to the NLIDB, which is useful during the semantic analysis of a query, since it permits to relate query words to the database columns and tables. Figure 5 shows a fragment of the ATIS database schema that exemplifies the use of grammatical descriptors.

Fig. 5. Fragment of the SEM diagram for the ATIS database

6 Case Study In order to show the usefulness of the SEM model, the following paragraphs describe its application to three of the most common problems found in the corpus for the ATIS database. Case #1. Use of words or phrases of different syntactic categories. Though in queries, tables and columns are usually referred to by nouns; however, our study of query corpora revealed that a significant percentage of queries include verbs, prepositions and adjectives for referring to tables and columns. The following query is an example of this type of problem: How much does it cost to fly from Boston to Oakland in one-way? The analysis of this query shows that words and phrases such as “cost”, “to fly from”, “to”, and “one-way” refer to columns of table “Fare”. Figure 6 shows the semantic information included (modeled by suitable descriptors) for table “Fare”. These descriptors contain the information required during the semantic analysis of the query for identifying the columns and table referred to in the query.

Fig. 6. Grammatical descriptors for columns of table “Fare”

Case #2. Semantic Ellipsis. In natural language communication, people usually omit words that might be crucial for the semantic interpretation of a query. The following query shows an example of this problem: All flights and fares from ATL to SFO on 539. This query has three types of problems: regarding “All flights” the query does not specify the information requested, “fares” might refer to two columns, and “539” is a search value for some unspecified column. For the first problem, from the model, it follows that “flights” refers to table “Flight”, from which the NLIDB can display a list of the columns of table “Flight” so the user can select the ones he/she wishes. Concerning the second problem, “fares” refers to two columns (“one_way_cost” and “rnd_trip_cost”) of table “Fare” (Fig. 6), which can be displayed to the user so he/she can choose one or both. Concerning the third problem, the model contains information about the data type of each column; with this information the NLIDB can display a list of the columns of table “Flight” that contain numeric data, so the user can indicate the unspecified column. It is important to mention that semantic ellipsis is an extremely complex problem; and therefore, the intervention of the user is necessary for choosing among the different alternative interpretations that arise during the semantic analysis. For example, in the preceding query, there is no way that the NLI can determine if the user refers to one-way cost or round-trip cost, unless the NLI receives this information from the user through a dialogue. Case #3. Covering of the capability of SQL. The following query shows an example that involves an aggregate function: Find the cheapest one-way fare from Pittsburgh

to Oakland first class. In addition to the structures mentioned in Sections 4 and 5, the model includes a structure for this purpose that contains information on the aggregate functions supported by the NLIDB (Figure 7). Each of the aggregate functions has a link to each column it can be applied to, and the link has a descriptor that specifies the word or phrase for referring to such function in the wording of a query. This information is useful during the semantic analysis for identifying the aggregate function referred to by the word “cheapest” in the query.

Fig. 7. Information for associating aggregate functions to columns

7 Final Remarks and Future Work Despite the fact that since the late 60s many NLIDBs have been developed, up to now many problems continue that prevent the translation process to be totally successful, i.e., error free. Most of the NLIDBs developed have focused on solving existing complex problems, but have overlooked a key aspect of NLIDB design: a formal and comprehensive model that includes enough semantic information so as to facilitate the semantic analysis of queries for achieving success rates close to 100%. From the experience acquired during the development of the previous version of our NLIDB, we realized about the extreme complexity of the problems involved in translating correctly from natural language to SQL, which prevents attaining success rates above 90% by domain-independent NLIDBs. Thus, we concluded that for solving these problems, it is necessary to apply design techniques for highly complex systems. From database management systems (DBMS), we borrowed the idea that a NLIDB should have a powerful data dictionary (similar to those found in relational DBMSs), with the difference that our data dictionary, besides the structural information of the database, should be enriched with enough semantic information in order to facilitate the translation process from natural language to SQL.

This paper presents a new modeling method for databases that includes relevant information in order to improve the performance of NLIDBs. The new model is based on the analysis of problems that occur in the corpora of three databases (ATIS, Northwind and Pubs), from which it is intended to obtain all the information needed by NLIDBs to solve such problems. We are currently designing the data dictionary for the new version of our NLIDB, based on the SEM modeling method presented in this paper. We claim that in order to implement a successful translation process for a NLIDB, it is necessary to design an architecture for the translation process applying another design technique used in complex systems: an architecture based on functionality layers (similar to the OSI model for communications systems). We have already designed a layered architecture for the new version of our NLIDB, which will be published elsewhere. We think that in order to attain success rates close to 95%, our NLIDB should have learning capability. This feature will permit the NLIDB to learn from its interaction with users (via an extended version of our dialogue module [6]) in order to modify information stored in its data dictionary for inserting new semantic information or correcting possible inaccuracies. We are currently implementing the new version of the NLIDB. For evaluating the performance of the new system, it will be tested with a corpus that includes queries that involve the problems mentioned in Table 2. The results of these tests will be compared with those from the best state-of-the-art systems such as C-PHRASE [10] and STK [11], as well as ELF [3] which is one of the best existing commercial NLIDBs.

References 1 Cimiano, P., Haase, P., y Heizmann, J.: Porting Natural Language Interfaces Between Domains: an Experimental User Study with the ORAKEL System. In: Proc. 12th International Conference on Intelligent User Interfaces, Honolulu, Hawaii, USA (2007) 180–190 2 BBC News. Bill Gates Says: Mouse is Out, Touch Screen and Natural Language Interface Are In, http://news.bbc.co.uk/player/nol/newsid_7170000/newsid_7174300/ 7174330.stm? bw=bb&mp=wm&asb=1&news=1&bbcws=1

3 ELF Software, http://www.elf-software.com/FaceOff.htm 4 DARPA Air Travel Information System (ATIS0), http://www.ldc.upenn.edu/Catalog/ readme_files/atis/sdtd/trn_prmp.html 5 Pazos, R.A., Pérez, J., et al.: A Domain Independent Natural Language Interface to Databases Capable of Processing Complex Queries. Lecture Notes in Artificial Intelligence, Vol. 3789. Springer-Verlag (2005) 833–842 6 Pazos, R.A., Rojas, J.C., et al.: Dialogue Manager for a NLIDB for Solving the Semantic Ellipsis Problem in Query Formulation. Lecture Notes in Artificial Intelligence, Vol. 6277. Springer-Verlag (2010) 203–213 7 Zhang, G., Chu, W.W., Meng, F., Kong, G.: Query Formulation from High-Level Concepts for Relational Databases. In: Proc. User Interfaces to Data Intensive Systems (1999) 64–74.

8 Stratica, N.: Using Semantic Templates for a Natural Language Interface to the CINDI Virtual Library. Data & Knowledge Engineering, Vol 55. Elsevier Science Publishers, The Netherlands (2005) 4–19 9 Hallett, C.: Generic Querying of Relational Databases using Natural Language Generation Techniques. In: Proc. INLG’06. Nottingham, United Kingdom (2006) 88–95 10 Minock, M., Olofsson, P., y Näslund, A.: Towards Building Robust Natural language Interfaces to Databases. In: Proc. of the 13th International Conference on Natural Language and Information Systems, Berlin, Heidelberg (2008) 187–198 11 Giordani, A., Moschitti, A.: Semantic Mapping Between Natural Language Questions and SQL Queries via Syntactic Pairing. In: Proc. International Conference on Applications of Natural Language to Information Systems (2010) 207–221

Suggest Documents