Multilingual Database and Crosslingual Interrogation in a Real

Recommend Documents

Multilingual database and crosslingual interrogation in a ... - CiteSeerX

Multilingual database and crosslingual interrogation in a real internet application. Architecture and problems of implementation. Christian Fluhr, Dominique ...

COMPASS2008: Multimodal, multilingual and crosslingual ... - CiteSeerX

board or mouse. On the other hand, research on multilingual and crosslingual .... We call our translation services âCOMPASS2008 Translation Centerâ. Figure 2 ...

towards a flexible database interrogation - Semantic Scholar

International Journal of Database Management Systems ( IJDMS ) Vol.4, No.3, June 2012. DOI: 10.5121/ijdms.2012.4302. 21. TOWARDS A FLEXIBLE ...

Validating a Multilingual and Multimodal Affective Database

multilingual (Spanish and Basque) and multimodal (audio and video) ..... languages regarding to their grammar and vocabulary, Spanish and Basque seem to ...

A Multilingual Database of Idioms - CiteSeerX

This paper presents a possible architecture for a multilingual database of idioms. We discuss the challenges that idioms present to the creation of such a ...

A Multilingual Database Management System For

To handle data written in different codesets and to provide a ... documents written in languages coded in non- ... codeset information as codeset announcement.

A System for Uniform and Multilingual Access to Structured Database ...

Ralf D. Brown, Yibing Geng and Danny Lee (1997). Translingual Information Retrieval: A comparative evaluation. In Proceedings of the Fifteeth International.

Crosslingual Location Search - Microsoft

diverse languages and scripts (Arabic, English, Hindi and. Japanese) searching .... matarâ or the translation âairportâ may be preferable for the word. âØ±ïºï»ï»¤ï»Ø§â in an ...

Literacy in multilingual and multicultural contexts - UNESDOC Database

Luke, 1997; Walter, 1997). In future, it will be important to better understand the interactions between gender and lang

Indian Multilingual Bibliographic Database Problem ...

application software) and GIST card (DOS based hardware solution) (developed by C-. DAC, Pune) is ..... OpenType is a new cross-platform font file format developed jointly by Adobe and ... fonts, otherwise native IME can help in this regard.

Crosslingual Location Search - Microsoft

location names from the script of the query to that of the stored data. However, we show that ... domain) to focus only on viable interpretations of the query. Our ..... spatial footprints are passed into our Interpretation Finder, which implements t

Crosslingual Location Search - Microsoft

example, âViennaâ is âWienâ in German, and âBecsâ in Hungarian). Such mapping ... alignments, we train a maximum entropy classifier (adapted from. [12]) to estimate .... was within 1 km distance of the gold standard location. In the left

Real-time observation of DNA target interrogation and product release ...

May 22, 2018 - dDepartment of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205; and eHoward Hughes Medical Institute, Baltimore, ...

EuroWordNet: a multilingual database for information retrieval - ERCIM

ENTAILMENT verb-to-verb buy/ pay. CAUSE verb-to-verb kill/ die .... dS1 takes place and also S2 (which is caused by dS1) and vice versa (e.g. to shoot/to hit); ...

EuroWordNet: A Multilingual Database with Lexical Semantic Networks

EuroWordNet: A Multilingual Database with Lexical Semantic. Networks. Piek Vossen ... tics where lexical knowledge of English is required. The goal of the ...

aAqua: A Database-backended Multilingual, Multimedia Community Forum

domains of interest to a developing population. This demonstration will ... internet needs to be (i) reachable, (ii) accessible and affordable,. (iii) relevant (timely ...

A Multilingual Database of Polarity Items - LREC Conferences

Words are described in detail in (Sailer and Trawinski, ... Welte (1978) lists NPIs for German and ... However, his collection is entirely based on the author's in-.

A multilingual soil profile database (SDBm Plus) as ... - Semantic Scholar

Environmental Modelling & Software 17 (2002) 721â730 www.elsevier.com/locate/envsoft ..... main options: File, Codes, Content, Search, Customise,. Language ...

Efficient Diphone Database Creation for MBROLA, a Multilingual ...

Oct 23, 2010 - Efficient Diphone Database Creation for MBROLA, a Multilingual Speech Synthesiser. Jolanta Bachan, Institute of Linguistics, Adam Mickiewicz ...

FEMUS: a FEderated MUltilingual database System - Semantic Scholar

information management system, to include several computerized applications ..... and the COCOON query is C1 = select [select [s > 6000] (s:salary) â Ã] (EmployeeC) ... In 2nd Int'l Symposium on Computer and Information Science, Istanbul,.

The SIWIS database: a multilingual speech ... - Idiap Publications

bilingual and 14 trilingual speakers) were recorded for a set of 171 prompts in ..... were trained on the Wall Street Journal database [13]; for Ger- man, we used ...

MORBO/COMP: a multilingual database of compound ...

for Germanic languages, but not for Romance languages, ... and Romance words analyzed as such). .... Although the project has reached a final stage in the.

aAqua: A Database-backended Multilingual, Multimedia Community Forum

ABSTRACT. aAQUA is an online multilingual, multimedia Agricultural portal for disseminating information from and to rural communities. It answers farmers' ...

Efficient Diphone Database Creation for MBROLA, a Multilingual ...

Oct 23, 2010 - phonetic models of human speech. It allows easy ... XII International PhD Workshop .... both being applied to the SAMPA alphabet 1.[21]:.

Multilingual Database and Crosslingual Interrogation in a Real

Download PDF

44 downloads 0 Views 474KB Size Report

Comment

Nov 16, 1990 - the fulltext database is used as a semantic knowledge for solving translation ambiguities which are the main problem to solve in this approach ...

From: AAAI Technical Report SS-97-05. Compilation copyright © 1997, AAAI (www.aaai.org). All rights reserved.

Multilingualdatabaseand crosslingual interrogation in a real internet application Architecture and problemsof implementation Christian Fluhr, DominiqueSchmit, Fa’iza Elkateb, Philippe Ortet, Karine Gurtner Commissariat ~t rEnergie Atomique Direction de rlnformation Scientifique et Technique CEA/Saclay 91191Gif sur Yvette cedex France E_mail: [email protected],fr

Abstract The EMIREuropean project demonstrated the feasibility of a crosslingual interrogation of fulltext databases using a mono and a bilingual general reformulationof the query. New developments have been done to accept multilingual databases even if there is morethan one language in the same documents. Problemsof implementationof full-scale applications are discussed.

Mainprinciples of the approach of crosslingual interrogation The EMIR’sapproach is based on the use of a general transfer dictionary as a set of reformulation rules. That meansthat all possible translations are proposedas possible key for retrieving the documents. This approach is opposite to the one consisting of a translation of the query followed by a monolingual interrogation.

Background Our crosslingual text retrieval technology is based on the results of the EMIR(Fluhr 1994) (European Multilingual Information Retrieval) project in the frameworkof the ESPRIT European Program . The project was done on three languages: English, French and German, with partners from France, Germanyand Belgium. The project was performed from October 90 to April 94.

Thereare 3 differences : the reformulationis based on the translation of concepts in the queryand neednot to build up a syntactically and semantically correct translated query. the translation of conceptsis used to obtain documents, if the word inferred in the target language is not the right translation but if a relevant documentsis obtained we can consider that the system works properly ( for example : inference of an hyperonymor a word of a different part of speech).

The project had as technological support an already existing multimonolingual (French and English) text retrieval sottware : SPIRIT(Syntactic and Probabilistic Indexing and Retrieval of Information in Texts). SPIRIT has been marketed since 1980 on IBMmainframe, since 1985 on various platforms (from PC to mainframes) and from 1993in a client-server architecture.

The translation and retrieval process are mixedso that the fulltext database is used as a semantic knowledge for solving translation ambiguities which are the main problem to solve in this approach of domain independent reformulation. One of the main result of this approachis that if an answerto the queryexists in

At this time a part of EMIRresults are introduced in the SPIRITsystem. Applications are especially for the FrenchEnglish couple. A new language has been added : Russian and the Dutchlanguage is on the way.

32

the database, the system, in most cases, can select the right translation by looking in most relevant documents.

The filtering by the database lexicon is not sufficient to eliminate all wrongWanslations.So it is possible to take the translations contained in the most relevant documents(that means the ones that contain the maximumof the query words, especially the ones where words have the same dependencyrelations than in the query).

Principle of the method Wesuppose that the database is monolingual, we will discuss in the followingparagraphsthe problemsspecific to mnlrilingual databases.

It is necessary before performing this optimization to be quite sure that the