linked to specific documents, context about the user and the explicit use of time. None of the above systems, however, directly support all three forms of context.
A Context-based Document System for Wearable Computers. Kent Lyons. 1. , Thad Starner. 1. , Lonnie Harvel. 2. {kent, thad, ldh}@cc.gatech.edu. 1. College of ...
Jun 5, 2015 - Vector Space and Language Models for Scientific Document Summarization. John M. .... topic signature model where 10â50 bigrams remain.
May 31, 2015 - John M. Conroy .... We use the OCCAMS algorithm (Davis et al., 2012) to select sentences ... are nonnegative and W has k columns and H has.
Email: vvalenti,am,barlaud¡ @i3s.unice.fr. *Dipartimento di Ingegneria Elettronica e delle ... coding scheme for video compression which combines con- strained ...
ing steps, usefull for transmission on a variable bitrate envi- ronment. They simply transmit the vectors using prediction and entropy coding. Moreover, for each ...
advantages of object-oriented models for distributed context handling and those of ontology-based models for .... A Context Entity is a lightweight software component ..... Rules can be written in the. Semantic .... custom namespace. When a ...
The most common design approach for distributed context-aware frameworks is an infrastructure ... A Context Entity is a lightweight software component ..... custom namespace. When a .... Sons, Ltd, Chichester UK (2004) 69-92. 29. Strang, T.
A Document Management System (DcMS) for the efficient organization and .... recommendations, both documents also proposed by ICOMOS(1964, 2003), were.
University of Minho (UMinho), in partnership with the Centre of Computer ..... by using a PHP scripting language, in combination with a MySQL database.
Weka is free software available under the GNU General Public License. The Weka workbench contains a collection of visualization tools and algorithms for data ...
Rey-Long Liu. Dept. of Information Management. Chung-Hua University. HsinChu, Taiwan, R.O.C.. Tel: 886-3-5186530. Email: [email protected]. Yun-Ling Lu.
sentence meaning. We argue that existing mod- els for this task do not take
syntactic structure sufficiently into account. We present a novel structured vector ...
Semantic spaces are a popular framework for the rep- resentation of .... vector for wi correlates with human reading times. ...... structures for answer extraction.
Oct 26, 2012 - type of the given question and instructs the context-ranking model to re-rank ..... The best1 one serves as the passage score for passages with multiple focus ..... [15] G.G. Lee, J.Y. Seo, S.W. Lee, H.M. Jung, B.H. Cho, C.K. Lee, ...
sentence meaning. We argue that existing mod- els for this task do not take
syntactic structure sufficiently into account. We present a novel structured vector ...
This paper proposes a framework named SoRTESum to take advantage of information from Twitter viz. Diversity and reflection of document content to generate ...
management systems based on Norma-System software were put together for ..... and heterogeneous databases are best built using finepoint structure models, .... 1. multiauthor, âStudio di fattibilità per la realizzazione del progetto «Accesso ...
Oct 21, 2011 - another language. This explains why these services are at ..... shoguns. 66.900419 tokugawa. 54.892651 edo. 46.252141 samurai. 46.252141.
The main objective of this work is to design and implement a vector control system ... vector control system based on the combustion pressure purge and through.
vector components. Fortunately, the change is usually diminutive; thus, we simply limit the similarity measurement to the set of JDs that exist in both versions.
systems (MMIS) has become a central issue in developing large-scale .... architecture of a secure distributed multimedia document data- base system. In Section ...
Sep 6, 2011 - tions with the properties of context-awareness and ad hoc communi- cations. .... The Push or Provisioning Module manages the automatic distri- bution of ... text information about the device and produces a context profile.
MatchPlus: A CONTEXT VECTOR SYSTEM FOR. DOCUMENT RETRIEVAL. Stephen L Gallant, Principal Investigator. William R. Caid, Project Manager. HNC ...
MatchPlus: A C O N T E X T DOCUMENT
VECTOR SYSTEM RETRIEVAL
FOR
Stephen L Gallant, Principal Investigator William R. Caid, Project Manager HNC, Inc. 5501 Oberlin Drive San Diego, CA 92121 PROJECT
GOALS
There are two primary goals for Mat&Plus. First we want to incorporate into the system a notion of similarity of use. For example, if a query deals with 'autos' we want to be able to recognize as relevant a document with many mentions of 'cars'.
• Hand-entered queries may be given using a simple syntax. Terms, paragraphs, and documents can comprise a query (all optionally weighted), along with an (optional) Boolean filter. Documents are always returned in order by estimated likelihood of relevance.
Second, we want to apply machine learning algorithms to improve both ad-hoc retrieval and routing performance. Several different algorithms come into play here:
• Documents may be "highlighted" to show hotspots, or areas of maxinmm correspondence with the query.
• a "bootstrap" algorithm develops context vector representations for stems so that similar stems have similar vector representations
• We have implemented routing using neural network learning algorithms. This resulted in a 20-30% improvement compared with the automated ad-hoc system.
• neural network algorithms produce routing queries from initial queries and lists of relevant and nonrelevant documents
• Lists of stems closest to a given stem provide useful and interesting insight into the system's vector representations.
• clustering algorithms help generate a word-sense disambiguation subsystem (being implemented) • neural network algorithms interactively improve adhoc user queries (being implemented) • clustering algorithms can also speed retrieval algorithms using a new "cluster tree" pruning algorithm (planned) A context vector representation is central to all MatchPlus system capabilities. Every word (or stem), document (part), and query is represented by a fixed length vector with about 300 real-valued entries. For any two of these items, we can easily compute a similarity measure by taking a dot product of their respective vectors. This gives a build-in, generalized thesaurus capability to the system. RECENT
RESULTS
• We have built a system for 800,000 documents (2 GB of text). This system takes Tipster topics, automatically generates queries, and performs retrievals. 396
PLANS
FOR
THE
COMING
YEAR
We have been running many bootstrap learning experiments, and some variations have resulted in significant improvements to performance. We expect that this improvement will carry over to all aspects of the system~ including routing. Currently we are implementing word sense disambiguation. We hope that this will give performance improvements, possibly even eliminating the need for phrase processing. This module should also be able to serve as a stand-alone package, providing help for machine translation and speech understanding systems. We plan to apply learning algorithms for automated interactive query improvement in a manner similar to our approach with routing. It seems likely that this will give a significant boost to ad-hoc query performance. Finally, we are performing additional learning experiments to improve routing.