The Second International Workshop AIS-ADM-07 June 3-5, 2007, St. Petersburg, Russia
Ontos Solutions for Semantic Web: Text Mining, Navigation and Analytics Vladimir Khoroshevsky, Computer Center RAS, 40 Vavilov str, GSP-1 Moscow, Russia Irina Efimenko, Grigory Drobyazko, Polina Kananykina, Victor Klintsov, Dmitry Lisitsin, Viacheslav Seledkin, Anatoli Starostin, Vyacheslav Vorobyov Ontos AG, 84/2 Vernadskogo Av., 119606 Moscow, Russia
Agenda Introduction ¾ Semantic Technologies Umbrella
Ontos Solutions for Semantic Web ¾ ¾ ¾ ¾
General View Ontology Driven Text Mining & Web Mining MAS Based Information-To-Knowledge Transformation Semantic Navigation and Analytics
Ontos Semantic Services Demos ¾ Portal MedTrust ¾ Semantic RSS and Semantic Navigation
Conclusion ¾ Challenges and Future Trends AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 2
Introduction. Semantic Technologies Umbrella
MAS
“The Semantic Web will globalize KR, just as the WWW globalised hypertext” Tim Berners-Lee
Semantic Web
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 3
Introduction. Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 4
Introduction. Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 5
Introduction. Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 6
Introduction. Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 7
Introduction. Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 8
Introduction. Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 9
Ontos Solutions for Semantic Web. General View Daniel Hladky CEO Ontos International AG Mittelstrasse 24, 2560 Nidau
[email protected] Mobile: +41 79 353 50 43 Tel.: +41 32 332 82 70 Fax: +41 32 332 92 52
Main R&D in Domain
Ò
Ontology Driven Text Mining & Web Mining
Ò
MAS Based Information-to-Knowledge Transformation
Ò
RDF-storage Development and Implementation
Ò
Semantic Navigation & Analytics
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 10
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Ò
Objectives Ò Ò
Ò
Combining of AI & IT experience within the NLP domain R&D in knowledge management
Approaches Ò Ò
Usage of IE technologies in NL texts processing Enrichment of IE techniques with NLP on the basis of special linguistic models Ò Representation of the NL texts meaning in the form of cognitive maps Ò
Results Ò Ò
New generation of MIE-systems Practical usage of the results within the commercial & noncommercial organizations
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 11
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining
Basic Principles Ò
Processing those constructions, that can be processed correctly, and NON-processing those ones, that still can not be processed correctly
Ò
Development of reusable components for multiplatform implementation
Ò
Providing domain ontology-driven analysis
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 12
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining
Main Requirements Ò
Work with multilingual document collection (English, German and Russian texts)
Ò
Work with monothematic document collection (first of all, the so-called “Business Duties” domain)
Ò
An adequate processing of relevant objects and relations, according to the concrete ontology
Ò
Representation of processing results in a form of a cognitive map, that is a kind of semantic network
Ò
Multi-platform implementation of all systems of the family
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 13
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Domain Ontology “Business Duties”
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 14
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Domain Ontology “Business Duties” (cont.)
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 15
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Domain Ontology “Business Duties” (cont.)
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 16
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Domain Ontology “Business Duties” (cont.)
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 17
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Domain Ontology “MedTrust”
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 18
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining General Sheme of NL-texts Processing Web
doc, xls, pdf
Crawler
plain plain text text
filters
OntosMiner™
•Oracle RDF Store •MS SQL Server 2005 •InMemory DB •IBM DB2
RDF-Store AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 19
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining GATE Solution
Ontos Solution Architecture
Creole-based Chain
Domain-based Chain Main Modules
Tokenizer, SentSplitter, POS-tagger, Gazetteer, NE-transducer, Coreferencer, OrthoMatcher (hardcoded)
Tokenizer, SentSplitter, Morph-tagger, POStagger, Gazetteer, Morph Gazetteer, NP-chunker, VP-chunker, Anaphora Resolver, NE-transducer, OrthoMatcher (rule-based), UnknownMather (rule-based), Minimizer, Semantic Tagger (on text), Semantic Tagger (by sent), XML-generator (model-driven)
Additional Technologcal Components Plugins
Dix (Dictionary SDK for developers), minDix (user-oriented Dictionary SDK), Cross-lingua XML-generator
Applications Digester, Summarizer, LightOntos, Report Generator, Semantic Navigator, Semantic RSS AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 20
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Extracted Named Entities Types Ò Ò Ò Ò Ò Ò Ò Ò Ò Ò Ò Ò Ò Ò Ò Ò
People; Organizations – Various types (commercial, educational, etc.); Elections Mass Media and Media holdings Parties Government Structures Titles and JobTitles; Scientific degrees; Several kinds of Addresses; Money; Percent; URL, e-mail, phone (international style); Locations; Dates and Periods of Time; Medicines, diseases, treatment methods, etc. Etc.
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 21
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Extracted Semantic Relations Types
¾ Affiliate; ¾ Buy-Sell; ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾
Employ; Found; Graduate; Invest; JointVenture; Own; Rival; LocatedIn; EarnDegree
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾
Reside; Belong; Investigate; Lobby; Participate; BeFriend; BeRelative; BePartner; Support; Medical Rels (be_indicated, etc.) Etc.
Page 22
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Google management Team
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 23
Ontos Solutions for Semantic Web. Ontology Driven Text Mining & Web Mining Russian Text Processing Results
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 24
Ontos Solutions for Semantic Web. MAS Based Information-To-Knowledge Transformation Information Processing Technological Cycle
Resources Crawling
Text Mining
Objects & Relations Identification & Merging
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Semantic Navigation & Intelligent Analytics
Data & Metadata Storing
Page 25
Ontos Solutions for Semantic Web. MAS Based Information-To-Knowledge Transformation Ontos SOA General View
RDF-STORE
Internet
OntosMiner
OntosMiner
CRAWLERs
OntosMiner
IDENTIFY & MERGE AGENTs
MERGE AGENTs
NLP SERVICE
APPL SERVECEs
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 26
Ontos Solutions for Semantic Web. MAS Based Information-To-Knowledge Transformation Knowledge Base
MAS Architecture
Store Agent Web Crawler
Internet
Merge Agent Merge
Broker of Merging
Agent Merge Agent
Doc Storage
Web Crawler Web Crawler
Agent Tokenizer Agent Tokenizer
Content Extraction Broker
Agent Tokenizer
Broker of Doc Postprocessing
Broker of Doc Preprocessing
Broker of Doc Processing
Vocabs Vocab Agent Vocab Agent Vocab Agent
Agent Fragmentator
Morph Tagger Agent
Morph Tagger Morph Agent VP Chunk Tagger VP ChunkAgent Agent Agent VP Chunk Agent
OWL Generator Agent
Broker of Semantic Tagger Broker of Syntax
SR Agent SR Agent
NP Chunk Agent
NP Chunk Agent
NE Agent
NP Chunk Agent
Coref Agent SR Agent
Broker of NE Transducing
NE Agent NE Agent
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Ortho Matcher Agent
Page 27
Ontos Solutions for Semantic Web. MAS Based Information-To-Knowledge Transformation Objects & Relationships Merging
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 28
Ontos Solutions for Semantic Web. MAS Based Information-To-Knowledge Transformation Semantic Indexing
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 29
Ontos Solutions for Semantic Web. Semantic Navigation and Analytics
¾ Semantic Navigation ¾ On Fly Digesting of Document Collections ¾ On Fly Summarization of Documents & Document Collections in a Specified Target Language ¾ Documents & Document Collections Meaning Visual Representation & on Fly Reporting by Demands AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 30
Ontos Solutions for Semantic Web. Semantic Navigation Show navigation history, Generate Digest
Work Flow of Semantic Web Client
Return to Start point
Forward to Next point
Backward to Previous point of semantic navigation Open Relation Navigation Card
Backward to Previous point of semantic navigation
State-1
Forward to Next point
Choose an object of interest
Open Object Navigatio n Card
Open Object Navigation Card
Look through references
Show navigation history, Generate Digest
State-4
State-3
State-2
Generate summary
Hide/Visualize object attributes
Change rank threshold
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Return to Start point Backward to Previous point of semantic navigation
Generate summary
Page 31
Ontos Solutions for Semantic Web. Semantic Navigation
Object card
Object relations (How interacts With, Indicated To, Contraindicated To,...)
Relevant Docs
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 32
Ontos Solutions for Semantic Web. Cross Lingua Digesting & Summarization General Work Flow
Language Independed Internal Representation
English
Semantic Digester Google Management
OntosMiner
RDF Storage Semantic Summarizer
Russian
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 33
Ontos Solutions for Semantic Web. Semantic Digesting Surf for Digesting
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 34
Ontos Solutions for Semantic Web. Semantic Digesting On Fly Digesting of Document Collections
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 35
Ontos Solutions for Semantic Web. Semantic Digesting On Fly Digesting of Document Collections
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 36
Ontos Solutions for Semantic Web. Semantic Summarization “Report” summarization mode
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 37
Ontos Solutions for Semantic Web. Semantic Summarization Summary of Documents Collection by Demand
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 38
Ontos Solutions for Semantic Web. Semantic Summarization Summary of Documents Collection by Demand
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 39
Ontos Semantic Services Demos. Portal MedTrust
Within the medtrust portal you can order directly from a connected partner (drug store). Mashups function build into the semantic navigation card. AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 40
Ontos Semantic Services Demos. Portal MedTrust
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 41
Ontos Semantic Services Demos. Portal MedTrust
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 42
Ontos Semantic Services Demos. Semantic RSS and Semantic Navigation
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 43
Ontos Semantic Services Demos. Semantic RSS and Semantic Navigation
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 44
Ontos Semantic Services Demos. Semantic RSS and Semantic Navigation
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 45
Conclusion. Main Challenges Ò
Challenges of “Multi” Ò
Multi-source (Adoption of the Semantic Web paradigm (e.g. blogs, wikis, etc.) because of their inherent multi-source nature) Ò Multi-lingua (Collection under processing consists of the texts in different languages) Ò Multilingual (Text under processing consists of the fragments in different languages) Ò
Challenges of “Mono” Ò
Native-Language-Only Users (need in information presented in different languages but understand only native language)
Ò
Challenges of “Customizing” Ò Domains (users needs related to the different domains) Ò Ontologies (different domains and users need in related ontologies) Ò Adaptation (based on extension of ontologies and/or NLPs)
Ò
Challenges of “Back-End” Ò
Results representation (different domains and users need in different representation of results) Ò User friendly interface Ò
Challenges of “Dimension” Ò Ò
Volume (Terabytes of documents should be processed) Performance (NLP processing is time-consuming task)
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 46
Conclusion. Future Trends Ò
Solutions in “Multi” and ”Mono” Ò Multi-source, Multilingual Content Extraction Ò Genre-Driven Content Extraction Ò Cross-lingua Digesting and Summarization
Ò
Solutions in “Customizing” Ò Ontologies Merging and Alignment Ò User-Driven Acquisition and Management of Lexicons & Processing Rules Ò Computer-Aided Language Processing
Ò
Solutions in“Back-End” and “Dimension” Ò Representation of Results as Cognitive Maps Ò Knowledge-Driven Semantic Navigation Ò Multi-agent Architecture for Language Processing & Knowledge Management Ò Grid Platform Usage
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Page 47
Conclusion. Outlook Ò
Semantic Navigation with Analytics is available for a specified domain ontology Ò
Ò
Volume handling and performance Ò
Ò
Ò
At the moment rather difficult to imagine a “world ontology” to cover all possible topics and being supported by the super hardware infrastructure
Social book marking / manual annotation
Adding functions to the navigation card Ò Ò
Ò
Pilots with first customers show that the frequency can be improved by such new technologies
Social book marking Mashups
Semantic Search Extending domain ontology's to cover wider range of people and areas of interest
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007
Mashups: e.g. call pictures for a named entity
Page 48
Thank You! Any Questions?