7 Oct 2011 ... MATHM-57300 Semantic. Techniques and Applications. Adjunct Professor Ossi
Nykänen,
. Tampere University of ...
MATHM-57300 Semantic Techniques and Applications Adjunct Professor Ossi Nykänen,
[email protected] Tampere University of Technology, Dept. of Mathematics, Hypermedia Laboratory Slides for the Autumn 2011 course, 3 cu, one period of lectures and assignments; everyone must register in Moodle, course material (in Moodle) provides an outline to the topic; seek also examples to topics from the Web This work is licensed under a Creative CommonsAttributionShareAlike 3.0 Unported License.
1 Introduction Problem statement Basic ideas Course overview Opportunities and challenges Examples to start with Looking forward
Semantic Techniques and Applications /1 Introduction (ON 2011)
1.1 The quest(ion) "I want computers to do more for us" "Ok, watch this!" ...Bäng! Bäng! Bäng! ... "Oops, my bad; let me rephrase... (well, the hammer application indeed looks cool in a way)" o See the YouTube video by ruggedtabletpc at http://www.youtube.com/watch?v=5DD0G6S5Kf0&feature =player_embedded (just watch, don't try )
Semantic Techniques and Applications /1 Introduction (ON 2011)
1.2 The question rephrased... I want computers to do smarter things than mere paper sketchbooks & mail – I want computers (and people) exchanging "information"; not just "documents" For instance: I want to plug (information from) my camera into my calendar, it into my email, it into my trip planner, it into my ... (or yours) (Painting: The Construction of the Tower of Babel by Hendrick van Cleve) Semantic Techniques and Applications /1 Introduction (ON 2011)
1.3 The answer Short answer: Do (all) things in (the same) machineprocessable form o Enrich data to do more o Longer answer follows... Some other approaches: o Develop more perceptive systems with common sense (e.g. reinforcement learning) o Wait until ... [add your favourite advancement here] o Downsize wish-list (perfectly happy with old office apps?) o Ignore the question (e.g. let people do the computing) Not exclusive (and sometimes computers outperform people) Semantic Techniques and Applications /1 Introduction (ON 2011)
1.4 About this course (beyond this intro) Resource Description Framework (RDF) and the related standards The SPARQL query language Schemas, ontologies (and notes about rules) Basic logical ideas (modelling & reasoning) Basics of SemWeb programming Vocabulary and application examples Observations, design notes The aim is to see a "big picture" so not all topics are covered in depth; also assignments play an important role here Semantic Techniques and Applications /1 Introduction (ON 2011)
1.5 "Semantic ..." – what? Information systems (e.g. databases) are associated with o Instance data (tuplets, records) o Terminology (fields, vocabularies) o Schemas (ontologies, datatypes) Semantic applications work beyond mere instance data o The trend is making better or "smarter" applications by enriching data (cf. smart data vs. smart code) o Semantics may utilised in program logic (custom apps, components), data, (user) interface, or user concepts Beware: "semantics" lies in the eye of the beholder Related: Semantic computing, Linked Data, Web 3.0, ... Semantic Techniques and Applications /1 Introduction (ON 2011)
1.6 A nice example to start with: RSS Really Simple Syndication (or RDF Site Summary) Family of (XML etc.) web feed formats Providers publish data feeds, requestors read/syndicate Pros o Integrate & consume lots of data sources o Concrete applications, all sorts of (meta) data available Cons o Both syntax and semantics(!) need to be widely agreed o Finding quality feeds, validating/evaluating information o Archiving and versioning, scalability Notes: Data model, format, vocabularies, data & tools, people, social apps, ..."critical mass" needed for being useful Semantic Techniques and Applications /1 Introduction (ON 2011)
1.7 About (term) "Semantics" Formal logic: o Analysing valid inference (interpretation function) Linguistics: o Meaning of sentences (that pragmatics may still change) Semantic Computing: o Useful descriptions for purposes of machine processing, "machine-understandable" data (with user intentions) "Everyday" interpretation o Often vaguely used as "meaning" in general [Add your favourite definition here] Semantic Techniques and Applications /1 Introduction (ON 2011)
1.8 Abstract use cases of semantic tech/apps Knowledge externalisation (descriptions) and communication Modelling and information analysis Interface design, information integration Search and navigation, visualisation Semantic computing Logical and heuristic inference (also knowledge validation) Integrated, ontology-based (etc.) applications ... Related: business intelligence, unified information spaces, (software/web) agents, semantic services, artificial intelligence (?), etc. [add your favourite buzzword here]
Semantic Techniques and Applications /1 Introduction (ON 2011)
1.9 Challenges Legacy aspects, business aspects, legal framework(s), small vs. large Theory vs. engineering, expertise, interoperability and concrete data Note: Picture retrieved from the web, original author unknown Everything "important" is not (and will not be?) machine-readable (cf. privacy, business models) Overlapping, missing, and poor-quality data, junk/sabotage Complexity, scalability, "meaning" may change over time "Hard problems" (e.g. combinatory explosion, common sense reasoning, human understandability, mngmnt & versioning) Vague expectations: "I dislike computers because they simply follow orders – and people because they don't." Semantic Techniques and Applications /1 Introduction (ON 2011)
1.10 Semantic web (twisting semantics from wire) A graph-based data model with annotated nodes and edges
When distributed, using common "names" may link subgraphs Details matter for computers (model primitives and granularity, [file] format(s), names, physical distribution, retrieving,...) Interpretation is based on the understanding of the (standard) names and structures (names, types, classes, properties, ...) Also interpretation can be partly formalised, by adding more standard structures (e.g. "is-a" means class membership) Semantic Techniques and Applications /1 Introduction (ON 2011)
1.11 Semantic Web, really When talking about semantic web, people usually mean... o Some restricted variant of the generic data model, o The W3C Semantic Web – specifications and standardbased models and tools, and/or o Systems that (claim to) incorporate all sorts of semantic descriptions and techniques (also non-W3C) Again, the list is not exclusive Standards provide a crucial backbone for authors and developers but end-users mainly care about working applications with good content (cf. TV) Semantic Techniques and Applications /1 Introduction (ON 2011)
1.12 Semantic web applications Software engineering, configuration, medicine, digital libraries, Web-based information systems Web portals, multimedia collections, corporate web sites, design documentation, agents and services, ubiquitous computing Visions ranging from eGovernment and global business rules to personal knowledge bookkeeping [add your favourite domain/app here] Typical problem: Nice demonstration X does not "work" in serious production use (cf. business proposition, legacy issues, critical mass, scaling, policies, ...) Semantic Techniques and Applications /1 Introduction (ON 2011)
1.13 Semantic web applications (Cont'd) "Commonly cited" Semantic web (?) applications: o http://www.kulttuurisampo.fi/ o http://semantic-mediawiki.org/wiki/Semantic_MediaWiki o http://images.search.yahoo.com/ o http://www.evri.com/ o http://linkeddata.org/ o http://www.trueknowledge.com/ o http://www.tripit.com/ o http://www.zoominfo.com/ o http://www.foaf-project.org/ o http://dbpedia.org/sparql More technical examples follow (not only Web applications) Semantic Techniques and Applications /1 Introduction (ON 2011)
1.14 Psst. Come back to these later... Once RDF and OWL feel like piece of cake, remember to visit also these: o http://planetrdf.com/guide/, http://swoogle.umbc.edu/, http://owl.cs.manchester.ac.uk/repository/, http://www.hakia.com/, http://zitgist.com/, ... ...so that you don't have to develop, e.g., your models and applications from scratch
Semantic Techniques and Applications /1 Introduction (ON 2011)
1.15 Three architectural ideas to start with "Semantic Web" complements "Document Web" (for now) Typical applications adapt/include legacy content with data retrieved as files, queried interactively, or accessed via some dedicated platform(s) Some challenges: o Common semantics? o Access mngmnt o Legacy systems IO (Update?) Missing something? (semantic services) Semantic Techniques and Applications /1 Introduction (ON 2011)
1.16 Conclusion The "catch" lies in the machine-readable semantics, by which we mean the descriptions added for computers to enable more processing (e.g. the type and format of things) The purpose of "metadata" is similar; we will see differences (self-referencing) when looking at techniques in more detail Standards needed for data interoperability and tools – social and business aspects also play a major role, legacy rules and large apps thus need a critical mass (iterations expected?) Let us dive deeper... (more technical stuff to follow) Semantic Techniques and Applications /1 Introduction (ON 2011)
1.17 Wait! What about killer applications? Feel free to disagree: o The killer application of the Web 1.0 was/is the Web browser (server) o The killer application in between was/is the web search engine (crawler) o The killer application of the Web 2.0 is/is social content (.) o The killer application for Semantic web/Web 3.0/... is [add your app here] (No-silos? Agents? Daily me? Webised Finland? Holodeck? Something we still call user agent?) Trends: webised applications (travel, shops, eGov), mobility and location-awareness, sensors, clouds, "browsers" taking the role of operating systems, new devices and interfaces,... Semantic Techniques and Applications /1 Introduction (ON 2011)
2 Semantic Web Concepts General architecture Syntaxes & RDF vocabulary First modelling concepts & logical ideas SemWeb and Linked data
2.1 W3C Semantic Web technology "layer cake" A mixture of vision and planned work Standardisation underway but not "finished" Lots of "ready" technologies (specs), some still missing, and "grey areas" (logic, trust & access mngmnt) +Distributed vocabulary development Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.2 Basic RDF Concepts Graph data model (of statements; Triple(s,p,o)) o Subject, Predicate (Property), & Object URI-based vocabulary (URIrefs & Blank nodes) Datatypes o Lexical space, value space, and lexical-to-value mapping o rdf:XMLLiteral Literals (plain [+optional language tag] or typed) XML serialization syntax & expression of simple facts (Graph) entailment o "(P and Q) gives P", "foo(baz) gives (exists (?x) foo(?x))" o More formally: "S entails E iff a subgraph of S is an instance of E" (Interpolation lemma) Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.3 Basic modelling principles Distributed knowledge No Unique Names Assumption Open World Assumption (cf. monotonicity)
Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.4 Basic modelling principles (Cont'd) Data interchange is based on the common resource description framework ("anyone can describe anything") o Descriptions must follow the common data model o There is no separate "metadata" layer; descriptions can refer to descriptions (e.g. schema languages use this) Uniform Resource Identifiers (URI) are used for e.g. resource and predicate names (also helps in decentralisation) o Locators (URL) may enable retrieval; besides document resources, URIs may also point to abstract things, such as concepts, or concrete things, such as physical objects Reusing names enables semantic linkage & interoperability No "global access", description quality is an issue Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.5 RDF Hello World! _:x "Hello world!" .
Syntax, serialisation Meaning (intended/intuitive/formal) Storage Usage Correctness, data quality, provenance Tools (authoring, validation, query, reasoning, application development, program logic, ...) Community & business models Legal framework ...
Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.6 More examples (RDF Primer) "There is a Person identified by http://www.w3.org/People/EM/c ontact#me, whose name is Eric Miller, whose email address is
[email protected], and whose title is Dr." Visual notation vs. serialisations vs. (entailed) statements Formal meaning vs. "intended meaning" (underlying logical interpretation) RDF is defined in several specifications (see SemWeb pubs)
Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.7 Serialisation in RDF/XML
Eric Miller Dr.
Abbreviations, namespaces, XML markup (@xml:lang) Subgraphs & merging may seem trivial at first (but...) Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.8 Serialisation in N3 @prefix rdf:
.
@prefix contact: . @prefix em:
.
em:me rdf:type contact:Person. em:me contact:fullName "Eric Miller" . em:me contact:mailbox . em:me contact:personalTitle "Dr." .
Again, various abbreviations (i.e. naive parsing does not work) While people usually talk vaguely about "N3", different versions of the triple syntax do exist o Turtle (Terse RDF Triple Language), N-Triples, Notation 3 (N3), SPARQL Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.9 Machine-Readability: "Authoring" vs. "modelling"
Authoring
Modelling
persons:ossi misc:favourite
...
XML/HTML/Etc. Serialization
Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.10 Revisit the RSS example Graph syntax & serialisation Vocabularies Parsing (why Atom may seem attractive...) Problems with certain primitive structures (is using e.g. rdf:Seq really such a great idea?) Please note that RSS is "just" an example to start with o Some other easy-to-understand (vocabulary/application) examples: Dublin Core, FOAF, RDFa (we'll come back to these later, if the time permits)
Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.11 The standard (core) RDF vocabulary rdf:type, rdf:Property, rdf:XMLLiteral, rdf:nil, rdf:List, rdf:Statement, rdf:subject, rdf:predicate, rdf:object, rdf:first, rdf:rest, rdf:Seq, rdf:Bag, rdf:Alt, rdf:_1, rdf:_2, ..., rdf:value RDF interpretations include the RDF axiomatic triples: rdf:type rdf:type rdf:Property . rdf:subject rdf:type rdf:Property . rdf:predicate rdf:type rdf:Property . rdf:object rdf:type rdf:Property . rdf:first rdf:type rdf:Property . rdf:rest rdf:type rdf:Property . rdf:value rdf:type rdf:Property . rdf:_1 rdf:type rdf:Property . rdf:_2 rdf:type rdf:Property . ... rdf:nil rdf:type rdf:List . Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.12 Misc logic notes (compare with Logic 101 course) Recall the two tasks of logic ("in general") o Descriptive task, deductive task Syntax, semantics, proofs, object language vs. meta language But for RDF... o No type categories, no (direct) negation, no (direct) disjunction, no n-ary predicates (best practices), ... o Order of statements (cf. First order logic!) o Reification, containers, standard notions for "classes", etc. o In most (?) applications only RDF subsets are used (descriptive logic) Beware: Logical interpretation is always there, but some RDF apps use only the syntactic part in their interpretation(s)... Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.13 Useful comparisons (Hebeler et al., 2009) Feature Fundamental component Primary audience Links
WWW Semantic Web Unstructured content Formal statements
Primary vocabulary
Formatting instructions Informal/nonstandard Description logic
Logic
Humans Indicate location
Applications Indicate location and meaning Semantics and logic
"But": SemWeb is a part/aspect of the WWW SemWeb is not the only way to express semantics and logic Not all SemWeb applications actually adopt description logic Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.14 Useful comparisons, cont'd (Hebeler et al., 2009) Feature Structure Data Administration language Query language Relationships Logic Uniqueness
Relational database Schema Rows DDL
Knowledgebase
SQL Foreign keys External of database/triggers Key for table
SPARQL Multidimensional Formal logic statements URI
Ontology statements Instance statements Ontology statements
"But": Knowledgebase may also include rules Expressing uniqueness may vary, standards required Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.15 Conclusion (RDF) Graph model & concepts, RDF (core) vocabulary Tool support, special RDF parser required (no standard canonical syntax) Communities & applications dictate which (other) vocabularies actually do become popular o See some resource guide Competition (?): Topic Maps, UML, ... Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.16 Wait! What about linked (open) data? "Best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF." Tim Berners-Lee's (TBL) four rules for linked data: 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things. Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.17 (TBL's) five star (linked) data? ★ Available on the web (whatever format), but with an open licence ★★ Available as machine-readable structured data (e.g. excel instead of image scan of a table) ★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel) ★★★★ All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff ★★★★★ All the above, plus: Link your data to other people’s data to provide context Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
2.18 Research challenges (Bizer et al., to appear) User Interfaces and Interaction Paradigms Application Architectures Schema Mapping and Data Fusion Link Maintenance Licensing Trust, Quality and Relevance Privacy
Semantic Techniques and Applications /2 Semantic Web Concepts (ON 2011)
3 Query Basics Introduction Query use cases SPARQL Towards reasoning
3.1 How to use RDF data? RDF data can be utilised in several ways, including o End-user use cases: Through mainstream applications (e.g. Kulttuurisampo) Part of existing file formats (.pdf ...RDF, XMP) As yet another new file format(s) (e.g. foaf.rdf) o Developer use cases: Through (specialised) applications (e.g. Protégé) Ad hoc programmatic processing (troubles ahead!) RDF/XML parsers/frameworks (RDFLib, Jena, ...) Via query engines The query approach is nice: clean interface, familiar from database world in general, extends for reasoning Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.2 SPARQL (1.0) SPARQL Query Language for RDF (weird "acronym") Specifications for o Query syntax (basic I/O) o Results in XML and JSON o Protocol (i.e. as a web service, in WSDL 2.0) Good for o Accessing data (also in chunks) o Very simple conditional processing/computations o Standard-based reasoning (extensions/plugins) Standards still missing o Update, negation, functions, well-defined extensions, ... o SPARQL 1.1 underway... Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.3 A simple example Data: @prefix foaf:
.
_:a
foaf:name
"Johnny Lee Outlaw" .
_:a
foaf:mbox
.
_:b
foaf:name
"Peter Goodguy" .
_:b
foaf:mbox
.
_:c
foaf:mbox
.
Query PREFIX foaf:
SELECT ?name ?mbox WHERE
{
?x foaf:name ?name . ?x foaf:mbox ?mbox }
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.4 A simple example (Cont'd) Result (in tabular form): name
| mbox
------------------------- +--------------------------"Johnny Lee Outlaw"
|
"Peter Goodguy"
|
Notes: o Query string structure o Variables in graph pattern matching o Implicit source was assumed o In reality, one might rely on result set iterator
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.5 Notes about SPARQL tools Some obvious use cases o Command-line & GUI tools (e.g. ARQ for Jena, Twinkle) o Frameworks and libraries (e.g. RDFLib, Jena, Sesame) o Web apps (e.g. ARQ SPARQLer - An RDF Query Demo, OpenLink Virtuoso SPARQL Query) o Servers (e.g. Joseki) o Embedded tools (e.g. SPARQL query panel in Protégé) Tool-specific extensions available, some being standardised as we speak
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.6 More than meets the eye SPARQL is not too different from SQL, but based on a network data model However, with SPARQL, one can o Make queries based on purely syntactic structures o ...also including entailed statements due the standard knowledge representation languages (RDFS, OWL) o ...also including some domain-specific vocabularies (such as in spatial or temporal reasoning) As a consequence, an "empty" model may well yield (a potentially infinite set of axiomatic) triples
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.7 Basic concepts Query processor, (source) data (graph(s)), query (string), solution, (query) results, application Queries are intuitively based on graph pattern matching SPARQL is essentially a low-level, text-based protocol Enables many kinds of applications, including the commonlycited semantic search, but additional efforts needed for: o GUI o Intuitive query interface (e.g. facets) and end-user concepts (query language) o Data and context management and quality control o Computations and domain-specific heuristics Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.8 Query forms (not just tabular results...) SELECT CONSTRUCT ASK DESCRIBE
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.9 CONSTRUCT example Query PREFIX foaf:
PREFIX vcard:
CONSTRUCT
{ vcard:FN ?name }
WHERE
{ ?x foaf:name ?name }
Result @prefix vcard: . vcard:FN "Alice" .
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.10 ASK example Data @prefix foaf:
.
_:a
foaf:name
"Alice" .
_:a
foaf:homepage
.
_:b
foaf:name
"Bob" .
_:b
foaf:mbox
.
Query PREFIX foaf: ASK
{ ?x foaf:name
"Alice" }
Result yes
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.11 DESCRIBE example PREFIX ent:
DESCRIBE ?x WHERE { ?x ent:employeeId "1234" }
Result @prefix foaf:
.
@prefix vcard:
.
@prefix exOrg:
.
@prefix rdf:
.
@prefix owl:
.
_:a
exOrg:employeeId
"1234" ;
foaf:mbox_sha1sum
"ABCD1234" ;
vcard:N
[ vcard:Family
vcard:Given foaf:mbox_sha1sum
rdf:type
"John"
"Smith" ;
] .
owl:InverseFunctionalProperty . Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.12 Basic SPARQL language constructs Literals, blank nodes, lists, RDF collections Graph patterns o Grouping, FILTER OPTIONAL UNION (alternatives) Datasets o FROM (default graph) o FROM NAMED + GRAPH (named graph) Solution sequences and modifiers o Order, Projection, Distinct, Reduced, Offset, Limit
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.13 Operators Unary operators: !A, +A, -A Tests: bound(A), isIRI(A), ... Accessors: STR(A), LANG(A), ... Connectives: A||B, A&&B XPath tests: A=B, A!=B, ... XPath arithmetic: A*B, A/B, ... SPARQL tests: A=B, A!=B, sameTerm(A), langMATCHES(A,B), REGEX(STRING, PATTERN, FLAGS?)
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.14 About protocols and serialisations SPARQL endpoints (SPARQL Protocol for RDF) XML result syntax (SPARQL Query Results XML Format)
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.15 SPARQL (1.1) update Extension to o Insert new triples to an RDF graph o Delete triples from an RDF graph o Perform a group of update operations as a single action o Create a new RDF Graph to a Graph Store o Delete an RDF graph from a Graph Store Examples LOAD [ INTO ] DELETE [ FROM ]* { template } [ WHERE { pattern } ] INSERT [ INTO ]* { template } [ WHERE { pattern } ] CLEAR [ GRAPH ] DROP [ SILENT ] GRAPH
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.16 Reasoning? (Choosing appropriate) method of entailment defines the search results – this might be called "reasoning" Entailments typically fall into the following order o Simple entailment (graph structure) o RDF entailment (RDF vocabulary entailment; recall the RDF axiomatic triples) o RDFS entailment (standard schema constructs) o D-entailment (datatype entailment, ...) o "OWL 2 entailments" (standard ontology constructs)
Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.17 Using a reasoner Using a reasoner may depend on the specific toolkit (separate reasoners available) Jena example (above figure from Jena2 documentation): o Create a Model (request InfModel from ModelFactory associated with a Reasoner, e.g. RDFS Reasoner) o Query the Model; the Results (may) now include also the entailed subgraphs (Same pattern applies also to programmatic Model access) Semantic Techniques and Applications /3 Query Basics (ON 2011)
3.18 Conclusion SPARQL is nice, but essentially a low-level protocol – the intuitively "bigger" Semantic Search needs more Naively reading 3rd party RDF graphs may introduce severe problems o Exceptions and (syntax) error management? o Problems of telepathic communication? o Database statistics? (E.g. how "big" is the database?) o Versioning and "metadata"? One strategy to (try to) escape these problems, is to execute (SPARQL) queries "within code" (and use the cmd-line tools "only for debugging") Other query languages also exist, but we ignore them for now Semantic Techniques and Applications /3 Query Basics (ON 2011)
4 Knowledge Representation Basics Introduction to knowledge representation RDF Schema Modelling basics (and issues)
4.1 Introduction How should one interpret RDF data with proprietary names? _:x ex:email .
Alternatives for interpretation, based on... o "informal meaning" (and structure), o narrative definition (e.g. a spec in English), or o interpretation with respect to a (formal) Knowledge Representation system (usually the above and them some) A Knowledge Representation (KR) system provides a standard way to describe, e.g.: o Concepts, attributes, taxonomies, relationships, functions, axioms, instances, and other knowledge components Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.2 Example "ex:email is a Property that allows capturing the email address of a given Person." Notes o What does it mean? o Is it precise? (How to give, e.g., the email address?) o What elements were used in the (analytical) definition? (What do "Property" and "Person" mean?) o How to communicate it? o What if we ignore it? (cf. "[validity] assertions")
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.3 KR use cases (Informal) communication Instance validation (and analysis) Knowledge model validation (and analysis) Derived specifications Inference ...and the related applications (query, semantic integration, reasoning, ...)
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.4 Overview of KR for the Semantic Web The standard core includes several KR languages or related systems: 1. RDF Vocabulary Description Language 1.0: RDF Schema (RDFS) 2. Simple Knowledge Organization System (SKOS) 3. Web Ontology Language (OWL) Historically, these we defined as 1, 3, 2 (+revised)
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.5 RDF Schema (RDFS) The basic RDF Vocabulary Description Language 1.0: RDF Schema concepts are classes and properties o (Class names are usually capitalised, properties are not) RDFS is a semantic extension of RDF (itself also a RDF vocabulary) o Refines the definition of the RDF (core) vocabulary o Provides also some utility definitions Note that RDFS interpretation is actually defined in the (a bit formal) RDF Semantics specification (but the RDFS spec is more readable for most people)
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.6 New RDFS classes New classes: o rdfs:Resource o rdfs:Class o rdfs:Literal o rdfs:Datatype Also, recall the ones from the RDF vocabulary: o rdf:XMLLiteral o rdf:Property Example: e:ed rdf:type e:Bugist . e:Bugist rdf:type rdfs:Class .
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.7 New RDFS properties New predicates: o rdfs:range (think as a Rel:DomainRange) o rdfs:domain o rdfs:subClassOf (thinking subsets may help...) o rdfs:subPropertyOf (think as: Rel1 Rel2) New (utility) predicates o rdfs:label, rdfs:comment, rdfs:seeAlso, rdfs:isDefinedBy, rdf:value Also, recall from the RDF vocabulary: o rdf:type rdfs:subClassOf and rdfs:subPropertyOf are transitive: subClassOf(A, B) subClassOf(B, C) subClassOf(A, C) Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.8 Other RDFS vocabulary New concepts: o rdfs:Container, rdfs:ContainerMembershipProperty, rdfs:member Also, recall from the RDF vocabulary: o rdf:Bag, rdf:Seq, rdf:Alt, rdf:List, rdf:first, rdf:rest, rdf:nil, rdf:Statement, rdf:subject, rdf:predicate, rdf:object,
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.9 Examples (assume prefixes) e:StaffMember
rdfs:subClassOf
e:Person .
e:Van
rdfs:subClassOf
e:MotorVehicle .
e:MotorVehicle
rdfs:seeAlso
dict:Car .
e:primaryDriver rdfs:subPropertyOf ex:driver . e:primaryDriver rdfs:domain
e:MotorVehicle .
e:primaryDriver rdfs:range
e:StaffMember .
e:primaryDriver rdfs:comment "The first driver according to the company default car insurance form." .
ac:ed
rdf:type
e:StaffMember .
ac:ed
rdfs:label
"Ed Wellersby" .
ac:HU-71 rdf:type
e:Van .
ac:HU-71 e:primaryDriver
ac:ed.
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.10 Model considerations Good models encode practically useful information Model components o Terminology o (Instance data) Assertions Typically several physical components
company_drivers_schema.rdf "Terminology"
■ e:driver rdfs:subPropertyOf rdfs:domain
■ e:primaryDriver
● e:MotorVehicle
● e:Person
rdfs:range rdfs:subClassOf ● e:Van
rdfs:subClassOf ● e:StaffMember
rdf:type "Instance data ac:HU-71 assertions"
e:primaryDriver
rdf:type
ac:ed
acme_drivers.rdf
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.11 Useful modelling principles (Russel & Norvig, 1997) Decide what to talk about Decide on a vocabulary of predicates, functions, and constants Encode general knowledge about the domain Encode a description of the specific problem instance Pose queries to the inference procedure and get answers
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.12 Notes Note that the notion of a "class" is a bit different in SemWeb and in, e.g., Object-Oriented design (properties are universal) Properties do not need to be "functional", etc. A resource may belong to several classes (or properties) A resource may be "simultaneously" a class, a property, and in instance and it may have several names RDFS includes axiomatic triples; these allow e.g. inferring types of objects Intuitively, RDFS follows the monotonic reasoning paradigm Later, we will see that OWL defines the following concepts: owl:Thing, owl:Nothing
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.13 Notes (Cont'd) As a KR, however, RDFS is quite weak o Intuitively rich enough for defining simple taxonomies o Useful also for documenting purposes But since it is simple and nicely introduces many of the SemWeb KR basic concepts – and later reappears in OWL in a restricted form – knowing it is essential In addition, there are also some well-known variants with more practical value in between the RDFS and OWL languages, such as the RDFS-Plus (RDFS + some OWL concepts; see Allemang et al., 2008) Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.14 Some highlights from the RDFS axiomatic triples rdf:type rdfs:domain rdfs:Resource . rdf:type rdfs:range rdfs:Class . ... (rdfs:Resource rdf:type rdfs:Class .) (rdfs:Class rdf:type rdfs:Class .) ... rdfs:domain rdfs:domain rdf:Property . ... rdf:first rdfs:domain rdf:List . rdf:rest rdfs:domain rdf:List . rdfs:seeAlso rdfs:domain rdfs:Resource . rdfs:isDefinedBy rdfs:domain rdfs:Resource .
...
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.15 Summary of RDF classes (see the spec) rdfs:Resource rdfs:Literal rdf:XMLLiteral rdfs:Class rdf:Property rdfs:Datatype rdf:Statement
rdf:Bag rdf:Seq rdf:Alt rdfs:Container rdfs:ContainerMembership Property rdf:List
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.16 Summary of RDF properties (see the spec) rdf:type rdfs:subClassOf rdfs:subPropertyOf rdfs:domain rdfs:range rdfs:label rdfs:comment rdfs:member
rdf:first rdf:rest rdfs:seeAlso rdfs:isDefinedBy rdf:value rdf:subject rdf:predicate rdf:object
4.17 Ok – Implications to modelling? Modelling = representing phenomena with respect to some modelling language + appr. reqs/constraints (now a KR system + suitable terminology) With (logical) RDFS this means 1. Developing (practical) instance data assertions, and 2. Developing (useful, commonly used) terminologies 3. ..using the RDF primitives, RDF and RDFS vocabularies Considering presented above, we see that strictly speaking, RDF modelling comes in two (related) flavours: as modelling of graph structures (1) and modelling of terminologies (2)
4.18 .: Some modelling patterns are better than others The distinction of modelling between graph structure and terminology may seem subtle, but from the perspective of applications and KR systems, graph structures should be chosen so that 1. they can be queried, and that 2. the standard primitives of the given KR system are applicable (revisit this question when familiar with OWL) RDFS already demonstrates that this can be tricky... (Things improve when richer KRs are introduced, but the basic problem setting remains!) Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.19 Instance data examples: Relational data Relational data ("table" about multiple vehicles) e:Vehicle rdf:type e:Dataset . _:x1 rdf:type e:Vehicle . _:x1 e:regPlate "HU-71" . _:x1 e:weight "3000" . _:x1 e:license "B". _:x2 rdf:type e:Vehicle . _:x2 e:regPlate "AU-30" . _:x2 e:weight "5000" . _:x2 e:license "C". ...
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.20 Instance data examples: N-ary relations Modelling N-ary relations ("N-valued properties") requires some thinking (for N=3, think "(a,b,c)", "a(b,c)", etc.) Approach 1: Explicit resource representing the relation e:HU-71 e:hasDrivers _:driverRelation . _:driverRelation e:primaryDriver e:ed . _:driverRelation e:secondaryDriver e:tim . _:driverRelation e:insurance e:InsuranceComp1 .
Approach 2: RDF collections (when order matters) @prefix e: . e:HU-71 e:hasDrivers ( [e:primaryDriver e:ed] [e:secondaryDriver e:tim] [e:insurance e:InsuranceComp1]).
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.21 "Issues" with RDFS Intuitively, we would like to give context to assertions --- but RDFS class and property assertions are universal Also, according to RDF Semantics, RDFS is for asserting additional information about the domain (not for "validating") To highlight the latter, let us next consider two examples: o Issues with domain specifications o Issues with class specifications
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.22 Objects of the same type by accident? Part 1 Consider the following statements: ex:weight
rdf:range
xsd:decimal .
ex:weight
rdfs:domain
ex:Book .
ex:weight
rdfs:domain
ex:MotorVehicle .
_:x ex:weight "85" .
Conclusion: _:x is both a book and a motor vehicle (!) Potential solution(s): o Property hierarchy (different "weights" for different classes, each subproperty of "the" concept of weight) o Model as little as possible (range is more interesting?)
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.23 Objects of the same type by accident? Part 2 Consider the following statements: ex:Professor
rdfs:subClassOf
ex:Teacher .
_:x
rdf:type
ex:Professor .
Conclusion: _:x is both a Professor and a teacher --- but what if the concepts do not overlap completely? Potential solution(s): o Again, class hierarchy (add a common class, e.g., Staff, which is the superclass of both Professor and Teacher) o Model as little as possible (what was the use case of the subclass anyway?)
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
4.24 Conclusion Knowledge representations systems/languages introduce common concepts using which semantics may be communicated RDF Schema (RDFS) essentially introduces the concepts of class and property However, since classes and properties are universal and for reasoning additional facts about the domain, naive models may introduce unwanted statements (remember monotonicity) Other Semantic Web KR systems introduce more useful concepts
Semantic Techniques and Applications /4 Knowledge Representation Basics (ON 2011)
5 Practical Matters (Programming) Motivation Part I: Simple Knowledge Organization System (SKOS) Part II: Simple RDF programming with Java/Jena
5.1 Introduction Entailments can be tricky – sometimes a simple "skeleton" for concepts/objects is sufficient (without too much logic) o ...SKOS The query pattern is so quite powerful that many applications can be built solely based on SPARQL – but sometimes handson RDF manipulation is required (e.g. modifying models using sophisticated computations) o ...accessing the RDF graph e.g. with Java These ideas typically emerge in practical "Semantic Web programming" (which complements the logical definition)
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.2 Part I: Concept structures "Concepts are the units of thought – ideas, meanings, or (categories of) objects and events – which underlie many knowledge organization systems. As such, concepts exist in the mind as abstract entities which are independent of the terms used to label them." (SKOS Primer)
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.3 Simple Knowledge Organization System (SKOS) SKOS Simple Knowledge Organization System Reference & Primer (W3C standard & note, 2009): SKOS provides a models for expressing the basic structure and content of concept schemes such as o thesauri, o classification schemes, o subject heading lists, o taxonomies, o folksonomies, o and other similar types of controlled vocabulary
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.4 Start simple and grow as needed In basic SKOS, conceptual resources (concepts) are identified with URIs, labelled with strings in one or more natural languages, documented with various types of note, semantically related to each other in informal hierarchies and association networks, and aggregated into concept schemes. In advanced SKOS, conceptual resources can be mapped across concept schemes and grouped into labeled or ordered collections. Relationships can be specified between concept labels. Finally, the SKOS vocabulary itself can be extended to suit the needs of particular communities of practice or combined with other modelling vocabularies. Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.5 Example: A simpe animal thesaurus ex:animalThesaurus rdf:type skos:ConceptScheme; dct:title "Simple animal thesaurus"; skos:hasTopConcept ex:animals; dct:creator ex:antoineIsaac. ex:animals rdf:type skos:Concept; skos:prefLabel "animals"@en; skos:narrower ex:mammals; skos:inScheme ex:animalThesaurus. ex:mammals rdf:type skos:Concept; skos:prefLabel "mammals"@en; skos:broader ex:animals; skos:inScheme ex:animalThesaurus. ex:dogs rdf:type skos:Concept; skos:prefLabel "dogs"@en; skos:broader ex:mammals; skos:inScheme ex:animalThesaurus. app:someDogArticle rdf:type foaf:Document; dct:subject ex:dogs.
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.6 Notes Concept schemas may also have an identity and descriptions One can refer (reuse) existing definitions made by others (URIRefs) While e.g. skos:broader and skos:narrower allow heuristic reasoning, SKOS does not declare them transitive (cf. rdfs:Class) o If transitivity is needed, use skos:broaderTransitive and skos:narrowerTransitive instead/in addition Applications typically create their concept backbone with SKOS and extend from there
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.7 SKOS vocabulary at a glance... skos:Concept, skos:ConceptScheme, skos:inScheme, skos:hasTopConcept, skos:topConceptOf, skos:altLabel, skos:hiddenLabel, skos:prefLabel, skos:notation, skos:changeNote, skos:definition, skos:editorialNote, skos:example, skos:historyNote, skos:note, skos:scopeNote, skos:broader, skos:broaderTransitive, skos:narrower, skos:narrowerTransitive, skos:related, skos:semanticRelation, skos:Collection, skos:OrderedCollection, skos:member, skos:memberList, skos:broadMatch, skos:closeMatch, skos:exactMatch, skos:mappingRelation, skos:narrowMatch, skos:relatedMatch
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.8 Part II: RDF with Java "Implementing too quickly, without first understanding the RDF data model, leads to frustration and disappointment. Yet studying the data model alone is dry stuff and often leads to tortuous metaphysical conundrums. It is better to approach understanding both the data model and how to use it in parallel. Learn a bit of the data model and try it out." (Jena documentation) Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.9 First architectural notes about development... Semantic Web Software Development Environment (Hebeler et al., 2009):
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.10 Jena (See jena.sourceforge.net) Jena is a Java framework for building Semantic Web applications. It provides a programmatic environment for RDF, RDFS and OWL, SPARQL and includes a rule-based inference engine. Jena is open source, grown out of work with the HP Labs Semantic Web Programme. The Jena Framework includes: o A RDF API o Reading and writing RDF in RDF/XML, N3 and N-Triples o An OWL API o In-memory and persistent storage o SPARQL query engine Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.11 A complete example (from Jena Tutorial03.java) Create a model from scratch, add triples, and print it out: package jena.examples.rdf ; import com.hp.hpl.jena.rdf.model.*; import com.hp.hpl.jena.vocabulary.*; public class Tutorial03 extends Object { public static void main (String args[]) { // some definitions String personURI = "http://somewhere/JohnSmith"; String givenName = "John"; String familyName String fullName = givenName + " " + familyName; // create an empty model Model model = ModelFactory.createDefaultModel(); // create the resource // and add the properties cascading style Resource johnSmith = model.createResource(personURI) .addProperty(VCARD.FN, fullName) .addProperty(VCARD.N,
= "Smith";
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
model.createResource() .addProperty(VCARD.Given, givenName) .addProperty(VCARD.Family, familyName)); // list the statements in the graph StmtIterator iter = model.listStatements(); // print out the predicate, subject and object of each statement while (iter.hasNext()) { Statement stmt = iter.nextStatement(); // get next statement Resource subject = stmt.getSubject(); // get the subject Property predicate = stmt.getPredicate(); // get the predicate RDFNode object = stmt.getObject(); // get the object System.out.print(subject.toString()); System.out.print(" " + predicate.toString() + " "); if (object instanceof Resource) { System.out.print(object.toString()); } else { // object is a literal System.out.print(" \"" + object.toString() + "\""); } System.out.println(" ."); } } } Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.12 Notes RDF models captured as Java objects o fields & methods (class/interface Model) o .getX(), .setX(), .read(), .write(), iterators, etc. o exceptions (!) Reasoner services available (with options!) Effectively, one can programmatically do pretty much anything within the constraints the RDF abstract data model...
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.13 More examples (See the Jena Introduction) Reading RDF Writing RDF Navigating a Model Querying a Model Operations on Models
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.14 SPARQL query in Java (see the ARQ docs) Performing a simple SELECT query: import com.hp.hpl.jena.query.* ; Model model = ... ; String queryString = " .... " ; Query query = QueryFactory.create(queryString) ; QueryExecution qexec = QueryExecutionFactory.create(query, model) ; try { ResultSet results = qexec.execSelect() ; for ( ; results.hasNext() ; ) { QuerySolution soln = results.nextSolution() ; RDFNode x = soln.get("varName") ; // Get a result variable by name. Resource r = soln.getResource("VarR") ; // Get a result variable must be a resource Literal l = soln.getLiteral("VarL") ; // Get a result variable must be a literal } } finally { qexec.close() ; } Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
5.15 Conclusion RDF is essentially a logical modelling system – the RDF (RDFS, ...) entailments "are always there" In simple applications, however, a more syntactic perspective is often more convenient/practical to start with A wise developer does not misuse the "technical" to misuse/ignore/"override" the underlying logical machinery (this will surely backfire later when/if integrating semantic data) When in doubt, use SPARQL...
Semantic Techniques and Applications /5 Practical Matters (Programming) (ON 2011)
6 Web Ontologies Basic concepts and some logic preliminaries Introduction to the Web Ontology Language (OWL) Editor(s) (Protégé) and repositories
6.1 Introduction By a dictionary definition, ontology means the study of "being; that which is", or "basic categories of being" (cf. metaphysics, epistemology) In applications, however, the term is used in a mode restricted manner ... o "Ontology is a formal, explicit specification of a shared conceptualisation" (Gruber, 2002, etc.); or o "...The world can be described in terms of individuals and relationships between individuals. An ontology is a commitment to what exists in any particular task domain." (Poole et al., 1998) Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.2 An intuitive, "broad" ontology example Top-level ontology of the world (Russel & Norvig 1995):
In most (?) applications, however, ontologies are used to describe more restricted domains... Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.3 The ontology or several (domain) ontologies? By content and usage, one can differentiate different kinds of ontologies (Gómes-Pérez, Fernández-López & Corcho 2004): o Knowledge representation (KR) ontologies o General or common ontologies o Top-level or Upper-level ontologies o Domain ontologies o Task ontologies o Domain-Task ontologies o Method ontologies o Application ontologies Also combinations (usually Upper-level + Domains, using KR)
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.4 Some logic preliminaries, part 1/3 For modelling, different logics exists which fall into various computability classes (Poole, 1998) – decidability is nice, but efficiency is needed in applications (polynomial complexity) Undecidable + Turing equivalent FOPC Clausal form
Decidable
Horn clauses Definite clauses
Function-free FOPC Propositional calculus
Propositional clauses 3-CNF
Datalog
NP-hard Polynomial
2-CNF
Propositional-definite
Propositional database
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.5 Some logic preliminaries, part 2/3 Descriptive logics (DL) establish decidable subset(s) of first order predicate logic (but these can still be way too inefficient) The nature of a particular descriptive logic depends on its definition; the basis is typically AL (Attributive Language) C,D
A
|
(atomic concept)
┬
|
(universal concept, cf. owl:Thing)
┴
|
(bottom concept,
A
|
(atomic negation)
C R.C R.┬
D | |
cf. owl:Nothing)
(intersection) (value restriction) (limited existential quantification). Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.6 Some logic preliminaries, part 3/3 Abstract DL introduces notation close to set theory which allows the relatively intuitive analytical definition of classes (compare with OWL complex class definitions later) Examples (in the ALCN language, where CUE) - Parent Person hasChild. ~ parent is a person with at least one child - Person hasChild. ~ persons with no children - Person hasChild.Female ~ persons with (only) female children - Person (1hasChild (3hasChild hasChild.Female)) ~ persons with at least one child, or that have at most three children, one of which is female Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.7 Web Ontology Language (OWL) 2 "The W3C OWL 2 Web Ontology Language (OWL) is a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things. OWL is a computational logic-based language such that knowledge expressed in OWL can be reasoned with by computer programs either to verify the consistency of that knowledge or to make implicit knowledge explicit. OWL documents, known as ontologies, can be published in the World Wide Web and may refer to or be referred from other OWL ontologies." (OWL 2 Primer) Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.8 Web Ontology Language (OWL) 2 Set of W3C SemWeb standards o Intuitively, a KR system for expressing domain ontologies o Technically builds onto RDF and RDFS vocabularies, but may conceptually considered as a KR system of its own o Version 1 standardised in 2004, version 2 in 2009 Potential confusion may arise from the fact that depending on the context, the term "(OWL) ontology" may refer to the OWL 2 standard (i.e. as a general-purpose KR system), or some particular domain ontology (e.g. a food and wine ontology)
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.9 A simple OWL ontology (in Turtle or N3) @prefix @prefix @prefix @prefix
: . owl: . rdfs: . rdf: .
rdf:type owl:Ontology . :Woman rdfs:subClassOf :Person . :Mother rdfs:subClassOf :Woman . :Person owl:equivalentClass :Human . [] rdf:type owl:AllDisjointClasses ; owl:members ( :Woman :Man ) . # "Woman Man ┴" :Mary rdf:type :Person . :Mary rdf:type :Woman . :John :hasWife :Mary .
Looks quite familiar (cf. SKOS & RDFS), but the story does not end here, i.e., simply introducing new standard names... Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.10 Web Ontology Language (OWL) 2 Conceptually, OWL builds onto RDFS, enabling more entailments from RDF graphs Two alternate ways to assign meaning to OWL 2 ontologies: o OWL 2 DL (direct model-theoretic semantics) o OWL 2 Full (RDF-based semantics) Every OWL 2 DL ontology is an OWL 2 Full ontology but not vice versa Note: The OWL 2 Full mainly exists for specification completeness (every RDF structure is allowed), and is not typically used for any concrete (reasoning) applications – in applications subsets of DL are used (RDF integration problems await...) Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.11 OWL 2 Vocabulary (+RDF, RDFS; see the specs) owl:AllDifferent, owl:AllDisjointClasses, owl:AllDisjointProperties, owl:allValuesFrom, owl:annotatedProperty, owl:annotatedSource, owl:annotatedTarget, owl:Annotation, owl:AnnotationProperty, owl:assertionProperty, owl:AsymmetricProperty, owl:Axiom, owl:backwardCompatibleWith, owl:bottomDataProperty, owl:bottomObjectProperty, owl:cardinality, owl:Class, owl:complementOf, owl:DataRange, owl:datatypeComplementOf, owl:DatatypeProperty, owl:deprecated, owl:DeprecatedClass, owl:DeprecatedProperty, owl:differentFrom, owl:disjointUnionOf, owl:disjointWith, owl:distinctMembers, owl:equivalentClass, owl:equivalentProperty, owl:FunctionalProperty, owl:hasKey, owl:hasSelf, owl:hasValue, owl:imports, owl:incompatibleWith, owl:intersectionOf, owl:InverseFunctionalProperty, owl:inverseOf, owl:IrreflexiveProperty, owl:maxCardinality, owl:maxQualifiedCardinality, owl:members, owl:minCardinality, owl:minQualifiedCardinality, owl:NamedIndividual, owl:NegativePropertyAssertion, owl:Nothing, owl:ObjectProperty, owl:onClass, owl:onDataRange, owl:onDatatype, owl:oneOf, owl:onProperty, owl:onProperties, owl:Ontology, owl:OntologyProperty, owl:priorVersion, owl:propertyChainAxiom, owl:propertyDisjointWith, owl:qualifiedCardinality, owl:ReflexiveProperty, owl:Restriction, owl:sameAs, owl:someValuesFrom, owl:sourceIndividual, owl:SymmetricProperty, owl:targetIndividual, owl:targetValue, owl:Thing, owl:topDataProperty, owl:topObjectProperty, owl:TransitiveProperty, owl:unionOf, owl:versionInfo, owl:versionIRI, owl:withRestrictions
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.12 Other important OWL 2 names Datatype names xsd:anyURI, xsd:base64Binary, xsd:boolean, xsd:byte, xsd:dateTime, xsd:dateTimeStamp, xsd:decimal, xsd:double, xsd:float, xsd:hexBinary, xsd:int, xsd:integer, xsd:language, xsd:long, xsd:Name, xsd:NCName, xsd:negativeInteger, xsd:NMTOKEN, xsd:nonNegativeInteger, xsd:nonPositiveInteger, xsd:normalizedString, rdf:PlainLiteral, xsd:positiveInteger, owl:rational, owl:real, xsd:short, xsd:string, xsd:token, xsd:unsignedByte, xsd:unsignedInt, xsd:unsignedLong, xsd:unsignedShort, rdf:XMLLiteral
Facet names rdf:langRange, xsd:length, xsd:maxExclusive, xsd:maxInclusive, xsd:maxLength, xsd:minExclusive, xsd:minInclusive, xsd:minLength, xsd:pattern
Prefixes (include namespaces...) owl: http://www.w3.org/2002/07/owl#, rdf: http://www.w3.org/1999/02/22rdf-syntax-ns#, rdfs: http://www.w3.org/2000/01/rdf-schema#, xsd: http://www.w3.org/2001/XMLSchema# Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.13 Three OWL 2 (DL) profiles: EL, QL, and RL Three OWL 2 DL subsets exist, optimised for different use cases with some limitations on expressiveness (all, e.g., disallow negation and disjunction) o OWL 2 EL (the name comes from the EL family of DL which allows conjunction and existential restrictions) o OWL 2 QL (the name reflects implementation in relational query language) o OWL 2 RL (the name reflects implementation in standard rule language) o More profiles may be defined later About computational properties etc., refer to OWL 2 Profiles spec Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.14 Technical notes In our treatment, we shall not make special remarks about the EL, QL, and RL (DL) profiles Also, a general-purpose OWL 2 DL reasoner can be used with all the profiles, but a profile-specific reasoner should in average outperform it (meaning a faster or a smaller execution) In general, a (complex) OWL DL ontology can be too slow to be of practical use (NP-hard is also decidable) When in doubt, use OWL 2 EL or QL
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.15 OWL 2 basic concepts According to the OWL notions, ontologies include o Axioms: the basic statements that an OWL ontology expresses o Entities: elements used to refer to real-world objects o Expressions: combinations of entities to form complex descriptions from basic ones Examples: # An axiom referring three entities (or atomic constituents of the statement): :Mary rdf:type :Mother . # A class expression ("Mother Woman Parent"): :Mother owl:equivalentClass [ rdf:type owl:Class ; owl:intersectionOf ( :Woman :Parent ) ] . Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.16 OWL 2 DL specific restrictions An OWL ontology is a set of RDF statements (=axioms), using terms from the RDF, RDFS, XSD, and OWL namespaces o ...with restrictions considering which OWL Full structures are in fact applicable o ...with constraints on how types can be applied (e.g. object, datatype, and annotation properties must be disjoint, and classes and datatypes must be disjoint) OWL supports several syntaxes (!): o Functional-style, RDF/XML, Turtle, Manchester, OWL/XML
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.17 A tour of OWL examples... (See OWL 2 Primer) 4 Classes, Properties, and Individuals – And Basic Modeling With Them
4.1 Classes and Instances 4.2 Class Hierarchies 4.3 Class Disjointness 4.4 Object Properties 4.5 Property Hierarchies 4.6 Domain and Range Restrictions 4.7 Equality and Inequality of Individuals 4.8 Datatypes
5 Advanced Class Relationships
5.1 Complex Classes Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
5.2 Property Restrictions 5.3 Property Cardinality Restrictions 5.4 Enumeration of Individuals
6 Advanced Use of Properties
6.1 Property Characteristics 6.2 Property Chains 6.3 Keys
7 Advanced Use of Datatypes 8 Document Information and Annotations
8.1 Annotating Axioms and Entities 8.2 Ontology Management 8.3 Entity Declarations
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.18 OWL 2 Notes It usually makes sense to first study OWL ontologies developed by others The ontology use cases should be considered carefully (using a query engine with reasoner support) – ask questions Remember that unless explicitly defined otherwise, o two different names might refer to the same individual o two classes with different names might not be disjoint o etc. When logical entailments or reasoning seems too complicated, it might make sense to "roll back" to SKOS (do not use OWL "as SKOS" – be mindful about the logical consequences) Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.19 Editing OWL ontologies with Protégé Developing and maintaining an ontology can be a challenging task – things may improve when using an ontology editor Perhaps the best-known (easy local installation) ontology editor is Protégé (see protege.stanford.edu) o Free, OS ontology editor and knowledge-base framework o Supports two main ways of modelling ontologies via the Protégé-Frames and Protégé-OWL editors. Protégé ontologies can be exported into a variety of formats including RDF(S), OWL, and XML Schema o Based on Java, is extensible, and provides a plug-and-play environment that makes it a flexible base for rapid prototyping and application development Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.20 Some highlights (see Protégé home page) Visual editing of ontologies (both terminology and instance data assertions) with nice documentation Ontology import/export Annotation & visualisation Plugings/views (reasoning, consistency checking, queries, rules, ...) Known problems: stability (save often and backup), some plugins require specific Protégé version, OWL knowledge still required... Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.21 Ontology repositories Developing an ontology from scratch can be tedious business When possible, it makes sense to reuse or extend existing ontology Ontology repositories enable publishing, finding, browsing, and downloading (citing) ontologies for fun and profit; see o TONES (http://owl.cs.manchester.ac.uk/repository/) o BioPortal (http://bioportal.bioontology.org/) o Swoogle (http://swoogle.umbc.edu/) o SemWebCentral (http://semwebcentral.org/) o etc.
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.22 Well-known ontology examples Some commonly known upper-level (foundational) ontologies include (Hebeler et al., 2009): o Basic Formal Ontology (BFO) o Cyc and OpenCyc o Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE) o Dublin Core Metadata Iniatitive o Friend of a Fried (FOAF) o GeoRSS o Suggested Upper Merged Ontology (SUMO) o OWL Time
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
6.23 Conclusion Even if OWL may seem a bit complicated, it is here to stay Eventually, the ratio of developers of new ontologies and the adopters of the existing ontologies should approach 0 (but we are far from there yet) Some open challenges: o training, infrastructure, tools (with reasoning support) o versioning, ontology mapping and alignment, provenance o policy issues, legal framework, IPR, business models, ... o the rest of the SemWeb stack (e.g. trust, access mngmnt) In applications, ontologies are yet to be enriched with rules...
Semantic Techniques and Applications /6 Web Ontologies (ON 2011)
7 Conclusion The things we did not say (much about) Concluding remarks Further reading
7.1 So much to do, so little time Not for the acronymically challenged, though... o GRDDL, RDFa, POWDER o SWRL & RIF o SAWSDL (WSDL 2.0 RDF, OWL-S, ...) o FOAF, CC/PP, ... "And by calling immediately to the number appearing on the screen, you may also, without any extra fee, learn about..." o Semantic wikis etc. (e.g., http://semantic-mediawiki.org/) o SemWeb software architectures (in-memory vs. DB) o Search and navigation, best practices, semantic pipes and processes, scalability and metrics, other forms of reasoning,... Semantic Techniques and Applications /7 Conclusion (ON 2011)
7.2 Semantics revisited? Semantic technologies are the future of ICT o The reason is simple; content is the king and nobody has the energy or time to go through everything manually Today, the de facto W3C Semantic Web specifications and standards define not only what the concrete technologies are, but also how we perceive semantics in applications o In particular, RDF provides common and fruitful – albeit at times unnecessary complex – semantic technology for doing business, development, and research o However, like always with the case of highly influential ideas, thinking "semantics" solely as Semantic Web is a double-edged sword Semantic Techniques and Applications /7 Conclusion (ON 2011)
7.3 Looking for a bright future SemWeb might turn out to be something quite different than anticipated today, eventually dominated by simple and intuitive end-user applications – but not completely different o The reason is simple: legacy rules
Semantic Techniques and Applications /7 Conclusion (ON 2011)
7.4 Mainstream paradigms The competing technologies/ paradigms may eventually challenge some of the Semantic Web ideas (e.g. the open world assumption and binary predicates), placing, e.g., RDF/XML somewhere "between AI and WAP". This, however, will take place simply by merging ("stealing") the SemWeb ideas into the mainstream o The reason is simple: Just think what happened, e.g., to expert systems, object-oriented design, agents, etc. Semantic Techniques and Applications /7 Conclusion (ON 2011)
7.5 Some further reading etc. This presentation has been influenced by way too many documents and people to be explicitly cited here (thank you) Besides the publications of the W3C Semantic Web Activity, and the documents of the related applications and tools, the references mentioned in the slides (should) include: o Allemang, D. & Hendler, J. 2008. Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL. Morgan Kaufman. o Baader, F., Calvanese, D, McGuinness, D.L., Nardi, D., & PatelSchneider, P.F. (eds.) 2007. The Description Logic Handbook: Theory, Implementation, and Applications, second edition. Cambridge University Press.
Semantic Techniques and Applications /7 Conclusion (ON 2011)
7.6 Some further reading etc. (Cont'd) o Bizer, C., Heath, T., & Berners-Lee, T. To appear. Linked Data - The Story So Far. International Journal on Semantic Web and Information Systems (IJSWIS) o Gomez-Perez, A., Corcho, O., & Fernandez-Lopez, M. 2004. Ontological Engineering: with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. Springer. o Hebeler, J., Fisher, M., Blace, R., & Perez-Lope, A. 2009. Semantic Web Programming. Wiley. o Poole, D., Mackworth, A., & Goebel, R., 1998. Computational Intelligence: A Logical Approach. Oxford University Press. o Russel, S., & Norvig, P. 1995. Artificial Intelligence: A Modern Approach. Springer.
Semantic Techniques and Applications /7 Conclusion (ON 2011)
7.7 The end If the above references (newer editions are by no doubt available) sound too [something], you might wish also to try (really) o Pollock, J.T. 2009. Semantic Web for Dummies. Wiley. Any mistakes and inconsistencies appearing in these slides are, of course, due to yours truly; sincere apologies for these (no warranty, though) Thank you and good luck! --Ossi (at Tampere, Finland, 2011)
Semantic Techniques and Applications /7 Conclusion (ON 2011)