Temporally Enhanced Ontologies in OWL - Erasmus Universiteit

Temporally Enhanced Ontologies in OWL: A Shared Conceptual Model and Reference Implementation Sven de Ridder and Flavius Frasincar Erasmus University Rotterdam PO Box 1738, NL-3000 DR Rotterdam, the Netherlands [email protected], [email protected]

Abstract. The temporal dimension has been recognized as an integral feature of many Semantic Web applications, but there are significant differences in how ontology authors choose to represent changes in time. We present a temporal conceptual model for OWL DL ontologies that allows the expression of fluent properties, i.e., properties that change in time, that is both representation-agnostic and serializable for the various available representation schemes. We also provide Kala, a reference implementation developed in Java, that can be used to generate temporal ontologies, convert between temporal ontologies in different representation schemes, and develop new applications such as temporal querying or visualization tools. Keywords: Semantic Web, OWL, time, fluents, Kala

1

Introduction

The ability to identify trends and make predictions is of critical importance for successful trading in financial markets. The increase in prominence of sophisticated, low-latency algorithmic trading systems has spurred development of technologies such as news analytics for the timely extraction of information that is relevant to the identification of market opportunities [24]. The Semantic Web, and OWL in particular, provide the technology to represent, manage, share, and reason over self-describing data, but these representations tend to be synchronic, i.e., they lack the crucial time dimension. One reason for the lack of temporality in existing ontologies is that, while much effort has gone into providing support for temporal features at the representational level, there seems to be a lack of a shared, representation-agnostic model at the conceptual level, which is where user requirements commonly need to be met. Here, we consider the conceptual level to be the level at which humans may express and interpret information that closely relates to the perceived real world, the level that captures the essential semantics of temporal ontologies. In

2

contrast, we consider the representational level to describe the organization of the information for representation and storage as computer data, typically as entities and relationships between entities, supported by a representation-specific vocabulary to express the individual data items. We shall refer to the conceptualization of temporal ontologies as the temporal conceptual model, and to the specification on the representational level as the temporal representation scheme. Existing representation schemes are generally not directly compatible, and one result of the focus on representation schemes is that it greatly reduces the interoperability of various temporal implementations. Given the high conversion barriers, the ability to share and reuse data — a core objective of the Semantic Web — suffers. Secondly, the authors and users of temporal ontologies are directly exposed to the details of the particular representation scheme, which makes the development and use of temporal ontologies a cumbersome, complex, and error-prone process. Lastly, the focus on specific representation schemes results in applications that operate on temporal ontologies becoming tightly bound to a particular logical structure and implementation. This discourages the development of such applications, because their potential audience will be limited to the users of a particular representation scheme. Examples of these applications are temporally-enhanced reasoning, querying, and visualization. Our focus will be on the introduction of concepts that form the building blocks of the semantics for temporally enhanced ontologies. The temporal conceptual model is designed to be mappable to selected representation schemes in OWL DL; that is, these representation schemes can be expressed in the SHOIN (D) description logic, and are fully compatible with the SROIQ(D) description logic employed by OWL 2. The model is, itself, composed of two orthogonal partitionings: one that describes the time domain, while the other describes fluent properties. Fluent properties, first described in the earliest literature on computer learning and artificial intelligence by McCarthy and Hayes [21], represent properties and relationships that may change with time. Fluent properties form a suitable focal point for the exploration of a temporal conceptual model for a number of reasons. Firstly, the concept is immediately familiar: one does not have to stretch the imagination to think of examples of properties and relationships that change over time; a person’s address, employer, and even favorite soccer team are all subject to such change — in fact, it may be more difficult to conceive examples of properties and relationships that categorically do not change over time! Secondly, fluent properties are conceptually simple: in effect, they represent ternary relations that simply extend the familiar binary relations with a fixed role for the third operand, the time interval. Thirdly, they are useful; as we have argued above, fluent properties allow for the evolution of an ontology to be expressed, examined, queried, and visualized. Lastly, fluent properties are already supported, in some form, through existing representation schemes. This paper is structured as follows. Section 2 presents background on temporal models in general and the current state of temporal ontologies in particular. After this, in Sect. 3, we present the formal description of the proposed temporal

3

conceptual model. An example implementation is provided in Sect. 4, followed by an evaluation of the implementation in Sect. 5. Lastly, we give our concluding remarks and identify possible future work in Sect. 6.

2

Representations of Temporality

The topic of data temporality has enjoyed great prolificacy: the scientific literature is rich in discussions of temporality, from philosophical treatises of time to discussions of temporal infrastructures and reasoning. Historically, this interest stems from the importance of time in many real-world applications, from logging and scheduling systems to biomedical databases and algorithmic trading. Much of the early research on data temporality has focused on the field of temporal databases, a topic with similarities to temporal ontologies, and one that is considerably more mature. In fact, a striking resemblance to the current state of temporal ontologies may be gleaned from past reports on the field of temporal databases. Pissinou et al., in their report [27] on a 1992 ARPA/NSF workshop that was aimed specifically at identifying problems within the field of temporal database technology, conclude that the many different custom extensions to the relational model, each intended to serve very specific user needs regarding temporal support, and the resulting lack of a common terminology, infrastructure, and conceptual model for temporal databases, are primary reasons for reduced adoption of temporal database technology; similarly, we find that the field of temporal ontologies faces the same issues. The researchers and participants also identify the ad-hoc nature of many applications extended to include temporal information and the understandable resistance to replace existing applications with full-fledged temporal database technology as obstacles in the development and adoption of a standard for temporal databases, and conclude that there is a need for open architectures that allow for easy conversions between different representations. In response to such findings, a consensus temporal query language specification, TSQL2 [28], was developed, but the specification, despite strong initial ISO interest, failed to catch on: by the time that SQL:1999 was formally published, the specification had failed to meet the committee’s requirements and could not be included in the language standard, and SQL vendor interest waned. Eventually, however, a number of key ideas from TSQL2 found their way into the SQL:2011 specifications [22]. Of these ideas, the concepts of valid-time (“application-time tables”) and transaction-time (“transaction-time tables”) have proved particularly useful: valid-time marking the time that a fact is held to be true, and transaction-time marking the time that a fact is known in the database. The approach to temporality in databases, then, typically resolved to marking tuples with valid-time and transaction-time timestamps, and this formed the basis for the general temporal database model (see, e.g., the conceptual model by Jensen et al. [16], the survey of temporal databases by ¨ Ozsoyo˘ glu and Snodgrass [26], and the survey of temporal entity-relationship models by Gregersen and Jensen [11]).

4

With the development of the Semantic Web and its primary languages, the Resource Definition Language (RDF) and the Web Ontology Language (OWL), came efforts to represent temporal information in these languages. The Temporal RDF [12] language extension and its related query language T-SPARQL [10] form the main solution approach to introduce time to RDF. Unfortunately, Temporal RDF is not compatible with OWL DL because of its use of RDF reification. Compounding this problem is the snapshot-based entailment mechanism of Temporal RDF, which expands any temporal statement defined over a time interval into a series of temporal statements defined over each time instant contained by the interval, and the lack of available serializations, for example to RDF/XML. Another approach is ontology versioning [17], in which “snapshots” of the ontology are created for each state of the ontology during its development. Unsurprisingly, this comes at the cost of significant data redundancy. Moreover, its support for particular classes of queries (e.g., “when is fact S true in the database?”) is limited. However, the approach may also be used to model transaction-time, and may then be considered to be completely orthogonal to other (generally valid-time) approaches discussed here. The application of ontology versioning, therefore, may be appropriate in some cases where transaction-time needs to be modeled in addition to valid-time, but, perhaps, at low enough resolution so as to reduce the impact of data redundancy. There have also been proposals to extend description logics with valid-time; see, e.g., the surveys by Artale and Franconi [2] and Lutz et al. [20]. Such temporal description logics are generally based on the ALC description logic [8]. These extensions are generally not compatible with OWL: the decidability of temporal description logics is compromised when the language is extended to the full description logics of SHOIN (D) for OWL DL or SROIQ(D) for OWL 2 [3]. Opting for temporal description logics would also mean giving up on the rich toolset developed for the OWL language, such as editors and reasoners. Representation schemes for modeling temporality in OWL DL ontologies generally follow either the reification 1 approach or the 4D-fluents approach. In the reification approach, a property instance is reified, that is, converted into a proper instance, and the original property’s subject and object instances, or subject instance and datatype value, are then related to the newly reified relation through conventional property assertions to retain the information expressed by the original hsubject, property, objecti or hsubject, property, valuei triples. However, since we are now able to specify the reified property as the subject or object of additional triples, we effectively gain the ability to express properties that are ternary, quaternary, or generally n-ary in nature. The general approach of reification is, therefore, appropriately named n-ary relations [25]. At first glimpse, the reification approach seems appropriate for adding a ternary component, e.g., valid time, to any property assertion, and develop temporal ontologies based on temporally qualified properties. The reification approach is not without problems, however. One problem is the proliferation of 1

Note that the reification representation scheme is not the same as the RDF reification, the latter being not available in OWL DL.

5

objects, namely one for each reified property assertion; related to this is the problem of providing meaningful names to the reified properties, or, alternatively, dealing with objects that may not have meaningful names. Another problem is the reduction of OWL reasoning capabilities over ontologies with reified properties; property semantics such as inverses or cardinalities are difficult or impossible to describe for reified properties in a general reasoning context. In contrast to reification, the 4D-fluents approach [29] does not associate property assertions with valid-time intervals directly, but instead opts to have temporal properties hold between timeslices of entities, a timeslice being defined as the temporal facet of some entity as it “occupies” some interval in time. In order to be consistent, both subject and object timeslices must be compatible, that is, occupy the same interval in time. An important advantage of the 4D-fluents approach over the reification approach is that properties retain their semantics in reasoning contexts: for example, we may trivially define the inverse of a fluent property, as well as symmetry and transitivity; something that is not straightforward in the reification approach. The 4D-fluents approach, however, suffers from worse object proliferation than the reification approach in the general case. The 4D-fluents approach has inspired several implementations. tOWL [23] employs the 4D-fluents approach and combines it with concrete domains and Allen’s interval algebra [1] to allow the expression of temporal restrictions. SOWL [5], a spatio-temporal representation, uses the 4D-fluents approach as its temporal component. A reinterpretation of the 4D-fluents approach is implemented by the MUSING project [19], which focuses on adoption of the approach in the context of a reasoning architecture. Baratis et al. propose the 4D-fluents approach, combined with Allen’s interval algebra, as the basis for TOQL [4]. Both approaches employ strategies that force conceptual concessions that conflict with intuitive understanding: the reification representation scheme models properties as classes, property assertions as instances, and prevents the user from specifying qualifiers for property semantics; the 4D-fluents representation scheme retains the property semantics, but requires the user to view instances as “spacetime-worms” and accept the conceptual implications that such a view necessitates. Converting a synchronic ontology with only static properties to a temporal ontology with dynamic properties is thus a cumbersome, error-prone process, as is the conversion between representation schemes. The lack of work on conceptual models for temporal ontologies in the literature indicates the need for improvements in this area.

3

The Temporal Conceptual Model

In this section we describe the proposed temporal conceptual model. Section 3.1 describes the time model. Section 3.2 builds on the time model to present the fluents model.

6

3.1

The Time Model

The time model extends the OWL model by introducing time instants and time intervals (the so-called temporal primitives), as well as assertions that relate these primitives to one another or assign to them discrete timestamp values. These temporal primitives and assertions form the building blocks for time models of varying expressive power and complexity. The time model allows primitives to be declared explicitly through primitive declarations. Such declarations may or may not translate to OWL class membership declarations when serializing to a representation scheme R, depending on whether R represents time instants or time intervals as individuals or, instead, represents them directly as datatype values. The following axioms declare the anonymous individual :t1 to be a time instant, and the named individual period2013Q1 to be an interval. We use a syntax that closely resembles the OWL Abstract Syntax in order to concisely express concepts in a familiar way. TimeDeclaration(TimeInstant(_:t1)) TimeDeclaration(TimeInterval(period2013Q1))

The “before” relation between time instants t1 and t2 can be explicitly expressed in the temporal conceptual model, as shown below: TimeInstantRelationAssertion(_:t1 _:t2