Sharing Epigraphic Information as Linked Data

6 downloads 0 Views 791KB Size Report
Venkataraman Balaji, Meeta Bagga Bhatia, Rishi Kumar,. Lavanya Kiran ... Tadinada Vankata Prabhakar, Rahul Samaddar, Bharati Soogareddy,. Asil Gerard ...
Volume Editors Salvador Sánchez-Alonso Universidad de Alcalá, Edificio Politécnico despacho O-246, Ctra. Meco s/n, 28871 Alcalá de Henares, Spain E-mail: [email protected] Ioannis N. Athanasiadis IDSIA/USI-SUPSI, Galleria 2, 6928 Manno, Lugano, Switzerland E-mail: [email protected]

Library of Congress Control Number: 2010936639 CR Subject Classification (1998): H.4, H.3, I.2, H.2.8, H.5, D.2.1, C.2 ISSN ISBN-10 ISBN-13

1865-0929 3-642-16551-6 Springer Berlin Heidelberg New York 978-3-642-16551-1 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2010 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 06/3180 543210

Table of Contents

Bridging Scales and Paradigms in Natural Systems Modeling . . . . . . . . . . Ferdinando Villa

1

Analyzing Hidden Semantics in Social Bookmarking of Open Educational Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Juli` a Minguill´ on

8

Case Studies of Ecological Integrative Information Systems: The Luquillo and Sevilleta Information Management Systems . . . . . . . . . . . . . . Inigo San Gil, Marshall White, Eda Melendez, and Kristin Vanderbilt

18

Agrotags – A Tagging Scheme for Agricultural Digital Objects . . . . . . . . . Venkataraman Balaji, Meeta Bagga Bhatia, Rishi Kumar, Lavanya Kiran Neelam, Sabitha Panja, Tadinada Vankata Prabhakar, Rahul Samaddar, Bharati Soogareddy, Asil Gerard Sylvester, and Vimlesh Yadav

36

Application Profiling for Rural Communities: eGov Services and Training Resources in Rural Inclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pantelis Karamolegkos, Axel Maroudas, and Nikos Manouselis

46

Developing a Diagnosis Aiding Ontology Based on Hysteroscopy Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marios Poulos and Nikolaos Korfiatis

57

Utilizing Embedded Semantics for User-Driven Design of Pervasive Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ahmet Soylu, Felix M¨ odritscher, and Patrick De Causmaecker

63

Utilizing Linked Open Data Sources for Automatic Generation of Semantic Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Antti Nummiaho, Sari Vainikainen, and Magnus Melin

78

Application of Semantic Tagging to Generate Superimposed Information on a Digital Encyclopedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Piedad Garrido, Jesus Tramullas, and Francisco J. Martinez

84

Mapping of Core Components Based e-Business Standards into Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ivan Magdaleni´c, Boris Vrdoljak, and Markus Schatten

95

Model-Driven Knowledge-Based Development of Expected Answer Type Taxonomies for Restricted Domain Question Answering . . . . . . . . . . Katia Vila, Jose-Norberto Maz´ on, Antonio Ferr´ andez, and Jos´e M. G´ omez

107

XII

Table of Contents

Using a Semantic Wiki for Documentation Management in Very Small Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vincent Ribaud and Philippe Saliou A Short Communication – Meta Data and Semantics the Industry Interface: What Does the Food Industry Think Are Necessary Elements for Exchange? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kathryn A.-M. Donnelly

119

131

Social Ontology Documentation for Knowledge Externalization . . . . . . . . Gonzalo A. Aranda-Corral, Joaqu´ın Borrego-D´ıaz, and Antonio Jim´enez-Mavillard

137

Information Enrichment Using TaToo’s Semantic Framework . . . . . . . . . . Gerald Schimak, Andrea E. Rizzoli, Giuseppe Avellino, Tomas Pariente Lobo, Jos´e Maria Fuentes, and Ioannis N. Athanasiadis

149

Exploiting CReP for Knowledge Retrieval and Use in Complex Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lorenza Manenti and Fabio Sartori

160

Quality Requirements of Migration Metadata in Long-Term Digital Preservation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feng Luan, Thomas Mestl, and Mads Nyg˚ ard

172

A Model for Integration and Interlinking of Idea Management Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adam Westerski, Carlos A. Iglesias, and Fernando Tapia Rico

183

An Enterprise Ontology Building the Bases for Automatic Metadata Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Barbara Th¨ onssen

195

Matching SKOS Thesauri for Spatial Data Infrastructures . . . . . . . . . . . . . Cristiano Fugazza, Soeren Dupke, and Lorenzino Vaccari

211

Sharing Epigraphic Information as Linked Data . . . . . . . . . . . . . . . . . . . . . . ´ Fernando-Luis Alvarez, Elena Garc´ıa-Barriocanal, and Joaqu´ın-L. G´ omez-Pantoja

222

Development Issues on Linked Data Weblog Enrichment . . . . . . . . . . . . . . Iv´ an Ruiz-Rube, Carlos M. Cornejo, Juan Manuel Dodero, and Vicente M. Garc´ıa

235

On Modeling Research Work for Describing and Filtering Scientific Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ´ Miguel-Angel Sicilia Localisation Standards and Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimitra Anastasiou and Lucia Morado V´ azquez

247 255

Table of Contents

The Design of an Automated Workflow for Metadata Generation . . . . . . . Miguel Manso-Callejo, M´ onica Wachowicz, and Miguel Bernab´e-Poveda Assessing Quality of Data Standards: Framework and Illustration Using XBRL GAAP Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hongwei Zhu and Harris Wu Brazilian Proposal for Agent-Based Learning Objects Metadata Standard - OBAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rosa Maria Vicari, Alexandre Ribeiro, J´ ulia Marques Carvalho da Silva, Elder Rizzon Santos, Tiago Primo, and Marta Bez Towards Quality Measures for Evaluating Thesauri . . . . . . . . . . . . . . . . . . . Daniel Kless and Simon Milton Enriching the Description of Learning Resources on Disaster Risk Reduction in the Agricultural Domain: An Ontological Approach . . . . . . Thomas Zschocke, Juan Carlos Villagr´ an de Le´ on, and Jan Beniest Descriptive Analysis of Learning Object Material Types in MERLOT . . . Cristian Cechinel, Salvador S´ anchez-Alonso, ´ Miguel-Angel Sicilia, and Merisandra Cˆ ortes de Mattos Quality in Learning Objects: Evaluating Compliance with Metadata Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Christian Vidal, N. Alejandra Segura, S. Pedro Campos, and Salvador S´ anchez-Alonso

XIII

275

288

300

312

320 331

342

The Benefits and Future of Standards: Metadata and Beyond . . . . . . . . . . Christian M. Stracke

354

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

363

Sharing Epigraphic Information as Linked Data Fernando-Luis Álvarez1, Elena García-Barriocanal1, and Joaquín-L. Gómez-Pantoja2 1

Information Engineering Research Unit Computer Science Dept., University of Alcalá Ctra. Barcelona km. 33.6 – 28871 Alcalá de Henares (Madrid), Spain {fernandol.alvarez,elena.garciab}@uah.es 2 Dep. Of History.- University of Alcalá C/ Colegios 2, E-28801 Alcalá de Henares (Madrid), Spain [email protected]

Abstract. The diffusion of epigraphic data has evolved in the last years from printed catalogues to indexed digital databases shared through the Web. Recently, the open EpiDoc specifications have resulted in an XML-based schema for the interchange of ancient texts that uses XSLT to render typographic representations. However, these schemas and representation systems are still not providing a way to encode computational semantics and semantic relations between pieces of epigraphic data. This paper sketches an approach to bring these semantics into an EpiDoc based schema using the Ontology Web Language (OWL) and following the principles and methods of information sharing known as “linked data”. The paper describes the general principles of the OWL mapping of the EpiDoc schema and how epigraphic data can be shared in RDF format via dereferenceable URIs that can be used to build advanced search, visualization and analysis systems. Keywords: Epigraphy, EpiDoc, OWL, linked data, Semantic Web.

1 Introduction The digital representation of information has become a target for modern society, especially for the preservation and dissemination of cultural heritage objects, even though recent measures show that many digitization projects are still waiting to be realized (Poll, 2010). An epigraph is a text cut or scratched in a hard surface as stone, metal or pottery. Text could go from few letters to the length of an actual booklet. These elements were mostly cut for permanence and durability, hence its use as public records for legal or government regulations as well as cultic altars and private tombstones; at the same time, there are instances of impromptu writing, such as graffiti, ownership labels and the like. Epigraphy is a welcome source when documenting ancient cultures, since they provide actual samples of how long-gone languages as Gaulish or Punic were written and spelled. For better known Greek and Latin, inscriptions are valued as a witness for words, idioms, personal and place names not attested in any other historical sources as well as testimony of everyday life: family relationships shows prominently on S. Sánchez-Alonso and I.N. Athanasiadis (Eds.): MTSR 2010, CCIS 108, pp. 222–234, 2010. © Springer-Verlag Berlin Heidelberg 2010

Sharing Epigraphic Information as Linked Data

223

tombstones as well common people life expectancy; military gravestones attest the deployment of the Army all over the Roman Empire and milestones bear witness on its extended and well cared road network; and the actual wording of most Roman laws is known because copies of them were cut in bronze tables hung on public buildings; Digital epigraphy projects have been for years considered as a key technique for the study and dissemination of ancient texts (Manuelian, 1998). A number of both public and private institutions have sponsored the development and operation of epigraphic databases which will allow scholars or the general public to access the documented inscriptions in Roman provinces from any Web browser, making research easier and quicker for a number of scholars (Pasqualis, 2005). For example, Crowther (2002) describes the need for scanning documents and the creation of large databases, and Alkhoven (2005) stresses the importance of training. Recently, the open EpiDoc specification (Cayless, 2003) has been developed and several systems using this proposed standard have already been deployed (Bodard, 2008). An EpiDoc file is a representation in XML of the edition of one inscription or a group of inscriptions. At a minimum the file will contain a text in Greek or Latin, probably with editorial siglae. It may also contain apparatus, translation, commentary, place of finding, description and dating of the text or object, among several other information elements that are normally published in a scholarly edition. However, EpiDoc is not providing a way to encode computational semantics, as it is relying mostly on structured metadata with text fields. Such computational semantics would provide epigraphic information the added value that could be exploited in Semantic Web applications. Also, they would provide the basis for sharing epigraphical information following the ideas of open linked data (Berners-Lee, 2006), which opens new possibilities for automated analysis, search or visualization applications. It should be noted that the linked data approach has been considered an important element in the agenda of the European digital library Europeana1. This paper reports preliminary work on providing computational semantics to epigraphic information by using ontology representations and sharing these enhanced information representations following the linked data approach. The rest of this paper is structured as follows. Section 2 provides background information on digital epigraphy projects and proposed schemas specific to the domain of ancient texts and museums. Then, Section 3 describes the rationale and design of an EpiDoc-based ontology expressed in W3C OWL. Section 4 sketches the design to sharing RDF-based representations of epigraphic information following the open linked data approach. Finally, conclusions and outlook are provided in Section 5.

2 Background In past years, ancient Epigraphy has jumped wholeheartedly into the digital bandwagon and the number of operative databases and data repositories keeps growing2. Such an enthusiasm arises from historical, economical and convenience reasons. First 1 2

Europeana White Paper No. 1: Knowledge = Information in Context by Prof. Stefan Gradmann, available at http://version1.europeana.eu/web/europeana-project/whitepapers For a selection of those dealing with Latin inscriptions, see: http://eda-bea.es/, http://www.manfredclauss.de/

224

F.-L. Álvarez, E. García-Barriocanal, and J.-L. Gómez-Pantoja

of all, there is a long tradition (going back as farther as the XVIth. century) to collect and edit inscriptions in great corpora or collections put together according geographical, chronological or thematic criteria. But those project are always expensive, both in terms of scholarly work and in the cost of publishing specialized volumes which need to be brought up to date often and target a restricted group of readers; furthermore, data recovery and searching on printed copy are tricky proposition, because compiling indexes by hand is such a cumbersome task that limit the number of search-key allowed. An unexpected benefit of digital databases, even the simpler ones, is that they naturally provided notable searching capabilities, while the volumes from which data was taken often lack of indexes.3 Most current epigraphic repositories are based on relational databases (Harl and Schaller, 2004), and in general searching in these systems becomes a complex task. A first important problem is that they the objects were monuments made for reading and seeing; hence, it is important to provide a detailed description of both its decoration and the place where they were set up. Second, the text could have written in Latin, Greek of any other combination of both or with other scripts like Punic, Celtiberian or Egyptian and there are not established standards for mixing scripts in a regular searching pattern. And, last, any word in an inscription may have different relevance from a scholar’s point of view.

Fig. 1. An example simple Search in the EDH Portal

3

For a detailed account on computers and Epigraphy see Gómez-Pantoja, forthcoming (Gómez-Pantoja, 2010).

Sharing Epigraphic Information as Linked Data

225

To overcome those difficulties, existing databases allow for complex search patterns, as the search mask (Figure 1) of Epigraphische Datenbank Heidelberg (EDH)4 shows. Figure 1 shows a fragment of the data needed to search for a particular inscription or set of inscriptions. Figure 2 shows a query formulated in Clauss database5. As it can be seen, they are completely different, although the results of the query depicted will include the same inscription as the query in Figure 1. Although the output in these cases is the same (as shown in Figure 3), the search is performed using different search fields and with different strings. Relational databases allow linking data from a variety of tables to establish relationships that make it possible to relate the required information. The problem is that in current databases, there is not a normalization of key entities as people or places, but they are provided as free text in metadata fields. These complications hampers efficient data analysis for which a full normalization of main information entities is required.

Fig. 2. An example simple search on Clauss/Slaby Data Base

Fig. 3. Result of queries in Figure 1 and Figure 2 4 5

http://www.uni-heidelberg.de/institute/sonst/adw/edh/index.html http://oracle-vm.ku-eichstaett.de:8888/epigr/epigraphik_en

226

F.-L. Álvarez, E. García-Barriocanal, and J.-L. Gómez-Pantoja

A first step to solve the problems of normalization of digital information in epigraphy was proposed at the end of the 90s by Elliott6. He promoted a new standard (EpiDoc)7, to represent and store epigraphic information. EpiDoc is not a database and has not rules of association; each element (inscription), is described with a XML file formatted following conventions based on the TEI guidelines8. EpiDoc uses XSLT to display stored items. So, through an XSLT template, the browser transforms XML into an HTML page that returns the required information. There are several examples of EpiDoc-based systems available on line, including Inscriptions of Aphrodisias9 (Reynolds, 1982; Bodard, 2008) and Vindolanda Tablets Online10 (Bowman, 1994; Bowman, 1983; Terras, 2006). Some of the advantages of this design are portability – as it is based on XML – ease of management – as each inscription is placed as an independent file. In consequence, it can be said that “the use of structured mark-up, like XML, increases precision in location and retrieval of content and opens up the possibility of reuse.” (Breure, 2006). However, representing information in XML files is only practical for small collections, and by itself it does not provide further entity normalization. Advancing towards increased normalization of information entities can be achieved by engineering an ontology taking as point of departure the schema and guidelines in EpiDoc. There are examples of standarized ontologies of cultural contents, such as CIDOC CRM11, which are valid for a general and wide cultural context, however none of them come close enough to the specificities of epigraphy to be usable without extension. Developing an ontological schema that allows for more normalized information structure has the added benefit of preparing epigraphic data to be shared on the Web via linked data, which opens new possibilities to relating information currently dispersed in several databases.

3 Representing EpiDoc in OWL The semantic representation for epigraphic information reported here is based on a mapping of the EpiDoc schema to an ontological representation expressed in the W3C OWL language. The mapping is not only a change in format (from EpiDoc XML to OWL) that would be rather straightforward, but a re-engineering of the scheme trying to represent all the important entities as OWL classes. Each concrete information element then becomes an instance of an Inscription concept, which relates to other concepts as Individual, Material or Letters. A series of properties are used to formally representing the links, e.g. has_civilization, whose domain will being Inscription and its range Civilization will establish the relationship among the instances of both concepts. It should be noted that this adds possibilities to map with other knowledge organization systems. For example, civilization appears in the 6

http://homepages.nyu.edu/~te20/ http://EpiDoc.sourceforge.net/ 8 http://www.tei-c.org/index.xml 9 http://insaph.kcl.ac.uk/iaph2007/index.html 10 http://vindolanda.csad.ox.ac.uk/ 11 http://www.cidoc-crm.org/ 7

Sharing Epigraphic Information as Linked Data

227

Art&Architecture Thesaurus12 of the Getty foundation associated with the facet . Even though the AAT does not provide a terminology for civilizations many art styles and periods can be easily mapped to them, providing a way to relate epigraphic information to them. EpiDoc has adopted some controlled vocabularies, e.g. the “EAGLE/EpiDoc Object Type Vocabulary” describing types of monuments and objects bearing inscribed texts. These vocabularies have been translated to ontology modules reusable separately, enriching them with subsumption and mereological predicates when appropriate. For example, the following vocabulary elements are mereologically related: columna (column) has as parts columna, basis (column base), columna, imoscapus (column shaft, lower) and columna, summoscapus (column shaft, upper). Also, in the EpiDoc header section , is intended to provide classificatory and contextual information about the text, such as its subject matter, the situation in which it was produced or the individuals described by or participating in producing it. This opens many possibilities for encoding semantically rich information as the Civilization mentioned above, or the key persons referred in the inscription. Endowing to each inscription, a series of properties and inference rules allows more precise search and relating information. This way, for example, if we look for “M Messio Abascanto”, we can carry out a textual search and return only those inscriptions containing that character string. But if we use a search in our ontology, we can differentiate the division of the selection “tria nomina”, and to look for the nomen, praenomen, cognomen or filiation. Also, genealogical trees or geographical distribution of names can be used to infer related information. Continuing the same example, roman citizens, that is, those who possessed all the rights such as voting in the city of Rome elections, received three names. The praenomen was given at the birth to the child in something like the actual name, which coincided with his father's name or a ancestor’s one. The nomen referred to the gens or family. The cognomen was a sort of nickname by which the individual was known. In out representation, each part of the nomen is a data-type associated to the inscription, as well as the filiation. This way, each one will have assigned a series of labels as attributes and properties, which will give them useful semantics for the recovery of the information. The following rule infers gens affiliations between citizens named in the inscriptions. Thus, if the reasoner finds two individuals with the same nomen, it can establish a relationship between them. Inscription(?i1) ^ has_person(?i1, ?p1) ^ Inscription(?i2) ^ has_person(?i2, ?p2) ^ has_nomen(?p1,?n1) ^ has_nomen(?p2,?n2) ^ sameAs(?n1, ?n2) → has_same_gens(?i1, ?i2)

As an example we will sketch in what follows the OWL mapping for the inscription AE 1987,63513 to illustrate the differences between an EpiDoc XML file and the corresponsing OWL representation. The inscription is showed in Figure 4.

12 13

http://www.getty.edu/research/conducting_research/vocabularies/aat/ http://edh-www.adw.uni-heidelberg.de/EDH/inschrift/012116

228

F.-L. Álvarez, E. García-Barriocanal, and J.-L. Gómez-Pantoja

Fig. 4. The AE 1987,00635 inscription The text that can be read on it is: “DIS MAN M MESSIO ABASCANTO SEGONTINO IVLIA SCINTILA MARITO PIENTISSIMO ET SIBI”

The Transcription (following an Standard notation) is: Dis Man(ibus) / M(arco) Messio Abascanto Seguntino Iulia Scinti[l]la marito / pientissimo et sibi

Which can be translated as: “To the sacred spirits of the departed. Marco Messio Abascanto of Segontia. Iulia Scintilla for her loving husband and for herself”. The following is an extract of a corresponding description in a XML / EpiDoc file . . . UAH - 2009 . . .
Dis Manibus /Marco Messio Abascanto Segontino Iulia Scintilla marito / pientissimo et sibi . . .



Sharing Epigraphic Information as Linked Data

229

To the sacred spirits of the departed. Marco Messio Abascanto of Segontia.Iulia Scintilla for her loving husband and fot herself

Guadalajara /Complutum

AE 1987,00635 entre otros



As seen, all the information is grouped into one file, formatted according to the standard , that tagging the information contained herein. In what follows, we will discuss some relevant parts of the corresponding using the Manchester OWL Syntax mapping: CivilizationObjectProperty: has_civilization Domain: Inscription Range: Civilization InverseOf: is_civilization_of Individual: Roman_ Civilization Types: Civilization Facts: is_civilization_of iComp000001, is_civilization_of iComp000002, is_civilization_of iComp000003, is_civilization_of iComp000004, Individual: iComp000001 Types: Inscription Facts: has_Last_Recorded_Location Complutum_Last_location, has_Letters Roman_Capital_Letters, has_Original_Location Complutum_Orig_location, has_civilization Roman_Civilization, has_editor UAH, has_material Sandstone, has_pers_name Abascanto, has_pers_name Marco, has_pers_name Messio, bibliography "AE 1987,00635", depth 0.3f, edition_Text "Dis Man(ibus) /M(arco) Messio Abascanto/ Segontino/Iulia Scintil(l)a marito / pientissimo et sibi", height 1.8f, lettersWidth 0.0f, translation "A los Dioses Manes. Marco Messio Abascanto, de Segontia a Iulia Scintilla de su marido piadosisimo y a si mismo..."@es,width 1.2f Individual: Marco

230

F.-L. Álvarez, E. García-Barriocanal, and J.-L. Gómez-Pantoja

Types: PraeNomen Facts: praenomen "Marcus" Individual: Messio Types: Nomen Facts: nomen "Messio" Individual: Abascanto Types: Cognomen Facts: praenomen "Marcus"

In the OWL representation, information is treated as a property or class, avoiding the use of free text strings when possible. It should be noted that each element of the ontology elements in the OWL representation are identified by the URI of the element (whether class, property, or data) referenced by connecting each entry with others. The class schema is shown in Figure 5.

Fig. 5. A fragment of the structure of the EpiOnt ontology classes

Sharing Epigraphic Information as Linked Data

231

Figure 5 shows a fragment of the graph of classes and their relations in the ontology EpiOnt. Following the edges, any element contained in the system can be accessed by traversal, establishing the required semantic relationships among the inscriptions.

4 Exporting Epigraphic Linked Data The idea of linked data is to serve information as RDF files or through SPARQL endpoints and its philosophy is “you can find other” (Berners-Lee, 2006) A fundamental feature of this vision is that the graphs are decentralised: it has no single server of statements but instead anyone can contribute statements by making them available in the Web. According to (Bizer, 2009), “technologically, the core idea of Linked Data is to use HTTP URLs not only to identify Web documents, but also to identify arbitrary real world entities.” This idea brings a new dimension to the sharing of epigraphic information, allowing for software agents to consume RDF information for specific purposes, complementing interfaces oriented to use by humans. Each of their elements described in the previous section will be referenced by an unique address, a URI (Uniform Resource Identifier) that enables an unambiguous, consistent and permanent identification for inscriptions and all their associated information items. In our approach, information is already stored in OWL-RDF so that the main additional requirements are having a consistent URI design for the information items, and deploying the providing services. In what follows, the main elements of the linked data approach taken are introduced by an example. Let’s consider as an example an 14 Inscription instance from the database Hispania Epigraphica (HE) . The URI will have the following appearance: http://www.eda-bea.es/4134 This is simply referring to the object with HE code 4134. As it is common in epigraphical catalogues, the same object is assigned several codes, one per organization system. In consequence, the suffix of the URI can be changed by any other reference, in the example, CIL_II_230915 is the Corpus Inscriptionun Latinarum (CIL)16 reference for the same object. With this minimal convention, the main objects are properly referenceable and links easily formed. Following a similar convention, the main resources also exposed as linked data are all the separate entities in the model, e.g. Civilization, Letters, Locations, Material, Place_Name, Last_Recorded_Location, Original_Location, etc.

We will use the following example to see the way in which we will export data from a concrete inscription: Inscription AE 1987, 63517 14

http://www.eda-bea.es/ http://www.eda-bea.es/pub/record_card_1.php?refpage=/pub/search_select.php& quicksearch=CIL_II_2309&rec=4134 16 http://www2.uah.es/imagines_cilii/ 17 http://edh-www.adw.uni-heidelberg.de/EDH/inschrift/012116 15

232

F.-L. Álvarez, E. García-Barriocanal, and J.-L. Gómez-Pantoja

In the ontology this inscription will be a resource identified by an internal identifier, let’s say iComp000001. However, the URI for exporting the data will be http://www.eda-bea.es/108 (or the same changing the HE identifier by another normalized one) according to the above described convention. To illustrate the data exported via these URIs, let’s consider as an example the inscription HE 108 in RDF- Turtle notation18: @prefix epiont: . @prefix rdfs: . @prefix he: . a epiont:Inscription; epiont:has_same_gens he:619; epiont:has_same_gens he:4587; ... epiont:has_Editor he:UAH; epiont:has_Civilization epiont:Roma; epiont:has_Person he: MarcoMessioAbascanto; epiont:translation “To the sacred…”^^ xsd:string; rdfs:seeAlso: .

In this inscription, Marcus, as a praenomen, indicates the first name; but the nomen, Messius, speaks about the individual's surname. This way, we can infer some relationships with other inscriptions, in this case relating those of the same family (has_same_gens). RDF triples can be used to establish individual links among resources, connecting instances among themselves and allowing the navigation among different information elements. The predicate rdfs:seeAlso is used to point to alternate descriptions of the same inscription in other databases, in this cases, it is pointing to a (fictitious) location in another database in which it has a different internal number. An agent taking the RDF above could then ask for the RDF description corresponding to MarcoMessioAbascanto. Thanks to the "Content Negotiation" facility implemented on the HTTP protocol, the client can communicate with the server to access resources. Thus, if the server does not locate the requested resource is able to redirect to the correct one. The following example explains the operation. A system user requests an HTML representation of the resource AE 1987, http://www.inscom.com/resource/AE1987: GET /resource/108 HTTP/1.1 Host: inscom.com Accept: text/html;q=1

The server would return: HTTP/1.1 302 Found Location: http://www.edabea.es/pub/record_card_1.php?refpage=/pub/search_select.p hp&quicksearch=abascanto&rec=108 18

http://www.w3.org/TeamSubmission/n3/

Sharing Epigraphic Information as Linked Data

233

The user requests a resource to the given host, in html format. The parameter q=1 indicates that it is the only format that he wants to receive (a lower rate would open the possibility that if resources are not in that format, the server will look for others). If instead of the above the user would have preferred access to the resource in RDF, the request would be: GET /resource/108 HTTP/1.1 Host: inscom.com Accept: text/html;q=0.5, application/rdf+xml

The server interprets q=0.5 like a customer preference he want preferably and whenever possible, RDF format, and returns: HTTP/1.1 303 See Other http://www.inscom.com/data/108 Vary: Accept

The system accesses to the resource sending the corresponding RDF, allowing traversing the dereferenceable URIs to access resources connected with it. The URIs linked can be at any location, which enables cross-linking of the existing databases.

5 Conclusions and Outlook Epigraphic information has experimented an evolution from printed cataloges to digital databases in the last years. Recently, the EpiDoc guidelines for encoding epigraphic information have provided an opportunity to share that information using a common syntax. However, EpiDoc is still relying in a metadata schema using mostly text fields, lacking the necessary computational semantics to build advanced analysis, search and browsing applications. This paper has described the main elements of a OWL ontology based on the XML EpiDoc format, which allows for a formal modeling of domain and descriptive concepts related to epigraphy, including referencing and linking to existing thesauri and controlled vocabularies. Inference can be used to infer or relate information as has been described in the examples. The ontology and mapping from EpiDoc presented here is preliminary and it will be revised, extended or modified in future work. However, it serves the basis for going an step ahead in creating rich metadata for epigraphic resources. The main elements of a linked data approach for the open sharing of epigraphical databases has also been described. The linked data approach is based on the OWL representation, so that the benefits of semantics and inference are exploited in enriching the information returned about the inscriptions. Future work will be directed to evaluating the OWL mapping proposed by piloting it in existing epigraphic databases. Also, the approach will be promoted at existing communities and databases to foster adoption of the linked data approach. Eventually, data integrators that combine different databases using linked data would be developed to allow an integrated access to the distributed, shared databases providing epigraphic information.

Acknowledgements The work presented here has been co-financed by the project “eCultura: Desarrollo de una Plataforma Semántica para la Preservación y Explotación de Contenido Cultural”, funded by the Spanish Ministry of Industry, grant number TSI-020501-2008-53.

234

F.-L. Álvarez, E. García-Barriocanal, and J.-L. Gómez-Pantoja

References 1. Alkhoven, P.: Digitizing Cultural Heritage Collections: The Importance of Training, Humanities, Computer and Cultural Heritage. In: Proceedings of the XVIth Int. Conference of the Association for History and Computing, Amsterdam, pp. 7–11 (2005) 2. Berners-Lee, T.: Linked Data (2006), http://www.w3.org/DesignIssues/LinkedData.html (Retrieved June 14, 2008) 3. Bizer, B.: The Emerging Web of Linked Data. IEEE Intelligent Systems 24(5), 87–92 (2009) 4. Bizer, C.: The Emerging Web of Linked Data. IEEE Intelligent Systems 24(5), 87–92 (2009) 5. Bodard, G.: The Inscriptions of Aphrodisias as electronic publication: A user’s perspective and a proposed paradigm. Digital Medievalist 4 (2008) 6. Bowman, A.K., Thomas, J.D.: Vindolanda: The Latin writing-tablets, London, (1983) 7. Bowman, A.K., Thomas, J.D., Adams, J.N.:: The Vindolanda Writing Tablets (Tabulae Vindolandenses II), London (1994) 8. Breure, L.: PROGENETOR: An editorial framework for reuse of XML content. In: Proceedings of the XVIth International Conference of the Association for History and Computing, September 14-17, pp. 57–63. Royal Netherlands Academy of Arts and Sciences, Amsterdam (2005) 9. Cayless, H.: Tools for Digital Epigraphy. In: Proc. of the Association for Computing in the Humanities/Association for Literary and Linguistic Computing, Athens GA (2003) 10. Crowther, C.: Building an Image Bank of Inscriptions, V. Imaging Projects, Centre for the Study of Ancient Documents, University of Oxford (2002) 11. Doerr, M., Schaller, K., Theodoridou, M.: Integration of complementary archaeological sources, en (2004), http://www.ics.forth.gr/isl/publications/ paperlink/doerr3_caa2004.pdf 12. Gómez-Pantoja, J.L.: Amor virtual o por qué se llevan tan bien inscripciones y ordenadores. In: Iglesias Gil, J. M. (ed.) Patrimonio cultural e informática, Santander (2010) 13. Harl, F., Schaller, K.: Adapt Computers to Archeology or Adapt Archeology to Computers? Ecole Francaise d’Athenes (February 2004), http://www.ubi-eratlupa.org/site/PDF_files/ATHENS_paper.pdf (Retrieved July 15, 2010) 14. Manuelian, P.: Digital Epigraphy: An Approach to Streamlining Egyptological Epigraphic Method. Journal of the American Research Center in Egypt 35, 97–113 (1998) 15. Pasqualis dell’Antonio, S.: From the roman eagle to E.A.G.L.E.: harvesting the web for ancient epigraphy (ed.)19, Humanities. In: Computers & Cultural Heritage -— Proceedings of the XVI Int’l Conference of the Association for History and Computing, Amsterdam, September 14-17, pp. 224–228 (2005) 16. Poll, R.: NUMERIC: statistics for the digitisation of European cultural heritage. Program: electronic library & information systems 44(2), 122–131 (2010) 17. Reynolds, J.: Aphrodisias and Rome. Journal of Roman Studies Monographs 1 (1982) 18. Terras, M.: Image to interpretation: an intelligent system to aid historians in reading the Vindolanda texts. Oxford University Press, Oxford (2006)

19

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.103.2193&rep=rep1&type= pdf#page=224