Visual Navigation Over The Linked Open Data Cloud Using The Ontodia Library Daniil Razdyakonov1 , Dmitry Pavlov1 , Yury Emelaynov1 , Alexey Morozov1 , Olga Belyaeva1 , Dmitry Mouromtsev2 , and Gerhard Wohlgenannt2 1
VISmart Ltd., St.Petersburg, Russia {daniil.razdyakonov, dmitry.pavlov, yury.emelyanov, alexey.morozov, olga.belyaeva}@vismart.biz 2 ITMO University, St.Petersburg, Russia
[email protected],
[email protected]
Abstract. We present a live demo3 of a web tool designed for incremental visual navigation and exploration over the Linked Open Data (LOD) cloud. The demo is capable to provide a seamless user experience when moving from one data source to another within the LOD cloud. We demonstrate a method for creating an internal unified data model and architecture, in which the contextual information about external resources and linked ontologies is converted and presented in the form of an interactive expandable graph. The Ontodia library4 makes use of IRIs to generate HTTP requests to the LOD servers and converts the responses to semantic graphs for storing and visualizing results. Keywords: semantic data visualization, linked open data navigation, linked oped data cloud, RDF visualization, ontology visualization.
1
Motivation
The LOD is an unprecedented attempt in provision of structured machine and human readable data that can be used for developing apps, finding answers concerning general knowledge, bringing additional context to the queries, etc. Despite its undeniable potential the absence of convenient and practical enduser tools designed for effective data consumption of LOD is one of the main blockers of LOD’s massive uptake [3]. We enable the Ontodia library to work with most of the LOD resources to minimize the entry barrier for regular users to engage with the knowledge available in the LOD cloud.
2
Related Work
QueryVOWL[4]. VOWL is a visual query language and QueryVOWL is a tool that implements it as a web-based application. In comparison with the demo, the 3 4
http://wikidata-lod.apps.vismart.biz/composite.html https://github.com/ontodia-org/ontodia
main focus of this tool is in query construction rather than in data exploration. Lodlive[2]. Lodlive demonstrates an appealing approach to graph visualization based on a node-link display pattern. DBpedia and Wikidata5 can be explored by the tool, but in case of Wikidata the property and resource labels were not resolved into meaningful text thus blocking the user from extracting any knowledge. In our demo we devoted much time to achieving the proper rendering of the data from most popular datasets. Discovery Hub[5] is an advanced tool for knowledge presentation and exploration. The link provided by authors in their publications currently returns an error. Compared to our demo the tool provides more textual information, which is sometimes more convenient for the user, than a diagram. Although similar to QueryVOWL, Discovery Hub is limited to the knowledge from DBpedia. To the best of our knowledge currently there are no visualization tools for conventient and seamless navigation and exploration of external linked datasets.
3
The End-User Application
Fig. 1. The architecture of the LOD navigation solution based on the Ontodia library
The Ontodia library [6] is a utility designed for visualizing LOD in the form of an interactive expandable graph with emphasis on high user awareness about the structure of the data and knowledge context. For data sources hosted on reliable premises the tool can use the SPARQL endpoint to incrementally query the data as users expand the nodes on a diagram. 5
https://www.wikidata.org
For this demonstration we have constructed a special RDFDataProvider (see Fig. 1 with the architecture diagram) for pulling the external data from the LOD cloud. It is capable of making requests to the LOD sources via HTTP to fetch the data in RDF-serialized form. If content negotiation succeeds and results are parsed to RDF, Ontodia employs the rdf-ext JavaScript library6 to transform the data to the internal model comprising of only 5 basic concepts: Elements (Vertexes), Links (Edges), Properties (Attributes of Vertexes), ElementTypes (Types of Vertexes), LinkTypes (Types of Edges), PropertyTypes (Types of Attributes). Once the data is prepared, it is passed on further for graph rendering. A user does not notice the leap from one data source to another - the Ontodia graph behaves exactly as it would with the hosted data. With the discussed improvements Ontodia expands the graph by the fetching resources that are referenced, but not defined in initial dataset, specifically: 1) linked ontologies to provide relevant metadata: multilingual labels, class hierarchies, etc, and 2) referenced external resources according to the LOD principles [1]. 3.1
Data Volume Problem
Table 1. Capturing the data transferred and requests sent Step 1. Cranberries (BBC Things) 2. Cranberries (Wikidata) 3. Cranberries (MusicBrainz7 ) 4. Limerick (location of formation property of Wikidata) 5. Limerick (GeoNames8 ) 6. Cranberries (DBpedia9 )
No. of requests (produced / timed out 17 / 0 26 / 0 23 / 0 11 / 0
MB transferred
14 / 0 142 / 10
0.6 18.7
0.2 0.8 0.4 0.1
To illustrate this problem we start navigation from BBC things10 . We take the music band ”Cranberries” as a search phrase and locate its IRI. We drag-anddrop the IRI to the Ontodia canvas, and Ontodia immediately pulls all the data related to this entity including the BBC things ontology. We click the Cranberries entity node on Ontodia canvas and then on the navigation icon located to the right of the node. From the list of properties we select ”same as” and choose the property with the name Q483810 (Ontodia cannot display the label adequately, because at this point it has not received the data to resolve it). By repeating several similar steps we generate the diagram presented in Fig. 2. Table 1 shows how the stack of data grows with the number of generated requests (data taken 6 10
https://github.com/rdf-ext/rdf-ext http://www.bbc.co.uk/things/
from the browser’s console). We plan to overcome this complication by creating a set of filters preventing the loading of irrelevant data.
Fig. 2. The diagram generated in Section 3.1
4
Conclusion
In this demo we present an approach for convenient navigation over distributed LOD datasets. The main contributions are (i) developing an open-source prototype of the system, (ii) bringing up the usability issues based on actual user experience and suggesting ways to solve them.
References 1. Bizer, C., Heath, T., Berners-Lee, T.: Linked data: Principles and state of the art. In: World wide web conference. pp. 1–40 (2008) 2. Camarda, D.V., Mazzini, S., Antonuccio, A.: Lodlive, exploring the web of data. In: Proceedings of the 8th Int. Conf. on Semantic Systems. pp. 197–200. ACM (2012) 3. Freitas, A., Curry, E., Oliveira, J.G., O’Riain, S.: Querying heterogeneous datasets on the linked data web: challenges, approaches, and trends. IEEE Internet Computing 16(1), 24–33 (2012) 4. Haag, F., Lohmann, S., Siek, S., Ertl, T.: Visual querying of linked data with queryvowl. In: SumPre-HSWI@ ESWC (2015) 5. Marie, N., Gandon, F., Ribi`ere, M., Rodio, F.: Discovery hub: on-the-fly linked data exploratory search. In: Proceedings of the 9th Int. Conf. on Semantic Systems. pp. 17–24. ACM (2013)
6. Mouromtsev, D., Pavlov, D., Emelyanov, Y., Morozov, A., Razdyakonov, D., Galkin, M.: The simple web-based tool for visualization and sharing of semantic data and ontologies. In: Int. Semantic Web Conf. (Posters & Demos) (2015)