Int. J. Metadata, Semantics and Ontologies, Vol. 6, Nos. 3/4, 2011
175
Accessing learning resources described in semantically enriched weblogs I. Ruiz-Rube, C.M. Cornejo and J.M. Dodero* Department of Computer Languages and Systems, University of Cádiz, C/Chile 1, 11002 Cádiz, Spain Fax: +34 956 015 139 E-mail:
[email protected] E-mail:
[email protected] E-mail:
[email protected] *Corresponding author Abstract: In this paper, we describe how to design learning activities that dynamically provide thematic Web resources from a linked-data repository. The aim is to enable teachers to share a variable set of resources related to a given subject and postpone the actual resource delivery to the deployment or enactment of the course. We have proposed a Learning Activity Management System (LAMS) tool that provides an interface to automatically select the related resources to be delivered to the students who are running a learning activity. The thematic resource repository was a linked-data extension of the WordPress blog engine. This extension allows enriching text-based and video information contained in blog entries with RDF triples that can be externally managed and exploited. Our approach allows discovering learning resources from external linked data sets and enrich blog contents with linked-data, independently of the underlying conceptual model. Keywords: linked-data; learning design; LMS integration; learning activity management system; RDF. Reference to this paper should be made as follows: Ruiz-Rube, I., Cornejo, C.M. and Dodero, J.M. (2011) ‘Accessing learning resources described in semantically enriched weblogs’, Int. J. Metadata, Semantics and Ontologies, Vol. 6, Nos. 3/4, pp.175–184. Biographical notes: I. Ruiz-Rube is a Researcher at the University of Cádiz (UCA), Spain. He has an MSc in Software Engineering and Technology from the University of Seville. His fields of research are software process improvement, semantic web and model-driven engineering. He has published several papers in these fields. Previously, he has worked as a Software Engineer in several consulting companies such as EVERIS Spain, S.L. and SADIEL S.A. C.M. Cornejo currently works at the UCA as a Researcher. He has a BSc in Computer Science from the University of Seville. His research interests are related to semantic web and e-learning technologies, in which he has published several papers. Formerly, he worked as a software developer for the Technology Innovation Center of AtSistemas S.L. J.M. Dodero is an Associate Professor of the UCA. He has a PhD in Computer Science and Engineering from the University Carlos III of Madrid. He worked as an R&D Engineer in iSOCO S.A. and as a Lecturer in the University Carlos III of Madrid. His fields of research are software and web engineering and semantics, with a special focus on technologyenhanced learning applications. He is the author of numerous publications in international conferences and journals. He is a founder and member of the Spanish Chapter of the ACM SIGCSE.
1 Introduction The life-cycle of a learning activity in a course consists of a number of steps that includes authoring (i.e., the creation, packaging and distribution of learning resources), deployment (i.e., allocating the course
Copyright © 2011 Inderscience Enterprises Ltd.
elements such as users, resources, activities, applications or services to the learning activity, according to the actual members, roles and structure of the course) and enactment (i.e., starting the interaction with the actually available resources and services as designed) (De la Fuente et al., 2008). In a web-based learning
176
I. Ruiz-Rube et al.
environment, perhaps the simplest kind of learning activity that can be designed is to share among students the URLs of a set of web-based resources or applications about a given subject or theme. Authoring, deployment and enactment of even the simplest activity come usually coupled, meaning that if the teacher wants to share a given resource in the course, he or she must explicitly know its URL and include it as part of the authoring phase, and then prepare its deployment to a specific group of learners, before it is enacted in the actual run of the course. In this work, we present a dynamic approach to the provision of thematic resources in learning activities design, which enables teachers to share a variable set of learning resources and applications about a subject, and postpone the delivery of actual resources either to deployment of enactment time. Under this perspective, a new kind of learning activity tool has been designed for an LAMS (Dalziel, 2003). The use of this tool allows to dynamically provide thematic web resources from a linked-data repository. To feed the repository with semantic information, we have built a tool for text and video annotation based on the existing, standard ontologies. This tool is a linked-data extension to the well-known WordPress blog engine. Our approach enables the automatic discovery of learning resources from metadata extracted from blog posts that are annotated according to a linked-data model. However, some issues have emerged owing to the use of linked-data technologies. To cope with these issues, we have proposed several solutions concerning to the usability of the annotation process and its performance, using extended reasoning capabilities.
2 Linked learning resource integration Our approach to dynamic generation of thematic educational resources is based on the linked-data paradigm. Linked data refer to the fact of describing and publishing the information of web pages in a structured manner that can be easily processed by software programs (Bizer et al., 2009a). Semantic web techniques are used to integrate information from different sources obtaining linked-data-based learning resources. The provision of linked-data aims at two objectives: •
automatic generation of learning contents, which can be easily updated without needing to re-author and deliver the course
•
to keep learning activities independent from the annotations used to describe the subjects of the resources.
Figure 1 depicts a general view of our proposal. To deliver a variable set of thematic web resources to the students, perhaps the simplest way is to ask them to do a Google search for a number of terms. They usually get thousands of web resources. To focus
on the search results, a refinement is to perform the search on a learning resource repository for a given subject, according to a set of metadata that must be specified in the query. Our approach is based on the latter, using enriched resources that contain RDF information describing their subject or theme. The implementation of the learning activity can then enquiry a semantic repository that exposes its data via an SPARQL endpoint that provides access to the thematic ontology that makes the activity definition independent from the configured ontology. Figure 1 Conceptual diagram of our proposal, based on a producer/consumer model (see online version for colours)
The approach has been built on top of a learning systems integration architecture that uses semantic and linkeddata technologies to decouple learning resources and services from the Virtual Learning Environment (VLE) or Learning Management System (LMS) that actually manages the course activities. The learning activity tool has been connected to the semantic content repository of the eCultura platform (www.ecultura.org). This platform hosts either producer or consumer web applications built around a linked RDF metadata repository. The repository is fed with the Music Ontology, FOAF and CIDOC CRM1 ontologies. The activity tool is built onto an LAMS instance, which acts as a consumer application of the eCultura platform. This way, thematic resources related to musical and cultural concepts can be automatically generated (Cornejo et al., 2012). Learning resources can be created through the other linked-data producer applications that feed the resource repository. For this aim, we have built a tool for text and video annotation based on standard ontologies. The tool has been designed as an extension for WordPress blog contents. Posts of the blog, properly annotated, can be exploited to enhance users’ collaboration while examining the information held in the blog.
2.1 Designing dynamic learning activities To design the consumer application, we extended an LAMS Share Resource tool, which enables a teacher to
Accessing learning resources described in semantically enriched weblogs share a predefined list of URLs to the students that will execute the activity.
2.1.1 Share thematic resource tool design The new Share Thematic Resource (STR) tool is developed using an LAMS tool contract Application Programming Interface (API), through which an LMS can connect its core services – namely administration, author, learner and monitor – with any external application. The STR tool allows easily selecting the resources that are to be presented to the students by means of a hyperbolic interface that allows navigating through the tree of thematic concepts and relationships, according to a given ontology. After selecting the desired concept, the tool automatically selects the related web resources that are to be delivered to the students. The Add Resource option of the STR tool was extended to allow setting up an SPARQL endpoint. Figure 2 depicts items from the MO and FOAF vocabulary. Any other RDF-based ontology can be also managed. The JavaScript InfoVis navigable interface that is implemented enables to create interactive data visualisations of the thematic information. Whenever the instructor selects a concept, a query against the endpoint is generated so that the STR manages the response. From this response, the tool automatically delivers the set of web resources that are to be shared among students. The communication with the server is implemented through jQuery, which offers a suite of tools for RDF processing and JSON parsing. Figure 2 Interactive hyperbolic navigation through the concepts of the ontology
177
system will display some of the URIs associated with the concept for each type of relationship, in a queryby-example fashion. Then the user can select the types of relationships that he/she is interested in and obtain related learning resources. After that, the system offers the possibility of selecting the HTTP content type that is interested in, according to standard MIME types (e.g., text/html, application/pdf, application/rdf+xml, etc.) Thus, the system can gather the preferences of the author of the activity: thematic concepts to deal with, possible relationships and the content types to publish. Figure 3 Visual interface to configure how thematic resources must be generated and delivered to the students (see online version for colours)
After the activity has been designed and built into any LAMS learning activity sequence, it will be ready for deployment and enactment to learners. When a student executes the activity, the system launches in the background a query to the SPARQL endpoint. This query will discover all the new URIs associated with the concept selected by the instructor, taking into account the preferences indicated in the authoring phase of the activity. The system also provides the ability to store the set of URIs that are relevant for the selected concept as a snapshot at a given time. Thus, it is possible to reduce latency on the server, avoiding to query the ontology each time an LAMS activity is run. This is also appropriate in those cases where changes in the set of resources are not frequent or the designers require a fixed set of resources.
2.2 Linked-data weblog enrichment 2.1.2 Generating dynamic learning resources The STR tool is based on the automatic discovery of resources from explicit linked-data resources. After the author selects the concept that he or she wants to present to the students, the system automatically collects all the interesting URIs for such resources (see Figure 3). Subsequently, the system will present to the activity author a window with the types of relationships supported by the selected concept. Additionally, the
We have extended the WordPress blog engine with an add-on to enrich posts on the basis of concepts and relationships defined by standard ontologies. To avoid changing the blog engine data model, annotations are combined with the blog post contents, using the RDFa specification. The result is a new version of the blog engine named LinkedBlog. An important objective of LinkedBlog is to keep annotations independent from the underlying ontology
178
I. Ruiz-Rube et al.
models used to describe web resources. For that reason, the add-on can work with any ontological model available through a query endpoint compatible with the SPARQL protocol for RDF.
2.2.1 Text annotation The text annotation procedure is similar to, for example, how the user transforms a selected text into a hyperlink. First, the user must select a portion of text and then select the concept to be applied to that text. In the Concept Selector pop-up window (see Figure 4) the user can associate an instance of a concept (using the hyperbolic interface of Figure 2) to the selected text.
Listing 1 shows the HTML-embedded RDFa that is generated by the add-on after annotating a post about Chano Lobato flamenco singer. In this example, you can see how his name is annotated with foaf:firstName (i.e., Sebastián) and foaf:surname (i.e., Ramírez Sarabia) datatype properties, his discography using the mo:discography object property and the Wikipedia page (mo:wikipedia) about him.
Listing 1 RDFa code that is generated from annotating the text of the example
Figure 4 Selecting a musician using the Concept Selector window (see online version for colours)
2.2.2 Video annotation
Once the user has selected the right concept or concept type, he or she can insert or edit annotations. In the pop-up window (see Figure 5) the user can indicate whether he or she wants to define a direct relationship (i.e., an RDF property having the concept as its domain) or an inverse relationship (i.e., an RDF property having the concept as its range). Figure 5 Identifying the Wikipedia page of a musician through the Insert/Edit Annotation window
LinkedBlog enables annotating videos that are included in a blog post. Annotations can be done to Youtube videos, because it is the most popular video provider that is integrated with WordPress. The procedure to describe the information held in the clip is as follows. First, the user must add the video to the post. After that, the user clicks on it and selects the Insert/Edit Video Annotations option. For describing the information contained in a part of the video (see Figure 6), the user must define a time interval along it. For that interval, the user can specify the desired concept or concept type, as well as the required annotations. To achieve this, the user can set direct/indirect relationships with literal values or other concepts. Listing 2 shows the HTML-embedded RDFa code generated by the add-on after annotating a blog post including a video about Chano Lobato and Juan Carmona (also known as Juan Habichuela) flamenco artists, performing a Soleá (i.e., a kind of flamenco style). In our example, we have defined a single sample interval, which describes the musical performance concept (mo:Performance). There are also two mo:performed inverse relationships with Juan Carmona and Chano Lobato artists (i.e., mo: MusicArtist). The RDFa snippet code includes both concepts from the MusicOntology domain as well as the needed concepts of an upper ontology used to specify time intervals.
Accessing learning resources described in semantically enriched weblogs Listing 2 RDFa code generated for the video annotation example
Figure 6 Setting the singers of a musical performance via the Insert/Edit Video Annotations window
179
of these properties, making it complicated as an effective and general exploitation method. Explicitly linked resources. Ontologies allow defining relationships between concepts in multiple ways. For example, in the Music Ontology, property mo:wikipedia is used to link a musical genre, for instance, to its corresponding Wikipedia page. In the FOAF ontology, the property foaf:homepage relates something to its homepage. These object properties provide direct access to web resources that describe these same concepts. This is an important issue, however, because accessing such resources depends on the specific ontology. Therefore, it is not known what ontology properties are more likely to publish their assertions as learning resources. Automatic discovery of resources. From a given concept, we can automatically collect all the related resources available. With this aim, all the resources available through the axioms of xsd:anyURI datatype properties and the axioms of object properties can be collected. This alternative presents a major problem. Since it involves an exhaustive search, it is possible to obtain URIs of scarce interest for learners or URIs that return content that is not suitable for human consumption. All these strategies present some pros and cons. We have chosen to use a mixed strategy between automatic discovery of resources and explicitly linked resources, as we explained earlier.
3 Discussion During the development of our system, a number of integration issues raised. In the following, we explain such issues and then propose an approach that can overcome them.
3.1 Learning resource generation strategies When generating the actual resources that are to be delivered to the student, we can pose the following questions: What knowledge sources can be extracted from the thematic ontology? How can we extract interesting resources to students from a given concept of the ontology? Here, we have identified several strategies to extract actual learning resources. Underlying ontological knowledge. The set of axioms and assertions stored in an ontology represents an important source of knowledge. However, the definition of ontologies is not intended for human consumption. Thus, we might consider as a learning resource the representation of the ontology in a friendly format (e.g., a visual graph). In our case, it can be done over the subset of the ontology that results from mapping those elements that are related to the selected concepts. Annotation properties. Another approach can be using the rdfs:seeAlso and rdfs:isDefinedBy properties in the ontology items. These are present in all resources of the ontology and provide additional information about them. However, there is no standard agreement in the use
3.2 Non-functional aspects of linked-data integration Security. LinkedBlog includes an RDFa integrated editor that lets users annotate web contents using the existing concepts in the ontological model. This model can be enquired through an SPARQL endpoint. Typically, content management systems – as is our case with WordPress – provide authentication and authorisation mechanisms. However, the ontological data repository has to supply its own security access mechanisms, since an authorised user can be able to modify contents and might not have enough privileges to work with the repository. Annotation trustiness. Another aspect to be considered occurs when two or more authorised users are annotating the same concept and contradict themselves with respect to the ontological model, or even when information comes from a dubious provenance. This issue raises the need of an annotation review protocol or a publication workflow carried out by a specific domain knowledge actor. Evolutionary tracking. Web contents often evolve over time, i.e., once the content has been created, it can be updated or even removed. It becomes usual that annotations existing at a given time cease to be valid, for example, owing to an error in the annotation, which might cause a latter update to the post. Thus, an issue about the persistence of annotations in the
180
I. Ruiz-Rube et al.
ontological model arises. When should annotations be persisting? Possible times include, among others, the moment of annotation, on submitting the post, having an asynchronous update task, on user demand, etc. We have not addressed the security and trustiness of annotations in this work, although regarding to evolutionary tracking, we have chosen to store metadata on user demand.
3.3 Usability of annotations One important issue to consider deals with the usability of the tools, especially in LinkedBlog. Since the extension is designed to be used with any ontological model, its user interface must be completely decoupled from that model. This feature, which a priori can be an advantage because of its generality, might not be attractive enough to the end-users, owing to the lack of the adaptation to a particular domain knowledge. Therefore, an intensive review process of usability was conducted. We identified several problems, which were addressed by the following actions:
URLs for new resources (i.e., individuals) identified in the blog posts. Annotation complexity. The amount of metadata to completely annotate contents always depends on the ontological model used. When using certain ontologies, the number of RDF statements involved to define even the simplest concepts can complicate too much RDFa annotations. For instance, OWL Time is a standard W3C ontology for defining temporal concepts on web pages and Web services. In the beginning, we planned to annotate the time intervals of the videos according to this model. The large number of RDF statements and RDFa code required to define a simple time interval, however, made it unmanageable. In Listing 3, we can see an example of how a simple time interval can be described through the Time Ontology. Listing 3 Sample interval in OWL functional-style syntax
Interlinked annotations. The RDFa recommendation allows making annotations while the HTML content is being written, i.e., to define relationships between concepts that are not stored yet but are visible on the posted content. This issue can be solved by persisting annotations as they are provided. It might cause, however, inconsistencies in the ontological model if the post that is being annotated is not eventually saved. In consequence, our solution for this issue is supported on the AJAX capabilities of the editor to retrieve/save concepts that are not yet consolidated. Recent annotations. When annotating concepts, it is usual to share the same set of properties. Thus, the annotation interface has been improved to highlight most frequently used metadata. Likewise, we improved the visual interface, providing a textual description for each annotation that the user is doing. This is possible because all axioms contained in ontologies are selfdescriptive (using the property rdf:label). Hiding URIs. According to the linked-data paradigm, any individual, property, assertion or axiom must be identified by a URI. However, the management of URIs is rather complex and not very intuitive for the end-user. Therefore, the annotation tool has been adapted to isolate the user from working with URIs, showing the rdfs:labels associated with classes or properties instead of URIs as long as they are reported. URI design. There is an additional problem concerning how it would be the best manner to identify concepts in the web of linked-data and how these concepts should be organised in an overall directory of URIs. Nevertheless, the idea of managing a flexible and sustainable URI space is beyond the scope of our approach. To this end, it was incorporated a scheme to automatically generate
3.4 Reasoning and performance Both the content annotation process and the learning resources delivery from metadata require a set of operations to carry out with axioms and assertions of the ontological model. The amount and complexity of SPARQL queries needed to support the user in this process will, therefore, depend on the complexity degree of the model itself. For instance, in complex models with different hierarchy levels, it requires the execution of multiple SPARQL queries to collect all the data. This is originated by the fact that the mechanisms to exploit the potential benefits inherent to ontologies, such as class inheritance or inference, might be not natively supported by the system. This issue can be mitigated by using semantic reasoners, who can infer logical consequences from a set of axioms and asserted facts of an ontology. Nonetheless, semantic reasoning comes at the cost of significantly decreasing performance. For example, in LinkedBlog, when the Insert/edit annotation window is shown, the repository will be asked by those object or data properties whose domain matches the current concept type or any of
Accessing learning resources described in semantically enriched weblogs its supertypes. Since our repository contains FOAF and Music Ontologies (see Figure 7), when finding the applicable properties to mo:SoloMusicArtist, it is required a pre-processing step that collects the set of nearby classes (e.g., foaf:Person and mo:MusicArtist) more distant superclasses (e.g., foaf:Agent). Subsequently, the applicable properties to that set of classes are consulted. Using a semantic reasoner such as Pellet (Sirin et al., 2007) enables to obtain all the properties that can be applicable to a class (e.g., up to 86 properties formo:SoloMusicArtist instances) by running a single SPARQL query, because the reasoner is able to infer on the class hierarchy. Therefore, our add-on provides specific endpoint interfaces for queries enhanced with semantic reasoning capability. These implementations drastically reduce the number and complexity of SPARQL queries. Figure 7 Excerpt from Music Ontology and FOAF Ontology (see online version for colours)
4 Related work In this section, we briefly review the related work concerning, first, learning systems’ integration architectures and, second, the approaches to manage the linked-data infrastructure.
4.1 Learning systems’ integration architectures The LMSs are used to store, manage and track web-based learning courses and events. A 2009 survey (Ellis, 2009) yields the following functional features as the most valuable in an LMS: reporting (52%), tracking (46%), assessment (45%), content management (29%), course catalogue (28%), authoring (19%), analytics (17%) and collaboration tool integration (15%). 37% of respondents identified content integration as the biggest challenge to implement an LMS. The LMSs usually have to store and manage the web contents
181
and applications as a part of their responsibilities. More modern virtual environments aim at decoupling the management of resource contents and web applications from an LMS. In such systems, web-based learning resources, applications and services have to be integrated with an LMS, which must keep the functions of managing and tracking the learning process (Yueh and Hsu, 2008). Web-based resources, applications and services must be externally provided, managed and integrated with the learning system. Learning resource integration was first approached by defining how contents are packaged and delivered to make them shareable as an open format (e.g., SCORM), which properly tagged with metadata (e.g., LOM) allows to describe the educational contents they hold (Devedzic et al., 2007). Resources and metadata are usually kept in educational repositories, from which they can be then imported into any learning environments or LMS (Geser, 2007). After enriching content with metadata, Educational Modelling Languages (EMLs) are used to extend the content-based learning course model with formal descriptions of the activities that the course contains (Torres et al., 2010). Some LMS has been extended with software engines that enable running EML descriptions of the learning activities based on the IMS Learning Design (LD) specification (Olivier and Tattersall, 2005), such as CopperCore (Vogten et al., 2007) and Grail (De la Fuente et al., 2008). Other learning design environments, such as the LAMSs (Dalziel, 2003), have proposed their own playable model of learning activity sequences along with its associated users, activities and resources, among other items. When linking a learning resource required for a learning activity, all these LD specifications and environments explicitly hard-code the URI of the resource. They do not even exploit resource metadata capabilities to provide dynamic resources to the learning environment. For instance, with IMS LD, resource URIs have to be part of the activity definition or be explicitly included in the environment of the learning unit. With an LAMS you can use the Add Resource tool to provide students with a set of resources, but their URIs have to be also hard-coded in the learning activities. From the needs of next generation LMS, software architectures of integration have to be defined, focused in platforms and software applications that implement service-based learning environments (Briscoe and De Wilde, 2006), or either evolve to web servicebased architectures and protocols (De la Fuente et al., 2008; Dodero and Ghiglione, 2008). Indeed, new versions of most widespread LMSs such as Moodle are integrated with external applications through servicebased extensions (García-Peñalvo et al., 2011). There is a proposal for a web-based learning system integration architecture that aims at decoupling LMS responsibilities from external learning resources and applications, such as Content Management Systems (CMSs), social networks and so forth. A first level of integration deals with the protocol required to
182
I. Ruiz-Rube et al.
interact with the external learning resources. ReSTbased architectural styles have been used to achieve the raw protocol-based integration of resources in an LMS (Dodero and Ghiglione, 2008). A second level of integration aims at decoupling further an LMS and the learning resources through a semantic web services layer (Dodero et al., 2010).
4.2 Approaches for managing linked-data The linked-data paradigm presents a distinctive feature, i.e., its implementation is somewhat the reverse of the traditional data modelling approach. While in classical data modelling the data model is first designed and then populated with data instances, the linked-data approach is to provide the structured data schema and access facilities after publishing data instances in the web. Providing such a structure means to enrich web pages with annotations, which are usually compliant with W3C standards such as OWL and RDF(S). Enriching web data to expose their underlying concepts, structure or schema depends on how the data sources are designed. There is a common ground between structured database schemas and ontology modelling. Relational database models make easier to define an RDF/OWL class that holds the same properties of a table in a relational database. Nevertheless, since data on the web are often published as regular text, it is more difficult to automatically map such data to a structured ontology model. The issue of extending web contents with formal metadata semantics is a challenge for the Semantic web community (Kiryakov et al., 2004). Natural language processing techniques are often used to automatically annotate text-based web sources. It is increasingly common, however, to have multimedia resources, such as images and videos, embedded in actual Web applications. It would be also interesting to describe and annotate the information held in these resources. Since real users are either producers or consumers of web resources, they can play the role of an annotator, publisher or reviewer of their contents (Allen, 2008). Web resources have to be enriched as linked-data resources to be machine-understandable. When it comes to the enrichment of web resources with metadata by users, a number of issues have to be considered. In this work we focus on the standardisation of metadata models and ontologies and on how and where such metadata are stored and managed. Linked-data resources are usually held in semantically enhanced CMSs that often have the functionality and interface of wikis – e.g., DBPedia (Bizer et al., 2009b) and Freebase (Bollacker et al., 2008) – and blogs – e.g., Zemanta (Solc, 2008). These systems provide APIs (Dotsika, 2010) to manage and exploit the linked-data they host. On the wikis side, Freebase and DBPedia can extract information from the Wikipedia and publish it as RDF(S), according to a set of own schemas. With Freebase, special-purpose linked-data applications
can be programmed – e.g., Thinkbase (Hirsch et al., 2008) – and hosted in their platform. It can pull-out Wikipedia and DBPedia data to augment their linked contents. But when resource annotations have to be linked to a standard ontology, Freebase schemas require an adaptation or transformation step before integration (Pinto and Martins, 2001). This can be seen in Freebase’s RDF output, in which a great number of OWL:SameAs identifiers point out to DBPedia. On the blogging side, the Zemanta assistant uses the English Wikipedia to capture knowledge from different areas of interest (such as literature, politics and sports) and automatically enrich blog posts with in-text hyperlinks. At the core of the system, there is a collection of semantic entities that represent concepts. These are not endorsed, however, by a standard RDF schema or an OWL ontology. Just like Zemanta, our LinkedBlog tool allows the user to annotate free texts in a blog, but using only the RDF formats of Freebase or DBpedia. In contrast, LinkedBlog supports for any ontological model to annotate blog contents. Although LinkedBlog has been devised as an add-on for WordPress, the tool has been built as an extension to the JavaScript/HTML TinyMCE2 editor. Hence, it can be readily integrated in other content management systems, in a similar way to server-side Zemanta plugins. In relation to the storage and management of annotations, a distributed approach has been provided by Annotea (Schroeter et al., 2007). Annotea is a Webbased shared annotation system based on a generalpurpose open RDF infrastructure. Annotations are external to the documents and can be stored into special annotation servers (Kahan et al., 2002). It also provides several plugins for the client side (e.g., Annozilla3 or Amaya4 ). Google SideWiki5 is another browser extension that allows website viewers to add text comments about web contents. The relevancy of these user-generated comments is based on Google’s ranking algorithms and in consequence, they are not held upon the basis of an ontological model. Our annotation tool stores annotations in the CMS itself. The contents are enriched using embedded RDFa avoiding to alter the data model of the system. Multimedia annotation approaches focus on acquiring metadata descriptions to facilitate indexing, search and retrieval. Feng et al. (2004) have introduced an automatic image and video annotation technique for retrieval, based on textual queries, where the images that form the sequence are partitioned into regions. Their motivation has been to improve the costly task of users (e.g., librarians) for manually annotating resources, so they propose a statistical generative model that uses a set of annotated training images. This outperforms reasonable skills in closed domain collections, but the quality of the approach critically depends on the training set used and can be hardly extrapolated to other general domains. In this vein, some software tools have been developed for the semantic annotation of videos using multimedia ontologies (Sicilia et al.,
Accessing learning resources described in semantically enriched weblogs 2011) for educational purposes, based on standard metadata, that adapts the existing video sequences as ready-to-use learning resources. They make reference to domain ontology elements inside metadata elements using concepts from the Gene Ontology6 that enrich multimedia resources as part of the learning sequence. Time aspects of such sequences are represented by using the MPEG-7 ontology (Hunter, 2001), which classifies video as continuous annotated segments. The MPEG-7 ontology is linked to a set of predefined concepts of multimedia resources (e.g., dates and time), thus losing the generosity that may offer W3C standard ontologies, such as the TimeOntology7 , for instance, to describe date-time relations with the temporal dimension of elements of other domains. With regard to multimedia annotations, unlike the proposal of Feng et al. (2004), LinkedBlog uses linkeddata technologies, however, it uses a manual annotation procedure as in Sicilia et al. (2011). Furthermore, LinkedBlog is agnostic with respect to the used ontology, allowing to identify concepts in time intervals as well as to establish relations between them.
183
The RDF and RDFa specifications are difficult to understand for a mid-level user. Therefore, the development of an annotation tool that hides users from the details of the specifications was a reasonable option. Despite of improvements in usability, a fluent annotation of contents depends on the complexity of the underlying ontological model. It would, therefore, be interesting to study how to work with different views or perspectives of the ontological model. Thus, more simplified views would be used for users’ annotation while the more complex ones would be in the case of automatic annotations. To bridge the gap among the complex and simplified views of the ontological models, domain-specific languages can provide a reasonable approach based on model transformations.
Acknowledgement This work has been sponsored by a grant from the ASCETA project (P09-TIC-5230) of the Andalusian Government.
References 5 Conclusions In this paper, we present a proposal for dynamically sharing a set of linked-data resources as part of a learning course. An STR LAMS tool enables to design a learning activity such that learning contents can be updated without needing to re-author and deliver the course again. Such resources are selected from an RDFenriched digital repository that can be fed by a number of content producer applications. In our case, resources are selected from a blog application endowed with linked-data capabilities. The LinkedBlog system enables to semantically enrich blog contents using the RDFa specification, independently of the underlying conceptual model and loosely coupled with the application that hosts the contents. The solution has been developed as an add-on to the common WordPress engine, but can be easily integrated in other kinds of CMS. For testing purposes, linked-data annotations are based on concepts selected from a semantic repository that uses thematic ontologies on the cultural domain, such as the CIDOC CRM, Music Ontology and FOAF. This set of ontologies can be easily extended to additional domains that are more suitable for other educational purposes. As a result of the work, some strategies for retrieving linked resources from semantic repositories are explained. We argue on non-functional aspects of information integration to cope with, concerning security, trustiness and tracking of information. In addition, we discuss some issues and solutions concerning the usability of the annotation process and its performance, motivated by the use of extended reasoning capabilities that semantic technologies provide.
Allen, M. (2008) ‘Web 2.0: An argument against convergence’, First Monday, Vol. 13, No. 3, online. http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm Bizer, C., Heath, T. and Berners-Lee, T. (2009a) ‘Linked data – the story so far’, International Journal on Semantic Web and Information Systems, Vol. 5, No. 3, pp.1–22. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R. and Hellmann, S. (2009b) ‘DBpedia – a crystallization point for the web of data’, Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 7, pp.154–165. Bollacker, K., Evans, C., Paritosh, P., Sturge, T. and Taylor, J. (2008) ‘Freebase: a collaboratively created graph database for structuring human knowledge’, SIGMOD 08: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, ACM, New York, NY, USA, pp.1247–1250. Briscoe, G. and De Wilde, P. (2006) ‘Digital ecosystems: Evolving service-oriented architectures’, IEEE Int. Conf. on BIONETICS, Cavalese, Italy, pp.1–6. Cornejo, C., Ruiz-Rube, I. and Dodero, J. (2012) ‘Semantic management of digital contents for the cultural domain’, Recent Trends in Information Reuse and Integration, Springer, Berling Heidelberg. Dalziel, J. (2003) ‘Implementing learning design: The learning activity management system (LAMS)’, Proceedings of 20th ASCILITE, Adelaide, Australia, pp.593–596. Devedzic, V., Jovanovic, J. and Gasevic, D. (2007) ‘The pragmatics of current E-learning standards’, IEEE Internet Computing, Vol. 11, No. 3, pp.19–27. De la Fuente, L., Miao, Y., Pardo, A. and Delgado Kloos, C. (2008) ‘A supporting architecture for generic service integration in IMS learning design’, Design, Lecture Notes in Computer Science, Springer-Verlag, Vol. 5192/2008, pp.467–473.
184
I. Ruiz-Rube et al.
Dodero, J.M., Ghiglione, E. and Torres, J. (2010) ‘Engineering the life-cycle of semantic services-enhanced learning systems’, International Journal of Software Engineering and Knowledge Engineering, Vol. 20, No. 4, pp.499–519. Dodero, J.M. and Ghiglione, E. (2008) ‘Rest-based Web Access to learning design services’, IEEE Transactions on Learning Technologies, Vol. 1, No. 3, pp.190–195. Dotsika, F. (2010) ‘Semantic APIs: scaling up towards the semantic web’, International Journal of Information Management, Vol. 30, pp.335–342. Ellis, R. (2009) Field Guide to Learning Management Systems, Technical Report, American Society for Training and Development. Feng, S., Manmatha, R. and Lavrenko, V. (2004) ‘Multiple bernoulli relevance models for image and video annotation’, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.1002–1009. García-Peñalvo, F.J., Conde, M.A. and Alier, M. (2011) ‘Opening learning management systems to personal learning environments’, Journal of Universal Computer Science, Vol. 17, No. 9, pp.1222–1240. Geser, G. (Ed.) (2007) Open Educational Practices and Resources, OLCOS Roadmap 2012, Salzburg Research. Hirsch, C., Grundy, J. and Hosking, J. (2008) ‘Thinkbase: A visual semantic wiki’, Proceedings of the 7th International Semantic Web Conference (ISWC2008), Karlsruhe, Germany. Hunter, J. (2001) ‘Adding multimedia to the semantic web: Building an mpeg-7 ontology’, Proceedings of the International Semantic Web Working Symposium (SWWS), Stanford, USA, pp.261–283. Kahan, J., Koivunen, M-R., Prud’hommeaux, E. and Swick, R.R. (2002) ‘Annotea: an open RDF infrastructure for shared web annotations’, Computer Networks, Vol. 39, No. 5, pp.589–608. Kiryakov, A., Popov, B., Terziev, I., Manov, D. and Ognyanoff, D. (2004) ‘Semantic annotation, indexing, and retrieval’, Journal of Web Semantics, Vol. 2, No. 1, pp.49–79. Olivier, B. and Tattersall, C. (2005) ‘The learning design specification’, in Koper, R., Tattersall, C. (eds): Learning Design, A Handbook on Modelling and Delivering Networked Education and Training, Springer, Berlin, pp. 21–40.
Pinto, H. and Martins, J. (2001) ‘A methodology for ontology integration’, Proceedings of the 1st International Conference on Knowledge Capture, pp.131–138. Schroeter, R., Hunter, J. and Newman, A. (2007) ‘Annotating relationships between multiple mixed-media digital objects by extending annotea’ Proceedings of the 4th European Semantic Web Conference, Springer, Innsbruck, Austria, pp.533–548. Sicilia, M., Sánchez-Alonso, S. and Lytras, M. (2011) ‘Semantic annotation of video fragments as learning objects: a case study with youtube videos and the gene ontology’, Interactive Learning Environments, Vol. 19, pp.25–44. Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A. and Katz, Y. (2007) ‘Pellet: a practical OWL-DL reasoner’, Journal of Web Semantics, Vol. 5, No. 2, pp.51–53. Solc, T. (2008) Automatic Generation of in-text Hyperlinks in Web Publishing, Technical Report, Zemanta Ltd, London, UK. Torres, J., Cárdenas, C., Dodero, J.M. and Juárez, E. (2010) Educational Modelling Languages and Service-Oriented Learning Process Engines, In-Tech, Chapter 2, pp.17–38. Vogten, H., Martens, H., Nadolski, R., Tattersall, C., Van Rosmalen, P. and Koper, R. (2007) ‘CopperCore service integration’, Interactive Learning Environments, Vol. 15, No. 2, pp.171–180. Yueh, H-P. and Hsu, S. (2008) ‘Designing a learning management system to support instruction’, Communications of the ACM, Vol. 51, No. 4, pp.59–63.
Notes 1
http://www.cidoc-crm.org/official_release_cidoc.html http://www.tinymce.com/ 3 http://annozilla.mozdev.org/ 4 http://www.w3.org/Amaya/ 5 http://www.google.com/sidewiki/ 6 http://www.geneontology.org/ 7 http://www.w3.org/TR/owl-time/ 2