tag with corresponding attributes will be created (e.g. Figure 5.12 line 1 or 3). • if no text is selected, a new tag with corresponding attributes will be created (e.g. Figure 5.12 line 17 or 18).
5.6. Usability Evaluation Since releasing RDFaCE, the tool (WordPress plugin and independent TinyMCE plugins) has been downloaded over 3000 times and the online demo page has received more than 5000 unique visits. we also could collect considerable feedback
96
5.6. Usability Evaluation
1 2 3
4 5 6 7 8 9 10 11 12 13
14 15 16 17 18 19
Strawberry Cake
By Ali Khalili, July 8, 2013
Ingredients
Preparation time: 10 mins
Cooking time: 30 min
Ready in 40 min
Calories: 393 kcal
Figure 5.12.: Example of Microdata annotations generated by RDFaCE. from RDFaCE end-users on the Social Web. For a concrete evaluation of RDFaCE usability, we conducted an experiment with 16 participants of ISSLOD 2011 summer school17 . For the experiment, we developed a usability test platform18 . The experiment consisted of the following steps: First, some basic information about semantic content authoring along with a demo showcasing different RDFaCE features was presented to the participants as a video. Then, participants were asked to use RDFaCE to annotate three text snippets – a wiki article, a blog post and a news article (News#2, Blog#4, Wiki#3 from our sample articles). For each text snippet, a timeslot of five minutes was available to use different features of RDFaCE for annotating occurrences of persons, locations and organizations with suitable entity references (i.e. Linked Data URIs). Subsequently, a survey was presented to the participants were they were asked some questions about their experience while working with RDFaCE. Questions were targeting six factors of usability [Lauesen, 2005] namely Fit for use, Ease of learning, Task efficiency, Ease of remembering, Subjective satisfaction 17 18
Summer school on Linked Data: http://lod2.eu/Article/ISSLOD2011 Available online at: http://rdface.aksw.org/usability
97
5. From WYSIWYG to WYSIWYM
Figure 5.13.: Using RDFaCE to annotate recipes based on Schema.org. Skill/ Level heard of it basic Skill in Semantic Web 6.25% 37.50% Skill in RDFa 18.75% 37.50%
advanced 37.50% 37.50%
expert 18.75% 6.25%
Table 5.2.: Participants level of knowledge. and Understandability. Results of individual user annotations as well as the results of the survey were carefully analyzed for extracting subjective and objective usage characteristic of RDFaCE, respectively. In the following we report about the result of this experiment: Participants. Participants included students (85%) and researchers (15%) working on different aspects of computer science and informations systems. As shown in Table 5.2, they bear different level of knowledge in Semantic Web and in particular RDFa, varying from basic to expert knowledge. Usability Factors. During the experiment we collected considerable qualitative feedback from the end-users. As shown in Table 5.3, the overall feedback of users was positive and they provided constructive feedback to enhance the usability of RDFaCE. They frequently told us that they are impressed with the functionality of RDFaCE to support their desired tasks. They found the UI easy to learn but in some cases had difficulties to distinguish between property and subject suggestions. Some users suggested to change triple insertion to property insertion and some suggested to improve the visualization of URI suggestion results so that they can easily choose the appropriate one. Most of the users found the UI easy to remember
98
5.7. Comparison of RDFaCE to Existing SCA Tools Usability Factor/Grade Fit for use Ease of learning Task efficiency Ease of remembering Subjective satisfaction Understandability
Poor 0% 0% 0% 0% 0% 6.25%
Fair 12.50% 12.50% 0% 0% 18.75% 18.75%
Neutral 31.25% 50% 56.25% 37.50% 50% 31.25%
Good 43.75% 31.25% 37.50% 50% 25% 37.50%
Excellent 12.50% 6.25% 6.25% 12.50% 6.25% 6.25%
Table 5.3.: Usability factors derived from the survey. and a few suggested to change some RDFaCE toolbar icons to more descriptive ones. Annotations. Figure 5.14 reflects the number of annotations (triples) as well as the time of annotation per user for each of the text fragments. From the results we can see that almost (except two cases) all users have been able to create semantic text content. The annotation time for the last text snippet has decreased for most of the users which is an indicator for increased familiarity of the users with RDFaCE.
5.7. Comparison of RDFaCE to Existing SCA Tools There are already many Semantic Content Authoring (SCA) systems available. RADiFy 19 , WYMeditor 20 , DataPress [Benson et al., 2010], Loomp [Luczak-Roesch, 2009] and FLERSA [Navarro-Galindo and Samos, 2010] are some examples of SCA systems which adopt the bottom-up approach. We can also mention RDFauthor [Tramp et al., 2010] and SAHA 3 [Frosterus et al., 2011] as two examples which adopt the top-down approach for semantic authoring. OntosFeeder 21 and Epiphany 22 are also two partially related tools. They do not provide editing functionality for RDFa generated content but can be used as complementary tools to RDFaCE which deliver a set of initial RDFa annotations to be edited and extended later on by RDFaCE. As an another related work we can mention Named Entity Recognition and Disambiguation (NERD) [Rizzo and Troncy, 2011]23 which is an evaluation framework which records and analyzes ratings of Named Entity extraction and disambiguation tools. The main difference between RDFaCE and NERD is that RDFaCE employs the voting approach to combine the results of NLP APIs for automatic annotation but NERD expects a human being to manually compare the results of different NLP APIs and choose the right one for annotation. Figure 5.15 provides a comparison between the three popular SCA systems 19
http://duncangrant.co.uk/radify/ http://www.wymeditor.org 21 http://wordpress.org/extend/plugins/ontos-feeder/ 22 http://projects.dfki.uni-kl.de/epiphany/ 23 http://nerd.eurecom.fr/ 20
99
5. From WYSIWYG to WYSIWYM
Figure 5.14.: Results of usability test. (top) Number of annotations per user. (bottom) Annotation time per user. (RDFauthor, SAHA 3 and Loomp) and RDFaCE based on the quality attributes discussed in Chapter 3. Here we have compared the tools based on the quality attributes that were already addressed during the development of RDFaCE. RDFauthor is a tool for editing RDFa contents. The RDFauthor approach is based on the idea of making arbitrary XHTML views with integrated RDFa annotations editable [Tramp et al., 2010]. RDFauthor converts an RDFa-annotated view directly into an editable form thereby hiding the RDF and related ontology data models from novice users. It is backend independent to some extend and supports two different types of storage engines. Although RDFauthor has as RDFaCE the goal to make RDFa editing simple by abstracting the details of RDFa authoring both differ in two crucial aspects: Firstly, RDFauthor assumes that the RDFa content is already existing while RDFaCE provides the feature to creating new RDFa annotations. Secondly, instead of using forms to edit RDFa contents, RDFaCE employs inline editing of contents by providing a rich semantic text editor. Saha
100
5.8. Conclusions
RDFauthor
Usability
Customizability
Proactivity
Automation
Scalability
-Single point of entry UI -Inline editing -
-Resource suggestion -Concept reuse
-Storage strategy: backend independent (Mysql, Virtuoso)
SAHA 3
Loomp
RDFaCE
-Single point of entry UI -Inline editing
-Single point of entry UI -Faceted viewing
-Single point of entry UI -Inline editing
-
-
-Resource suggestion -Concept reuse -Real-time validation
-Storage strategy: server-side triple store
-Resource suggestion -Concept reuse
-Storage strategy: server-side triple store
-Semantic views: WYSIWYM, WYSIWYG , triple view, source code view -Resource suggestion -Automatic annotation: NLP APIs -Storage strategy: on-the-fly client-side triple storage
Figure 5.15.: Comparison of RDFauthor, SAHA 3, Loomp and RDFaCE according to the quality attributes. 3 is another meta data editor which is very similar to RDFauthor but supports real-time validation in addition (see Section 3.8.2). Loomp is an editor which allows annotating words and phrases with references to ontology concepts (see Section 3.8.3). It supports a faceted viewing feature, which highlights user-selected annotations in the Web browser. The main difference between Loomp and RDFaCE is that Loomp relies on the functionality of a server managing the semantic content while RDFaCE provides client-side annotation for modifying RDFa content directly. Morever, Loomp uses a triple store on the server side but in RDFaCE, triples are created on the fly in the user browser. The main advantages of RDFaCE comparing to other tools are twofold: Providing different views for authoring semantic documents as well as supporting automatic content annotation, which improve the customizability and automation remarkably. Furthermore, since RDFaCE processes the annotations client-side within the user’s browser and does not require any central storage backend, it is highly scalable.
5.8. Conclusions This chapter addressed the research question RQ3 (cf. Section 1.3) to integrate semantic authoring features into the current tools on the Social Web. With RDFaCE we presented an approach and its implementation of a WYSIWYM editor based on complementing the classical WYSIWYG view with three additional views on the semantic representations. We showed that with RDFaCE the semantic annotation and enrichment can be easily integrated into the content authoring pipelines commonly found in many content centric scenarios.
101
Chapter 6
WYSIWYM for Lightweight Text Analytics “Simplicity is the ultimate sophistication.” — Leonardo da Vinci
In this chapter we present a text analytics architecture of participation, which employs WYSIWYM UI model to allow ordinary people with no or limited knowledge of programming to use sophisticated NLP techniques for analyzing and visualizing their content, be it a Blog, Twitter feed, Website or article collection. Different exchangeable components can be plugged into this architecture, making it easy to tailor for individual needs. We evaluate the usefulness of our approach by comparing both the effectiveness and efficiency of end users within a task-solving setting. Moreover, we evaluate the usability of our approach using a questionnaire-driven approach. The chapter is structured as follows: Section 6.1 describes the current analytical information imbalance. In Section 6.2, we introduce conTEXT for democratizing the NLP usage. We show that conTEXT fills a gap in the space of related approaches in Section 6.3. The general workflow and interface design is presented in Section 6.4. The different visualizations and views supported by conTEXT are discussed in Section 6.5 before we present our implementation in Section 6.6. We show the results of a qualitative and quantitative user evaluation in Section 6.7 before we conclude in Section 6.8.1
6.1. Analytical Information Imbalance Currently, there seems to be an imbalance on the Web. Hundreds of millions of users continuously share stories about their life on social networking platforms such as Facebook, Twitter and Google Plus. However, the conclusions that can be drawn from analyzing the shared content are rarely shared back with the users of these platforms. The social networking platforms on the other hand exploit the results of analyzing user-generated content for targeted placement of advertisements, promotions, customer studies etc. One basic principle of data privacy is, that every person should be able to know what personal information is stored about herself in a database (cf. OECD privacy principles2 ). We argue, that this principle does not suffice anymore and that there is an analytical information imbalance. People 1 2
102
The contents of this chapter have been published as [Khalili et al., 2014]. http://oecdprivacy.org/#participation
6.2. conTEXT: A Text Analytics Architecture of Participation should be able to find out what patterns can be discovered and what conclusions can be drawn from the information they share. Let us look at the case of a typical social network user Judy. When Judy updates her social networking page regularly over years, she should be able to discover what the main topics were she shared with her friends, what places, products or organizations are related to her posts and how these things she wrote about are interrelated. Currently, the social network Judy uses analyzes her and other users data in a big data warehouse. Advertisement customers of the social networking platform, can place targeted adds to users being interested in certain topics. Judy, for example, is sneaker aficionado. She likes to wear colorful sports shoes with interesting designs, follows the latest trends and regularly shares her current favorites with her friends on the social network. Increasingly, advertisements for sportswear are placed within her posts. Being able to understand what conclusions can be drawn by analyzing her posts will give Judy at least some of the power back into her hands she lost during the last years to Web giants analyzing big user data.
6.2. conTEXT: A Text Analytics Architecture of Participation In order to mitigate the current analytical information imbalance, we created conTEXT 3 – a text analytics architecture of participation, which allows end-users to use sophisticated NLP techniques for analyzing and visualizing their content, be it a Weblog, Twitter, Facebook, G+, LinkedIn feed, Website or article collection. With almost no effort, users can analyze the information they share and obtain similar insights as social networking sites. The conTEXT architecture comprises interfaces for information access, natural language processing (currently mainly NER) and visualization. Different exchangeable components can be plugged into this architecture. Users are empowered to provide manual corrections and feedback on the automatic text processing results, which directly increase the semantic annotation quality and are used as input for attaining further automatic improvements. An online demo of the conTEXT is available at http://context. aksw.org. conTEXT empowers users to answer a number of questions, which were previously impossible or very tedious to answer. Examples include: • Finding all articles or posts related to a specific person, location or organization. • Identifying the most frequently mentioned terms, concepts, people, locations or organizations in a corpus.
3
We choose the name conTEXT, since our approach performs analyzes with (Latin ‘con’) text and provides contextual visualizations for discovered entities in text.
103
high
6. WYSIWYM for Lightweight Text Analytics
Flexibility of user interface
Text Analysis Development Environments
Linked Data Analysis Tools conTEXT
Text Analysis Tools Business Intelligence Tools Spreadsheets
Social Media Analysis Tools
low
NLP APIs Expert-programmer
Novice programmer
Non-programmer
Targeted user
Figure 6.1.: Flexibility of user interfaces and targeted user groups as well as genericity (circle size) and degree of structure (circle color) for various analytics platforms. • Showing the temporal relations between people or events mentioned in the corpus. • Discovering typical relationships between entities. • Identifying trending concepts or entities over time. • Find posts where certain entities or concepts co-occur. conTEXT lowers the barrier to text analytics by providing the following key features: • No installation and configuration required. • Access content from a variety of sources. • Instantly show the results of analysis to users in a variety of visualizations. • Allow refinement of automatic annotations and take feedback into account. • Provide a generic architecture where different modules for content acquisition, natural language processing and visualization can be plugged together.
6.3. Classification of Existing Text Analysis Tools Analytics (i.e. the discovery and communication of meaningful patterns in data) is a broad area of research and technology. Involving research ranging from NLP and Machine Learning to Semantic Web, this area has been very vibrant in
104
6.3. Classification of Existing Text Analysis Tools recent years. Existing tools in the domain of analytics can be roughly categorized according to the following dimensions: • Degree of structure. Typically, an analytics system extracts patterns from a certain type of input data. The type of input data can vary between unstructured (e.g. text, audio, videos), semi-structured (e.g. text formats, shallow XML, CSV) and structured data (e.g. databases, RDF, richly structured XML). • Flexibility of user interface. Analytics systems provide different types of interfaces to communicate the found patterns to users. A flexible UI should support techniques for exploration, visualization as well as even feedback and refinemment of the discovered patterns. This dimension also evaluates the interactivity of UIs, diversity of analytical views as well as the capability to mix results. • Targeted user. An analytics system might be used by different types of users including non-programmer, novice-programmer and expert-programmer. • Genericity. This dimension assesses an analytics system in terms of genericity of architecture and scalability. These features enable reuse of components as well as adding new functionality and data at minimal effort. Figure 6.1 provides an abstract view of the state-of-the-art in analytics according to these dimensions. Text analysis development environments usually provide comprehensive support for developing customized text analytics workflows for extracting, transforming and visualizing data. Typically they provide a high degree of genericity and interface flexibility, but require users to be expert-programmers. Examples include the IBM Content Analytics platform 4 , GATE [Cunningham et al., 2011], Apache UIMA [Ferrucci and Lally, 2004]. Text analysis tools provide a higher level of abstraction (thus catering more novice users) at the cost of genericity. Yang et al. [Yang et al., 2013] recently published an extensive text analytics survey from the viewpoint of the targeted user and introduced a tool called WizIE which enables novice programmers to perform different tasks of text analysis. Examples include Attensity 5 , Thomson Data Analyzer 6 Trendminer [Preotiuc-Pietro et al., 2012] and MashMaker [Ennals et al., 2007]. Business intelligence (BI) tools are applications designed to retrieve, analyze and report mainly highly-structured data for facilitating business decision making. BI tools usually require some form of programming or at least proficiency in
4 5 6
http://www-03.ibm.com/software/products/us/en/contentanalyticssearch http://www.attensity.com http://thomsonreuters.com/thomson-data-analyzer/
105
6. WYSIWYM for Lightweight Text Analytics query construction and report designing. Examples include Zoho Reports 7 , SAP NetWeaver 8 , Jackbe 9 , and RapidMiner [Jungermann, 2009]. Spreadsheet-based tools are interactive applications for organization and analysis of data in tabular form. They can be used without much programming skills, are relatively generically applicable and provide flexible visualizations. However, spreadsheet-based tools are limited to structured tabular, data and can not be applied to semi-structured or text data. Examples include Excel, DataWrangler [Kandel et al., 2011], Google Docs Spreadsheets and Google Refine. NLP APIs are web services providing natural language processing (e.g. named entity recognition and relation extraction) for analyzing web pages and documents. The use of these APIs requires some form of programming and flexible interfaces are usually not provided. Examples include Alchemy, OpenCalais, Apache OpenNLP.10 Linked Data analysis tools support the exploration and visualization of Linked Data (LD). Examples include Facete 11 for spatial and CubeViz 12 for statistical data. Dadzie and Rowe [Dadzie and Rowe, 2011] present a comprehensive survey of approaches for visualizing and exploring LD. They conclude that most of the tools are designed only for tech-users and do not provide overviews on the data. Social Media analysis tools such as SRSR, TweetDeck 13 , Topsy 14 , Flumes 15 , and Trendsmap 16 focus in comparison to conTEXT primarily on the content aggregation across large repositories (e.g. Twitter as a whole) and perform popularity and trend analysis. conTEXT on the other hand aims at providing different exploration and visualization means for more specific types of content exploiting the extracted semantics. When comparing these different analytics tool categories according to the dimensions genericity, UI flexibility, target users and degree of structure we discovered a lack of tools dealing with unstructured content, catering non-expert users and providing flexible analytics interfaces. The aim of developing the text analytics tool conTEXT is to fill this gap.
6.4. Workflow and Interface Design Workflow. Figure 6.2 shows the process of text analytics in conTEXT. The process starts by collecting information from the web or social web. conTEXT utilizes standard information access methods and protocols such as RSS/ATOM 7
http://www.zoho.com/reports/ http://sap.com/netweaver 9 http://jackbe.com/ 10 A complete list of NLP APIs is available at http://nerd.eurecom.fr/ 11 http://aksw.org/Projects/Facete 12 http://aksw.org/Projects/CubeViz 13 http://tweetdeck.com/ 14 http://topsy.com/ 15 http://www.flumes.com/ 16 http://trendsmap.com/ 8
106
6.4. Workflow and Interface Design
RSS, Atom, RDF Feeds
REST APIs
SPARQL Endpoints
Web Crawlers
Collecting
Processing Mixing
Enriching
BOA
feedback RDFaCE
Annotation Refinement
Exhibit D3.js
Exploring & Visualizing
Figure 6.2.: Text analytics workflow in conTEXT. feeds, SPARQL endpoints and REST APIs as well as customized crawlers for SlideWiki, WordPress, Blogger and Twitter to build a corpus of information relevant for a certain user. The assembled text corpus is then processed by NLP services. While conTEXT can integrate virtually any NLP services, it currently implements interfaces for DBpedia Spotlight [Mendes et al., 2011] and the Federated knOwledge eXtraction Framework (FOX) [Ngomo et al., 2011] for discovering and annotating named entities in the text. DBpedia Spotlight annotates mentions of DBpedia resources in text thereby links unstructured information sources to the Linked Open Data cloud through DBpedia. FOX is a knowledge extraction framework that utilizes a variety of different NLP algorithms to extract RDF triples of high accuracy from text. Unlike DBpedia Spotlight, which supports all the DBpedia resource types, FOX is limited to Person, Location and Organization types. On the other hand, since FOX uses ensemble learning to merge different NLP algorithms, leads to a higher precision and recall (see [Ngomo et al., 2011] for details). The processed corpus is then further enriched by two mechanisms: • DBpedia URIs of the found entities are de-referenced in order to add more specific information to the discovered named entities (e.g. longitude and latitudes for locations, birth and death dates for people etc.). • Entity co-occurrences are matched with pre-defined natural-language patterns for DBpedia predicates provided by BOotstrapping linked datA (BOA) [Gerber and Ngonga Ngomo, 2011] in order to extract possible relationships between the entities. The processed data can also be joined with other existing corpora in a text analytics Mashup. Such a Mashup of different annotated corpora combines information
107
6. WYSIWYM for Lightweight Text Analytics from more than one corpus in order to provide users an integrated view. Analytics Mashups help to provide more contexts for the text corpus under analysis and also enable users to mix diverse text corpora for performing a comparative analysis. For example, a user’s WordPress blog corpus can be integrated with corpora obtained from her Twitter and Facebook accounts. The creation of analytics Mashups requires dealing with the heterogeneity of different corpora as well as the heterogeneity of different NLP services utilized for annotation. conTEXT employs NIF [Hellmann et al., 2013] to deal with this heterogeneity. The use of NIF allows us to quickly integrate additional NLP services into conTEXT. The processed, enriched and possibly mixed results are presented to users using different views for exploration and visualization of the data. Exhibit [Huynh et al., 2007]17 (structured data publishing) and D3.js [Bostock et al., 2011]18 are employed for realizing a dynamic exploration and visualization experience. Additionally, conTEXT provides an annotation refinement user interface based on the RDFa Content Editor (RDFaCE) discussed in Chapter 5 to enable users to revise the annotated results. User-refined annotations are sent back to the NLP services as feedback for the purpose of learning in the system. Progressive crawling and annotation. The process of collecting and annotating a large text corpus can be time-consuming. Therefore it is very important to provide users with immediate results and inform them about the progress of the crawling and annotation task. For this purpose, we have designed special user interface elements to keep users informed until the complete results are available. The first indicator interface is an animated progress bar, which shows the percentage of the collected/annotated results as well as the currently downloaded and processed item (e.g. the title of the blog post). The second indicator interface is a real-time tag cloud, which is updated while the annotation is in progress. We logged all crawling and processing timings during our evaluation period. Based on these records, the processing of a Twitter feed with 300 tweets takes on average 30 seconds and the processing of 100 blog posts approx. 3-4 minutes on standard server with i7 Intel CPU (with parallelization and hardware optimizations further significant acceleration is possible). This shows, that for typical crawling and annotation tasks the conTEXT processing can be performed in almost real-time thus providing instant results to the users. Annotation refinement interfaces. A lightweight text analytics as implemented by conTEXT provides direct incentives to users to adopt and revise semantic text annotations. Users will obtain more precise results as they refine annotations. On the other hand, NLP services can benefit from these manually-revised annotations to learn the right annotations. conTEXT employs the RDFaCE within the faceted browsing view and thus enables users to edit existing annotations while browsing 17 18
108
http://simile-widgets.org/exhibit3/
Data-Driven Document (D3) http://d3js.org/
6.4. Workflow and Interface Design Parameter text entityUri surfaceForm offset feedback context isManual senderIDs
Description annotated text. the identifier of the annotated entity. the name of the annotated entity. position of the first letter of the entity. indicates whether the annotation is correct or incorrect. indicates the context of the annotated corpus. indicates whether the feedback is sent by user or by other NLP services. identifier(s) of the feedback sender.
Table 6.1.: NLP Feedback parameters.
Figure 6.3.: Screenshots of the conTEXT WYSIWYM interface (T2 indicates the inline editing UI, V1 – the framing of named entities in the text, V2 – text margin formatting for visualizing hierarchy, V7 – line connectors to show the relation between entities, V9 – a callout showing additional type information, X2 – faceted browsing, H3 – recommendation for NLP feedback). the data. The WYSIWYM interface as depicted in Figure 6.3 enables integrated visualization and authoring of unstructured and semantic content (i.e. annotations encoded in RDFa). The manual annotations are collected and sent as feedback to the corresponding NLP service19 . The feedback encompasses the parameters specified in Table 6.1.
19
DBpedia Spotlight Feedback API (http://spotlight.dbpedia.org/rest/feedback), FOX Feedback API (http://139.18.2.164:4444/api/ner/feedback)
109
6. WYSIWYM for Lightweight Text Analytics
Figure 6.4.: Example of realtime semantic analysis in conTEXT. Exploration and visualization interfaces. The dynamic exploration of content indexed by the annotated entities facilitates faster and easier comprehension of the content and provide new insights. conTEXT creates a novel entity-based search and browsing interface for end-users to review and explore their content. On the other hand, conTEXT provides different visualization interfaces which present, transform, and convert semantically enriched data into a visual representation, so that, users can explore and query the data efficiently. Visualization UIs are supported by noise-removal algorithms which will tune the results for better representation and will highlight the picks and trends in the visualizations. For example, we use a frequency threshold when displaying single resources in interfaces. In addition, a threshold based on the Dice similarity is used in interfaces, which display cooccurrences. By these means, we ensure that the information overload is reduced and that information shown to the user is the most relevant. Note that the user can chose to deactivate or alter any of these thresholds. Linked Data interface for Search Engine Optimization (SEO). As discussed in Section 5.5.3, the Schema.org initiative provides a collection of shared schemas that Web authors can use to markup their content in order to enable enhanced search and browsing features offered by major search engines. A direct feature of the Linked Data based text analytics with conTEXT is the provisioning of a SEO interface. conTEXT encodes the results of the content annotation (automatic and revisions by the user) in the JSON-LD format (cf. Section 3.3.1) which can be directly exposed to schema.org aware search engines. This component employs
110
6.5. Views for Text Analytics the current mapping from the DBpedia ontology to the Schema.org vocabularies20 . Thus the conTEXT SEO interface enables end-users to benefit from better exposure in search engines (e.g. through Google’s Rich Text Snippets) with very little effort. Real-time semantic analysis. In addition to its normal functionality, conTEXT also supports real-time content analysis for streaming data like Twitter streams. Figure 6.4 shows an example of real-time semantic anlaysis for Twitter streams on specific hashtags.21 This way, users can see the live progress of different analytics views on incoming data and thereby can quickly follow the trends that are currently on the social media. Real-time analytics is also useful for the companies and businesses to gain competitive advantage and to improve their customer relationships by monitoring users feedback on social media Websites.
6.5. Views for Text Analytics A key aspect of conTEXT is to provide intuitive exploration and visualization options for the annotated corpora. For that purpose, conTEXT allows to plugin a variety of different exploration and visualization modules, which operate on the conTEXT data model capturing the annotated corpora. By default, conTEXT provides the following views for exploring and visualizing the annotated corpora: • Faceted browsing allows users to quickly and efficiently explore the corpus along multiple dimensions (i.e. articles, entity types, temporal data) using the DBpedia ontology. The faceted view enables users to drill a large set of articles down to a set adhering to certain constraints. • Matrix view shows the entity co-occurrence matrix. Each cell in the matrix reflects the entity co-occurrence by entity types (color of the cell) and by the frequency of co-occurrence (color intensity). • Trend view shows the occurrence frequency of entities in the corpus over the times. The trend view requires a corpus with articles having a timestamp (such as blog posts or tweets). • Image view shows a picture collage created from the entities Wikipedia images. This is an alternative for tag cloud, which reflects the frequent entities in the corpora by using different image sizes. • Tag cloud shows entities found in the corpus in different sizes depending on their prevalence. The tag cloud helps to quickly identify the most prominent entities in the corpora. 20
http://schema.rdfs.org/mappings.html
21
An online demo of the real-time semantic analysis for Twitter is available at http:// context.aksw.org/resa.
111
6. WYSIWYM for Lightweight Text Analytics
Figure 6.5.: Different views on an analyzed corpus: 1) faceted browser, 2) matrix view, 3) sentiment view 4) image view, 5) tag cloud, 6) chordal graph view, 7) map view, 8) timeline, 9) trend view. • Chordal graph view shows the relationships among the different entities in a corpus. The relationships are extracted based on the co-occurrence of the entities and their matching to a set of predefined natural language patterns. • Places map shows the locations and the corresponding articles in the corpus. This view allows users to quickly identify the spatial distribution of locations refereed to in the corpus. • People timeline shows the temporal relations between people mentioned in the corpus. For that purpose, references to people found in the corpus are enriched with birth and death days found in DBpedia. • Sentiment view shows the overall sentiment of the corpus as well as the sentiment of the individual articles in the corpus.
112
Processing stage Information access Named Entity Recognition Enrichment, authoring & feedback
Visualization & exploration
Input
Output
Textual or semistructured Web resources
Corpus with metadata (e.g. temporal annotations)
Corpus Semantically annotated corpus
Semantically annotated corpus Automatically and manually enriched semantic annotations
Semantically annotated and enriched corpus
Exploration and visualization widgets leveraging various semantic annotations
Table 6.2.: conTEXT’s extensible architecture supports a variety of plug-able components for various processing and interaction stages.
113
6.5. Views for Text Analytics
Component RSS/Atom feeds RDF/SPARQL endpoints REST APIs Custom crawlers & scrapers DBpedia Spotlight FOX BOA RDFaCE Faceted browsing Map view Timeline view Tag cloud Chordal graph view Matrix view Sentiment view Trend view
6. WYSIWYM for Lightweight Text Analytics
6.6. Implementation conTEXT is a Web application implemented in PHP and JavaScript using a relational database backend (MySQL). The application makes extensive use of the Model-View-Controller (MVC) architecture pattern and relies heavily on JSON format as input for the dynamic client-side visualization and exploration functionality.
Figure 6.6.: conTEXT data model. Figure 6.6 shows the conTEXT data model, which comprises Corpus, Article, Entity and Entity Type tables to represent and persist the data for text analytics. A corpus is composed of a set of articles or a set of other corpora (in case of a mixed corpus). Each article includes a set of entities represented by URIs and an annotation score. The Entity type table stores the type(s) for each entity. As described in Section 6.4, conTEXT employs NIF for interoperability between different NLP services as well as different corpora. Figure 6.7 shows a sample NIF annotation stored for an article. In order to create the required input data structures for different visualization views supported by D3.js and Exhibit, we implemented a data transformer component. This component processes, merges and converts the stored NIF formats into the appropriate input formats for visualization layouts (e.g. D3 Matrix layout or Exhibit Map layout). After the transformation, the converted visualization input representations are cached on the server-side as JSON files to increase the performance of the system in subsequent runs. One of the main design goals during the development of conTEXT was modularity and extensibility. Consequently, we realized several points of extensibility for implementation. For example, additional visual analysis views can be easily added. Additional NLP APIs and data collectors can be registered (cf. Table 6.2). The faceted browser based on Exhibit can be extended in order to synchronize it with other graphical views implemented by D3.js and to improve the scalability of the system. Support for localization and internationalization can be added into the user interface as well as to the data processing components.
114
6.7. Evaluation
1 { 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 }
"@article": "http://blog.aksw.org/2013/dbpedia-swj", "@context": "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#", "resources": [ { "@id": "http://dbpedia.org/resource/DBpedia", "anchorOf": "DBpedia", "beginIndex": "1144", "endIndex": "1151", "@confidence": "0.9", "@type": "DBpedia:Software" }, { "@id": "http://dbpedia.org/resource/Freebase_(database)", "anchorOf": "Freebase", "beginIndex": "973", "endIndex": "981", "@confidence": "0.9", "@type": "DBpedia:Misc" }, ... ]
Figure 6.7.: Generated semantic annotations represented in NIF/JSON.
Figure 6.8.: conTEXT task evaluation platform: Left – task view showing the tasks assigned to an evaluation subject, Right – individual task.
6.7. Evaluation The goal of our evaluation was two-fold. First, we wanted to provide quantitative insights in the usefulness of conTEXT. To this end, we carried out a task-driven usefulness study where we measured the improvement in efficiency and effectiveness that results from using conTEXT. Second, we aim to evaluate the usability of our approach.
115
6. WYSIWYM for Lightweight Text Analytics
6.7.1. Usefulness study Experimental Setup To achieve the first goal of our evaluation, we carried out controlled experiments with 25 users (20 PhD students having different backgrounds from computer software to life sciences, 2 MSc students and 3 BSc students with good command of English) on a set of 10 questions pertaining to knowledge discovery in corpora of unstructured data. For example, we asked users the following question: “What are the five most mentioned countries by Bill Gates tweets?”. The 10 questions were determined as follows: We collected a set of 61 questions from 12 researchers of the University of Leipzig. These questions were regarded as a corpus and analyzed using conTEXT. After removing questions that were quasi-duplicates manually, we chose 10 questions that we subdivided into 2 sets of 5 questions. Each of users involved in the evaluation was then asked to solve one set of questions with conTEXT and the other one without the tool. In all cases, the users were given access to the corpus from which the question was extracted. While answering the questions with conTEXT, the users used the analysis abilities of conTEXT. Else, they were allowed to use all digital search media of their choice except conTEXT. To ensure that we did not introduce any bias in the results due to distribution of hard questions across the two sets, one half of the users was asked to solve the first set of questions with conTEXT while the others did the same with the second set and vice-versa. We evaluated the users’ efficiency by measuring the time that they required to answer the questions. Note that the users were asked to terminate any task that required more than 5 minutes to solve. In addition, we measured the users’ effectiveness by comparing the answers of each user to a gold standard which was created manually by the authors. Given that the answers to the questions were sets, we measured the similarity of the answers A provided by the each user and the gold standard G by using the Jaccard similarity of the two sets, i.e., |A∩G| . The platform22 provided |A∪G| users with a short tutorial on how to perform the tasks using conTEXT and how to add their responses for the questions (cf. Figure 6.8). Results The results of our first series of evaluations are shown in Figures 6.9 and 6.10. On average, the users required 136.4% more time without conTEXT than when using the tool. A fine-grained inspection of the results suggests that our approach clearly enables users to perform tasks akin to the ones provided in the evaluation in less time. Especially complex tasks such as “Name a middle-eastern country that has never been spoken of in the AKSW blog” are carried out more than three times faster using conTEXT. In some cases, conTEXT even enables users to carry out tasks that seemed out of reach before. For example, the question “What are the five most mentioned countries by Bill Gates’ tweets?” (Q10) was deemed impossible to answer in reasonable time by using normal search tools by several users. A look at the effectiveness results suggests that those users who tried 22
116
available at http://context.aksw.org/app/evaluation
6.7. Evaluation
Figure 6.9.: Avg. Jaccard similarity index for answers using & without the conTEXT.
Figure 6.10.: Avg. time spent (in second) for finding answers using & without the conTEXT.
to carry out these task without conTEXT failed as they achieve an average Jaccard score of 0.17 on this particular task while users relying on conTEXT achieve 0.65. The overall Jaccard score with conTEXT lies around 0.57, which suggests that the tasks in our evaluation were non-trivial. This is confirmed by the overall score of 0.19 without conTEXT. Interestingly, the average effectiveness results achieve by users with conTEXT are always superior to those achieved without conTEXT, especially on task Q8, where users without conTEXT never found the right answer. Moreover, in all cases, the users are more time-efficient when using conTEXT than without the tool.
6.7.2. Usability study Experimental Setup The goal of the second part of our evaluation was to assess the usability of conTEXT. To achieve this objective, we used the standardized, ten-item Likert scale-based System Usability Scale (SUS) [Lewis and Sauro, 2009] questionnaire and asked each person who partook in our usefulness evaluation to partake in the usability evaluation. The questions were part of a Google questionnaire and can be found at http://goo.gl/JKzgdK. Results The results of our study (cf. Figure 6.11) showed a mean usability score of 82 indicating a high level of usability according to the SUS score. The responses to question 1 suggests that our system is adequate for frequent use (average score to question 1 = 4.23 ± 0.83) by users all of type (4.29 ± 0.68 average score for question 7). While a small fraction of the functionality is deemed unnecessary by some users (average score of 1.7± 0.92 to question 2, 1.88±1.05 to question 6 and 1.76±1.09 to question 8), the users deem the system easy to use (average score of 4.3± 0.59 to question 3). Only one user suggested that he/she would need a
117
6. WYSIWYM for Lightweight Text Analytics
Figure 6.11.: Result of conTEXT usability evaluation using SUS questionnaire. technical person to use the system, while all other users were fine without one. The modules of the system in itself were deemed to be well integrated (4.23±0.66 average score to question 5). Overall, the output of the system seems to be easy to understand (4.11 ± 1.05 score to question 9) while users even without training assume themselves capable of using the system (1.52± 0.72 to question 10). These results corroborate the results of the first part of our evaluation as they suggest that conTEXT is not only easy to use but provides also useful functionality.
6.8. Conclusion This chapter addressed the research question RQ4 (cf. Section 1.3) to exploit semantically-enriched content for content analysis. With conTEXT, we showcased an innovative text analytics application for end-users, which integrates a number of previously disconnected technologies. In this way, conTEXT is making NLP technologies more accessible, so they can be easily and beneficially used by arbitrary end-users. With regards to RQ4.1, conTEXT provides users with instant benefits for manual content annotation by empowering users to gain novel insights and to complete tasks, which previously required substantial development.
118
Chapter 7
WYSIWYM for Authoring of E-Learning Content “There are three ingredients in the good life: learning, earning and yearning.” — Christopher Morley
In this chapter we present an application called SlideWiki for collaborative authoring of semi-structured educational content. SlideWiki employs the WYSIWYM concept for user-friendly authoring of semi-structured e-learning content – in particular, presentations, slides, diagrams and self-assessment tests. In order to support collaboration and crowdsourcing, SlideWiki utilizes our proposed data model called WikiApp. Two use cases of SlideWiki as a platform of authoring of OpenCourseWare and as a tool for elicitation and sharing of corporate knowledge are also described in this chapter. The rest of the chapter is organized as follows: Section 7.1 describes our proposed data model WikiApp for supporting collaboration and crowdsourcing. In Section 7.3, we introduce SlideWiki as an implementation of WikiApp data model together with two use cases. Section 7.4 elaborates on the architecture and technical implementation details of SlideWiki application. In Section 7.5, we provide a comparison between SlideWiki and existing presentation management systems. Results of our usability evaluation are reflected in Section 7.6. Finally we conclude the chapter in Section 7.7.1
7.1. WikiApp Data Model Ward Cunningham’s Wiki [Leuf and Cunningham, 2001] paradigm is mainly only applied to unstructured, textual content thus limiting the content structuring, repurposing and reuse. More recently with the appearance of Semantic Wiki’s, the concept was also applied and extended to semantic content[Schaffert et al., 2008]. There are currently two types of Semantic Wikis. Semantic Text Wikis, such as Semantic MediaWiki [Kr¨otzsch et al., 2007] or KiWi [Schaffert et al., 2009] are based on semantic annotations of the textual content. Semantic Data Wikis, such as OntoWiki [Auer et al., 2006], are based on the RDF data model in the first place. Both types of Semantic Wikis, however, suffer from two disadvantages. Firstly, 1
The contents of this chapter have been published as [Khalili et al., 2012b, Tarasowa et al., 2013, Auer et al., 2013, Tarasowa et al., 2014]. Some parts of this chapter are written jointly by Darya Tarasowa (http://aksw.org/DaryaTarasowa).
119
7. WYSIWYM for Authoring of E-Learning Content
Figure 7.1.: Schematic view of the WikiApp data model. their performance and scalability is restricted by current triplestore technology (cf. Section 2.5), which is still an order of magnitude slower when compared with relational data management, which is regularly confirmed by SPARQL benchmarks such as BSBM [Bizer and Schultz, 2009]. Secondly, Semantic Wikis are generic tools, which are not particularly adapted for certain domains thus substantially increase the usage complexity for users. The latter problem was partially addressed by OntoWiki components such as Erfurt API2 , RDFauthor3 and Semantic Pingback4 , which evolved OntoWiki into a framework for Web Application development [Heino et al., 2009]. In many potential usage scenarios, the content to be managed by a wiki is neither purely textual nor fully semantic. Often (semi-)structured content (e.g. presentations, educational content, laws, skill profiles etc.) should be managed and the collaboration of large user communities around such content should be effectively facilitated. In this section we introduce the fundamental WikiApp concept. The WikiApp concept is based on the following principles: • Provenance. The origin and creation context of all information in a WikiApp implementation should be preserved and well documented. • Transparency, openness and peer-review. Content in a WikiApp implementation should be visible and easily observable for the largest possible audience, thus facilitating review and quality improvements. • Simplicity. WikiApp implementations should be simple to build and use.
2
https://github.com/AKSW/Erfurt http://aksw.org/Projects/RDFauthor 4 http://aksw.org/Projects/SemanticPingback 3
120
7.1. WikiApp Data Model • Social collaboration. Following other users, watching the evolution of content as well as reusing and re-purposing of content in social collaboration networks is at the heart of WikiApp. • Scalability. WikiApp implementations should be scalable and be implementable according to established Web application development practices (such as the MVC pattern). The aim of the WikiApp concept is to provide a framework for implementing these principles similarly to Ward Cunningham’s Wiki concept for traditional text wikis. However, due to the increased complexity of the (semi-)structured content and operations on this content just a high level description of principles is not sufficient to support the creation of domain-specific WikiApp implementations. By devising a formal WikiApp concept we aim to provide a clear and consistent description of the approach, which simplifies the creation of concrete WikiApp instantiations and can be used as a basis for integration WikiApp support into engineering methodologies, development frameworks as well as model-driven code generators. In the sequel, we present a formal description of the WikiApp data model and describe then the base operations on this data model.
7.1.1. Data Model The WikiApp data model is a refinement of traditional Entity-Relation (ER) data model. It adds some additional formalisms in order to make users as well as ownership, part-of and derived-from relationships first-class citizens of the data model. We illustrate the WikiApp model in Figure 7.1 and formally define it as follows: Definition 4 (WikiApp data model). The WikiApp data model WA can be formally described by a triple WA = (U, T, O) with: • U a set of users. • T a set of content types with associated property types Pt having this content type as their domain. • O = {Ot∈T } with Ot being sets of content objects for each content type t ∈ T . Each Ot consists of content objects ot,i = {Pt,i , bt,i , ut,i , ct,i } with: – i ∈ IT being a suitable identifier set for the content objects in Ot ; – properties Pt,i = Attrt,i ∪ Relt,i ∪ P artt,i with Attrt,i being a set of literal, possibly typed attributes, Relt,i being a set of relationships with other content objects, P artt,i being a set of part-of and has-part relationships referring to other content objects; – bt,i ∈ Ot ∪ N U LL referring to base content object from which this content object was derived;
121
7. WYSIWYM for Authoring of E-Learning Content – ut,i ∈ U referring to a user being the owner of this content object; – ct,i containing the creation timestamp of object ot,i . The WikiApp data model assumes that all content objects are versioned using the timestamp ct,i and the base content object relation bt,i . In practice, however, usually only a subset of the content objects are required to be versioned. For auxiliary content (such as user profiles, preferences etc.) it is usually sufficient to omit a base content object relation. For reasons of simplicity of the presentation and space restrictions we have omitted a separate consideration of such content here. However, this is in fact just a special case of the general WikiApp data model, where the base content object relation bt,i is empty for a subset of the content objects. The WikiApp data model is compatible with both the relational data model as well as the RDF data model. When implemented as relational data, content types correspond to tables and content objects to rows in these tables. Functional attributes and relationships as well as the owner and base-content-object relationships can be modeled as columns (the latter three representing foreign-key relationships) in these tables. For 1 − n and m − n relationships and non-functional attributes suitable helper tables have to be created. The implementation of the WikiApp data model in RDF is slightly more straightforward: content types resemble classes and content objects instances of these classes. Attributes and relationships can be attached to the classes via rdfs:domain and rdfs:range definitions and directly used as properties of the respective instances. For reasons of scalability we expect the WikiApp data model to be mainly used with relational backends. However, using techniques such as Triplify [Auer et al., 2009] or other RDB2RDF [Sahoo et al., 2009] mapping techniques a Linked Data interface can be easily added to any WikiApp implementation (cf. Section 7.4). Example 7.1 [SlideWiki data model] For our SlideWiki example application (whose implementation is explained in detail in Section 7.4) the data model consists of individual slides (consisting mainly of HTML snippets and some meta-data), decks (being ordered sequences of slides and sub-decks), media assets (which are used within slides) as well as themes (which are associated as default styles with decks and users): • T = {deck, slide, media, theme} • Attrdeck = {title → text, abstract → text, license → {CC − BY, CC − BY − SA}}, Reldeck = {def ault theme → theme}, P artdeck = {deck content → deck ∪ slide} • Attrslide = {content → text, speaker note → text, license → {CC − BY, CC − BY − SA}}, Relslide = {uses → media}, P artslide = {}
122
7.1. WikiApp Data Model • Attrmedia = {type → {image, video, audio}, uri → string, license → {CC − BY, CC − BY − SA}}, Relmedia = {}, P artmedia = {} • Attrtheme = {title → string, css def inition → text}, Reltheme = {}, P arttheme = {}
7.1.2. Operations After we introduced the WikiApp data model, we now describe the main operations on the data model. In the spirit of the Wiki paradigm, there is no deletion or updating of existing, versioned content objects. Instead new revisions of content objects are created and linked to their base objects via the bt,i relation. Definition 5 (WikiApp operations). Five base operations are defined on the WikiApp data model: • create(u, t, p) : U × T × Pt → Ot creates a new content object of type t with the owner u and properties p. • newRevision(u, t, i, p) : U × T × IT × Pt → Ot creates a copy of an existing content object ot,i of type t potentially with a new owner u and overriding existing properties with p. • getRevision(t, i) : T × IT → Ot ∪ f alse returns the existing content object ot,i of type t including all its properties or false in case a content object of type t with identifier i does not exist. • isW atching(u, t, i) : U × T × IT → {true, f alse} returns true if the user u is watching the content object of type t with identifier i or false otherwise. Following users is a special case, where the content object type is set to user. • watch(u, t, i) : U × T × IT → {true, f alse} toggles user u watching the content object of type t with identifier i and returns the new watch status. All operations have to be performed by a specific user and the newly created content objects will have this user being associated as their owner. In addition, when a new revision of an existing content object is created and the original content object is indicated to be part of another content object (by the distinguished part-of relations) the creation of a new revision of the containing content object has to be triggered as well. In our Example 1, this is, for example, triggered when a user creates a new revision of a slide being part of a deck. If the user is not the owner of the containing deck, a new deck revision is automatically created, so as to not implicitly modify other users’ decks.
123
7. WYSIWYM for Authoring of E-Learning Content
Figure 7.2.: Instantiation of the WikiApp DSL representing the SlideWiki model.
7.2. Model-driven generation of WikiApp implementations Using a model-driven Web application engineering approach, developers are able to easily and quickly implement WikiApp applications. We devised a DomainSpecific Language (DSL) based on the WikiApp Data Model and a transformation approach implemented in a tool called Wikifier 5 which receives a WikiApp definition in the DSL and generates the appropriate database (or RDF) schema, classes and methods for interacting with this model as well as the required SQL (or SPARQL) queries. The Wikifier DSL is dedicated to the specific WikiApp problem representation technique. In essence its a YAML-formatted6 file with the definition of content types, their attributes, relations and part-of relations according to the WikiApp data model (cf. Definition 4). Figure 7.2 shows an instantiation of this DSL for our SlideWiki example application. The Wikifier model transformation is integrated into the code generator of the 5 6
124
Available at: http://slidewiki.aksw.org/wikifier/ YAML Ain’t Markup Language: http://yaml.org/
7.3. SlideWiki
Figure 7.3.: Generated database schema by Wikifier. Symfony framework7 which is based on the MVC design pattern. It transforms the DSL instantiation to the corresponding data models with basic Create-RetrieveUpdate-Delete (CRUD) operations and the corresponding views and controllers. The generated models will include the following extensions derived from the WikiApp data model: (a) A revision model for each content object with timestamp and based_on properties. (b) A partOf model which includes the identifier properties of the selected revision models. (c) A subscription model which is used for following revision models. (d) A user model which is referred by each of the generated revision models. The database schema generated by Wikifier for SlideWiki example is depicted in Figure 7.3. In addition to the generic WikiApp operations (cf. Definition 5) Wikifier creates convenience methods for performing these operations directly from the respective content object classes.
7.3. SlideWiki In this section we describe with SlideWiki a concrete WikiApp implementation, which we created to demonstrate the effectiveness and efficiency of the WikiApp approach. SlideWiki is available publicly at http://slidewiki.org. The main idea of SlideWiki is to enable crowdlearning – the crowdsourcing of educational content, in particular presentations. The SlideWiki data model was already introduced in Example 1 and shows the relatively complex relationships between decks, slides, media assets and themes. In the sequel, we present two use cases of the SlideWiki application. 7
http://symfony.com/
125
7. WYSIWYM for Authoring of E-Learning Content
Figure 7.4.: Crowdlearning strategies in SlideWiki.
7.3.1. Authoring of OpenCourseWare While nowadays there is a plethora of Learning Content Management Systems (LCMS), the collaborative, community-based creation of rich e-learning content is still not sufficiently well supported. Few attempts have been made to apply crowdsourcing and Wiki-approaches for the creation of e-learning content. Wikiversity 8 , for example, is a Wikimedia Foundation project aiming to leverage standard wiki technology for the creation of hypertext e-learning content. Peer 2 Peer University (P2PU) 9 and PlanetMath 10 are other examples which employ crowdsourcing to create rich e-learning content. P2PU helps users to navigate the wealth of open education materials and supports the design and facilitation of courses. The PlanetMath is a project aiming to become a central repository for mathematical knowledge on the web, with a pedagogical mission. However, we deem, that no real attempt has been made so far to truly apply the concepts behind Wikis and crowdsourcing to develop a specifically tailored technology supporting the creation of (semi-)structured e-learning content. As defined by Open Education Consortium11 , OpenCourseWare (OCW) is a free and open digital publication of high quality college and university-level educational materials. OCW are free and openly licensed, accessible to anyone, anytime via the internet. As an OCW authoring platform, SlideWiki deals with two types of (semi-)structured learning objects: slide presentations and assessment tests. SlideWiki empowers communities of instructors, teachers, lecturers, academics to create, share and re-use multilingual educational content in a collaborative way. As depicted in Figure 7.4, to support crowdlearning, SlideWiki realizes the following strategies: • Standard-compliance. SlideWiki adopts the Sharable Content Object Refer8
http://wikiversity.org/ http://p2pu.org/ 10 http://planetmath.org/ 11 http://www.openedconsortium.org 9
126
7.3. SlideWiki ence Model (SCORM) standard [ADL, 2011a] and practical recommendations [ADL, 2011b] and expands the standard for the collaborative model. This will decrease the costs associated with building high-quality e-learning content by importing/exchanging content from/between existing LCMSs. • Semantic structuring. Instead of dealing with large learning objects (often whole presentations or tests), SlideWiki employes the WikiApp data model to decompose learning material into fine-grained learning artifacts. • Reuse and repurpose. Instead of the full redevelopment, the content can slightly evolve in SlideWiki. This will decrease the cost of content creation, will increase the quality of e-learning content and will support the evolution and adaptation to new requirements. • Crowdsourcing. There are already vast amounts of amateur and expert users, which are collaborating and contributing on the Social Web. Harnessing the power of such crowds in SlideWiki can significantly enhance and widen the distribution of e-learning content. • Social networking. The theoretical foundations for e-Learning 2.0 are drawn from social constructivism [Wang et al., 2012]. It is assumed that students learn as they work together to understand their experiences and create meaning. SlideWiki supports social networking activities (e.g. following, discussing, sharing, rating slides and presentations) to enable students to proactively interact with each other to acquire knowledge. • Support of multilinguality. SlideWiki enables the crowd-translation of content to promote open e-learning material among different countries. • Progress evaluation. SlideWiki supports the creation of questions and selfassessment tests based on slide material. This will enable users to evaluate their progress while learning. SlideWiki brings the following benefits for academic learning: It enables educators, lecturers and teachers to • increase the user base by making the content accessible to a world-wide audience. • get their high-quality e-learning content translated into many different languages. • engage students in contributing to and discussing the slides. • create (self-)assessment tests for students. • involve peer-educators in improving and maintaining the quality and attractiveness of their e-learning content. • increase the reputation in the community, by sharing qualitative e-learning content. Students can also
127
7. WYSIWYM for Authoring of E-Learning Content • view rich-learning content right in a browser. • discuss particular content (e.g. a slide or question) with other students and instructors. • contribute additional content, improvements and feedback. • assess learning progress using the questionnaires attached to presentations.
7.3.2. Elicitation and Sharing of Corporate Knowledge In medium and large enterprises and organizations presentations are crucial elements of the corporate knowledge exchange. Such organizations are mostly hierarchically organized and communication and knowledge flows usually accompany corporate hierarchies. In addition to sending emails and documents, meetings where presentations are shown to co-workers, subordinates and superiors are one of the most important knowledge exchange functions. Research conducted by the Annenberg School of Communications at UCLA and the University of Minnesota’s Training & Development Research Center show that executives on average spend 40-50% of their working hours in meetings.12 They spend a remarkable amount of time collecting their required materials and creating new presentations. The challenges with current organizational presentations can be roughly divided into the following categories: • Sharing and reuse of presentations. Much of the corporate strategy, direction and accumulated knowledge is encapsulated in presentation files; yet this knowledge is effectively lost because slides are inaccessible and rarely shared. Furthermore offline presentations are hard to locate. Thereby executives usually spend their time creating new slides instead of re-using existing material. • Collaborative creation of presentations. Executives in different departments or countries often unknowingly duplicate their efforts, wasting time and money. To collaboratively create a presentation, the members need to manually download and merge the presentations. • Following/discussing presentations. Finding the most up-to-date presentation is difficult and time-consuming, therefore costly. Furthermore, discussing the content of presentations in face-to-face meetings or email discussions is not efficient within organizations. • Tracking/handling changes in presentations. Tracking and handling changes that occur within different presentations is a time-consuming task which needs opening all offline presentations and manually comparing their content. Additionally there are hundreds of slide copies to change when an original is modified. This cascading change costs a fortune each time. 12
128
http://www.shirleyfinelee.com/MgmtStats
7.3. SlideWiki • Handling heterogeneous presentation formats. Presentations can be created in different formats (e.g. Office Open XML, Flash-based, HTML-based or LaTeX-based presentations) thereby integration and reuse of them will be a cumbersome task for organization members. • Ineffective skills management and training. Medium and large enterprises are obliged by law to provide means for training and qualification to their employees. This is usually performed by seminars, where training material is prepared in the form of presentations. However, it is usually not possible to provide engaging bi-directional and interactive means of knowledge exchange, where employees contribute to the training material. • Preserving organization identity. Having a consistent template and theme including the logo and brand message of organization is of great significance in shaping the organization identity. With offline presentations it is difficult to persistently manage and sustain specific organization templates. Everyone needs to take care of templates and themes individually and managing the changes takes a remarkable amount of time. SlideWiki as a crowdsourcing platform deals with most of the above-mentioned limitations of current presentation tools within organizations. As a tool for knowledge management within organizations, SlideWiki can be applied to the following areas: Developing a shared mental model within organization. In organizational learning, learning occurs through shared insights and mental models. In this process, organizations obtain the knowledge that is located in the minds of their members or in the epistemological artifacts (maps, memories, policies, strategies and programs) and integrates it with the organizational environment [Valaski et al., 2012]. This shared mental model (a.k.a. organizational memory) is the accumulated body of data, information, and knowledge created in the course of an individual organization’s existence. Combining presentations with social approaches for crowdsourcing. Presentations when combined with crowdsourcing and collaborative social approaches can help organizations to cultivate innovation by collecting and expressing the individual’s ideas within organizational social structures. As discussed in [Blankenship and Ruona, 2009], there are different types of social structures living in the context of organizations. Work groups, project teams, strategic communities, and learning communities, communities of practice, informal networks, etc. to mention some. These social structures make use of presentations frequently to present and discuss their internal ideas. Therefore, creating an integrated collaborative platform for authoring and sharing presentations will result in exchanging knowledge within and cross these social structures (even supporting inter-organizational knowledge transfer).
129
7. WYSIWYM for Authoring of E-Learning Content
Organizational Memory Slides
Structures Templates & Themes Discussions & Tags Questionnaires
Knowledge Communities
SlideWiki platform
In support of contribution
Creating questionnaires and evaluation tests
Value System
Vision
Designing consistent templates and themes
Measurments Tracking/Handling changes in presentations Following/Discussing presentations Sharing/Reusing presentations Collaborative authoring of presentations
Organizational Support Incentives
To reinforce value of sharing
Crossing organization & functional boundaries
Figure 7.5.: SlideWiki ecosystem for organizational knowledge sharing. As a driver for organizational innovation. Presentations are an important driver of organizational innovation particularly when they are exchanged between social connections that cross functional and organizational boundaries. As discussed in [Fonstad, 2005], improvising is a structured process of innovation that involves responding to changing situations with resources at hand by creating a production and adapting it continuously. Presentation tools enable the creation of so called Structural Referents – a representation one develops about a structure. Structural referents support the communities to collaborate on individual’s ideas and foster the potential ideas in alignment with the organizational goals. Ghost Sliding is a process introduced in [Fonstad, 2005] which utilizes presentation slides as structural referents for collaborative knowledge management. Ghost sliding is an iterative process where consultants draw up quick, rough representations of each slide and discuss them with clients to develop consensus on what statements are going to be included in the final presentation and what data needs to be collected to support those statements. The rationale for ghost-sliding is that by developing explicit
130
7.4. Implementation representations of what a consultant is striving for, the consultant could discuss the hypotheses with others and be more efficient about what kind of data to look for. As a media for knowledge exchange and training. As reported in [Cobb and Steele, 2011], PowerPoint presentations are the most used (75.4 %) tool for developing e-learning content within organizations. Presentations contain visualized learning materials, which improve the training of organization members having different levels of knowledge. Enabling users to contribute to these training materials makes it possible to provide engaging bi-directional and interactive means of knowledge exchange. SlideWiki provides a crowdsourcing platform for elicitation and sharing of corporate knowledge using presentations. It exploits the wisdom, creativity and productivity of the crowd for the collaborative creation of structured presentations. Figure 7.5 shows the SlideWiki ecosystem for supporting organizational knowledge management. SlideWiki provides a collaborative environment, which enables knowledge communities to contribute to dynamic parts of organizational memory, which is encapsulated in presentations. The dynamic view of the structure of organizational memory [Casey and Olivera, 2011] takes into account the social nature of memory. Rather than viewing memory as knowledge stored in a collection of retention bins, the emphasis is on memory as continually constructed and reconstructed by humans interacting with each other and their organizational environment. In SlideWiki, users from different knowledge communities crossing the organization and functional boundaries can collaboratively create structured online presentations. Users can assign tags and categories for structuring the presentations. The created presentations can be shared and reused to build new synergetic presentations. Users can also track and manage changes occurring within presentations using a revisioning system. Additionally, SlideWiki includes an e-learning component that deals with questionnaires created for each presentation slide. Questionnaires together with the evaluation tests facilitate the training of users within organizations. With regard to preserving the organization identity and branding, SlideWiki supports creating and sharing of templates and themes. Apart from the contribution on authoring of presentation content, SlideWiki also supports social networking activities such as following presentation decks, slides and users as well as discussing the created content.
7.4. Implementation The SlideWiki application makes extensive use of the MVC architecture pattern. The MVC architecture enables the decoupling of the user interface, program logic and database controllers and thus allows developers to maintain each of these components separately. As depicted in Figure 7.6, the implementation comprises
131
7. WYSIWYM for Authoring of E-Learning Content
Human Users
Machine
Frontend
WYSIWYM Authoring
Change Management
Search & Browsing
Styling
E-Learning
Social Networking
Linked Data Interface
File formats
Import/ Export
Translation
Relational Database
Triplify/Sparqlify
Ajax Controller
Model
Controller
View
Slides
Discussions Structures Templates & Themes & Tags Questionnaires
Triple Store
Figure 7.6.: Bird’s eye view on the SlideWiki MVC architecture. the main components: WYSIWYM authoring, change management, search and browsing, styling, e-learning, social networking, import/export, translation as well as linked data interface.. We briefly walk-through these components in the sequel. WYSIWYM Authoring. SlideWiki employs the WYSIWYM interface model together with an inline HTML5 based WYSIWYG text editor for authoring the presentation slides (cf. Figure 7.7). Using this approach, users will see the slideshow output at the same time as they are authoring their slides. The editor is implemented based on ALOHA editor13 extended with some additional features such as image manager, source manager, equation editor. The inline editor uses Scalable Vector Graphics (SVG) images for drawing shapes on slide canvas. Editing SVG images is supported by SVG-edit14 with some predefined shapes which are commonly used in presentations. For logical structuring of presentations, SlideWiki utilizes a tree structure together with a context menu by which users can append new or existing slides/decks and drag & drop items for positioning. When creating presentation decks, users can assign appropriate tags as well as footer text, default theme/transition, abstract and additional meta-data to the deck. 13 14
132
http://aloha-editor.org/ http://code.google.com/p/svg-edit/
7.4. Implementation
Figure 7.7.: Screenshots of the SlideWiki WYSIWYM interface (V2 – text margin formatting for visualizing content tree, V7 – line connectors to show the relation between slides and decks, X4 – expanding & drilling down to explore content, T4 – drag & drop to change the order of slides and decks, T6 – floating ribbon editing to author slide content, H5 – collaboration and crowdsourcing helper components). Change management. Revision control is natively supported by WikiApp data model. We just define rules and restrictions to increase the performance. There are different circumstances in SlideWiki for which new slide or deck revisions have to be created. For decks, however, the situation is slightly more complicated, since we wanted to avoid an uncontrolled proliferation of deck revisions. This would, however, happen due to the fact, that every change of a slide would also trigger the creation of a new deck revision for all the decks the slide is a part of. Hence, we follow a more retentive strategy. We identified three situations that have to cause the creation of new revisions: • The user specifically requests to create a new deck revision. • The content of a deck is modified (e.g. slide order is changed, change in slides content, adding or deleting slides to/from the deck, replacing a deck content with new content, etc.) by a user which is neither the owner of a deck nor a member of the deck’s editor group.
133
7. WYSIWYM for Authoring of E-Learning Content change the slide content
user requests to copy a deck add/remove item to/from deck
new slide revision
change the order of items in a deck replace/change the content of a deck
user is the owner of the container deck yes
no
no new deck revision
container deck has usage somewhere else yes
Figure 7.8.: Decision flow during the creation of new slide and deck revisions. • The content of a deck is modified by the owner of a deck but the deck is used somewhere else. The decision flow is presented in Figure 7.8. In addition, when creating a new deck revision, we always need to recursively spread the change into the parent decks and create new revisions for them if necessary. Search and Browsing. There are three ways of searching in SlideWiki: by keywords, by metadata and by user (who contributed or follows certain content). We combined keywords and tag search so that users can either 1) search by keywords and then add a tag filter, or 2) show all slides or decks having the tag and then running an additional keyword search on the results. In both cases an ordering a user might have applied is preserved for subsequent searches. In addition to the deck tree user interface for browsing the presentations, a breadcrumb navigation bar is implemented in SlideWiki. Breadcrumb improves the accessibility of system by increasing the user awareness when browsing nested presentations. Styling. In order to create flexible and dynamic templates and styles for presentations, SlideWiki utilizes Saas (Syntactically Awesome Stylesheets) language15 . Sass extends CSS by providing several mechanisms available in programming languages, particularly object-oriented languages, but not available in CSS3 itself. When Sass script is interpreted, it creates blocks of CSS rules for various selectors as defined by the Sass file. Using Saas, SlideWiki users can easily create and reuse presentation themes and transitions. 15
134
http://sass-lang.com/
7.4. Implementation
Figure 7.9.: Editing of a question & Test mode in SlideWiki. E-learning. SlideWiki supports the creation of questions and self-assessment tests based on slide material. Each question has to be assigned to at least one slide. Important note here, that the question is assigned not to the slide revision, but to slide itself. Thus, when a new slide revision appears, it continues to include all the list of previously assigned questions. Questions can be combined into tests. The automatically created tests include the last question revisions from all the slides within the current deck revision. Manually created tests present a collection of chosen questions (cf. Figure 7.9). Social Networking. As a social software, SlideWiki supports different types of social networking activities. Users can follow items such as decks, slides and other users. They can also rate, tag and discuss decks and slides. Content syndication in multiple formats such as RSS, ATOM, OPML and JSON is provided for created items so that users can subscribe to them. We are currently integrating SlideWiki with popular social networking sites like Twitter, Facebook, GooglePlus and LinkedIn. Import/Export. SlideWiki implementation addresses interoperability as its first class citizen. SlideWiki supports import/export of the content from/to existing desktop applications and Learning Objects RepositoryLearning Objects Repositories
135
7. WYSIWYM for Authoring of E-Learning Content (LORs) thereby allowing users from other LCMSs to access the created content. The main data format used in SlideWiki is HTML. However, there are other popular presentation formats commonly used by desktop application users, such as PowerPoint .pptx presentations, LATEXand others. We implemented import of the slides from .pptx format and work on the LATEXformat support is in progress. Translation Our architecture allowed us to implement a translation feature backed by the Google Translate service. After the translation into one of 54 supported languages, the presentation can be edited independently from the original one. Linked Data Interface. While sharing and reusing educational data across institutional and national boundaries is a general goal for both the public and the private education sector, the last decade has seen a large amount of research dedicated to Web-scale interoperability. For example, LinkedEducation.org is an open platform which promotes the use of Linked Data for educational purposes. In order to enable the export of SlideWiki content on Data Web as LORs, we employed the RDB2RDF mapping tool Triplify [Auer et al., 2009] to map SlideWiki content to RDF and publish the resulting data on the Data Web. The Triplify configuration for SlideWiki was created manually according to IEEE Learning Objects Metadata (LOM) standard and can be changed to support specific LORs. The SlideWiki Triplify Linked Data interface is available via: http://slidewikiw.org/triplify. http://slidewiki.aksw.org/main/deck/1#tree-1-slide-1-1-view selection of a controller selection of a function inside the controller function parameters anchor for initializing state in JavaScript
Figure 7.10.: Mapping of URLs to MVC actions in SlideWiki.
Frontend. In addition to overall MVC pattern, SlideWiki utilizes a client-side MVC approach (implemented in JavaScript and running inside the users Web browser) to provide users with a rich and interactive user interface. As described in Figure 7.10, there is a hash fragment in the request URL which acts as an input for the client-side MVC handler. This fragment consists of an identifier and an action name. The identifier itself has four parts which are combined based on the following pattern: tree-{container_deck_id}-{content_type}-{content_id}-{content_position}. For example, tree-1-slide-5-2-view refers to the view action which is assigned to the slide with id 5, located at second position of deck with id 1.
136
7.5. SlideWiki vs. Presentation Management Systems The client-side MVC handler as (singleton) controller listens to the hash fragment and once a change has occurred the handler triggers the corresponding actions. Each action has a JavaScript template (implemented using jQuery templates) with the corresponding variable place holders. For each action an Ajax call is made and the results are returned to the controller in JSON format. Subsequently, the controller fills the templates with the results and renders them in the browser.
7.5. SlideWiki vs. Presentation Management Systems There are already many Web-based platforms that provide services for online creation, editing and sharing of presentations. SlideShare.net is a popular Website for sharing presentations16 . Comparing to SlideWiki, it does not provide any feature to create and reuse the content of presentations. SlideRocket.com and Prezi.com are other related works which help people to create fancy and zoomable presentations. In contrast to SlideWiki, they focus more on the visualization aspects rather than the content of the presentations. Microsoft SharePoint Online 17 and SlideBank.com are two commercial solutions which provide the feature of slide libraries to allow users to work with PowerPoint slide decks stored in the cloud. Despite SlideWiki which is an online platform, these tools adopt the Software-as-aService approach to enable a synchronization between desktop applications and Web service providers.
7.6. Usability Evaluation SlideWiki has been already used for teaching Business Information Systems and Semantic Web lectures in the Chemnitz Technical University, University of Leipzig and University of Bonn. Figure 7.11 shows an screenshot of the Semantic Web lecture series comprising 785 slides collaboratively created by 22 Semantic Web researchers. As checked on May 2014, there were 3100 decks, 11000 deck revisions, 22000 slides, 41000 slide revisions, 2000 questions and 850 active users on SlideWiki. To evaluate the real-life usability of SlideWiki, we performed a usability user study with 22 subjects. Subjects were drawn from the members of AKSW research group at the university of Leipzig and MSc students at the Chemnitz Technical University who took the course Business Information Systems. We used the SUS scale to grade the usability of SlideWiki. The results of our survey showed a mean usability score of 69 for SlideWiki which indicates a reasonable level of usability (cf. Figure 7.12). In addition to quantitative results, we also collected a 16
Other examples include authorSTREAM http://www.authorstream.com, SlideServe http://www.slideserve.com, Scribd http://www.scribd.com and slideboom http://www. slideboom.com 17 http://sharepoint.microsoft.com
137
7. WYSIWYM for Authoring of E-Learning Content
Figure 7.11.: An screenshot of the Semantic Web lecture series created collaboratively on SlideWiki. number of user suggestions to further improve the SlideWiki platform. For instance some users suggested providing autosave feature, supporting more import/export formats, defining user groups etc. The students were working with SlideWiki for several weeks, and we collected the statistics for that period. The experiment was not obligatory but students actively contributed by creating additional questions and fixing mistakes. During that period, they created 252 new slide revisions which some of them were totally new slides, others were improved versions of the original lecture slides. Originally the whole course had 130 questions, and students changed 13 of them, fixing the typos or adding additional options to multiple-choice questions. In total, students performed 287 self-assessment tests. The majority of these used the automatically and randomly created tests covering the whole course material. After the experiment, based on the student grades at final exam, we could claim that, more active SlideWiki users received better marks on the real examination.
138
7.7. Conclusion
Figure 7.12.: Result of SlideWiki usability evaluation using SUS questionnaire.
7.7. Conclusion This chapter addressed the research question RQ5 (cf. Section 1.3) to apply crowdsourcing and collaborative content authoring techniques to the process of semantic content authoring. We presented the SlideWiki platform for authoring of highly-structured e-learning content. SlideWiki as a crowdlearning platform enables the collaborative authoring of presentations by utilizing the WikiApp data model as well as WYSIWYM user interface model. The created presentations will help to effectively shape rich e-learning materials by utilizing crowd feedback.
139
Chapter 8
WYSIWYM for Authoring of Semantic Medical Prescriptions “I always wanted to be somebody, but now I realize I should have been more specific.” — Lily Tomlin
In this chapter, we present how WYSIWYM model can be employed and customized in specific domains to provide content interoperability. We will introduce the new concept of Semantic Medical Prescriptions as an application of Semantic Web technologies in e-prescription systems. Semantic prescriptions can automatically handle the medication errors occurring in prescriptions and can increase the awareness of the patients about the prescribed drugs and drug consumption in general. We will also showcase Pharmer as our implemented WYSIWYM interface to realize the creation of semantic prescriptions. The remainder of this article is structured as follows: Section 8.1 and Section 8.2 provide a background on the basic concepts such as e-prescriptions and LODD. In Section 8.3, we describe the Pharmer as a solution to effectively create semantic prescriptions. Then we discuss the possible use cases of Pharmer in Section 8.4. To better demonstrate the possible stakeholders of the Pharmer system, an example scenario is drawn in Section 8.5. Section 8.6 reports the results of our usability evaluation and finally Section 8.7 concludes the chapter.1
8.1. E-Prescriptions As reported in MedicineNet [Melissa Conrad Stoppler, 2012], medication errors are the most common type of medical errors in health care. Errors such as improper dose of medicine, adverse drug interactions, food interactions, etc. often stem from invalid prescriptions and unawareness of the patients. Medication-oriented errors are usually the result of failures during the medication process [Gonz´alez et al., 2011]. Electronic prescriptions, which are recently gaining attention in the e-health domain, are one of the solutions proposed to solve these types of errors. In an e-prescription system, prescriber electronically sends an accurate, error-free prescription directly to a pharmacy from the point-of-care. 1
The contents of this chapter have been published as [Khalili and Sedaghati, 2013a]. Some parts of this chapter are written jointly by Bita Sedaghati (
[email protected]) from the institute of pharmacy, university of Leipzig.
140
8.2. Linked Open Drug Data (LODD) During the recent years, the adoption of e-prescriptions has been spreading relatively rapidly. In the US, the so called Electronic Prescribing Incentive Program is a reporting program that uses a combination of incentive payments and payment adjustments to encourage electronic prescribing by eligible professionals.2 . As recently published by [Galanter et al., 2013] hospitals’ use of computerized prescriptions prevented 17 million drug errors in a single year in the United States. The Canadian Medical Association (CMA) and the Canadian Pharmacists Association (CPhA) have approved a joint statement on the future of e-prescribing that aims to have all prescriptions for Canadians created, signed and transmitted electronically by 2015. The Australian government removed commonwealth legislative barriers to electronic prescribing started from 20073 . A system called epSOS 4 which performs the use of e-prescriptions all around Europe, is currently passing the extensive practical testing phase. However, one of the main challenges in current e-prescription systems is dealing with the heterogeneity of available information sources. There exist already different sources of information addressing different aspects of pharmaceutical research. Information about chemical, pharmacological and pharmaceutical drug data, clinical trials, approved prescription drugs, drugs activity against drug targets such as proteins, gene-disease-drug associations, adverse effects of marketed drugs, etc. are some examples of these diverse information. Managing these dynamic pieces of information within current e-prescription systems without blurring the border of the existing pharmaceutical information islands is a cumbersome task. On the other hand, Linked Open Data as an effort to interlink and integrate these isolated sources of information is obtaining more attention in the domain of pharmaceutical, medical and life sciences. Combining the best practices from Linked Open Data together with e-prescription systems can provide an opportunity for patients, researchers as well as practitioners to collaborate together in a synergetic way. A consequence of introducing LD in health care sector is that it significantly changes the daily duties of the employees of the health care sector. Therefore the most challenging aspect will not be the technology but rather changing the mind-set of the employees and the training of the new technology[Puustj¨arvi and Puustj¨arvi, 2006].
8.2. Linked Open Drug Data (LODD) In computing, Linked Data (LD) describes a method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP and URIs, but rather than using them to serve Web pages for human readers, it extends them to share information in a way that can be read automatically by computers. This enables data from different sources 2
Electronic Prescribing (eRx) Incentive Program http://www.cms.gov/erxincentive http://www.medicareaustralia.gov.au/ 4 epSOS : the European eHealth Project http://www.epsos.eu/ 3
141
8. WYSIWYM for Authoring of Semantic Medical Prescriptions
Figure 8.1.: Available datasets related to life sciences and pharmaceutical research. to be connected and queried [Bizer et al., 2009]. Tim Berners-Lee, the inventor of the Web and LD initiator, suggested a 5 star deployment scheme for Linked Open Data (LOD): 1) make your stuff available on the Web (whatever format) under an open license, 2) make it available as structured data (e.g., Excel instead of image scan of a table), 3) use non-proprietary formats (e.g., CSV instead of Excel), 4) use URIs to identify things, so that people can point at your stuff, 5) link your data to other data to provide context. Particularly in the areas of health care and life sciences with the wealth of available data, large scale integration projects like Bio2RDF 5 , Chem2Bio2RDF 6 , and the W3C HCLS’s (Health Care and Life Sciences) Linked Open Drug Data 5 6
142
http://bio2rdf.org/ Semantic Web in Systems Chemical Biology http://chem2bio2rdf.wikispaces.com/
8.2. Linked Open Drug Data (LODD) (LODD)[Samwald et al., 2011a] have not only significantly contributed to the development of the Linked Open Data effort, but have also made social and technical contributions towards data integration, knowledge management, and knowledge discovery. There are already many interesting information on pharmaceutical research available on the Web. The sources of data range from drugs general information, interactions and impacts of the drugs on gene expression, through to the results of clinical trials. LODD has surveyed publicly available data about drugs, created LD representations of the data sets, and identified interesting scientific and business questions that can be answered once the data sets are connected (cf. Figure 8.1). LODD Applications in Medical Domain. There exists few approaches that address the medical and pharmaceutical applications using LODD. TripleMap (http://www.triplemap.com) is a project connecting widespread distribution of journal articles, patents and numerous databases in pharmaceutics research. TripleMap as a Web-based application provides a dynamic visual interface to integrate RDF datasets such as the LODD. Showing an unexpected associations between entities related to researcher’s interest is main advantage of TripleMap inspired by the broad interconnected data available in the LODD data sets. The goal of the TripleMap project is to deliver and sustain an ‘open pharmacological space’ by using and enhancing the state-of-the-art Semantic Web standards and technologies [Samwald et al., 2011b]. Another related project is the Open Pharmacological Space (OPS), Open PHACTS (Pharmacological Concept Triple Store http://www.openphacts.org) project under the European Innovative Medicines Initiative (IMI http://www.imi. europa.eu/). The goal of this project is integration of chemical and biological data using LD standards to support drug discovery [Williams et al., 2012]. Linked Cancer Genome Atlas Database [Saleem et al., 2013] as another LD project aims to create an atlas of genetic mutations responsible for cancer. The project provides an infrastructure for making the cancer related data publicly accessible and to enable cancer researchers anywhere around the world to make and validate important discoveries. Although these projects address the backend side of creating LODD applications, there has been a clear lack of applications with user-friendly, efficient and effective interfaces to make LD resources accessible to end-users outside the biomedical community. One of the use cases of LODD datasets addressed in this chapter is authoring of Semantic Prescriptions which are prescriptions enriched by LOD.
143
8. WYSIWYM for Authoring of Semantic Medical Prescriptions
absorption
description
Ontologies genericName
affectedOrganism
Prescription Content
pharmacology
Bottom-Up Enrichment
Prescription
Drug
mechanismOfAction
...
...
biotransformation
halfLife
absorption toxicity indication
Instruction
date
dosageForm
...
quantity
dosage
comments
Figure 8.2.: Bottom-up semantic enrichment of prescriptions.
8.3. Semantic Authoring of Medical Prescriptions using Pharmer Semantic Medical Prescriptions are intelligent e-prescription documents enriched by dynamic drug-related meta-data thereby know about their content and the possible interactions. As depicted in Figure 8.2, semantic prescriptions are created based on a bottom-up process (cf. Section 3.3.1) in which normal e-prescriptions (unstructured or semi-structured with lower level of expressiveness) are enriched with semantic metadata coming from a set of predefined ontologies (with upper level of expressiveness). In order to showcase the applicability of semantic prescriptions we implemented an application called Pharmer. The Pharmer implementation is open-source and available for download together with an explanatory video and online demo at http: //code.google.com/p/pharmer/. Pharmer provides a platform for semantically annotation of conventional e-prescriptions. We use Schema.org MedicalTherapy and Drug vocabularies as our annotation ontologies and utilize the existing pharmaceutical linked datasets such as DBpedia, DrugBank7 , DailyMed8 and RxNorm9 as our domain ontology. 7
A bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) available at http://www.drugbank.ca 8 Information about marketed drugs available at http://dailymed.nlm.nih.gov 9 A normalized naming system for generic and branded drugs available at http://www.nlm. nih.gov/research/umls/rxnorm/
144
8.3. Semantic Authoring of Medical Prescriptions using Pharmer
BioPortal Annotator DBpedia Spotlight NLP Services
Linked Open Drug Data
Drug Detection
Drug Information Collector Document Layer
prescription - Instructions - General Information
Authoring UI
Semantic prescription
Visualizer
Fact Extractor
Annotator Semantic Layer
Interaction Finder Application Layer
Figure 8.3.: Architecture of the Pharmer system.
8.3.1. Architecture The Pharmer system architecture is depicted in Figure 8.3 and consists of three layers: Document Layer This layer includes the traditional e-prescription document plus two components as Drug Detection and Drug Information Collector. Drug detection component performs the natural language processing of the e-prescription document to detect the terms referring to a drug in the prescription. The component uses DBpedia Spotlight 10 and BioPortal annotator 11 NLP services to parse and analyze the text looking for known drugs. BioPortal annotator is an ontology-based Web service that annotates public datasets with biomedical ontology concepts based on their textual metadata. Automatic drug detection component is configurable so that users can easily add other existing NLP services for drug detection. When user is writing the prescription, this component asynchronously performs the drug recognition and adds the related annotations as real-time semantic tagging. Another component in this layer is drug information collector which grabs all the information regarding a specific drug from LOD. To pursue this, it utilizes datasets such as DrugBank, DailyMed and RxNorm by sending federated SPARQL queries.
10 11
http://spotlight.dbpedia.org/ http://bioportal.bioontology.org/annotator
145
8. WYSIWYM for Authoring of Semantic Medical Prescriptions
Figure 8.4.: Pharmer WYSIWYM implementation (V1 – highlighting of drugs through framing, V9 – additional information about a drug in a callout, T1/T2 combined form and inline editing of electronic prescriptions). Semantic Layer There are two main components in this layer namely Annotator and Authoring UI. The annotator component handles the automatic annotation and embeds the general information of the drugs as meta-data into the e-prescription. Annotator adopts the RDFa format. The authoring UI component provides users with a set of input forms to manually embed the meta-data related to prescription instructions into the prescription document. Application Layer This layer provides a set of applications on top of the generated semantic prescriptions. Interaction Finder checks the possible interactions between the prescribed drugs and warn the prescriber about them. Visualizer is responsible for graphically representing the embedded semantics of a prescription (e.g. as depicted in Figure 8.6). The Fact Extractor generates the RDF/Turtle representation of the semantic prescriptions.
8.3.2. Features The main features of Pharmer can be summarized as: • WYSIWYM User Interface. As shown in Figure 8.4, Pharmer employs the WYSIWYM concept described in Chapter 4. In Pharmer, users are able to directly manipulate the conventional e-prescriptions in order to enrich them
146
8.4. Possible Use Cases of Pharmer with semantics. The generated annotations can be viewed by different sets of user interfaces with are configurable by users. For example, users can select specific border/background colors to distinguish the annotated drugs in a prescription. • Providing Different Semantic Views. Semantic views allow the generation of different views on the same metadata schema and aggregations of the knowledge base based on the roles, personal preferences, and local policies of the intended users. Pharmer suggests two types of views: generic and domain specific views. Generic views provide visual representations of drug information (e.g., as information view depicted in Figure 8.5 or graph view in Figure 8.6). Domain-specific views address the requirements of a particular domain user (e.g., a researcher need specific views for visualizing the atomic structure of chemical compounds). • Real-time Drug Tagging. Real-time tagging means creating drug annotations while the user is typing. This will significantly increase the annotation speed [Heese et al., 2010]. Users are not distracted since they do not have to interrupt their current authoring task. Pharmer has a client-side component which interacts with the server asynchronously to make real-time tagging possible. • Drug Suggestion. When searching for a drug, Pharmer suggests the similar drugs by taking into account the history of search terms and by sending SPARQL queries to the relevant datasets. • Automatic Drug Annotation. Automatic annotation means the provision of facilities for automatic mark-up of prescriptions. The automatic process of annotating in Pharmer is composed basically of finding drug terms in prescription using an NLP service, mapping them against an ontology (i.e., DBpedia), and disambiguating common terms.
8.4. Possible Use Cases of Pharmer 8.4.1. A Ubiquitous Computing Platform for Semantic E-Prescribing Mobile and ubiquitous computing devices are increasingly present and prevalent in the health contexts. This trend brings a number of possibilities of mobile health (m-health) to address critical aspects of health care and health system needs, by
147
8. WYSIWYM for Authoring of Semantic Medical Prescriptions
Figure 8.5.: Screenshot of the Pharmer application (top-left: general view, topright: drug information view, bottom-left: prescription authoring view, bottom-right: drug interaction-finder results). virtue of these devices’ ubiquity, simplicity, and cost-efficiency [Petrucka et al., 2013]. In particular, in the process of semantic e-prescribing, having a mobile application will facilitate the creation of semantic medical prescriptions using any device and in any location. Pharmer mobile application as shown in Figure 8.712 provides a mobile user interface for authoring of semantic prescriptions as well as accessing multi-dimensional data on medical prescriptions. Current ubiquitous devices are programmable and come with a growing set of facilities including multi-touch screens and cheap powerful embedded sensors, such as an accelerometer, digital compass, gyroscope, GPS, microphone, camera and other type of sensors. Utilizing these rich set of facilities in the context of medical prescriptions will enrich the patient medical prescription with sensor data thereby improves the quality of e-health services. For example, the location of user and some indicators like blood pressure or hear rate can be received from sensors by which Pharmer can specify the suitable drugs 12
148
Available at http://bitili.com/pharmer/mobile
8.4. Possible Use Cases of Pharmer
Figure 8.6.: Graph view in Pharmer. located in pharmacies close to the user.
8.4.2. A Professional Social Network for Health-care Service Providers Pharmer as a prescribing tool can be incorporated in a health care social network. Such a network composed of health care professionals and patients who collaboratively write, correct and modify prescriptions in a semantically enriched environment. This social health care network facilitates relations between patients and health care professionals in order to improve Shared Decision Making (SDM). The traditional model of medical decision-making, in which doctors make decisions on treatment has no longer used in updated health care. The role of the patient, instead, in the consultation has been highlighted, mainly through introducing ‘patient-centered’ strategies. Therefore, nowadays the models promoting patients active involvement in the decision-making procedure becoming developed. A model introduced by Charles et al. [Charles et al., 1997] defines shared decision making only under the following four key characteristics. These keys are: • both the patient and the doctor are involved. • both parties share information. • both parties take steps to build a consensus about the preferred treatment. • an agreement is reached on the treatment to implement.
149
8. WYSIWYM for Authoring of Semantic Medical Prescriptions
Figure 8.7.: Screenshot of Pharmer mobile application. Pharmer facilitates the process of SDM through the connection amongst patient and physician on one hand and pharmacist on the other hand. Access to eLODD enables Pharmer to not only get linked to e-prescribing systems but also to further assist physicians in diagnosis and treatment. Pharmer with direct connection to up-to-date information enables physicians to reconfirm their diagnosis and help them in finding proper treatment approaches. Physician, after general examination can enter the observed symptoms in Pharmer network and there, with the wealth of data available, Pharmer can assist in diagnosis followed by therapies.
8.5. Pharmer Stakeholders: Example Scenario As depicted in Figure 8.8, Pharmer approach is very versatile and can be applied in a vast number of use cases by different stakeholders. The arrows in the figure can be summarized as the following: 1. The physician diagnoses the disease and writes the corresponding semantic prescription using the Pharmer, where patient’s medication history is available.
150
8.5. Pharmer Stakeholders: Example Scenario
Pharmacist
3 Patient
2
4
1
5
Drug company
6
Physician
Researcher
Insurance company
Figure 8.8.: Pharmer ecosystem. 2. The patient accesses to drug information, food interactions and adverse drug reactions via Pharmer. 3. The pharmacist verifies the prescription and considers alternative options suggested by Pharmer. 4. Pharma companies utilize the Pharmer data store in order to balance their production and distribution according to the market taste and demand. 5. The Researchers easily access to the abundant data source and prescription statistical data. 6. Pharmer informs insurance companies to perform fair coverage plans according to covered drugs and patient’s medication history. All the above stakeholders utilize Linked Open Data as their integrated information source. As a scenario, a 63 year old man with the history of MI (Myocardial Infarction) and type 2 diabetes visits a heart and coronary specialist complaining about frequent headaches and heavy head feeling. The specialist, after general inspection and monitoring vital signs, asks for a blood test. He then considers symptoms including high blood pressure (sys/dias:158/95 mmHg) and high Fasting Blood
151
8. WYSIWYM for Authoring of Semantic Medical Prescriptions Sugar (150 mg/dl). He diagnoses high blood pressure and severe type 2 diabetes. Thereby, The patient profile is defined in Pharmer by patient’s information besides diagnosis. “no weight loss” is mentioned as a preference in the patient’s profile. Regardless of the patient’s preferences, the physician would prescribe Metformin as a drug of choice. However, since the major side effect of Metformin is weight loss, the physician replaces Metformin with Rosiglitazone. Considering the medication that the patient took before (Glibenclamide only), The specialist dispenses a new semantic prescription by entering the following drugs: • Rosiglitazone 4 mg Oral Tablet once daily • Glibenclamide 5 mg Oral Tablet bid • Atenolol 50 mg Oral Tablet once daily He then checks for the possible drug interactions by clicking the attributed button in the Pharmer software. As the Pharmer is connected to LODD, it is capable of recognizing the most recent updated drug interactions (available at Drugbank dataset). He finds out that Sulfunyl Urea class drugs (here Glibenclamide) are not compatible to be coadministrated with beta-blockers (here Atenolol). So, he needs to replace it with another drug. Using the Pharmer and its connection to Linked Open Data, the physician can find the possible alternatives. Then he decides to choose Captopril as replacement. The semantic prescription is then sent to the patient’s pharmacy of choice. There, the pharmacist is able to review the semantic prescription and comments on that directly in the system so that the physician is also aware of the corresponding changes. The pharmacist comments may cause minor or major modifications in the semantic prescription. For instance, using the Pharmer, she is able to check the appropriate dose of each medicine or suggest cheaper alternatives (if possible). In this case, as the Rosiglitazone elevates cardiovascular risks, the pharmacist suggests Rosiglitazone to be replaced by Pioglitazone. This change happens as a realization of the shared decision making between physician, pharmacist and patient. Thereafter, the patient who was referred to the pharmacy takes the prescribed drugs. Before he starts taking the tablets, he enters in Pharmer system with his ID as patient. There, he is able to observe drug information embedded in the error free semantic prescription besides the preferred time and drug intake instructions. He is also informed about the possible food interactions. The patient’s profile completes as he visits physicians or ask for refills. Furthermore, he is followed up by the physician and the pharmacist via the Pharmer. After 2 months, the patient visits another specialist for his recurrent symptoms of diabetes. The specialist via the Pharmer accesses to the patient’s medical profile and increases the anti-diabetic drug dose.
152
8.6. Usability Evaluation A researcher in an academy research institution investigates Captopril (as an Angiotansion II antagonist) effect on preventing diabetes recurrence. Having the data from the aforementioned patient follow up along with other similar patients allows investigator to lead her goal. In this case, for example, the Captopril along with anti-diabetic drugs led to diabetes recurrence. Observing all the corresponding patient profiles will either confirm or reject the research assumption. A Pharma company manager requires to determine the compliance rate of Captopril in the market in order to balance the production based on market demand. Applying the Pharmer allows him to simply access to these data and decide how to go on with this product. He is also able to collect the evidence which may prevent further dispense of Captopril by physicians or consumption among patients. Pharmer allows insurance companies to customize and individualize their services based on patient’s medical records. Recruiting Pharmer that contains information on insured drugs, the physician can choose the drugs accordingly. In the scenario, insurance company checks the dispensed medication with the disease and patient’s insurance status therefore decides to refund the patient.
8.6. Usability Evaluation In order to determine whether we succeeded to facilitate the creation of semantic prescriptions using Pharmer, we performed a usability user study with 13 subjects. Subjects were drawn from 3 physicians, 4 pharmacist, 3 pharmaceutical researchers and 3 students. We first showed them a tutorial video of using different features of Pharmer13 then asked each one to create a prescription with Pharmer. After finishing the task, we asked the participants to fill out a questionnaire, which consisted of two parts: feature usage questions and usability experience questions. We used the SUS scale [Lewis and Sauro, 2009] to grade the usability of Pharmer. SUS is a standardized, simple, ten-item Likert scale-based questionnaire giving a global view The results of our survey (cf. Figure 8.9) showed a mean usability score of 75 for Pharmer which indicates a good level of usability. Participants particularly liked the integration of functionality and the ease of learning and use. The confidence in using the system was slightly lower, which we attribute to the short learning phase and diverse functionality. Of course, this is a very simplified view on usability and we expect even better results could be achieved by putting more effort into the Pharmer development. However, our goal was to demonstrate that Pharmer implementations with good usability characteristics could be created with relatively limited effort. In addition to quantitative results, we also collected a number of user suggestions. For instance some users suggested providing a print-friendly document with all the patient’s desired information.
13
Available at http://youtu.be/eNbbqO-zLQk
153
8. WYSIWYM for Authoring of Semantic Medical Prescriptions
Figure 8.9.: Usability evaluation results for Pharmer.
8.7. Conclusion This chapter addressed the research question RQ6 (cf. Section 1.3) to apply semantic content authoring to a domain-specific use case (i.e. e-prescribing) for achieving content interoperability. Providing a consistent connection between patients, physicians, pharmacists, pharmaceutical researchers and drug companies is a crucial step towards enhancing the quality of knowledge management and thereby e-health services in the pharmaceutical domain. With Pharmer, we presented in this chapter another implementation of the WYSIWYM interface model for realizing Semantic Prescriptions as intelligent medical prescriptions to improve the integration and interoperability of e-prescribing systems with other e-health services. Semantic prescriptions include the important meta-data about the content of a prescription, which will increase the awareness of their consumers.
154
Chapter 9
Conclusions and Future Work “In the end, people are persuaded not by what we say, but by what they understand.” — John C. Maxwell
This chapter provides an overview on the answers to the research questions behind this thesis and summarizes the main results of this work. It then discusses the future directions in which we intend to move further to extend and broaden the research conducted in the contributed areas.
9.1. Answers to Research Questions In this section we revisit the research questions discussed in Section 1.3 and provide a summary of the answers and contributions: RQ1. What are existing approaches for user-friendly semantic content authoring? Based on the starting point of the authoring process which can be ontologies (with upper level of expressiveness) or unstructured content (with lower level of expressiveness), we can classify the existing approaches for SCA into categories Top-Down and Bottom-Up. The bottom-up approaches usually known as semantic annotation or semantic markup techniques aim to annotate existing documents using a set of predefined ontologies. The top-down approaches usually called ontology population techniques aim to create semantic content based on a set of initial ontologies, which are extended during the population process. The tools which employ the bottom-approach are more appropriate for end users with no or limited knowledge of the domain on which the annotations or semantic structures are applied. Tools that adopt the top-down approach usually need users to have knowledge of the corresponding domain as well as ontology concepts. A predefined set of quality attributes comprising standard UI types and features can be employed to evaluate the strengths and weaknesses of existing SCA systems. Quality attributes address different aspects of designing and developing user-friendly SCA systems: Essential, foundational quality attributes for an SCA system are, in particular, usability, generalizability, customizability and evolvability. Support of collaboration, interoperability and scalability are quality attributes required when an SCA system is employed in a community-driven environment with large amount of users, systems and interactions. Automation and proactivity are quality
155
9. Conclusions and Future Work attributes which facilitate usability of SCA systems especially for non-skilled users. Portability and accessibility are, as our survey indicated, not well addressed by the SCA-related literature so far and demand more investigations. RQ2. How can we bind user interface elements to semantic representation data models? In order to facilitate the creation of semantically-enriched content, we need to provide suitable UIs which are compatible with the elements of our underlying semantic representation data model. In addition to an in-depth analysis of the elements of existing semantic representation models such as tree-based, graph-based and hyper-graph-based models, we performed an extensive review of the existing UI elements and techniques for visualization, exploration and authoring of text, images and videos. Finding the possible bindings between the semantic representation data models and UI techniques for visualization, exploration and authoring of content led us to the development of a novel interface model called WYSIWYM (What You See Is What You Mean). WYSIWYM aims to standardize interfaces for SCA systems. In order to facilitate, enhance and customize the WYSIWYM model, a set of helper components, which implement cross-cutting aspects such as automation, recommendation and collaboration are integrated into the WYSIWYM model. RQ3. How can we integrate semantic content authoring features into the current authoring tools on the Social Web? Integrating SCA features into the current content authoring process on the Social Web, facilitates the promotion of structured content on the Web to a great extent. WYSIWYG (What You See Is What You Get) text authoring is meanwhile ubiquitous on the Web and part of most content creation and management workfows such as CMSs, Weblogs, Wikis, product data management systems and online shops. In this thesis we introduced RDFaCE approach as a transition from WYSIWYG to WYSIWYM. The rationale is to provide an environment to the user, which she is sufficiently familiar with, but at the same time enables her to understand, access and work with semantically-enriched content. We implemented RDFaCE as a plugin for existing WYSIWYG implementations which could be installed and employed on the Social Web without much additional effort. RQ4. How can we exploit semantically-enriched content for content analysis? Semantically-enriched content can be exploited to deal with the current analytical information imbalance (cf. Section 6.1). In this thesis, we introduced conTEXT as a Mashup platform for text analytics. conTEXT combines services for NLP (e.g. named entity recognition and relation extraction), sentiment analysis, visualization, exploration and feedback to exploit semantically-enriched content for text analysis. conTEXT employs the WYSIWYM interface model to enable ordinary Web users
156
9.2. Summary of the Results to perform sophisticated NLP tasks. Instant benefits provided by different analytics views in conTEXT act as an incentive for users to adopt semantic annotations and to take NLP feedback into account. Users receive more precise analytics results as they contribute to the refinement of automatically annotated content. RQ5. How can we apply crowdsourcing & collaborative content authoring techniques to the process of semantic content authoring? One of the main drivers to increase the amount of structured content on the Web is harnessing the power of crowds contributing to the Social Web on a daily basis. Addressing the crowdsourcing and collaboration aspects of SCA, requires new extensions to our proposed WYSIWYM interface model. By introducing the WikiApp data model in this thesis we aimed to provide a refinement of traditional entity-relationship data model which considers users and content revisions as its first class citizen. Based on the WikiApp model, we developed a platform called SlideWiki for collaborative authoring of highly-structured e-learning content. SlideWiki implements our proposed WYSIWYM interface together with collaboration helper components for authoring of semi-structured content in an implicit and user-friendly manner. RQ6. How can we apply semantic content authoring to a domain-specific use case for achieving content interoperability? In this thesis, we introduced Pharmer as a domain-specific implementation of the WYSIWYM model. Pharmer enables physicians to author semantic medical prescriptions as intelligent prescriptions which know about their own content. With Pharmer we investigated how semantically-enriched content can be applied to facilitate content interoperability in the domain of health-care services. Pharmer utilizes real-time drug tagging for user-friendly creation of structured medical prescriptions and to enable shared decision-making among physicians, pharmacists, researchers, Pharma and insurance companies.
9.2. Summary of the Results Upon completion of this thesis, the main research question behind this work, namely “How can we enable user-friendly manual and semi-automatic creation of rich semantic content? ” needs to be answered. As shown in Figure 9.1, in order to facilitate authoring of semantically-enriched documents, we need to extend the existing authoring approaches with appropriate user interfaces for semantic content authoring. This extension should be carried out with minimal effort so that user is not distracted from the normal process of content authoring. In this thesis, we proposed a semantics-based user interface model which provides a binding between existing semantic representation data models and existing user interfaces for visualization, exploration and authoring of content. The proposed model is
157
9. Conclusions and Future Work
Anlaytics
use
author
Community of users WYSIWYM
Helper components
Semantically-enriched documents
Binding Semantic representation models
UI elements & techniques
Figure 9.1.: User-friendy manual & semi-automatic creation of rich semantic content. called WYSIWYM (What You See Is What You Mean) and aims to standardize semantic authoring user interfaces. WYSIWYM model offers a set of helper components to deal with the cross-cutting aspect such as automation, proactivity, accessibility and personalization. In order to deal with the community of users and content revisions, we enriched WYSIWYM with an appropriate data model called WikiApp which supports collaboration and crowdsourcing. Furthermore, to incentivize users to adopt semantic content authoring, different views for text analytics are provided to users. Revisiting our user scenario in Section 1.1, Alice can exploit semantically-enriched job posts (either by annotating unstructured job posts or by authoring structured job posts from the scratch) to create different UIs for content exploration and visualization. For example, she will see a taxonomy of all the Data Science related skills together with their mentions in IT jobs posted on LinkedIn or other job posting Websites. She can then easily extract the most demanded Data Science skills from the collected IT jobs and can publish them as structured content on their online magazine to be reusable by other Web users too.
9.3. Impact The main goal of this research was to facilitate and promote semantic content authoring among Web end-users. To achieve this goal, we developed and released several open-source tools and platforms dealing with this task in both general and domain-specific use cases (cf. Appendix A). Most of these tools have been already
158
9.3. Impact actively used by users on the Web. With these tools, we envision the following impacts: Alleviating the Semantic Web’s chicken-and-egg problem. Recently we could observe a significant increase of the amount of structured data publishing on the Web. However, this increase can be attributed primarily to article metadata being made available and already to a much lesser extend to just a few entity types (people, organizations, products) being prevalent [Bizer et al., 2013]. As a consequence, we still face the chicken-and-egg problem to truly realize the vision of a Web, where large parts of the information are available in structured formats and semantically annotated. Before no substantial amount of content is available in semantic representations, search engines will not pick up this information and without better search capabilities publishers are not inclined to make additional effort to provide semantic annotations for their content. The latter is particularly true for unstructured and semi-structured content, which is much more difficult to annotate than structured content from relational databases (where merely some templates have to be adopted in order to provide e.g. RDFa). RDFaCE and conTEXT can help to overcome this problem, since they provide instant benefits (i.e. SEO and text analytics views) to users for creating comprehensive semantic annotations. Democratizing the NLP usage. With conTEXT, natural language processing technology is made more accessible, so that ordinary users can use sophisticated text analytics with just a few clicks. RDFaCE and Pharmer allow ordinary users to exploit NLP by one click (or even in real-time without any click while user is writing) for automatic content annotation. This was achieved by abstracting from a particular technology (e.g. by using the NIF format) and by supporting sophisticated content visualizations and exploration employing the WYSIWYM model and the data-driven document metaphor. As a result, ordinary users can observe the power of NLP and semantic technologies with minimal effort. By directly showing the effect of semantic annotations and demonstrating the benefits for improved navigation, exploration and search, users will gain a better understanding of recent technology advances. Harnessing the power of feedback loops. Thomas Goetz states in his influential WIRED Magazin article [Goetz, 2011]: ‘Provide people with information about their actions in real time, then give them a chance to change those actions, pushing them toward better behaviors.’ With RDFaCE and conTEXT, we give users direct feedback on what information can be extracted from their works using NLP services. At the same time we want to incorporate their feedback and revisions of the automatic semantic annotations back in the NLP processing loop. Incorporating user feedback was so far not much in the focus of the NLP community. With RDFaCE and conTEXT, we aim to contribute to changing this. We argue,
159
9. Conclusions and Future Work that NLP technology achieving, for example, 90% precision, recall or f-measure, might not fulfill the requirements of a number of potential use cases. When we can increase the quality of the NLP through user feedback, we might be able to substantially extend the range of potential NLP applications. The user feedback here serves two purposes: One the one hand, it directly increases the quality of the semantic annotation. On the other hand, it can serve as input for active learning techniques, which can further boost precision and recall of the semantic annotation. Enabling the collaborative creation of structured multilingual educational content. The creation of high-quality educational content is a time and resource consuming task. The task requires even more resources if there is a need to offer the content in different languages. With SlideWiki, we propose to exploit the power of crowd to author semantically structured educational content available in a number of different languages. By employing the WYSIWYM model and by semantic structuring of content, we split the content into reusable elements in such a way, that each of them fully covers an individual piece of knowledge thereby enabling educational content reuse and re-purposing on the Web.
9.4. Limitations and Future Directions The work presented in this thesis included both research and engineering parts. In the following, we describe the limitations and future work with regards to the main contributions of this thesis: UIs for semantic content authoring. While there are many benefits of systematic reviews, they also bear some limitations and validity threats originating from human errors. The main threats to validity of our systematic review are twofold: correct and thorough selection of the studies to be included as well as accurate and exhaustive selection of quality attributes together with their corresponding UI features. With the increasing number of works in the area of semantic content authoring we can not guarantee to have captured all the material in this area. The scope of our review is restricted to the scientific domain. Therefore, some tools or approaches employed in the industry might have not been included in our primary studies. Furthermore, since the review process was mainly performed by one researcher a bias is possible. In order to mitigate a potential subjective bias, the review protocol and results were checked and validated by a senior researcher and other colleagues experienced in the context of Semantic Web. As future work, we envision strategies to semi-automatically improve the realization of the quality attributes discussed in Section 3.9, for example, using active machine learning for better integration with approaches delivering automatic suggestions. Also extending the support for integration of multi-media and multi-modal semantic annotation (e.g. of images and multimedia content) is a promising research direction. Addressing open research and technology challenges
160
9.4. Limitations and Future Directions such as accessibility, handling complexity in UIs, formal and systematic methods for user interface evaluation and UIs for ubiquitous devices are other interesting areas for future research. WYSIWYM model. With regards to the limitations of our proposed WYSIWYM model, though we attempted the bindings to be fairly complete, new UI elements might be developed or additional data models (or variations of the ones considered) might appear. In this case, the bindings should be updated. As future work, we envision to adopt a model-driven approach to enable automatic implementation of WYSIWYM interfaces by user-defined preferences. This will help to reuse, re-purpose and choreograph WYSIWYM UI elements to accommodate the needs of dynamically evolving information structures and ubiquitous interfaces. We also aim to bootstrap an ecosystem of WYSIWYM instances and UI elements to support structure encoded in different modalities, such as images and videos. Creating live and context-sensitive WYSIWYM interfaces which can be generated on-the-fly based on the ranking of available UI elements is another promising research venue. Integrating semantic content authoring into the current content authoring tools. Integrating SCA systems into other applications like speech recognition and question-answering systems for improving the accuracy and quality of results is an important area of future work in this context. At the moment, intelligent mobile assistants (e.g. Siri 1 for the iPhone) only allow delegation of certain programmed tasks (e.g. making restaurant reservations, getting movie tickets, etc.) by invoking certain predefined web services. Employing semantically enriched content in the UI of mobile personal agents will extend their capability to inquiry the open Web of Data thereby achieving more efficient and effective results. Exploiting semantically-enriched content for content analysis and instant user gratification. In future, we plan to extend work on conTEXT along several directions: • Enhancing the NLP feedback. We aim to investigate, how user feedback can be used across different corpora. We consider the harnessing of user feedback by NLP services an area with great potential to attain further boosts in annotation quality. On a related angle, we plan to integrate revisioning functionality, where users can manipulate complete sets of semantic annotations instead of just individual ones. In that regard, we envision that conTEXT can assume a similar position for text corpora as have data cleansing tools such as OpenRefine for structure data. • Creating a flexible end-user NLP ecosystem. At the moment, conTEXT platform relies mainly on DBpedia Knowledge Base (KB) to extract the 1
http://www.siri.com
161
9. Conclusions and Future Work Named Entities and to provide different views for text analytics. Nevertheless, there are many use cases that require to change or extend the underlying KB for providing more elaborated and domain-specific views on content. Enabling users with mechanisms to modify the underlying KB with minimal efforts will bring a high impact to the conTEXT end-user NLP ecosystem. In the envisioned ecosystem, users can either create their own KB or reuse existing KBs provided by knowledge engineers and domain experts. Once the underlying KB changes, all the other components (e.g. NER tools or analytics views) must adapt to this change. Also, in this direction we plan to provide conTEXT as a composite Web service, where each component of the system such as input processors, NLP services and analytics views can be created, shared and reused by Web users in a modular way. In this dynamic and flexible ecosystem, a user’s content is continuously ingested and processed, the user is informed about updates and thus the semantic representations of the content evolve along with the content itself. Applying crowdsourcing & collaborative authoring techniques to the process of semantic content authoring. Our first direction for future work in this context is to implement a completely SCORM-compliant LCMS and authoring tool, based on the SlideWiki. This will allow us to exchange the content with other SCORM-compliant LCMSs. Also, in a real e-learning scenario, learners come from different environments, have different ages and educational backgrounds. These heterogeneities in user profiles are crucial to be addressed when enhancing the crowdlearning concept. New approaches should provide the possibility to personalize the learning process. Thus, our second direction is providing the personalized content based on initial user assessments. The third direction for the future work is to support the annotation of learning objects using standard metadata schemes. We aim to implement the LRMI 2 metadata schemes to facilitate end-user search and discovery of educational resources. Exploiting semantically-enriched content for content interoperability in domain-specific use cases. Regarding future work, we envision to extend the Pharmer application towards different modalities, such that the annotation of images and other medical objects is supported. Furthermore, we aim to integrate the other existing linked open datasets (e.g. related to publications, laboratories or insurance documents) into the Pharmer to extend its stakeholders.
2
162
Learning Resource Metadata Initiative: www.lrmi.net/
Appendix A
Software Release History The following software releases were made during the thesis: • RDFaCE (https://bitbucket.org/ali1k/rdface & http://wordpress. org/plugins/rdface/) – 0.4 - released 2014-5-11 - bug fixes, compatibility with TinyMCE 4.0 and Wordpress 3.9 – 0.3 - released 2013-4-15 - schema.org edition – 0.2 - released 2012-3-6 - rdface-lite with support for rNews vocabulary – 0.1 - released 2011-7-8 - initial version • conTEXT (https://github.com/AKSW/context) – 0.3 - released 2014-5-1 - support for replacing the DBpedia ontology with user’s ontology, support for selecting a subgraph of DBpedia – 0.25 - released 2014-4-20 - support for real-time analysis of Twitter streams – 0.2 - released 2014-4-1 - support for social media sign-in, added Twitter, LinkedIn, Facebook and G+ input sources, added sentiment analysis view – 0.1 - released 2014-1-17 - initial version • SlideWiki (https://github.com/AKSW/SlideWiki) – 0.1 - released 2013-9-24 - initial version • Pharmer (https://code.google.com/p/pharmer) – 0.1 - released 2012-12-1 - initial version
163
Appendix B
Curriculum Vitae Ali Khalili
Tarostrasse 12/212 04103 Leipzig, Germany. (+49) 17639002240
[email protected] http://ali1k.com Personal Data Name: Ali Khalili Birth date: June 26th, 1984 Birth place: Karaj, Iran Nationality: Iranian Marital status: Married
Education 2011 – Present University of Leipzig (Leipzig, Germany) Ph.D., Faculty of Mathematics and Computer Science, Department of Computer Science. Thesis title: A Semantics-based User Interface Model for Content Annotation, Authoring and Exploration. 2009 – 2010 VU University Amsterdam (Amsterdam, Netherlands) Research Assistant, Faculty of Computer Science, Department of Information
164
Management & Software Engineering. Project title: Studying Organizational Social Structures for Knowledge Sharing in the Process of Service-Oriented Design. 2007 – 2009 Khaje Nasir University of Technology (Tehran, Iran) M.Sc., Faculty of Industrial Engineering, Department of Information Technology (E-Commerce). Thesis title: Semi-automatic Creation of Enterprise Mashups using Semantic Descriptions. Excellent grade (20/20) 2003 – 2007 Shahid Beheshti University (Tehran, Iran) B.Sc., Faculty of Electrical and Computer Engineering, Department of Software Engineering. Thesis title: Design and Implementation of Conference Management Systems. Excellent grade (20/20)
Research Interests • • • •
Semantic Web Human-Computer Interaction Text Analytics Service-oriented Design, Web Services and Mashups
Selected Publications 1. Ali Khalili and S¨oren Auer.WYSIWYM – Integrated Visualization, Exploration and Authoring of Semantically enriched Un-structured Content. Semantic Web Journal, 2014. 2. Ali Khalili, S¨oren Auer and Axel C.N. Ngomo. conTEXT – Lightweight Text Analytics using Linked Data. 11th Extended Semantic Web Conference (ESWC 2014), pages 628-643, 2014. 3. Darya Tarasowa, S¨oren Auer, Ali Khalili, and J¨org Unbehauen. Crowdsourcing (semantically) Structured Multilingual Educational Content (cosmec). Open Praxis Journal, 6(2), 2014. 4. Timofey Ermilov, Ali Khalili, and S¨oren Auer. Ubiquitous Semantic Applications: A Systematic Literature Review. International Journal on Semantic Web and Information Systems (IJSWIS), 10(1), 2014. 5. Ali Khalili and Bita Sedaghati. A WYSIWYM Interface for Semantic Enrichment of E-prescriptions using Linked Open Drug Data. International Journal On Advances in Life Sciences, 5(3,4), 2013.
165
B. Curriculum Vitae 6. Ali Khalili and S¨oren Auer. User Interfaces for Semantic Authoring of Textual Content: A Systematic Literature Review. Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 22(1), 2013. 7. Ali Khalili and S¨oren Auer. WYSIWYM Authoring of Structured Content based on Schema.org. The 14th International Conference on Web Information Systems Engineering (WISE 2013), volume 8181 of Lecture Notes in Computer Science, pages 425-438. Springer Berlin Heidelberg, 2013. 8. S¨oren Auer, Ali Khalili, and Darya Tarasowa. Crowd-sourced OpenCourseWare Authoring with Slidewiki.org. International Journal of Emerging Technologies in Learning (iJET), 8(1), 2013. 9. Darya Tarasowa, Ali Khalili, S¨oren Auer, and J¨org Unbehauen. Crowdlearn: Crowd-sourcing the Creation of Highly-Structured E-learning Content. 5th International Conference on Computer Supported Education (CSEDU 2013), pages 33-42. SciTePress, 2013. 10. Ali Khalili and Bita Sedaghati. Semantic Medical Prescriptions – Towards Intelligent and Interoperable Medical Prescriptions. In IEEE Seventh International Conference on Semantic Computing (ICSC), 2013. pages 347-354. 11. Darya Tarasowa, Ali Khalili, and S¨oren Auer. Crowdlearn: Collaborative Engineering of (semi-)Structured Learning Objects. In Proceedings of the International Conference on Knowledge Engineering and Semantic Web (KESW), 2012. 12. Ali Khalili, S¨oren Auer, Darya Tarasowa, and Ivan Ermilov. Slidewiki: Elicitation and Sharing of Corporate Knowledge using Presentations. The 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2012), volume 7603 of Lecture Notes in Computer Science, pages 302-316. Springer Berlin Heidelberg, 2012. 13. Ali Khalili, S¨oren Auer, and Daniel Hladky. The RDFa Content Editor from WYSIWYG to WYSIWYM. In Computer Software and Applications Conference (COMPSAC 2012), IEEE 36th Annual, pages 531-540. 2012. 14. Shahriar Mohammadi, Ali Khalili, and Sarah Ashoori. Using an Enterprise Mashup Infrastructure for Just-in-Time Management of Situational Projects. In International Conference on e-Business Engineering (ICEBE), 2009. 15. Ali Khalili and Shahriar Mohammadi. Using Logically Hierarchical Meta Web Services to Support Accountability in Mashup Services. In IEEE AsiaPacific Services Computing Conference (APSCC), 2008. pages 410-415. 16. Ali Khalili, A.H. Badrabadi, and Farid Khoshalhan. A Framework for Distributed Market Place based on Intelligent Software Agents and Semantic Web Services. In IEEE Congress on Services Part II, 2008. pages 141-148.
166
17. Shahriar Mohammadi and Ali Khalili. A Semantic Web Service-oriented Model for Project Management. In Computer and Information Technology (CIT) Workshops, 2008.
Honors and Awards • Best-paper award of the 36th IEEE Signature Conference on Computers, Software, and Applications (COMPSAC) 2012 for the paper “The RDFa Content Editor - From WYSIWYG to WYSIWYM”. • Creative Innovation Project Award 2014 for OpenCourseWare Excellence from OCW Consortium (for developing the SlideWiki OpenCourseWare platform). • 1st Prize of the AI Mashup Challenge 2014 (for the conTEXT Mashup platform). • Best-application prize at WoLE2013 challenge (Doing Good by Linking Entities), WWW2013 workshops (for the Pharmer project). • Best-poster prize awarded at Leipzig Research Festival for Life Sciences 2012 (for the Pharmer project). • Nominated for best-paper award in the 5th International Conference on Computer Supported Education (CSEDU 2013) for the paper “CrowdLearn: Crowd-sourcing the Creation of Highly-structured E-Learning Content”. • Nominated for best-paper award in the 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2012) for the paper “SlideWiki: Elicitation and Sharing of Corporate Knowledge using Presentations”. • Elsevier Travel Grant to attend the Beyond the PDF 2 conference, Amsterdam, Netherlands (March 2013). • Awarded DAAD (The German Academic Exchange Service) Scholarship for studying PhD in Germany, 2012. • Awarded LOD2 project funding for doing a one year research in University of Leipzig, 2011. • Awarded NWO scholarship plus S-Cube project funding for doing a one year research in VU University of Amsterdam, 2010. • First-ranked student among K.N.Toosi Master Students of IT, 2009.
167
B. Curriculum Vitae • Awarded from Iran research institute of ICT for MSc project (Semantic Enterprise Mashups), 2009. • Rank 33 in the IT Master Entrance Exam among 7,000 participants in Iran, in 2007. • Awarded from computer society of Iran for BSc project (Design and Implementation of CSICC2007 Conference Management System), 2007.
Technical and Programming Skills • Programming Languages Skills: – PHP, Javascript (Professional). – .NET, VBscript, NodeJS, Java, Ruby, C / C++ (Intermediate). • Database Systems: – MySQL, MongoDB, SQL Server
Projects • SlideWiki: http://slidewiki.org A platform for collaborative authoring of OpenCourseWare. • conTEXT: http://context.aksw.org A platform for lightweight text analytics using Linked Data. • RDFaCE: http://rdface.aksw.org A WYSIWYM interface for authoring of semantic content. • Pharmer: http://bitili.com/pharmer A WYSIWYM interface for authoring of semantic medical prescriptions.
Language Skills • • • •
Persian: Native English: Advanced German: Intermediate (ZD B1 Certificate) Familiar with Arabic, Turkish and Dutch.
Research Community Service • Program Committee for Knowledge Engineering and Semantic Web (KESW ) conference, Linked Data on the Web (LDOW ) workshop, NLP-DBpedia workshop • Reviewer for ISWC, ESWC, WIMS and LREC conferences, Semantic Web Journal (SWJ ) and International Journal On Semantic Web and Information Systems (IJSWIS )
168
List of Abbreviations
API Application Programming Interface, pp. 38, 80, 82–84, 87–89, 91, 92, 95, 99, 106, 107, 109, 114 ATAG Authoring Tool Accessibility Guidelines, p. 59 ATR Automatic Term Recognition, pp. 25, 26 BOA BOotstrapping linked datA, p. 107 CMS Content Management System, pp. 7, 81, 91, 156 CSS Cascading Style Sheets, pp. 63, 83, 95, 134 CURIE Compact URI, p. 18 D3 Data-Driven Document, pp. 108, 114 DOM Document Object Model, pp. 82–84 DSL Domain-Specific Language, pp. 124, 125 EL Entity Linking, pp. 25, 26 ER Entity-Relation, pp. 10, 63, 121 FOAF Friend Of A Friend, pp. 48, 68 FOX Federated knOwledge eXtraction Framework, p. 107 HCI Human Computer Interaction, pp. 35, 75 IRI Internationalized Resource Identifier, p. 18 JSON JavaScript Object Notation, pp. 17, 95, 114, 135, 137 KB Knowledge Base, pp. 26, 27, 161, 162 KE Keyword Extraction, pp. 25, 26 KR Knowledge Representation, p. 4 LCMS Learning Content Management Systems, pp. 126, 127, 136, 162
169
B. Curriculum Vitae LD Linked Data, pp. 17, 27, 106, 141–143 LOD Linked Open Data, pp. 141–143, 145, 151 LODD Linked Open Drug Data, pp. 8, 140, 142, 143, 150, 152 LOM Learning Objects Metadata, p. 136 LOR Learning Objects Repository, pp. 135, 136 MVC Model-View-Controller, pp. 114, 121, 125, 131, 136, 137 NER Named Entity Recognition, pp. 10, 25, 26, 103, 162 NIF NLP Interchange Format, pp. 9, 27, 108, 114, 159 NLP Natural Language Processing, pp. 2, 3, 7, 9, 10, 25, 27, 38, 78, 80, 83, 84, 87–89, 95, 99, 102–104, 107–109, 114, 118, 145, 147, 156, 157, 159–162 OCA One Click Annotation, pp. 57, 64 OCW OpenCourseWare, p. 126 OPML Outline Processor Markup Language, p. 135 OWL Web Ontology Language, pp. 27, 35 PbE Programming by Example, p. 43 POS Part-Of-Speech, p. 25 QUIM Quality in Use Integrated Measurement, p. 42 RDBMS Relational Database Management System, p. 25 RDF Resource Description Frameworka, pp. 8, 12–19, 24, 25, 27, 35, 47, 56–58, 63, 68, 69, 83, 100, 105, 107, 119, 122, 124, 136, 143, 146 RDFa Resource Description Framework in Attributes, pp. 18, 19, 35, 38, 48, 54, 58, 67, 69, 82–84, 86, 87, 92, 95, 96, 98–100, 109, 146, 159 RSS Rich Site Summary, pp. 106, 135 SCA Semantic Content Authoring, pp. 5, 6, 9, 28, 32, 35, 39, 41–49, 52, 54, 59–61, 99, 155–157, 161 SCAUI Semantic Content Authoring User Interface, pp. 35, 58–60 SCORM Sharable Content Object Reference Model, pp. 126, 162 SDM Shared Decision Making, pp. 149, 150
170
SEO Search Engine Optimization, pp. 8, 110, 111, 159 SIOC Semantically-Interlinked Online Communities, pp. 48, 68 SKOS Simple Knowledge Organization System, pp. 1, 35, 48, 68 SPARQL SPARQL Protocol and RDF Query Language, pp. 9, 12, 24, 25, 35, 107, 120, 124, 145, 147 SUS System Usability Scale, pp. 117, 137, 153 UI User Interface, pp. 3, 5–9, 28, 32, 39, 41–43, 45, 46, 48, 52, 54–66, 68, 71, 73–75, 78, 79, 82, 83, 86, 95, 98, 102, 105, 110, 146, 155, 156, 158, 160, 161 URI Uniform Resource Identifier, pp. 13–15, 18, 24, 27, 46, 68, 78, 83, 84, 97, 98, 107, 114, 141, 142 URL Uniform Resource Locator, pp. 14, 19, 20, 136 W3C World Wide Web consortium, pp. 12, 13, 20, 24, 142 WCAG Web Content Accessibility Guidelines, p. 59 WKF Wikification, pp. 25, 26 WYSIWYG What You See Is What You Get, pp. 7, 9, 58, 74, 80, 81, 85, 86, 101, 132, 156 WYSIWYM What You See Is What You Mean, pp. 2, 7, 9, 10, 62–67, 78–81, 86, 87, 96, 101, 102, 109, 119, 132, 139, 140, 146, 154, 156–161
171
List of Tables 2.1. Sample RDF statements. . . . . . . . . . . . . . . . . . . . . . . . 3.1. List of quality attributes together with their corresponding UI features suggested for SCA systems. . . . . . . . . . . . . . . . . . 3.2. Relation between usability factors and criteria (’+’ indicates the positive effect of a criteria on usability factors). . . . . . . . . . . 3.3. User types, domain and authoring approach of the surveyed SCA systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4. User interface evaluation methods. . . . . . . . . . . . . . . . . . 3.5. Comparison of OntoWiki, SAHA 3, Loomp according to the quality attributes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16 40 42 51 53 55
5.1. Recall, Precision and F-score for each API and combined APIs. . 5.2. Participants level of knowledge. . . . . . . . . . . . . . . . . . . . 5.3. Usability factors derived from the survey. . . . . . . . . . . . . .
90 98 99
6.1. NLP Feedback parameters. . . . . . . . . . . . . . . . . . . . . . . 6.2. conTEXT’s extensible architecture supports a variety of plug-able components for various processing and interaction stages. . . . . .
109
172
113
List of Figures 1.1. A simple user scenario to exploit semantically-enriched content. . 1.2. Summary of research questions and key contributions. . . . . . . . 1.3. Overview of the chapters together with their corresponding research & application artifacts. . . . . . . . . . . . . . . . . . . . . . . . .
11
2.1. Semantic Web technology stack. . . . . . . . . . . . . . . . . . . . 2.2. RDF statement represented as a directed graph. . . . . . . . . . . 2.3. Small knowledge base about Ali Khalili represented as a graph. . 2.4. Sample RDF/XML format. . . . . . . . . . . . . . . . . . . . . . 2.5. Sample N3 format. . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6. Sample JSON-LD format. . . . . . . . . . . . . . . . . . . . . . . 2.7. Sample RDFa format. . . . . . . . . . . . . . . . . . . . . . . . . . 2.8. Sample Microdata format. . . . . . . . . . . . . . . . . . . . . . . 2.9. Excerpt of the DBpedia ontology. . . . . . . . . . . . . . . . . . . 2.10. Level of expressiveness of ontologies (source:[Schaffert, 2006]). . . 2.11. An example schema (LocalBusiness) from Schema.org. . . . . . . 2.12. SPARQL query to get the homepage of Ali Khalili’s current project. 2.13. Examples of information extraction subtasks (source:[Mendes, 2013]). 2.14. An example of NIF integration (source:[Hellmann et al., 2013]). .
13 14 16 17 17 18 19 20 21 22 24 25 26 27
3.1. Steps followed to scope the search results. . . . . . . . . . . . . . 3.2. The screenshot of the coding software showing the generated list of codes from the primary studies. . . . . . . . . . . . . . . . . . . . 3.3. Publications per year. . . . . . . . . . . . . . . . . . . . . . . . . . 3.4. Semantic content authoring ecosystem. . . . . . . . . . . . . . . . 3.5. Top-Down and Bottom-Up approaches for semantic content authoring. 3.6. Quality attributes dependencies (‘+’: positive effect, ‘+-’: reciprocal effect). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7. Screenshot of the OntoWiki instance view with inline editing. . . 3.8. Screenshot of the SAHA 3 inline editing. . . . . . . . . . . . . . . 3.9. Screenshot of the Loomp faceted viewing UI. . . . . . . . . . . . .
30
4.1. Schematic view of the WYSIWYM model. . . . . . . . . . . . . . 4.2. Comparison of existing visual mapping techniques in terms of semantic expressiveness and complexity of visual mapping. . . . . .
3 6
31 33 34 36 49 56 57 58 65 67
173
List of Figures 4.3. Screenshots of user interface techniques for visualization and exploration: 1-framing using borders, 2-framing using backgrounds, 3-video subtitle, 4-line connectors and arrow connectors, 5-bar layouts, 6-text formatting, 7-image color effects, framing and line connectors, 8-expandable callout, 9-marking with icons, 10-tooltip callout, 11-faceting . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. Possible bindings between user interface and semantic representation model elements. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70 77
5.1. 5.2. 5.3. 5.4.
RDFaCE system architecture. . . . . . . . . . . . . . . . . . . . . 82 Annotation user interface. . . . . . . . . . . . . . . . . . . . . . . 85 The four views for semantic text authoring. . . . . . . . . . . . . 86 RDFaCE WYSIWYM implementation (T6 indicates the RDFaCE menu bar, V1 – the framing of named entities in the text, V9 – a callout showing additional type information, T5 – a context menu for revising annotations). . . . . . . . . . . . . . . . . . . . . . . . 87 5.5. Generated results of different NLP APIs for article #1. . . . . . . 88 5.6. Avg. Precision, Recall and F-score for each API & their combination. 89 5.7. Screenshot of RDFaCE integrated into WordPress. . . . . . . . . . 92 5.8. Architecture of RDFaCE-Lite. . . . . . . . . . . . . . . . . . . . . 93 5.9. Screenshot of RDFaCE-Lite with support for rNews. . . . . . . . . 94 5.10. Configuration steps in RDFaCE Schema.org edition. . . . . . . . . 95 5.11. Search results improved by rich snippets. A: enhanced recipe, B: normal recipe, C: browsing recipes by ingredients, cook time and calories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.12. Example of Microdata annotations generated by RDFaCE. . . . . 97 5.13. Using RDFaCE to annotate recipes based on Schema.org. . . . . 98 5.14. Results of usability test. (top) Number of annotations per user. (bottom) Annotation time per user. . . . . . . . . . . . . . . . . . 100 5.15. Comparison of RDFauthor, SAHA 3, Loomp and RDFaCE according to the quality attributes. . . . . . . . . . . . . . . . . . . . . . . . 101 6.1. Flexibility of user interfaces and targeted user groups as well as genericity (circle size) and degree of structure (circle color) for various analytics platforms. . . . . . . . . . . . . . . . . . . . . . 6.2. Text analytics workflow in conTEXT. . . . . . . . . . . . . . . . . 6.3. Screenshots of the conTEXT WYSIWYM interface (T2 indicates the inline editing UI, V1 – the framing of named entities in the text, V2 – text margin formatting for visualizing hierarchy, V7 – line connectors to show the relation between entities, V9 – a callout showing additional type information, X2 – faceted browsing, H3 – recommendation for NLP feedback). . . . . . . . . . . . . . . . . . 6.4. Example of realtime semantic analysis in conTEXT. . . . . . . . .
174
104 107
109 110
List of Figures 6.5. Different views on an analyzed corpus: 1) faceted browser, 2) matrix view, 3) sentiment view 4) image view, 5) tag cloud, 6) chordal graph view, 7) map view, 8) timeline, 9) trend view. . . . . . . . 6.6. conTEXT data model. . . . . . . . . . . . . . . . . . . . . . . . . 6.7. Generated semantic annotations represented in NIF/JSON. . . . . 6.8. conTEXT task evaluation platform: Left – task view showing the tasks assigned to an evaluation subject, Right – individual task. . 6.9. Avg. Jaccard similarity index for answers using & without the conTEXT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10. Avg. time spent (in second) for finding answers using & without the conTEXT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.11. Result of conTEXT usability evaluation using SUS questionnaire.
112 114 115 115 117 117 118
7.1. 7.2. 7.3. 7.4. 7.5. 7.6. 7.7.
Schematic view of the WikiApp data model. . . . . . . . . . . . . 120 Instantiation of the WikiApp DSL representing the SlideWiki model.124 Generated database schema by Wikifier. . . . . . . . . . . . . . . 125 Crowdlearning strategies in SlideWiki. . . . . . . . . . . . . . . . 126 SlideWiki ecosystem for organizational knowledge sharing. . . . . 130 Bird’s eye view on the SlideWiki MVC architecture. . . . . . . . . 132 Screenshots of the SlideWiki WYSIWYM interface (V2 – text margin formatting for visualizing content tree, V7 – line connectors to show the relation between slides and decks, X4 – expanding & drilling down to explore content, T4 – drag & drop to change the order of slides and decks, T6 – floating ribbon editing to author slide content, H5 – collaboration and crowdsourcing helper components). 133 7.8. Decision flow during the creation of new slide and deck revisions. 134 7.9. Editing of a question & Test mode in SlideWiki. . . . . . . . . . . 135 7.10. Mapping of URLs to MVC actions in SlideWiki. . . . . . . . . . . 136 7.11. An screenshot of the Semantic Web lecture series created collaboratively on SlideWiki. . . . . . . . . . . . . . . . . . . . . . . . . . . 138 7.12. Result of SlideWiki usability evaluation using SUS questionnaire. 139
8.1. 8.2. 8.3. 8.4.
Available datasets related to life sciences and pharmaceutical research.142 Bottom-up semantic enrichment of prescriptions. . . . . . . . . . . 144 Architecture of the Pharmer system. . . . . . . . . . . . . . . . . 145 Pharmer WYSIWYM implementation (V1 – highlighting of drugs through framing, V9 – additional information about a drug in a callout, T1/T2 combined form and inline editing of electronic prescriptions). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 8.5. Screenshot of the Pharmer application (top-left: general view, topright: drug information view, bottom-left: prescription authoring view, bottom-right: drug interaction-finder results). . . . . . . . . 148 8.6. Graph view in Pharmer. . . . . . . . . . . . . . . . . . . . . . . . 149 8.7. Screenshot of Pharmer mobile application. . . . . . . . . . . . . . 150
175
List of Figures 8.8. Pharmer ecosystem. . . . . . . . . . . . . . . . . . . . . . . . . . . 8.9. Usability evaluation results for Pharmer. . . . . . . . . . . . . . .
151 154
9.1. User-friendy manual & semi-automatic creation of rich semantic content. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
158
176
Bibliography [ADL, 2011a] ADL (2011a). Scorm 2004 4th edition specification. http://www. adlnet.gov/scorm/scorm-2004-4th/. [ADL, 2011b] ADL (2011b). Scorm users guide for programmers. http://www.adlnet.gov/wp-content/uploads/2011/12/SCORM_Users_ Guide_for_Programmers.pdf. [Adrian et al., 2010] Adrian, B., Hees, J., Herman, I., Sintek, M., and Dengel, A. (2010). Epiphany: Adaptable rdfa generation linking the web of documents to the web of data. In Cimiano, P. and Pinto, H., editors, Knowledge Engineering and Management by the Masses, volume 6317 of Lecture Notes in Computer Science, pages 178–192. Springer Berlin / Heidelberg. 10.1007/978-3-642-16438-5-13. [Ankolekar et al., 2007] Ankolekar, A., Kr¨otzsch, M., Tran, T., and Vrandecic, D. (2007). The two cultures: mashing up web 2.0 and the semantic web. In WWW ’07: Proceedings of the 16th international conference on World Wide Web, pages 825–834, New York, NY, USA. ACM Press. [Araujo et al., 2010] Araujo, S., Houben, G.-J., and Schwabe, D. (2010). Linkator: Enriching web pages by automatically adding dereferenceable semantic annotations. In Web Engineering, volume 6189 of Lecture Notes in Computer Science, pages 355–369. Springer. [Auer et al., 2012a] Auer, S., B¨ uhmann, L., Dirschl, C., Erling, O., Hausenblas, M., Isele, R., Lehmann, J., Martin, M., Mendes, P., Nuffelen, B., Stadler, C., Tramp, S., and Williams, H. (2012a). Managing the life-cycle of linked data with the lod2 stack. In Cudr´e-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J., Hendler, J., Schreiber, G., Bernstein, A., and Blomqvist, E., editors, The Semantic Web – ISWC 2012, Lecture Notes in Computer Science, pages 1–16. Springer Berlin Heidelberg. [Auer et al., 2012b] Auer, S., Demter, J., Martin, M., and Lehmann, J. (2012b). Lodstats – an extensible framework for high-performance dataset analytics. In Teije, A., V¨olker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., and Hernandez, N., editors, Knowledge Engineering and Knowledge Management, volume 7603 of Lecture Notes in Computer Science, pages 353–362. Springer Berlin Heidelberg. [Auer et al., 2009] Auer, S., Dietzold, S., Lehmann, J., Hellmann, S., and Aumueller, D. (2009). Triplify: Light-weight linked data publication from relational databases. In WWW2009, Spain. ACM.
177
Bibliography [Auer et al., 2006] Auer, S., Dietzold, S., and Riechert, T. (2006). Ontowiki – a tool for social, semantic collaboration. In Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., and Aroyo, L., editors, The Semantic Web - ISWC 2006, volume 4273 of Lecture Notes in Computer Science, pages 736–749. Springer Berlin / Heidelberg. 10.1007/11926078-53. [Auer et al., 2013] Auer, S., Khalili, A., and Tarasowa, D. (2013). Crowd-sourced open courseware authoring with slidewiki.org. International Journal of Emerging Technologies in Learning (iJET), 8(1). [Beckett, 2004] Beckett, D. (2004). RDF/XML syntax specification (revised). http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/. [Benson et al., 2010] Benson, E., Marcus, A., Howahl, F., and Karger, D. (2010). Talking about data: Sharing richly structured information through blogs and wikis. In The Semantic Web – ISWC 2010, volume 6496 of Lecture Notes in Computer Science, pages 48–63. Springer. [Berners-Lee and Connolly, 2011] Berners-Lee, T. and Connolly, D. (2011). Notation3 (N3): A readable RDF syntax. http://www.w3.org/TeamSubmission/ n3/. [Berners-Lee et al., 2001] Berners-Lee, T., Hendler, J., and Lassila, O. (2001). The semantic web. Scientific American, 284(5):34–43. [Berners-Lee et al., 2007] Berners-Lee, T., Hollenbach, J., Lu, K., Presbrey, J., d’ommeaux, E. P., and m.c. schraefel (2007). Tabulator redux: Writing into the semantic web. http://eprints.ecs.soton.ac.uk/14773/. Tabulator Redux tech report. [Bishop et al., 2011] Bishop, B., Kiryakov, A., Ognyanoff, D., Peikov, I., Tashev, Z., and Velkov, R. (2011). OWLIM: A family of scalable semantic repositories. Semantic Web, 2(1):1–10. [Bizer et al., 2013] Bizer, C., Eckert, K., Meusel, R., M?hleisen, H., Schuhmacher, M., and V?lker, J. (2013). Deployment of rdfa, microdata, and microformats on the web - a quantitative analysis. In 12th International Semantic Web Conference, 21-25 October 2013, Sydney, Australia, In-Use track. [Bizer et al., 2009] Bizer, C., Heath, T., and Berners-Lee, T. (2009). Linked Data - The Story So Far. International Journal on Semantic Web and Information Systems (IJSWIS), 5(3):1–22. [Bizer and Schultz, 2009] Bizer, C. and Schultz, A. (2009). The berlin sparql benchmark. Int. J. Semantic Web Inf. Syst., 5(2):1–24.
178
Bibliography [Blankenship and Ruona, 2009] Blankenship, S. and Ruona, W. (2009). Exploring knowledge sharing in social structures: Potential contributions to an overall knowledge management strategy. Advances in Developing Human Resources, 11(3). [Bostock et al., 2011] Bostock, M., Ogievetsky, V., and Heer, J. (2011). D3 datadriven documents. Visualization and Computer Graphics, IEEE Transactions on, 17(12):2301–2309. [Breslin et al., 2009] Breslin, J., Passant, A., and Decker, S. (2009). The Social Semantic Web. Springer-Verlag, Heidelberg. [Broekstra et al., 2002] Broekstra, J., Kampman, A., and van Harmelen, F. (2002). Sesame: A generic architecture for storing and querying RDF and RDF schema. In ISWC, number 2342 in LNCS, pages 54–68. Springer. [Buffa et al., 2008] Buffa, M., Gandon, F., Ereteo, G., Sander, P., and Faron, C. (2008). Sweetwiki: A semantic wiki. Web Semantics: Science, Services and Agents on the World Wide Web, 6(1):84 – 97. Semantic Web and Web 2.0. [Burel et al., 2009] Burel, G., Cano1, A. E., and Lanfranchi, V. (2009). Ozone browser: Augmenting the web with semantic overlays. volume 449 of CEUR WS Proceedings. [Camara et al., 1999] Camara, G., Souza, R. C. M., Monteiro, A. M., Paiva, J., Garrido, J., Cˆamara, G., Cartaxo, R., Souza, M. D., Miguel, A., Monteiro, V., Carlos, J., Garrido, P. D., Processamento, D., and Dpi, I. (1999). Handling complexity in gis interface design. In In: Proceedings of the I Brazilian Workshop on GeoInformatics, Campinas, S˜ao. [Casey and Olivera, 2011] Casey, A. J. and Olivera, F. (2011). Reflections on organizational memory and forgetting. Journal of Management Inquiry, 20(3):305– 310. [Chang et al., 2013] Chang, K. S.-P., Myers, B. A., Cahill, G. M., Simanta, S., Morris, E., and Lewis, G. (2013). Improving structured data entry on mobile devices. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology, UIST ’13, pages 75–84, New York, NY, USA. ACM. [Charles et al., 1997] Charles, C., Gafni, A., and Whelan, T. (1997). Shared decision-making in the medical encounter: What does it mean? (or it takes at least two to tango). Social Science & Medicine, 44(5):681 – 692. [Chen and Babar, 2011] Chen, L. and Babar, M. A. (2011). A systematic review of evaluation of variability management approaches in software product lines. Information & Software Technology, 53(4):344–362.
179
Bibliography [Chu et al., 2009] Chu, H.-C., Chen, M.-Y., and Chen, Y.-M. (2009). A semanticbased approach to content abstraction and annotation for content management. Expert Systems with Applications, 36(2, Part 1):2360 – 2376. [Clark et al., 2008] Clark, K. G., Feigenbaum, L., and Torres, E. (2008). SPARQL Protocol for RDF. World Wide Web Consortium, Recommendation RECrdf-sparql-protocol-20080115 http://www.w3.org/TR/2008/REC-rdf-sparqlprotocol-20080115. [Cobb and Steele, 2011] Cobb, J. and Steele, C. (2011). Association learning management systems. http://www.tagoras.com/docs/Tagoras-AssociationLMS-Report-Overview.pdf. [Cunningham et al., 2011] Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., Gorrell, G., Funk, A., Roberts, A., Damljanovic, D., Heitz, T., Greenwood, M. A., Saggion, H., Petrak, J., Li, Y., and Peters, W. (2011). Text Processing with GATE (Version 6). [Dadzie and Rowe, 2011] Dadzie, A.-S. and Rowe, M. (2011). Approaches to visualising linked data: A survey. Semantic Web, 2(2):89–124. [d’Aquin et al., 2008] d’Aquin, M., Motta, E., Dzbor, M., Gridinoc, L., Heath, T., and Sabou, M. (2008). Collaborative semantic authoring. Intelligent Systems, IEEE, 23(3):80 –83. [Davis et al., 1993] Davis, R., Shrobe, H. E., and Szolovits, P. (1993). AI Magazine, (1):17–33. [Deligiannidis et al., 2007] Deligiannidis, L., Kochut, K. J., and Sheth, A. P. (2007). RDF data exploration and visualization. In CIMS 2007, pages 39– 46. ACM. [Di Iorio et al., 2010] Di Iorio, A., Musetti, A., Peroni, S., and Vitali, F. (2010). Ontology-driven generation of wiki content and interfaces. New review of hypermedia and multimedia, 16(1-2, SI):9–31. [Dyba et al., 2007] Dyba, T., Dingsoyr, T., and Hanssen, G. K. (2007). Applying systematic reviews to diverse study types: An experience report. In Proceedings of the First International Symposium on Empirical Software Engineering and Measurement, ESEM ’07, pages 225–234, Washington, DC, USA. IEEE Computer Society. [Ennals et al., 2007] Ennals, R., Brewer, E. A., Garofalakis, M. N., Shadle, M., and Gandhi, P. (2007). Intel mash maker: join the web. SIGMOD Record, 36(4):27–33.
180
Bibliography [Erling and Mikhailov, 2007] Erling, O. and Mikhailov, I. (2007). RDF support in the virtuoso DBMS. In Auer, S., Bizer, C., M¨ uller, C., and Zhdanova, A. V., editors, CSSW, volume 113 of LNI, pages 59–68. GI. [Ermilov et al., 2011] Ermilov, T., Heino, N., Tramp, S., and Auer, S. (2011). Ontowiki mobile – knowledge management in your pocket. In 8th Extended Semantic Web Conference (ESWC2011). [Ferrucci and Lally, 2004] Ferrucci, D. and Lally, A. (2004). Uima: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng., 10(3-4):327–348. [Fitzpatrick, 1998] Fitzpatrick, R. (1998). Strategies for evaluating software usability. Methods, 353(1). [Fonstad, 2005] Fonstad, N. O. (2005). Tangible purposes and common beacons: The interrelated roles of identity and technology in collaborative endeavors. In OKLC (Organizational Learning, Knowledge and Capabilities). [Frosterus et al., 2011] Frosterus, M., Hyv¨onen, E., and Laitio, J. (2011). Datafinland—a semantic portal for open and linked datasets. In The Semanic Web: Research and Applications, volume 6644 of LNCS, pages 243–254. Springer. [Galanter et al., 2013] Galanter, W., Falck, S., Burns, M., Laragh, M., and Lambert, B. L. (2013). Indication-based prescribing prevents wrong-patient medication errors in computerized provider order entry (cpoe). Journal of the American Medical Informatics Association, 20:477–481. [Geiger et al., 2011] Geiger, D., Rosemann, M., and Fielt, E. (2011). Crowdsourcing information systems : a systems theory perspective. In Australasian Conference on Information Systems (ACIS) 2011, Sydney, Australia. [Gerber and Ngonga Ngomo, 2011] Gerber, D. and Ngonga Ngomo, A.-C. (2011). Bootstrapping the linked data web. In 1st Workshop on Web Scale Knowledge Extraction @ ISWC 2011. [Glaser and Strauss, 1967] Glaser, B. G. and Strauss, A. L. (1967). The Discovery of Grounded Theory: Strategies for Qualitative Research. Aldine de Gruyter, New York, NY. [Goetz, 2011] Goetz, T. (2011). Harnessing the power of feedback loops. WIRED Magazine. ´ Palacios, R. C., Berb´ıs, [Gonz´alez et al., 2011] Gonz´alez, A. R., Garc´ıa-Crespo, A., J. M. G., and Jim´enez-Domingo, E. (2011). Using ontologies in drug prescription: The semmed approach. IJKBO, 1(4):1–15.
181
Bibliography [Greenfield, 2006] Greenfield, A. (2006). Everyware: The Dawning Age of Ubiquitous Computing. New Riders Publishing, Berkeley, CA. [Haase et al., 2010] Haase, P., Eberhart, A., Godelet, S., Math¨aß, T., Tran, T., Ladwig, G., and Wagner, A. (2010). The information workbench. interacting with the web of data. In 3rd Future Internet Symposium (FIS2010). [Hachey, 2011] Hachey, G. (2011). Semantic web user interface: A systematic survey. Master’s thesis, Athabasca University. [Haller and Abecker, 2010] Haller, H. and Abecker, A. (2010). imapping: a zooming user interface approach for personal and semantic knowledge management. SIGWEB Newsl., pages 4:1–4:10. [Hasida, 2007] Hasida, K. (2007). Semantic authoring and semantic computing. In Sakurai, A., Hasida, K., and Nitta, K., editors, New Frontiers in Artificial Intelligence, volume 3609 of Lecture Notes in Computer Science, pages 137–149. Springer. 10.1007/978-3-540-71009-7-12. [Heese et al., 2010] Heese, R., Luczak-R¨osch, M., Oldakowski, R., Streibel, O., and Paschke, A. (2010). One click annotation. In Scripting and Development for the Semantic Web (SFSW). [Heflin, 2004] Heflin, J. (2004). OWL Web Ontology Language Use Cases and Requirements. Technical report, W3C. [Heino et al., 2009] Heino, N., Dietzold, S., Martin, M., and Auer, S. (2009). Developing semantic web applications with the ontowiki framework. In Networked Knowledge - Networked Media, volume 221 of Studies in Computational Intelligence, pages 61–77. Springer, Berlin / Heidelberg. [Heino et al., 2011] Heino, N., Tramp, S., and Auer, S. (2011). Managing web content using linked data principles – combining semantic structure with dynamic content syndication. In Proceedings of the 35th Annual IEEE International Computer Software and Applications Conference (COMPSAC 2011). IEEE Computer Society. [Heinrich et al., 2012] Heinrich, M., Lehmann, F., Springer, T., and Gaedke, M. (2012). Exploiting single-user web applications for shared editing: a generic transformation approach. In WWW 2012, pages 1057–1066. ACM. [Heitmann et al., 2009] Heitmann, B., Kinsella, S., Hayes, C., and Decker, S. (2009). Implementing semantic web applications: reference architecture and challenges. In 5th International Workshop on Semantic Web-Enabled Software Engineering.
182
Bibliography [Hellmann et al., 2013] Hellmann, S., Lehmann, J., Auer, S., and Br¨ ummer, M. (2013). Integrating nlp using linked data. In 12th International Semantic Web Conference, 21-25 October 2013, Sydney, Australia. [Herzig and Ell, 2010] Herzig, D. and Ell, B. (2010). Semantic mediawiki in operation: Experiences with building a semantic portal. In The Semantic Web – ISWC 2010, volume 6497 of Lecture Notes in Computer Science, pages 114–128. Springer. 10.1007/978-3-642-17749-1-8. [Hong and Chi, ] Hong, L. and Chi, E. H. Annotate once, appear anywhere: collective foraging for snippets of interest using paragraph fingerprinting. CHI ’09, pages 1791–1794. ACM. [Howe, 2006] Howe, J. (2006). The rise of crowdsourcing. Wired Magazine, 14(6). [Huynh et al., 2003] Huynh, D., Quan, D., and Karger, D. R. (2003). User interaction experience for semantic web information. In King, I. and M´aray, T., editors, WWW (Posters). [Huynh et al., 2007] Huynh, D. F., Karger, D. R., and Miller, R. C. (2007). Exhibit: lightweight structured data publishing. WWW ’07, pages 737–746, New York, NY, USA. ACM. [Johnson, 2014] Johnson, J. (2014). Designing with the Mind in Mind, Second Edition: Simple Guide to Understanding User Interface Design Guidelines. Morgan Kaufmann Publishers Inc. [Jungermann, 2009] Jungermann, F. (2009). Information extraction with rapidminer. Proceedings of the GSCL Symposium Sprachtechnologie und eHumanities. [Kandel et al., 2011] Kandel, S., Paepcke, A., Hellerstein, J., and Heer, J. (2011). Wrangler: interactive visual specification of data transformation scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’11, pages 3363–3372. ACM. [Karger et al., 2009] Karger, D. R., Ostler, S., and Lee, R. (2009). The web page as a wysiwyg end-user customizable database-backed information management application. In UIST 2009, pages 257–260. ACM. [Karger and Quan, 2005] Karger, D. R. and Quan, D. (2005). What would it mean to blog on the semantic web? Web Semantics: Science, Services and Agents on the World Wide Web, 3(2-3):147 – 157. [Khalili and Auer, 2013a] Khalili, A. and Auer, S. (2013a). User interfaces for semantic authoring of textual content: A systematic literature review. Web Semantics: Science, Services and Agents on the World Wide Web, 22(0):1 – 18.
183
Bibliography [Khalili and Auer, 2013b] Khalili, A. and Auer, S. (2013b). Wysiwym authoring of structured content based on schema.org. In Lin, X., Manolopoulos, Y., Srivastava, D., and Huang, G., editors, The 14th International Conference on Web Information Systems Engineering (WISE 2013), volume 8181 of Lecture Notes in Computer Science, pages 425–438. Springer Berlin Heidelberg. [Khalili and Auer, 2014] Khalili, A. and Auer, S. (2014). Wysiwym – integrated visualization, exploration and authoring of semantically enriched un-structured content. Semantic Web Journal. [Khalili et al., 2012a] Khalili, A., Auer, S., and Hladky, D. (2012a). The rdfa content editor - from wysiwyg to wysiwym. In 2012 IEEE 36th Annual Computer Software and Applications Conference (COMPSAC), pages 531–540. [Khalili et al., 2014] Khalili, A., Auer, S., and Ngomo, A.-C. N. (2014). context – lightweight text analytics using linked data. In 11th Extended Semantic Web Conference (ESWC 2014), pages 628–643. Springer International Publishing Switzerland. [Khalili et al., 2012b] Khalili, A., Auer, S., Tarasowa, D., and Ermilov, I. (2012b). Slidewiki: Elicitation and sharing of corporate knowledge using presentations. In Teije, A., V¨olker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., and Hernandez, N., editors, The 18th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2012), volume 7603 of Lecture Notes in Computer Science, pages 302–316. Springer Berlin Heidelberg. [Khalili and Sedaghati, 2013a] Khalili, A. and Sedaghati, B. (2013a). Semantic medical prescriptions – towards intelligent and interoperable medical prescriptions. In IEEE Seventh International Conference on Semantic Computing (ICSC 2013), pages 347–354. [Khalili and Sedaghati, 2013b] Khalili, A. and Sedaghati, B. (2013b). A wysiwym interface for semantic enrichment of e-prescriptions using linked open drug data. International Journal On Advances in Life Sciences, 5(3,4):204 – 213. [Kitchenham, 2004] Kitchenham, B. (2004). Procedures for performing systematic reviews. Technical report, Keele University and NICTA. [Kiyavitskaya et al., 2009] Kiyavitskaya, N., Zeni, N., Cordy, J. R., Mich, L., and Mylopoulost, J. (2009). Cerno: Light-weight tool support for semantic annotation of textual documents. Data & Knowledge Engineering, 68(12):1470 – 1492. Including Special Section: 21st IEEE International Symposium on Computer-Based Medical Systems (IEEE CBMS 2008) - Seven selected and extended papers on Biomedical Data Mining.
184
Bibliography [Klebeck et al., 2011] Klebeck, A., Hellmann, S., Ehrlich, C., and Auer, S. (2011). Ontosfeeder – a versatile semantic context provider for web content authoring. In The Semanic Web: Research and Applications, volume 6644 of Lecture Notes in Computer Science, pages 456–460. Springer. [Kock et al., 2009] Kock, E. D., Biljon, J. V., and Pretorius, M. (2009). Usability evaluation methods : Mind the gaps. Evaluation, pages 122–131. [Kr¨otzsch et al., 2007] Kr¨otzsch, M., Vrandeˇci´c, D., V¨olkel, M., Haller, H., and Studer, R. (2007). Semantic Wikipedia. Journal of Web Semantics, 5(4):251–261. [Kurki and Hyv¨onen, 2010] Kurki, J. and Hyv¨onen, E. (2010). Collaborative metadata editor integrated with ontology services and faceted portals. In 1st Workshop on Ontology Repositories and Editors for the Semantic Web. [Lane et al., 2010] Lane, N., Miluzzo, E., Lu, H., Peebles, D., Choudhury, T., and Campbell, A. (2010). A survey of mobile phone sensing. Communications Magazine, IEEE, 48(9):140–150. [Lauesen, 2005] Lauesen, S. (2005). User Interface Design: A Software Engineering Perspective. Addison Wesley. [Leuf and Cunningham, 2001] Leuf, B. and Cunningham, W. (2001). The Wiki way: quick collaboration on the Web. Addison-Wesley, London. [Lewis and Sauro, 2009] Lewis, J. and Sauro, J. (2009). The Factor Structure of the System Usability Scale. In Human Centered Design, volume 5619 of LNCS, pages 94–103. [Lin et al., ] Lin, J., Thomsen, M., and Landay, J. A. A visual language for sketching large and complex interactive designs. CHI ’02, pages 307–314. ACM. [Loecken et al., 2012] Loecken, A., Hesselmann, T., Pielot, M., Henze, N., and Boll, S. (2012). User-centred process for the definition of free-hand gestures applied to controlling music playback. Multimedia Syst., 18(1):15–31. [Lohmann et al., 2008] Lohmann, S., Heim, P., Auer, S., Dietzold, S., and Riechert, T. (2008). Semantifying requirements engineering – the softwiki approach. In Proceedings of the 4th International Conference on Semantic Technologies (ISEMANTICS ’08), J.UCS, pages 182–185. [Lopez et al., 2011] Lopez, V., Uren, V., Sabou, M., and Motta, E. (2011). Is question answering fit for the semantic web? a survey. Semantic Web ? Interoperability, Usability, Applicability, 2(2):125–155. [Luczak-Roesch, 2009] Luczak-Roesch, R. H. M. (2009). Linked data authoring for non-experts. In WWW WS on Linked Data on the Web (LDOW2009).
185
Bibliography [Makhoul et al., 1999] Makhoul, J., Kubala, F., Schwartz, R., and Weischedel, R. (1999). Performance measures for information extraction. In In Proceedings of DARPA Broadcast News Workshop, pages 249–252. [Melissa Conrad Stoppler, 2012] Melissa Conrad Stoppler, M. (2012). http:// www.medicinenet.com/script/main/art.asp?articlekey=55234. [Mendes, 2013] Mendes, P. (2013). Adaptive Semantic Annotation of Entity and Concept Mentions in Text. PhD thesis, Department of Computer Science and Engineering, Wright State University. [Mendes et al., 2011] Mendes, P. N., Jakob, M., Garc´ıa-Silva, A., and Bizer, C. (2011). Dbpedia spotlight: shedding light on the web of documents. In Proceedings of the 7th International Conference on Semantic Systems, I-Semantics ’11, pages 1–8, New York, USA. ACM. [Miles and Huberman, 1994] Miles, M. B. and Huberman, M. (1994). Qualitative Data Analysis: An Expanded Sourcebook(2nd Edition). Sage Publications, Inc, 2nd edition. [Morsey et al., 2011] Morsey, M., Lehmann, J., Auer, S., and Ngonga Ngomo, A.-C. (2011). Dbpedia sparql benchmark – performance assessment with real queries on real data. In ISWC 2011. [Muller et al., 2011] Muller, W., Rojas, I., Eberhart, A., Haase, P., and Schmidt, M. (2011). A-r-e: The author-review-execute environment. Procedia Computer Science, 4:627 – 636. ICCS 2011. [Myers, 1998] Myers, B. A. (1998). A brief history of human-computer interaction technology. interactions, 5(2):44–54. [M¨oller et al., 2006] M¨oller, K., Bojars, U., and Breslin, J. (2006). Using semantics to enhance the blogging experience. In Sure, Y. and Domingue, J., editors, The Semantic Web: Research and Applications, volume 4011 of Lecture Notes in Computer Science, pages 679–696. Springer Berlin Heidelberg. [Navarro-Galindo and Samos, 2010] Navarro-Galindo, J. L. and Samos, J. (2010). Manual and automatic semantic annotation of web documents: the flersa tool. In Proceedings of the 12th International Conference on Information Integration and Web-based Applications Services, iiWAS ’10, pages 542–549, New York, NY, USA. ACM. [Ngomo et al., 2013] Ngomo, A.-C., Kolb, L., Heino, N., Hartung, M., Auer, S., and Rahm, E. (2013). When to reach for the cloud: Using parallel hardware for link discovery. In Cimiano, P., Corcho, O., Presutti, V., Hollink, L., and Rudolph, S., editors, The Semantic Web: Semantics and Big Data, volume 7882 of Lecture Notes in Computer Science, pages 275–289. Springer Berlin Heidelberg.
186
Bibliography [Ngomo et al., 2011] Ngomo, A.-C. N., Heino, N., Lyko, K., Speck, R., and Kaltenb¨ock, M. (2011). Scms - semantifying content management systems. In ISWC, pages 189–204. [Nielsen, 2012] Nielsen, J. (2012). Introduction to usability. [Nielsen and Molich, 1990] Nielsen, J. and Molich, R. (1990). Heuristic evaluation of user interfaces. In Proceedings of the SIGCHI conference on Human factors in computing systems: Empowering people, CHI ’90, pages 249–256, New York, NY, USA. ACM. [O’Donoghue et al., 2010] O’Donoghue, S. I., Horn, H., Pafilis, E., Haag, S., Kuhn, M., Satagopam, V. P., Schneider, R., and Jensen, L. J. (2010). Reflect: A practical approach to web semantics. Web Semantics: Science, Services and Agents on the World Wide Web, 8(2-3):182 – 189. [Oviatt et al., 2000] Oviatt, S., Cohen, P., Wu, L., Vergo, J., Duncan, L., Suhm, B., Bers, J., Holzman, T., Winograd, T., Landay, J., Larson, J., and Ferro, D. (2000). Designing the user interface for multimodal speech and pen-based gesture applications: state-of-the-art systems and future research directions. Hum.-Comput. Interact., 15(4):263–322. [Patel and Khuba, 2009] Patel, D. R. and Khuba, S. A. (2009). Realization of semantic atom blog. Journal of Computing, 1:34 – 38. [Paulheim and Probst, 2010] Paulheim, H. and Probst, F. (2010). Ontologyenhanced user interfaces: A survey. International Journal on Semantic Web and Information Systems (IJSWIS), 6:2. [Perdrix et al., 2009] Perdrix, F., Garc´ıa, R., Gil, R., Oliva, M., and Mac´ıas, J. A. (2009). Semantic web interfaces for newspaper multimedia content management. In New Trends on Human-Computer Interaction, pages 1–10. Springer London. 10.1007/978-1-84882-352-53. [Petrucka et al., 2013] Petrucka, P., Bassendowski, S., Roberts, H., and James, T. (2013). mhealth: A vital link for ubiquitous health. Online Journal of Nursing Informatics (OJNI), 17:2675. [Pietriga et al., 2006] Pietriga, E., Bizer, C., Karger, D. R., and Lee, R. (2006). Fresnel: A browser-independent presentation vocabulary for rdf. In ISWC, LNCS, pages 158–171. Springer. [Power et al., 1998] Power, R., Scott, D., and Evans, R. (1998). What You See Is What You Meant: direct knowledge editing with natural language feedback. In European Conference on Artificial Intelligence (ECAI), pages 677 – 681.
187
Bibliography [Preotiuc-Pietro et al., 2012] Preotiuc-Pietro, D., Samangooei, S., Cohn, T., Gibbins, N., and Niranjan, M. (2012). Trendminer: an architecture for real time analysis of social media text. http://people.eng.unimelb.edu.au/tcohn/ papers/trendminer+ramss+2012.pdf. [Prud’hommeaux and Seaborne, 2008] Prud’hommeaux, E. and Seaborne, A. (2008). SPARQL query language for RDF. http://www.w3.org/TR/rdfsparql-query/. [Puustj¨arvi and Puustj¨arvi, 2006] Puustj¨arvi, J. and Puustj¨arvi, L. (2006). The challenges of electronic prescription systems based on semantic web technologies. In ECEH, pages 251–261. [Quint and Vatton, 2007] Quint, V. and Vatton, I. (2007). Structured templates for authoring semantically rich documents. In Proceedings of the 2007 international workshop on Semantically aware document processing and indexing, SADPI ’07, pages 41–48, New York, NY, USA. ACM. [Riechert et al., 2010] Riechert, T., Morgenstern, U., Auer, S., Tramp, S., and Martin, M. (2010). The catalogus professorum lipsiensis – semantics-based collaboration and exploration for historians. In Proceedings of the 9th International Semantic Web Conference (ISWC2010), Lecture Notes in Computer Science, Shanghai / China. Springer. [Rizzo and Troncy, 2011] Rizzo, G. and Troncy, R. (2011). Nerd : a framework for evaluating named entity recognition tools in the web of data. [Ronallo, 2012] Ronallo, J. (2012). HTML5 Microdata and Schema.org. The Code4Lib Journal, (16). [Ross and Nisbett, 1991] Ross, L. and Nisbett, R. E. (1991). The person and the situation : perspectives of social psychology / Lee Ross, Richard E. Nisbett. Temple University Press Philadelphia. [Ruiz-Rube et al., 2010] Ruiz-Rube, I., Cornejo, C. M., Dodero, J. M., and Garc´ıa, V. M. (2010). Development issues on linked data weblog enrichment. In S´anchezAlonso, S. and Athanasiadis, I. N., editors, Metadata and Semantic Research, volume 108 of Communications in Computer and Information Science, pages 235–246. Springer. 10.1007/978-3-642-16552-8-22. [Sah et al., 2007] Sah, M., Hall, W., Gibbins, N. M., and Roure, D. C. D. (2007). Semport - a personalized semantic portal. In 18th ACM Conf. on Hypertext and Hypermedia, pages 31–32. [Sahoo et al., 2009] Sahoo, S. S., Halb, W., Hellmann, S., Idehen, K., Jr, T. T., Auer, S., Sequeda, J., and Ezzat, A. (2009). A survey of current approaches for mapping of relational databases to rdf. http://www.w3.org/2005/Incubator/ rdb2rdf/RDB2RDF_SurveyReport.pdf.
188
Bibliography [Saleem et al., 2013] Saleem, M., Padmanabhuni, S. S., Ngonga Ngomo, A.-C., Almeida, J. S., Decker, S., and Deus, H. F. (2013). Linked cancer genome atlas database. In Proceedings of I-Semantics. [Samwald et al., 2011a] Samwald, M., Jentzsch, A., Bouton, C., Kallesoe, C., Willighagen, E., Hajagos, J., Marshall, M., Prud’hommeaux, E., Hassanzadeh, O., Pichler, E., and Stephens, S. (2011a). Linked open drug data for pharmaceutical research and development. Journal of Cheminformatics, 3(1). ˜ C. S., [Samwald et al., 2011b] Samwald, M., Jentzsch, A., Bouton, C., KallesA¸e, Willighagen, E., Hajagos, J., Marshall, M. S., Prudˆa€™hommeaux, E., Hassenzadeh, O., Pichler, E., and Stephens, S. (2011b). Linked open drug data for pharmaceutical research and development. Journal of Cheminformatics, 3(19). [Sauer, 2006] Sauer, C. (2006). What you see is wiki – questioning WYSIWYG in the Internet age. In Proceedings of Wikimania 2006. [Schaffert, 2006] Schaffert, S. (2006). Ikewiki: A semantic wiki for collaborative knowledge management. In Enabling Technologies: Infrastructure for Collaborative Enterprises, 2006. WETICE ’06. 15th IEEE International Workshops on, pages 388–396. [Schaffert et al., 2008] Schaffert, S., Bry, F., Baumeister, J., and Kiesel, M. (2008). Semantic wikis. IEEE Software, 25(4):8–11. [Schaffert et al., 2009] Schaffert, S., Eder, J., Gr¨ unwald, S., Kurz, T., Radulescu, M., Sint, R., and Stroka, S. (2009). Kiwi - a platform for semantic social software. In SemWiki. [Seaman, 1999] Seaman, C. B. (1999). Qualitative methods in empirical studies of software engineering. IEEE Trans. Software Eng., 25(4):557–572. [Seffah et al., 2006] Seffah, A., Donyaee, M., Kline, R. B., and Padda, H. K. (2006). Usability measurement and metrics: A consolidated model. Software Quality Control, 14(2):159–178. [Sheu et al., 2010] Sheu, P., Yu, H., Ramamoorthy, C. V., Joshi, A. K., and Zadeh, L. A. (2010). Semantic Computing. Wiley-IEEE Press. [Shneiderman, 2000] Shneiderman, B. (2000). Creating creativity: user interfaces for supporting innovation. ACM Trans. Comput.-Hum. Interact., 7(1):114–138. [Simperl, 2012] Simperl, E. (2012). Crowdsourcing semantic data management: Challenges and opportunities. In Proceedings of the 2Nd International Conference on Web Intelligence, Mining and Semantics, WIMS ’12, pages 1:1–1:3, New York, NY, USA. ACM.
189
Bibliography [Siorpaes and Simperl, 2010] Siorpaes, K. and Simperl, E. (2010). Human intelligence in the process of semantic content creation. WORLD WIDE WEBINTERNET AND WEB INFORMATION SYSTEMS, 13(1-2, SI):33–59. [Spiesser and Kitchen, ] Spiesser, J. and Kitchen, L. Optimization of html automatically generated by wysiwyg programs. In WWW 2004, pages 355–364. [Tarasowa et al., 2014] Tarasowa, D., Auer, S., Khalili, A., and Unbehauen, J. (2014). Crowd-sourcing (semantically) structured multilingual educational content (cosmec). Open Praxis, 6(2). [Tarasowa et al., 2013] Tarasowa, D., Khalili, A., Auer, S., and Unbehauen, J. (2013). Crowdlearn: Crowd-sourcing the creation of highly-structured e-learning content. In Foley, O., Restivo, M. T., Uhomoibhi, J. O., and Helfert, M., editors, CSEDU, pages 33–42. SciTePress. [Th´orisson et al., 2010] Th´orisson, K., Spivack, N., and Wissner, J. (2010). The semantic web: From representation to realization. In Transactions on Computational Collective Intelligence II, volume 6450 of Lecture Notes in Computer Science, pages 90–107. Springer. [Tramp et al., 2010] Tramp, S., Heino, N., Auer, S., and Frischmuth, P. (2010). Rdfauthor: Employing rdfa for collaborative knowledge engineering. In Knowledge Engineering and Management by the Masses, volume 6317 of LNCS, pages 90–104. Springer. [Treviranus, 2008] Treviranus, J. (2008). Authoring tools. In Harper, S. and Yesilada, Y., editors, Web Accessibility, Human-Computer Interaction Series, pages 127–138. Springer. 10.1007/978-1-84800-050-69. [Tunkelang, 2009] Tunkelang, D. (2009). Faceted Search (Synthesis Lectures on Information Concepts, Retrieval, and Services). Morgan and Claypool Publishers. [Uren et al., 2006] Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., and Ciravegna, F. (2006). Semantic annotation for knowledge management: Requirements and a survey of the state of the art. Web Semantics: Science, Services and Agents on the World Wide Web, 4(1):14 – 28. [Valaski et al., 2012] Valaski, J., Malucelli, A., and Reinehr, S. (2012). Ontologies application in organizational learning: A literature review. Expert Systems with Applications, 39(8):7555 – 7561. [Valkeapaeae et al., 2007] Valkeapaeae, O., Alm, O., and Hyvoenen, E. (2007). An adaptable framework for ontology-based content creation on the semantic web. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 13(12):1835–1853. 28th Annual Meeting of the Society-of-Behavioral-Medicine, Washington, DC, MAR 21-24, 2007.
190
Bibliography [Van Kleek et al., 2007] Van Kleek, M., Bernstein, M., Karger, D. R., and schraefel, m. (2007). Gui — phooey!: The case for text input. In Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology, UIST ’07, pages 193–202, New York, NY, USA. ACM. [W3C, 2004] W3C (2004). Resource description framework (rdf). http://www.w3. org/RDF/. [W3C, 2009] W3C (2009). W3C semantic web activity. http://www.w3.org/ ´ 2001/sw/. Ultima visita 8/6/2010. [W3Techs, 2011] W3Techs (2011). Usage of content management systems for websites. http://w3techs.com/technologies/overview/content_management/ all. [Wang et al., 2012] Wang, X., Love, P. E., Klinc, R., Kim, M. J., and Davis, P. R. (2012). Integration of e-learning 2.0 with web 2.0. ITcon - Special Issue eLearning 2.0: Web 2.0-based social learning in built environment, 17:387–396. [Wikipedia, 2013] Wikipedia (2013). SPARQL — Wikipedia, The Free Encyclopedia. http://en.wikipedia.org/w/index.php?title=SPARQL&oldid= 544624084. [Online; accessed 31-March-2013]. [Williams et al., 2012] Williams, A. J., Harland, L., Groth, P., Pettifer, S., Chichester, C., Willighagen, E. L., Evelo, C. T., Blomberg, N., Ecker, G., Goble, C., and Mons, B. (2012). Open phacts: semantic interoperability for drug discovery. Drug Discovery Today, 17(21-22):1188 – 1198. [Yang et al., 2013] Yang, H., Pupons-Wickham, D., Chiticariu, L., Li, Y., Nguyen, B., and Carreno-Fuentes, A. (2013). I can do text analytics!: designing development tools for novice developers. CHI ’13, pages 1599–1608, New York, NY, USA. ACM. [Yu, 2006] Yu, B. (2006). Cognitive aspects of human-gis interaction : A literature review cognitive aspects of human-gis interaction : A literature review. Interface, pages 1–17. [Yu, 2007] Yu, L. (2007). Introduction to Semantic Web and Semantic Web services. Chapman & Hall/CRC, Boca Raton, FL. [Lazaruk et al., 2012] Lazaruk, S., Kaczmarek, M., Dzikowski, J., Tokarchuk, O., and Abramowicz, W. (2012). Towards the semantic web – incentivizing semantic annotation creation process. In Teije, A., V¨olker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., and Hernandez, N., editors, Knowledge Engineering and Knowledge Management, volume 7603 of Lecture Notes in Computer Science, pages 282–291. Springer Berlin Heidelberg.
191
Selbst¨ andigkeitserkl¨ arung Hiermit erkl¨are ich, die vorliegende Dissertation selbst¨andig und ohne unzul¨assige fremde Hilfe angefertigt zu haben. Ich habe keine anderen als die angef¨ uhrten Quellen und Hilfsmittel benutzt und s¨amtliche Textstellen, die w¨ortlich oder sinngem¨aß aus ver¨offentlichten oder unver¨offentlichten Schriften entnommen wurden, und alle Angaben, die auf m¨ undlichen Ausk¨ unften beruhen, als solche kenntlich gemacht. Ebenfalls sind alle von anderen Personen bereitgestellten Materialien oder erbrachten Dienstleistungen als solche gekennzeichnet. Leipzig, den 26.1.2015
Ali Khalili
192