Metadata Generation for Learning Objects An Experimental Comparison of Automatic and Collaborative Solutions Matthias Bauer, Ronald Maier, Stefan Thalmann University of Passau University of Innsbruck Production and Logistics Information Systems Innstrasse 39 Universitaetsstrasse 15 94032 Passau (Germany) 6020 Innsbruck (Austria)
[email protected];
[email protected];
[email protected]
1 Introduction For administration and exchange of learning objects (LOs), meaningful metadata are required. Typically, learning material is not limited to text, but includes multimedia content, such as images, audio and video. Metadata not only describe the content, but also refer to e.g., didactical methods, domain of usage and relationships to other LOs (Motelet et al. 2006). Many authors agree on the fact that dealing with metadata cannot be entirely left to humans (Duval and Hodgins 2004; Cardinaels et al. 2005; Ochoa et al. 2005). It is argued that creation of structured metadata is too difficult, complicated and time-consuming for authors of LOs. Thus, traditionally, a small group of experts categorizes or indexes resources on the basis of an agreed, structured catalogue of keywords, a taxonomy, in order to make resources accessible (McGregor and McCulloch 2006). With the rapidly increasing amount of resources, here LOs, time and cost required for professional metadata creators are unsustainable for many organizations. Furthermore, experts find it challenging to describe LOs for all kinds of application areas, due to the fact that they cannot be experts in all domains that LOs are developed for (Shipman and McCall 1994). Therefore, metadata generation generally remains the responsibility of authors of LOs. While learners or educational professionals may benefit from metadata, the authors rarely take advantage (Motelet et al. 2007). So it is not surprising that one of the most often heard critical remarks about LO metadata is that LO authors are not willing to spend additional effort to add metadata to their LOs (Duval and Hodgins 2003). Automatic processes can resolve the problem in part by reducing the number of metadata elements which have to be humanly edited (Duval and Hodgins 2004).
Moreover, with the advent of Web technologies allowing for large numbers of users to participate in content production, sometimes termed Web 2.0 (O‘Reilly 2005), "collective intelligence" emerging from the contribution of many has been discussed as a promising phenomenon that requires further investigation. With respect to the task in question here, annotation of LOs could be moved from few authors to a potentially much larger number of users with what has come to be called collaborative tagging. This paper discusses two approaches to solve the challenge of annotating LOs, automatic metadata generation (section 2) and collaborative tagging (section 3). The paper reports of experiments in both areas, the results of which are discussed in the light of which type of metadata they seem to be most useful for. Finally, some recommendations and an outlook to future developments conclude this paper (section 4).
2 Automatic Metadata Generation Section 2.1 reviews methods of automatic metadata generation with respect to their suitability for generating metadata types as defined in the learning object metadata (LOM) standard (IEEE 2002). Section 2.2 presents the results of an experiment on key phrase extraction. 2.1 Methods for automatic metadata generation Metadata can be extracted from different sources available to the authoring system (Cardinaels et al. 2005). On the one hand, the resource itself and on the other hand, the context in which the LO is used present opportunities for generating metadata (Motelet et al. 2006). Therefore, it is distinguished between resource-based and context-based methods. Metadata harvesting, extraction, classification and propagation are among resource-based methods. Thereby, LOs are analyzed independent of usage, other LOs or learning management systems (LMS, Cardinaels et al. 2005). Harvesting means that metadata are automatically collected from previously tagged metadata, e.g., in the “header” of an HTML resource or encoding from other resource formats (Greenberg 2004). The number of metadata elements to be found depends on the file format and its metadata schema (Edvardsen 2005). For example, PDF files can include title, author or subject, audio and video files contain play time and HTML files can imply the metadata elements standardized in the Dublin Core Standard1. 1
http://dublincore.org/
Extraction is a text-based method and means that an algorithm automatically extracts metadata from the content of LOs (Greenberg 2004). For this purpose, several techniques like regular expressions, rule-based parsers, and machine learning have been used (Hu et al. 2006). Which metadata may be extracted depends on the genre. For instance, a research paper often starts with a title, followed by author, abstract, body of text and finally ends with a bibliography (Kim and Ross 2006). Independent work exists on extraction of metadata elements from documents within specific genres (e.g. Giles et al. 1998; Han et al. 2003; Hu et al. 2006). Key phrase extraction and automatic summarization are special within the scope of metadata extraction because these methods are independent of the genre. The intention of key phrase extraction is to find relevant phrases or keywords in the text in order to describe the content of LOs. In this regard, rule-based approaches (Humphreys 2002; Brook Wu et al. 2005) and approaches based on machine learning (Turney 1999; Frank et al. 1999) are available. Automatic summarization is discussed in detail in (Jones et al. 2002). Classification approaches assign metadata values from a controlled vocabulary (Paynter 2005). They are used to assign language (Cavnar and Trenkle 1994), technical format (Knight 2007), keywords (Brook Wu et al. 2005; Medelyan and Witten 2006) and author (Warner and Brown 2001). Furthermore, classification approaches can be used to assign LOs into standard classifications, like the Library of Congress Classification (LCC) (Paynter 2005). Generation of metadata based on relationships is called metadata propagation or metadata inheritance (Hatala and Forth 2003). If a LO contains other LOs as its components, metadata of the components and of the aggregate are closely related. Thus, e.g., the component’s metadata can be used to automatically generate metadata for the aggregate. However, not all types of metadata elements expose the same behavior concerning propagation within content aggregations. For example, values for metadata elements like keyword and format will be accumulated whereas size will be summarized (Cardinaels 2007). Using already available information from the context of LOs to assign metadata is frequently discussed in literature (Cardinaels et al. 2005; Ochoa et al. 2005; Pansanato and Fortes 2005; Motelet et al. 2006). Metadata can be obtained from internal and external sources or by templates. The authors identified five types of internal sources which can be used for automatic metadata generation: (1) the administration of the executive LMS in addition to the operating system, (2) log files, (3) user profiles, (4) feedback and (5) other LOs and their metadata records. For example, identifier (Hörmann 2006), current version (Hettrich and Koroleva 2003) and navigation structure of previous and following material (Ochoa et al. 2005)
could be found in the LMS. The operating system keeps track of, e.g., size and location of LOs (Pansanato and Fortes 2005). Log files store for instance date of creation, last modification and usage of LOs as well as the corresponding user. User profiles contain creator and user information. Feedback means that users may annotate or evaluate already used LOs that can be incorporated into the metadata records. Finally, information about other LOs can also be used to assign metadata to LOs (Edvardsen 2005). For example, if a relationship, e.g. “is based on”, is described within a LO’s metadata, then the target LO’s metadata can be automatically updated by including the opposite relationship, here “is basis for”. External sources can also imply useful information about published LOs. For instance, when a LO is used in different LMSs, several metadata records about the same LO may exist and can be interchanged. Templates can be created to default many relevant fields (Duval and Hodgins 2003). (Pansanato and Fortes 2005) distinguish (1) system, (2) user and (3) resource templates. System templates contain metadata valid for all LOs. When a new LO is created, its metadata receives the values stored in the system template. User templates include metadata valid to all LOs created by the same user. When the author of a LO is known, the LO’s metadata receives all values stored in the attendant user template. Resource templates imply metadata valid to a set of resources. All LOs which belong to the set receive the metadata stored in the template. Table 1 shows which metadata elements defined in the learning object metadata (LOM) standard (IEEE 2002) may be generated by using methods of automatic metadata generation. Some metadata elements like general.keyword or lifecycle.contribute can be created by more than one method, whereas other metadata elements like educational.interactivitylevel or educational.semanticdensity cannot be generated automatically at present. Table 1. Comparison of methods for automatic metadata generation elements of LOM general identifier title language description keyword coverage structure aggregation Level life cycle version status contribute
read out interharvesting nal sources x (AD)
x (AD)
x (H) x (H, W, P) x (H) x (H) x (H, W, P) x (H)
extraction
classification
propagation
x
x x x x x x
x x x
x
x
x
x (W)
x (AD, LO, US) x (H, W, P)
meta-metadata identifier contribute metadata schema language technical format size location requirement installation remarks other platform requirements duration educational interactivity type learning resource type interactivity level semantic density intended end user role context typical age range difficulty typical learning time description language
x (AD) x (LO, US) x x (AD) x (AD) x (AD)
x (H, W, P)
x
x x x x
x (A/V)
x x x x
x (H)
x x
x x x x (AD, FE) x (LO) x x
x (LO)
rights cost copyright and other restrictions description
x x x
x (H)
relation kind resource
x (OL) x (OL)
x (H) x (H)
annotation entity date description
x (LO) x (LO) x (FE)
x (W)
x x
x x
classification purpose x taxon path x (AD) x x description x (H) x keyword x AD administration and operating system, LO log files, US user profiles, H HTML + Dublin Core, OL other learning objects, FE feedback, W MS Word, P PDF, A/V Audio/Video.
2.2 Experiment “Key Phrase Extraction Tools” Of the large array of methods studied in 2.1, key phrase extraction was selected for the experiment due to the fact that it is independent of the genre and of the context of LOs and can thus be best compared to collaborative tagging of individual resources independent of genre and context. The experiment addresses the questions (1) whether key phrase extraction generally is appropriate to extract keywords for content descriptions and (2) which key phrase extraction tool gets the best results. The empirical study was realized in July 2007 at the Martin Luther University of Halle-Wittenberg with which all authors had been at that point in
time. Four key phrase-extraction tools were tested: (1) KEA (Key Phrase Extraction Algorithm) was developed by (Frank et al. 1999) and is based on machine learning for training and key phrase extraction. (2) Extractor is presented in (Turney 1999) and uses the hybrid genetic algorithm GenEx for training and key phrase extraction. (3) PhraseRate is described in (Humphreys 2002) and is part of the iVia-Software, a virtual library developed by the University of California. It uses a rule-based approach without any training. (4) SAmgI (Simple Automatic Metadata Generation Interface) was developed by ARIADNE2 and is an implementation of a framework for automatic metadata generation (Cardinaels et al. 2005). 24 persons participated in the experiment. Because they were all graduates in management information systems, all of them had a similar background. Eight English language research papers were used for test documents. The papers were chosen because they provide a good fit with the background of the participants. The authors’ key phrases were hidden so that they would not influence the participants. Each participant received all papers as well as all key phrases generated by one tool and had to rate all key phrases on the basis of an ordinal scale with respect to their suitability for describing the content of the corresponding paper. Thus, each key phrase was evaluated by six participants. Every key phrase extraction tool extracted seven key phrases except for SAmgI which was predefined to extract ten key phrases. The authors’ key phrases were removed so that they would not influence the tools. To extract key phrases with KEA, it was first necessary to build a model. Therefore, 50 research papers with similar subjects were used as training data. The categories of the ordinal scale were weighted to create a ranking order of 0 – unsuitable until 1 – very well suitable. Table 2 shows the mean scores for each tool and the resultant ranking order. Extractor and PhraseRate got the best results with an equal mean score of 0.63, followed by KEA with 0.57. SAmgI received negative ratings overall. The mean score of 0.44 is located below the midpoint. Moreover, mean scores and resultant ranking order vary between the papers. For example, KEA ranked first for paper 1 and ranked forth for paper 4. PhraseRate ranked first for papers 1, 4, 5, 7 and 8, but ranked third for paper 3. Based on a t-Test, all tools have highly significantly higher scores (