Feeding back learning resources repurposing patterns ... - mEducator

Proceedings of the 9th International Conference on Info~ation Technology and Applications in Biomedicine, ITAB 2009, Larnaca, Cyprus, 5-7 November 2009

Feeding back learning resources repurposing patterns into the "information loop": opportunities and challenges D. Giordano, A. Faro, F. Maiorana, C. Pino, C. Spampinato

Abstract-The paper outlines a model for framing the representation and treatment of information gathered from the reuse and repurposing of learning resources from distributed repositories. The model takes into account as sources of information both static user-edited or automatically generated metadata fields and the emerging, dynamic information clouds that surrounds a learning resource when users comment on it, tags it, or explicitly links it to other learning resources. By coordinating these separate information layers, the advantages that can be achieved are reducing the semantic gap occurring when unanticipated contexts of use are to be described by resorting only to predefined vocabularies; and improvements in the relevance of the retrieved resources after a query. To achieve this "coordination" it is proposed that the textual descriptions of the repurposing activity with respect to the intended learning outcomes and pedagogical strategies are fed to a dynamic unsupervised classification method that operates on the above mentioned information spaces, and that supports exploratory search by suggesting associations. It is argued that the proposed analogical retrieval, as opposed to standard query matching, is more fit to tracking the loci of innovation and sustaining the formation of best practices in the community.

Index Terms- metadata, classification, multimedia analysis and indexing, content sharing, exploratory search

I. INTRODUCTION

T

current research efforts on educational content sharing and repurposing are deeply shaped by a technological scenario that features increasingly mature Web 2.0 technologies and semantic web technologies, and by the change from the single repository model of the learning object (LO) paradigm to multiple, distributed repositories belonging to different organizations. Whereas the complementarity and integration opportunities between Web 2.0 and semantic web are becoming increasingly apparent, there is still considerable debate on how the learning object should be represented and treated to support the repurposing needs of the community of educators in a given discipline. Topical issues regard HE

This work was supported in part by mEducator project, funded by the eContentplus programme, a multiannual Community programme to make digital content in Europe more accessible, usable and exploitable (Grant ECP 2008 EDU 418006). The Authors are with the University of Catania, Dipartimento di Ingegneria Informatica e Telecomunicazioni, Viale A. Doria 6, 95125 Catania, Italy (corresponding author phone: + 39 095 7382371; fax: +39 095 7382397; e-mail: [email protected]). 978-1-4244-5379-5/09/$26.00 ©2009 IEEE

whether or not an ontological model should be used to represent the learning object, and the type of information that should be contained in the metadata to ensure interoperability and effective sharing. There is a general agreement that the metadata fields of the adopted standard (e.g., IEEE LOM, Dublin Core) should be selected or extended based on the specifics of application, and, on the other hand, that they should be kept as simple as possible to make the generation process less costly and more useroriented. Many models have been proposed to enrich the information in the metadata to facilitate the repurposing process, e.g., models of the content [1, 2, 3]; the context [4]; the learner [5]; the pedagogical process [6]; the modification history of the content [7]. Inevitably, the proposed solutions reflect the underlying philosophy about what a learning object is, and what is the ultimate goal of the repurposing; the typical perspective on the issue would mention the improvement in the efficiency of production of new materials and training. No wonder that so many effort have focused on the structural properties of LOs; and in this line of reasoning we share the concern and criticism expressed in [8] Le., the lack of learning orientation in learning objects. With the diffusion of social software and technologies for content sharing, now it is common to talk about learning resource (LR). This change underscores a transition from an instructivistic model of learning objects to a constructivistic one, where the meaning of the learning object is not a property of the object, but is in the relation between the object, who uses it and how. Also, adopting the theoretical perspective of the community of practice as learning model [9], makes it clear that the content repurposing activity actually is the reification of a practice into an artifact (the learning object), and that the participation process of educators to the community develops through the comments on LO and contributions of new content; more than about efficiency all this is about providing a medium in which the community itself defines, creates, sustains and propagates its best tools and practices, and the identities of its members. The goal of this paper is to analyze the type of information that can be generated during the repurposing activity and find ways to coordinate it with all the information spaces that pertain to an LO to improve the search and retrieval capability of the content sharing system. It is proposed that authors of the LO introduce a description of the repurposing activity that is anchored to the type of learning outcome and the intended strategy of

use; and to resort to a clustering and association building method that supports a type of search so far overlooked in the La repositories, i.e., exploratory search.

II. Two CONCEPTIONS OF REPURPOSING In the La community repurposing refers to the process of using an existing educational resource (either as it is or by modifying part of it) in a different context, where either the original goal or some aspect of the target audience and pedagogical strategy have changed; changes to the La might regard structure, content, sequencing, format and so forth. In the multimedia community, repurposing of audio, video, and in general media streams, refers to the reprocessing of the multimedia elements contained in a digital resource, so that this resource can be delivered to a different platform, taking into account also the characteristic of the network [10]. This type of repurposing is relevant to our goal for two reasons: one is that changes in the delivery platform are a topical scenario, e.g., in the mobile learning context; the other one is that considerable progress is being made in the automated segmentation and indexing of multimedia streams [11]. Furthermore, the computationally intensive content based analysis that is performed by these methods is able to generate information about the breakdown of a resource in sub components (e.g., scenes, audio segments, images, objects in images) and low level features descriptions that are a basis for implementing a contentbased retrieval system where "queries by example" can be performed [12]. It is easy to imagine that this could be provided as a service by future content sharing platforms; a more immediate application is to employ these methods as long as they prove robust to automatically generate metadata and alleviate the burden of manual preparation. The content repurposing process from the perspective of the designer of a learning experience is depicted in figure 1.

version of this diagram or illustration where labels have been translated? Am I properly acknowledging sources? Tracking allows a backward - forward navigation to resources related by a repurposing relation. It must be noted that one challenge related to implementation of this relatively simple conceptual model is moving from one repository to many repository. In fact, whereas in the single repository model the tracking could be easily automated, in the distributed model acknowledging repurposing from various sources and providing location information requires agreed standard protocol. An example comes from the digital library field, where the COinS in & OpenURL 1.0 standard allow the embedding of citations in an HTML page so that the metadata of the resource and its point of access can be located. In fig.1 we assume as working hypothesis that the repurposing links are typed for an immediate high level identification of the repurposing contexts, and that a suitable mapping to a numerical weight expressing the strength of the relationship can be defined. This will allow us to proceed with the clustering procedure that is part of the method proposed in section IV. Most importantly, the details of the repurposing operation, together with the pedagogical/learning reasons for repurposing the resources are expressed in the description associated to the repurposed resource. A suitable description would mention the goal that we would like to achieve with that particular type and topic: e.g., engagement? diagnostic reasoning? perceptual discrimination? correct application of the procedure? development of a conceptual model? and how in practice we want to use (Why & Activity). Fig. 2 depicts the learning resource and the associated information spaces that are generated when it is made available through a content sharing platform with Web 2.0 features.

Links space

o o

repurposing descript ion "Why" "Act ivity"

LR n

Fig. I . Typical repurposing activity: LRr has been repurposed from LRI by modifying one of its elements and by reusing an element of LRn. The author comments on the repurposing, on the intended goals and possible pedagogical strategies.

Tracking the repurposing process could be useful to answer query such as: has this Virtual Patient procedure been mapped to my country regulations? Is there a more recent, improved version of this tutorial? Is there a

Reuse & Activity space

Fig. 2 Adding the social dimension to the Learning Resource.

In the figure, for reasons that will became clear in the following sections, we have not explicitly linked the tag space and the use-reuse (activity) space with the metadata, but it is possible that these spaces are treated as complementary to the metadata space, or as data sources that will be processed to eventually fill the relevant metadata fields. III.

EXPLORATORY SEARCH IN THE REPURPOSING WORKFLOW

Recently it has been acknowledged that search behavior and information needs strongly depend on the user goals, and the distinction between focused search and exploratory search has been introduced [13]. In focused search the users knows exactly what s/he is looking for, e.g., I'm looking for a very specific content type that will support my expository strategy during face to face class, or to enrich the content of one section in my online course. In exploratory search I do not know exactly what I'm looking for, although I have a broad concept of it, and the search process is integral to the refinement of the query. Examples of this type of query could be: I would like to know if there is anyone who has considered including in their courses the use of this (or any) quite novel methods or equipment; or, how is this topic being taught in other medical schools? What type of resources do they use? This distinction impacts not only what should be in the metadata, but also how we should process the metadata to better serve the end user aim. In general, metadata support focused search, although it does not alleviate the wellknown vocabulary problem, i.e, the user not knowing the term that are used for indexing. Indexing on the basis of taxonomies with a controlled vocabulary is a solution; but this classification systems can turn too rigid within fast pace evolving communities; moreover, in a content sharing community coexist various classification system that are conveniently mapped in folksonomies form usergenerated tags. This latter scenario is especially relevant in the sharing of learning resources because of the multiplicity of purposes that any resource may fulfill, the only limit being the designer creativity. But even if the tag space can facilitate the focused search process, under the assumption that it generally tend to be closer to the end-user language, the key question is if there can be a method to find relevant resources through implicit associations, and how the repurposing information be used to this aim. IV.

PROPOSED METHOD

Two main search methodologies can be adopted to find relevant learning objects: one is based on the retrieval of objects indexed by the same terms contained in a search query (we call it query matching retrieval), the other aims at retrieving the objects which may have a role in designing new learning experiences, Le., which are

meaningful for creating a new learning object (analogical retrieval). The main advantage of the former methodology is that retrieval is very precise with respect to the query issued by the user, especially at the increase of the number of terms shared between the query and the learning object description. However, its precision is also a shortcoming since many objects potentially relevant for the search are not retrieved (low recall rate). Indeed, we may be interested in reusing objects coming from similar experiences that are not necessarily described by the same terms of our use context. As a consequence, the former methodology does not have a suitable recall for supporting creativity, that in this case is related to innovative uses of LO and strategies. The latter methodology behaves in opposite way. It is not very precise, although it is able to recall many objects potentially relevant for the problem at hand. Although the use of ontologies may increase the recall of the query matching retrieval [14], it does not fulfil the requirements of exploratory search, and in particular, it does not support the analogical reasoning processes that are at the basis of creative thinking, since the recall is increased by semantic expansion methods that can only identify superclasses, subclasses and siblings of the query term. On the contrary, the latter methodology seems more appropriate for learning object design (through reuse and repurposing) since it is relatively easy to increase its precision by the use of classic queries to filter flexibly the high number of the objects recalled. Of course, the real effectiveness of the analogical retrieval depends on the notion of similarity between learning objects adopted by the method. In the paper we assume that a suitable way to define the semantics of the learning objects on the basis of either their structural properties or the activities in which they were involved is the one of abstracting their functionality/role from their descriptions. In particular, two learning objects are considered similar if they play a similar function/role in the use context or have a similar structure. In principle, we may define the object similarity on the basis on the number of the terms shared by their descriptions. Thus, if one issues a query by terms, the system should extract from the dataset the learning objects that are nearest to the query, e.g., by using the Euclidean distance between the vectorial representations of the objects and of the query. However, this similarity criteria would lead to retrieve lexically coincident or quasi-coincident objects, whereas we are interested in sets of objects analogous in wider senses. A way to achieve this aim without excessive effort is classifying the mentioned vectorial descriptions of the objects by an unsupervised method, e.g., the selforganizing map (SOM)n proposed by Kohonen [15]. The resulting classes of such a classification will contain a set of objects that exemplify an emergent concept, Le., the concept described by the terms in common among the objects of the class, that we called in a previous paper

•

"positive features" of the class [16]. A SOM may be then used to increase the recall by returning to the user the entire class instead of only the object matching the query. Generalizing this way of proceeding we may define a fruitful strategy in two steps for analogous retrieval of learning objects as follows: First step - Classification process (Fig. 3) • each learning object is defined by several categories of data, e.g., LOM subsets, user Tags, Use descriptions and Repurposing links. • the learning objects of the dataset are classified in an unsurpervised way from the categories point of view. Thus, if we have four categories, as in fig.3, we will execute four classifications by taking into account the relevant vectorial descriptions. In other words, the LOM classification we will be obtained by using only the LOM data, the user Tags classification by using the user Tags descriptions and so on.

LOMs

10

•

for each class we extract the positive features FCij, where C denotes the category and i, j respectively the class and the feature name . For example in fig.3 class CLI related to the LOM classification contains the learning objects dJ, d3 , d4 , and ds and is featured by the features FLII=L) and FLI 2=L2 , where L) and L2 are LOM data . Let us note that in the case of link based classification the positive features are computed as the most frequent use terms shared by the documents in the class . the associations between positive features FCij of the different classifications are signalled if their classes share a number of objects above a threshold set by the user. As an example, the association L), L2 ~7 TJ, T6 in fig.3 is due to the fact that classes CLI and Cn share 75% of objects. The other associations indicated in fig. 3 are derived automatically in the same fashion .

TAGs

Repurposing

USEs

dl d2 d3 d4 dS d6 d7 d8 d9 dlD Ll, L2

(0 Ul

8 I Ll,L2

¢::::::>

Tl,T6

I Ll,L2

I I

•

¢::::::>

Tl,T6

¢::::::> U3, us

I

I I

Ul

¢::::::>

R4

U3, us

~ig.3 Classification process . Four classifications are obtained by LOM, User Tags, Use descriptions (Uses) and Repurpo sing Imks. Nodes are classes and the numbers are the documents IDs. For some classes the extracted positive features are indicated and the resulting associations are pointed out '

Query Query by terms ... O"

Analogical

matching

dl d7...

..

lth Module

---- L1 ,L2 L1 ,L2

Ul

¢::::::> ¢::::::>

Tl,T6

¢::::::>

R4

...

U3, US

Retrieval

3rd module

Analogical

'---'"

L1

¢::::::>

T6

•

CLl Cn CU2CRiO"

lth module

CLl CUi Analogical

Retrieval

.... ....

dl d3 d4 d8 d9 dlD

Analogical Retrieval

.... ~

...Filter by term 2nd module

dl d3 d8

Retrieval

s

•

Query

4th module

matching

•

Filter by terms

d3 d8

•

2th Module Fig.4 An example on how the user may obtain relevant objects by combined use of query matching and analogical retrieval.

Second step - Information Retrieval (IR) and exploratory search (fig 4) . • the user issues a query by terms . • a classic query matching module retrie ves the objects closest to the query , e.g., d, and d 7 in fig.a. These might be ordered by standard similarity metrics (e.g., [17]) . • the user selection of the retrieved objects are given as input to the analogical retrieval module to retrieve the classes that contains such objects, e.g., class CLl Cn CO2 and CR3 in figA . This increases recall . The list of the objects belonging to such classes may be given to the user, if requested. The user may terminate the search , otherwise she/he executes the following steps . • the user issues another query to filter the above classes and increase precision. As an example, figA assumes that the terms issued by the user reduce the classes relevant for user to CLl and CUI. • the user passes the above retrieved classes to another analogical retrieval module to know the associations supported by these classes . As an example, figA points out that the associations due to the classes CLl and CUI are : 1) L" L2 ~-7 T" T6 , 2) L, , L2 ~-7 U 3 , Us, and

3)U, •

~-7~.

the user passes one of the identified associations or another association defined by her/him to another analogical module to extract the relevant objects underlying these associations and then the last module is used to filter them by another query to increase precision.

By this methodological apparatus, the user may obtain relevant objects from either the classical query matching or by an exploratory approach . In the latter case she/he has several ways to obtain information by taking into account not only the structure of the learning objects (from LOM) but also their actual use (from repurposing links), and their potential reuse (from use and repurposing descriptions) and how they were commented by the community. Moreover, this method supports abstract searches by associations. Another important aspect is that this method does not require any commitment on the a priori organization of the metadata and of the sources of information, that may derive from alternative or complementary ways to model the repurposing process in a social platform (e.g., [18]) . Instead of favoring the model of a comprehensive metadata recording, filled with information extracted

from all the selected sources, one could think of keeping different aggregation of metadata, and classifying independently on the defined subsets. This approach has the advantage that one can incrementally incorporate other information spaces, such as those that could be generated from relevance feedback and analysis of the query patterns. I. RELATED WORK The works that relate to the proposed method are those that, in general try to facilitate the search of LOs. In [18] several relevance ranking metrics are evaluated with respect to some notions of relevance that apply in a focused search scenario. In [13] a layered similarity metric (by document, by topic and by subtopic) is applied to search by "query by example". The proposed method is an evolution of the one proposed in [19], where the documents classification was directly based on the network of links. In [13] a clustering approach is used on the results of web searched together with dynamic taxonomies. The novelty of our method lies in the multipartitioning of the information space and in the superimposition of the term based classifications and the repurposing links classification to obtain associations that are able to capture use contexts and practices that do not share a common vocabulary yet.

REFERENCES [1] [2]

[3]

[4]

[5]

[6]

[7]

[8] [9] [10] [11]

[12]

II. CONCLUSION This work has pointed out the importance of incorporating descriptions of activity and reuse in natural language as part of the raw data collection that includes as sources metadata, social tagging and repurposing links; the proposed analogical retrieval method uses the above information spaces to perform an unsupervised classification and derive its descriptive (positive) features; in addition it derives features associations across classes that may capture notions of relevance from unanticipated descriptions of use contexts, or even from context that are implied but never explicitly described. It is argued that because of this property the proposed analogical retrieval method, as opposed to standard query matching, is more fit to tracking the loci of innovation and sustaining the formation of best practices in the community. Since the user plays a key role in harnessing the power of the method by interacting with the system, it remains to be seen how the analogical retrieval process could be steered by rules to make is useful for efforts aimed at automated repurposing, where higher precision is desirable. Future work is planned on the evaluation of different metrics to compute the classification, and on the study of suitable user interfaces to handle the complexity of the information to be displayed.

[13]

[14]

[15] [16]

[17] [18]

[19]

[20]

K. Verbert and E. Duval, "Evaluating the ALOCOM Approach for Scalable Content Repurposing," ed, 2007, pp. 364-377. S. Nesic, M. Jazayeri, J. Jovanovic, D. Gasevic, "Ontology-based content model for scalable content reuse," Proc. 4th international conference on Knowledge capture, Whistler, BC, Canada, 2007. D. S. S. Rodrigues, S. W. M. Siqueria, M. H. L. B. Braz, R. N. Melo, "A Strategy for Achieving Learning Content Repurposing," Proc. 1st world summit on The Knowledge Society, Athens, Greece, 2008. K. Haase, "Context for semantic metadata," presented at the Proceedings of the 12th annual ACM international conference on Multimedia, New York, NY, USA, 2004. O. Conlan, W. Wade, C. Bruen, M. Gargan, "Multi-model, Metadata Driven Approach to Adaptive Hypermedia Services for Personalized eLearning," Proc. 2nd International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems, 2002. A. Carle, M. Clancy, J. Canny, "Working with pedagogical patterns in PACT: initial applications and observations," SIGCSE Bull., vol. 39, pp. 238-242, 2007. M. Meyer, S. Bergstraessr, B. Zimmermann, C. Rensing, R. Steinmetz, "Modeling Modifications of Multimedia Learning Resources Using Ontology-Based Representations," LNCS 4351, Springer-Verlag, pp 34-43 , 2007. D. Jonassesen, D. Churchill, "Is There a Learning Orientation in Learning Objects?" Int. Journal on E-Learning (IJEL) 3:2,2004. E. Wenger, "Communities of practice: learning, meaning and identity", Cambridge, UK, 1998. Z. Obrenovic, D. Starcevic, B. Selic, "A model-driven approach to content repurposing", IEEE Multimedia, Jan-March 2004. C. Dorai, R.G. Farrell, A. Katriel, G. Kofman, Ying Li, Y. Park, "MAGICAL demonstration: system for automated metadata generation for instructional content". ACM Multimedia 2006: 491492,2006. B. Zaka, N. Kulathuramaiyer, W. Balke, H. Maurer, "TopicCentered Aggregation of Presentations for Learning Object Repurposing", World Conf. on E-Learning in Corporate, Government, Healthcare, and Higher Education(ELEARN) 2008: 1 P. Papadakos, S. Kopidaki, N. Armenatzoglou and Y. Tzitzikas, "Exploratory Web Searching with Dynamic Taxonomies and Results Clustering't.l J" European Conference on Digital Libraries (ECDL'09), Corfu, Greece, Sep-Oct 2009. S. Angeletou, M. Sabou, E. Motta, "Folksonomy Enrichment and Search", in LNCS 5554, The Semantic Web: Research and Applications, 2009. T. Kohonen, "Self-organizing maps". Berlin; Heidelberg; NewYork: Springer; 1995. A. Faro, D. Giordano, F. Maiorana, "Discovering complex regularities: from tree to semi-lattice classifications". International Journal ofComputational Intelligence. Vo1.2,N.l, pp.34-39, 2005. X. Ochoa and E. Duval, "Relevance Ranking Metrics for Learning Objects," IEEE Trans. Learn. Technol., vol. 1, pp. 34-48,2008. E. Kaldoudi, et aI., "Social Networking for Learning Object Repurposing in Medical Education," The Journal on Information Technology in Healthcare, vol. 7(4), pp. 233-243, 2009. A. Faro, D. Giordano, "Concept formation from design cases: why reusing experience and why not", Knowledge Based Systems, 11, pg.437-448, 1998. B. Zens, P. Baumgartner, "Making efficient use of open educational resources using a multi-layer metadata approach", World Conf. on Educational Multimedia, Hypermedia and Telecommunications(EDMEDIA) 2008: 1