1364
Current Developments in Technology-Assisted Education (2006)
A collaborative tagging system for learning resources sharing Wen-Tai Hsieh1,2, Wei-Shen Lai*,1, and Seng-Cho T. Chou2 1
2
Innovative Digitech-Enabled Applications & Services Institute, Institute for Information Industry, 22FL.-A, No.333, Sec.2, Duenhua S. Rd., Taipei 106, Taiwan, R. O. C. Department of Information Management, National Taiwan University, No.1, Sec. 4, Roosevelt Road, Taipei 106, Taiwan, R. O. C.
Web 2.0 has an architecture for participation and sharing, thus encouraging users to add value to an application. This study proposes a refined Collaborative Content Sharing Module that better equips the LCMS to enhance the user experience on content sharing. Within this module, a concept space that brings both functions of tag recommendation and concept based search is generated using the alliance between tags and learning contents. The refined Related Tag Generator provides related tags to users, enabling them to determine whether to use precise tags (lower-level concepts) or fuzzy tags (upper-level concepts). Moreover, the refined Query Analyzer and Ranking algorithm can improve the recall of search results by using concept distance. Additionally, the navigation service is enhanced through a Topic Map generated by the concept space. An experiment is performed, demonstrating that the proposed system increases the recall of research results by 141% on average, with an average loss of precision of only 6% as the tradeoff. With this collaborative tagging system, users can better organize and more efficiently discover the required data in the resource sharing environment. The virtuous cycle of reuse and sharing is likely to increase the diversity of learning resources in the very near future. Keywords Folksonomy; collaborative tagging; web 2.0; LCMS
1. Background The architecture of participation and sharing that encourage users to add value to the application is one of the fundamental characteristics of a successful Web 2.0 application. Data are the impetus for transforming participants from consumers to producers. A tagging scheme seems to be the easiest way to categorize data when managing the rapid growth of content. The trend of web 2.0 has introduced a new categorization method called folksonomy. A folksonomy is an Internet-based information retrieval method consisting of collaboratively generated, open-ended labels that categorize content such as Web pages, online photographs and Web links[1]. Vander Wal coined the term folksonomy, and has defined it as follows: “A folksonomy is the result of personal free tagging of information and objects (anything with a URL) on the internet for one’s own retrieval. The tagging is performed in a social environment (shared and open to others). The tagging action is done by the person consuming the information.” [2] By accumulating the popularity (the usage count of a tag) of tags in a collaborative tagging environment, flickr.com first provides a “tag cloud” for visual navigation of web browsing. A tag cloud (Fig.1) is a visual depiction of content tags used on a website. More frequently used tags are often depicted in a larger font or otherwise emphasized, while the displayed order is generally alphabetical [1]. A tag cloud was until recently a very common service provided by collaborative tagging web sites. Besides the popularity of tags, other information can also be derived from tagging actions. A tagging action comprises the tagger (the user who provides the tags), the resource, one or more tags and additional information such as date, title or note. One or more (determine by the number of tags) tagging data, each including one tagger, one tag, and one resource, can be obtained from the tagging action.
*
Corresponding author: Wei-Shen Lai: e-mail:
[email protected], Phone: +886-2-8732-6222 ext.508, Fax: +886-2-2377-0776
1
© FORMATEX 2006
Current Developments in Technology-Assisted Education (2006)
1365
Two different relations between any pair of tags can be defined by aggregating tagging data. First, two tags are in a “Co-Resource” relation if they are adopted for the same resource. This relation is stronger between tags with more shared resources. Second, two tags used by the same user are in a “Co-User” relation. While the “Co-Resource” relation is most appropriate for establishing a public concept hierarchy, the “Co-User” Relation is most suitable for establishing a private concept hierarchy. This investigation presents a learning resource sharing platform that contains a cooperative tagging module to make sharing easier. Furthermore, public domain concept hierarchy among tags is established from the collected tagging data. Finally, a straightforward experiment is performed to demonstrate that this concept hierarchy can help resources searching and tags recommendation.
Fig. 1:
A snapshot of a tag cloud on del.ici.ous.
2. The run of learning object repositories Conventional repositories and digital libraries focus on how to perform content classifications. The researches attempt to organize the content by subjects or by suitable ages. Each popular systems provides well-organized contents for user browsing and searching. However, as Web 2.0 grows rapidly, user participation plays an increasingly significant role in content sharing systems. Therefore, this investigation discusses the user-oriented architecture of information architecture [3] of some popular on-line repositories and digital libraries, namely EtoE, GEM, MERLOT, EdNA and JISC. Information architecture is the practice of structuring information for a purpose. These architectures are often structured according to their context in user interactions or larger databases [1]. Some of these libraries such as EtoE, GEM and JISC do not provide a user-oriented architecture. Other systems, including MERLOT and EdNA, provide some user-oriented architecture by incorporating a user rating module. To enhance the user participant of systems, this investigation recommends incorporating a refined cooperative tagging module, which is popular in Web 2.0 systems for content sharing.
3. Running Folksonomy Although the Folksonomy collaborative tagging service is quite successful and popular, it still has some limitations [4, 5, 6, 7, 8]: 1. 2. 3.
4.
Homonymy (polysemy) A water filters is a very different subject from Bayesian Bayesian Filtering. Synonyms (including plurals and conjugate) cats, cat, feline, and meowmeow. Acronyms MIT can be a acronym for “Made in Taiwan” and for “Massachusetts Institute of Technology” Spaces, Symbols and Multiple Words “nyc”, “NewYork”, “newyorkcity” and “new_york”
© FORMATEX 2006
1366
Current Developments in Technology-Assisted Education (2006)
5. 6.
7.
Meta Noise incorrectly spelled tag or irrelevant tags Basic level variation Documents tagged with “Perl” or “JavaScript” may be too specific for some users, while a document tagged “programming” may be too general for others. Public Tags and Private Tags A tag “car” on a Porsche car is a public tag. Conversely, if tags like “my” or “aim” are private tags because they are used for personal reason.
The Meta Noise problem can be simply reduced with a tag recommend mechanism; the 2nd, 3rd, 4th and 6th limitations above have similar affection on search – decrease the recall. Therefore, this investigation constructs a concept hierarchy of tags, and applies this hierarchy to raise the recall of search without much loss of precision, or to recommend relevant tags to users.
4. Concept relation of tags and related researches The discussion of tagging systems has been increasing in recent months. Researche to obtain information from tags increass as the limiations of tagging systems become clearer. “WWW 2006” held a Collaborative Web Tagging workshop to explore this issue. The most common way to gather additional information in collaborative tagging systems is clustering. Begelman, Keller and Smadja developed an “Automated Tag Clustering” mechanism to construct a tag clustering in Collaborative Web Tagging workshop [9] They apply the “Co-tags” relation to construct a relation graph of tags, and then recursively run a partition algorithm to construct a tag cluster. Similar research was peformed in Taiwan in 2006. Tag clustering was also examined in a Master thesis from National Taiwan University in 2005 [10], where a Web-based clustering approach and graphical approach were employed to attempt to build the cluster of tags. Schmitz presented a faceted ontology derived from flickr.com with a subsumption-based model and probabilistic model [11]. Schmitz also noted that images are annotated and most easily retrieved when emphasizing several key facets, namely place, activity and depictions of tags. In contrast to clustering and facet, this investigation presents a concept space, which is presented as a concept hierarchy. Since the tags are constructed hierarchically, a tag would not connect to all concepts related to it when it is added to the concept space. Instead, if a tag A is added into the concept hierarchy and is considered to be related to a tag B, then tag A no longer needs to determine whether it is related to tag B’s ancestor nodes. Additionally, because tags are constructed hierarchically, this concept hierarchy can obtain further information for recommendation and search. The proposed method is described in the next paragraph. Upload/Remark Learning Resources
Navigation
Tag/Keyword Search
Search results
Collaborative Content Sharing Module
tag recommendation
Tag TagEditing Editing Interface Interface
Related RelatedTag Tag Generator Generator
Query Queryanalyzer analyzer
tags SCORM objects
Topic TopicMap Map
Concept ConceptSpace Space
concept based search
Concept ConceptSpace Space Generator Generator
Ranking Ranking
SCORM objects
User UserRemarks Remarks
Learning Content Management System Repository
Fig. 2 System overview
© FORMATEX 2006
Current Developments in Technology-Assisted Education (2006)
1367
5. System Framework Fig. 2 illustrates an overview of the proposed system. The system includes a tag editing interface that allows users to upload or comment on learning resources. After user remarks are accumulated, a concept space generator is utilized to construct a concept space. Additionally, the other refined modules can provide their services based on the concept space. 5.1
Concept Space Generator
The Concept Space Generator analyzes the tag space and establishes a concept hierarchy among tags. Fig. 3 depicts an example of how to construct a concept hierarchy. The scenario example has four tags and three learning resources. An arrow pointing from a tag to a learning resource indicates that the tag is adopted on the resource. A number next to an arrow denotes the number of times that the tag is adopted on the resource. Fig. 3 A scenario of Concept Space Generator: Step 1: Use tag vectors to describe tags. For each tag tj, we use a vector Tj to represent the relation between it and all resources. Each element T[i] denotes the number of times that a tag is applied on resource i in the tag space. For instance, the tag “programming” describes the three learning resources for 8, 3 and 7 times, so its tag vector is [8, 3, 7]. Step 2: Sort the above tag vectors by the following: (1) The number of learning resources on which a tag is applied. (2) The total use count if two tags have the same value at criterion (1). (3) The first time that a tag is used if two tags have the same value at criteria (1) and (2). Step 3: Use the sorted list from step 2 to establish a Concept Hierarchy. (1) Given a tag ‘A’, the concept hierarchy is checked to determine whether any tag ‘B’ exists whose description resource set(RB) is related to tag A’s description resource set(RA) by:
| R A I RB | > λ ……………………………………………….(1) | RA | (2) λ denotes a threshold between 0 and 1 that defines the lower bound of concept distance in the concept hierarchy. That is, if the λ is closer to 1, then the concept distances of two related tags need to be stronger. If the return value of the above equation is true, then tag B is considered as tag A’s parent node, and the cosine similarity is adopted to determine the concept distance.
5.2
Related Tags Generator and Query Analyzer
After the concept hierarchy is constructed, then the relation can be applied to make recommendations for uses when they are posting (or uploading) a resource. Clustering recommendations can be based on the closest tags in the concept hierarchy, or from ancestor, nodes and descendant nodes and sibling nodes. Conversely, the concept hierarchy can also help to refine the search module to improve recall. The related tags can be considered in order to improve the relevance of results, and the concept distance of a relation can help in ranking the results list.
6. Experiment To verify that how the proposed concept hierarchy can help in searching, a simple tagging system was built, and ten co-workers in our office were invited to use the tagging system. Each of the co-workers
© FORMATEX 2006
1368
Current Developments in Technology-Assisted Education (2006)
were invited to provide five web sites. After the fifty web sites were collected, the subjects were asked to tag action each of the fifty web sites. A total of 413 tags in 1036 tagging actions were gathered after around two weeks. The concept hierarchy was constructed based on the tagging data with λ = 0.5. Before starting to construct the graph, tags that are only used for one person or on one web site must first be filtered out. An example of applying this hierarchy for searching improvement is presented. The recall when searching on the keyword “新聞” (a Chinese word meaning “News”) rose from 75% (9 items were found within 12 items that should be found) to 92% (11/12) with little loss of precision (100% to 92%), and that when searching on “Web2.0” increased from 58% (7/12) to 83% (10/11), also with a small loss of precision (100% to 91%). These results confirm that the concept hierarchy can be very helpful in improving searching. However, the next problem is the time when these search data can be utilized, This investigation found that a refined search module (query analyzer) cannot replace the original search module. This investigation recommends using the module in a search-refined mode. Restated, a user is notified him some related tags for search refinement when he performs a search, enabling him to make the decision on which tags to use.
7. Conclusion and Future work This investigation has proposed a cooperative tagging module and a concept space generator for content sharing. The easy sharing and smart searching means that the refined searching module and refined cooperative tagging module can encourage people to contribute more learning resources and find them more easily. The following future work is planned. Research will be undertaken to identify the personal tags in the entire concept space. The public domain concept hierarchy can be improved by distinguishing between personal and public tags. Additionally, a personal concept hierarchy can be generated for each user. An issue following the construction of the personal and public domain concept hierarchies is to integrate the two hierarchies to improve searching and recommendation. Acknowledgements This research was supported by the III Innovative and Prospective Technologies Project of Institute for Information Industry and sponsored by MOEA , ROC.
References [1] WikipediaTM: http://en.wikipedia.org/ [2] Thomas Vander Wal folksonomy presentation at Online Information Conference in London : http://www.onlineinformation.co.uk/ol05/day2.html, (2005) [3] L. Rosenfeld and P. Morville, “Information Architecture for the World Wide Web“. California: O'Reilly Media, Inc, (2002). [4] Ulises Ali Mejias: “Tag Literacy ” http://ideant.typepad.com/ideant/2005/04/tag_literacy.html, (2005) [5] Scott A. Golder, Bernardo A. Huberman: “The Structure of Collaborative Tagging”, Systems Tech Report Of Information Dynamics Lab, HP Labs, (2005) [6] Marieke Guy, Emma Tonkin :“Folksonomies: Tidying up Tags?” D-Lib Magazine. Volume 12 Number 1, http://www.dlib.org/dlib/january06/guy/01guy.html, (2006) [7] Peter Merholz : “Metadata for the Masses”, http://www.adaptivepath.com/publications/essays/archives/000361.php, (2004) [8] Luigi Canali De Rossi: “Folksonomies: Tags Strengths, Weaknesses And How To Make Them Work” http://www.masternewmedia.org/news/2006/02/01/folksonomies_tags_strengths_weaknesses_and.htm, (2006) [9] G.Begelman, P.Keller, and F.Smadja: “Automated tag clustering: Improving search and exploration in the tag space”. Collaborative Web Tagging Workshop, 15th International World Wide Web Conference 2006, (2006) [10] T.Z.Yu, L.F.Chien: “Automatic organization of user-generated tags from the Web” Master Thesis of Nation Taiwan University, (2006) [11] Patrick Schmitz: “Inducing Ontology from Flickr Tags”, Collaborative Web Tagging Workshop, 15th International World Wide Web Conference 2006, (2006)
© FORMATEX 2006