Pedagogic Metadata - DLESE

2 downloads 20647 Views 155KB Size Report
metadata, shows how metadata can help non-experts search for online resources, and explains ... documents written for use with specialized software, e.g. worksheets written within a computer algebra .... keywords in the body of a document or in a separate list of keywords .... are envisioned as generated by best practices.
Pedagagic Metadata: 1

Pedagogic Metadata Robby Robson, Oregon State University [email protected] http://osu.orst.edu/~robsonr/

THIS IS A DRAFT OF A PAPER THAT IS INTENDED TO BE AN OVERVIEW ARTICLE IN A SPECIAL ISSUE OF INTERACTIVE LEARNING ENVIRONMENTS DEDICATED TO METADATA AND EDITED BY ERIC DUVAL, ROY RADA, AND THE AUTHOR. PLEASE DO NOT REDISTRUBUTE WITHOUT PERMISSION.

Pedagagic Metadata: 2

Abstract As more and more online resources are becoming available, finding ones suitable for specific educational purposes is becoming increasingly difficult. Not only must the subject matter be appropriate and the content be accurate, but the resources must also match the educational level and background of the user. Once a suitable Web site, graphic, applet, or other resource has been located, additional problems must be faced if it is to be integrated into a learning environment. There might be software incompatibilities, legal issues, and questions concerning how it will interface with other components. This article offers a non-technical introduction and overview of metadata, an important and fascinating part of the solution to these problems. It gives the definition and examples of metadata, shows how metadata can help non-experts search for online resources, and explains how metadata can assist in the use and re-use of online pedagogic materials. It ends with a discussion of Learning Object Metadata, the IEEE Learning Technology Standards Committee standard that has recently been accepted by the major pedagogic metadata efforts.

Pedagagic Metadata: 3

Pedagogic Metadata Introduction Metadata. The word metadata means data about data. Metadata tells us the contents and properties of a collection of data without requiring us to look through the data themselves. The data, for the purposes of this article, are electronic data, primarily those stored and delivered via the Internet and used for educational purposes. The general term for such data is pedagogic hypermedia. This includes Web pages but also graphics, digitized video, Java applets, and documents written for use with specialized software, e.g. worksheets written within a computer algebra system or multiple choice quizzes that can be used within a particular online learning environment (Looms, 1999). Two Fundamental Problems. There are two fundamental problems faced by the wouldbe user of pedagogic hypermedia. First, appropriate material must be found. Second, the material found must function in the user’s environment. The first problem is that of searching. The second is that of reusability.

Pedagagic Metadata: 4

Experienced Web users are familiar with the problem of searching. It is becoming increasingly difficult to find useful and relevant material on the Web. One problem is that much of the Web is not indexed. This is especially true of educational sites (Thompson, 1999). A second problem is that search engines return large numbers of irrelevant results and often don’t find what is wanted. Metadata cannot help with the first problem but is designed to assist with the second. The problem of reusability is perhaps less recognized but is equally essential and much harder. Technological requirements and standards pose an obvious barrier to using material found on the Internet. An application written for a Unix platform will usually not run on a Windows machine. Metadata can help this problem by appropriately labeling software. A more insidious barrier to reusability is that most academic authors write material for local delivery within highly specialized contexts. Simple practices such as hard-coding navigational aids make it impossible to extract a Web page from one site and insert it another without detailed editing. Awareness of metadata can encourage better practice, but the real contribution of metadata is to support the ability of specialized environments, taken as entities themselves, to communicate effectively with each other.

Pedagagic Metadata: 5

In this article we will explain, in terms that we hope are accessible to anyone who has used the Internet in a classroom, a little more about how metadata works and what it is intended to do. We will also briefly review the current state of pedagogic metadata without going into too many details. This article is meant as an overview and introduction, not a technical reference.

The Library Catalog: An Example of Metadata Library Classification Schemes. A useful and germane analogy to searching for hypermedia on the Web is that of finding reference material in a library. A small personal library can be successfully organized in any way that suits the owner, including not being organized at all. Larger collections would be useless were it not for schemes such as the Dewey decimal system and the Library of Congress classification. These schemes are called subject classifications. Labeling a book with its subject classification is an example of metadata. Other metadata that appear on every book in a public or university library include the title, the authors, the ISBN number, the publisher, and the publication date. Notice that all of this information is available without reading a single word of the book and in fact is available in a catalog (usually electronic but formerly consisting of cards) that is physically and conceptually separate from the books themselves.

Pedagagic Metadata: 6

Key Points About Metadata. The library catalog is an excellent example of metadata. Thinking about the types and uses of the information contained in a catalog serves as a guide and illuminates some of the key points about metadata in the electronic medium: •= The need for metadata is a function of the size of a collection. This observation is particularly relevant to the Web. Estimates of the number of Web pages range from 300 million on publicly available sites (OCLC 1999b) to a total of 800 million (Thompson, 1999). By way of contrast, the Library of Congress (1999) collections include “only” 17 million books, 2 million recordings, 12 million photographs, 4 million maps, and 50 million manuscripts. Moreover, the Web is growing at a rapid exponential rate (Internet Software Consortium, 1999) and projects like JSTOR (1999) are converting legacy print documents to electronic format. The Web is adding at least an order of magnitude to the size of the search problem. •= To be useful, metadata must be standardized. The standards need not be perfect, but they must be agreed upon by the user community. It would be a disaster if every library used its own proprietary subject classification system. •= Metadata describes many aspects of a resource. The subject classification is just one. Authorship, date and place of publication, copyright ownership, and language are also

Pedagagic Metadata: 7

part of the metadata in a card catalog. For electronic media, technological requirements become important, and for educational applications, metadata addressing the pedagogic aspects of a resource play a central role. •= Metadata may be associated with resources but is not necessarily attached directly to them. The library model, as well as the online model, is based on a catalog of pointers to resources. Each pointer is labeled with a description of the resource to which it points. This allows multiple descriptions and multiple cataloging of the same resource. It also permits descriptions to be extended or modified at a later date without changing the resources themselves. •= Metadata is needed for intellectual and commercial property rights. Copyrights depend on the ability to identify the copyright owner and bookstores depend on ISBN numbers and other metadata when placing an order. The online situation is more complex (World Wide Web Consortium, 1998 and 1999) because there are multiple business models (e.g., freeware, shareware, commercial ware, and open source software) and copyrights are harder to enforce because of the ease of reproduction, but this makes the existence of appropriate metadata even more important.

Pedagagic Metadata: 8

Despite all of the other metadata associated with a book, the pieces that are most visibly useful are the title and subject classification. The reason is that these facilitate searching. Searching, especially in the online pedagogic setting, is the topic of the next section.

Searching Search Methods. There are currently two standard methods of searching for online documents. One is a keyword search. A search engine can either look for a combination of keywords in the body of a document or in a separate list of keywords associated to the document. (Keywords associated to a document are a form of metadata.) Generalizations of keyword searches include weighted keyword searches and more sophisticated methods like latent semantic analysis (Deerwester et al. 1990) that reduce a large number of possible keywords to a small set which efficiently describes the search space. The other standard type of search uses a browse structure. A browse structure is a tree through which one can descend, moving from the more general to the more specific. Familiar subject classifications are examples of browse structures: one might start with “science”, descend to “astronomy”, and then descend to “lunar and planetary astronomy”, and so on. Web search

Pedagagic Metadata: 9

engines like Yahoo (1999) and Infoseek (1999) display chains of categories that constitute parts of a browse structure, as do digital libraries like GEM (1999). Difficulties in the Pedagogic Domain. Both keyword searches and browse structures suffer from the same drawback in the pedagogic domain. To effectively use a keyword search, or to browse a subject classification, the user must be relatively expert in the vocabulary or logical structure of the field of study. Metadata offers at least a partial solution to this. If a metadata description is associated with a pedagogic resource, then a search engine can look for resources with similar descriptions. The user need not specify (or even see) the metadata used for the search. The Problem of Context. The lack of subject knowledge is only one aspect of a more general problem, that of the user’s context. Finding suitable hypermedia often depends on parameters such as the user’s educational level, the language in which a resource is written, time available for using a resource, available technology, cost of the resource, the copyright status of the resource, and any number of other variables. Parameters that speak to these issues are included in standardized metadata schemes and can enable search engines or search agents to do a far better job of returning useful information than is now possible.

Pedagagic Metadata: 10

It is important to point out that parameters such as educational level cannot be utilized unless their values are known for the individual user. This means that the search software must negotiate these values in some way. This is exactly what occurs when a student asks a reference librarian for help. In a typical scenario, a student might make some statements about what type of materials are sought and perhaps mention a class or an assignment. For example, a student might ask for journals on astrophysics. The librarian might then ask for what purpose the journals are needed. The student might respond that they are needed for a report due in a firstyear topics in science class. On the basis of this response the librarian might then judge that professional journals are far less appropriate than expository sources, despite the student’s initial request. If the Web is to be useful repository of information used for educational purposes, it will be necessary to develop online environments that mimic this type of human-human interaction.

Interoperability and Other Roles Property Rights. Metadata is not just a means to improve searching. It is also needed to address issues of interoperability, academic validity, and property rights. As was pointed out earlier, property rights are more complex for than for printed material. This is true of both

Pedagagic Metadata: 11

commercial rights, which depend on business and distribution models, and intellectual property rights, which still are the subject of intense public and legislative debate (Bradley, L., 1998). As in the case of traditional copyright notices or licenses, metadata does not protect rights but rather exposes them and gives the information needed for the user, or the user’s software, to behave in accordance with them. Examples of this may be found in Educational Object Economies (1999). Educational object economies are collections of online educational resources that can be downloaded under various terms of use and availability. Metadata is used to specify these terms. Interoperability. Interoperability refers to the ability of hypermedia resources to successfully interface with each other and to be used in contexts other than the specific ones for which they were developed. The two keys to interoperability are standardization and documentation. Metadata depends on standardization and addresses the problem of documentation. A good example of interoperability metadata is that of a MIME type (Hood, 1998). The MIME type of a document consists of two words that tell an email client or Web browser what type of document is being delivered. For example, a Microsoft Excel spreadsheet has MIME type application/msexcel, a GIF image has MIME type image/gif and a page of HTML has

Pedagagic Metadata: 12

MIME type text/html. The receiving software can take different actions dependent on this information. It might launch an application to view the document, save the document to the user’s hard drive, or process the document directly. This adaptive behavior is keyed by one small piece of metadata. Without MIME types, programs like Web browsers would either have to assume that incoming documents all had the same format or try to deduce the format the content of the document itself. Software interoperability of the sort supported by MIME types is important in online learning because of the large number of software packages and interactive applications that are used as educational tools, e.g., see (GEM, 1999). It is highly desirable to share these in the educational community and to spare the effort of constantly re-writing them. A basic prerequisite for sharing is the existence of metadata that allows a learning environment to determine if a particular application can be used. That will handle stand-alone applications. The next and more difficult step is to standardize the interfaces among applications and learning environments. For example, if a Java applet can be used as an assessment tool, it may not suffice to simply know whether a student’s computer can run the applet. In addition, it might be necessary to record that the assessment has been completed or to record the results in an electronic grade book. This requires standardization of the formats in which assessments are recorded.

Pedagagic Metadata: 13

There are major efforts underway to promote interoperability and create these types of standards. Among these are the Instructional Management Systems project (IMS, 1999), the IEEE Learning Technology Standards Committee (IEEE, 1999), ARIADNE (ARIADNE, 1999) and the ESCOT (ESCOT, 1999) project. The first two are concerned primarily with creating a metadata-rich infrastructure to support interoperability, whereas the last two are efforts to create digital libraries of interoperating educational hypermedia that can be exchanged among its users. Validating Resources. In addition to interoperability and property rights management, metadata is needed to serve another important role, that of validating a resource. In the world of printed documents, the mere publication of a document by an academic publisher carries with it an implied warrant of legitimacy. But there is no mechanism that performs the same function in the online world. Although it does not speak to the core issue of how and by whom content will be branded as trusted, metadata offers the small but significant service of providing a place for the branding to appear. A likely scenario for the validation of academic hypermedia is one in which large digital libraries or “portals” provide de facto validation by including a resource in a digital collection. The digital libraries will want a complete set of pedagogic metadata to be associated with every resource in their collection. The metadata itself will probably first be generated by authors as

Pedagagic Metadata: 14

part of the scholarly process and then reviewed by cataloging librarians. The association of metadata with scholarship will promote its use and, hopefully, cast the inclusion of good metadata into the role of an indicator of quality.

Learning Object Metadata The Metadata Scene. The final portion of this paper is devoted to reporting on existing standards, on what work has been done, and on what work needs to be done. It will not come as a surprise that a number of separate organizations have developed pedagogic metadata and that they have to some extent taken separate paths. The Dublin Core Metadata Initiative (OCLC, 1999b), which applies to more than pedagogic metadata, held its first workshop in March, 1995, sponsored by the Online Computer Library Center and the National Center for Supercomputing Applications. Subsequent to that the most significant academically oriented organizations proposing metadata standards have been •= The IEEE Learning Technology Standards Committee (IEEE, 1999), which held its first metadata working group meeting in December, 1996. •= The Instructional Management Systems project (IMS, 1999), which was started by Educom (now Educause) and held its first metadata meeting in May, 1997.

Pedagagic Metadata: 15

•= The Alliance of Remote Instructional Authoring and Distribution Networks for Europe (ARIADNE, 1999) which started in January of 1996. •= Getting Educational Systems Talking Across Leading-Edge Technologies (GESTALT, 1999) which was started in September of 1998. On the industrial side, the most active organization has been the Aviation Industry CBT Committee (AICC, 1999) and on the government side the biggest project with a stake in pedagogical metadata is the Department of Defense and Whitehouse Advanced Distributed Learning network (ADL, 1999). What is perhaps surprising is that there has been significant convergence of standards. As of August, 1999, the IMS project, ARIADNE, ADL (which uses IMS metadata), and AICC (where applicable) are compatible with a single standard known as Learning Object Metadata (LTSC, 1999) produced by the IEEE Learning Technology Task Force. This standard, in turn, is largely compliant with the more general Dublin Core standards. Some Details. This is not the place to treat the structure of Learning Object Metadata in great detail but it seems appropriate to discuss its general outline. This will help elucidate the ability of metadata to support the uses discussed above.

Pedagagic Metadata: 16

Learning Objects Metadata, as defined by the IEEE, is as a tree structure. The “root” of the tree (or better yet, the “trunk”) is the resource being described. The next level in the tree consists of nine main branches that are called categories: General.

Context-independent features of the resource.

Lifecycle.

Authorship, ownership, etc.

Meta-metadata.

Describes what metadata scheme(s) are being used.

Technical.

Describes the format and the technical requirements needed to use the resource.

Educational.

Educational and pedagogical features of the resource.

Rights.

Refers to intellectual property rights.

Relation.

Describes the relation of the given resource to other resources.

Annotation.

Allows for comments on the educational use of the resource.

Classification.

Taxonomic classification of the resource. (Could be subject matter, educational objective, accessibility requirements, etc.)

Pedagagic Metadata: 17

Each category may in turn have its own branches and each branch may branch again, thereby providing increasingly specific information about the resource being described. Figure I shows a part of this tree (going left to right) for a hypothetical document.

Figure I. A Metadata Tree. Reproduced from (Anderson et. al., 1999). The information shown tells us that the document (the “root”) is called “Becoming a Metadata Expert”, that this title is in US English, and that it has ISBN number 8-7569-4062. The interested reader is referred to (Anderson et. al., 1999) for further information and examples.

Pedagagic Metadata: 18

The Elements of Learning Object Metadata. Both the nature and the form of the various metadata elements are specified by the Learning Object Metadata standard (LTSC, 1999). For example, the difficulty of a resource is an element (denoted by Educational.Difficulty) whose values can be very low, low, medium, high, and very high. Other attributes are described using unrestricted lists of keywords while others use numerical values. Some (like MIME type) use words with format restrictions and some use lists of keywords that are envisioned as generated by best practices. The most complex and open-ended element is called a taxonomy and goes under the classification category. As in the case of the Library of Congress Classification, a taxonomy can be used to describe the subject matter of a hypermedia resource, but a taxonomy could equally well be used to classify a different attribute such as pedagogic approach. Disciplinary Metadata. Although the existing elements of Learning Object Metadata are powerful enough to describe the subject matter, level, difficulty, intended audience, pedagogical approach, and many other educationally oriented attributes of a resource, it must be recognized that no standard is ever complete or perfect. Learning Object Metadata takes this into consideration by allowing for the judicious extension of the standards as dictated by need. Moreover, a general standard cannot and should not specify anything other than a framework. The details of which lists of keywords or which taxonomies to use, as well as the interpretation

Pedagagic Metadata: 19

of terms like “very difficult”, must be answered by specialized communities of users. In the case of pedagogic metadata, these communities are naturally formed by academic disciplines. A number of academic disciplines are in the process of specifying the details of metadata for their own use. For example, the author of this article is chairing a national effort to define and implement pedagogic metadata standards for the discipline of mathematics (Robson, 1999). Each such attempt potentially reveals strengths and flaws in the structure Learning Object Metadata and creates needs for extensions. Each such attempt potentially also adds to the understanding of metadata itself. To illustrate the challenges, consider the problem of defining a subject classification taxonomy (in any discipline) that is of use to naïve and non-expert users. This has proved to be a most challenging task. One reason is that searches should be able to move from a given search term to more general and more specific terms, but the notions of “more general” and “more specific” can simultaneously refer to natural language interpretations, technical interpretations, and definitional interpretations of the same terms. This makes it very hard (some would argue impossible) to structure taxonomies in traditional ways, and new ways have yet to be tested in the field.

Pedagagic Metadata: 20

Conclusion The intent of this article was to give a non-technical overview of metadata and its pedagogical uses. The ones stressed were those involving information retrieval tailored to the user (adaptive searching) and using hypermedia from diverse sources in the same online educational environment (re-usability and interoperability). Metadata has powerful applications in these and other areas of import to online learning. Pedagogic metadata standards are just becoming what might truly be called standards. In the course of implementing these standards, mistakes will undoubtedly be discovered and corrected. This should not, however, serve as a deterrent to including metadata and planning for metadata in pedagogic material developed from this point in time hence. There is great value in doing so.

Pedagagic Metadata: 21

References ADL. (1999). Advanced distributed learning network. Retrieved September 15, 1999 from the World Wide Web: http://www.adlnet.net/ Anderson, T, McArthur, D., Griffin, S., Wason, T. (1999). IMS learning resource metadata best practices and implementation guide. (version 1.0) [On-line WWW site] Available http://www.imsproject.org/metadata/index.html Accessed September 15, 1999 ARIADNE. (1999). The alliance of remote instructional authoring and distribution networks for Europe. Retrieved September 15, 1999 from the World Wide Web: http://ariadne.unil.ch/ AICC. (1999). Aviation industry CBT committee. Retrieved August 15, 1999 from the World Wide Web: http://www.aicc.org/ Bradley, L. (1998, October 13). American library association Washington office newsline. Retrieved September 1, 1999 from the World Wide Web: http://www.ala.org/washoff/alawon/alwn7125.html

Pedagagic Metadata: 22

Deerwester, S., Dumais, S., Funas, G., Landauer, T., & Harchman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41, 391407. Educational Object Economies. (1999). Retrieved from the World Wide Web on September, 15, 1999: http://www.eoe.org/ ESCOT. (1999). Educational software components of tomorrow. Retrieved September 15, 1999 from the World Wide Web: http://www.escot.org/ GEM. (1999). The Gateway to Educational Materials. Retrieved from the World Wide Web on September, 15, 1999: http://www.thegateway.org/ GESTALT. (1999). Getting educational systems talking across leading-edge technologies. Retrieved September 15, 1999 from the World Wide Web: http://www.fdgroup.co.uk/gestalt/ Hood, E. (1998, July 11). Multipurpose Internet Mail Extensions. Retrieved September 19, 1999, from the World Wide Web: http://www.oac.uci.edu/indiv/ehood/MIME/MIME.html IEEE. (1999). Learning Technology Standards Committee. Retrieved September 19, 1999 from the World Wide Web: http://ltsc.ieee.org/

Pedagagic Metadata: 23

IMS. (1999). The instructional management system project. Retrieved September 1999 from the World Wide Web: http://www.imsproject.org/ INFOSEEK. (1999). Retrieved September 19, 1999 from the World Wide Web: http://infoseek.go.com/ Internet Software Consortium. (1999). Internet domain survey host count. Retrieved September 22, 1999, from the World Wide Web: http://www.isc.org/dsview.cgi?domainsurvey/hosts.gif JSTOR. (1999). Journal Storage: Redefining Access to Scholarly Literature. Retrieved September 19, 1999 from the World Wide Web: http://www.jstor.org/ Library of Congress. (1999). Fascinating facts about the Library of Congress. Retrieved September 1, 1999 from the World Wide Web: http://lcweb.loc.gov/today/fascinate.html Looms, T. (1999). Survey of course and test delivery/management systems for distance learning. Retrieved July 21, 1999 from the World Wide Web: http://tangle.seas.gwu.edu/~tlooms/assess.html

Pedagagic Metadata: 24

LTSC. (1999). IEEE P1484.12 Learning Objects Metadata Working Group. IEEE Learning Technology Standards Committee. Retrieved August 29, 1999, from the World Wide Web: http://ltsc.ieee.org/wg12/index.html OCLC. (1999a). Dublin Core Metadata Initiative. Online Computer Library Center. Retrieved September 22, 1999, from the World Wide Web: http://www.oclc.org/oclc/research/projects/core/oldindex.htm OCLC. (1999b). OCLC research project measures scope of the Web. Online Computer Library Center. Retrieved September 22, 1999, from the World Wide Web: http://www.oclc.org/oclc/press/19990908a.htm Robson, R. (1999). Welcome to the Mathematics metadata working group. Retrieved September 19, 1999, from the World Wide Web: http://math-classes.orst.edu/metadata/ Thompson, M. J., (1999, July 8). Web overwhelms search engines. PC World Online. [On-line Magazine] Retrieved August 30, 1999 from the World Wide Web: http://www.pcworld.com/heres_how/article/0,1400,11704,00.html YAHOO. (1999). Retrieved September 19, 1999 from the World Wide Web: http://www.yahoo.com/

Pedagagic Metadata: 25

World Wide Web Consortium. (1998, November 11). W3C Intellectual Property FAQ. Retrieved September 19, 1999 from the World Wide Web: http://www.w3.org/Consortium/Legal/IPR-FAQ.html World Wide Web Consortium. (1999). W3C Copyright Document Notice and License. Retrieved September 20, 1999 from the World Wide Web: http://www.w3.org/Consortium/Legal/copyright-documents.html