Applying FRBR Model as a Conceptual Model in Development of ...

5 downloads 125 Views 249KB Size Report
WTEC Panel report on Digital information organization in Japan. International. Technology Research Institute, World Technology (WTEC) Division.
Applying FRBR Model as a Conceptual Model in Development of Metadata for Digitized Thai Palm Leaf Manuscripts Nisachol Chamnongsri1, Lampang Manmart1, Vilas Wuwongse2, and Elin K. Jacob3 1

Information Studies Program, Faculty of Humanities and Information Science, Khon Kaen University, Thailand 2 School of Engineering and Technology, Asian Institute of Technology, Thailand 3 School of Library and Information Science, Indiana University, USA [email protected], [email protected], [email protected], [email protected]

Abstract. This paper outlines the adaptation of IFLA's Functional Requirements for Bibliographic Records (FRBR) for development of a metadata scheme to represent palm leaf manuscripts (PLMs) and facilitate their retrieval in digital collections. The FRBR model uses a structured, four-level hierarchy to represent an intellectual work with multiple titles, editions or formats. Because FRBR focuses on representation of the conceptual work rather than the physical entity, it must be modified for representation of PLMs. In this modified model, the level of work applies to the physical PLM rather than its conceptual content; expression applies to the languages in which the PLM occurs; manifestation applies to the formats in which each expression is available; and item applies to individual copies of a single format. The modified model has been used to devise a metadata scheme where each level has its own set of elements. Keywords: Metadata, FRBR Model, Palm Leaf Manuscript, Digital Collection.

1 Introduction In the new, knowledge-based world economic system, the production, dissemination and use of knowledge are crucial factors for enhancing economic growth, job creation, competitiveness and welfare [1]. In Thailand, the 9th National Economic and Social Development Plan, 2002-2006, proposes that Thailand will become more competitive through development of innovation by integrating modern science and technology and “local wisdom”. This approach will lay the foundation of Thai’s intellectual capital that can be used to develop competency and strengthen economic and social foundations for long-term, sustainable growth [2]. For example, the One Tambol One Product (OTOP) projects encourage villagers in each district to create their original products by applying local wisdom or resources and modern technology, thereby giving the old products added value and new marketplace appeal [3]. “Local wisdom”, which is being promoted as one of the country’s strengths, is a holistic, knowledge-based approach rooted in local circumstances such as the S. Sugimoto et al. (Eds.): ICADL 2006, LNCS 4312, pp. 254 – 263, 2006. © Springer-Verlag Berlin Heidelberg 2006

Applying FRBR Model as a Conceptual Model in Development of Metadata

255

experiences and problem-solving skills of Thai ancestors [4] that are recorded in ancient documents. Palm Leaf Manuscripts (PLMs) is an ancient document form that comprises a significant documentary heritage of the Isan people of Northeastern Thailand. These manuscripts contain a vast amount of knowledge that can be classified in eight categories: Buddhism (or Religion), Tradition and Beliefs, Customary Law, Economics, Traditional Medicine, Science, Liberal Arts, and History. 70% of the content recorded in PLMs consists of Buddhist stories and doctrine and 30% records local wisdom in the form of folktales, diaries, poetries, ethics, customary law, rites and rituals [5], [6]. However, the local wisdom recorded in these manuscripts is often difficult to access and use. There are several obstacles to access: PLMs are scattered in many places, such as temples and households in rural areas, making them difficult to collect [7]; they are regarded as holy objects and their owners may not allow access to them; some of the original manuscripts have disappeared or been destroyed [8]; manuscripts that have survived are very fragile and easily damaged [9]; and, perhaps most importantly, the languages in which they are written are either archaic or undergoing shift. Additionally, access to the content of the original manuscripts is problematic because they are written in three archaic orthographies [10], requiring expert translation. Creation of digital collections of PLMs promises the possibility of dissemination and access to these manuscripts; furthermore, it is another way to preserve the original document. In order to gain more efficiency in using digitized palm leaf manuscript collections, metadata is required. This paper outlines the adaptation of IFLA’s Functional Requirements for Bibliographic Records (FRBR) for development of a metadata scheme to provide effective and efficient access to PLMs in digital collections.

2 Information Representation in Digital Libraries Because it is recognized that knowledge recorded in PLMs can be used to strengthen Thai economic and social foundations, several research and preservation projects are attempting to collect and register PLMs. In order to preserve the original manuscripts, PLMs are microfilmed, transcribed and digitized to make them easier to use. However, access to the knowledge contained in these manuscripts is limited because of the lack of a plan for systematic management that would establish and maintain effective services for users. Additionally, because individual PLMs may be available in both archaic and modern languages as well as in several different formats, an effective system of information representation is important in order to provide for efficient access to and use of PLMs. Such a system would allow users to identify the desired PLM in the most appropriate format. At the same time, a well-structured system of representation would also be helpful in collection management, record updating and maintenance. In the digital environment, different versions of a single intellectual work can be accessed under different titles and in various formats. To support users in searching for and accessing the content of PLMs and to reduce the cost and time involved in maintaining a digital collection, information retrieval systems need a more effective approach to representation than the traditional flat cataloging system. This new approach to cataloging would support a hierarchical catalog where each level in the

256

N. Chamnongsri et al.

hierarchy would inherit information from the preceding level, allowing various versions and formats of the same work to be cataloged more quickly and permitting catalog records to be stored and updated more efficiently [11]. To support this approach to cataloging, the description of each version or format of a work requires an explicit statement of the structural relationships between hierarchical levels as well as clear definition of how each data element in the scheme is linked to the description of a particular work and its available version or formats. Otherwise, digital materials may be unusable because they are not linked correctly to their related bibliographic records [12]. In addition, a digital format may be composed of several components (text, pictures, and etc.) captured in various formats and stored in different places. In order to display information about a PLM (work) on screen in a logical way, structural metadata is required to model the object [13]. Moreover, a representation can be created for any format (item) that someone determines needs metadata access [14]. This addresses a discussion of granularity or the level at which an item is described [12], which is the conceptual key for understanding representation of information about digital objects. Thus, creating successful document representations (metadata) for a digital library requires a useful model to help clarify what the digital library project is trying to do with metadata, what functions are required, how the metadata record should be structured, and what data elements it should contain [15].

3 The FRBR Model The Functional Requirements for Bibliographic Records (FRBR) model was proposed by the International Federation of Library Associations and Institutions (IFLA) in 1998. FRBR is a conceptual model that defines a structured framework and the relationships between metadata records by focusing on the kinds of resources that a data record describes. In order to solve the problem of searching for intellectual works in a digital library, where one work may have variation in titles, versions and/or formats, FRBR uses a hierarchical structure that establishes relationships between four levels of representation: a work, its expression(s), an expression’s manifestation(s), and the individual item(s). This approach ensures that the user will be able to select the most appropriate version or format of the desired work. The hierarchical model of FRBR was inspired by the entity relationship model for relational databases and by the concept of inheritance, which ensures that the properties (or data elements) described at superordinate levels of representation are inherited by all the subordinate levels nested under them [15]. FRBR lays the foundation for hierarchical catalog records by recognizing the difference between a particular work, several expressions of work, various formats in which an expression exists, and the particular item [11]: -

Work represents an intellectual concept of works, identified by titles and realized through its relationship with expression; Expression represents the various versions or revisions of a work established in its relationship with one or more manifestations; Manifestation represents the physical embodiment of an expression; and Item represents the individual unit of a manifestation

Applying FRBR Model as a Conceptual Model in Development of Metadata

257

The FRBR model uses an entity analysis technique to identify entities and relationships. Analysis begins by isolating key entities to be represented. The attributes associated with each entity are then identified with the emphasis on attributes important in formulating bibliographic searches, interpreting responses to those searches, and navigating the universe of entities described in bibliographic records [16].

Work Is realized through

Expression Is embodied in

Manifestation Is exemplified by

Item Fig. 1. FRBR model: primary entities and relationships [16]

The FRBR model assists in identifying and defining relationships between key entities, especially for complicated documents such as film, music or museum material, which may have a range of expressions and formats. It is widely accepted and frequently used in digital library projects as a model for analysis of metadata requirements and the development of metadata schemas [17], [18].

4 The Characteristics of Isan Palm Leaf Manuscripts Isan PLMs vary in size. A standard palm leaf manuscript is generally 5-6 cm. in width and 50-60 cm. in length with 48 pages (24 leaves written on both sides). PLMs can be as short as 15 cm. or as long as 80 cm. and can vary as to the number of pages (i.e., leaves). The Isan people use the different sizes in different ways: the longer PLM is used as a textbook to recorded Buddhist stories and doctrine [19], while the shorter one is used as a notebook to record local wisdom related to daily life [10]. The languages in which PLMs are written are either local or undergoing shift (Pali, Isan, and Khmer) [8]; and the manuscripts are written in three archaic orthographies (Tham-Isan, Thai Noi, and Khmer) [10], requiring expert translation. Because the length of a PLM is determined by it physical dimensions rather than its content, a single manuscript may record many stories or a single story may require more than one manuscript [20]. Finally, a PLM may have pictures in addition to text. The only access point to the bound manuscript is its title; but this presents a problem for the user who is not already familiar with both the content of a specific PLM and the archaic language in which it is written. Furthermore, access to

258

N. Chamnongsri et al.

individual stories is difficult when many stories are recorded in one bound manuscript or when a particular story has different titles in different PLMs. However, because users generally access manuscripts using title or subject, the title (or story) is obviously the most important access point to the knowledge contained in PLMs. To preserve the knowledge recorded in palm leaf manuscripts and make it accessible to modern users, preservation projects transcribe a PLM in modern Thai alphabet and language and then reproduce the transcription in a variety of formats. The translation process involves several transformations: from the ancient alphabet to the modern Thai alphabet, from the ancient language to the Isan language, and finally from Isan to modern Thai. The original PLM, its transcriptions and its translations are reproduced in the form of microfilms, photocopies, digital images, PDF files, and text files. This ensures, on the one hand, that the user who is not familiar with ancient salphabets or archaic languages can access the knowledge in these manuscripts; on the other hand, a user or researcher familiar with ancient alphabets and languages can access the original manuscripts or their reproductions. To enhance ease of access to each story or subject in a PLM, some projects will separate reproductions or image files by story rather than putting all images in one file in an attempt to maintain the look of the original bound manuscript. Because of the complicated physical and linguistic characteristics of PLMs, the creation of a digital collection of these manuscripts must address complex issues of description, representation, organization, and use of the knowledge in PLMs. A hierarchical relationship model such as FRBR can help to develop a conceptual framework for metadata that will support access to one work in its various versions and formats; maintain the link between creators or owners; and help to manage the relationship between an original manuscript and the stories it contains.

5 Adapting FRBR to PLMs Metadata Model When searching for PLMs, users will have many questions: Which PLM recorded the desire knowledge? What is the title of that PLM? Where was it created? Who was the patron who paid for inscription of the PLM? Who currently owns the PLM? Where is it stored? Where was it found? Who is the owner of each version? Is it available only as an original manuscript? Is it available in translation? Who is the translator? Is there a digital format available? How is the digital version accessed? The hierarchical relationship established by the FRBR model holds potential for development of a metadata scheme that will enable the user to discover, select, locate, and access PLMs in the most appropriate versions and formats. Moreover, this model can be used to develop metadata that will support the collection management, manuscript preservation and use restrictions. However, the concept of work as the top level in the FRBR model is not suitable for representation of PLMs since FRBR focuses on work as a conceptual entity. Because a PLM exists as a physical rather than a conceptual entity, an effective metadata scheme must consider orthographies and languages, the physical and digital formats in which those expressions are available, and the individual copies of a given manifestation. Accordingly, the revised FRBR model applies the concept of work to the physical manuscript, which may contain one or more stories or only a part of a story. The

Applying FRBR Model as a Conceptual Model in Development of Metadata

259

expression applies to the alphabets and languages in which PLMs occur: the Khmer language, Pali language, or Isan language in its original archaic alphabet (Khmer, Tham-Isan, or Thai-Noi); the modern Thai alphabet transcription of the PLM’s archaic alphabet for users familiar with the original language who can not read the archaic alphabet; and the modern Thai language, using the Thai alphabet, for users who are not familiar with the archaic languages. The manifestation addresses the various formats in which each expression is available: the original bound manuscript, microfilm, photocopy, digital image (formatted for archiving and distribution), and text files in both modern Thai and archaic alphabets. The item applies to individual copies of a single format. Is identified by

Location

Is created by

PLM Is realized through

Is realized by

Alphabet & language Is available in

Person

Is produced by

Format

Corporate body

Is exemplified by Is owned by

Item Fig. 2. The Model of relationships between key entities in PLM metadata

In the entity analysis of PLMs, the element location was added to the model, because the intellectual concept—that is, the stories recorded in a PLM—may show little variation across PLMs. What does vary, however, is the treatment of the story, which will reflect local wisdom: because the writing style of the PLM’s author will reflect the tradition and belief of their communities, each story and therefore each manuscript will be unique even if it shares the same content with other PLMs. Thus, location is a key characteristic of the uniqueness of a PLM because it is location that shapes intellectual content. Each entity (or hierarchical level) in the model was then identified by those characteristics or attributes associated with it which would be important for users in searching, identifying, selecting, accessing, and using PLMs. The establishing of metadata elements is based on the results of analyses of (1) the physical structure and content of palm leaf manuscripts, (2) the user needs and expectations with respect to these manuscripts, and (3) the requirements for managing collections of palm leaf manuscripts. In order to specify the semantic interpretation for each of these attributes, the metadata elements and their possible values across different collections and uses will be defined in an ontology. This will ensure consistent application of the metadata scheme and provide for interoperability across different collections and different retrieval systems.

260

N. Chamnongsri et al.

PLM: Building a house

PLM Is

realized through

Building a house (Ancient Alphabet& Ancient Language)

Alphabet & language

Building a house (Thai Alphabet & Ancient language)

Building a house (Thai Alphabet & Thai Language)

Text file/Book

Text file/Book

Is available in

Bound manuscript

Format

microform

BUH01-I.doc

BUH01-T.doc

BUH01-I.pdf

BUH01-T.pdf

BUH01-I-Print out

BUH01-T-Print out

Is exemplified by

photocopy Digital image

Master copy

Item

B1L1.tiff/jpg

copy1

B1L24.tiff/jpg

copy2

Fig. 3. FRBR- based framework for palm leaf manuscript collection

In addition to the resource discovery function of metadata, the scheme will also cover management functions, including preservation, use restrictions, rights management, files organization and display, etc. These aspects of the metadata will address questions such as: What is the preservation status of each version of the PLM? Who is the person responsible for each version? What is the preservation technique used? When should a

Building a house

PLM Is realized through

How to build a house in Archaic language Original bound manuscript

•Title •Dimension

•Ink/inscription method

Alphabet & language Is available as

Is exemplified by

•Past Owner •Current Owner

-Keyword •Registration no •Original Title •Original Creator

•Original Patron •Table of Content

•Cover board character

•Color of the edge

-Subject

•Inscriber

Format

•Number of page

•Wrapper

-Uniform title,

•Language •Alphabet

Item

•Place of original creation

•Condition

•Date of Inscription

•Date of found

•Use Restriction

•Preservation status

•Subject

•Storage place

•Keyword

•Use Restriction •Date of last presentation •Place-found •Language •Alphabet

Fig. 4. Attributes of the entity: example

Applying FRBR Model as a Conceptual Model in Development of Metadata

261

digital version be migrated? Who is allowed to use the PLM and in which version? Who hold the copyright for each version of PLM? How should the digital version be displayed? The process of designing administrative and structural metadata will require identification of the various functions involved in collection management and use. The FRBR model can be help in analyzing at which level each function applies and the relationship between functions since some functions will not apply at all levels. For example, in the level of expression, the original manuscript will not require structural metadata intended to support digital functions. Thus, the model indicates that structural metadata at each level can be linked across different functions by using the identification number assigned to the unique record. Descriptive metadata

PLM

Structural metadata

Administrative metadata

Is realized through

Alphabet & language Is available as

Format

Archaic Alphabet & Language

Archaic Alphabet & Language

Thai Alphabet, Archaic language, Thai language

Thai Alphabet, Archaic language, Thai language

Digital images

Digital images

Digital images

Book

Book

Book

Images: file name/URI

Images: file name/URI

Images: file name/URI

Call number

Call number

Call number

Is exemplified by

Item

Fig. 5. Functional structure for PLM metadata example

6 Conclusion and Future Work Development of a metadata scheme for management of palm leaf manuscript collections will not only increase efficiency in discovering, accessing, and using these manuscripts, but will also support preservation of the original manuscripts and the administration of digital versions. The FRBR model offers a conceptual framework for development of a metadata scheme that will support the main functions of PLM management: resource discovery, access and use; record maintenance; digital preservation; and rights management. FRBR’s four-level hierarchical model allows the metadata record at each level to represent the data applicable to the various expressions and different formats as well as individual items. Moreover, the FRBR model will be of assistance in defining those metadata elements which are required for each function of the digital collection. The fact that PLMs are physical objects is seemingly at odds with FRBR’s notion of work as a conceptual entity. However, by reconceptualizing work as a representation of the original palm leaf manuscript,

262

N. Chamnongsri et al.

FRBR’s hierarchical structure provides an effective framework for the design of a metadata scheme that can support the various functions required for access to and management of resources in PLM collections. This revised FRBR model is based on a preliminary study of user needs that was conducted in 2005 and included literature review, observation and unstructured interviews with the staff of four palm leaf manuscript preservation projects at four different institutes as well as unstructured interviews with four researchers (anthropologist, linguist, sociologist, local scholar). Following actual data collection with users and administrators of PLM collections, the current model will be revised to reflect specific needs; and to resolve problems of semantics and to support access to PLM collections via the Semantic Web, RDF (Resource Description Framework) will then be used to develop an ontology that establishes controlled vocabularies for the values of PLM metadata elements. Acknowledgments. We would like to acknowledge the grant support from Center for Research on Plurality in the Mekong Region, Khon Kaen University and a Southeast Asia Digital Library Grant of the US Department of Education and administered through Northern Illinois University Library.

References 1. [UNECE] United Nations Economic Commission for Europe. Knowledge-Based Economy. http://www.unece.org/ie/wp8/kbe.htm (2004). 2. Office of National Economic & Social Development Board. 2002. One Tambon One Product. http://ie.nesdb.go.th/gd/html/forms/Projects/TumBonProject/TumBonExPlain/ TumBonProjectExPlain.htm. (In Thai) 3. Office of National Economic & Social Development Board. 2003. The main point of the Ninth National Economic and Social Development Plan. http://www.nesd.go.th/ interesting_ menu/progress_plan/plan9_data. (In Thai) 4. Office of the National Education Commission. 1999. The promoting policy of Thailand’s Knowledge on education. in the conference on management of local information. Mahasarakham: Central Library, Mahasarakham University. (In Thai) 5. Suphon Somchitsripanya. 2001. Perspective from palm leaf manuscript: language and literature. The conference on Study of Knowledge Recorded in Palm Leaf Manuscript in Northeastern Thailand. 6. Phanphen Khlue-Thai. 2003. The studies of inscription and ancient document by Social Research Institute Chaingmai University. In Language and Inscription Volume 9. Nakhon Prathom: Department of Eastern Language, Faculty of Archeology, Sillapakhon University. p. 32-59. (In Thai) 7. Suriya Samutkhup and Pattana Kittiarsa. 2003. Why was a female lower garment used as a wrapper of palm-leaf manuscripts in Northeast Thailand? An anthropology approach to Isan-palm-leaf manuscripts. Art & Culture Magazine. 24(6): 82-95. (In Thai) 8. Northeastern Thai Palm Leaf Manuscript Preservation Project, Mahasarakham University. 2004. http://www.msu.ac.th/BL/bailan/PAG2.ASP. (In Thai) 9. Sineenart Somboon-a-nake. 1998. Palm leaf manuscript: cultural heritage of Lanna people. Library Association Bulletin 42(2): 25-36. (In Thai) 10. Ekawit na Thalang. 2001. Isan’s knowledge. Bangkok: Ammarin. (In Thai)

Applying FRBR Model as a Conceptual Model in Development of Metadata

263

11. Mimno, David; Crane, Gregory; Jones, Alison. Hierarchical Catalog Records implementing a FRBR Catalog. D-Lib Magazine 11(10) (2005). http://dlib.anu.edu.au/ dlib/ october05/crane/10crane.html. 12. Devis-Brown, Beth. 1999. Information organization: old concepts, new challenges. In WTEC Panel report on Digital information organization in Japan. International Technology Research Institute, World Technology (WTEC) Division. http://www.wtec. org/loyola/digilibs/toc.htm 13. Proffitt, Merrilee. 2004. Pulling it all together: use of METS in RLG cultural materials service. Library Hi Tech 22(1): 65-68. 14. Taylor, Arlene G. 1999. The organization of information. Englewood, Colorado: Library Association. 15. Coyle, Karen. 2004. Future considerations: the functional library system record. Library Hi Tech 22(2): 166-174. 16. [IFLA] IFLA Study Group on the Functional Requirements for Bibliographic Records. 1998. Functional Requirements for Bibliographic Records: Final Report. UBCIM Publications-New Series. Vol. 19, Munchen: K.G.Saur, (1998). http://www.ifla.org/ VII/s13/frbr/frbr.htm. 17. Lin, Simon C. and et al. 2001. A Metadata case study for the FRBR model based on Chinese painting and calligraphy at the National Palace Museum in Taipei. In DC-2001 Proceedings of the international conference on Dublin Core and metadata applications 2001. (2001) (pp. 51-59). Tokyo: NII. 18. Caplan, Priscilla. 2003. Metadata fundamentals for all librarians. Chicago: American Library Association. 19. Surajit Chantharasakha. 2001. Isan palm leaf manuscript. The conference on Study of Knowledge Recorded in Palm Leaf Manuscript in Northeastern Thailand. 20. Wirat Unnatwaranggul. Palm leaf manuscript. In The Royal edition of palm leaf manuscript in Rattanakosin Era. Bangkok: National Library, (1984): p 1-11. (In Thai)

Suggest Documents