Understanding Metadata

0 downloads 0 Views 2MB Size Report
Jun 7, 2004 - Data that organizes and manages resources ... revision to the information object (data) ... Schema: Provides the rules for using data elements.
Understanding Metadata Grace Agnew June 7, 2004

METADATA Definitions: Data about data Data that organizes and manages resources Data that enables users to find, interpret, select and obtain resources

1

METADATA Role of Metadata: Support the information and service roles of the organization: Enable Discovery and use of information Safeguard the intellectual property of organization Insure the durability and integrity of information resources Integrate resources with other collections in information collaborations

METADATA Metadata May Be auto-generated automatically harvested from the resource human-created

Audience May Be end user metadata creator/manager computer application/program

2

METADATA Two Metadata Strategies A one-to-one metadata record to information object relationship. Incorporated within information object

INTRINSIC

Located in a separate EXTRINSIC

database with fielded link to information object

METADATA The Nature of Information Work

Distinct intellectual or artistic creation

Expression

Manifestation

Manifestation

Expression

Manifestation

Intellectual or artistic realization of a work (“interpretation”) Physical manifestation of an expression. May differ in physical format, but not in content or interpretation Unique physical

Item

instance of a manifestation.

3

METADATA

Within the Archive:

Origin of Information First generation of information under control of organization

“Work” – ownership resides with creator

“hurricane” test

Source

Preservation Master

Access copy

Reproduction Master

Access copy

METADATA Types of Metadata Structural metadata --Structured relationship among components of a complex

object. Enables a user to “page” through a book, browse table of contents, go directly to a selected chapter, etc.

Meta metadata --Metadata that describes and manages the metadata

record. Can add “intelligence” to metadata

4

Administrative Metadata Chain of ownership Rights of access, use and reproduction Retention requirements Donor information

Digital Provenance Audit trail for significant lifecycle events -- change in metadata version -- revision to the information object (data)

Technical Metadata Describes the technical features of the information object or data, e.g. -- File Size -- Duration -- Encoding or digitizing standard -- Color space -- Resolution (pixels per inch) -- Data rate (frames per second, bit rate)

5

METADATA Descriptive Metadata Find – Discover relevant resources via a search Identify – Understand and interpret the record(s) retrieved Select – select the most relevant among competing resources Obtain – obtain the resource described by the metadata

METADATA

METS: Metadata Encoding & Transport Standard

Associates administrative, technical, rights, source and descriptive metadata with data objects Concatenates multiple versions of an object— source, digital preservation master, access copy Associates structure map, file types and behaviors with digital objects to provide “intelligent” complex objects - e.g. E-Journal with machine and human recognizable “table of contents,” “abstract,” “citation,” etc.

http://www.loc.gov/mets

6

METADATA METS Metadata Elements Structure Map Descriptive Metadata Source Metadata Technical Metadata Digital Provenance Rights Metadata

7

8

METADATA Implementing Metadata

Record Structure

Data Model

Repository Design

Data Element Registration

Dissemination to Users

Database Population

Data interchange (other repositories)

9

METADATA Data Model “Information Worldview” for the data to be managed and described. WHO will use the data? Who will create and manage the data and metadata? Who will the organization collaborate with on data creation and management? WHAT does the data consist of? --formats, versions, subjects. What metadata standards exist for similar, complementary information, that are also important to your users? What are the primary ways the users will employ this information (scenarios of use).

METADATA Data Model WHEN How often will users need the data? How often will data be updated with new versions? How long must the data be kept and managed? WHERE Where is the data created? Where does it need to be stored? Where is it used? HOW do users seek and utilize the data? How is the data constructed and organized? How do separate but complementary pieces of data need to be linked for context?

10

University Information Universe Library Collection Descriptive

Private colls

Rights & Access

Administrative Technical

“Gray Lit” Official docs Slid e 13

METADATA Metadata Structure Data Element: Smallest unit of semantic meaning Value: Unique information contained within the data element Record: Structured grouping of data elements that collectively describe an information resource Schema: Provides the rules for using data elements and creating the values within data elements

11

METADATA Metadata Structure Data Elements Populated with meaningful information (“values”) According to rules (“Schema”) Rules: • Data element (e.g. controlled vocabulary; formatting; prescribed value list or formatting rule)

• Record structure: data element constraints: (mandatory, recommended, optional) (repeatable) (order of elements)

METADATA Metadata Schema Creates standardized metadata that is: understandable by organization and user – “shared understanding” Shareable across repositories Can be mapped to other schema to create a different context for the metadata Maintained by a standards body for durability

12

METADATA Metadata Schema Examples Anglo-American Cataloging Rules Dublin Core IEEE Learning Object Metadata NISO Technical Metadata for Still Images MPEG-7 Multimedia Content Description Interface

METADATA 15 OPTIONAL, REPEATABLE ELEMENTS Content Title Subject Description Source Language Relation Coverage

25

Intellectual Property

Instantiation

Creator Publisher Contributor Rights

Date Type Format Identifier

From “Description of Dublin Core Elements” http://purl.oclc.org/metadata/dublin_core_elements

13

METADATA Role of XML eXtensible Markup Language - subset of SGML Defines metadata: data elements, values and relationshipes among data elements Expresses metadata, for display on the web Requires an associated stylesheet (XSL, CSS) to format for web display

METADATA Role of XML Every part of a document/object can have semantic meaning--to a person and to a computer:
The
cat in
the
hat .

Hierarchical relationships supported: cat hat

14

METADATA Selecting a Metadata Schema Selecting a Schema: • Supports the organization’s data model • Supports customization • Managed by a reputable standards body • Maps readily to other schema for sharing data and changing context

METADATA Customizing Metadata +

Support for unique needs of users Adds to metadata universe of knowledge Supports unique commonalities among a distributed user base

-

May not be developed robustly Interoperability may be compromised

15

METADATA Data Element Registration Defines metadata characteristics and formatting requirements Supports data sharing with crosssystem and cross-organization descriptions of common units of data. Provides common understanding of a unit of data’s meaning, representation and identification.

METADATA ISO 11179-3 – Metadata Registry Standard: ATTRIBUTES OF DATA ELEMENTS Data Data Element Name Type Permissible Values Value Domain - context in which data element and permissible values are used Version (number and date) Data element obligation (mandatory, optional, recommended) Repeatability: yes or no

16

Example: Moving Image Collections (MIC) Organization Directory DB default (blank) = fixed / langstring variable

Name Label INSTITUTIONAL DATA

OrgUID

OrgName

variable with maxleng th of 8 char

Organization Idenftifier

Organization Name

Y/N

Y/N

(M, O, R)

Form atting Rule

Definition

Full name of organization or unit contributing to the Gatew ay

variable

N

Serves as UID in XML UDDI schema; controlled vocabulary. Use the ISO 15511 ISIL standard (international standard Identifier for Libraries and Related Organizations

Use MARC21 Identification code (formerly NUC code), conforming to ISO 15511 requirements. Older Org Codes may need revision to support ISO 15511 requirements. Codes are formally assigned upon request of organization so participating archives without a c

N

Sentence capitalization; Do not abbreviate or use acronym. Can correspond to Vcard ORG element

Organizations that are part of a larger institution should enter the name of their specific division or unit here, and the larger institution in the field, "Parent Organization." Ex: UCLA Film and Television Archive

Y

M

45

METADATA What is a Digital Repository? Provides secure, robust accessible storage and access to metadata and to digital resources Provides centralized access to applications and technologies to support access and use of digital resources

17

METADATA The Digital Information Repository Persistent Objects: Manage objects through changes to: hardware, software, players, search&retrieval systems, etc.

Persistent Metadata: Manage metadata through schema and data element versioning changes, new metadata formats, I&R changes, hardware & database migrations.

METADATA The OAIS Functional Model http://ssdoo.gsfc.nasa.gov/nost/isoas/ An OAIS is an organization of people and systems that have accepted the responsibility to preserve information and make it available for a designated community. Preservation Planning

Data Management

Producer

Ingest

Access

Consumer

Archival Storage Administration

Management

18

METADATA Centralized Repository Design Admission policies & procedures Web-based Controlled entry (database forms) Data validity checking “meta metadata”

METADATA Repository support for Metadata RUL / NJDH Workflow Management System Collection Record

Session Record

METS

METS

Session Record

METS

METS

19

METADATA Other Useful Metadata System Requirements: Authentication and authorization to create, modify, replace records. Support for thesauri and name authority records. Ability to use a record as a “template” with overlay for individual fields.

METADATA RUL Data Entry Use Model Metadata Application Design – Data Entry Use r

User selects schema

Sys tem displays Menu

User submi ts entry

Sys tem retrieves data from data object

Data entr y menu

Sys tem displays entered dat a

No

User approve?

Sys tem proces s the dat a and creat es a data object

Dat a Object

Are requi red Fi el ds fi lled?

47

User enters data

No

Yes

Yes

Sys tem proces s Dat a from dat a object

Sys tem updates database

Dat abase

20

METADATA Open-Source Repository Applications: Dspace – “capture, store, index, preserve, and redistribute the intellectual output of a university’s research faculty in digital formats.” http://www.dspace.org Fedora Project – release 1.2 on 12-22-03. METS-based digital repository management system. http://www.fedora.org/info// DLSX – Middleware to manage digital objects Release 11 – http://www.dlsx.org/products/index.html

METADATA Open-Source Repository Tools Web Server --e.g., Apache Relational Database --e.g., MySQL --e.g., PostGRESQL Object-relational • Search Engine - Z39.50, XML compliant -- e.g, Zebra, Xpat Lite

21

METADATA Protocol for Data Sharing OAI - Open Archives Initiative http://www.openarchives.org hTTP-based protocol for simple searching and data mining across metadata repositories.

Requires:

•Archive ID •Record ID •Datestamp (Date created, Date Modified, Date Deleted) •Dublin Core Simple in XML export format

Optional: • Set ID (corresponds roughly to “collection”

•Flow control (sets limits on number of records to send in a batch and interval between record batches

Metadata Crosswalks Map data models for metadata schemas Map data elements – labels and semantic meaning Map values (controlled vocabularies or formatting principles)

22

23

24

25

METADATA Putting it All Together

The User Portal

26

27

28

29

30

31

[email protected]

32