Building Preservation Environments Reagan W. Moore
Richard Marciano
San Diego Supercomputer Center 9500 Gilman Drive, MC-0505 La Jolla, CA 92093-0505 01 858 534 5073
San Diego Supercomputer Center 9500 Gilman Drive, MC-0505 La Jolla, CA 92093-0505 01 858 534 8345
[email protected]
[email protected] three preservation environment. include:
ABSTRACT1 The preservation of digital entities requires data management technologies that are provided by digital libraries and data grids. Digital libraries provide standard data organization and presentation mechanisms. Data grids provide support for infrastructure independence, the ability to incorporate new technology as it becomes available. Preservation environments integrate these technologies to assure the authenticity and integrity of digital entities. We will describe the concepts behind preservation and illustrate the concepts with three data preservation environments based on the NARA research prototype persistent archive, the NHPRC Persistent Archive Testbed, and the NSF NSDL persistent archive.
o
Preservation processes o
o
Categories and Subject Descriptors
o
H.3.4 [Information Storage and Retrieval]: Systems and Software – distributed systems.
Management, Design, Reliability, Security, Standardization
Keywords Persistent archives, authenticity, integrity, infrastructure independence.
Appraisal, accession, arrangement, description, preservation, access
Preservation architecture based on data grids o
Authenticity mechanisms – metadata linking, metadata completeness and validation
o
Integrity mechanisms – replication, checksum
o
Logical name space management – registration of web crawls, registration of existing collections
Access mechanisms based on digital libraries o
General Terms
The topics covered will
DSpace integration
o
Scalability – automation of archival processes
o
Standards – choices for archival information packages,
o
Demonstrations of technology used in NARA, NHPRC, and NSDL collaborations.
3. ACKNOWLEDGMENTS This technology development was supported by the NSF NPACI ACI-9619020 (NARA supplement), the NSF NSDL/UCAR Subaward S02-36645, the NHPRC Persistent Archive Testbed. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the National Science Foundation, the National Archives and Records Administration, or the U.S. government.
1. INTRODUCTION Preservation environments ensure the authenticity and integrity of records for time periods much longer than the lifetime of any individual software or hardware technology. Authenticity is the assertion that the provenance context that describes the creator, creating institution, and creation procedures remain associated with the record. Integrity is the assertion that the bits comprising the record have not been corrupted. The management of technology evolution is an assertion that an infrastructure independent architecture can be created that allows the incorporation of new technologies, formatting standards, and presentation technologies.
4. REFERENCES [1] Moore, R., “Preservation Environments,” NASA / IEEE MSST2004, Twelfth NASA Goddard / Twenty-First IEEE Conference on Mass Storage Systems and Technologies, April 2004. [2] Smorul, M., J. JaJa, F. McCall, S. F. Brown, R. Moore, R. Marciano, S-Y. Chen, R. Lopez, R. Chadduck, “Recovery of a Digital Image Collection Through the SDSC/UMD/NARA Prototype Persistent Archive,” SDSC Technical Report 2003-06, September 2000.
2. Tutorial We will conduct a half-day tutorial that covers the preservation concepts, implementation, and lessons learned from building
Copyright is held by the author/owner(s). JCDL’05, June 7–11, 2005, Denver, Colorado, USA ACM 1-58113-876-8/05/0006.
424