COMPARATIVE STUDY OF AN OPEN SOURCE ...

7 downloads 11992 Views 448KB Size Report
Can easily modify your software to suit patron's needs and your needs;. ➢ Little to no ... Joomla,drupal,Wordpress ... easy to install/configure and operate.
COMPARATIVE STUDY OF AN OPEN SOURCE DIGITAL LIBRARY SOFTWARE: DSPACE, GREENSTONE AND EPRINT By Ramani Ranjan Sahu Library Assistant Pandit Deendayal Petroleum University [email protected] Mob: 9998191907

Alekha karadia Asst. Librarian Bhima Bhoi College,Rairakhol [email protected] Mob: 9124037352

Abstract:Organizations have a new option for acquiring and implement system ,plus new opportunities for participating in open sources software projects library professional should be involve in their development and to build a digital library under economical conditions open sources software is preferable. The extremely competitive environment, zero deficiency and enhanced productivity has made it mandatory for the organization to carefully choose if they want to create a parallel digital library with features which we may not find in traditional library .They should have basic idea about the selection, installation and maintenance. This paper deals with comparison of DSpace, Greenstone, and E-Prints Open sources software from various point of view and how to selected open source software for digital library. Keywords: Open sources software, Digital library, DSpace, GreenStone, EPrints 1-INTRODUCTION: Due to Information Communication Technology and digital library has changed access methods for all stake holders in retrieving key knowledge and relevant information. Digital libraries mean creation, organization, maintenance, management, access, sharing and preservation of digital document collection.Open source digital library software presents a system for the construction and presentation of information collections. It helps in building collections with searching and metadata-bases browsing facilities. Open source digital library management software’s provide extensible features to administrators’ and allows an organization to showcase their digital achieve to world audience. With full rights of software available under GPL and source code being provided with the software, Organization’s can extend the functionality of the software as being

required for the particular operation. Due to shrinking budgets and the increasing prices of journals, librarians have to look forward to a new alternative by which they can collect, store, arrange, and disseminate information to the users. The concept of open access and institutional repository (IR) has evolved to find out the solutions. In building the IR the academic libraries can take the help of the OSS. (Meitei, L.S. & Devi, P. 2009). So that organization has evaluated and comparisons choose popular open sources digital library software various point of view for creating of institutional repository Open source software Open source software is computer software whose source code is available under a license that permits users to study, change, and improve the software, and to redistribute it in modified or unmodified firm. It is often developed in a public, collaborative manner. It is the most prominent example of open source development and often compared to user generated content. Reasons to Use Open Source Software  It promotes creative development;  Those who can't afford proprietary software can download open source programs for free ;  Money saved can be used to purchase other needed materials  Can easily modify your software to suit patron's needs and your needs;  Little to no upgrade costs ;  No more grueling over software that doesn't meet your standards -- create it yourself based off of a close preexisting piece of software ;  The price (free) makes  it easier to change your mind when the software doesn't live up to  its expectations Little to no viruses!  Open source software that can be incorporated into libraries.

Open Source Software for Libraries:Library Automation Digital Library Software Web Publishing Other Computer Programs

Koha, NewGenLib, Evergreen, DSpace, Fedor, Greenstone, keystone and E-Prints ,opus etc Joomla,drupal,Wordpress Ubuntu, Firefox, PDF Creator, Thunderbird, Open Office, GIMPshop, NVU

Digital Library:Digital library is a collection of digital documents or objects. According to Smith (2001) defined a digital library is an organized and focused collection of digital objects, including text, images, video and audio, with the methods of access and retrieval and for the selection, creation, organization, maintenance and sharing of collection. The digital library focused on digital collections for preserving their documents.

2-SELECTION CRITERIA OF OPEN SOURCE DIGITAL LIBRARY SOFTWARE’S Evaluation of open source software is different from proprietary programs. A key difference for evaluation is that the information available for open source programs is usually different than for proprietary programs; source code, analysis by others of the program design, discussion between users and developers on how well it is working, and so on.My point of view selection criteria are like that Open source licenses, Functional modules, Stable releases, Developers and user community, User interface, Documentation.

Characteristic Reliability

point of selection criteria Maturity Popularity Availability Learnability

Usability

Operability Accessibility

User interface aesthetetics

Evaluation Is the software new in market? Does this software have numerous user? Does this software frequently release new software version? How easy to learn or understand the software without using user manual? Is this software easy to operate? Is this software easy to accessed without other third party software or plug-in? Is the user interface is suitable with its software functionality?

Time behaviour

Performance efficiency

Resource utilisation

Functional completeness Functional correctness Functionality Functional appropriateness Modularity

Modifiability Maintainability Reusability

Testability

Confidentiality

Security

Integrity Authenticity Support

Tangible

Documentation

Is this software easy to install/configure and operate within short time? Is this software use minimal/ limited resources or can be used with existing resources (e.g : server, operating system )? Does the software meet user’s expectation and requirement? Does the software provide correct output as user’s expectation? Does the software function appropriately? Does the code structural and readable? How well is the software designed? How easy the system can be customized to meet user’s requirement? How easy to reuse or extent the code for further extension or integration? Is the software error-free?

How secure data and the software? How confidence that software is free from vulnerabilities? Does the software have any control mechanism to ensure system integrity? Does the software provide level of user’s authentication? Is there any community or commercial support provided? Complete documentation provided? Both technical and user manual?

Version Reliability Community

Does software version release as targeted or expected time with mainly new functionality? How active is the community for the software?

Responsiveness Competence

Assurance

Empathy

Does the community posses of required skill and knowledge? Credibility Does the development team and community have perform good track record? How many bugs were fixed in last 6 month? Communication Does the community acknowledge your problems and help in solving it? Skill

How many internal technical staff skilled with tools and language used by this software?

Competence (Chamili,K 2012)

3. Open source digital library software’s Comparison In the following, the five open access Open source digital library software’sare compared based on the characteristics identified in the previous section. The level of support of each characteristic and specific considerations for each DL system are discussed. Object model Dspace: The basic entity in DSpace is item, which contains both metadata and digital content. Qualified Dublin Core (DC) [8] metadata fields are stored in the item, while other metadata sets and digital content are defined as bitstreams and categorized as bundles of the item. The internal structure of an item is expressed by structural metadata, which define the relationships between the constituent parts of an item. DSpace uses globally unique identifiers for items based on CNRI Handle System. Persistent identifiers are also used for the bit streams of every item. Greenstone: Basic entity in Greenstone is document, which is expressed in XML format. Documents are linked with one or more resources that represent the digital content of the

object. Each document contains a unique document identifier but there is no support for persistent identifiers of the resources. EPrints: Basic entity in EPrints is the data object, which is a record containing metadata. One or more documents (files) can be linked with the data object. Each data object has a unique identifier. Collections and relations support Dspace: Supports collections of items and communities that hold one or more collections. An item belongs to one or more collections, but has only one owner collection. It is feasible to define default values for the metadata fields in a collection. The descriptive metadata defined for a collection are the title and description.There is no support of relations between different items. Greenstone: A collection in Greenstone defines a set of characteristics that describe its functionality. These characteristics are: indexing, searching and browsing capabilities, file formats, conversion plugins and entry points for the digital content import. There are also some characteristics for the presentation of the collection.The representation of hierarchical structure in text documents is supported for chapters, sections and paragraphs. The definition of specific sections in text document is implemented through special XML tags. XLinks in a document can be used to relate it with other documents or resources. EPrints: There is no consideration of collections in EPrints. Data objects are grouped depending on specific fields (subject, year, title, etc). There is no definition of relations between documents, except using URLs in specific metadata fields.

Metadata and digital content storage Dspace: Dspace stores qualified DC metadata in a relational database (PostgreSQL or Oracle). Other metadata sets and digital content are represented as bitstreams and are stored on filesystem. Each bitstream is associated with a specific bistream format. A support level is defined for every bistreamformat, indicating the level of preservation for the specified file format. Greenstone: Both documents and resources are stored on filesystem. Metadata are user defined and are stored in documents using an internal XML format. EPrints: Metadata fields in EPrints are user-defined. The data object, containing metadata, is stored in a MySQL database and the documents (digital content) are stored on filesystem.

Search and browse Dspace: Provides indexing for the basic metadata set (qualified DC) by default, using the relational database.Indexing of other defined metadata sets is also provided using Jakarta Lucene API. Lucene supports fielded search, stemming and stop words removal. Searching can be constrained in a collection or community. Also,browsing is offered by default on title, author and date fields. Greenstone: Indexing is offered for the text documents and specific metadata fields. Searching capabilities provided for defined sections in a document (Title, chapter, paragraph) or in whole document. Stemming and case sensitive searching is also available. Managing Gigabytes (MG) open-source applications is used to support indexing and searching. Browsing catalogs can be defined for specific fields using hierarchical structure. EPrints: Indexing is supported for every metadata field, using the MySQL database. Full text indexing is supported for selected fields. Combined fielded search and free text search are provided to the end-user. Browsing is provided using specified fields (e.g. title, author, subject). Object management DSpace: Items in DSpace are created using the web submission user interface or the batch item importer, which ingests XML metadata documents and the constituent content files. In both cases a workflow process may initiate depending on the collection configuration. The workflow can be configured to contain from one to three steps where different users or groups may intervene to the item submission. Collections and communities are created using the web user interface. Greenstone: New collections and the contained documents are built using the Greenstone Librarian Interface or the command line building program. EPrints: A default web user interface is provided for the creation and editing of objects. Authority records can be used helping the completion of specific fields (e.g. authors, title). Objects can also be imported from text files using multiple formats (METS, DC, MODS, BibTeX, EndNote). User interfaces DSpace: A default web user interface is provided in order for the end-user to browse a collection, view the qualified DC metadata of an item and navigate to its bistreams. Navigation into an item is supported through the structural metadata that may determine the ordering of complex content (like book pages or web pages). A searching interface is provided by default that allows the user to search using keywords.

Greenstone: The default web user interface provides browsing and searching into collections, navigating into hierarchical objects (like books) using table of contents. Presentation of documents or search results may differ depending on specified XSLTs. EPrints: The web user interface provides browsing by selected metadata fields (usually subject, title or date). Browsing can be hierarchical for subject fields. Searching environment allows user to restrict the search query using multiple fields and select values from lists. Access control DSpace: It supports users (e-people) and groups that hold different rights. Authentication is provided through user passwords, X509 certificates or LDAP. Access control rights are kept for each item and define the actions that a user is able to perform. These actions are: read/write the bitstreams of an item, add/remove the bundles of an item, read/write an item, add/remove an item in a collection. Rights are based in a default-deny policy. Greenstone: A user in Greenstone belongs to one of two predefined user groups: an administrator or a collection builder. The first user group has the right to create and delete users, while the second builds and updates collections. End-users have access to all the collections and the documents. EPrints: Registered users in EPrints are able to create and edit objects. Users are logged in using their username and password pair. Multiple languages support All the DL systems use Unicode character encoding, so the support of different languages can be supported.Every system can use multiple languages in the metadata fields and digital content. Keystone and EPrintsprovide an XML attribute on metadata fields to define the language used for the field value. Greenstone provides ready to use multilingual interfaces already translated in many languages Interoperability features All the DL systems support OAI-PMH in order to share the metadata of the DL with other repositories. Greenstone and Keystone also support Z39.50 protocol for answering queries on specific metadata sets. Fedora and DSpace are able to export digital objects as METS XML files. Both systems also use persistence URIs to access the digital content providing a unified access mechanism to external services. DSpace also supports OpenURL protocol providing links for every item page. EPrints exports data objects in METS and MPEG-21 Digital Item Declaration Language (DIDL) format.

Level of customization Dspace: Although DSpace has a flexible object model is not so open in constructing very different objects with independent metadata sets because of its database oriented architecture. The user interface is fixed and provides only minor presentation interventions. Another disadvantage is the full support of only specific file formats as digital content. Greenstone: It provides customization for the presentation of a collection based on XSLTs and agents that control specific actions of the DL. Greenstone architecture provides (i) a back end that contains the collections and the documents as long as services to manage them and (ii) a web based front end that is responsible for the presentation of collections, documents and their searching environment. EPrints: The data objects in EPrints contain user defined metadata. Plug-ins can be written in order to export the data objects in different text formats. A Core API in Perl is provided for developers who prefer to access basic DL functionality. Findings This comparison of five major open source software based on certain parameters mentioned above has resulted into the following findings. •DSpace is the most popular among the digital library solutions available in the open source domain and DSpace is functionally richer and supports a wide range of object types, including text, sound, images and video. It provides detailed implementation guidelines. •GSDL and EPrints are also widely used and it is a low cost option for repository primarily aimed at open access to article pre-prints and post-prints, including digital theses. A range of object types can be uploaded, including video, audio, images and zip files. Educational institutions dominate in the use of these packages. •Institutions for which EPrints is not quite suitable may find DSpace and Greenstone more closely meets their needs, without being unnecessarily complex. India is benefiting well from the open source movement. •DSpace supports community based content policies and submission process and accommodates various kinds of digital document formats. •EPrints is a useful Digital Library system with large user community. But when there is a need for technical support and training in using the software,DSpace was found suitableKeystone •Though many libraries are using Greenstone , E-Print, fedora and Keystone but the majority of the libraries prefer DSpace as it has got several advantages and can support numerous forms

and formats. It was also noted that by using DSpace, there isa possibility of interacting with other libraries in the city for technical support. Moreover it is open source software and can be customized as per the institutional requirement. Conclusion: The Digital Library Management software’s (DLMS) present an easy to use, customizable architecture to create online digital libraries. With these institutions/organizations can disseminate their research work, manuscripts, or any other digital media for preservations and world over dissemination of digital items. The software’s discussed above present different services and architectures. It is difficult to propose one specific DLMS system as the most suitable for all cases. The Comparative study open source digital library software’s can be used as a reference guide by any organization or institute to decide which one will be ideal for creating and showcasing their digital collection. The choice usually depends on type/format of material, distribution of material, software platform and time frame etc for setting up a Digital Library. (International Journal of Computer Applications (0975 – 8887) Volume 59– No.16, December 2012)

Reference:

1.

C. Lagoze and H. Van de Sompel. The Open Archives Initiative: Building a low-barrier interoperability framework. In Proceedings of the Joint Conference on Digital Libraries (JCDL ’01), 2001.

2. Chamili, Khadijah(2012).Selection Criteria for Open Source Software Adoption in

Malaysia.Asian Transactions on Basic and Applied Sciences (ATBAS ISSN: 2221-4291) Volume 02 Issue 02 retrieved from http://www.asiantransactions.org/Journals/Vol02Issue02/ATBAS/ATBAS-60212027.pdf

3. Goh, D, Razikin, K, Chua,Alton Y.K., Lee ,Chei Sian and Foo ,Schubert (2009).On the Effectiveness of Social Tagging for Resource Discovery.Handbook of Research on Digital Libraries: Design, Development, and Impact .pp. 251-260.www.irmainternational.org/chapter/effectiveness-social-tagging-resource-discovery/19888/ 4. Ibrahim, Ushaman alhaji, Digitazation of Library Resources and formation of digital Libraries: A Practical Approach.pp.2. http://www.library.up.ac.za/digi/docs/alhaji_paper.pdf 5. Kinoshenko,D, MashtalirV, Shlyakhov,V and YegorovaE (2012).Nested Partitions Properties for Spatial Content Image Retrieval.MultimediaStorage and Retrieval Innovations for Digital

Library Systems. pp. 240-269.www.irma-international.org/chapter/nested-partitions-propertiesspatial-content/64471/ 6. Kovacevic,Aand Devedzic,V (2009). Duplicate Journal Title Detection in References Handbook of Research on Digital Libraries: Design,Development, and Impact (pp. 235242).www.irma-international.org/chapter/duplicate-journal-title-detection-references/19886/ 7. Meitei, L.S. & Devi, P. (2009).Open source initiatives in digital preservations: The need for an open sourcedigital repository and preservation system. In CALIBER 2009. http://hdl.handle.net/1944/996 8. R. Kahn and R. Wilensky. A Framework for Distributed Digital Object Services. Corporation of National Research Initiative - Reston USA, 1995. Available at http://www.cnri.reston.va.us/k-w.html. 9. Shaoqun Wu and Ian H. Witten (2010).First Person Singular: A Digital Library Collection that Helps Second Language LearnersExpress Themselves.International Journal of Digital Library Systems (pp. 24-43).http://dblp.uni-trier.de/pers/hd/w/Wu:Shaoqun 10. Tyagi,Sunil (2013). The Concept of Metadata for Digital Information Resources with Special Reference to DublinCore (DC).Design, Development, and Management of Resources for Digital Library Services.pp.160-170.www.irma-international.org/chapter/concept-metadatadigital-information-resources/72455/ 11. DCMI Metadata Terms. Dublin Core Metadata Initiative. Available at http://www.dublincore.org/documents/dcmi-terms/ 12. DSpace Federation. Available at http://www.dspace.org/ 13. EPrints for Digital Repositories. Available at http://www.eprints.org/ 14. Fedora Project. Available at http://www.fedora.info/ 15. Greenstone Digital Library Software. Available at http://www.greenstone.org/ 16. Keystone DLS. Available at http://www.indexdata.dk/keystone/ 17. METS: An Overview & Tutorial. Library of Congress. Available at http://www.loc.gov/standards/mets/METSOverview.v2.html

Suggest Documents