database management tool: handling the petabytes

9 downloads 6287 Views 2MB Size Report
DATABASE MANAGEMENT TOOL: HANDLING THE PETABYTES. The AmiBio system foresees handling of a huge quantity of data, mostly audio, to be recorded ...
Project progress

DATABASE MANAGEMENT TOOL: HANDLING THE PETABYTES by T. Kostoulas, C. Tsimpouris, O. Jahn, K. Riede, T. Ganchev, O. Kocsis, and N. Fakotakis The AmiBio system foresees handling of a huge quantity of data, mostly audio, to be recorded by the monitoring stations and continuously transmitted to the central station for analysis. Handling the Petabytes of data at the central station imposes many requirements with respect to the database functionality and the corresponding management tools. The preparatory action A.8 in AmiBio was dedicated to the creation of a database repository, which will store the species records in a structured manner, and to the implementation of software tools for semi-automatic annotation of the AmiBio recordings and database management. The main purpose of the database management tool is to provide access to the AmiBio database and repository of the original sound recordings, processed audio signals, and reference recordings, which will facilitate the management of the audio files and related metadata. Within the AmiBio database, three main repositories have been foreseen: Unidentified sounds: All recordings for which there exists no annotation information. Reference sound library: All recordings which are linked with annotation/metadata information. Training library: All recordings which are confirmed as “excellent” ones. The architectural design of the Automatic processing management tool includes four recordings main processes: Automatic processing of reSound Signal Verification cordings: The user can process any data relying in the reference Playback sound library or unidentified See results of Automatic sound signals. Processing Sound signal verification: For editing the source identification Revise Metadata (ID) and for assigning subjective scores of the probability for a correct ID and for the sound Confirm/Annotate ID Unidentified Sounds quality. Reference sound library manSave Sound Signal agement: Provides a flexible tool for editing the data entries; a Annotation/Metadata history of all changes is generrepository Reference Sound Library Management ated and saved automatically. Reference Sound Training library management: Library Edit Data Entries for managing the data relying in the training library (the one conStatistics on the taining recordings rated as available recordings “excellent”). All the database management Preselecting potential Training Library training recordings tools are executed through a web-based interface, which also Editing of training enables users to annotate serecordings lected recordings. The manual annotation of the recordings is performed using Training Library the external standalone, and Edit Data Entries open source, application Praat (www.praat.org). Statistics on the The specific actions that authorTraining recordings ized users can perform are illustrated in Fig. 1. Fig. 1. Architectural design of the database management tool 2

AMIBIO

Project progress

...HANDLING THE PETABYTES The web interface for the AmiBio database management has been constructed on top of Drupal (drupal.org), a highly customizable Content Management System (CMS). This CMS has been chosen because of all the following advantages:  It is free to use, customizable and modular, as it is open source.  It is supported by a live community of thousands of programmers.  The Amibio public web site has been created also on Drupal, making it possible for these two to be merged, upon user request.  It is really easy for the administrator to create user accounts, and differentiate them to roles, where each role will have different access levels and privileges.  It has an effective out-of-the-box revisioning system, which is a requirement for Amibio.

Fig. 2. Song of Fringilla Coelebs

In Fig. 2, the homepage for authorized users is illustrated. From this page all audio files can be seen and, in addition, many other options are available, depending on access level and privileges:  Navigate to other pages (top menu):  Audio files used For Training: only audio files that are stored for training are selected.  Full Audio List: current page.  My Bookmarks: every user can personally bookmark audio files, for later use.  Revision history: it shows all changes that have been made by all users.  Filter out the search results (Green form).  Sort the search results as preferred by the user, as titles are clickable (Titles). In Fig. 3, all available information for a single audio file is shown, as presented to the authorized users. Another menu exists (red box), and it is enabled upon access rights of current user, where the available choices to execute operations on files are shown: Create copy, Tool1 (automatic processing), Tool2 (automatic processing), hear the audio file (Hear It), make changes (Edit) or export to other databases (Export to Europeana, Export to GBIF Database). This menu is highly customizable, according to user requirements, and new automatic processing tools can be easily added. Moreover, the user can bookmark (green box) records to his personal bookmark list for later use.

Fig. 3. Available information and operations for audio files

AMIBIO

3

AmiBio

NEWSLETTER With the contribution of the LIFE financial instrument of the European Union

WWW.AMIBIO-PROJECT.EU

LIFE+ NATURE AND BIODIVERSITY

Contact Us Nikos Fakotakis, Project Coordinator Wire Communication Laboratory, University of Patras, 26500 Rion-Patras, Greece E-mail: [email protected] Phone: +30 2610 996 496 http://www.amibio-project.eu/

C

O N T E N T S

Pages 2 & 3 Database Management Tool: Handling the Petabytes

Page 4 AmiBio—Activities in the 2nd Year

LIFE08 NAT/GR/000539

7th Issue, January 2012

Suggest Documents