information system development

3 downloads 569 Views 159KB Size Report
of information system development in the beginning of the new “e-age” – when .... interconnected online information systems become more and more integrated) ...
INFORMATION SYSTEM DEVELOPMENT IN THE “E-AGE” Gábor Magyar, István Szakadát, and Gábor Knapp∗

ABSTRACT Audio-visual content plays increasing role in cultural and scientific life, however preservation and accessibility methods of such objects is still under research and development in all over the world. The problem arises from the multidisciplinary and heterogeneous nature of content on the one hand, and the continuos and high speed change in infocommunication technology on the other hand. Successful co-operation of several different kinds of multicultural, multilingual repositories require the use of widely accepted open standards and systems, however tailored collaboration among the project participants, the institutes and experts of information technology, media and regulation is an exciting challenge for project management. The project for the initiation of the National Audio-visual Archive (NAVA), Hungary was founded in 1999 by the Ministry of National Cultural Heritage, and planned to be completed in December 2001. Now the theoretical and modelling aspects are already cleared and described in details, a test-bed is established, appropriate contacts with co-operating institutes are made. Further research is needed for the extending and propagation of metadata scheme, and the experimental integration of existing archives. This case study emphasises the importance of a manageable system development method, however concrete development results and technological problems related with management functions and decisions are also mentioned. The project is an edifying case of information system development in the beginning of the new “e-age” – when fundamental rules become unstable. The paper introduces an extended model of the Controlled Iteration Method. Real example shows, that a proper information system can be designed and realised rapidly even if the functional requirements are modifying in the late phases of the project and the specification have to be defined in a very complex space. *

Gábor Magyar, Department of Telecommunications and Telematics, BUTE, Budapest, Hungary. István Szakadát, MATÁVnet Internet Provider and Development Ltd., Budapest, Hungary. Gábor Knapp, Centre of Information Technology, BUTE, Budapest, Hungary. 1

2

G. MAGYAR, I. SZAKADÁT, AND G. KNAPP

OBJECTIVES Conventional memory institutions, such as libraries, archives and museums, have traditions for hundreds of years in preserving knowledge and maintaining cultural and historical identity. In the analogue world these repositories were distinct, required different knowledge, methods and description techniques. In the previous decade the collections had to be supplemented with audio-visual objects, however this extension didn’t modified the processing significantly. Nowadays broadcasted audio-visual content and the temporary electronic media, internet plays increasing role, valuable products appear and sometimes unfortunately disappear irrecoverably. The preservation of electronic content is urgent, but needs quite different processing methods and technology. The digital age made possible not only the preservation, but also simultaneously the enhancement in accessibility of knowledge. [1, 2] The objectives to be reached fed on the demand of completing the range of the cultural heritage to be preserved by broadcasted audio-visual content, and to build an open information system that can serve as a nation-wide centre of standardisation, - as a methodological and technological interface for interoperability. The tasks described before couldn’t be completed in one step, because of relevant technological an social aspects, but appropriate definitions and priority could be established: 1.

The most urgent task was to assure the legal, economical and technological frame for preserving broadcasting audio-visual content.

2.

An open system is to be specified, because the systems to be connected are not thought to have the same technology. Besides openness, the archive can serve as a centre of standardisation. The main technical aspects are: • • • • •

3.

streaming technology database interoperability, multilingual, multicultural environment standardised communication protocols (IP, XML, Z39.50) system design method (OO, UML) [3] catalogisation, thesauri (MARC, AACR2R, UDC, DDC, LCC)

The system has to be designed for continuous change of technology and customer demand on both the end-user and the production end of the supply chain. The criteria partly can be satisfied with the previously described Open architecture itself.

An overall objective of the project was to demonstrate that it is possible to assure, to develop, and to maintain the interoperability of different metadata-base or multimedia database relay on easily applicable and widely accepted and applied, common modelling, catalogisation, classification, information retrieval, searching methods and tools.

METHOD We got an important experience during the application of the controlled iteration method in the special conditions of the project. [4] In the classical principle first you must have a strategy, a clear conceptual space, a clear vision of the role of the planned information system, and then you can define the functionality and derive the plans,

INFORMATION SYSTEM DESIGN IN THE „E-AGE”

3

design, etc. [5] But nowadays typical in “e-age” (hopefully: temporarily…) that the user can not even imagine the new possible services – because the service doesn’t exist at the moment, or it is not yet introduced. How can you specify – for example - the user requirements of a public Audio-visual archive? Will you derive from the experiences of libraries? But a library and an online archive (where you can obtain not only online catalogue data and abstracts, but the entire content – in different qualities, using advanced in-depth searching tools, etc.) have totally different functions in information storing and access! The number of permitted concurrent user channels (one aspect of user requirements) can be 10 or 50 or 100 or more. It’s a question of money, but what is more: it has strong socio-cultural impacts as well. (10 is enough for local users, 50 is suitable for a closed user group, 100 is fit the needs of major library sites, but what about internet access? We have scalability in computer systems, but only in a limited range of performance. Besides technology, the decision has cultural, political, financial, legal, and other aspects as well. (All these factors are changing rapidly in time. For example legal conditions (copyright, media regulation, digital signature) hold application developers in a state of uncertainty. The system development process leading to the specification of the information system of the National Audio-visual Archive [6] has some special properties, several uncertainties, so conventional methods couldn‘t be used, because • • • • • •

there was no similar, fully tested model to follow, no antecedents, no experience at all; the investor had no clear vision on functionality; the customer needs were (and still are) hardly known – which is quite common in early e-economy; both investor and customer expected rapid development with fast results; even the tools of development were under development, the components of the system had to be fitted and tested; the experts of the project had to continuously enhance and refine their skills built on the knowledge obtained in previous stages.

The system design method to be selected has to handle all the problems above. To manage the other specific problems, the Controlled Iteration was chosen as a basis of the methodology used for the life cycle of the project. Controlled Iteration model combines the approaches of managers and software developers, is based on development phases, and manages the tasks as development activities for the sake of the user requirements. The model provides smooth transitions between the phases (based on a maturity criterion). This feature means higher responsibility for the management, but ensures flexibility for handling unusual cases. Close and frequent interaction of requirement, content and plan management functions can be realised. [7] The Maturity Model could be used to control ever changing circumstances. The technological possibilities essentially alter practically in every six-month, novel technical solutions appear, providing better performance at lower prices, possibly causing the fundamental re-thinking of concepts. [8] According to the nature of the task, it can be stated without any further consideration, that object oriented philosophy has to be supported and the tools of Rapid Application Development and Prototyping techniques are to be used.

4

G. MAGYAR, I. SZAKADÁT, AND G. KNAPP

Activities

System description, product definition

Design, Modelling

Implementation, Building system

Verification, Testing

Inception

Elaboration

Construction

Transition

Phases

Figure 1. The Controlled Iteration Model The project management has to co-ordinate the parallel sub-projects (legal, organisational and financial feasibility, audio-visual technology, metadata-base, streaming video database) in each phase. A GroupWare environment was established to provide controlled, safe and fast information exchange. Critical factor in the model the determination of the level of maturity. The initial specification changes during each phase throughout development caused by the experience obtained by the project team and the changes in technological environment. Who can make the decision, that the 80% of work have been done, and the next phase can be started? The purpose was to avoid unlimited project duration and costs, but to find a good method approaching user needs (even if it is unknown or undiscovered at early stages). The model was supplemented with a Conceptual frame, a Board of Experts was established representing all consumers and suppliers concerned, including specialists in media, libraries, archives, audio-visual and internet technology, copyright and media law and informatics. With the introduction of the new layer the phase specific conventional iteration between system description, design, implementation and verification, can be extended to the conceptual frame. The most crucial questions can be “feature-extracted” and transformed to a higher level of abstraction – while iterative nature of the development method doesn’t diminish.

INFORMATION SYSTEM DESIGN IN THE „E-AGE”

5

Conceptual decision points

Conteptual frame - Board of Experts

Activities

System description, product definition

Design, Modelling

Implementation, Building system

Verification, Testing

Inception

Elaboration

Construction

Transition

Phases

Figure 2. Extended model for the project The competent experts cannot be invited to the project, because of the small number of the specialists with appropriate skills and business background, not mentioning the costs and organisation difficulties of the daily collaboration of the highly occupied professionals. So the conceptual frame is a loosely coupled part of the of the project, the work of the Board of experts is periodical, the sessions are held only at phase transitions. Certainly experts (as far as it is necessary to complete tasks) are employed by the project itself. The use of this extended model was essential for the success, and the use of this extended model proved to be a suitable frame for the development. The most critical points of this extended model are: • • • •

the adequate and clear definition of conceptual decision points, the proper planning of iteration cycles, the determined and consistent management of phases. the effective (software) support tools.

CURRENT STATUS: RESULTS AND PLANS Following is a brief summary of the design and implementation tasks. This overview illustrates, that the project really had a high number of decision points even in the conceptual level. A feasibility study, and the detailed logical design of the metadata-base were completed, moreover a pilot digital audio-visual archive was developed. [9]

6

G. MAGYAR, I. SZAKADÁT, AND G. KNAPP

The feasibility study covers the functional, organisational, legal and financial questions, and summarises the current technical trends as well. Alternatives were argumented for decisions on collection of interest, the functionality of the archive (services), the legal status, the necessary investment and operating costs, etc. If we give wide publicity for the multimedia content we have to ensure fast, easy, understandable and usable searching and navigation process. It’s not an easy task. As one of the most important consequence of the convergence that the category of "multiculture" came to existence, but that notion has more meanings. It means multilingual, multi-professional or multi-cultural as well. People who are living in different countries, who are speaking different languages, who rely on different knowledge bases, who have different professions, skills, who have different value set, can easily communicate with each other - in the age of Internet. And even we believe in that the process of convergence is unstoppable (which is means from a certain point of view that interconnected online information systems become more and more integrated) we must not suppose that this integration process has a result of a totally unified and united information system. The differences between the specific areas remain within the digital world as well. So we can not expect the existence of only one platform. In that case we need (i) protocols which can provide the communication between the different multimedia contents, databases; (ii) and open communication standards. (Of course it is a matter of question what are these standards which are already widely accepted standards or have a chance to become widely accepted standards.) Describing a digital object we need meta information about it in the one hand and we have to store the information into records which can be divided into four ingredients (content, structure, layout, and context). According to the trends of digital world we are using XML language family as a standard markup language, [10] Unified Modelling Language (UML) as a standard modelling language, and object-oriented development method. [11] In the field of meta information systems we can find a lot of profession-specific metaschemes, classification systems, thesauri, structural, administrative and descriptive metadata blocks, and we have to investigate what kind of extent we can use these solutions for our specific purposes (standards from the area of archives as ISAD(G), ISAAR(CPF), AED, IASA Cataloguing Rules [12], standards and classification systems of the librarians as MARC, AACR2R, UDC, DDC, LCC, some communication standards as Z39.50, [13] GILS or standards of the broadcast industry as SMPTE Metadata Dictionary, EBU-materials, UMID). [14] There is no existing meta information system satisfying our claims. But it seems that there is one metadata system - the so-called Dublin Core Initiative - which have a chance becoming a minimal consensus metadata system accepted by the most important players of the digital world. Probably the Dublin Core will be the only one commonly accepted and applicable meta information set. This means from another point of view, that there will be no convergence in the field of the metadata of different memory institutions: the standards of the librarians can not apply for the audio-visual archives and vice versa. If the co-existence of the different metadata schemes are not cease in the future (as we believe in that), we need a tool to handle the different meta information systems in a unified way, and we rely on the standard of Resource Description Framework (RDF) which has the same objectives. [15]

INFORMATION SYSTEM DESIGN IN THE „E-AGE”

7

Within the process of modelling of the archive we have to investigate which elements can we use from some other similar projects as CCSDS Open Archival Information System (OAIS), [16] Making of America II. project of University of Berkeley, [17] or Audio-visual Prototype Project of Library of Congress. [18, 19] We have to focus on another important initiative called MPEG-7 as well (sometimes it is referenced as Multimedia Content Description Interface), because the objectives of that project are very similar to our purposes. [20] Based on these decisions a series of models, vocabularies and architectures were analysed. For lack of widely accepted standards and models we had to develop an extensive object and context vocabulary. Table 1. Example: Part of the context vocabulary unit_hun Képernyőszöveg Játékfilm animációs film Némafilm hangos film dokumentumfilm fikciós film Hipermédia Medium interaktív média Média Multimédia Hipertext Fontface Programkód adatbázis modell Térkép Újság Könyv Folyóirat Periodika Üzenet gyűjteményes dokumentum adáshossz

unit_eng monitor text feature film animation film silent film sound film documentary film fiction film hypermedia medium interactive media media multimedia hypertext fontface source code database model map newspaper book journal periodics message collection-like document transmission period

adásidő

transmission time

adáskezdet

transmission start

description

note link film film film film film film

dokumentum dokumentum dokumentum dokumentum adott·műsorszám másodpercpontos játszási időtartama valós időben· a műsoridő és a műsorszámnak nem minősülő információk együttes időtartamának összege adott műsoradón adott műsornapon adott műsorszám – valós idejű – kezdési időpontja

Műsorszámhossz időpontosság valós idő műsoridő műsorszámnak nem minősülő információ valós idő műsorszám

rule

8

G. MAGYAR, I. SZAKADÁT, AND G. KNAPP

As a minimal scheme of the metadata-base, the Dublin Core model was selected. The Dublin Core has 15 widely accepted description entities. [21,22] Table 2. Example: The minimal scheme – Dublin Core 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Cím (cím, title) Alkotó: szerző vagy létrehozó (létrehozó, creator) Tárgy- és kulcsszavak (tárgy, subject) Leírás (leírás, description) Kiadó (kiadó, publisher) Egyéb közreműködő (közreműködő, contributor) Dátum (dátum, date) Forrástípus (típus, type) Formátum (formátum, format) Forrásazonosító (azonosító, identifier) Forrás (forrás, source) Nyelv (nyelv, language) Kapcsolat (kapcsolat, relation) Lefedett téma (téma, coverage) Szerzői jog (jog, rights)

(The adapted model has – of course – a detailed description in Hungarian.) Meta information elements were grouped into 3 basic groups. These are: • • •

structural information descriptive information administrative information

The content and scope of structural information was analysed first. „Program entity” was chosen in our broadcast audio-visual archive as central object category. Fig. 3. shows (as an example) the hierarchical structure of this category. Product

Series 1

Episode 1

Segment 1

Series 2

Episode 2

Segment 1

Episode 1

Segment 1

Episode 2

Segment 1

Episode 3

Segment 1

Shot 1

Figure 3. Example: The hierarchical structure of the Program entity category

INFORMATION SYSTEM DESIGN IN THE „E-AGE”

9

Later we worked out an extended model, and we introduced the „digital document” category. Perhaps this sounds easy, but ’document’ has so many different meanings for different professionals. The ontology of the different kind of document object categories is shown in Fig. 4.

Still image

voice

Motion picture composition

image silent film

Sound-film Sheet-music

letter

glyph

word classification

record

sentence

table

Part of text

database

text

Graphical representation

Letter representation

graph

multimédia

Non-linear

chart

writing

linear speech

hypertext

digital

audiovisual culture

analogue

written culture

oral culture

language

Type of information

Type of document

Time-invariant phenomenon

Time-variant phenomenon

Figure 4. The ontology of the document object categories Descriptive information covers genre, copyright, source, owner, history, etc. An entire typology of descriptive information was developed.

10

G. MAGYAR, I. SZAKADÁT, AND G. KNAPP

Administrative information means access management system, archive copy history, archiving events, etc. Table 3. Example: administrative information elements: ID 1 2 3 4 5 6 7 8

9 10 11 12 13 14 15

Name access_category

Description Code for the category of access, to be used by access management system together with identification of user. Coding allows for shifts in subcategory treatment. access_expiration_date Date that the access category is expected to change. access_information Any additional information about access management; not to be confused with information about rights, rightsholders, etc. access_rights Pointer to location that contains a record of the rightholders, e.g., intellectual property owner, donor, etceteras, and to related information. archive_date_time Date and time of archiving. Associated with a particular archiving event, thus with a particular archive_identifier. Likely to be subsumed within archive_history. archive_history Listing of archiving events; in effect, a compendium of all archive_date_time and archive_identifier data (which two fields may in the end simply be subsumed within this field). archive_identifier Pointer to the entity that has been archived according to the application of the archiving_profile on a particular archive_date_time. Likely to be subsumed within archive_history. archive_next_date_time Date and time for creation of an additional secure copy of this object. Some archiving will be triggered by changes in a entity made by the "producer." This trigger is proposed for situations where there are noproducer-warranted-changes and would be based on, say, longevity analysis of the storage media or format. archiving_profile Identifies the program (or equivalent) used to manage the archiving of this object for users. The program will build the entity to be archived, which will be assigned a handle. Compare to presentation_profile. associated_file_name Name of the support file, e.g., 11320.ent, cw005.ram. associated_file_type Purpose or function of support file (e.g., pgi or ent file for SGML) audio_bits_per_sample For sound files (where relevant), number of bits per sample. Compare to image_bit_depth. audio_channel_configuration Indicator for audio channel configuration (e.g., stereo, mono, surround sound, bilingual, etc.) for the digital file. Not to be confused with similar data for original within reformatting_documentation. audio_channel_information Additional information about audio channel configuration (e.g., which languages, which channels to rear speakers, etc.). Not to be confused with similar data for original within reformatting_documentation. audio_sampling_frequency For sound files (where relevant), number of samples per second. Compare to image_spatial_resolution and video_data_rate.

As a result of research the minimal scheme is realised – based on the Dublin Core initiative, but slightly modified to fit the domestic practise of production sites and existing archives. A complete pilot system was established that could be used to verify the functionality of applications and other hardware and software components. According to the result of the decision of the Board, the project continues towards the extension of metadata scheme to the maximal model enhancing the capabilities of the archive for research, educational purposes. Trials are to be made on cross border and multicultural interoperation.

INFORMATION SYSTEM DESIGN IN THE „E-AGE”

11

CONCLUSION The paper demonstrated, that a proper information system can be designed and realised rapidly even if the functional requirements are modifying in the late phases of the project and the specification have to be defined in a very complex space. For that reason an extended model of the Controlled Iteration Method was introduced. With the extended method, architecture and applications were developed rapidly, models could be adapted to the continuously changing environment. The main advantage is however, that all events, progress reports and solutions are known to the expert board, all decisions can be made in consensus, after extensive discussion. So all stakeholders (e.g. the actors of the further archivation process) can influence the development at specific project points. The method has positive balance: the effort in management side and to keep this board informed and active, are cleared and result a relatively easy and fast implementation procedure.

REFERENCES 1. Andreas Tirakis, Distributed Audio-Visual Archives: the DiVAN Project, EBU, "Programme Archives", Geneva, 1999 jan 2. Guidelines on best practices for using electronic information (How to deal with machine-readable data and electronic documents), DLM Forum, European Communities, 1997 3. Cantor, Murray. Object-Oriented Project Management with UML. New York: John Wiley & Sons, Inc., 1998. 4. Paulk, Weber, Curtis: The Capability Maturity Model : Guidelines for Improving the Software Process, Addison-Wesley, 1995 5. Suzanne and James Robertson: Mastering the Requirement Process. AddisonWesley, 1999. 6. György, P., Knapp, G., Kovács, A.B., Magyar, G., Rozgonyi, K., Szakadát, I.: „National Audiovisual Archive Pilot Project” Poster at the VI. European Conference on Archives, Florence, 2001 7. Stephens, Luke: Understanding Analysis and Analysis Models: Models and Notations of Formal Object-Oriented Methods. Raleigh, NC: Parc Place-Digitalk, 1996. 8. Raynus: Software Process Improvement with CMM Artech House, 1998 9. Knapp, G., Kovács, A.B., Magyar, G.: National Audiovisual Archive Pilot Project and Feasibility Study. Presented at the Europrix – IST CEE Rregional Workshop, Budapest, 2001 10. W3C Consortium: XML Protocol Activity, http://www.w3.org/2000/xp/ 11. Fowler, Martin and Kendall Scott. UML Distilled: Applying the Standard Object Modeling Language. Menlo Park, CA: Addison Wesley Longman, 1997. 12. The IASA Cataloguing Rules, http://www.llgc.org.uk/iasa/icat/icat001.htm 13. ANSI/NISO Z39.50-1995 Information Retrieval (Z39.50):Application Service Definition and Protocol Specification 14. Wilkinson, Cox: An SMPTE Proposed Standard for the Unique Material Identifier (UMID), EBU, "Programme Archives", Geneva, 1999 jan

12

G. MAGYAR, I. SZAKADÁT, AND G. KNAPP

15. Resource Description Framework (RDF) Model and Syntax Specification, W3C Recommendation 22 February 1999, http://www.w3.org/TR/REC-rdf-syntax 16. OSI Reference Model for an Open Archival Information System (OAIS), NASA, Washington, 1998 17. The Making of America II Testbed Project White Paper Version 2.0 (September 15, 1998), wp-v2.pdf 18. Digital Repository for Audio-Visual Preservation - The Library of Congress Prototyping Project, in: http://lcweb.loc.gov/rr/mopic/avprot/avprhome.html 19. Library of Congress "digital library " model: http://lcweb.loc.gov/standards/metadata.html 20. Multimedia catalogue: the RAI experience, EBU, "Programme Archives", Geneva, 1999 jan. 21. INDECS - Interoperability of Data in E-Commerce Systems, http://www.indecs.org 22. Internet RFC 2413 (The Dublin Core Metadata for Simple Resource Discovery)