Encoding the ANSI Z39.50 Search and Retrieval Protocol ... - CiteSeerX

Encoding the ANSI Z39.50 Search and Retrieval Protocol using LOOPN

John William Lamp BSc GradDipSc(IT) CD

Thesis submitted in partial fulfilment of the requirements of the degree of Bachelor of Computing with Honours

Department of Computer Science University of Tasmania

November 1994

ii

Abstract An examination of the Z39.50 protocol in its context is made, and a methodology for expressing it in Petri nets is proposed and implemented. The resulting place/transition nets are expressed in LOOPN modules and source code. Advantages and difficulties of the use of this particular form of expression are noted, and recommendations for further analysis and work in this area are made.

iii

Acknowledgements I would like to acknowledge the support of my supervisor, Brian Marriott, especially when it became obvious that my original project was no longer appropriate and a replacement project needed to be found. Charles Lakos and Chris Keen assisted with a number of aspects of this project. Clifford Lynch, Ray Denenberg and Ralph LeVan, members of the Z39.50 Implementors Group were particularly helpful throughout the year. No project such as this can ever be successful without the support of the family members. Bernice, Matthew and Gemma all helped in their own ways.

The opportunity for me to undertake this year as a full time project would not have been possible without the financial support provided by the Higman Prize and Comcare Australia.

iv

Table of Contents 1. Introduction..............................................................................................................1 Definitions.........................................................................................................1 Information Management and the Internet..............................................1 Information management concepts ................................................1 The internet ..........................................................................................4 Applications for accessing data .........................................................7 Network management and information management..............9 2. Aims and Objectives ...............................................................................................12 What is original in the work.........................................................................12 Overview of rest of thesis ..............................................................................12 3. Theoretical Background .........................................................................................13 The OSI Model.................................................................................................. 13 The OSI reference model....................................................................13 ANSI Z39.50 ......................................................................................................14 Background of Z39.50..........................................................................15 Implementations of Z39.50................................................................16 Mechanism of operation....................................................................18 Future development of ANSI Z39.50..............................................19 Petri Nets ...........................................................................................................19 LOOPN ...............................................................................................................22 Language features ................................................................................22 4. Experimental Design...............................................................................................24 Research Methods............................................................................................24 Top down approach.............................................................................24 5. Results and Analysis...............................................................................................25 Identification of the services .........................................................................25 Place/Transition Nets.....................................................................................28 The Initialisation Service...................................................................29 The Search, Present, Delete and Resource-report Services ..................................................................................................30 The Access-control Service................................................................31 The Trigger-resource-control Service..............................................32 The Resource-control Service...........................................................33 The IR-abort Service............................................................................33 The IR-release Service ........................................................................34 ANSI Z39.50 on the Internet .........................................................................35 LOOPN Modules..............................................................................................37 The Z39.50 Initialisation/IR-release Service..................................39 The Search, Present, Delete and Resource-report Services ..................................................................................................43 The Trigger-resource-control Service..............................................44 The Access-control Service................................................................45

v The Resource-control Service...........................................................47 The IR-abort Service............................................................................48 First Approach..........................................................................49 Second Approach.....................................................................53 Other Approaches....................................................................54 Assembly................................................................................................55 6. Conclusions............................................................................................................... 56 General conclusions........................................................................................56 Achievements ..................................................................................................56 Further analysis................................................................................................56 Future work ......................................................................................................57 Bibliography..................................................................................................................58 Glossary..........................................................................................................................65 Appendix 1: Z39.50-1994 Ballot Letter......................................................................A1 Appendix 2: Using Z39.50 on the internet..............................................................A8

Figures 1.1 The Searching Procedure .....................................................................................3 3.1 3.2 3.3 3.4

The OSI Seven Layer Model................................................................................14 Simplified Z39.50 Interaction..............................................................................18 The Firing of a Transition....................................................................................20 A Coloured Net......................................................................................................21

4.1 Sequence of Interactions....................................................................................... 29 4.2 Z39.50 Origin Initialisation Service ...................................................................30 4.3 Z39.50 Origin Search, Present, Delete, Resource-report Service..................31 4.4 Z39.50 Origin Access-control Service.................................................................31 4.5 Z39.50 Origin Trigger-resource-control Service ..............................................32 4.6 Z39.50 Origin Resource-control Service............................................................ 33 4.7 Z39.50 Origin IR-abort Service ............................................................................34 4.8 Z39.50 Origin IR-release Service.........................................................................34 4.9 Z39.50 Origin Initialisation/IR-release Service...............................................35 4.10 Z39.50 Origin Ideal Structure.............................................................................37 4.11 Z39.50 Origin init_places....................................................................................39 4.12 Z39.50 Origin open_ok .......................................................................................40 4.13 Z39.50 Origin open_not_ok ..............................................................................41 4.14 Z39.50 Origin irel_req .........................................................................................41 4.15 Z39.50 Origin open_Arel_conf.........................................................................42 4.16 Z39.50 Origin generic send.................................................................................43 4.17 Z39.50 Origin generic Access_control..............................................................46 4.18 Z39.50 Origin generic Resource_control.........................................................47 4.19 Z39.50 Origin partial IR_abort...........................................................................49 4.20 Origin Test Net.....................................................................................................55

vi

Tables 3.1 Some Currently Operating Z39.50-1992 Servers..............................................17 3.2 State Table for Z39.50-1992 Origin ......................................................................28

Listings 4.1 Part of defs.l.............................................................................................................38 4.2 init_places.l.............................................................................................................39 4.3 open_ok.l.................................................................................................................40 4.4 open_not_ok.l........................................................................................................ 41 4.5 irel_req.l...................................................................................................................42 4.6 open_Arel_conf.l ..................................................................................................42 4.7 init_origin.l.............................................................................................................43 4.8 send.l.........................................................................................................................44 4.9 from Z3950_origin.l (a).........................................................................................44 4.10 trig_res.l.................................................................................................................45 4.11 acc.l..........................................................................................................................46 4.12 from Z3950_origin.l (b)....................................................................................... 47 4.13 rcs.l ..........................................................................................................................48 4.14 Z3950_origin_abort.l...........................................................................................50 4.15 abort.l......................................................................................................................53 4.16 send_ab.l................................................................................................................53

1

1. Introduction Definitions There are many challenges which confront information systems practitioners in the world today. A large number of these relate to the explosion of information available today, either in printed form or in various forms of electronic media. In particular, the need for tools to manage and access this information, and the means by which these tools can be designed and created are areas of significant research. In this thesis I have looked at a particular area of electronic information, the Internet, and at one of the tools used in that area, the ANSI Z39.50 Search and Retrieval Protocol and examined •

the definition of the protocol,

•

the degree to which a particular form of information modelling, object-oriented Petri nets, can be used in this area, and

•

a methodology for deriving object Petri nets from the protocol specification

Information Management and the Internet One of the basic requirements of human society is information. To do what we must or want to do, we need knowledge. It has become impossible for each of us individually to acquire this by trial and error only. Therefore we must gain our knowledge from others. Thus, information may be defined as: any kind of knowledge about things, facts, concepts, etc., that is exchangeable among users. (Griethuysen, 1989, p206)

Information management concepts To effectively use the information which an individual or organisation has acquired, it is necessary for this information to be managed, so that, at some time after its acquisition, it is possible to retrieve the information and make it available for use. The nature and extent of the collection of information will determine the magnitude of the information management task. A small collection may only require a simple record of its contents. Larger collections may require more complex management systems including a variety of access points to the collection.

2

Indexing Indexing is the process of creating some form of structured list which may be searched in order to locate one or more specific items within a collection. The scope and order of an index will determine the access points to a collection. For example, indexes in library collections traditionally use author and title indexes as the two most common methods of accessing a collection. There are many other values by which a collection can be indexed, and these reflect the properties of the collection being indexed. The directory entries in a file system are an example of an index which usually can offer access alphabetically, by date, by size, or by type.

Vocabulary control When indexing textual material, one common index is by subject type. There are two main approaches to creating indexes based on subject. One approach, which is commonly used in computer databases, is to use an uncontrolled vocabulary extracted from the contents of the document being indexed. In this form of indexing, all non trivial words in the document are used to create the index. The ease of automating this process is an obvious attraction. The other approach is to use a controlled vocabulary. In this approach, key words or phrases are assigned to each document by an indexer, using a thesaurus, and the document is indexed on these words or phrases alone. While this may seem an unnecessarily cumbersome system at first glance there are a number of advantages which mainly result from the many layered nature of human language: Antonyms – Antonyms can be important in bringing out information. Some are actual opposites (male, female), others are reciprocals (hardness, softness) and others are reversals (potential, counterpotential) Synonyms – Having a preferred term for synonyms can increase recall. Consider a search for ‘footpath’, this would not find ‘sidewalk’ or ‘pavement’ where a preferred term would. Homographs – Homographs are words spelled alike, but having different meanings (violin bow, bow to the orchestra). Vocabulary control can eliminate these confusions of meaning.

3 The implementation of the controlled vocabulary can be either visible, where the user makes decisions within the framework of the thesaurus, or operating behind the scenes. In the latter case, to use the example of a synonym above, a search for ‘sidewalk’ may be mapped to the approved keyword ‘footpath’ by the system and thus the recall would be increased.

Search strategies The steps in a search strategy are outlined in the following figure. Whether or not the search is carried out manually or using a computer the steps are the same. User expresses an information need

Request is conceptually analysed

Request is translated into system's index language

Are all searching options depleted?

Figure 1.1 - The Searching Procedure

Search is completed

Yes

I sa us s tis er fie d?

Search is carried out

No

No

Reformation of the request

A searching startegy is composed

4 Most searching systems use a system of Boolean logic to make queries, so a query could be made for ‘cats AND drugs AND pneumonia’. Others also offer proximity searching – ‘information NEAR systems’ will find documents where the word information is near the word systems. (Cleveland & Cleveland, 1983) The definition of NEAR is usually able to be specified in terms of the number of words allowed between the two terms for a match to be considered near. ‘Information ADJ systems’ searches for the word ‘information’ adjacent to the word ‘systems’. Depending on the specific implementation of the ADJ operator, it may imply position, that is it may find ‘information systems’ but not ‘systems information’.

The internet Welcome to the Internet! You're about to start a journey through a unique land without frontiers, a place that is everywhere at once -- even though it exists physically only as a series of electrical impulses. You'll be joining a growing community of millions of people around the world who use this global resource on a daily basis. With this book, you will be able to use the Internet to: =

Stay in touch with friends, relatives and colleagues around the world, at a fraction of the cost of phone calls or even air mail.

=

Discuss everything from archaeology to zoology with people in several different languages.

=

Tap into thousands of information databases and libraries worldwide.

=

Retrieve any of thousands of documents, journals, books and computer programs.

=

Stay up to date with wire-service news and sports and with official weather reports.

=

Play live, "real time" games with dozens of other people at once.

(EFF, 1994)

The internet is vast. Current estimates are that it is expanding at the rate of some 20% per month (EFF, 1994). Most countries have some form of connection to the internet. In Australia the main connection is through AARNET, the Australian Academic and Research Network. This extraordinary growth is probably related to the speed with which the personal computer has become accepted into our lives. It took the telephone 75 years to reach the level of penetration in offices that the small computer has attained in 10. ... Over half of all American homes are already linked to cable or direct-broadcast satellite systems. (Dizard, 1989, p6)

5 From the description of the internet above, and the discussion of the need for greater sophistication of information management with increasing size, it can be expected that the internet has major information management needs. The connectivity, immediacy and relative low cost of placing items on the internet, compared to traditional publishing has popularised electronic publishing. If further definitions are necessary, electronic publishing may be defined as: •

the use of computers and telecommunications systems to distribute data electronically;

•

the use of various storage media to allow the distribution of data on demand. (Gurnsey, 1990, p 383)

Concurrently with this has been a debate within the library community on the question of ownership versus access, that is, if an item is readily available from the internet, or other electronic source, is there then any need for the local library to have an actual physical copy. The efforts of groups such as Project Gutenberg to place electronic copies of extant material on the net means that this issue is not restricted to recently published material. The concept of “ownership versus access” to information has surfaced as an issue for managers of library collections. The current perception of “collection” and “collecting” must also change to fit the new information environment. It is highly unlikely that library collections as we know them today will prevail in the next decade. (Nissley, 1992, p226)

Certainly, there is a visionary outlook on the part of some sections of the internet community, as can be seen from the EFF quote above and the quote from Dizard, below. We are heirs to a great tradition of men and women who saw the organisation of available information as a basic condition of human progress. It is the tradition of Ptolmey I Soter (350 - 283 BC), who founded the great library of Alexandria, the first attempt to gather all of the world’s books in one place and catalogue them scientifically. Its collection of perhaps 700,000 volumes was not matched again until the past century. (Dizard, 1989, p19)

Librarians, too, have made estimations of how the development of electronic publishing will affect their activities in the future. Here is a possible scenario for the evolution of scholarly journals. 6.1 1991 A.D. •

Paper journals totally dominate the scholarly scene.

•

There are some parallel electronic products, mostly the "static" CDROM format.

6 •

Some full text (without graphics) is available online via services such as Dialog and BRS.

•

Some mainstream publishers are experimenting with electronic publications.

•

There are a variety of options for delivering individual articles via mail and fax.

•

The biggest single article suppliers are libraries, via the longpopular and fairly effective interlibrary loan mechanisms.

•

Over a thousand scholarly electronic discussion groups exist.

•

Under ten scholarly electronic journals exist that are refereed, lightly-refereed, or edited.

•

Two institutional preprint services are in development.

•

OCLC, a library utility, positions itself through development work for the AAAS as a serious electronic publisher of scientific articles.

6.2 1995 A.D. •

Significant inroads into the paper subscription market, because (1) libraries make heavy journal cancellations due to budget constraints, and they feel "mad as hell" about high subscription prices; and (2) it becomes possible to deliver specific articles directly to the end-user.

•

Librarians and publishers squabble over prices--ELECTRONIC prices.

•

For the first time, the Association of American Publishers (AAP) sues a research library or university over either electronic copying or paper resource-sharing activities.

•

There are over 100 refereed electronic journals produced by academics.

•

In collaboration with professional or scholarly societies, university-based preprint services get underway in several disciplines.

•

The Net still subsidized.

•

Rate of paper journal growth slows.

•

Many alternative sources exist for the same article, including publishers and intermediaries.

•

Bibliographic confusion and chaos reigns for bibliographic utilities, libraries, and, by extension, scholars.

6.3 2000 A.D. •

Computer equipment and user-sophistication are pervasive, although not ubiquitous.

•

Parallel electronic and paper availability for serious academic journals; market between paper journals and alternatives (e.g., electronic delivery) is split close to 50/50.

•

Subscription model wanes; license and single-article models wax.

•

Secondary services re-think roles; other indexing (machine browsing, artificial intelligence, and full-text or abstract searching) strengthens.

•

Net transferred to commercial owners, but access costs are low.

7 •

New niches are created: archive, scanning, re-packaging, and information-to-profile services.

•

Publishers without electronic delivery shrink or leave the marketplace.

•

Many collaborations, some confusing and unworkable, as publishers struggle with development, conversion, and delivery.

•

Major Copyright Law revision continues.

•

Stratification of richer and poorer users, universities, and nations. (Okerson, 1991, pp 19 - 21)

Probably the only pertinent comment on this prediction, is that the process is happening faster than predicted.

Applications for accessing data Any examination of the documentation surrounding the online industry shows a series of peaks and fads, periods when some issues took on crucial significance followed by times when they were virtually ignored. (Gurnsey, 1990, p391)

telnet and rlogin Telnet: Access to databases, computerized library card catalogs, weather reports and other information services, as well as live, online games that let you compete with players from around the world. (EFF, 1994)

The telnet and rlogin protocols allow the interactive use of remote computers over the internet. Once logged in to the remote computer, applications can be run in accordance with the privileges allowed to the account. The development of client/server applications using protocols such as Z39.50 have reduced the need to establish account facilities on multiple sites, while maintaining functionality.

FTP FTP: File-transfer protocol -- access to hundreds of file libraries (everything from computer software to historical documents to song lyrics). You'll be able to transfer these files from the Net to your own computer. (EFF, 1994)

Many computers on the internet offer access to information using the FTP protocol. Most of these offer anonymous logon, so that it is not necessary to establish an account on any particular machine. The catch is to know where the particular files in which you are interested in are located. There are accessible databases of available files, called Archie, and it is possible to run limited queries on these databases.

WAIS WAIS allows a user to do keyword searches for documents, scanning a single server or multiple servers. WAIS responds to a search with a list of documents sorted by a weighted score--the higher the score, the better the match to the search. WAIS is strong in its ability to allow users to sift

8 through a large body of documents stored in servers across the Internet. It is more appropriate for finding a small set of documents among a large set of candidates than for serving as a menuing or browsing tool. (Wiggins, 1993, p 48)

WAIS (Wide Area Information Server) databases operate in client server mode and communicate using the ANSI Z39.50 version 1 protocol. The choice of the Z39.50 protocol is significant for a number of reasons. It is an open protocol and its adoption by WAIS is likely to encourage others to adopt it as well, promoting the standardization that is needed for the easy exchange of all formats of electronic information. In the future, Z39.50 has the potential to deal with audio, video, and image data in addition to text. Libraries that already employ the Z39.50 protocol have the potential to turn their OPACs into WAIS servers. (Bailey & Rook, 1991, p 18)

This is discussed further in the section on Z39.50.

Gopher In a nutshell, Gopher offers access to files and interactive systems using a hierarchical menu system. The organization of the menu is defined by the administrator of each server. The resources that each menu item describes, such as ASCII files, Telnet sessions to other computers, and submenus, are called "documents." (Wiggins, 1993, p 7)

Gopher is an application which presents a series of menus, or folders, through which the user can navigate, and perform simple searches on menu titles. The menu items can refer to items held locally or to remote items. It is not necessary for the actual location of the item selected to be known by the user, as this is handled by the application. Gopher is restricted to a hierarchical menuing system, and the available searching utilities (Veronica, Jughead) are restricted to searching on the titles, or file names of items accessible by gopher. Some gopher implementations allow conducting WAIS searches using a gateway.

World wide web "O what a tangled WWWeb we weave, When we hyper-link-retrieve" (Fitch, 1994)

The world wide web (WWW or W3) takes the gopher approach one step further. It is a hypermedia based system, and rather than navigating though menus, the links are embedded in the documents viewed by the user. Again there is no requirement for the user to know anything about the machine or item location in order to follow a hypermedia link. WWW is able to navigate gopher menus, and access FTP sites. Wais searches are also possible using a number of linking systems. There is almost no control over who and how gopher and www menu pages are created and linked. Navigational confusion through following

9 circular links, or links to documents which no longer exist, or have been moved, is common even amongst experienced users.

Network management and information management In these first decades of computer technology, the machines have been regarded primarily as useful, somewhat exotic storage bins and calculating devices. Their misuse and underuse is characteristic of our technological and social illiteracy. (Dizard, 1989, p14)

The internet was put together by computer scientists who wanted to share information on their field, and as an experimental platform for their various projects. The orientation of the implementors was then, one which concentrated on the network and sought to manage that side of it, rather than examining the information available on the network and the means of managing that. A technology viewpoint, rather than an information viewpoint. The technology viewpoints concentrate on technical artifacts (realised components) from which the distributed processing system is built. It must model the hardware and software that comprise the local operating systems, the input/output devices, storage, points of access to communication services, etc. ... The information viewpoints concentrate on information, and information exchange aspects of distributed processing systems. Information structures and information flows are modelled. The rules and constraints that govern manipulation and management of information are identified. These are modelled in terms of information structures, the relationships and constraints between them, and information flows. Both manual and automated aspects of information processing are taken into account. The resulting models results in a “specification” for what must be achieved by distributed processing system. (Griethuysen, 1989, p213)

This is not necessarily a criticism of the people involved. As was said earlier, a small collection of information requires only simple indexing and management. The expansion of the internet has made this a very different matter. The number of files available has increased out of all proportion and their cover now impinges on every facet of human activity. If this needs emphasis, the Cuadra Directory of online databases – a work which is increasing becoming the definitive global source – now lists some 4,200 databases, together with over 1,700 producers and 570 (host) service suppliers. ... Listing online services has become a cottage industry in its own right and a large number of directories are now available. (Gurnsey, 1990, p388)

The expansion of the internet means that it is now a meta-network of many component networks. The users range from academic, through

10 business to hobbyists. A single central authority would be unlikely to be able to impose an information management structure on top of the internet. There are, however other approaches. Coordination and cooperation are necessary between separate organisations as well as within a single organisation. Electronic trading, for example, leads to distributed systems spanning traders, customers and banks. This level of integration is much harder, since no single authority can control the entire activity of the system. Instead ways must be found to maximise the ability of independent authorities to interwork, without jeopardising their concerns and interests. (Griethuysen, 1989, p215)

Protocols in this area, agreed ways of searching and accessing the available information, is one approach which meets these needs. The incorporation of business users, in particular, has now resulted in a focus on effectiveness and efficiency on the internet. Access to services is required to be justified in terms of business cases, and the costs of the internet are of concern. Although the information sector is now larger than the agricultural and industrial sectors combined, the costs of its labour intensive activities may outweigh the productivity gains made possible by more sophisticated technology. (Dizard, 1989, p222)

The expansion of the internet also brings with it the problem of naive users, and supporting their needs. Better, more intuitive and consistent interfaces are required not only as an end in themselves, but to allow these naive users to effectively use the net, without a large training cost, either to them or the organisations through which they gained access. Formerly it was assumed that most library users would know how to open and use a book or periodical, even if spines were frequently broken. But can we now assume that the average library user is familiar, and comfortable with, myriad technological devices. ... Experience with online catalogs has suggested that users have little motivation to learn more about the system than is absolutely necessary. And why should they? (Nissley, 1992, p229 230)

Librarians, the people with most experience in cataloging the wealth of human knowledge, also have come to the realisation that they are unable to cope with the scale and content of the internet without technological aid. The Internet is a non static space that is host to a variety of information objects. Cataloging rules were not drafted with these objects in mind, and it is difficult to apply them. There has been some work done by computer scientists to name, locate, and describe these objects in machine-driven ways. Librarians can advance their profession by helping to build bridges between the technical, economic, and service issues surrounding access to networked objects. We should actively work to dispel the frustrating idea that human catalogers can ever seize the time, find the funding, or create the tools to handle the Internet all by themselves. (Brugger, 1992)

11 One of the tools which is expected to provide a powerful support to accession of information is the Z39.50 standard. Another national effort involves the development of the Z39.50 network protocol standard. Z39.50 exists within the overall OSI protocol layers and will provide the standard network capability for search and retrieval of information between remote computers. Just as the "TELNET" and "FTP" commands of TCP/IP have enabled network access, the Z39.50 standard will hopefully enable sophisticated use of network-based information. (Bailey & Rooks, 1991, p 8)

12

2. Aims and Objectives The aims and objectives of thesis project were to: •

Develop a methodology for prototyping a protocol such as Z39.50 using LOOPN

•

Develop an operational version of Z39.50 based on LOOPN, including •

Creation of Petri nets which model the Z39.50 protocol.

•

Translation of the nets into LOOPN code.

•

Testing the Z39.50 interface using the National Library of Canada software.

What is original in the work Petri nets have been used for modelling low level protocols, such as FDDI, and have been successful in that area. (Razouk, 1982; Lakos and Keen, 1993b; Lakos and Keen, 1991b; Billington, 1982; Berthelot & Terrat, 1982) In this project Petri nets were used to model a high level protocol, ANSI Z39.50. Petri nets have not previously been used to model the ANSI Z39.50 protocol. The use of LOOPN to model a protocol such as this has not been attempted. Accordingly, the methodologies involved were also examined.

Overview of rest of thesis The following chapters will describe the theoretical background of the elements of the project, the experimental approaches and designs, the results and analysis and the conclusions drawn from the research.

13

3. Theoretical Background The OSI Model OSI is an abbreviation for Open Systems Interconnection. In the early 80's, people in a number of standardization committees all over the world felt that the time had come to develop a set of non-proprietary standard protocols. The hope was that one day these protocols would replace most of the vendor-dependent specifications and that this will make the way free for easy and flexible world-wide computer communication. It took nearly a decade before the first results were produced: a reference model and a set of layer standards from physical cable definitions up to distributed databases and information systems, together with management and security tools.

The OSI reference model The OSI Reference Model defined in ISO 7498 divides the communication process between two application programs into 7 intermediate layers. Each layer provides a certain kind of service to the next higher layer. This service is provided by communicating with the peer entity in same layer of the remote host using the service provided by the next lower layer. Some of the layer entities may be implemented in physical devices, some may be part of the operating system and some may be included in application programs. Most layers provide their service by forwarding protocol data units to the next layer together with an added or removed header or by performing other functions and state changes.

14 The usual diagram that is used to describe the OSI Reference Model looks like this: (from Kuhn, 1994) End System

End System

+-------+ +-------+ | 7 | | 7 | +-------+ +-------+ | 6 | | 6 | +-------+ +-------+ | 5 | Intermediate | 5 | +-------+ System +-------+ | 4 | | 4 | +-------+ +-----------+ +-------+ | 3 | | 3 | | 3 | +-------+ +-----+-----+ +-------+ | 2 | | 2 | 2 | | 2 | +-------+ +-----+-----+ +-------+ | 1 | | 1 | 1 | | 1 | +-------+ +-----+-----+ +-------+ | | | | +------------+ +----...-----+

Application Layer Presentation Layer Session Layer Transport Layer Network Layer Data Link Layer Physical Layer

Figure 3.1 - The OSI Seven Layer Model Layer seven has been further subdivided into several modules (service elements), as some of them may be useful for more than one application. This Application Layer Structure is defined in ISO 9545. The Association Control Service Element (ACSE) (ISO 8649 service definition and ISO 8650 protocol definition) manages the establishment of a connection between two remote applications.

ANSI Z39.50 In its classical meaning, a “protocol” is a formal code of etiquette for dealings between communicating parties. A protocol standard specifies required etiquette to be followed by all parties wishing to communicate. Computer network protocols form a system of hierarchical support. Each protocol provides service operations to the next higher layer in the hierarchy, by using the services of the next lower layer. ... A service specification defines the services provided by the layer, describing only that behaviour visible to the user at the next layer. There should be no description of how the service is actually realised; this is a system-wide view of the layer as a single unit. The protocol specification refines the service specification to define requirements of how each (possibly physically distributed) entity supports the service through interaction with the services of the next lower layer. (Schwartz & Melliar-Smith, 1982, p3)

Z39.50 is an applications-layer protocol within the OSI reference model developed by the International Standards Organisation (ISO). Its purpose is to allow one computer operating in a client mode to perform information retrieval queries against another computer acting as an information server.

15 The standard provides a uniform procedure for client computers to query information resources, for example, server computers supporting databases such as online library catalogues. For example, the development of a client program running on one machine may provide end users with a common means of access to a variety of information resources attached to a computer network. While many of the initial applications of Z39.50 have been for use with bibliographic data (online public access library catalogues, for example), the protocol is actually quite general, and search attribute sets can be defined which allow the protocol to work with most other types of data. The Scientific and Technical Attribute and Element Set (STAS) supports identification and selection of data elements retrievable from scientific, technical, and related databases, using a Z39.50 Present Request. The STAS Attributes and Elements are also useful within other Z39.50 services, such as SCAN and SORT. (STAS, 1994)

Background of Z39.50 The original 1988 version of Z39.50 (Z39.50-1988) was developed between 1983-87 (and approved by ANSI in 1988). Z39.50 is an American national standard. However, there is an ISO standard called Search and Retrieval (SR), ISO 10162/10163 (service and protocol documents, respectively), which was formally adopted to International Standard (IS) status in 1991. The second version of ANSI Z39.50 (Z39.50-1992) was adopted in 1992. In this document the term ‘Z39.50’ refers to Z39.50-1992. Other versions will be explicitly referred to.

The change from Z39.50-1988 to Z39.50-1992 There are two categories of changes from Z39.50-1988 to Z39.50-1992: •

changes necessary for alignment with SR, and

•

features deemed necessary by implementors, to provide sufficient functionality so that implementation can be economically justified.

Changes to align with SR There were various drafts of SR between 1984 and 1991 (when it was finally approved); the U.S. input to SR was influenced by Z39.50-1988, but Z39.50-1988 wasn't entirely stable during that period. The result was that a few incompatibilities between the data elements used remained. (For example, the search request in SR has small-set and

16 medium-set element set names; Z39.50-1988 had only a single, or global, element set parameter. As another example, Z39.50-1988 did not include Preferred-record-syntax.) Z39.50-1988 did not use ASN.1 (ASN.1 is a standard specifying the protocol syntax). Virtually all OSI application protocol data structures are described using ASN.1. Steps are also being taken to combine ASN.1 definitions with formal description techniques. (Bochmann & Deslauriers, 1989) Z39.50-1988 was finalised slightly before ASN.1 was stable, and Z39.50-1988 developed a homegrown syntax notation which was abandoned in Z39.501992. A serious limitation of Z39.50-1988 is the lack of flexibility to reference objects – attribute-sets, diagnostic-sets, and other objects – which SR references through OSI object identifiers. All of these limitations are corrected in Z39.50-1992.

Implementor initiated features In the second category are enhancements to security, access control, and attribute sets and related objects; and a new ‘proximity’ query.

Implementations of Z39.50 WAIS The WAIS software was originally developed by Thinking Machines Inc. Most WAIS systems used over the internet use a form of Z39.50-1988. Thinking Machines Inc have abandoned development of WAIS, however the Coalition for Networked Information (CNI) have taken over development, with their product freewais, which now supports Z39.501992.

OPAC The librarianship community was an instigator of the development of the Z39.50 protocol. This early involvement is reflected in the fact that all librarianship related protocols are in the Z39 group. Until this fact is appreciated, it is hard to see why Z39.50 is grouped together with standards for metal shelving, and the printing of book spines. It was hoped that Z39.50 would become a universal standard for the connection of Online Public Access Catalogues (OPACs). This has not been the case - vendors have clung to their proprietary standards until recently, but now are beginning to adopt Z39.50 at least as an alternative to their proprietary standards.

17

United States Library of Congress AT&T

ibm2.loc.gov:2210 z3950.research.att.com

North Carolina State University Duke University University of North Carolina-Greensboro University of North Carolina-Chapel Hill Middlebury College Butler University Cleveland Public Library Grambling State University

ncsulib.lib.ncsu.edu ducatalog.lib.duke.edu library.uncg.edu unclib.lib.unc.edu myriad.middlebury.edu ruth.butler.edu clevxg.cpl.org gopac.gram.edu

University Center at Tulsa The Research Libraries Group Pica (Netherlands)

lib.uct.edu rlg.stanford.edu chico.pica.nl:2100

Table 3.1 - Some currently operating Z39.50-1992 servers A number of organisations have Z39.50-1992 servers currently operating on the internet (see Table 3.1). The National Library of Canada and the Florida Center for Library Automation have released client and server software into the public domain.

The Online Journal of Current Clinical Trials The Online Journal of Clinical Trials has never appeared in paper form. From its inception it was an electronic journal. Its publishers took a considered decision to use Z39.50 from the first issue of the Journal. Guidon uses a modified version of Z39.50 (1988) to communicate with the database at OCLC. There are several advantages to the use of this protocol: •

Other user interfaces are planned for other environments and with other capabilities, but using the same protocol.

•

Authorization and billing is handled centrally. This both simplifies the interface program and eliminates most security issues about program code that is not under the vendor's control.

•

Capabilities will grow as we support the newer Z39.50 standards.

•

Z39.50 provides a clear interface standard to work with, rather than having to develop our own.

•

It makes it possible for others to use their own interfaces to get to OCLC databases. In the future, we hope to release a programmer's toolkit that will make the development of such interfaces easier. (Hickey & Noreault, 1992, p 5)

18

Chemical Abstracts Service - SciFinder SAN FRANCISCO--October 24, 1994--CAS (Chemical Abstracts Service) today unveiled a new generation research tool to assist scientists and researchers worldwide with access to the organization's databases. SciFinder, which works on Macintosh or Windows desktop computer systems, places information ranging from chemical structures to chemicalrelated literature at the fingertips of scientists who have no or little online search expertise. This revolutionary client-server application, linked to CAS databases, enhances the creative discovery process by providing a simplified means to answer the majority of the questions routinely asked by research scientists. (Wibberley, 1994)

SciFinder is a commercially available product which uses the Z39.50 protocol as its client/server engine.

Mechanism of operation The client (originator, in the Z39.50 specification) initiates an intersystem query by converting a local user's query from the local search syntax into the intersystem syntax and directing the search to a server (target, in the Z39.50 specification) specified by the user. An intersystem query contains a search term with associated search attributes, and, it may contain Boolean operator(s), and a result set identifier. A query with one search term and a single list of attributes is considered to be a simple query; one with more than one set of attributes, more than one search term, or one or more Boolean operators is considered to be a complex query. The following diagram is an outline of events in a simple, successful Z39.50 query process. ORIGIN

TARGET

InitRequest Init Response SearchRequest SearchResponse PresentRequest PresentResponse TerminateRequest TerminateResponse Figure 3.2 - Simplified Z39.50 interaction

19

Future development of ANSI Z39.50 The National Information Standards Organisation (NISO, an accredited ANSI standards committee) is the standards body responsibility for Z39.50, which was originally developed by NISO Committee D (Computer-toComputer Protocols). That committee was disbanded after Z39.50-1988 was approved, but the Library of Congress administered Z39.50 Maintenance Agency continued to work on preparing Z39.50-1992. The Maintenance Agency assumes editorship of the standard and technical coordination of its development, and relies on the Z39.50 Implementors Group (ZIG, formed in 1990) for advice on changes. The latest draft of version 3 of Z39.50 has reached the stage of being balloted upon by members of ZIG. It is probable that this will be issued this year, and therefore become known as Z39.50-1994.

Petri Nets Communication protocols lie at the foundation of all distributed systems. Thus it is imperative that the protocol underlying a system function correctly. But how can we “prove” that a protocol does, in fact, function correctly? Since protocols can exhibit extremely complicated behaviour, an ad hoc verbal verification of the protocol is often incorrect and/or incomplete. What is needed is a formal method for the specification and verification of protocols. (Kurose, 1982, p43)

There have been many methods proposed for describing and modelling protocols (Venkatraman & Piatkowski, 1986); e.g. temporal logic (Sabnani & Schwartz, 1982; Kurose, 1982), general or special purpose languages (Ansart et al, 1982; Watson, 1982; Haas, 1986; Karjoth, 1986; Bochmann et al, 1986; Brinksma, 1986; Linn, 1986), algebraic methods (Holzmann, 1982); general graphical methods (Razouk, 1982) and Petri nets (Billington, 1982; Berthelot & Terrat, 1982) Petri nets are considered to have great modelling power (Kujansu et al, 1982, p311). The foundation of Petri nets was presented by Carl Adam Petri in his doctoral thesis in 1962. The first nets were called Condition/Event Nets (CE-nets). Following this there was a large amount of work on various developments of Petri nets, in particular, Place/Transition Nets (PT-nets) which allowed for more than one token in a place, and Predicate/Transition nets (PrT-nets). Coloured Petri Nets (CP-nets) were proposed by Jensen (1981) as a more elegant way of dealing with some technical problems of PrT-nets.

20 A Petri Net is a directed graph with two types of nodes ... . The place nodes, or places, are usually represented by circles and the transition nodes, or transitions, by short lines or bars. Directed arcs may connect the different types of nodes, but not nodes of the same type. An arc from a place to a transition is called an input arc and an arc from a transition to a place is called an output arc. A place connected to a transition by an input arc is called an input place, and a place connected to a transition by an output arc is called an output place. The places may be occupied by any natural number of markers called tokens, shown as dots. Places and transitions are usually named by an algebraic symbol. When all the input places to a transition contain at least one token, the transition is enabled and may fire. When the transition fires, one token is removed from each of its input places and one token is added to each of its output places. The collection of tokens in a net is called the marking of the net.(Billington, 1982, 97)

Before firing

After firing

Figure 3.3 - The Firing of a Transition Since this paper was published the convention of denoting a transition by a short line or bar has changed to include representing it by a rectangle. The following descriptions of coloured Petri nets is largely drawn from Jensen (1991). Coloured Petri nets differ from other Petri nets in that each place may contain several tokens and each of these contains a data value - which may be of arbitrarily complex type (eg a record where the first field is a real, the second a text string, while the third is a list of integer pairs). The data value which is attached to a given token is referred to as the token colour. (Jensen, 1991, p47)

21

Figure 3.4 A Coloured Net (Jensen, 1991, p47) Figure 3.4 illustrates the three different parts of a CP-net; the net structure, the declarations and the net inscriptions. The net structure consists of graphical representations of arcs, places and transitions. The declarations in the upper left corner define the colour sets, (U, I, P, and E) and two variables (x and i). The net inscriptions are the text associated with the net structure. Places may have three different types of inscriptions; names, colour sets and initialisation expressions. Transitions may have names and guards and arcs have arc expressions. Names have no formal meaning and are used descriptively as identifiers of places or transitions. Each place has a colour set which determines the kind of tokens which may reside on that place. Initialisation expressions specify the initial marking of a place and must evaluate to a multi-set over the corresponding colour set. This implies that two tokens in the same place may have the same colours. It is conventional to omit empty sets. Guards are boolean expressions which must be fulfilled before a transition can occur. Again, by convention, guards which always evaluate to true are omitted. The arc expression is an expression that may contain variables, constants, functions and operations that are defined in the declarations

22 (explicitly or implicitly). When the variables of the arc expression are replaced by colours from the corresponding colour sets, it must evaluate to a colour (or set of colours) that belong to the colour set of the place of the arc. Hierarchical CP-nets were first described in Huber et al (1991) and introduced the concept of relating individual CP-nets, called pages, through five hierarchy constructs; substitution transitions, substitution places, invocation of transitions, fusion transitions and fusion places.

LOOPN A number of languages have been proposed to be used in association with Petri nets; P-NUT (Petri Net UTilities) (Morgan & Razouk, 1986), PROTEAN (PROTocol Emulation and ANalysis) (Billington & WilburHam, 1986) and PROTOB (Baldassari & Bruno, 1988). LOOPN is a compiler, simulator and associated source control system for object-oriented Petri nets. It is described in LOOPN - Language for objectoriented Petri nets (Lakos, 1991) and the LOOPN User Manual (Lakos, 1992) LOOPN is the result of development from a Petri net simulator, PETSI, originally devised in 1987 by Simon Milton as an Honours project (Lakos, 1988). Major development of LOOPN has been undertaken by Dr Charles Lakos and Dr Chris Keen. PETSI was based on the use of Pascal and the Llama parser generator, and produced Pascal code. PETSI had a number of limitations, and required a verbose specification of the Petri net. Despite these limitations working simulations of a number of classic examples of Petri nets were successfully undertaken in PETSI, eg Dining Philosophers and stop and wait protocols. LOOPN now translates the Petri nets into C source code. This source code is compiled, and the intermediate C files are deleted. It is possible to configure LOOPN to retain these C source code files. LOOPN is actively under development at the University of Tasmania.

Language features In LOOPN, programs are made up of one or more modules. Each module is specified within a separate file, the name of which is related to the name of the module. Constants and types are declared using a syntax similar to that of Pascal.

23 Fundamental to LOOPN is the declaration of token types. Token types are extended record declarations which may include fields or functions. Additionally, a token type is always defined as a subtype of another, thereby inheriting the fields and functions of the parent. The parent’s fields and functions may be augmented or overridden. The basic token type, which is inherited by all token types is null. A null token has no fields, but has the predefined functions: first, last, delay, empty, tail. Place declarations define the Petri net places, and the colour (token type) which is acceptable at that place. The transition declaration defines the Petri net transition, including the input and output places, and the actions which take place at the transition. Complex Petri nets can be built up by instances of other modules within a module, or by inheriting the features of a module as part of the definition of a new module. In LOOPN, a Petri net is an extension of coloured timed Petri nets •

•

token types are classes •

they consist of both data fields and functions

•

they can be declared by inheriting from other token types

module or subnet types are classes.

24

4. Experimental Design Research Methods A number of papers have been published dealing with methodologies for the development of code using LOOPN. (Keen & Lakos, 1993; Lakos & Keen, 1991a; Lakos & Keen, 1991b; Lakos & Keen, 1993b) In all of these cases the papers dealt with greenfield projects, that is, with projects which involved the analysis and design of the particular project from the beginning. In the case of the Z39.50 protocol, the specifications had been predetermined and published, which placed constraints on the processes and data which were to be modelled. On the other hand, this was also an advantage, as the protocol had been published, implemented and revised, and should, therefore, be relatively unambiguous.

Top down approach The Z39.50 protocol documentation is firmly structured around a number of core services. The strong modular nature of the document, and the apparent discrete nature of these services suggested that the following steps would provide the basis for a methodology for the development of the LOOPN code. •

Identify the high level services provided by Z39.50

•

For each of these services, determine the data processes and states which form the components of the service

•

Create a place/transition net for each service

•

Create LOOPN modules

•

Assemble modules

25

5. Results and Analysis Identification of the services Seven core services of the Z39.50 protocol were identified: •

Initialisation - An Init request by the origin, followed by an Init response from the target.

•

Search - A Search request from the origin, followed by a Search response from the target.

•

Retrieval - A Present request from the origin, followed by a Present response from the target.

•

Result-set-delete - A Delete request from the origin, followed by a Delete response from the target.

•

Access Control - An Access-control request by the target (following an Init, Search, Present, Delete or Resourcereport request by the origin) followed by an Access-control response from the origin.

•

Accounting/Resource Control - consists of three services: •

Trigger-resource-control - A Trigger-resource-control request from the origin (following an Init, Search, Present, or Delete request by the origin)

•

Resource-control - A Resource-control request from the target (following an Init, Search, Present, or Delete request by the origin) followed (possibly) by a Resourcecontrol response from the origin.

•

Resource-report - A Resource-report request from the origin (following an Init, Search, Present, Delete or Resource-report response by the target) followed by a Resource-report response from the target.

The Trigger-resource-control request is an intriguing service. An origin system may issue a Trigger-resource-control requests [sic] following an Init, Search, Present, or Delete request, prior to the receipt of the corresponding response. The request serves as a signal to the target system that the origin wishes the target to:

26 a)

simply send a Resource-report – ie issue a Resource-control request with Response-required “off”;

b)

invoke full resource control – ie issue a Resource-control request with Response-required “on”; or

c)

Cancel the current Init, Search, Present, or Delete operation.

The target is not obligated to take any specific action on receipt of a Trigger-resource-control request. For the purpose of procedure description, there is not response to the request; if the target wishes to issue a Resourcecontrol request it does so unilaterally. (If the origin issues as Triggerresource-control request and subsequently receives a Resource-control request, the origin cannot necessarily determine whether the latter resulted from the Trigger-resource-control request. However, the target may include Triggered-request-flag in the Resource-control request to so indicate.) ANSI, 1992, p 15)

•

Termination - There are two ways of terminating: •

IR-abort service - An IR-abort request from either the origin or the target.

•

IR-release - An IR-release request by the origin, followed by an IR-release response by the target.

The IR-abort and IR-Release services map directly onto the AABORT and A-RELEASE services (respectively) of the association control service element. (ANSI, 1992, p 4) It was also stated that The Information Retrieval protocol assumes service from the Presentation layer and the association control service element. ... The services required are: 1)

orderly association release ..., and

2)

association abort. (ANSI, 1992, p 27)

A state table for the origin and the target was also provided. The state tables constitute the most precise documented definition of the processes within Z39.50. That being the case, the following section is quoted in full. 4.2.3 State Tables This section defines two Information Retrieval Protocol Machines (IRPMs) in terms of state tables. One state table is defined for the origin (table 10) and one state table is defined for the target (table 11). Each state table shows the inter-relationship between the state of an Information Retrieval association, the incoming events which occur in the protocol, the actions taken, and, finally, the resulting state of the association. The IRPM state table does not constitute a formal definition of the IRPM. It is included to provide a more precise specification of the protocol procedures. The following conventions are used in the state tables. State Table Cells The intersection of an incoming event (row) and a state (column) forms a cell. A blank cell represents the combination of an incoming event and a state that is not defined for the IRPM. A non blank cell

27 represents an incoming event and state that is defined for the IRPM. Such a cell contains one or more actions separated by semicolons (;). Actions to be Taken by the IRPM The IRPM state tables define the action to be taken by the IRPM in terms of one or more outgoing events (separated by semicolons) and the resulting state (in parenthesis) of the Information Retrieval association. Invalid Intersections Blank cells indicate an invalid intersection of an incoming event and state. The state tables define correct operation only. They do not specify actions to be taken in response to incorrect operation (for example, erroneous protocol control information, incorrect protocol control actions, etc.). Such actions are not within the scope of the protocol although implementations must consider them. (ANSI, 1992, pp 28 - 29)

The lack of precision in the state table became obvious when the services were examined in detail, and will be further commented on in the next section dealing with the creation of the place/transition nets. The state table for the origin is included on the following page. A number of matters were not specified within the Z39.50 protocol. •

establishment of a session: It is assumed that prior to APDU’s being exchanged the Information Retrieval service user will handle the association control services required to establish an association with an application context encompassing the Information Retrieval service. (ANSI, 1992, p 27)

•

the authentication identifier: The origin and the target agree, outside the scope of the standard, whether or not this parameter is to be supplied by the origin, and if so, what value it is. (ANSI, 1992, pp 4 - 5)

•

aspects of the access control service: The specific content of the Access-control request and response parameters are outside the scope of this standard. ... Security-challenge and Security-challenge-response The contents of these two parameters are outside the scope of this standard and must be established by prior agreement between a given target/origin system pair. (ANSI, 1992, p 13)

At present these aspects are handled by the parties controlling the origin and target, either by exchange of e-mail or as part of the service contract between the two parties. (Hinnebush, 1994; Zeeman, 1994) The current version of Z39.50-1994 which is being balloted on by the ZIG at present contains several access control formats in Appendix ACC, and recommended formats for the IdAuthentication parameter in the ASN.1 for InitializeRequest. (Wibberley, 1994)

28

Figure 4.1 - State Table for Z39.50-1992 Origin (ANSI, 1992, p 31)

Place/Transition Nets The following nets are for the origin part of the Z39.50 protocol. In drawing the nets, I have used the term origin agent to denote the user of the origin, origin for the part of the net containing the Z39.50 protocol itself, and net to denote the networking layers below Z39.50.

29

The Initialisation Service ‘Begin at the beginning,’ the King said gravely, ‘and go on till you come to the end: then stop.’ (Lewis Carroll, Alice in Wonderland, Ch 11)

It seemed logical at the time to begin the analysis of the origin with the initialisation service. This immediately brought to light two problems in dealing with the state tables. Firstly, the state tables do not group all the events belonging to a particular service together. Secondly, knowledge of the abstract implementation-independent concepts of service-user, service-provider and service primitive are necessary to properly interpret the state tables. The following figure shows the sequence of request → PDU → indication → response → PDU → confirmation, which lies at the heart of such protocols. origin agent (1) Request

target agent (6) Confirmation (3) Indication

(4) Response

(2) Protocol Message origin

target (5) Protocol Message

Figure 4.1 - Sequence of Interactions (ANSI, 1992, p 28) With this background knowledge the place/transition net at figure 4.2 was constructed for the initialisation service. Initially the Z39.50 origin is in the closed state. When an initialisation request (Init req) is received, the state changes to Init sent reflect this, and an initialisation protocol data unit (Init PDU) is passed to the network layers. If the initialisation is accepted by the target an appropriate response PDU (Init resp + PDU) is received by the origin, the state of the origin changes to open and the confirmation (Init conf +) is notified to the user. If the connection is not accepted by the target, the response PDU (Init resp - PDU) is received by the origin, a release request (Arel req) is sent to the net by the origin, the user is notified (Init conf -) and the origin’s state changes to Rlease sent to await confirmation of the release. The confirmation of the release is by an Arel conf message from

30 the target. On receipt of this the origin returns to the closed state, and the user is advised that the release has been confirmed by the target.

origin agent

net

origin

Init PDU

Init req Init sent closed

Init resp + PDU

Init conf +

open Init resp - PDU Init conf Arel req

Rlease sent Irel conf

Arel conf

Figure 4.2 - Z39.50 Origin Initialisation service

The Search, Present, Delete and Resourcereport Services One of the more interesting features of the analysis of the Z39.50 protocol was the observation that the Search, Present, Delete and Resource-Report Services all have the same basic net structure. This has implications for the use of inheritance in the creation of the LOOPN modules. In all of these cases, the origin must be in the open state, before a request can be sent. On receipt of a request (req), the state of the origin changes to sent and a request PDU (req PDU) is passed to the net. When a response

31 PDU (resp PDU) is received by the origin, the state returns to open and a confirmation (conf) is passed to the user. origin agent

net

origin

req PDU

req open

sent resp PDU

conf

Figure 4.3 -Z39.50 Origin Search, Present, Delete and Resource-Report services

The Access-control Service The target may send an Access-control request following an Init, Search, Present, Delete or Resource-report request by the origin. For simplicity, this multiple state will be represented on the place/transition net as a single shaded place. In this case, the state table supplied as part of the ANSI standard resorts to an exceptional strategy to cope with the fact that this service is seeking to affect the state of another service in the Z39.50 protocol. It introduces the operations stsk and popst. The stsk operation saves the state of the system, and the popst operation restores it. While this strategy has the desired effect, it deviates significantly from the place/transition concept. In the figure below the stsk and popst operations are shown against the transitions in which they would occur. origin agent

origin

Acc ind

net

(stsk)

Acc PDU sent

Acctrl recvd

Acc resp PDU

Acc resp (popst)

Figure 4.4 - Z39.50 Origin Access-control service

32 What really is required, however, is the greyed arc to return to the state before the Rsc PDU (Resp) message was received. A way of achieving this will be shown in the section on developing the LOOPN modules.

The Trigger-resource-control Service In order to use the Trigger-resource-control service, the origin must have sent an Init, Search, Present, or Delete request. For simplicity, this multiple state will be represented on the place/transition net as a shaded place. When a Trigger-resource-control request (Trigrc req) is received, a Trigger-resource-control request PDU is sent, but there is no change in state. origin agent

origin

net Trigrc PDU

Trigrc req

sent Figure 4.5 - Z39.50 Origin Trigger-resource-control service

33

The Resource-control Service In order to act upon a Resource-control request from the target, the origin must have sent an Init, Search, Present, or Delete request. For simplicity, this multiple state will be represented on the place/transition net as a shaded place. The target request can require no response, or it can require a response. In the former case, the target is providing a resource report to the origin - the state of the origin does not change. Otherwise, a response is required. As with the Access-control service, this service uses the stsk and popst operations and a greyed arc has been placed in the net to show the actual state change in the syntax of a place/transition net. origin agent

origin

Rsc ind

net Rsc PDU (Noresp)

sent

Rsc ind

Rsc PDU (Resp) (stsk) rsctl recd Rsc resp PDU

Rsc resp (popst)

Figure 4.6 - Z39.50 Origin Resource-control service

The IR-abort Service The IR-abort service is able to handle an IR-abort request from either the origin or the target. The Aab ind message is issued as a result of an abort request from the target. The APab ind message is issued by the OSI service provider as a result of an abnormal condition within the underlying protocol layers (ISO, 1988a, p 9). Both are handled identically by the Z39.50 origin.

34

origin agent

origin

net

Iab req

Aab req closed

Iab ind

Aab ind APab ind

Figure 4.7 - Z39.50 Origin IR-abort service

The IR-release Service Once the origin is open, an IR-release request sends an Arel req to the net, and the origin moves into the Rlease sent state. The target responds with an Arel conf and the origin moves back to a closed state. origin agent

origin

net

open Arel req

Irel req


Arel conf

closed

Figure 4.8 - Z39.50 Origin IR-release service At this point it is worth noting that the initialisation service handles the reception of an Arel conf message which may be received as part of the graceful termination of an unsuccessful attempt to establish connection with a target, in an identical fashion. It therefore makes sense to modify the Initialisation service to include the handling of the IR-release service.

35

origin agent

net

origin

Init PDU

Init req Init sent closed

Init resp + PDU

Init conf +

open Arel req

Irel req

Init resp - PDU Init conf Arel req


Arel conf

Figure 4.9 - Z39.50 Origin Initialisation/IR-release service

ANSI Z39.50 on the Internet Many institutions implementing this protocol chose, for various reasons, to layer the protocol directly over TCP/IP rather than to implement it in an OSI environment or to use the existing techniques that provide full OSI services at and above the OSI Transport layer on top of TCP connections (as defined in RFC 1006 [7] and implemented, for example, in the ISO Development Environment software). These reasons included concerns about the size and complexity of OSI implementations, the lack of availability of mature OSI software for the full range of computing environments in use at these institutions, and the perception of relative instability of the architectural structures within the OSI applications layer (as opposed to specific application layer protocols such as Z39.50 itself). Most importantly, some of these institutions were concerned that the complexity introduced by the OSI upper layers would outweigh the relatively meager return in functionality that they were likely to gain. Thus, for better or worse, the decision was taken to implement the Z39.50 protocol directly on top of TCP (with the understanding that this decision might be revisited at some point in the future). (Lynch, 1994)

36 A number of the quotes in this section are taken from Clifford Lynch’s document Using the Z39.50 Information Retrieval Protocol in the Internet Environment (Lynch, 1994). This document is an Internet Engineering Task Force draft, which specifically states in the introduction Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress." (Lynch, 1994)

The reader’s attention is drawn to this caution. The expiration date for the document in question is the first of February, 1995. The version of the document cited in this thesis was current as at the date of submission. That version of the document is appended as Appendix 2. TCP/IP is a suite of protocols that has been developed by the US Department of Defense and that are used on the Internet. Software supporting TCP/IP is part of nearly every UNIX distribution today. TCP/IP is not a OSI protocol and does not fit in the OSI reference model. Specifically, the Application Control Service Element of the Application Layer and the other Presentation Layer services relied upon by the ANSI Z39.50 definition are absent. During 1991-1993, the Coalition for Networked Information (CNI) convened a group of Z39.50 implementors to form the Z39.50 Interoperability Testbed project (ZIT). The main aim was to work towards interoperable Z39.50 implementations running over TCP/IP on the Internet. The absence of a presentation layer has required a modification to the technique used with the Basic Encoding Rules (BER). The standard Basic Encoding Rules (BER) [11] are applied to the ASN.1 structures defined by the Z39.50 protocol to produce a byte stream that can be transmitted across a TCP/IP connection. The only restriction on the use of BER to produce this byte stream is that direct, rather than indirect, references must be used for EXTERNAL objects. (Lynch, 1994)

It should be noted that this means that a Z39.50 BER stream produced by a TCP/IP site can be read by an OSI site, but the reverse is not always the case. TCP Port 210 has been assigned to Z39.50 by the Internet Assigned Number Authority. To initiate a Z39.50 connection to a server under TCP/IP, an origin simply opens a TCP connection to port 210 on the target and then, as soon as the TCP connection is established, transmits a Z39.50 Init PDU

37 using the BER encoding of that Init PDU. Some sites still use other port numbers eg 2210, 2100. The original versions of the Wide Area Information Server (WAIS) employed Z39.50-1988 with some extensions. Z39.50-1988 did not use BER encoding and Z39.50-1988 Init PDUs look very different from the Init PDUs of Z39.50-1992. Implementations of Z39.50 should be prepared to detect and appropriately reject or translate WAIS Init PDUs. The IR-release service is directly mapped to a TCP CLOSE. The IR-abort service in the TCP/IP environment is implemented by terminating the TCP connection either via TCP ABORT or TCP CLOSE. While this can create some ambiguity in the nature of the closing of the connection, this reflects the inability of some TCP/IP implementations to distinguish a TCP reset from the other side of the connection from other events. The transmission of data by origin and target are mapped to TCP transmit and receive operations.

LOOPN Modules This section describes the development of LOOPN modules for a Z39.50 origin. The decision was taken at the outset to initially work on the origin. The ASN.1 data structures could under some implementations be as large as 32 kbyte. LOOPN would best handle this size of element by using pointers to external C data. The tokens used in the nets would then consist of control information, used in the LOOPN nets eg kind of token, other distinguishing data. In the development stage a string data element was also attached to tokens to assist in debugging the net models. To preserve the concept of a layered structure, the decision was also taken to attempt to produce a final net resembling the standard sequence diagram (Figure 4.1). That is, a transition with four places, as follows: request

confirmation

Z39.50 origin

to_net

from_net

Figure 4.10 - Z39.50 Origin Ideal Structure

38 A module defining common types was also declared. The types defined also included values for the service types. The ACSE and presentation layer services were defined in terms of negative numbers, as these layers do not exist under TCP/IP, and no values were defined for them in the Z39.50 protocol, as they are defined outside this protocol. CONST

initRequest initResponse searchRequest searchResponse presentRequest presentResponse deleteResultSetRequest deleteResultSetResponse accessControlRequest accessControlResponse resourceControlRequest resourceControlResponse triggerResourceControlRequest resourceReportRequest resourceReportResponse Arel_req Arel_ind Arel_resp Arel_conf Irel_req Irel_ind Irel_resp Irel_conf Aab_req Aab_ind Aab_resp Aab_conf Iab_req Iab_ind Iab_resp Iab_conf APab_ind

= = = = = = = = = = = = = = = = =

= = = = = = = = = = = = = = =

-1; -2; -3; -4; -5; -6; -7; -8; -9; -10; -11; -12; -13; -14; -15; -16; -17;

TYPE kinds = APab_ind .. resourceReportResponse; request_type = TOKEN null WITH kind : kinds; contents : string; END; confirmation_type = TOKEN null WITH kind : kinds; contents : string; END; PDU = TOKEN null WITH kind : kinds; contents : string; END; str = TOKEN null WITH contents : string; END;

Listing 4.1 - part of defs.l

20; 21; 22; 23; 24; 25; 26; 27; 28; 29; 30; 31; 32; 33; 34;

39

The Z39.50 Initialisation/IR-release Service LOOPN allows for inheritance of previously defined modules. The syntax indicates that a module declaration may be in the form: MODULE moduleid1 = moduleid2, moduleid3

...

This indicates that moduleid1 is the name of the module being defined, while moduleid2, moduleid3, … are its parent modules (with generic as the default parent). A module inherits all the features of the parents, and it may augment those features. In addition, it may override the declaration of places, transitions, instances and access functions. (Lakos, 1992, p 17)

Accordingly, it was decided to create a module which defined the places used in the Initialisation/IR-release service, and then to create modules for the components of the service, and finally instantiate these to consolidate them as the service module. request

confirmation

Init sent

open

closed rls sent

to_net

from_net

Figure 4.11 - Z39.50 Origin init_places MODULE init_places = OUTPUT OUTPUT INPUT INOUT INOUT

defs (INPUT confirmation to_net from_net open closed

request : request_type; : confirmation_type; : PDU; : PDU; : null; : null );

PLACE init_sent : null; rls_sent : null; TRANSITION initial; END MODULE

Listing 4.2 - init_places.l

40 The first component defined was the sequence defining the sending of an initialisation request where the target accepts the connection. This module was called open_ok. request

confirmation

send_init Init sent

open

closed rls sent rec_PDU (OK) to_net

from_net

Figure 4.12 - Z39.50 Origin open_ok MODULE open_ok = init_places ; TRANSITION initial; TRANSITION send_init; INPUT x

Encoding the ANSI Z39.50 Search and Retrieval Protocol ... - CiteSeerX

Encoding the ANSI Z39.50 Search and Retrieval Protocol ... - CiteSeerX

Suggest Documents

The Effect of Divided Attention on Encoding and Retrieval ... - CiteSeerX

Memory Encoding and Retrieval in the Aging Brain - CiteSeerX

Information Search and Retrieval in Microblogs - CiteSeerX

Video Retrieval using Search and Browsing - CiteSeerX

The myth of the encoding-retrieval match - CiteSeerX

The ANSI C12 protocol suite - updated and now with network ...

Distributed Encoding and Retrieval of Spatial Memory in ... - CiteSeerX

Distributed Encoding and Retrieval of Spatial Memory in ... - CiteSeerX

The TS-Tree: Efficient Time Series Search and Retrieval - CiteSeerX

Asymmetry between encoding and retrieval processes - Memory ...

ENCODING-RETRIEVAL SIMILARITY AND MEMORY ... - CabezaLab

Smart Search & Retrieval on Video Databases - CiteSeerX

Clustering Information Retrieval Search Outputs - CiteSeerX

Smart Search & Retrieval on Video Databases - CiteSeerX

Information Retrieval for Education: Making Search ... - CiteSeerX

Memory encoding and retrieval on the ascending and descending ...

Encoding, maintenance, and retrieval processes in the lag effect

Circuit mechanisms underlying memory encoding and retrieval in the ...

Semantic Encoding and Retrieval in the Left Inferior Prefrontal Cortex ...

The Attentional Demands of Encoding and Retrieval ...

MUSE: A Content-Based Image Search and Retrieval ... - CiteSeerX

semantic support for medical image search and retrieval - CiteSeerX

Progressive Search and Retrieval in Large Image Archives 1 - CiteSeerX

Faceted Search and Retrieval Based on Semantically ... - CiteSeerX