Experiments with the Z Interchange Format and SGML Daniel M. German and D.D. Cowan Dept. of Computer Science University of Waterloo Waterloo Ont. N2L 3G1
[email protected] and
[email protected]
Abstract. Standards, if widely accepted, encourage the development of
tools and techniques to process objects conforming to that standard. This paper describes a number of experiments using available tools to process text containing Z speci cations adhering to the existing Z Interchange Format. The experiments resulted in tools that could be used in speci c programming environments where Z was used to describe software systems.
1 Introduction The bene ts of a standard language for representing speci cations and programs, and a standard method of electronically interchanging documents containing these representations are clear. If we all use the same language, communication of concepts, ideas and descriptions is greatly simpli ed. Further, if the same electronic format is used and there are enough users of a language, then organizations are willing to make the investment required to create tools for processing and transforming these documents. Standard representations produce another bene t, they also allow tools to be linked to produce more powerful toolkits. The speci cation language Z is no exception to these observations. Z is a language that uses a wide range of mathematical symbols. Unfortunately not all those symbols are available in ASCII or any other commonly accessible character set. Although this lack of availability of symbols has not been an impediment to the creation of Z documents, their interchange between products and platforms has been seriously hindered. For instance, someone can create Z documents under Microsoft Windows2 using the public domain and shareware fonts available for Z symbols. However, if these documents are processed using UNIX-based tools they will appear unintelligible, as there is no transparent way to translate the Z les from one system to the other. LATEX[1] has become the de-facto standard for the interchange of Z documents, and many Z tools can process LATEX documents using zed.sty[2],
The research described in this paper was supported by IBM Canada, the Information Technology Research Centre of Ontario and Inforium Technologies Inc. 2 Microsoft Windows is a trademark of Microsoft Corporation
fuzz.sty[3], oz.sty[4]. Although LATEX is a sucient solution, it is not an optimal one. For example, a LATEX Z document might include formatting instructions that are not relevant to the Z speci cation. Inclusion of this extra material makes parsing of LATEX documents more complex than necessary. In order to import those documents a parser has to understand the complete syntax of LATEX and zed.sty, and might still fail with some complicated structures that involve complex macros and formatting operations. In addition, LATEX is not well known outside academia. An argument can be made that the use of LATEX provides an adequate solution to the problem of a standard interchange format. However, a tagging scheme such as the one supported by the Z Interchange Format [5] and the Standard Generalized Markup Language (SGML3, represents a simpler and more elegant solution. The standardization committee decided to create an interchange format that would facilitate the sharing of Z documents between tools and platforms. This format should not have any character set dependencies, and should be easy to parse and generate, so that tools to manipulate the format can be easily constructed. These criteria lead to the choice of SGML as the underlying technology for the interchange format. SGML is an ISO standard for the de nition of languages that describes the structure of text [6]. SGML relies on the inclusion of tags in the text of the document to de ne its structure. In addition, the use of SGML is spreading rapidly, and there are many commercial and public-domain tools available to author and process SGML documents.
2 The Z Interchange format Appendix D of the draft of the Z Standard [5] includes the description of the Z Interchange Format. The de nition states that:
The Z Interchange Format de nes a portable representation of Z, allowing Z documents to be transmitted between dierent machines.4 We believe that the de nition should be extended to include sharing of Z documents between dierent products. In order to try and make the paper self-contained, the remainder of this section contains a very brief description of SGML and the Z Interchange Format.
2.1 SGML SGML is the abbreviation for \Standard Generalized Markup Language", and is a meta-language to de ne markup languages that specify the structure of a document. The markup language does not specify how the document is processed. For example, tags in a markup language could specify the author or the title of 3 4
See 2.1 for a description of SGML Taken from [5], page 171.
an article, but not the type of font or its size when it is printed or displayed. When a document is marked up using an SGML tag set, the text is tagged with information about its structure. Each element in a tagging language de ned using SGML and the relationships among the tags determines a class of documents. A class is known as a DTD (Document Type De nition) in SGML jargon. Any SGML document should have a corresponding DTD. Normally, SGML documents use \" to specify the start of an element, and \ " to specify its end. Elements are the logical units of SGML documents. Entities are similar to macros: they de ne names that will eventually be replaced5 . Normally non-common special symbols (such as many matematical symbols used in Z) are unavailable in typical character sets. SGML entities allow the inclusion of those symbols without commiting to any particular character set. Figure 1 is an example of a small SGML document, namely a tagged memo:
John Doe Bob Smith The party is today!
Fig. 1. A Memo encoded using SGML From Figure 1 we can infer that a memo is composed of from , to and body elements, and body is composed of sequence of paragraph elements (in this case a single one).
2.2 The SGML representation of Z The function of the Z Interchange Format is to facilitate the interchange of Z documents and not necessarily represent their full structure. The DTD for the Z Interchange Format is similar but not identical to the grammar of Z. In particular, at the lowest level of the DTD for the interchange format there are sequences of identi ers and operators, while in the grammar of Z there are speci c rules on how those identi ers and operators can be joined together. The 5
This de nition of entities is not complete, but is sucient in the context of the Z Interchange Format.
declaration of entities is implementation dependent and therefore is not speci ed in the standard.
2.3 An example of a declaration of an element Figure 2 shows the declaration of the element axdef (axiomatic de nition). The semantics of this declaration are similar to those of a BNF production: axdef generates a decpart (declaration part) followed by an optional axpart (axiom part).
- -
(decpart, axpart?) >
Fig. 2. The element declaration for axdef
2.4 The nal result Figure 3 shows the SGML encoding of a complete schema, and Figure 46 shows its normal representation. Notice the use of \&id ;" to represent the special symbols of Z in SGML. They are called entities and avoid the use of special character symbols in the SGML le.
Update ΔCheckSys a? : ADDR p? : PAGE working' = working ⊕ { a? ↦ p? }
Fig. 3. The SGML representation of an schema The Z Interchange Format is simple enough that a person can enter Z in SGML directly. However, the representation is not very readable. 6
This example was taken from [7]
Update CheckSys a ? : ADDR p ? : PAGE working = working fa ? 7! p ?g 0
Fig. 4. A Z schema in its normal representation
3 Some Applications of the Z Interchange Format The Z Interchange Format has several advantages including:
{ { { {
conceptual simplicity, reliance on the ASCII character set, easy to parse, and use of a standard (SGML).
We would expect these features to be present in an interchange format, but the inclusion of SGML has produced some interesting side eects. Our research group is interested in the application of document systems incorporating SGML to various aspects of software engineering. When we became aware of the Z Interchange Format, we decided to conduct some experiments to assess its usefulness. Our experiments were oriented in two directions: building an environment for the editing of Z documents; and including Z speci cations in literate programming. In each case we tried to use available tools that supported SGML.
3.1 Building an environment to author Z documents using SGML tools Preparing documents containing speci cations written in the Z Interchange Format requires an environment that can display all the mathematical symbols used in Z. LATEX is an example of such an environment. Since the release of shareware and public domain fonts for Apple MacIntosh, and Microsoft Windows it is possible to write Z using word processors such as WordPerfect 7 or Microsoft Word 8. SGML editors are available for several dierent operating environments, and we believed that one of these editors in conjunction with the Z Interchange Format could be used as the foundation for an editor for Z documents. This editor called ZEdit , was produced for Microsoft Windows using Rita [8] a structured 7 8
WordPerfect is a trademark of WordPerfect Corp. Word is a trademark of Microsoft Corp.
Fig. 5. A session of ZEdit editor for documents, the Live PAGE9 SGML Document Browser[9], and Visual Basic. The two tools were connected through the le system and used Direct Data Exchange (DDE) for communication. Rita is a structured editor for documents [8] that is capable of reading and saving SGML documents in ASCII format. Rita is grammar directed, and understands the structure of a given DTD. The user is restricted by the DTD and can only insert valid SGML constructs into a document; hence, the user can only create documents that comply with a given document grammar. In addition, Rita uses a menu to prompt the user as to which structures can be inserted in a document at an any given point. In simple terms we can view Rita as a word processor that only allows the user to produce \correct" SGML documents, that is documents that conform to a given DTD. The Live PAGE SGML Document Browser is one of a series of tools developed by Inforium Technologies Inc. to manipulate documents stored in SGML format. The Browser takes as input a DTD, an SGML le and a style-sheet to specify how the SGML le is to be typeset. The output of the Live PAGE Browser is a typeset document on the screen. A printed typeset document can also be 9 Live PAGE is a trademark
of Inforium Technologies Inc.
produced. Figure 5 illustrates an editing session using Rita and the Live PAGE Browser; Rita is on the left in the Figure. Since we are using a regular keyboard we enter the entities for the special symbols rather than the symbols themselves. Rita is almost WYSIWYG, but it only has one font for display purposes, and is not capable of showing the special Z character fonts. Since the Live PAGE Browser has a exible style sheet we use it to display the document. Once a modi cation is saved, the Browser on the right in the Figure automatically displays the changes using the complete Z symbols. Using these combined tools the user does not need to know that ZEdit is tagging the Z speci cations with SGML. In fact, users never need to examine the source les. We also developed a converter from the Z Interchange Format to the zed.sty LATEX style. This converter provides quite strong evidence that the Z Interchange Format is easy to process: the converter is a simple lex program.10 To create ZEdit it was only necessary to create a style-sheet that will typeset the Z Interchange Format in a common representation. Since the amount of work was minimal, it certainly demonstrates some of the bene ts of using SGML.
3.2 Literate Programming and Z Knuth coined the term Literate Programming [10] to describe the technique of combining compilable source code with descriptive prose in a single master document, called a literate program . He developed WEB as an implementation of his ideas. WEB was a tagging scheme; the user had to incorporate cryptic tags in the document to specify how to process text and code fragments. The tags used in WEB allowed a programmer to produce documentation using the TEX document formatting system and to process source code using a Pascal compiler. Two different programs called Weave and Tangle are used to process this tagged master document. Weave produces the program documentation that can be processed by TEX; Tangle produces Pascal source code capable of being processed by a compiler. Unfortunately there were no tools other than standard editors to assist the programmer in creating literate programs. Thus, the programmer had to remember numerous tags, and their correct placement to obtain a syntactically correct WEBdocument. Of course the resulting document was unreadable without further processing by TEX. Ryman proposed an extension to Knuth's ideas by incorporating formal methods into the literate programming environment[11]. Mixing formal speci cations, source code, and plain text in one document has numerous advantages. Code can be described both in natural language and using formal speci cations, making the code easier to understand, and yet reducing potential ambiguities. Since all information about the software is contained in one document, changes in formal speci cations are more likely to incorporated in the code, and vice versa. Storing the three components in the same environment, is likely to reduce inconsistency and redundancy. 10
This program is available at ftp://csg.uwaterloo.ca/pub/dmg/zif2tex.tar
We decided to create a literate programming environment using an SGMLbased tagging scheme that will incorporate formal speci cations, source code and documentation including multimedia entities such as pictures, diagrams, sound and video clips. The environment was built by \gluing" together available tools.
Fig. 6. A literate program that includes Z speci cations WEB uses tags to markup the components that constitute the structure of the literate program. Tangle and Weave convert the WEB le into a Pascal program and a TEX le respectively. We replaced the original markup tags with SGMLbased tags, and produced a DTD for literate programming. Although the new environment is not conceptually better, it has several advantages. Available SGML tools support literate programming in a more interactive environment. For example, we could use the ZEdit environment described in the previous section to isolate authors from the details of SGML tags and the production of syntactically correct documents11. This approach is certainly better than forcing the programmer to edit literate programs containing raw WEB code 11
Notice that this process creates documents compliant with the DTD for the Z Interchange Format; such documents not necesarily conform to the concrete Z, since the former does not mimic the latter
with embedded cryptic markup. Another bene t of SGML markup is the ability to include many dierent types of digital information. Pictures, images and sounds can be easily inserted and played with SGML viewers. Even Z formal speci cations could be explained using sound clips embedded in the document. For a complete description of the tools and the issues involved see [12]. Constructing a literate programming environment including formal speci cations is straightforward. We just add the Z Interchange Format to our DTD for literate programming, and merge the style-sheet from ZEdit with one for literate programming. This modi ed environment then easily supports the insertion of Z speci cations into our literate programs. Furthermore, the Z speci cations can be easily extracted from the literate programming document and converted to another format to be further processed by a type-checker. Figure 6 shows a partial literate program generated with the prototype. Since formal speci cations play an important role in software development, we believe they must be easily integrated with other software documents. This experiment with literate programming, SGML, and the Z Interchange Format demonstrates the ease with such integration can be achieved.
4 Word Processors, SGML, Publishing, the World-Wide Web and Z There is growing interest in and support for SGML, because of the pressure to standardize the representation of documents. A number of dierent groups have recognized that standards are bene cial and are promoting the use of SGML. For example, some editors recommend that their writers use SGML to create their manuscripts, and some organizations require that the documentation of their contracted systems be delivered in SGML. Because of this growing support, there are some word-processors such as FrameMaker 12 that already implement some support for SGML, while other companies have promised that new versions of their products will support SGML. Speci cally, Microsoft Word and WordPerfect have recently announced they will provide SGML suport on their products. Using these tools with the Z Interchange Format will make it easier to author Z speci cations. Furthermore, the speci cations will be easily portable to Z tools to have them veri ed. The World-Wide Web[13] and its clients (Netscape, Mosaic, etc.) present another example of SGML-compliant tools that might be used in conjunction with the Z Interchange Format. The World-Wide Web is distributed repository of hypertext documents located in the Internet. Documents on the World-Wide Web are tagged with HTML. HTML is a markup language de ned with a DTD (similarly to the way the Z Interchange Format is de ned). Version 2 of HTML does not include most of the entities required in the Z Interchange Format, but it might be possible that in the future is is augmented for that purpose. 12
FrameMaker is a trademark of Frame Technologies
Thus, documents containing Z speci cations could be stored and browsed on the World-Wide Web13.
5 Conclusions Our limited experiments with available tools strongly supports the thesis that the Z Interchange Format ful lls its objectives: it is an eective way to interchange Z documents between dierent tools and platforms. Furthermore, the results of these experiments indicate that the use of the Z Interchange Format can be used to create editors for Z using standard SGML tools, and that the Z speci cations can be easily included in other SGML documents. The Z Interchange Format will de nitely make it easier to share speci cations, as well as make them accessible through inclusion in books, documentation and the World Wide Web. Furthermore, by choosing SGML for the Z Interchange Format, the processing of Z documents can increasingly bene t from current and future SGML technology.
Acknowledgments Many thanks to our anonymous referees for their constructive comments.
References 1. L. Lamport, LATEX User's Guide & Reference Manual. Reading, Massachusetts, USA: Addison-Wesley Publishing Company, 1986. 2. J. Spivey, \A guide to the zed style option." Oxford University Computing Laboratory, December 1990. 3. J. Spivey, The f uzz Manual. Computing Science Consultancy, 2 Willow Close, Garsington, Oxford OX9 9AN, UK, 2nd ed., 1992. 4. P. King, \Printing Z and Object-Z LATEX documents." Department of Computer Science, University of Queensland, May 1990. 5. S. M. Brien and J. E. Nicholls, \Z base standard," Technical Monograph PRG-107, Oxford University Computing Laboratory, 11 Keble Road, Oxford, UK, Nov. 1992. Accepted for standardization under ISO/IEC JTC1/SC22. 6. C. Goldfarb, SGML Handbook. Oxford University Press, 1990. (0-19-853737-9). 7. J. M. Spivey, The Z Notation: A Reference Manual. Prentice Hall International Series in Computer Science, 2nd ed., 1992. 8. D. Cowan, E. Mackie, G. Pianosi, and G. d. V. Smit, \Rita { An Editor and User Interface for Manipulating Structured Documents," Electronic Publishing, Origination, Dissemination and Design, vol. 4, pp. 125{150, September 1991. 9. Inforium Inc., Waterloo Ont. Canada, LivePage Browser Tutorial and Reference Manual, Version. 2.0, 1994. 13 Visit http://www.comlab.ox.ac.uk/archive/z/html-z.html on the World-Wide Web) for more information
10. D. E. Knuth, \Literate programming," The Computer Journal, vol. 27, pp. 97{111, May 1984. 11. A. Ryman, \Formal Methods and Literate Programming," in Proceedings of the Third IBM Software Engineering ITL, June 1993. 12. D. Morales-German, \An SGML based Literate Programming Environment," in Proceedings of the 1994 CAS Conference, pp. 42{49, November 1994. 13. T. Berners-Lee, R. Cilliau, A. Loutonen, H. F. Nielsen, and A. Secret, \The World Wide Web," Communications of the ACM, vol. 37, pp. 76{82, August 1994.
This article was processed using the LATEX macro package with LLNCS style