Teaching Network Programming with Ada and Lower Layer Jesus M. Gonzalez-Barahona, Jose Centeno-Gonzalez, Pedro de las Heras-Quiros Francisco J. Ballesteros-Camara, Luis Lopez-Fernandez Grupo de Sistemas y Comunicaciones (GSyC) Universidad Carlos III de Madrid Phone no: +(34-91) 624-9497. Fax no: +(34-91) 624-9430 Butarque 15, E-28911 Leganes (Madrid). Spain. e-mail:
[email protected]
July 24, 1998 Abstract
After some years of using a classical approach to teach network programming (C and BSD sockets), we decided to try a dierent approach, based on Ada as the programming language and Lower Layer as the communication library. After two years of using this new approach in several undergraduate courses, we present here our experience and the lessons learned. In short, the new approach has proven to oer an environment easier to study, to understand, and to use for implementing protocols. Some of these advantages are due to the bene ts of using Ada, while others are due to the cleaner interface of the Lower Layer library |if compared to BSD sockets.
1 Introduction Undergraduate students attending courses related to Computer Networks in our University are required to do several practical assignments. Our classical approach for the design of the assignments of the rst of these courses was based on programming on top of BSD sockets [LFJ+ 86], using C [KR88]. We have used this approach with our students during three years. After that experience we decided that the use of these tools for teaching purposes was inappropriate, both due to the language and to the inherent complexity of the BSD sockets interface. Students wasted valuable time because of their common misuse of the C language, and the oddity of the BSD sockets interface. This shortened the time students could devote to learning issues speci c of communication protocols, which is the subject of the courses. It is worth to mention that remaining close to the network layer was also a prominent goal. We wanted the abstraction level provided by BSD sockets, but not the programming interface they oer. In order to help students to improve their learning of Network Programming we planned to replace both the C programming language and the BSD sockets library. As language, Ada [ISO95] was chosen, since we believe it might be one of the best choices to help students to avoid classical problems with C. Students have previously attended only a semester on Programming, using mainly Pascal. As communication library, we have been using the Lower Layer library [GBdlHQCG97a]. Lower Layer has been designed to oer a clean programming interface, adequate for building communication protocols on top it, achieving independence of the actual underlying communication system. For our students, the implementation of Lower Layer on top of UDP BSD sockets has been used. Two years later, this new approach has proven to be even more successful than we expected. Students now nish their assignments earlier and reach a deeper degree of network programming skills. In this paper we want to share our experience, and highlight the bene ts we have obtained from it. The structure of this paper is as follows: in the following section, the major problems we detected using C and BSD sockets are described. In section 3, an overview of the Lower Layer system is presented. In section 4, A
version of this document was presented in the TriAda'97 Conference, St. Louis, Missouri, USA, November, 1997, and published in the corresponding proceedings. The ACM holds the copyright of that version.
1
a set of examples of using Lower Layer for teaching are shown. Results from this teaching experience are discussed in section 5. Other approaches for teaching communication protocols and related work will be enumerated in section 6. Section 7 concludes the paper with conclusions and some directions for future work.
2 The C language, the BSD sockets interface, and their problems The BSD sockets library over the TCP/IP stack [Ste90] is perhaps the most widely chosen tool to teach network programming in practice. Our students learned to program mainly over the UDP transport protocol [Pos80], and designed and implemented their own protocols based on UDP datagrams. Since UDP is unreliable, datagram-oriented and functionally similar to raw IP [Pos81a], it is specially well suited for teaching to implement protocols similar to TCP [Pos81b]. The BSD sockets library and its interface are written in C, and students also programmed using this language. As an example, this is the piece of code that our students used to send a single message from a client to a server process, and to receive its reply: /* open a UDP socket (Internet datagram socket) */ if ( (sockfd = socket (AF_INET, SOCK_DGRAM, 0) ) < 0) err_dump ("client: can't open datagram socket");
/* bind our local address so that server can send to us */ bzero ( (char *) &cli_addr, sizeof (serv_addr) ); cli_addr.sin_family = AF_INET; cli_addr.sin_addr.s_addr = htonl (INADDR_ANY); cli_addr.sin_port = htons (0); if (bind (sockfd, (struct sockaddr *) &cli_addr, sizeof (cli_addr) ) < 0) err_dump ("client: can't bind local address");
/* construct server address so that we can send to it */ bzero ( (char *) &serv_addr, sizeof (serv_addr) ); serv_addr.sin_family = AF_INET; serv_addr.sin_addr.s_addr = inet_addr(SERV_HOST_ADDR); serv_addr.sin_port = htons (SERV_UDP_PORT); servlen = sizeof(serv_addr); /* copy message to buffer */ strcpy(buffer1, "Hello world\n"); /* send buffer contents */ if (sendto(sockfd, buf, strlen(bufffer1), 0, (struct sockaddr *)&serv_addr, servlen) != strlen(buffer1)) err_dump ("client: sendto error"); printf("client: string sent=%s\n", buffer1); /* receive into buffer */ n = recvfrom (sockfd, buffer2, MAXMESG, 0, (struct sockaddr *)&serv_addr, &servlen); if (n < 0) err_dump ("client: recvfrom error"); /* print buffer contents */ printf("client: string received=%s\n", buffer2); /* close socket */ close(sockfd);
Having used this approach for teaching purposes during three years, we have detected several problems, which fall into one of the following three categories:
Students enroll in communication courses without any knowledge of the C programming language, since it is not taught in courses on Programming. Therefore, students must learn the non trivial programming tricks of C, together with actual communication matters. This leads them to build incorrect programs, usually because of the misuse of C. Inadequate management of C pointers and dynamic data structures are a clear example, along with the absence of strong type checking. 2
The BSD sockets library was not designed for teaching, but for eciency and exibility. There are a pocketful of recipes for using sockets that students just \cut and paste" from books to their code, without understanding them at all. BSD sockets provide a connection oriented interface, which ts well with reliable stream protocols such as TCP, but not with datagram oriented ones. The UDP interface of BSD sockets suers from this problem. Nevertheless, teaching to build protocols using datagrams seems to be more convenient.
Using these tools, students learned (at most) how to program the BSD sockets library, but not the rationale of programming communication protocols over a datagram oriented interface, which is the real goal of the course.
3 The Lower Layer library Lower Layer is the lowest level of the Simple Com system [GBCGdlHQ97], a tool-box designed for building and using unicast and multicast protocols oering dierent qualities of service. It is completely written Ada. Lower Layer provides the services that enable Simple Com to work on top of dierent communications systems. Currently there are implementations of Lower Layer over TCP sockets, UDP sockets, and Transis [DM96]. The main design goal of Lower Layer is simplicity. It tries to hide from its users the peculiarities of underlying communication system, providing a simple datagram oriented interface. Lower Layer is implemented using aggressively object oriented programming techniques and child libraries. This eases the simple extension of the library to include access to new communication systems and protocols, while maintaining a uniform interface for all of them.
3.1 Communication Model
The communication model provided by Lower Layer is datagram-oriented. The system is viewed as a set of processes interconnected by a message-passing network. This network enables any process to send and receive data to/from any other process. A process can receive messages sent to any proper address|that is, any unicast address corresponding to the host machine where the process resides or any multicast address. The operation used for indicating the willingness to receive data directed to a given address is called `binding'. A process can also refuse to accept messages directed to a given address by executing an `unbind' operation. In any given instant, a process can be bound to any number of addresses. A process can always send data to any address, just by specifying it as the destination address. However, it can only receive messages sent to addresses to which it is currently bound. The `receive' operation blocks the caller if no messages for the speci ed address are currently available. A complementary mechanism is also provided for registering handlers, subprograms which will be called asynchronously when a message for a given address is received. If handlers are used, they are speci ed when binding the corresponding addresses. With respect to qualities of service, the basic model oers none|or best eort, which is just an optimistic way of saying the same. Only if the underlying transport system provides any quality of service, the implementation of Lower Layer for that system should also provide it, if possible.
3.2 Library Speci cation
We provide here the speci cation of Lower Layer that our students use. It should be noted that Lower Layer is a set of hierarchic libraries, with child packages for every communication protocol over which it has an actual implementation. To avoid exposing our students to some unnecessary complex Ada constructions with, we provide them with this simpler-written interface, specialized for the UDP protocol. with Lower_Layer.Inet.UDP.Uni; with Ada.Streams; package Lower_UDP_Uni is -- Subtype for declaring buffer variables: subtype Buffer_Type is Lower_Layer.Stream; -- Resets a buffer, emptying all data out of it
3
procedure Reset(A_Buffer: in out Buffer_Type); -- Subtype for declaring communication end-points subtype Address_Type is Lower_Layer.Address_CA; -- Type for declaring accesses to procedures which will -- handle reception of incoming data type Handler_Type is access procedure (To: in Address_Type; Data: access Buffer_Type); -- Builds an Address from an IP address and a port number function Build(IP: String; Port: Natural) return Address_Type; -- Binds to an Address in order to be able to receive -- data on it. -- (a) If no handler is provided, a call to Receive procedure -is needed to receive the incoming data -- (b) If A_Handler is provided, such procedure will be -asynchronously called when some data for the -specified address arrives. procedure Bind(An_Addr: in Address_Type; A_Handler: in Handler_Type := null); -- Builds an Address for the local machine, and binds to it -- to be able to receive data on it. -- (a) If no handler is provided, a call to Receive procedure -is needed to receive the incoming data -- (b) If A_Handler is provided, such procedure will be -asynchronously called when some data for the -specified address arrives. procedure Bind_Any(An_Addr: out Address_Type; A_Handler: in Handler_Type := null); -- Disables the binding to an Address. procedure Unbind(An_Addr: in Address_Type); -- Sends buffer contents (accessed by Data) to To address procedure Send(To: in Address_Type; Data: access Buffer_Type); -- Retrieves data sent to To address, and places it into the -- buffer accessed by Data. procedure Receive(To: in Address_Type; Data: access Buffer_Type); ------------------------------------------------------- Exceptions. ------------------------------------------------------- Error while binding. Binding_Error: exception; -- Error while unbinding. Unbinding_Error: exception; -- Error sending. Sending_Error: exception; -- Error receiving. Receiving_Error: exception; -- An operation which requires binding was issued on an -- unbound address. Unbound_Address: exception; -- Internal error, probably due to a bug. Internal_Error: exception; -- An Address_Type was null. Null_Address: exception;
4
-- Incorrect format for an address. Bad_Address: exception; -- Forbidden call to subprogram Forbidden_Call: exception; end Lower_UDP_Uni;
4 Using Lower Layer Lower Layer oers a simple datagram oriented programming interface, consisting of two abstractions and a collection of subprograms. For identifying destinations of communications, addresses are used. For carrying data to be sent or received, buers are the Ada Streams based abstraction exploited. There are also subprograms for sending and receiving from/to buers, and for building addresses, preparing addresses for reception and releasing addresses prepared to receive). The procedure to follow in order to send data to another process is: 1. To build the destination address from its basic components (in the case of IP protocols, the address consists of the IP number and the port). 2. To store data to be sent into a buer. For this task we use Ada streams, which provide support for automatic marshalling/unmarshalling. 3. To send the buer contents to the destination address. The procedure a process must follow in order to receive data from another one is: 1. To build and bind a local address for receiving data. 2. To receive data directed to that local address into a buer. 3. To extract data received from the buer. As an example, this is the piece of code our students can use to send a single message from a client to a server process and to receive its response: begin -- build server end point Server_Address := Build(Server_Ip_Addr, Server_Port); -- prepare a local address to receive Bind_Any(Client_Address); -- put local end point into buffer so server can reply to us Address_Type'Output(Buffer_1'Access, Client_Address); -- put request on the buffer String'Output(Buffer_1'Access, Request); -- send buffer contents Send(Server_Address, Buffer_1'Access); Put_Line("client: string sent: " & Request); -- receive into a buffer Receive(Client_Address, Buffer_2); -- extract reply from buffer Reply := String'Input(Buffer_2'Access); Put_Line("client: string received: " & Reply); -- unbind local address Unbind(Client_Address); exception when Binding_Error => Put_Line("client: can't bind"); when Sending_Error => Put_Line("client: sending error"); when Receiving_Error => Put_Line("client: receiving error"); when others => Put_Line("client: unexpected exception"); end;
5
5 Results from experience We have been using Lower Layer and Ada to teach network programming during the last two years. We used programming assignments similar to those proposed during the previous three years with BSD sockets and the C language. Students do not have any previous knowledge of the Ada language, having received only a six months CS1 level course on the basics of programming using Pascal. We use the rst two weeks of our course introducing Ada to students, only with those language topics that they will need to use in their assignments. For illustrating the kind of assignments we are proposing, let us explain one of the most signi cant: implementing TFTP on top of BSD UDP sockets or Lower Layer (also using UDP). TFTP (Trivial File Transfer Protocol) [Sol92] is a simple protocol for transferring les on top of UDP, that uses a simple stop-andwait schema to achieve reliability. Students can optionally add improvements such as programming support for multiple concurrent clients. The results of the experience can be summarized as follows:
Students learn Lower Layer basics in around one third of the time they used to learn the basics of BSD sockets They nish their implementation of TFTP over Lower Layer in about one sixth of the time they spent with BSD sockets. We are asked a smaller number of questions by students during their assignments. Most questions of students are related to protocol implementation details, compared to previous years where most questions were about the sockets library and C pointer handling. Final implementations are quite robust and reliable. BSD sockets versions usually crashed, mainly due to incorrect usage of pointers. The Ada compiler now catches most usual problems. Most runtime errors are now due to the misuse of the Buer abstraction (implemented with Ada Streams). These errors are usually quickly located and xed by the own students, with the help of the compiler. With BSD sockets, more frequent runtime errors were caused by bad pointer handling. They were dicult to locate, and persisted even in completely nished programs. Errors in the protocol logic are clearly isolated from Lower Layer and Ada incorrect use, which helps in their analysis and correction. When using C, students usually found dicult to decide if wrong behavior of their programs was due to errors in the coding or in the logic of the protocol they used.
From the students point of view, end-of-course surveys re ected the following aspects:
Most frequent referred disappointment about assignments in general is compiling time. Previous years it was the C language (as a whole). The answer to the question \What is the relation between what you have learned in Theory classes and what you do in assignments?" is now \strong" or \very strong". With C and BSD sockets they used to answer \weak" for the same assignments. This probably re ects the fact that they now spend a larger fraction of the time dealing with the protocols, not with the programming language. Students consider that the \diculty to complete assignments" is now \normal" while in past years they considered it as "hard" or "very hard".
From the equipment point of view, students have complained about slow compilations (a phenomenon not perceived when using the C compiler). We've traced this problem to the lack of enough RAM in the computers. From experimentation, it seems that 32 MB is a good minimum for our environment, but computers in our lab are currently equipped with just 16 MB. An upgrade is currently in the works. Summarizing, we have found that students now learn more, better and quickly about protocols, and that they nd their work more related to theory, which is one of the goals of the assignments.
6
6 Related Work Practical assignments about network programming usually fall into one of these categories:
design of protocols using protocol simulators actually programming over a real network
The former approach can be followed with simulators like the one described in [Tan96]. Students must understand the behavior of the simulated protocols, and then modify them in order to achieve dierent goals. We use similar written exercises in Theory classes, but we prefer actual protocol coding for assignments. In real network programming, the BSD sockets library over the TCP/IP stack is commonly chosen. Another functionally similar library is the Unix System V TLI [OMI86]. Both of them suer from the same problem when compared with Lower Layer: The last oer a cleaner an more intuitive programming interface to a datagram oriented service. Building a higher abstraction level library over BSD sockets is also frequent. Using it with teaching purposes is the case of [Tol95], written in C++. Other toolkits that could be used for this purpose are the ACE toolkit [Sec94] (also written in C++), and of [Arn95] (written in C). We prefer Lower Layer approach as it remains as close to the network as UDP BSD sockets are. Some work has been made in order to provide thin Ada bindings to the BSD sockets library such as Distributed Communications [CKF90] and PARADISE [Cou94] (both written in Ada 83). However, they usually oer just a C-like interface written in Ada. For our teaching purposes, only C-language related problems would be solved if this approach were used. And what is more, these bindings tend to show the whole complexity of the BSD sockets interface, which reveals unnecessary {and sometimes even inconvenient{ for teaching purposes. Also, none implementation of the Ada Distributed Systems Annex [Int95], was chosen because of its high level programming interface, although we currently plan to use it at a higher course of these matters. The same applies to Garlic [KPT95], the Partition Communication Subsystem of the Glade implementation of the Ada Distributed Systems Annex.
7 Conclusion, future work and availability With Lower Layer and Ada students seem to learn faster, easier and they gain deeper knowledge than with BSD sockets and C. Moreover, Ada has proven to be an easier to learn language than C for them, at least in order to program the communication protocols we teach. We are currently developing a graphical tool written in Ada for visualizing the runtime internals of Simple Com and Lower Layer. We will use it to visualize the protocols the students are developing, in order to help them in debugging. We are also exploring the idea of teaching communication protocols using the whole Simple Com system in an advanced network programming course. Lower Layer and Simple Com are software available at ftp://ftp.gsyc.inf.uc3m.es/pub/simple_com under GPL. They have been tested with the GNAT Ada compiler over NetBSD and Linux operating systems.
A Environment The laboratory used for assignments has 20 i486 PCs connected by an Ethernet, each one equipped with 16 MBytes of RAM. They are fully connected to Internet, and run NetBSD, a free BSD Unix-like operating system. We recommend our students the use of Xemacs as integrated development environment. They use the emacs ada-mode for syntax highlighting and other language oriented features. They also invoke the Gnat compiler from Xemacs. The basic debugging tool (also integrated with Xemacs) is GNU gdb, with Ada patches. As an alternative, students can choose DDD, a graphical front-end to gdb. All this software is free. You can obtain a full review of this environment [GBdlHQCG97b] in: http://www.gsyc.inf.uc3m.es/~jgb/ada/ada_tools/ada_tools.html
7
References [Arn95]
David M. Arnow. Xdp: A simple library for teaching a distributed programming module. In Curt M. White, editor, Twenty-Sixth SIGCSE Technical Symposium on Computer Science Education, volume 27. ACM SIGCSE, March 1995. [CKF90] Joe Cross, Mike Kamrad, and Sylvester Fernandez. Distributed communications. Ada Letters, Fall 1990. [Cou94] N. Courtel. PARADISE: Package of Asynchronous Real-Time Ada Drivers for Interconnected Systems Exchange. GNU, 3.4 edition, January 1994. [DM96] Danny Dolev and Dalia Malki. The Transis approach to high availability cluster communication. Communications of the ACM, 39(4), April 1996. [GBCGdlHQ97] Jesus M. Gonzalez-Barahona, Jose Centeno-Gonzalez, and Pedro de las Heras-Quiros. Overview of the Simple Com system. In V Jornadas de Concurrencia, Vigo, Spain, June 1997. [GBdlHQCG97a] Jesus M. Gonzalez-Barahona, Pedro de las Heras-Quiros, and Jose Centeno-Gonzalez. Lower layer: A family of interfaces to transport communication protocols. submitted for publication, 1997. [GBdlHQCG97b] Jesus M. Gonzalez-Barahona, Pedro de las Heras Quiros, and Jose Centeno Gonzalez. A development environment for ada programming. In VI Jornada Tecnica de Ada-Spain. Ada-Spain, February 1997. [Int95] Intermetrics. Ada 95 Rationale, chapter E Distributed Systems, page E.1. Intermetrics, January 1995. [ISO95] ISO. Ada Language Reference Manual. International Standards Organization, 1995. [KPT95] Y. Kermarrec, L. Pautet, and S. Tardieu. Garlic: Generic ada reusable library for interpartitions comunication. In Proceedings of Tri-Ada'95 Conference, Annaheim, California, USA, November 1995. [KR88] Brian W. Kernighan and Dennis M. Ritchie. The C Programming Language. Prentice Hall, second edition, 1988. [LFJ+ 86] Samuel J. Leer, Robert S. Fabry, William N. Joy, Phil Lapsley, Steve Miller, and Chris Torek. An advanced 4.3BSD interprocess communication tutorial. In 4.3BSD Programmer's Supplementary Documents, volume PS1 of 4.3 Berkeley Software Distribution. O'Reilly, Sebastopol, California, USA, 1986. [OMI86] D.J. Olander, G.J. McGrath, and R.K. Israel. A framework for networking in System V. In Proceedings of the 1986 Summer USENIX Conference, pages 38{45, Atlanta, Ga., USA, 1986. USENIX. [Pos80] J.B. Postel. User Datagram Protocol. RFC 768, August 1980. [Pos81a] J.B. Postel. Internet Protocol. RFC 791, September 1981. [Pos81b] J.B. Postel. Transmission Control Protocol. RFC 793, September 1981. [Sec94] Stuart Sechrest. An introductory 4.4BSD interprocess communication tutorial. In 4.4BSD Programmer's Supplementary Documents, chapter 20. O'Reilly, Sebastopol, California, USA, April 1994. [Sol92] K. Sollins. The Trivial File Transfer Protocol. RFC 1350, July 1992. revision 2. [Ste90] W. Richard Stevens. Unix Network Programming. Prentice Hall, 1990. [Tan96] Andrew S. Tanenbaum. Computer Networks. Prentice Hall, 3rd. edition, 1996. [Tol95] William E. Toll. Socket programming in the data communications laboratory. In Curt M. White, editor, Twenty-Sixth SIGCSE Technical Symposium on Computer Science Education, volume 27. ACM SIGCSE, March 1995. 8