In our lab session, we introduced Linux kernel modules as a project that ..... independently from what the user does, the important hosting system is secure.
IADIS International Conference Information Systems 2009
DEMOLISHING MYTHS IN MICROCOMPUTING LAB: BUILD A LINUX DEVICE DRIVER Michael Grivas and Dimitris Kehagias Department of Informatics, TEI of Athens Ag. Spyridonos str, Athens, Greece
ABSTRACT In Computer Science and Informatics courses, students usually learn programming in a finest abstract way. The highlevel programming though does not provide information about what the computer really does when executing a program. On the other hand, embedded systems courses cover excessively the lowest part of computing, but they are usually parts of Computer Engineering curricula, where higher-level languages and the respective abstraction are not studied in depth. We, in the department of Informatics of the Technological Educational Institute (TEI) of Athens, in the content of the course ‘microcomputers and applications’ focus in providing the low-level bits of programming to students that usually learn high-abstraction structures and programming methods. We found that most of the students build a perception of low-level computing matters that resembles to tribal fetishes. Hence, we built some projects that help the students understand the inner parts of an operating system and approach in a simplistic way the low-level programming and the hardware. KEYWORDS Microcomputing education, educational device driver
1. INTRODUCTION In most information technology or computer science studies, students obtain a better understanding of programming as an abstract, mathematical procedure than as a set of functions that a computer takes. The lab sessions reveal the distance between “learn” -closely related to abstract thinking- and “do” -more what programmers actually work with. This expands to the level where students become audience to what computing is and how it really works. Furthermore, we found that most of the students build a distorted perception of common issues such as: • Hardware is alien to software: They see hardware as a platform where software runs independently. They do not seem to comprehend the vital bond of the two and the influence that hardware imposes. • Programming languages arrange things: They understand that modern IDE do the whole of it. They miss the point that things happens in a hardware/software combination, such as stacks etc. • Operating Systems are strange to programming: The usual application-level programming takes the operating system for granted. Students do not get a clear grasp on that, it is simply a set of programs. • System programming is a foreign term: Extending the previous point, they seem to see system programming as a kind of weird code that special people do. • Device drivers are a bit of magic: When discussion comes to device drivers, students seem to consider it a subject far beyond our powers, somewhat religiously mystic and totally cryptic. The solution to this has been proposed several times as a hands-on experience with operating systems programming [2,4,7,8,9,15]. Unfortunately, only a few of computing curricula include such a real interaction, most of which aim at hardware in general. In such hardware courses that include embedded systems, hardware design (VLSI etc), communications or robotics, programming of devices, system programming is a must-do [1,3,6,9,10,11,12,16]. The course of “microcomputers and applications” offered at our department intends to clarify the details of computing by studying the parts of a computer and their functionality in the terms of software. That includes inner parts as well as interfacing to the external world (I/O). The primary tool to accomplish its
261
ISBN: 978-972-8924-79-9 © 2009 IADIS
scope is the i386 assembly language, in a beginner’s stage. By the end of the course, students should have a clear understanding of the computer parts, their functionality and communication and should be able to program and control directly a microprocessor. In our lab session, we introduced Linux kernel modules as a project that expands over half of the semester. The kernel modules is a mechanism that allows one to build part of the operating system -better said add-ons to the operating system- without digging into the kernel's code or touching the kernel at all. It does not even require compiling the kernel again. The first results of the project were fairly positive. In the following sections we present the syllabus of the course, including where the misconceptions are based on and how they are built, how our course syllabus works against those misconceptions (part 2). Finally, in part 3, we present the solution we propose. The conclusion summarizes the way we try to moderate the misconceptions.
2. ABSTRACTION: THE MYTH AND AN ANTITHESIS Students by taking various programming courses have already been familiar to programming in several levels and depths, such as algorithms, data structures and programming. Although, those issues build a clear understanding and some experience in programming, they normally hide all the lower level matters that constitute the execution of code in a computer. It would be too large for this paper and out of the scope to explain all of the drawbacks of such a lack of understanding. For clarification purposes, we will only mention a couple of examples: (a) Thinking of a pointer solely as a data structure hides the actual relation of it with computer's memory and does not foster the understanding of common programming problems, such as memory leakages or segment access violations, that may happen even in the most abstract forms of development, like Java. (b) Real-time systems cannot waste even a single cycle of execution due to their being bound to instant response. In such systems, the usual abstract methodology has caused serious problems in the past, when what is considered good programming practice, such as some structured programming rules or object-oriented programming prerequisites, may waste lots of execution cycles. The same applies to embedded devices, where resources are extremely limited and wasting them is a luxury that cannot be afforded. At this point, it should be mentioned that abstraction and mathematical approach is a quite common practice for most -if not all- of the computer science or information technology studies. This practice applies to any curriculum regarding software. Only embedded systems and other computer engineering courses, for obvious reasons, focus more or at least study quite in depth the details of execution on computing devices [3,6,11].
2.1 Microcomputing Syllabus: Counteracting The course of “Microcomputers and applications” has as a main purpose of understanding the internals of a computer: the basic units that constitutes a computing device, the interconnection of such units and the communication with the external world. In addition, the course provides an in-depth explanation of programming and control of computing devices' units through low-level languages, including Assembly. An important consideration in the course, is to provide to the students all necessary knowledge regarding microcomputer hardware and the way of writing programs that control directly microprocessors and peripherals. The course presents in details the architecture of a microprocessor -specifically the 80x86 family, the main memory and its address decoding, the buses, DMA and microcontrollers, as well as issues regarding functionality, such as I/O from CPU to peripherals (serial, parallel) and interrupts (hardware and software). The course is divided into two distinct parts: theory and lab. The laboratory part focuses on providing hands-on experience to the subjects that have been presented in theory lectures, as well as the low-level details of programming, including user and system programming. In order to fulfil its purpose, the laboratory part includes mostly projects that are mainly done and then presented within lab sessions. Short presentations by the lab instructor have rather explanatory than instructional character, aiming to clarify the projects' requirements or details that remain obscure to the majority of the participants. The lab part of the course was constituted by user-level programs that studied rather macroscopically several issues of the underlying system, by direct hardware I/O on 16-bit 8086 programming or using
262
IADIS International Conference Information Systems 2009
operating system calls. The programs that the students built were of moderate difficulty and rather small size and mostly finished within the lab session. Each program was also accompanied by a small lab report that explains what they did, reports any problems, difficulties or special issues they came across and their responses to questions that are applied to each project. However, this kind of projects had three major drawbacks: (a) Student's motivation was not stimulated, they considered the tasks outdated, boring or even trivial, while Assembly resembled a burden for no reason, (b) the projects were done in 16-bit 8086 assembly for simplicity and focus, which augmented the previous feeling and (c) it did not allow the students approach the implementation of the actual infrastructure, the inner parts of the operating system.
3. ERGOTHERAPY AGAINST MYTH - ergo + therapy from Greek: ργ ( = work, labor) + θεραπεία ( = therapy) As explained earlier, one of the major problems is that teaching programming relates it most to theory and leaves students far from the “real thing” that is the computing devices. Furthermore, common studies are more based on theoretical courses and reading than on a hands-on class that provides experience together with the knowledge. Probably, the most pertinent phrase for the problem is: “Students often feel that education is something that is done to them, rather than something they are actively doing for themselves because they are not encouraged to think critically” [1]. The lab part is usually the contrary. The idea behind it is that students work on actual material. Extending that, our lab part makes students acquaint experience on what they feel alienated to. The effect is that the acquainted experience eliminates the feelings that usually students had, against hardware and operating system programming. Additionally, it offers a very clear and practical understanding of what operating system is how it works and what it does. To the best of our knowledge, no other use of kernel module in microcomputer courses has been offered, neither has it ever be used as a weapon against myths and dogmatic cultures in computing.
3.1 Approaches In general, as seen in existing curricula, there are three approaches of showing how the hardware and lowlevel software works to engineering students [2]: • User-level programs: It is probably the most common solution. It consists of -usually small- programs that run on user-level and try to exhibit operating systems' characteristics through the system calls. It is exactly the case we used to work with. It has many administrative advantages, including simplicity and security [7]. However, it does not allow to the students see the inner parts and the actual functionality of an operating system, neither exhibits the problems that may arise within an operating system. Furthermore, as seen by our experience, since the programs have to be quite small and easy, they tend to be boring, too, diminishing any excitement or stimulation for new thoughts that the course should provoke. • Simulations: Those are ready-made applications. They have been many times discussed, both in the international arena of computing education [1], as well as within the department. Normally, the simulation programs tend to be very educative, since they are built for that purpose. They are also simple user-level application that can run on top of any system, without any special administrative requirements. However, many times, these simulators need their own learning curve. In addition, they offer no more excitement than building a user-level program (or even less) [4], neither do they offer a clear view of the underlying system, leaving the rest to the students imagination. • Actual system programming: From an administrative point of view, real system-level programming causes greatest difficulties for a number of reasons: (a) First, if involved with an operating system as is, the code that must be referenced is enormous and quite advanced, then (b) programs and programmers must work as the system's superuser (root) eradicating any security scheme, exposing systems as totally vulnerable, (c) the system may (and normally will) become unstable and/or cease functioning if the program (or programmer) does something really wrong, (d) the system may collapse beyond hopes of revive or restore, if the user deletes parts of the operating system, (e) issues that were out of scope, such as malware or changing critical filesystem parts, come into account and last but not least, (f) there must
263
ISBN: 978-972-8924-79-9 © 2009 IADIS
be a common root password that all students will use, further exposing the whole lab to wrong use. However, programming in the system level is a major advantage from the educational perspective: It (a) allows the students to see real responses to their code, (b) exhibits all possible issues that may arise in low-level programming, (c) allows the students go further in depth or breadth into the operating system, (d) shows how programming is done in the operating system, (e) motivates them to search for more information and (f) improves their self-esteem by making them fill they do something different, more difficult and of greater merit than common programming, which in turn stimulates for further involvement and research [4,8,10]. In our lab session for the course “Microcomputers and applications”, we took into account the pros and cons of each solution. We concluded that Linux kernel modules could be a plausible solution. The survey of other curricula in the area of Microcomputing [6,9,10,15], embedded systems [1,11,12,16] and operating systems [2,4,7,13] revealed that the use of Linux kernel modules is a clear winner from the educational perspective. Kernel modules effectively allow for an addition to a running kernel, dynamically, without recompile or even reboot. A kernel module can be loaded and unloaded by the root any time with a simple command. Then, it is dynamically linked to the executing kernel [5,8,14]. It should be noticed that most parts of a contemporary Linux kernel are loaded as modules at boot-time or later, without any difference. Most of the device drivers nowadays are built as kernel modules, instead of statically linked into the kernel. As for its functionality, a module after loading becomes a part of the operating system, while it remains a somehow isolated piece of code. It shares global structures from the running Linux kernel when they are exported, but it cannot modify things that are not specifically available, neither can it replace parts of the executed kernel. Hence, it functions as an add-on to the kernel itself. As for its programming, a kernel module is a fairly simple piece of code with a start up and a shutdown function, isolated from the kernel’s gigantic tree of source code files [5]. Hence, (a) kernel modules allow the student to see as much of the real operating system code as he likes, but (b) he does not have to study or read any part of it, while (c) he may focus and reserve his work to a part, regardless of the rest, and finally, (d) he can build easily a working module as a small piece of code. So, a work can be isolated and small, but still a part of the operating system, or as bloated and interwoven wanted. On the other hand, from the administrator's perspective, kernel modules do not affect the installed operating system. Even in the worst case, that of malfunction or even a system “crash”, a reboot restores everything to the initial working state [2,4,5,14]. The threats introduced by the use of root as the user can be eliminated by employing virtual machines, as explained later.
3.2 Indulge with deep knowledge Before, we should point out that there is no printed material given to the students, neither is there any extended bibliography referred. In contrast, they are required to search on the WWW to find the necessary information, examples of code and details of programming device drivers in Linux. The coursework consists mainly of two projects, each of which contains its own sub-projects. Each project is submitted as a whole, but each sub-project has its own deadline. Projects consist of the program and a report that presents the algorithm on which the program was based, describes problems, errors peculiarities or difficulties encountered, explains the procedure and methodology followed and responds in any question that may be included in the assignment. The first project has mainly a two-fold introductory role. First, it works as a “warming-up” session. Its first part is based on 16-bit 8086 assembly and C and assists in remembering low-level programming (i.e. system calls, registers etc), while it also introduces inter-language function calls. This task is fairly easy and therefore it has a very short and strict deadline. Next part applies the same task to 32-bit Linux assembly and C. Here, students have to learn how Assembly (gas or nasm) and C work together on Linux, in 32-bit architecture. Its importance stems from being preparatory for the next project. That second part was dictated by the first semester's difficulties with those tasks. That project is assigned usually no more than 3 weeks. The second project takes the rest of the semester's lab sessions and is the main contribution to the student's knowledge, experience and myth devastation. It consists of the building of a very small kernel module that drives the serial port of a personal computer, i.e. the UART. The kernel module is basically written in C. It consists of 3 parts:
264
IADIS International Conference Information Systems 2009
The init function: It is called when loading the module and is responsible for initializations. That task includes (a) UART cleaning and (b) setting communication details (speed, parity etc) and (c) declaring the interrupt handling routine by issuing the appropriate system calls. The clean function: It restores the previous interrupt handler, similarly and complementary to the init, and performs any other procedures required for a kernel module. The int_handler: It is a function written totally -or at least mostly- in assembly and it is the important part. The handler takes care of reading from the UART any received data, checking whether this data are erroneous and then send them back. This procedure is fairly simple, it exhibits however all the vital issues regarding device drivers and interrupt handling.
3.3 Taking Care of the Details The validity of the code can be verified in several ways. The simplest is to connect two PCs over null-modem cable to their serial ports, where one runs the assignment's results and the other a common serial terminal emulation program. Although this is our way, to the students we provide several alternatives, such as twoserial ports or redirection to the port from the virtual machine to a socket or a file. Regarding the administrative issues mentioned earlier, our labs have been built in a quite secure way: there is only a light Linux loaded at the beginning. Then, it starts a VMware virtual machine that automatically boots either another Linux or windows, depending on the username and password. That way, independently from what the user does, the important hosting system is secure. In addition, our administration stuff built a new user and a corresponding virtual machine that loads a full Linux operating system, with source and everything needed to run our projects. Students work as root. However, the inner (guest) system's disks are actually files in a single directory; restoring to initial operation is a matter of copying a directory. Hence, although the user is the root, he cannot cause any trouble: Even if the system collapses, a remotely invoked command can restore everything within minutes. Another advantage is that the virtual machine is available to the students and, hence, they may have at home the exact same environment as in the lab.
3.4 Outcome Our experience in using Linux device drivers in the lab session of the course for two semesters proved very valuable for both teacher side and students. This project was introduced in autumn of 2007, after a discussion with students, since it would be in a quite experimental basis. We should stress that, since these projects have only been offered for two semesters, it is not possible to fully assess the impact on student learning. Further assessment with questionnaires and interviews will be possible when these projects mature as an educational element of our syllabus. From the educators' point of view, the results are at least satisfying. First, the course syllabus is fulfilled: (a) It introduces low-level programming, (b) it exhibits the low-level details in communication between functions and even among different programming languages, (c) it provides a means to see and understand system programming, (d) it clarifies most of the details regarding I/O and communication among different hardware parts of the computer and with the external world and (e) it presents the programming and controlling of external devices. Regarding the misconceptions, this project (a) clarifies system programming to an extend that students can apprehend it, (b) allows a close view of operating systems internals and functions, diminishing any dogmatic repel, (c) offers experience in systems programming that means new horizons of involvement with computer science and (d) reveals the algorithmic and methodological face of the infrastructure, as a common set of programs. This is important not only for our course, but also for operating systems' courses, as it has also been discussed elsewhere [2,13,15]. Students get a clear, practical view of material they usually learn only by book. From the students' side, the results were more than positive. In early and later discussion with students that take the course and others that have passed it, we received a clear feeling that they consider it an advancement in their studies and an important addition to the coursework. Furthermore, as we saw within the lab sessions during the semester, they clearly showed initiatives to learn more, beyond the sessions material and into individual projects of their interest, while they followed
265
ISBN: 978-972-8924-79-9 © 2009 IADIS
the course in a self-driven pace. Another positive surprise was that the next semester there was a significant increase in people attending the lab, which is not a compulsory one, in the fourth semester. It also seems that it has initiated a discussion among students, since it appeared in several student for a, with generally positive comments. In the future, we plan to improve the project by adding synthetic interrupt handling, devices for communication with user-level programs and other, vastly different device handling.
4. CONCLUSION We introduced building Linux device drivers as a new approach for teaching “microcomputers and applications” course, in order to eliminate previous programming misconceptions. The overall experience, in order to help students approach in a simplistic way the low-level programming and the hardware, was fairly positive. Students expressed their interest and their positive surprise and they showed initiatives to learn more. Last but not least, it provided the basis and food-for-thought for final year projects that are related to hardware and into operating systems, such as a bar-code reader support system, an modified Ethernet driver and a study of a car's microcontroller programming.
REFERENCES Beer, R. D. et al (1999), 'Using autonomous robotics to teach science and engineering', Communications of the ACM 42(6), 85-92. Bower, T. (2006), ‘Using Linux Kernel Modules for Operating Systems Class Projects’, American Society for Engineering Education Calvez, J. & Pasquier, O. (1998), 'Training Engineers in Real-Time Systems Design: An Integrated Curriculum', RealTime Systems Education Workshop, IEEE, 8-15. Claypool, M. et al (2001), An open source laboratory for operating systems projects, in 'ITiCSE '01: Proceedings of the 6th annual conference on Innovation and technology in computer science education', ACM, New York, NY, USA, pp. 145-148. Corbet, J. et al (eds.), (2005), Linux Device Drivers, O'Reilly. Crespo, A. et al (1998), 'Real-Time Education in a Control Engineering Curriculum', Real-Time Systems Education Workshop, IEEE, 112. Davoli, R. (2004), Teaching operating systems administration with user mode linux, in 'ITiCSE '04: Proceedings of the 9th annual SIGCSE conference on Innovation and technology in computer science education', ACM, New York, NY, USA, pp. 112-116. Gaspar, A. & Godwin, C. (2006), 'Root-kits & loadable kernel modules: exploiting the Linux kernel for fun and (educational) profit', J. Comput. Small Coll. 22(2), 244-250. Kornecki, A. et al (1998), Teaching Device Drivers Technology in a Real-Time Systems Curriculum, in 'Real-Time Systems Education Workshop, IEEE', IEEE Computer Society, Los Alamitos, CA, USA, pp. 42-48. Lotenberg, R. & Tyszberowicz, S. (1998), 'Student Projects in Reactive and Real-Time Systems Course', Real-Time Systems Education Workshop, IEEE, 57-63. Motus, L. (1998), 'Teaching Software-intensive Embedded Systems at Tallinn Technical University', Real-Time Systems Education Workshop, IEEE, 30-35. Nooshabadi, S. & Garside, J. (2006), Modernization of teaching in embedded systems design-an international collaborative project'IEEE Transactions on Education', pp. 254-262. Rogers, M. P. (2000), Working Linux into the CS curriculum, in 'Proceedings of the seventh annual consortium on Computing in small colleges midwestern conference', Consortium for Computing Sciences in Colleges, USA, pp. 8591. Rusling, D. A. (2001), The Linux Kernel, (available on internet). Striegel, A. & Rover, D. T. (2002), Problem-Based Learning in an Introductory Computer-engineering Course, in '32nd ASEE/IEEE Frontiers in Education Conference'. Tempelmeier, T. (1998), 'Embedding Practical Real-Time Education in a Computer Science Curriculum', Real-Time Systems Education Workshop, IEEE, 149-153.
266