An Introduction to Python Programming - Curve - Coventry University

43 downloads 1270 Views 3MB Size Report
In 2005 we decided to start using Python for introductory programming ..... Michael Dawson (2003) Python Programming for the Absolute Beginner Pub-.
An Introduction to Python Programming James Shuttleworth and Sarah Mount 2005

Preface Why learn to program? Why bother to learn a programming language when you can already use a computer just by pointing and clicking on your desktop icons? Whenever you use a computer in the conventional way, you are restricted by what some other programmer has allowed you to do. If you want to really be in control of your computer, you need to be able write your own code. Not only will you be able to develop applications from scratch, more and more software packages allow you to extend their capabilities by writing plugins. The ray tracer PovRay and the GIMP raster graphics package are examples of extensible applications. Programming is also an essential skill in the computing industry. Pretty much any job you can think of involves at least some programming. Learning the fundamental concepts of programming languages and systems, as well as the facilities of one or more languages will put you ahead of the game in terms of employability. It’s also worth considering that programming languages tend to rise and fall in popularity. Python is on the way up, but at some point it’ll be overtaken by the Next Big Thing—. In these notes we’ll be teaching you as much about programming in general as Python specifically. At the end of the module, you will also go on to study other languages, such as Java. These experiences will give you the flexibility to learn new languages whenever they become popular, which will keep you in a job much longer than those people who will learn a single language (COBOL?!) and have difficulty retraining in years to come. This introduction to programming was written specifically for students who have no prior experience of writing software. We’ve tried to teach everything from scratch and make it fun and interesting along the way. Last year we saw some fantastic results (and great code) from students, almost all of whom had never written a program before. The External Examiner said about our approach:

The institution’s approach to the teaching of programming is novel and is very successful: it gives Coventry University students a head start compared to approaches commonly used elsewhere in the UK. iii

iv

Why Python? In 2005 we decided to start using Python for introductory programming modules at Coventry. We are not the only University to make this switch – Leeds University and the University of Oregon are among those beginning to use Python. However, many Universities use other languages and we thought it might be helpful to explain why we believe that Python is such a great choice.

Python is easy to learn The syntax of Python is based on a language called ABC which was specifically developed to be easy to learn and understand. As Python has developed to become an industrial strength language, its creator, Guido van Rossum, has gone to some lengths to make sure that the language remains as easy to use as possible. Programmers often describe the way Python code looks as “clean” and “simple”. So, we think you’ll be able to pick it up quicker than other languages and be able to write quite sophisticated programs early on in your studies. When Python was first born, it was always hoped that it would become a language widely used in education. For that reason, Python has a lot of built-in support for learners – especially the turtle graphics module that you will meet in Chapter 1 of this book and use in your first Studio.

Python is used in different sorts of applications Like XML, scripting was extremely useful as both a mod tool and an internal development tool. If you don’t have any need to expose code and algorithms in a simple and safe way to others, you can argue that providing a scripting language is not worth the effort. However, if you do have that need, as we did, scripting is a no brainer, and it makes complete sense to use a powerful, documented, cross-platform standard such as Python. Python, like many good technologies, soon spreads virally throughout your development team and finds its way into all sorts of applications and tools. In other words, Python begins to feel like a big hammer and coding tasks look like nails. – Mustafa Thamer of Firaxis Games, talking about Civilization IV. Quoted on page 18 of the August 2005 Game Developer Magazine http://www.gdmag.com/ Python is capable of implementing pretty much any sort of program you might wish to write. It is used in web application, scientific programming, business applications, image processing, games, database programming and probably anything else you can think of. The number of useful things you can do quickly with Python seems to be expanding all the time, as more and more people and companies are using the language.

v

Python has fantastic library support and an active user community I have the students learn Python in our undergraduate and graduate Semantic Web courses. Why? Because basically there’s nothing else with the flexibility and as many web libraries. – Prof. James A. Hendler, University of Maryland Python has fantastic online support, in the form of documentation, tutorials and even books. Many of these are listed in our “recommended reading” in Section and many of them are free. So, you will be well supported in your learning, both inside the University and outside.

Python is increasingly used in industry Python has been an important part of Google since the beginning, and remains so as the system grows and evolves. Today dozens of Google engineers use Python, and we’re looking for more people with skills in this language. – Peter Norvig, director of search quality at Google, Inc. NASA is using Python to implement a CAD/CAE/PDM repository and model management, integration, and transformation system which will be the core infrastructure for its next generation collaborative engineering environment. We chose Python because it provides maximum productivity, code that’s clear and easy to maintain, strong and extensive (and growing!) libraries, and excellent capabilities for integration with other applications on any platform. All of these characteristics are essential for building efficient, flexible, scalable, and well-integrated systems, which is exactly what we need. Python has met or exceeded every requirement we’ve had. – Steve Waterbury, Software Group Leader, NASA STEP Testbed. Whilst we want you to become competent and well-educated programmers, we realise that at some point many of you will earn a living in the Computer industry. Almost any job you might find will require some programming skills. System administration, system analysis, programming(!), online content creation and games development all require programming skills of some sort. Animation also often requires some scripting and many graphics and animation packages now come with scripting capabilities built-in. Knowledge of Python should serve you well whatever you choose to do later on.

Python runs on many platforms Python can run on PCs, Macs on Windows operating systems, or Linux, or Mac, or pretty much anything else. Recently, Nokia released a Python SDK1 with which you can write Python programs for mobile phones running the Symbian OS. 1

http://www.forum.nokia.com/main/0,,034-821,00.html

vi

Python is Free! Python is free in two senses. Firstly, you don’t have to pay any money to get it. You can download as many copies of Python as you like from http://www.python.org/download/. If you have a computer at home and you want to work there, we suggest you get a copy for yourself. In this book, we have used Python 2.4.1, but any version of Python which starts with 2.42 should be fine. More importantly, Python is free software, which means that anyone can get the Python source code, improve it and distribute it to anyone who wants a copy. This means that thousands of developers around the globe have been able to improve the Python language and environment, which is partly why the language is so robust, well-written and popular. You can read more about free software on the Free Software Foundation’s website http: //www.fsf.org.

How to read this book Together with the recommended reading, this book should be all you need in terms of notes for your module. Different topics in this book represent give rise to different kinds of challenges, so one Chapter in this book will not necessarily represent one lecture on your module. To get the most out of your learning experience, you should read ahead at least a Chapter at a time, before the lectures. When you have attended a lecture, you need to do all of the exercises at the end of the relevant Chapter, either in your Studio class or your own time. These exercises will be assessed and for your portfolio you should hand in at least all of the exercises marked “key assignment”. Feel free to hand in more than that – you might consider it particularly useful to hand in any exercises you found very difficult, so we can give you good feedback and advice on how to approach the tasks we set. If you have difficulty understanding the work covered in lectures and you’ve read through the relevant notes at least once, then the recommended reading in Section or at the end of each Chapter should be the first place you look. Of course, in your Studios you will be supported by members of staff, PhD students and Proctors (second and third year students) who can answer your questions. We also encourage you to use WebCT to discuss your thoughts and ideas with other students and with us.

Chapter structures Each chapter in this book has the same structure: Learning outcomes tell you what you will be able to do when you have finished working through the chapter. 2

Version 2.3 won’t work with the latest versions of Pygame and some other libraries although you might be OK with it up to Chapter 13.

vii Content the main Sections of the Chapter where you will learn all about programming and Python. Further reading is a list of material you can use to learn more about the contents of the chapter. Wherever possible, we have tried to make use of the amazing amount of free, online material that is available to help you learn Python. In particular, http://www.python.org is the place that professional programmers go to for official Python documentation and tutorials. It’s important that you get used to reading around the subject and finding out more for yourself. Many professional programmers are expected to learn new languages and systems quickly, without outside help and training. The faster you are able to learn to make use of the free reading material around you, the more you will know about Python and the sharper your employability skills will be! Glossary gives a list of the new words you have learned in the Chapter and their meanings. Homework exercises is a list of assessed exercises for you to do in your Studios and in your own time. You should have these signed-off by your Personal Tutor each week and bring them to your Studios.

Book structure We’ve been very careful to structure these notes in such a way that you should be able to understand each new concept. We want to avoid saying “this works, but we won’t tell you why for a few weeks”, which we think makes programming much harder to learn. In fact, this is one of the reasons why we chose to use Python – it’s much easier to understand as a beginner programmer than languages such as C and Java! Of course, where you will see a particular topic again, in a different context, we’ve given you forward references to coming Chapters. With that in mind, it’s important that you read each Chapter of this book in order, otherwise it won’t make much sense. We’ve tried to make sure that you will be writing reasonably challenging and interesting programs all the way through the module – even in the first week – so where we’ve chosen to take a difficult concept quite slowly, we hope to keep your attention! The following is a rough guide to what’s in each Chapter, and why it should be interesting to you: Getting started with Python here, you will learn how to run Python and use it’s most basic features. We’ve made use of Python’s turtle module, which means you will be writing programs which draw pictures in your first week. Python basics in this Chapter you will learn some of the very basic concepts of programming and programming languages. Specifically, you will learn about expressions, statements, commands, literals, variables and types. These will all be covered in

viii more detail in later Chapters, but here you will learn enough to get you started in writing simple programs. Boolean algebra this is a concept you will also meet in modules on hardware and mathematics. “Booleans” help us represent the concepts of “true” and “false” in programs. In programming, Boolean algebra makes it possible to write programs which make choices between alternatives. Choice in this Chapter you will make use of Boolean algebra and learn how to write programs which choose between alternatives. You’ll see an example about how to randomly choose a signature to add to the end of your emails. Repetition: recursion recursion is one of two ways to write programs which repeat some of their actions. In this Chapter you will learn how to use recursion to draw fractal curves like this one:

You will also draw one or more fractals of your own, which must be included in your portfolio of work. Repetition: iteration iteration is the other way of writing programs which repeat some of their actions. In this Chapter you will see some of the examples you saw in the previous Chapter on Recursion. This should help you learn how to “do” repetition in different ways and to choose which language features to use for particular task. One of the examples you’ll see will implement a simple form of encryption called the Cæsar Cipher, which was invented by Julius Cæsar to prevent his instructions to troops from being read by his enemies. State this Chapter is all about controlling the information stored in your programs. You will learn about finite state machines and a particular sort of state machine called a lexer.

ix Compound types in this Chapter you will learn how to represent and process complex pieces of information in Python. Some of the concepts from the Chapter on Python basics will be covered in more detail, especially strings. You will also learn about regular expressions with which you can perform powerful text processing. Searching and sorting programming is all about algorithms – ways of solving problems. Algorithms to search and sort data are among the most fundamental in Computer Science. Here, you will learn about the most useful algorithms for searching and sorting, how to compare their efficiency (using “Big-O notation”) and how to implement them in Python. Functions and modules functions and modules are two of the basic building blocks of Python, along with objects which are covered in the Object Oriented Python Chapter. You will already have met functions in the Chapter on Python Basics, but here you will study them in more detail. You will learn about professional programming practice, including how to usefully document your programs and how to write code for other programmers to use. Input and output most useful programs read data in from somewhere (a file, the keyboard, the Internet, . . . ) and produce some output (in a file, on the screen, etc). In this Chapter you’ll learn all about input and output, including how to save data in your programs so that you can use it again. We’ll also cover issues about platforms – how running your program on Linux might be different to running it on Windows or a MAC. Object oriented Python object oriented programming languages gained enormous popularity in the 1990s and have become almost a de facto standard in industry. In the 00s, as languages like Python have become more popular, many people have begun to think that it’s good to have a choice about whether to use object oriented programming (OOP) or not and how to solve some of the problems that OOP has created. Because OOP has become so important, we want you to have a good understanding of what it is, when and why it’s useful and how to write programs using OO techniques. Python extensions gives an introduction to some libraries which do not come with with Python when you download it. This gives you a chance to find out how professional programmers make use of software that comes from a variety of sources. You will learn: ˆ how to use pygoogle to write programs which search the Internet with the Google search engine; ˆ how to do image processing in Python, using the Python Image Library and ˆ how to write arcade games using pygame. Your will start writing your arcade game in your Studios and this must be included in your portfolio of work.

x Index At the end of the book there is an index of terms which you can use to quickly access information about a particular concept (or Python keyword, which will be in bold type) covered in the book.

Teaching schedule We will devote as much time to each Chapter as we think is necessary for you to fully understand it. You will always have one lecture a week and the following list tells you which Chapters are covered in which weeks of the lecture schedule: Lecture 1 Getting Started. Lecture 2 Python Basics. Lecture 3 Boolean Algebra. Lecture 4 Choice. Lectures 5-6 Repetition: Recursion. Lecture 7 Repetition: Iteration. Lectures 8-9 Compound Types. Lecture 10-11 Searching and Sorting. Lecture 12 Functions and modules End of Autumn term Lecture 13 Functions and Modules (continued). Lecture 14 Input / Output. Lectures 15-17 Object Oriented Python. Lectures 19-19 Python Extensions. Lectures 20-24 Other languages! This section isn’t covered in this book, but it will involve transferring the skills you’ve learned to some new languages (such as Java and C). End of Spring term Note that for this year we’re skipping Chapter 8 (State), but we’ve left it in the notes because you might still find it useful.

xi

Conventions used in this book To help you find your way around this book and to keep the text consistent, we have adopted a number of conventions, listed below. Font conventions This book uses the following conventions: ˆ Italic is used to introduce new words. ˆ Fixed width is used for Python code, keywords, filenames, etc.

Code listings On WebCT you will find a copies of all the Python programs in this book, as well as the images and music that we’ve developed. The listings/ directory contains directories for each of the chapters in the book as well as chapters called images/ and sounds/. Each chapter folder contains all of the code listings from that Chapter. We have used Python 2.4.1 for the development of all the programs in this book. We have not tried our code with any other version of Python, although most things outside the Python Extension Chapters should work on versions beginning with “2”. In the book, we have two ways of displaying Python programs. Short, simple pieces of code will look like this: # This is a code listing which will not appear in the # list of listings ... Longer or more complete programs will have a caption and will also appear in the List of Listings on page xxviii: Listing 1: An example code listing which will appear in the list of listings. 1 # !/ bin / env python 2 3 print " Hello World ! " Figures Figures are numbered and appear with a caption, like Figure 1. All Figure are listed in the List of Figures on Page xxvi. Mathematics There isn’t much Mathematics in this book, but we have included some of the most important and fundamental mathematical concepts in programming. You should find these useful (in the sense that they help you write better programs) and relevant to what you will

xii

Figure 1: An example figure which will appear in the List of Figures

be learning. Where there is a choice between using Python’s notation and using a more mathematical notation, we have used Python’s notations. For example, we write x ∗ ∗y (meaning “x to the power of y”) rather than xy or and rather than ∧. We hope that this will help you make the link between the mathematics we have used and the programming you will be doing. Equations will be written like this: x = (a + 5)/(b ∗ ∗6) Asides Asides contain interesting pieces of information, which we think you would like to know, but won’t be assessed as part of the course.

An example aside . . . and an aside looks like this.

Modules 110CR, 112CR, 171CS and 159CS Learning outcomes The specific Learning Outcomes for your module are available online http://mid.coventry.ac.uk/midhome.html. In general, at the end of any module based on these notes, you should be able to: 1. Design and implement interactive programs using appropriate linguistic features of a given programming language. 2. Demonstrate an understanding of imperative, declarative and object oriented language features and know when it is appropriate to use each.

xiii 3. Specify the functionality of an algorithm (for example, formally by stating its preand post-conditions or informally, by describing it in English). 4. Employ appropriate facilities in a given programming language for data abstraction and encapsulation (for example, functions, modules, packages and classes). 5. Write programs which make use of external libraries, APIs, etc. and apply their skills to a new programming language.

Assessment This book forms part or all of the work for the following Level 1 programming modules: 110CR Introduction to Programming – all 112CR Programming Concepts and Practice – all 171CS Systematic Programming – part – plus the Java component of 105CR 159CS/159CSEVE Computer Programming – all The module mark weighting and pass criteria for those modules is as follows: 110CR 100% coursework. You must have a module mark of at least 40% for a Pass. 112CR 100% coursework. You must have a module mark of at least 40% for a Pass. 171CS Your coursework mark must be at least 35%, exam mark must be at least 35% and module mark must be at least 40% for (Double) Pass. For a Single Pass, your coursework mark must be at least 35% and your module mark must be at least 30%. 159CS 100% coursework. You must have a module mark of at least 40% for a Pass. The coursework component for each module will be obtained by marking work from your portfolio. Your portfolio must include the “Key Assignment” in every set of exercises for all the Chapters which appear in the lecture schedule. You may also include any other exercises you have attempted and code you have written outside of this module, if you wish. In addition to the exercises in this book, the series of lectures on learning new languages (at the end of the Spring term) will also include exercises and Key Assignments. It remains your responsibility to ensure that you have included enough work in your portfolio to convince us that you have met the Learning Outcomes for your module. You should include in your portfolio a list of these Outcomes and a brief paragraph for each explaining which pieces of code you are submitted are intended to meet each Outcome. The Chapter exercises should be completed by you, in Studios or your own time, and signed-off weekly by your Personal Tutor. You should bring in your current work to each Studio on your schedule. A component of the mark will be awarded for doing the work in

xiv a timely way. You may expect work put into your portfolio at the last minute to score less than work which was done when it was set. You should hand in your Portfolio of work at Reception in the Armstrong Siddeley Building by 17:00 on the following dates: ˆ Wednesday 13th December, 2006. ˆ Wednesday 21st March, 2007.

Modal assessment criteria For a pass mark (≥ 40%), you should be able to: ˆ Specify an algorithm both formally and informally. ˆ Use appropriate Python language structures to implement a formally or informally specified algorithm. ˆ Test a program and note where it may be deficient. ˆ Appreciate that some of Python’s language features can be used to create equivalent code (e.g. for and while loops can both be used to implement iteration). Be able to translate programs using one language feature into those using an equivalent feature (e.g. translate a for loop into an equivalent while loop or vice versa). ˆ Demonstrate a basic understanding of the syntax and semantics of at least one language other than Python.

For a first class mark (≥ 70%), you should be able to: ˆ Consider several possible choices for the design and implementation of an algorithm. ˆ Make defensible choices about which of Python’s facilities for encapsulation (functions, modules, objects, etc) to use. ˆ Improve a program, having tested it. ˆ Determine which of an APIs facilities to use to perform a given task, based on the professional documentation for that API. ˆ Write programs which take into account the needs of users. For example, it should be clear to users of an arcade game how the game controls work. ˆ Demonstrate competence in more than one language other than Python.

xv

Recommended reading These notes form the the only essential reading for our introductory programming modules. However, there is a wealth of good material available on Python programming and we want you to look around and use whatever you feel most comfortable with. One of the professional skills that good programmers have is to be able to use tutorials and manuals to learn the teach themselves whatever they need to know to get their job done. Whilst we will be giving you much more guidance than that, we want you to aim to acquire this skill yourself. As well as the tutorials and introductions that we’ve listed below, you should also make use of the Python module index. This might only become useful to you around part-way through your studies, but it provides documentation for all the programs which come free with Python. It’s very useful, and something that most Python programmers have open whenever they are coding. So, make sure you bookmark it now in your favourite browser: http://docs.python.org/modindex.html Python material for novice programmers ˆ Beginner’s guide to Python http://wiki.python.org/moin/BeginnersGuide ˆ Josh Cogliati Non-programmers tutorial for Python http://www.honors.montana.edu/~jjc/easytut/easytut/ ˆ Magnus Lie Hetland Instant hacking http://www.hetland.org/python/instant-hacking.php ˆ Swaroop C H A byte of Python http://www.byteofpython.info/ http://www.ibiblio.org/g2swap/byteofpython/read/index.html ˆ Alan Gauld Learning to program http://www.freenetpages.co.uk/hp/alan.gauld/ ˆ Allen B. Downey, Jeffrey Elkner and Chris Meyers How to think like a Computer Scientist http://www.ibiblio.org/obp/thinkCSpy/ ˆ Hans Petter Langtangen Scripting for computational science http://www.ifi.uio.no/in228/lecsplit/ ˆ John M. Zelle (2004) Python Programming: An Introduction to Computer Science Published Franklin Beedle & Associates. ISBN 1887902996 ˆ Michael Dawson (2003) Python Programming for the Absolute Beginner Published Premier Press. ISBN 1592000738

xvi Python material for experienced programmers The following material has written for experienced programmers. It will probably only be useful to you if you can already program competently in at least one programming language: ˆ Guido van Rossum Python tutorial http://docs.python.org/tut/tut.html ˆ Mark Pilgrim Dive into Python http://diveintopython.org ˆ Aaron R. Watters The what, why, who and where of Python http://www.networkcomputing.com/unixworld/tutorial/005/005.html ˆ Magnus Lie Hetland Instant Python http://www.hetland.org/python/instant-python.php ˆ Bruce Eckel Thinking in Python http://www.mindview.net/Books/TIPython ˆ Jacek Artymiak Python programming for beginners http://www.linuxjournal.com/article/3946 ˆ Richard P. Muller Python short course http://www.wag.caltech.edu/home/rpm/python_course/ ˆ Magnus Lie Hetland (2002) Practical Python Published A Press. ISBN 1-59059006-6 ˆ Mark Lutz and David Ascher (1999) Learning Python Published O’Reilly. ISBN 1-56592-464-9 ˆ Mark Lutz (2001) Programming Python Published O’Reilly. ISBN 0-596-00085-5 ˆ Bradley N. Miller & David L. Ranum Problem Solving with Algorithms and Data Structures Using Python Published Franklin, Beedle and Associates. ISBN 1-59028-053-9

Recommended software The following is a list of all the software we have made use of in this book, with version numbers. You can download everything listed here, free of charge, for Windows, Linux or Mac (at least!). python-2.4.1 http://www.python.org/download/

xvii pygame-2.6 http://www.pygame.org/download.shtml pygoogle-0.6 http://pygoogle.sourceforge.net/ Imaging-1.1.5 http://www.pythonware.com/products/pil/ googleapi http://www.google.com/apis/download.html numarray-1.3.3 http://www.stsci.edu/resources/software_hardware/numarray Numeric-23.1 http://sourceforge.net/projects/numpy SDL-1.2.7-8 http://www.libsdl.org/download-1.2.php SDL-devel-1.2.7-8 http://www.libsdl.org/download-1.2.php SDL net-1.2.5-2 http://www.libsdl.org/projects/SDL_net/ SDL net-devel-1.2.5-2 http://www.libsdl.org/projects/SDL_net/ SDL image-1.2.3-6 http://www.libsdl.org/projects/SDL_image/ SDL image-devel-1.2.3-6 http://www.libsdl.org/projects/SDL_image/ SDL mixer-1.2.5-4 http://www.libsdl.org/projects/SDL_mixer/ SDL mixer-devel-1.2.5-4 http://www.libsdl.org/projects/SDL_mixer/ SDL ttf-2.0.6-1 http://www.libsdl.org/projects/SDL_ttf/ smpeg-0.4.4-1 http://www.lokigames.com/development/smpeg.php3

Acknowledgements These notes were produced with emacs, LATEX (using the fantastic Listings package), ImageMagick and make.

xviii

Contents 1 Getting started with Python 1.1 Installing Python . . . . . . . . . . . . . . . . 1.2 Using the Python interpreter . . . . . . . . . . 1.2.1 Some basic python commands . . . . . 1.3 Turtle graphics . . . . . . . . . . . . . . . . . 1.3.1 Basic shapes . . . . . . . . . . . . . . . 1.3.2 Other functions in the turtle module 1.4 Algorithms . . . . . . . . . . . . . . . . . . . . 1.4.1 Designing an algorithm: square spirals 1.5 Python programs in files . . . . . . . . . . . . 1.5.1 Editing files . . . . . . . . . . . . . . . 1.5.2 #! . . . . . . . . . . . . . . . . . . . . 1.5.3 chmod . . . . . . . . . . . . . . . . . . 1.5.4 Executing the program . . . . . . . . . 1.6 Some basic Python . . . . . . . . . . . . . . . 1.7 Further reading . . . . . . . . . . . . . . . . . 1.8 Glossary . . . . . . . . . . . . . . . . . . . . . 1.9 Introductory Studio . . . . . . . . . . . . . . . 1.9.1 Key Assignment . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

1 1 2 2 3 5 7 8 11 15 15 15 16 17 17 17 18 19 19

2 Python basics 2.1 Statements . . . . . . . . . . . . . . . . . . 2.2 Commands . . . . . . . . . . . . . . . . . 2.3 Literals . . . . . . . . . . . . . . . . . . . 2.4 Type . . . . . . . . . . . . . . . . . . . . . 2.5 Expressions . . . . . . . . . . . . . . . . . 2.5.1 Order of precedence . . . . . . . . . 2.6 Variables and assignment . . . . . . . . . . 2.7 Currency conversion . . . . . . . . . . . . 2.8 Functions . . . . . . . . . . . . . . . . . . 2.8.1 Currency conversion done properly 2.9 Further reading . . . . . . . . . . . . . . . 2.10 Glossary . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

21 21 22 22 22 23 23 24 25 25 26 27 27

xix

. . . . . . . . . . . .

. . . . . . . . . . . .

xx

CONTENTS 2.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.11.1 Key Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 Boolean algebra 3.1 True and False . . . . . . . . . . . . 3.2 Boolean algebra . . . . . . . . . . . . 3.2.1 and . . . . . . . . . . . . . . . 3.2.2 or . . . . . . . . . . . . . . . 3.2.3 not . . . . . . . . . . . . . . . 3.2.4 Expressions with more terms . 3.2.5 De Morgan’s laws . . . . . . . 3.2.6 Notation . . . . . . . . . . . . 3.3 Boolean algebra in Python . . . . . . 3.3.1 Comparisons . . . . . . . . . 3.3.2 Between . . . . . . . . . . . . 3.3.3 A note on inequality . . . . . 3.4 Further reading . . . . . . . . . . . . 3.5 Glossary . . . . . . . . . . . . . . . . 3.6 Exercises . . . . . . . . . . . . . . . . 3.6.1 Key Assignment . . . . . . . . 4 Choice 4.1 if . . . . . . . . . . . 4.1.1 Diving in . . . 4.1.2 Example . . . . 4.1.3 Nesting . . . . 4.1.4 Randomisation: 4.1.5 else . . . . . . 4.1.6 elif . . . . . . 4.2 Further reading . . . . 4.3 Glossary . . . . . . . . 4.4 Exercises . . . . . . . . 4.4.1 Key assignment

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

5 Repetition: recursion 5.1 Recursive functions . . . . . . . . . . . . . . 5.1.1 Another recursive function . . . . . . 5.2 Base cases, induction cases and termination 5.3 Efficiency of recursive functions . . . . . . . 5.4 Visual recursion: fractal curves . . . . . . . 5.5 Further reading . . . . . . . . . . . . . . . . 5.6 Glossary . . . . . . . . . . . . . . . . . . . . 5.7 Homework exercises . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27 28

. . . . . . . . . . . . . . . .

29 29 30 30 30 30 30 31 31 31 32 34 34 34 35 35 35

. . . . . . . . . . .

37 37 38 39 39 40 41 42 43 43 43 44

. . . . . . . .

45 45 47 49 49 50 55 55 56

CONTENTS 5.7.1

xxi Key Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 Repetition: iteration 6.1 Iteration . . . . . . . . . . . . . 6.2 for loops . . . . . . . . . . . . . 6.2.1 Examples of for loops . . 6.3 while loops . . . . . . . . . . . . 6.3.1 Examples of while loops 6.4 String operations . . . . . . . . 6.5 More complex loop constructs . 6.5.1 Example of complex loop 6.6 Further reading . . . . . . . . . 6.7 Glossary . . . . . . . . . . . . . 6.8 Homework exercises . . . . . . . 6.8.1 Key Assignment . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 State 7.1 Variables, assignment and substitution 7.1.1 Substitution . . . . . . . . . . . 7.1.2 Simultaneous assignment . . . . 7.2 State . . . . . . . . . . . . . . . . . . . 7.3 Changing state . . . . . . . . . . . . . 7.4 Finite state machines . . . . . . . . . . 7.5 Lexers . . . . . . . . . . . . . . . . . . 7.6 Programs which manage state . . . . . 7.7 Further reading . . . . . . . . . . . . . 7.8 Glossary . . . . . . . . . . . . . . . . . 7.9 Homework exercises . . . . . . . . . . . 7.9.1 Key Assignment . . . . . . . . . 8 Compound types 8.1 Introduction . . . . . . . . . . . . . 8.2 Strings . . . . . . . . . . . . . . . . 8.2.1 Indexing . . . . . . . . . . . 8.2.2 Slicing . . . . . . . . . . . . 8.3 Tuples . . . . . . . . . . . . . . . . 8.3.1 Distance finding with tuples 8.3.2 Adding to immutables . . . 8.4 Lists . . . . . . . . . . . . . . . . . 8.4.1 Greenfly reproduction . . . 8.4.2 Cellular automata . . . . . . 8.4.3 Towers of Hanoi . . . . . . . 8.5 Dictionaries . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

56

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

59 59 60 62 64 65 66 69 70 71 72 72 75

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

77 77 78 80 81 82 83 85 88 89 89 90 91

. . . . . . . . . . . .

93 93 93 94 96 97 97 98 98 98 99 100 102

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

xxii

8.6 8.7 8.8

CONTENTS 8.5.1 Parsing Roman numerals 8.5.2 IMDB-style database . . Further reading . . . . . . . . . Glossary . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . 8.8.1 Key Assignment . . . . .

9 Searching and sorting 9.1 Introduction . . . . . . . 9.2 Searching . . . . . . . . 9.2.1 Linear search . . 9.2.2 Binary search . . 9.3 Sorting . . . . . . . . . . 9.3.1 Selection sort . . 9.3.2 Bubble sort . . . 9.3.3 Merge sort . . . . 9.3.4 Quicksort . . . . 9.4 Further reading . . . . . 9.5 Glossary . . . . . . . . . 9.6 Exercises . . . . . . . . . 9.6.1 Key Assignment .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

102 104 106 106 106 106

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

107 107 108 108 109 113 113 114 116 118 120 120 121 121

10 Functions and modules 10.1 Functions . . . . . . . . . . . . . . . . . . . . . . . 10.1.1 Scope . . . . . . . . . . . . . . . . . . . . . 10.1.2 Example: verifying ISBN checksums . . . . 10.2 Functional programming with Python . . . . . . . . 10.2.1 Using lambda to create anonymous functions 10.2.2 map and filter for automatic list processing . 10.2.3 An example using lambda, map and filter . . 10.3 Preconditions and postconditions for functions . . . 10.4 Modules . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1 Special data for modules . . . . . . . . . . . 10.5 Sets: an example module with preconditions and postconditions . . . . . . . . . . . . . . . . . . 10.5.1 Documentation for the module . . . . . . . . 10.6 Further reading . . . . . . . . . . . . . . . . . . . . 10.7 Glossary . . . . . . . . . . . . . . . . . . . . . . . . 10.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 10.8.1 Key Assignment . . . . . . . . . . . . . . . . 10.8.2 Challenge . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

123 123 124 126 129 130 132 133 134 136 136

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

136 139 142 142 144 144 144

CONTENTS

xxiii

11 Input and output 11.1 Simple keyboard input . . . . . . . . . . . . . . . . 11.2 Command-line Arguments . . . . . . . . . . . . . . 11.3 Files . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.1 Text files . . . . . . . . . . . . . . . . . . . . 11.3.2 Pickling and shelving . . . . . . . . . . . . . 11.3.3 Using shelve . . . . . . . . . . . . . . . . . 11.4 Regular expressions . . . . . . . . . . . . . . . . . . 11.4.1 Finding substrings with regular expressions . 11.4.2 Creating regular expressions . . . . . . . . . 11.4.3 Example: Finding calls to input() . . . . . 11.5 Further reading . . . . . . . . . . . . . . . . . . . . 11.6 Glossary . . . . . . . . . . . . . . . . . . . . . . . . 11.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 11.7.1 Key Assignment . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

145 145 146 147 147 149 150 152 153 154 155 156 156 156 156

. . . . . . . . . . . . . . . .

157 157 169 173 175 176 178 181 181 187 190 194 195 200 202 206 206

. . . . . . . .

207 208 210 211 212 214 215 216 218

12 Object oriented Python 12.1 Writing classes and using objects . . . . . . . . . . 12.1.1 The film database with objects . . . . . . . 12.2 Inheritance . . . . . . . . . . . . . . . . . . . . . . 12.3 Polymorphism . . . . . . . . . . . . . . . . . . . . . 12.3.1 Shapes example . . . . . . . . . . . . . . . . 12.3.2 Expression evaluator example . . . . . . . . 12.4 Abstract data types . . . . . . . . . . . . . . . . . . 12.4.1 Sets . . . . . . . . . . . . . . . . . . . . . . 12.4.2 Overloading built in operators and functions 12.5 Exceptions . . . . . . . . . . . . . . . . . . . . . . . 12.6 Unit testing with PyUnit . . . . . . . . . . . . . . . 12.6.1 A test harness for the sets class . . . . . . . 12.7 Further reading . . . . . . . . . . . . . . . . . . . . 12.8 Glossary . . . . . . . . . . . . . . . . . . . . . . . . 12.9 Homework exercises . . . . . . . . . . . . . . . . . . 12.9.1 Key Assignment . . . . . . . . . . . . . . . . 13 Python extensions 13.1 PyGoogle . . . . . . . . . . . 13.1.1 Googlewhacking . . . . 13.2 PIL . . . . . . . . . . . . . . . 13.2.1 Viewport . . . . . . . 13.2.2 The RGB colour model 13.2.3 Colour to greyscale . . 13.2.4 Colour to negative . . 13.2.5 Swapping colour bands

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xxiv

13.3

13.4 13.5 13.6 13.7

CONTENTS 13.2.6 Filters: edge enhancement and embossing . . 13.2.7 Pixel by pixel transformations: colour to sepia Pygame . . . . . . . . . . . . . . . . . . . . . . . . . 13.3.1 Bouncing ball animation . . . . . . . . . . . . 13.3.2 Game functions module . . . . . . . . . . . . 13.3.3 Old skool arcade games: Snake . . . . . . . . Further reading . . . . . . . . . . . . . . . . . . . . . Glossary . . . . . . . . . . . . . . . . . . . . . . . . . Homework exercises . . . . . . . . . . . . . . . . . . . Key Assignment . . . . . . . . . . . . . . . . . . . . .

. . . tone . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

219 221 225 225 229 230 248 248 249 250

List of Figures 1

An example figure which will appear in the List of Figures . . . . . . . . .

xii

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9

Running the demo() function in the turtle module Result of calling turtle.reset() . . . . . . . . . . Drawing a square . . . . . . . . . . . . . . . . . . . Drawing half a circle . . . . . . . . . . . . . . . . . Not really a square! . . . . . . . . . . . . . . . . . . A square spiral . . . . . . . . . . . . . . . . . . . . Part of a square spiral . . . . . . . . . . . . . . . . Drawing a square spiral with the turtle . . . . . . . House shape . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

5 6 7 8 10 12 13 14 20

5.1 5.2 5.3

A von Koch curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Torn Square fractal . . . . . . . . . . . . . . . . . . . . . . . . . . . . Building up to the Torn Square fractal . . . . . . . . . . . . . . . . . . . .

51 57 57

7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8

Some Python variables . . . . . . . . . . . . . . . . . . . . . Python variables after re-assignment . . . . . . . . . . . . . Simultaneous assignment: line 1 . . . . . . . . . . . . . . . . Simultaneous assignment: line 2 . . . . . . . . . . . . . . . . A state transition diagram for a light switch . . . . . . . . . A state transition diagram for a set of (British) traffic lights A state transition diagram for a simple vending machine . . A lexer for integers . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

78 79 81 81 82 84 85 86

13.1 RGB colour representation. Adding red to blue gives magenta, adding blue to green gives cyan, adding green to red gives yellow. Adding red, blue and green yields white. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Colour to greyscale: original image and result . . . . . . . . . . . . . . . . 13.3 Colour to negative: original image and result . . . . . . . . . . . . . . . . . 13.4 Swapping colour bands: original image and result . . . . . . . . . . . . . . 13.5 Edge enhancement: original image and result . . . . . . . . . . . . . . . . . 13.6 Embossing: original image and result . . . . . . . . . . . . . . . . . . . . . 13.7 Colour to sepia: original image and result . . . . . . . . . . . . . . . . . .

215 216 217 218 220 220 221

xxv

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . .

. . . . . . . . .

. . . . . . . .

. . . . . . . . .

. . . . . . . .

. . . . . . . . .

. . . . . . . .

. . . . . . . . .

. . . . . . . .

. . . . . . . . .

. . . . . . . .

. . . . . . . . .

. . . . . . . .

xxvi

LIST OF FIGURES

13.8 Bouncing ball animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 13.9 Snake: press any key to start . . . . . . . . . . . . . . . . . . . . . . . . . 231 13.10Snake: game over! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

Listings 1 1.1 1.2 4.1 5.1 5.2 5.3 5.4 5.5 6.1 6.2 6.3 6.4 6.5 6.6 7.1 7.2 7.3 8.1 8.2 8.3 8.4 8.5 8.6 9.3 9.6 9.8 9.11 9.12 9.13 9.14 10.1 10.2

An example code listing . . . . . . . . . . . . . . . . . . Hello World! in a file . . . . . . . . . . . . . . . . . . . . Birthday calculator . . . . . . . . . . . . . . . . . . . . . Random signatures 1 . . . . . . . . . . . . . . . . . . . . A function to draw a square . . . . . . . . . . . . . . . . Recursively raising a number to a power . . . . . . . . . Recursive factorial . . . . . . . . . . . . . . . . . . . . . Recursively raising a number to a power, more efficiently A von Koch curve . . . . . . . . . . . . . . . . . . . . . . Iteratively raising a number to a power with for . . . . . Iterative factorial with for . . . . . . . . . . . . . . . . . Iteratively raising a number to a power with while . . . . Iterative factorial with while . . . . . . . . . . . . . . . . Cæsar cipher . . . . . . . . . . . . . . . . . . . . . . . . Prime number filter using for loops . . . . . . . . . . . . Light switch . . . . . . . . . . . . . . . . . . . . . . . . . Traffic Light . . . . . . . . . . . . . . . . . . . . . . . . . A lexer for integers . . . . . . . . . . . . . . . . . . . . . Test for palindromes . . . . . . . . . . . . . . . . . . . . Greenfly Reproduction . . . . . . . . . . . . . . . . . . . Cellular Automata . . . . . . . . . . . . . . . . . . . . . Towers of Hanoi . . . . . . . . . . . . . . . . . . . . . . . A parser for Roman numerals . . . . . . . . . . . . . . . A simple film database . . . . . . . . . . . . . . . . . . . Linear Search . . . . . . . . . . . . . . . . . . . . . . . . Binary Search . . . . . . . . . . . . . . . . . . . . . . . . Selection Sort . . . . . . . . . . . . . . . . . . . . . . . . Bubble Sort . . . . . . . . . . . . . . . . . . . . . . . . . Improved Bubble Sort . . . . . . . . . . . . . . . . . . . Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . . . Quick Sort . . . . . . . . . . . . . . . . . . . . . . . . . . Verifying ISBN checksums . . . . . . . . . . . . . . . . . A lazy list for generating the integers . . . . . . . . . . . xxvii

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi 16 17 40 46 46 48 50 53 62 63 65 66 68 71 83 83 88 95 98 99 100 103 104 109 111 113 115 116 117 119 128 132

xxviii 10.3 A primes filter with functional programming . . . 10.4 A ”forall” function. . . . . . . . . . . . . . . . . . 10.5 A module for managing sets represented by lists . 10.6 Documentation for the sets module . . . . . . . . 11.1 Command-line arguments . . . . . . . . . . . . . 11.2 Writing text files . . . . . . . . . . . . . . . . . . 11.3 Appending to text files . . . . . . . . . . . . . . . 11.4 Reading text files . . . . . . . . . . . . . . . . . . 11.5 Persistent film database . . . . . . . . . . . . . . 11.6 Using regular expressions to search a file . . . . . 12.1 Game of craps with contracts . . . . . . . . . . . 12.2 Documentation for the craps module . . . . . . . 12.3 Film database with objects . . . . . . . . . . . . . 12.4 Film database in use . . . . . . . . . . . . . . . . 12.5 Points: inheritance as extension . . . . . . . . . . 12.6 Shapes: inheritance in use . . . . . . . . . . . . . 12.7 A simple expression evaluator . . . . . . . . . . . 12.8 A class to model sets . . . . . . . . . . . . . . . . 12.9 A test harness for the sets class . . . . . . . . . . 13.1 Viewport . . . . . . . . . . . . . . . . . . . . . . . 13.2 Colour to greyscale . . . . . . . . . . . . . . . . . 13.3 Colour to negative . . . . . . . . . . . . . . . . . 13.4 Swapping red and blue colour bands . . . . . . . 13.5 Edge enhancement . . . . . . . . . . . . . . . . . 13.6 Using the PIL emboss filter . . . . . . . . . . . . 13.7 Colour to sepia . . . . . . . . . . . . . . . . . . . 13.8 Bouncing ball animation . . . . . . . . . . . . . . 13.9 Documentation from the game functions module 13.10Full listing of the Snake program . . . . . . . . .

LISTINGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

134 136 137 139 146 147 148 149 151 155 163 166 169 171 174 176 179 184 197 212 216 217 219 219 221 224 228 229 243

Chapter 1 Getting started with Python Learning outcomes At the end of this Chapter, you will be able to: ˆ Start the Python interpreter and write simple programs to print output to the terminal. ˆ Use the import statement to use some of the code that comes with Python. ˆ Define the word algorithm. ˆ Define the terms bug and debug. ˆ Use the turtle module to draw simple graphics. ˆ Call functions from inside an imported module. ˆ Write a Python program in a stand-alone file and run your program from the interpreter.

1.1

Installing Python

Before going any further, you should have access to Python. If you’re using computers administrated by someone else (university students, for example) then ask your administrator to install it if it isn’t already. Windows users can find an installer and documentation on installation at http://www.python.org/2.4.1/. Linux users probably have Python already installed, but should check to see if it is version 2.4 or newer. If you don’t have it installed, or you have an old version, then it can be obtained from http://www.python.org/download/, or through your distribution’s package management system. 1

2

1.2

CHAPTER 1. GETTING STARTED WITH PYTHON

Using the Python interpreter

The Python interpreter, or shell, is executed by the operating system whenever it is requested to run a python program. You can find the location of your Python shell with the which command:

$ which python / usr / bin / python If you’re using some flavour of Unix, you probably know what a shell is - it’s the thing that lets you give text commands, such as cp, mv or ls. Windows users have a similar set of commands (copy, move, dir, etc.) that can be typed into the DOS shell. As well as interpreting Python programs stored in files, the Python shell can be used interactively, like bash. Just enter python at the command prompt and press enter.

$ python Python 2.3.4 (#1, Feb 20 2005 , 04:19:21) [ GCC 3.3.5 ( Gentoo Linux 3.3.5 - r1 , ssp -3.3.2 -3 , pie -8.7.7.1)] on linux2 Type " help " , " copyright " , " credits " or " license " for more information . >>> After some version and environment information, we are presented with the python prompt (>>>). Notice the difference between the two prompts - bash uses $, while Python uses >>>. Both of these signify that the current shell is ready for commands. We’ve already been told of three commands, so lets try them: >>> credits Thanks to CWI , CNRI , BeOpen . com , Zope Corporation and a cast of thousands for supporting Python development . See www . python . org for more information . >>> This just gives us a list of acknowledgements and the URL of the python website. license gives us licensing information and help gives us a very extensive collection of python help. Remember this if you ever want more information.

1.2.1

Some basic python commands

Now that you have the python interpreter running interactively, you can type in python commands and have them executed instantly. Try typing print "Hello Python!" : >>> print " Hello python ! " Hello python ! >>>

1.3. TURTLE GRAPHICS

3

You shouldn’t be surprised by what it does. Now try: >>> print 4+9 13 >>>

This is a less obvious output - why did it print 13 and not 4+9? This will all be made clear in Chapter 2, but for now compare this result with: >>> print " 4+9 " 4+9 >>>

And what would you expect the result of print 2+3*4 to be? 20? 14? Try it and see before a full explanation in Section 2.5.1.

1.3

Turtle graphics

One of the things that makes Python so powerful is the amount of code that comes with the language. This large library of code can be used by any programmer, at any time, and means that you don’t have to write every program from scratch. For example, if you are writing a program to manage a database, Python already comes with libraries which allows you to create and manipulate databases, so you don’t have to write your own. If you want to build a graphical user interface, Python comes with code to do that too. In fact, Python has code which can help you in almost any situation: in communicating over the Internet, writing a webpage, drawing pictures, and most other things you can think of. Rather than having all these libraries of code available all the time (which would use lots of memory!), programmers can ask Python to fetch just the code they want to use. Python keeps each piece of available code in a module, and you can give Python a command (just like the print command) to tell Python to find and load the module you want. This command is called import.

4

CHAPTER 1. GETTING STARTED WITH PYTHON

Seymour Papert’s Logo In the 1966 Seymour Papert and Wally Feurzeig designed Logo, which was a simple language designed for studying artificial intelligence. The earliest turtles were robots (called “Irvine”) which were radio-controlled and had a pen, a bell, touch sensors and could move forward and backwards and rotate left and right. Logo proved to be a fun language to write programs in and one which was very easy for beginners to pick up. Papert went on to use Logo as an educational tool, first at Muzzy Jr. High, Lexington, MA and later at other schools and institutions. There are currently many implementations of Logo which you can download and try out, including UCBLogo and MSWLogo. Some variants, including MIT Starlogo, allow you to create thousands of turtles at once. When Guido van Rossum wrote Python, he wanted to include a module that replicated Logo and was equally fun and simple to use. This became the turtle module (introduced in Python 1.5.2) that we’re using in this chapter.

In this Section, we will be using a module called turtle to draw simple pictures by moving a robot (called a turtle) around the screen. The listing below shows how to do this. We just start the Python interpreter and give Python the command import turtle and Python does whatever is necessary to import the turtle module, then gives us a prompt so that we can keep giving instructions: Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> import turtle Next, we want to do something exciting with the turtle module. Rather than drawing our own pictures, we can ask the turtle to show us what it can do. We can do this by typing turtle.demo() in the interpreter: Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> import turtle >>> turtle . demo () and Figure 1.1 shows the result. As you can see, the turtle can draw lines of different thickness, in different colours as well as write text and fill in shapes. Running the demo in the turtle module is not like issuing print and import commands. Firstly, we had to say turtle.demo() rather than just demo(), which told Python that demo is in the turtle module, not anywhere else. When we tell the turtle to do other

1.3. TURTLE GRAPHICS

5

Figure 1.1: Running the demo() function in the turtle module

things for us, we will also have to say turtle.DOSOMETHING to tell Python that it’s the turtle that should be acting for us and not any other module. Secondly, we had to say turtle.demo() rather than turtle.demo. The brackets tell us that turtle.demo() is not a command like print and import. In fact, turtle.demo() is a function. The brackets give us space give tell turtle.demo() any extra information it needs to do its job. In this case, turtle.demo() doesn’t need any extra information (called arguments), it just runs a demo and finishes. However, there are lots of functions in the turtle module which do take arguments and we’ll meet some of these next.

1.3.1

Basic shapes

So, let’s get the turtle to draw a shape for us. First of all, we need to clear the screen to get rid of all the pictures that the demo drew. To do this we call another function called turtle.reset(). Like turtle.demo(), turtle.reset() is a function that takes no arguments: Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> import turtle >>> turtle . demo () >>> turtle . reset () and the clear turtle window is shown in Figure 1.2.

6

CHAPTER 1. GETTING STARTED WITH PYTHON

Figure 1.2: Result of calling turtle.reset()

Now we have a clear screen, we can draw some pictures of our own. Let’s start with something simple: a square. >>> >>> >>> >>> >>> >>> >>>

turtle . forward (100) turtle . left (90) turtle . forward (100) turtle . left (90) turtle . forward (100) turtle . left (90) turtle . forward (100)

To draw a square, we have used two functions with arguments: turtle.forward() and turtle.right(), which move the turtle forwards and turn it clockwise. The argument to the turtle.right() function tells the turtle how many degrees to rotate by. So, if we want the turtle to rotate 45° clockwise then we say turtle.right(45). The argument to the turtle.forward() function tells the turtle how far to move forward. This number is measured in pixels. In computer graphics, monitors and other displays are divided into small squares called pixels. Common sizes for computer desktops are 800×600, or 1024×768 pixels. You can see how big a pixel is on your computer by clearing the turtles screen (by calling the turtle.reset() function) then telling the turtle to move 1 pixel forward. As you might have guessed, the turtle module also provides us with functions called turtle.backward() and turtle.left() so that we can move the turtle backwards and turn it anti-clockwise. However, it also provides us with lots of other useful functions and we will look at some of these next.

1.3. TURTLE GRAPHICS

7

Figure 1.3: Drawing a square

1.3.2

Other functions in the turtle module

First of all, we can change the colour of the turtle’s pen by calling the turtle.color() function. Notice that this is spelt in the American style: “color”, not “colour”. Then we need to move forward a little to see make sure the change of colour has worked: >>> turtle . reset () >>> turtle . color ( " Red " ) >>> turtle . forward (10) The turtle also allows us to move the pen up and down, as if we were lifting a real pen off a piece of paper. Sensibly, these functions are called turtle.up() and turtle.down() We can use turtle.up() and turtle.down() to draw a dashed line: >>> >>> >>> >>> >>> >>> >>> >>>

turtle . up () turtle . forward (10) turtle . down () turtle . forward (10) turtle . up () turtle . forward (10) turtle . down () turtle . forward (10)

Although we have to draw other shapes by hand, the turtle can also draw circles for us automatically, using the turtle.circle() function: >>> turtle . reset () >>> turtle . circle (50)

8

CHAPTER 1. GETTING STARTED WITH PYTHON

turtle.circle() is the first function we have looked at that can take more than one argument. The first argument (we used 50 in the example above) tells the turtle what the radius of the circle should be. Try experimenting with different values to see how the radius changes the size of the circle. The second argument, which is optional, tells the turtle how much of the circle to draw. If you don’t specify a number (like we didn’t above) then the turtle assumes that you want all 360°s of the circle. However, we can tell the turtle to only give us, say, half a circle, like this: >>> turtle . reset () >>> turtle . circle (50 , 180)

Figure 1.4: Drawing half a circle Figure 1.4 shows the result. Try experimenting with different values and see what results you get.

1.4

Algorithms

An algorithm is a sequence of instructions which can be used to carry out a specific task. You already know all about algorithms from your daily life. For example, if you ask someone how to get from your home to the nearest console games shop, the answer will be an algorithm. This sort of algorithm might be something like “go down the road, turn left at the traffic lights, take the third exit at the roundabout and the shop is on your left”. It is important that you follow the instructions in the correct order. If you try to take the third exit at the roundabout before you turn left at the traffic lights, you’ll be very confused - probably because the roundabout won’t be where you expect!

1.4. ALGORITHMS

9

Algorithms The term algorithm is named after the 9th Century Persian mathematician Abu Abdullah Muhammad bin Musa al-Khwarizmi. Originally, algorithms (or algorisms as they were called) were rules for performing arithmetic using Arabic numerals. By the 18th Century, the word algorism became algorithm which refers to a general sequence of commands to carry out a specific task. The first algorithms written for a computer were Ada Byron’s programs for Charles Babbage’s Analytical Engine, in 1842. Ada Byron is often thought of as the first computer programmer and a language called ADA was named after her. This is also true of the algorithms you will write in Python. If we had tried to call turtle.reset() before we told Python to import the turtle module, we would have received this error: Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> turtle . reset () Traceback ( most recent call last ): File " < stdin > " , line 1 , in ? NameError : name ’ turtle ’ is not defined >>> Here, Python is telling us that it doesn’t have anything called turtle to refer to, so it can’t run the turtle.reset() function for us. Sometimes, however, getting an algorithm wrong might not result in Python giving us an error message - we might just get unexpected results. One algorithm we’ve already looked at is the sequence of instructions needed to draw a square: Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> import turtle >>> turtle . demo () >>> turtle . reset () >>> turtle . forward (100) >>> turtle . left (90) >>> turtle . forward (100) >>> turtle . left (90) >>> turtle . forward (100) >>> turtle . left (90) >>> turtle . forward (100)

10

CHAPTER 1. GETTING STARTED WITH PYTHON

What would happen if we had entered these instructions in the wrong order? The following program is a re-arranged version of our square algorithm: Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> import turtle >>> turtle . forward (100) >>> turtle . left (90) >>> turtle . forward (100) >>> turtle . left (90) >>> turtle . forward (100) >>> turtle . forward (100) >>> turtle . left (90) Here, we have all the right instructions, just in the wrong order! Figure 1.5 shows the result. We haven’t had an error message from Python, but the shape we have drawn isn’t a square. This is sometimes called a semantic error, that is, an error in the meaning of the program. In this case, our instructions to Python ought to mean “draw a square”, but they don’t, because we wrote them out in the wrong order.

Figure 1.5: Not really a square!

For a small algorithm that draws a square, it should be obvious where errors have occurred and how we can correct our code. However, by the time you get half way through the chapters in this book, the algorithms you will be writing will be far more complicated and difficult to fix (or debug). Therefore, it is extremely important to plan your programs

1.4. ALGORITHMS

11

carefully, and make sure you know what they ought to do, before you write any code. Next, we’ll go through the development of a simple algorithm, to show you how this can be done.

Origins of bug and debug Admiral Grace Hopper was an early pioneer of computing and invented the compiler in 1952 and later the programming language COBOL. Hopper used to tell a story (which she did not witness) about a technician who fixed a “bug” in the Harvard Mark II machine by pulling an insect out of the contacts of one of its relays Since then, the terms bug meaning a mistake in a computer program and debug meaning to fix a bug, have become widely used. For many years the logbook associated with the incident and the bug in question (a moth) were displayed at the Naval Surface Warfare Center (NSWC). However, the term bug was also used in Edison’s time to mean a fault in electrical apparatus and even in the early days of telegraphy.

1.4.1

Designing an algorithm: square spirals

Figure 1.6 shows a square spiral. Our task is to draw a similar spiral using the Python turtle module. Firstly, we need to fully understand the result we want to achieve. This might seem obvious, but it’s amazing how many software projects fail because the engineers writing the product never fully understood what their client wanted. We also know from our experience as teachers that some students don’t take the time to understand their coursework briefs and exam questions, which causes them to lose marks. Figure 1.6 shows the shape we’re aiming to produce. We know we want to draw a spiral, but what does this involve? If we look closely at Figure 1.6 we can see that a spiral is made up of straight lines and corners. Are these left-hand corners or right-hand corners? Look carefully at the centre of the spiral and trace it round and you should see that all the corners are left-hand turns. So, our program is going to be made up of calls to the turtle.forward() and turtle.left() functions. We can see that all the corners are right-angles, so the turtle needs to turn left by 90° on each corner. How far forward should the turtle move? Looking carefully at the spiral, we can see that each part of it is almost like a square. If we increase the side length of each “square” too quickly, the shape will spiral out faster than we want. If we increase the side length too slowly, the lines of the spiral will meet and we will end up with a square, not a spiral at all! So, we need to think carefully about how to create the shape we want. The easiest way to do this is to just draw some shapes on a piece of paper, and see what we get (or, if we have more time, to experiment with the turtle). Try this now before you read on any further and see if you can work out for yourself what the square spiral algorithm might be

12

CHAPTER 1. GETTING STARTED WITH PYTHON

Figure 1.6: A square spiral

(and if your algorithm turns out to be different from mine, try yours out with the turtle and see what shape it makes). Figure 1.7 shows three sides of a square. Each side is the same length. If we trace round this shape (starting from the lower left hand corner) we can see that the next side to be drawn will meet the first one and form a square. We don’t want that, so the third side needs to be longer than the first two, to make a spiral. So, our final square spiral algorithm will be something like this: Move forward by some amount . Turn left 90 ° Move forward by some amount . Turn left 90 ° Move forward by a bit more than last time Turn left 90 ° ... keep going Or we might like to write this a bit more concisely: n = 10 Repeat : Move forward by n pixels .

1.4. ALGORITHMS

13

Figure 1.7: Part of a square spiral

Turn left 90 ° Move forward by n pixels . Turn left 90 ° n = n + 10 Here, we have said that we will move forward by n pixels, where n starts off being 10, then we turn left, move forward by n pixels again, turn left, add 10 to n (so now n is 20), then repeat that whole process. This algorithm gives us a good idea of what our Python code will look like. A Python version of our algorithm is given below and the output of this program is given in Figure 1.8. Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> import turtle >>> turtle . forward (10) >>> turtle . left (90) >>> turtle . forward (10) >>> turtle . left (90) >>> turtle . forward (20)

14 >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>

CHAPTER 1. GETTING STARTED WITH PYTHON turtle . left (90) turtle . forward (20) turtle . left (90) turtle . forward (30) turtle . left (90) turtle . forward (30) turtle . left (90) turtle . forward (40) turtle . left (90) turtle . forward (40) turtle . left (90) turtle . forward (50) turtle . left (90) turtle . forward (50) turtle . left (90) turtle . forward (60)

Figure 1.8: Drawing a square spiral with the turtle Notice that the Python implementation of our algorithm seems to be a lot longer than the informal description of the algorithm! In Chapters 5 and 6 we will be looking at how Python can repeat instructions for us and help us cut down on our typing. Also, although we have started moving forward by 10 pixels and have added 10 after every second move forward, we could have chosen other numbers instead. In the exercises at the end of this chapter, you will be asked to play with this and see what sort of spirals you can draw using different numbers. Then in Section 2.6 we will be looking at how Python can remember

1.5. PYTHON PROGRAMS IN FILES

15

numbers for us (so we don’t have to keep typing them in) and in Chapter 10 we will see how to write our own functions, which will make our programs much smaller and reusable, just like the functions we’ve used in the turtle module. As you continue learning Python, you will see many more examples of algorithm development, usually for much more complicated algorithms! Pay attention to these, because you will soon be writing your own algorithms, even in the exercises to this first chapter. The more practice you have at designing different programs to solve different problems, the easier you will find it.

1.5

Python programs in files

So far, we’ve been typing out programs line by line and having them executed as we go. Now imagine a program with hundreds or thousands of lines - we obviously need a way to avoid typing these programs into the interpreter every time we want to run them. Storing a python program for later execution is actually very simple in Unix: 1. We write the instructions into a file as text, exactly as you typed them into the python shell. 2. We add a line to the top of the file that informs the OS of how to execute it. 3. We tell the OS that we want to be able to execute the file. We’ll look at each of these in turn.

1.5.1

Editing files

Program files in python are simply text files containing python statements. A compiled language would then translate (compile) these instructions into a second file (and possibly intermediate files) that could be executed. Python is an interpreted language, which means that there is no compilation step - python translates the file as it is executed. So editing python source files is just like editing any plain text document - pick an editor and start typing. Unix and python don’t really care what you call the file itself, but we generally call it something sensible and end it with .py so we know that it’s a python file.

1.5.2

#!

While this might look like a disguised expletive, it is actually the start of the line that tells the OS which interpreter should be used to execute the rest of the file.

How to say #! Most people pronounce these two characters as “sh-bang”, as in “The whole #!”.

16

CHAPTER 1. GETTING STARTED WITH PYTHON The first line of a stored python program will look something like:

# !/ usr / bin / python or # !/ usr / bin / perl or # !/ usr / local / bin / perl ...depending on the interpreter you want to use (we want python, obviously) and it’s location. How can we find out where our interpreter is? Type the following into the shell (not the Python shell):

$ which python You should see the path to the python interpreter displayed on the line below. In my case, this is /usr/bin/python, but it could be different on your machine. There is a slight drawback to this: on a different machine, the python interpreter might be in a different place. On most modern Unix systems, there’s a much more portable way to achieve the same result. The program env can be used to run the python interpreter without explicitly giving its location. If you have this program (which env will tell you if you have, and where), then the first line can be written as: # !/ bin / env python So, our file should look something like this: Listing 1.1: Hello World! in a file 1 # !/ bin / env python 2 3 print " Hello World ! " ...and saved with a sensible name, like hello.py.

1.5.3

chmod

Now we need to tell the operating system that our file is executable. Assuming you’ve saved the file as hello.py, you would type:

$ chmod u + x hello . py The chmod utility changes the permissions of a file. The u+x part of the command states that the owner of the file (u for user) should be able to execute it (x for eXecute). For more information on chmod, try man chmod in your normal shell.

1.6. SOME BASIC PYTHON

1.5.4

17

Executing the program

At the prompt, simply enter:

$ ./ hello . py The ./ prefixing the file name just tells the OS that the file is right here. You should see the output of your program. Running the program again will give you the same results, unless you change the contents of the source file.

1.6

Some basic Python

Now that we’ve covered storing and executing programs, try typing, saving and executing Listing 1.2, the birthday calculator. Read the code and try to see how it works, but don’t worry, we’ll be covering all of the concepts used in this program later on. For now, just start getting used to how python looks. Listing 1.2: bday.py 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

# !/ bin / env python from time import gmtime print " Enter the year of your birth ( e . g . 1979): " , y = input () print " Enter the month of your birth ( e . g . 2 for February ): " , m = input () print " Enter the day of your birth ( e . g . 20): " , d = input () nowy , nowm , nowd = gmtime ()[0:3] ny = nowy - y nd = nowd - d nm = nowm - m nd = nd +( nm *30)+( ny *365)

print d , " / " ,m , " / " ,y , " was roughly " , nd , " days ago . "

1.7

Further reading

ˆ Sections 1 and 3 from the Python tutorial http://www.python.org/doc/tut/tut.html

18

CHAPTER 1. GETTING STARTED WITH PYTHON ˆ Wikipedia article on Seymour Papert’s Logo language http://en.wikipedia.org/wiki/Logo_programming_language ˆ Wikipedia article on algorithms http://en.wikipedia.org/wiki/Algorithm ˆ Jargon file entry on the term bug http://catb.org/~esr/jargon/html/B/bug.html ˆ Guido van Robot - a very simple introduction to programming in a Python-like language http://gvr.sourceforge.net/

1.8

Glossary

Algorithm a sequence of instructions carried out to perform a specific task. Bug a mistake or error in a program. Command tell Python to do something for you (like print something to the terminal or import a module). Unlike functions (such as turtle.reset()) commands are not followed by brackets. Debug to fix the mistakes in a program. Function a piece of code which performs a specific task. For example, turtle.reset() resets the turtle window. Unlike commands, functions always end with brackets and some functions can be given extra information by placing values inside their brackets. For example, calling turtle.circle(100, 270) causes the turtle to draw 270° of a circle with radius 100 pixels. import commands Python to find and load a module. Interpreter something which interprets your Python programs and carries out your instructions. Module an independent piece of code which you can ask Python to load for you using the import command. To run functions in a module, you need to tell Python the name of the module, followed by a dot, then the name of the function (with any arguments). For example turtle.circle(100). Pixel the smallest possible square of space on a monitor. Each pixel has its own brightness and colour. On most displays, pixels are so small that they appear to merge into a smooth image. print commands Python to print something to the terminal.

1.9. INTRODUCTORY STUDIO

19

Semantic error a mistake in the meaning of a program. Shell an interface to the functions of an operating system, application or language.

1.9

Introductory Studio

1. Draw a red circle, then a yellow circle underneath it and a green circle underneath that. 2. Use the turtle module to draw a regular hexagon. Remember that regular hexagons have six sides, of equal length, which meet at an angle of 120°. 3. The turtle module provides a function to draw circles. If you were implementing this function yourself, what algorithm would you use to draw a circle? Hint: remember there are 360° in a circle! 4. Try drawing your own square spiral. Don’t move forward by ten (then ten more after two moves forward) like I did, try out different values. Do all values make a spiral? Which values make the nicest looking spiral? 5. Type, save and execute bday.py.

1.9.1

Key Assignment

Look at the house shape in Figure 1.9. Can you use the turtle to draw this shape again without going over any of the lines more than once? It will help you to figure out what all the lengths and angles in the house shape should be, before you start. Notice that the triangle that forms the roof is a right-angled triangle. So, to work out the lengths of the two slopes which p form the roof you can use Pythagoras’ theorem: hypotenuse= x2 + y 2 where hypotenuse is the longest side of the triangle and x and y are the lengths of the other two sides. To find the square root of a number in Python, you first need to import a module called math, then call the function math.sqrt(). So, if you want to find the square root of the number 2, you would import the math module then say math.sqrt(2).

20

CHAPTER 1. GETTING STARTED WITH PYTHON

Figure 1.9: House shape

Chapter 2 Python basics This chapter gives you an introduction to the basics of programming and python. We’ll be covering many new concepts and new terms, but don’t worry - they are all fairly simple. We’ll be typing the commands directly into the Python shell even though this isn’t exactly how we write a program. The commands are the same, but a program is something stored that we can execute. Later, once the commands have been introduced, we’ll start writing reusable programs as discussed in Section 1.5 on page 15.

Learning outcomes After reading this chapter and completing the exercises, you will be able to: ˆ Understand statements, expressions, types and variables. ˆ Assign values to variables and use them in expressions. ˆ Create a simple program and enter it into the interpreter.

2.1

Statements

A statement is an instruction. Usually, this is one line of code. Each statement is executed in order until the end of the program. Later, we’ll see how we can partition our programs and group statements to do more complex tasks, but for now, we’re just writing statements. Here is a statement in Python: print " Programming is fun . " This is a single instruction to Python. If we typed this into the python shell, we’d see: Programming is fun . 21

22

CHAPTER 2. PYTHON BASICS

2.2

Commands

Commands are one of the components of statements. In the printing example, the “print” part of the statement is the command - it tells python to print whatever follows it to the screen.

2.3

Literals

Literals are the pieces of our statements that are literally what we type in. In the printing example, this was the text “Programming is fun.” So, out first line of code is a statement made up of a command and a literal. Other literals might be 12, 144.3 or “A”.

2.4

Type

Programming languages are usually “typed”. This doesn’t refer to them being typed in at a keyboard. Data, to most programming languages, can be of one of a number of types. Our example statement had a literal that was of type string - it is a string of characters. String literals in Python are surrounded by double or single quote marks, so the statements: print " James " print ’ James ’ are both the same. The literal 120 is of type integer. That is, it’s a whole number. Notice that there are no quotes around this literal. Why?1 Finally, for now, there is the float type. Floats are floating-point numbers, like 1.2, 0.3333 and 99.9. Again, no quotes. Python gives us a method to determine the type of something. Entering type(12) will display . This tells us that 12 is an integer. Try this method on the other types we have covered.

Other types Everything in Python, except commands, have a type associated with them. The output from type("abcde"), for example, has a type. Find out what it is by entering type(type("abcde")).

1

Answer: because being surrounded by quotes would make it a string.

2.5. EXPRESSIONS

2.5

23

Expressions

Going back to our print statement, we could now say that the format or syntax for a print statement is something like: print But this is not quite correct. We can print more than just literals. The syntax is (simplified): print An expression is something that can be evaluated to give a single value. The simplest expressions are just literals, so our example print statement still follows this rule. Putting it another way, the literal “Yadda yadda yadda” evaluates to the string “Yadda yadda yadda” and the literal 12 evaluates to 12. So why have two names? Well, expressions are not always literals. Here are some more expressions: 12+9 " Leaping " + " Lizards " 91.5+8.5 1*10+3 1+10*3 " 43 " + " 7 " " 43+7 "

2.5.1

( Evaluates to 21) ( Evaluates to " LeapingLizards " )

Order of precedence

What do you expect to be the output of the following statement? print 1+2*3 If you think 7, then you probably already know what this section is about. The reason the answer is not 9 is because of the order of precedence. It is likely that you learnt the acronym “BODMAS” at school as a way to remember the order of precedence: Brackets, Order, Division, Multiplication, Addition, Subtraction. This means that brackets will always be evaluated before subtraction, or multiplication before addition, for example. If we wanted to force our addition to be evaluated first, making the answer 9, then we could put brackets around it: print (1+2)*3 Python has more operators than these six, however. Python’s order of precedence is listed in the reference manual, available on-line at the URL given in the further reading at the end of the chapter, on page 27.

24

CHAPTER 2. PYTHON BASICS

2.6

Variables and assignment

We can now do some mathematical expressions, concatenate strings and print things. But what if we wanted to store the result of an expression for later use? We don’t want to have to remember it and then type it in again. Variables allow us to hold a value and assign a name to it. Here is the syntax for assigning a value to a variable: = And here is an example: a =12 This statement assigns the value 12 to the variable named “a”. Here’s another example: b =44/4 Which leaves the variable “b” holding the value 11. Now for the good part: variables can be used in expressions! So, this: 1 a =10 2 print a Prints “10” on the screen. The important thing to note is that variables in an expression evaluate to the value they hold. Here’s a more interesting example: 1 value =5 2 square = value * value 3 print value , " squared is " , square Notice how we’ve put 3 expressions after the print command, separated by commas. This causes the print command to print all three evaluated expressions with spaces between them. Here is a piece of code that does the same thing: 1 value =5 2 print value , " squared is " , value * value And so does this: 1 print 5 , " squared is " ,25 Which is even smaller, but less useful. Why? Imagine if we wanted to change the value that is squared - we’d need to recalculate the answer ourselves. Can you think of another not-very-useful way to write this?

2.7. CURRENCY CONVERSION

2.7

25

Currency conversion

Now for an example that puts all of these new concepts together to do something useful. We’re going to convert British pounds into American dollars. Remember that our job as a programmer is to decide on the process required to achieve the required result, so lets be clear on what that is: 1. Assign some numeric value to a variable called “pounds”. 2. Assign the value 1.75 to the variable “conv” 3. Convert the to dollars, by multiplying the two variables, and assign it to the variable “dollars”. 4. Print the result to the screen as part of a sensible message. Now let’s see how that is done in python: 1 2 3 4

pounds = 10 conv = 1.75 dollars = pounds * conv print pounds , " GBP is " , dollars , " USD . " And the output should be: 10 GBP is 17.5 USD. Which is correct. Does this mean that the program is completely correct?

2.8

Functions

Now that we’ve seen plenty of the basic building blocks of a program, lets quickly look at a way to group sets of instructions together into repeatable actions. We’ll look at functions here in just enough detail to let you use them for simple tasks. Later, in Chapter 10, we’ll cover them in much more detail. Our currency conversion application is quite useful. In fact, it’s so useful that we might want another one to convert to Hong Kong dollars, one for Euros, one for Rupees, etc. We could just write the same code many times, but that would be a waste of effort. Instead, we should take the important piece of code (the actual conversion) and make it reusable. In fact, once we do this, we can use the code in more than just simple conversion applications - we might write a calculator application that could also use this functionality, or maybe a web app for finding holiday destinations.

26

2.8.1

CHAPTER 2. PYTHON BASICS

Currency conversion done properly

The important code in the currency converter should be something like this: 1 dollars = pounds * conv 2 print pounds , " GBP is " , dollars , " USD . " Looking at this piece of code, we can see that there are two parts that are specific to converting pounds to dollars. Lets change the dollars variable to result and replace the string "USD." with the variable target_c (target currency). Now we have: 1 result = pounds * conv 2 print pounds , " GBP is " , result , target_c This assumes that pounds, conv and target_c have values assigned to them. This is where functions become useful. We can put this code into a function that requires these pieces of information. 1 def convert ( pounds , conv , target_c ): 2 result = pounds * conv 3 print pounds , " GBP is " , result , target_c The code above defines a function called convert, which requires a value in pounds, a conversion rate and a name for the target currency. Provided with these three pieces of information, it will print a message containing the conversion. Lets see how this is used: 1 def convert ( pounds , conv , target_c ): 2 result = pounds * conv 3 print pounds , " GBP is " , result , target_c 4 5 convert (10 ,1.75 , " USD " ) The biggest benefit of moving the code into the function is that we can reuse it at any time. Look at the example below, which prints a table of conversions for travellers: 1 2 3 4 5 6 7 8 9 10 11

def convert ( pounds , conv , target_c ): result = pounds * conv print pounds , " GBP is " , result , target_c convert (1 ,1.75 , " USD " ) convert (2 ,1.75 , " USD " ) convert (5 ,1.75 , " USD " ) convert (10 ,1.75 , " USD " ) convert (20 ,1.75 , " USD " ) convert (50 ,1.75 , " USD " ) convert (100 ,1.75 , " USD " ) Notice how we have used the same piece of code to convert many values. We could use this same function to convert to any currency.

2.9. FURTHER READING

2.9

27

Further reading

ˆ An Informal Introduction to Python. Part of the Python tutorial, available at: http: //docs.python.org/tut/node5.html. Read sections 3.1 and 3.2. ˆ The Python order of precedence. http://www.python.org/doc/current/ref/summary.html.

2.10

Glossary

Statement A statement is one complete program instruction. Command A command, such as print, used to perform a task. Commands may be part or the entirety of a statement. Literal An expression that evaluates as it is read. The literal 220, evaluates to 220, for example. Type The ‘type’ of some thing describes the nature of that thing - the range of values it may have and the operations that may be performed on it. Integer A whole number. Shortened to ‘int’ in python and many other languages. Float A floating point number. String A string of characters - some text. Expression Part of a statement that can be evaluated to give a result. Syntax The rules governing the construction of statements. Variable A name for a value. Assignment The process of putting a value to a name. Function A reusable block of code. A better definition will be given in the functions chapter (Chapter 10).

2.11

Exercises

1. Explain the difference between the following two statements: 1 print 12 1 print " 12 "

28

CHAPTER 2. PYTHON BASICS 2. Highlight the literals in the following code: 1 2 3 4 5 6 7 8

a =3 b = ’1 ’ c =a -2 d =a - c e = " dog " f = ’ , went to mow a meadow . ’ g = ’ man ’ print b ,g , " and his " ,e ,f ,a ,g , " ," ,d ,g , " ," ,c ,g , " and his " ,e , f

3. What is the output of the above code? 4. What is the result of 4+4/2+2? 5. Use brackets to make the answer: ˆ 2 ˆ 5 ˆ 6

2.11.1

Key Assignment

Modify the currency converter program (the one that doesn’t use functions) so that instead of converting from GB pounds to US dollars, it converts from dollars to pounds. First change line 4, so that the correct text is displayed, but with incorrect results. Now it is possible to change either line 2 or line 3 to make the conversion correct. What change is required for each line? Look back at the conversion function. Alter the function so that it can be used to convert from currencies other than GBP.

Chapter 3 Boolean algebra In this chapter, we’re going to look at a new type: boolean. We’ll see how to construct boolean expressions and how to evaluate them. In Chapter 4, we will look at how it is possible to alter the behaviour of our programs based on the result of a boolean expression.

Learning outcomes After completing the exercises for this chapter, you should be able to: ˆ Understand simple boolean logic ˆ Be able to work with simple boolean algebra ˆ Work with the boolean type in Python

3.1

True and False

We looked at numeric types in Section 2.4, and how we can construct expressions that evaluate to a number. Sometimes, we don’t want to calculate a number, but instead we want to know if some condition, or number of conditions is met. For example, we might want to know if someone’s age is over 21, or if they have a shoe size greater than 10. For this kind of data, we use a boolean type, which can contain true or false. Assignment is exactly the same as with other types we’ve seen, so: over18 = True isMale = False Assigns the value True to the boolean variable over18 and False to the boolean variable isMale. This small example is valid Python, but before we continue with Python’s boolean type, we’ll cover boolean algebra from a mathematical perspective. 29

30

3.2

CHAPTER 3. BOOLEAN ALGEBRA

Boolean algebra

Just as we have operators that apply to numbers (+, -, etc.), we have a number of operators that apply to booleans. In this section, we will cover these operators and how we can use and manipulate them.

3.2.1

and

The and operator is applied to two boolean values, and evaluates to true only if both of the operands are true, otherwise it evaluates to false. So: true and true = true false and true = false true and false = false false and false = false

3.2.2

or

The or operator evaluates to true if either of its operands are true: true or true = true false or true = true true or false = true false or false = false

3.2.3

not

While and and or take two operands, the not operator takes just one. That is, it is a unary operator, and not a binary operator. The not operator evaluates to true if its operand is false and to false if its operand is true: not true = false not false = true

3.2.4

Expressions with more terms

Just as we can combine arithmetic operators to create more complex expressions, we can combine boolean operators. For example, if P and Q are both booleans, then we can write an expression to determine the value of R such that R is true if either P or Q are true, but not both: R = (P or Q) and (not (P and Q)) Notice the use of brackets to ensure that there is no ambiguity, and the use of names (P, Q, R) to denote unknown values - variables. This operation is called an exclusive or and is often written as xor. So, R = P xor Q has the equivalent meaning.

3.3. BOOLEAN ALGEBRA IN PYTHON A and B A∧B A·B

A or B A∨B A+B

31 notA ¬A A

Table 3.1: Notations for boolean algebra

3.2.5

De Morgan’s laws

De Morgan’s laws give us the relationship between the three basic boolean operands: and, or and not. If A and B are boolean variables, then de Morgan’s laws state: not (A and B) = (not A) or (not B) not (A or B) = (not A) and (not B) These laws are useful if we want to rearrange an expression. You will see these laws being used later on in this book, but they are presented here to show that boolean logic is a complete algebra, and has rules and operations similar to those you learnt at school.

3.2.6

Notation

Here, we’ve used words to describe operators (and, or, etc.), and words to represent values (true and false), but there are other descriptions that could be used. In one notation, for example, the symbol ∨ represents or, while ∧ represents and and ¬ is not. In this notation, de Morgan’s laws would be written: ¬(A ∧ B) = ¬A ∨ ¬B ¬(A ∨ B) = ¬A ∧ ¬B Table 3.1 lists the most popular boolean algebra notations. Whichever notation is used, the algebra is the same and de Morgan’s laws and other axioms still apply.

3.3

Boolean algebra in Python

Now that we have covered the theory, we can examine how boolean algebra is expressed and evaluated in Python. We already saw the simplest of Python boolean expressions at the start of the chapter: over18 = True isMale = False The variables over18 and isMale both have the type “boolean”, or, in Python, bool for short.

32

CHAPTER 3. BOOLEAN ALGEBRA

The values assigned to them are the results of the simplest boolean expressions: True or False. These are boolean literals, just as 12 is an integer literal. Python uses and, or and not for the boolean operations of the same name. Examine the following python program: 1 over18 = True 2 isMale = False 3 canEnter = over18 and not isMale The first two lines have already been discussed. Line 3 assigns a boolean value to the variable canEnter. The expression over18 and not isMale will evaluate to true if over18 is true, and isMale is false. Perhaps it is the entry conditions for a girls-only nightclub. Now examine the next piece of code: 1 over18 = False 2 withAdult = True 3 canEnter = over18 or withAdult In this example, canEnter will only be true if over18 is true, or withAdult is true. This could be determining who gets into a theme park.

3.3.1

Comparisons

Now that we have examined operators that take booleans as operands, we can look at other operators that produce a boolean result. Looking back at our example of theme park entry, we are required to have answers to two questions (is the person over 18? is the person with an adult?), before we can determine the result. A more realistic example would have the person’s age stored in an integer variable, called age for example. How can this be used to determine if the person is allowed to enter? Examine the following code: 1 2 3 4

age =17 over18 = age >= 18 withAdult = True canEnter = over18 or withAdult In this example, over18 is assigned the result of the expression age >= 18 (line 2). The expression uses a comparison operator, which will return true if age is greater than, or equal to, 18. Now examine the next example.

1 BAD_DEBT_CODE =100 2 income = 20000 3 age = 23

3.3. BOOLEAN ALGEBRA IN PYTHON

33

4 code = 0 5 giveMortgage = ( age >= 21) and ( income >= 18000) \ 6 and ( code != BAD_DEBT_CODE ) In this example, giveMortgage will be true only if age is greater than or equal to 21, income is greater than or equal to 18, 000 and code is not equal to the value held in BAD_DEBT_CODE. The only thing new about this example is the != operator, which returns true only if the two operands are not equal, and the continuation of line 5 using the backslash (\). It is often useful to be able to split a statement across multiple lines. In Python, this is achieved by placing the backslash character at the break in the line and continuing on the next line. Lines split this way will be interpreted as a single statement. Table 3.2 lists the numerical comparison operators available in Python. There are other comparisons that can be made, on different types, which will be discussed as those types are covered. Finally, here is a more complex example: 1 2 3 4 5 6 7 8 9

BAD_DEBT_CODE =100 income = 20000 age = 23 code = 0 depositPercent = 0.8 giveMortgage = ( code != BAD_DEBT_CODE ) and (\ ( depositPercent >= 0.1 and age >= 21 and income >= 18000 ) or \ ( income / 1000 >= age ) \ ) Under which circumstances will someone be given a mortgage? Let’s break the important expression down into smaller terms and examine them separately. Line 7 contains a number of boolean operators as well as brackets (really parentheses, but lets not be picky). The order of precedence (section 2.5.1) requires that terms within brackets are evaluated first, so let’s look at these: ˆ (code != BAD_DEBT_CODE) ˆ ((depositPercent >= 0.1 and age >= 21 and income >= 18000 ) or (income / 1000 >= age) )

If we call the first of these A and the second B, then the complete expression is simply: A and B. The first of these is trivial, but the second contains more brackets. Lets divide it up: ˆ (depositPercent >= 0.1 and age >= 21 and income >= 18000 ) ˆ (income / 1000 >= age)

34

CHAPTER 3. BOOLEAN ALGEBRA Operator A < B A > B A = B A == B A != B A B

Meaning True if A is less than B True if A is greater than B True if A is less than or equal to B True if A is greater than or equal to B True if A is equal to B True if A not equal to B True if A is not equal to B

Table 3.2: Numeric comparison operators in Python

Lets call these C and D. So, B is now written as: C or D. The whole expression , then is: Aand (C or D). To qualify for a mortgage, then, term A must evaluate to true (code is not equal to BAD_DEBT_CODE) and either C or D must be true. Term C is true for people over 21 that have a 10% or greater deposit and an income of over £18000. Term D is true for people who earn at least their age, in thousands. That is, a 19 year old must earn at least £19,000

3.3.2

Between

On paper, it is common to see written an expression of the form “a < b < c”, which simply states that b must be greater than a and less than c. Many programming languages do not allow this kind of expression because the evaluation becomes complex: do we evaluate “a < b” or “b < c” first? Whichever we decide, we are left with a less than operator with operands of incorrect types. Luckily for us, Python does understand this kind of expression, and so rather than write age>0 and age=18 and income >=18000: 4 print " Your mortgage application has been accepted . "

4.1.2

Example

Now lets look at a more useful example. This program prints out some useful advice for people who get lost while visiting Coventry University. The variable “place” contains the name of the place they wish to find. 1 2 3 4 5 6 7 8 9 10

place = " reception " if place == " reception " : print " From the steps of the cathedral , " print " reception is directly opposite " if place == " humber " : print " The Humber lecture theatre is " print " next to the Hillman lecture theatre " if place == " hillman " : print " The Hillman lecture theatre is next " print " to the Humber lecture theatre " In this example, we have a series of if statements. Each one will be evaluated, in order, and if any of the boolean expressions evaluate to true, then the associated block of code will be executed.

4.1.3

Nesting

First, another example. 1 age = 21 2 sex = " male " 3 if age >= 18: 4 print " Welcome to Monty ’s Flying Nightclub , " 5 print " please drink responsibly . " 6 if sex == " female " : 7 print " Since it ’s ladies night , you get free drinks . "

40

CHAPTER 4. CHOICE

In this example, our block of code contains another if statement. Notice that the indentation for the second if statement is indented relative to the first. Line 6 is only executed if the boolean expression for both if statements evaluate to true. We could write the following instead: 1 2 3 4 5 6 7 8 9 10

age = 21 sex = " male " if age >= 18 and sex == " male " : print " Welcome to Monty ’s Flying Nightclub , " print " please drink responsibly . " if age >=18 and sex == " female " : print " Welcome to Monty ’s Flying Nightclub , " print " please drink responsibly . " print " Since it ’s ladies night , you get free drinks . " The output is the same, but we have to do a lot more typing.

4.1.4

Randomisation: .signatures

Most e-mail software has the facility to attach a “signature” to the end of all outgoing messages. On Unix systems, this is often done by having a file called “.sigature” in your home directory, which the e-mail software will look for. Because the signature is read from a file, many people have some software that changes the contents of the file weekly, daily or even hourly. Something called a “cron daemon” is used to execute a program at certain intervals. The output of this program is placed into the .signature file. We aren’t interested1 in how to set up cron, or how to get the output from your program into a file, but we are interested in the program that generates the random signature. The example below will display one of five possible pieces of text. Which one is displayed is dependant on a random number, generated by the “randint” function. All of the pieces of text are taken from the Unix fortune file - a pretty huge set of interesting snippets of text. Listing 4.1: Random Signatures 1 1 2 3 4 5 6 7

# !/ bin / env python # Import the function for generating random numbers from random import randint # generate a number between 0 and 4 inclusive val = randint (0 ,4) 1

If you are interested, try the Internet for information or, if you’re in Coventry University, ask Sarah or James

4.1. IF 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

41

if val ==0: print " Every morning I read the obituaries ; " print " if my name ’s not there , I go to work . " if val ==1: print " Nothing takes the taste out of peanut " print " butter quite like unrequited love . " print " -- Charlie Brown " if val ==2: print " Join the march to save individuality ! " if val ==3: print " If you love someone , set them free . " print " If they don ’t come back , then call them " print " up when you ’ re drunk . " if val ==4: print " The best things in life go on " print " sale sooner or later . " Try this program and make sure you understand its operation before you move on to the next section.

4.1.5

else

Lets go back to our club bouncer code. Rather than completely ignore people under 18, we should display a polite messae. We’re going to go through a few iterations of refining and altering this, but here’s a start:

1 age = 21 2 if age >= 18: 3 print " Welcome to Monty ’s Flying Nightclub , please drink respinsibl 4 if age < 18: 5 print " I ’m very sorry , but you must be over 18 to enter Monty ’s . " This works, but there is a better way. First, lets change the literal 18 to a variable, min age. This is good because if we ever need to change the minimum age (from 18 to 21 for example), we only have to change one value. This minimises our chances of getting it wrong, because we know that both if statements use the same value. It also make it less likely that we’ll mess it up if we make a change. Finally, it also means that we can have the age set somewhere near the top of our program, while these two if statements might be buried somewhere hundreds of lines down. The second change we want to make is one that allows us to pair the blocks of code that at the moment are gaurded by two separate if statements. It is very easy to make a mistake when writing code as we have done in the first example: you might use a greater-

42

CHAPTER 4. CHOICE

than for the first and less-than for the second, leaving people of exactly 18 years without any response. You might accidentally mistype one of the conditions. Have a look at the new version: 1 2 3 4 5 6

age = 21 min_age =18 if age >= min_age : print " Welcome to Monty ’s Flying Nightclub , please drink respinsibl else : print " I ’m very sorry , but you must be over 18 to enter Monty ’s . " We now have an if/else statement in place of the two if

4.1.6

elif

Sometimes, we have more than two possible results that need to be tested for. With the example above, we either meet the condition or not, but it is possible to do something more complex. The elif block comes before the else block (if there is one) and has another condition. Just as with the else block, this block can only be executed if the first is not, but the new condition must also be met. An example should make this clear: 1 2 3 4 5 6 7 8 9 10 11 12 13

age = 21 sex = " male " if age >= 18: print " Welcome to Monty ’s Flying Nightclub , " print " please drink responsibly . " if sex == " female " : print " Since it ’s ladies night , you get free drinks . " else : print " We were only kidding about drinking responsibly . " elif age < 16: print " What are you doing out on a school night ? " Here, we’ve added an elif block to give a different message when age is less than 16. There’s also an else block attached to the indented if, just to give another example of indentation. Here’s an example with else and elif associated with the same if:

1 age = 21 2 sex = " male " 3 if age >= 18: 4 print " Welcome to Monty ’s Flying Nightclub , "

4.2. FURTHER READING

43

5 print " please drink responsibly . " 6 if sex == " female " : 7 print " Since it ’s ladies night , you get free drinks . " 8 else : 9 print " We were only kidding about drinking responsibly . " 10 elif age < 16: 11 print " What are you doing out on a school night ? " 12 else : 13 print " Sorry , friend , you ’ re not old enough " We can add as many elif blocks as we like or need.

4.2

Further reading

ˆ Wikipedia article on choice - this is a general article, not python-specific. http://en.wikipedia.org/wiki/Control_flow#Choice ˆ Control flow in the Python tutorial. http://docs.python.org/tut/node6.html

4.3

Glossary

Compound statement. A compound statement collects more than one statement into a block. Control flow. Statements that allow the programmer to alter the order of execution of code. This includes choice, repetition, etc. if allows the conditional execution of a code block elif can follow and if block to add another, mutually exclusive, conditional block. else is used to provide a “default” behaviour. Code in this block will execute if none of the other associated blocks (if or elif) have been executed.

4.4

Exercises

1. Under what conditions would the following code snippet print “B”? 1 if thing1 > 10: 2 print ’A ’ 3 else : 4 print ’B ’ 2. Under what conditions would the following code snippet print “B”?

44

CHAPTER 4. CHOICE 1 if thing1 > 10: 2 print ’A ’ 3 elif thing1 > 200: 4 print ’B ’ 3. Under what conditions would the following code snippet print “B”? 1 if thing1 > 10: 2 print ’A ’ 3 if thing1 > 200: 4 print ’B ’ 4. Under what conditions would the following code snippet print “B”? 1 if thing1 > 10 and thing1 < 10: 2 print ’A ’ 3 else : 4 print ’B ’

4.4.1

Key assignment

A year is a leap year if it is divisible by 4, unless the year is a Century, in which case it is only a leap year if it is divisible by 400. Write a function which takes a year as an argument and prints a message to the screen saying whether or not that year is a leap year. Here’s an outline to get you started: 1 2 3 4 5 6 7 8 9 10

def isLeapYear ( year ): ... # Testing ... isLeapYear (400) isLeapYear (800) isLeapYear (1600) isLeapYear (1601) isLeapYear (2004) ...

Chapter 5 Repetition: recursion You have already seen and invented algorithms that require some sort of repetition. For example, the square spiral in Section 1.4.1 of Chapter 1 required us to repeat a sequence of steps to get the turtle to draw the shape we wanted. In this Chapter, you will learn about recursion which is one of two ways of getting Python to repeat steps. Next Chapter we will look at iteration, which is the second method. In most programming languages you can use either technique, but each is appropriate for solving different sorts of problems. By getting lots of practise and seeing many examples, you should develop and intuition for when recursion might be more appropriate than iteration.

Learning outcomes When you have finished this Chapter you will be able to: ˆ Define the term recursive function. ˆ Design and implement recursive functions which terminate correctly. ˆ Identify the base case and induction case of a recursive function. ˆ Code using top-down design. ˆ Draw fractal curves (such as the von Koch curve) using the turtle module.

5.1

Recursive functions

You have already seen how to define a function: 1 2 3 4

>>> def print_hello ( name ): ... print " Hello " , name ... >>> print_hello ( " James " ) 45

46

CHAPTER 5. REPETITION: RECURSION

5 Hello James 6 >>> . . . and you probably also know that functions can call other functions, like this:

Listing 5.1: A function to draw a square 1 import turtle 2 def draw_square ( side ): 3 turtle . forward ( side ) 4 turtle . left (90) 5 turtle . forward ( side ) 6 turtle . left (90) 7 turtle . forward ( side ) 8 turtle . left (90) 9 turtle . forward ( side ) 10 11 draw_square (100) In this Chapter, we’ll be talking about recursive functions – functions that call themselves. That might seem a bit of a weird idea, but it’s actually very straight-forward and very easy to do in Python. Let’s dive in an write one and see how it works: Listing 5.2: Recursively raising a number to a power 1 >>> def power (n , pow ): 2 ... if pow ==0: 3 ... return 1 4 ... else : 5 ... return n * power (n , pow -1)

5.1. RECURSIVE FUNCTIONS 6 7 8 9 10 11 12 13 14 15

... >>> 32 >>> 1 >>> 81 >>> 81 >>>

47

power (2 ,5) power (100 ,0) power (3 ,4) power (9 ,2)

So, the function power raises a number to a given power. If you look through lines 7-12 you’ll see we’ve called it to find the values of 25 , 1000 , 34 and 92 . power takes two arguments: firstly the exponend (the number we want to raise to a power) called n and secondly the exponent, the power we want to raise it to, called pow. Next, we have an ifelse statement which handles two cases. If the exponent is 0, then we return 1 (remember, n0 = 1 from you GCSE maths). If the exponent is greater than 0, the function returns n*power(n,pow-1). So, what happens when we call the function with a value of pow which isn’t zero? Let’s consider the function call power(2,3) as an example. That function call returns 2*power(2,3-1), which is 2*power(2,2). Next Python evaluates the function call power(2,2), giving us 2*2*power(2,2-1), which is 2*2*power(2,1). Evaluating that gives 2*2*2*power(2,1-1) or 2*2*2*power(2,0) which evaluates to 2*2*2*1. Now Python just has to multiply those values together to give us the answer: 8.

5.1.1

Another recursive function

The factorial function is used widely y Mathematicians in combinatorics and statistics. Most programming languages don’t have a built-in factorial function, so it’s useful to be able to define one. The definition of a factorial is as follows: fact(0) = 1 fact(n) = n × fact(n − 1) To understand that better, let’s work through a simple example – fact(3). So, using the induction case,

48

CHAPTER 5. REPETITION: RECURSION

fact(3) = = = = =

3 × fact(2) 3 × 2 × fact(1) 3 × 2 × 1 × fact(0) 3×2×1×1 6

This is a recursive definitions – it has a base case: fact(0) and an induction case: fact(n). Let’s write that out in Python: Listing 5.3: Recursive factorial 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

>>> def fact ( n ): ... if n < 2: ... return 1 ... else : ... return n * fact (n -1) ... >>> fact (0) 1 >>> fact (1) 1 >>> fact (2) 2 >>> fact (3) 6 >>> fact (4) 24 >>> fact (5) 120 >>> fact (6) 720 >>> fact (7) 5040 >>> fact (12) 479001600 >>> One of the nice things about this simple program, is that you can see how close the Python code is to the original mathematical definition. Not every language is so straight forward! Next Chapter, we’ll be looking at other ways to define and implement factorial functions and the other programs you’ll see here.

5.2. BASE CASES, INDUCTION CASES AND TERMINATION

5.2

49

Base cases, induction cases and termination

What would have happened if we’d written the power function like this: def power_crazy (n , pow ): return n * power (n , power -1) Imagine if we asked Python to evaluate the function call power_crazy(2,1). Python would reduce this to 2*power_crazy(2,0) then 2*2*power_crazy(2,-1) then 2*2*2*power_crazy(2,-2) and so on, forever. This situation is called infinite recursion – recursion which never terminates. Well, actually it wouldn’t go on forever. Python would run out of memory for storing all those multiplications and would give us an error message. The first implementation of power worked because it had a base case which handled the case where pow is 0. Remember how the code was structured: 1 def power (n , pow ): 2 if pow == 0: # Base case 3 return 1 4 else : # Induction case 5 return n * power (n , pow -1) Most recursive functions are structured in this way. They have one or more base cases, like pow==0, which every function call should eventually reduce to. This ensures that the function always terminates (or stops). Then, there are one or more induction cases which tell Python what should happen if the base case isn’t called. Recursive functions are based on a mathematical technique called proof by induction, which you have probably seen at A level. This is a method of proving theorems which about the integers. Actually, it can be extended to be used on other structures (like Pythons lists) and this is sometimes called structural induction. This is one of many techniques that Computer Scientists use to verify their programs.

5.3

Efficiency of recursive functions

One idea we haven’t mentioned yet in this book is efficiency – how fast or slow your programs are or (sometimes) how much memory they need to run. Most of the time this isn’t really important. Python generally runs pretty fast and it’s more important to write a correct program than an efficient one. Consider that for a minute – it’s trivially easy to write a program that runs blindingly fast, so long as you don’t need it to be correct! Having said that, there are a few applications where efficiency really is important. Later, we will write some animations and arcade games (and you will write an arcade game in your Studios) and we’ll look at some simple techniques for making animations run fast. For now, let’s keep things a bit simpler and look at a more efficient implementation of the power function:

50

CHAPTER 5. REPETITION: RECURSION Listing 5.4: Recursively raising a number to a power, more efficiently

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

>>> ... ... ... ... ... ... ... >>> 32 >>> 1 >>> 81 >>> 81 >>>

def power (n , pow ): if pow ==1: # Base case return n elif pow % 2 == 0: # Induction case ( pow even ) return power ( n *n , pow /2) else : # Induction case ( pow odd ) return n * power ( n *n , pow /2) power (2 ,5) power (100 ,0) power (3 ,4) power (9 ,2)

Here, the induction case has been split into two parts: one for even values of pow and one for odd values of pow. This works because: x1 = x x2n = (x2 )n x2n+1 = x × (x2 )n . . . which you should remember from your GCSE maths. In the previous implementation of power in Section 5.1 each induction case required n multiplications. In this implementation, at most 2 lg n multiplications are needed, where lg is the logarithm of n to the base 2. This is a good example of how even a very basic understanding of mathematics can help improve code enormously. In Chapter 9 we’ll be taking a more formal approach to efficiency and looking at how to tell roughly how fast your algorithms are and how to compare them.

5.4

Visual recursion: fractal curves

Top-down design is a popular method for designing and implementing programs. The idea behind it is to start off with a very abstract plan for how your program should work then fill in the details as you go. In programming, a good way to leave things abstract is to write stubs which are functions without a body, like this:

5.4. VISUAL RECURSION: FRACTAL CURVES

51

def do_something (): return or to make calls to functions that you haven’t written yet: do_something () This style of design and implementation tends to suit people who enjoy abstract thinking and spot patterns easily. As an example, we’re going to use this technique in our next example. Of course, you’ll see it appearing several times in the notes.

Figure 5.1: A von Koch curve Figure 5.1 shows a von Koch curve. This is one of a number of fractal curves, which Mathematicians call self-similar. If you look at the overall shape of one of the sides of the triangle, it looks like a line with a triangle in the middle: _/\_ in ASCII art! If you look closer at each part of the line, it looks just the same. The property of fractals is that however closely you “zoom” in on them, they look the same, hence “self-similarity”. In this section, we’re going to draw a von Koch fractal curve with Python’s turtle module, which you used in Chapter 1. So, what do we want to be able to do when we’ve finished writing the Koch program? Well, obviously we want to be able to draw a Koch curve. In Python, then, we want to be able to make a function call which does this for us: koch () Does this function need any extra information? Well, we probably want to say how large each line should be, so the curve doesn’t stretch off the edge of the screen, or get so

52

CHAPTER 5. REPETITION: RECURSION

small it’s unreadable. Of course, we want to avoid infinite recursion, so we need to be able to tell Python when to use the base case. There are various ways we could say this, but one that seems plausible is to say how big (in pixels) the smallest straight line should be. This means that we can set sensible limits, like saying that no line should be shorter than one pixel long. So, our final function call should look something like this: koch (500 , 10) which should come from a function like this: def koch ( line_length , smallest ): # do something here ! return Next, we need to think carefully about what should go in the function. Looking at the shape of the curve an abstract idea of what;’s going on should be reasonably clear. To draw a Koch curve, we need to draw a “Koch line” (however that happens), then turn right by 120°, draw a second “Koch line”, turn right by 120° then draw the third “Koch line”. In Python that’s going to look like this: 1 def koch ( line_length , smallest ): 2 " Draw a whole von Koch curve . " " " 3 koch_line ( line_length , smallest ) 4 turtle . right (120) 5 koch_line ( line_length , smallest ) 6 turtle . right (120) 7 koch_line ( line_length , smallest ) 8 return Notice the string at the top of the function. This is documentation. If you type the function into the Python interpreter you can ask Python to give you this documentation by typing help(koch), which gets this response: 1 koch ( line_length , smallest ) 2 Draw a whole von Koch curve . OK, so we’ve already done quite a lot of the work for this program, without even thinking about recursion or how to draw each bit of the lines. All we have left to do is to fill in the koch_line function and we’re done. In fact, that’s pretty simple too. We just need to trace round the _/\_ shape and for each of the four parts of the line we should either: ˆ move forward a certain amount – base case; or ˆ draw a “Koch line” that’s smaller than the one we’re currently drawing – induction case.

5.4. VISUAL RECURSION: FRACTAL CURVES

53

Once you’ve figured out what all the angles in the shape are and the relative sizes of the lines this is very straight forward. We won’t go over the geometry here but this is the resulting code (or my version – try this program out with slightly different angles and line lengths and see what you get!): 1 def koch_line ( line_length , smallest ): 2 """ 3 - - -/\ - - 4 """ 5 third = line_length /3 6 if ( third > range (10) 2 [0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9] 3 >>> range (1 , 5)

6.2. FOR LOOPS

61

4 [1 , 2 , 3 , 4] 5 >>> Optionally, we can tell range that we want to count up in numbers other than 1. So, if we just want odd numbers ranging from 1 to 9 we can say this: 1 >>> range (1 , 10 , 2) 2 [1 , 3 , 5 , 7 , 9] 3 >>> The last part of the for loop follows the colon (:) and is called the body of the loop – these are the instructions that Python should perform repeatedly. Statements and expressions in the body should be indented by one tab to show that they belong inside the loop. So, in general, a for loop looks like this: for in : Notice that we said the counter is a new variable name. It’s scope (where it’s visible and accessible in code) is the body of the loop and the code which follows it. Have a look at the following code, which demonstrates the scope of counters in for loops: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

>>> >>> 1 >>> ... ... 5 6 7 8 9 10 >>> 10 >>> ... ... 6 7 8 9 10 11

i = 1 print i for i in range (5 ,11): print i

print i for j in range (6 , 12): print j

62

CHAPTER 6. REPETITION: ITERATION

24 >>> print j 25 11 26 >>> Here we have created a variable called i, used it later in a for loop as a counter, then printed it again afterwards. Even outside the loop it still contains the last variable it held inside the loop. Next, we have a loop with a counter called j and even though j is new (it was created inside the loop) we can still use it outside the loop. So, it should be intuitively clear to you how the for loop works, but it’s worth looking through it in detail. Let’s take this short example: 1 2 3 4 5 6 7 8

>>> for z in range (0 , 2): ... print z ... 0 1 >>> print " foobar " foobar >>> Here, we have created a new counter variable inside the for loop called z. First, Python assigns it a number which is the first number in the sequence generated by range – in this case 0. Next the body of the loop is executed, in this case we print z, which currently holds 0. Next Python jumps back to the top of the loop and changes the z to hold the next value in the sequence – in this case 1. The body of the loop is then executed with the new value, so we print out 1. Jumping back up to the top, we’ve already used up all the values in the sequence, so the value of z stays at 1. Our loop is now finished, so “control” returns to the end of the loop, and whatever comes after it is executed. In this case we have the statement print "foobar".

6.2.1

Examples of for loops

Now you know how the for loop works it’s time to look at a few examples. Last Chapter you saw some algorithms for raising a number to a power. Let’s try that again using iteration: Listing 6.1: Iteratively raising a number to a power with for 1 >>> def power (n , pow ): 2 ... answer = 1 3 ... for i in range (1 , pow +1): 4 ... answer = answer * n 5 ... return answer 6 ... 7 >>> power (1 , 0) 8 1

6.2. FOR LOOPS 9 10 11 12 13 14 15 16 17

>>> power (2 , 32 >>> power (2 , 1024 >>> power (3 , 19683 >>> power (9 , 729 >>>

63 5) 10) 9) 3)

Here, we’ve created two variables: answer, which is a temporary value to hold the intermediate values while we’re still working out what value we want to return. The other variable we’re using is i which is a loop counter. To figure out what n to the power of pow is we just need to go round a loop pow times, multiplying answer by n. We’ve also looked at the Factorial function, which, last Chapter we defined as: fact(0) = 1 fact(n) = n × fact(n − 1) Of course, this is a recursive definition: we’ve defined the factorial function in terms of itself! It would probably help to have an iterative definition, which might look something like this: fact(n) = 1 ∗ 2 ∗ . . . ∗ n Or, if we want to be fancy with our notation, like this: Y i i≤1≤n

With iteration, this is simple to implement: we just need a counter to go from 1 to n and somewhere to store our intermediate values: Listing 6.2: Iterative factorial with for 1 2 3 4 5 6 7 8 9 10 11

>>> def fact ( n ): ... fact = 1 ... if not n >> fact (1) 1 >>> fact (2) 2

64 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

CHAPTER 6. REPETITION: ITERATION

>>> fact (3) 6 >>> fact (4) 24 >>> fact (5) 120 >>> fact (6) 720 >>> fact (7) 5040 >>> fact (8) 40320 >>> fact (9) 362880 >>>

6.3

while loops

for loops are one way of implementing iteration in Python. The second syntactic construct is a while loop. This is similar to a for loop (and can do anything a for loop can do), but does not have and explicit counter or sequence. Generally, Python programmers seem to use while less often than for. One reason for this is that it’s very clear when a for loop terminates (when it gets to the end of the sequence), but wwhile loops can be a little trickier. Still, most all imperative programming languages (ones with variables and assignment) have a while loop, so you need to be able to use them well. while loops are structured like this: while : This is like saying “while this condition remains true, keep repeating this action”. Here’s an example: 1 2 3 4 5 6 7 8 9 10

>>> while not raw_input () == " 9 " : ... print " Enter 9 to exit . " ... 1 Enter 9 to exit . 2 Enter 9 to exit . 3 Enter 9 to exit . 4

6.3. WHILE LOOPS

65

11 Enter 9 to exit . 12 9 13 >>> Remember the function raw_input grabs a value from the keyboard and (unlike input) returns that value as a string. Our while loop starts off by calling raw_input and checking if the value it returns is "9". If it isn’t, we print out "Enter 9 to exit.". Next, we jump back to the top of the loop, and test the condition not raw_input() == "9" again. We keep doing this until the condition is false, when we jump straight out of the loop and execute whatever is underneath it.

6.3.1

Examples of while loops

As we’ve said above, anything that can be done with a for loop can be done with a while loop and vice versa. So, let’s have another look at the power and fact functions, this time using while. Listing 6.3: Iteratively raising a number to a power with while 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

>>> def power (n , pow ): ... answer = 1 ... c = 1 ... while c >> power (1 , 0) 1 >>> power (2 , 5) 32 >>> power (2 , 10) 1024 >>> power (3 , 9) 19683 >>> power (9 , 3) 729 >>> Here our answer is slightly different to the version with for. We have created an explicit counter variable called c, which starts with the value 1. The condition we test for each time we go round the loop is whether c is less than or equal to pow. The difference here is that rather than having Python automatically assign values to c, we have to do this ourselves in the line c = c + 1.

66

CHAPTER 6. REPETITION: ITERATION

The fact function is similar. Here, we have an explicit counter, this time called i and each time we go round the loop we have to update both our intermediate answer (called fact) and the counter: Listing 6.4: Iterative factorial with while 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

>>> def fact ( n ): ... fact = 1 ... i = 2 ... while i >> fact (1) 1 >>> fact (2) 2 >>> fact (3) 6 >>> fact (4) 24 >>> fact (5) 120 >>> fact (6) 720 >>> fact (7) 5040 >>> fact (8) 40320 >>> fact (9) 362880 >>>

6.4

String operations

The last examples all used numbers as counters and in sequences, apart from the example using raw_input. It is possible to use strings as sequences in for loops. Python will assign each character of your string to your counter variable in turn, like this: 1 2 3 4

>>> for c in " foobar " : ... print c ... f

6.4. STRING OPERATIONS 5 6 7 8 9 10

67

o o b a r >>> This is very convenient and means that we can perform sophisticated string manipulation very easily. Here’s quite a simple example: when Julius Cæsar wanted to send orders to his troops, he was concerned that they would be intercepted by the enemy. So, he invented a system – now called the Cæsar cipher – to encrypt his commands. Suetonius said: If he had anything confidential to say, he wrote it in cipher, that is, by so changing the order of the letters of the alphabet, that not a word could be made out. If anyone wishes to decipher these, and get at their meaning, he must substitute the fourth letter of the alphabet, namely D, for A, and so with the other. – Suetonius, Life of Julius Cæsar, 56 So, to encipher a message, we need to take each letter in it and move that letter up or down the alphabet. We’ll say the key to the message is the number of places around the alphabet we move each letter. Obviously, two people need to share the same key to understand one another’s messages. We need to know a little more about strings to do this. How can we “move letters around the alphabet”? Well, the simplest way to do this is to convert each character to a numeric value (usually it’s ASCII value), then add a number to it, then convert back to a string. Python has some built in functions that can help us here. ord converts a character to it’s ASCII value and chr converts an integer (presumed to be an ASCII value) back to a string. Like this:

1 2 3 4 5 6 7 8 9 10 11

>>> 97 >>> 65 >>> ’a ’ >>> ’A ’ >>> ’A ’ >>>

ord ( " a " ) ord ( " A " ) chr (97) chr (65) chr (97 -32)

So, now our enciphering should be simple. We just need to go through each character in the message, convert it to its ASCII value, add the key to it and convert back to a string. Here’s some code which does this:

68 1 2 3 4 5 6 7 8

CHAPTER 6. REPETITION: ITERATION

>>> def encipher (s , key ): ... cipher = " " ... for c in s : ... new = chr ( ord ( c ) + key ) ... cipher = cipher + new ... return cipher ... >>> Notice that we return the entire enciphered message. We can do this with a + which, when applied to strings, concatenates them:

1 >>> print " foo " + " bar " + " ! " 2 foobar ! 3 >>> Next we need to be able to decipher messages, so we can read them back in English. Well, that should be simple, we just subtract the key from each ASCII value: 1 2 3 4 5 6 7

>>> def decipher (s , key ): ... plain = " " ... for c in s : ... new = chr ( ord ( c ) - key ) ... plain = plain + new ... return plain ... Putting this together, we have both functions and some testing at the end. Of course, we mainly want to check that if we encipher a message then decipher it, we get the original message back!

Listing 6.5: Cæsar cipher 1 >>> def encipher (s , key ): 2 ... cipher = " " 3 ... for c in s : 4 ... new = chr ( ord ( c ) + key ) 5 ... cipher = cipher + new 6 ... return cipher 7 ... 8 >>> def decipher (s , key ): 9 ... plain = " " 10 ... for c in s : 11 ... new = chr ( ord ( c ) - key ) 12 ... plain = plain + new 13 ... return plain 14 ...

6.5. MORE COMPLEX LOOP CONSTRUCTS 15 16 17 18 19 20 21

69

>>> plain = " You are surrounded by geeks . Retreat . " >>> key = 15 >>> encipher ( plain , key ) ’h ~\ x84 / p \ x81t /\ x82 \ x84 \ x81 \ x81 ~\ x84 } sts / q \ x88 / vttz \ x82 =/ at \ x83 \ x81tp \ x >>> decipher ( encipher ( plain , key ) , key ) ’ You are surrounded by geeks . Retreat . ’ >>>

6.5

More complex loop constructs

Python also gives you some extra constructs to help control your repetition. Note that you can have a loop inside a loop. This is called nested looping and might look like this: 1 for i in ... : 2 # Outer loop 3 for j in ... : 4 # Inner loop The break statement jumps out of the inner most loop, as if the loop had finished. continue jumps straight up to the top of the inner most loop, without completing the rest of the statements in the body of the loop. Here’s a quick example: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

>>> ... ... ... ... ... ... a 1 th s 2 th d 3 th f 4 th 9 >>>

for i in range (1 , 101): if raw_input () == " 9 " : break else : print str ( i ) + " th iteration " continue

iteration iteration iteration iteration

Here, if the user enters 9 at the keyboard, we break out of the loop. Otherwise we print how many iterations have already occurred and keep going. while and for statements may also end in an else clause, which is only executed if the loop terminated without a break statement. Here’s a simple example:

70 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

>>> ... ... ... ... ... ... ... ... a 1 th s 2 th d 3 th f 4 th got >>>

CHAPTER 6. REPETITION: ITERATION for i in range (1 , 5): if raw_input () == " 9 " : break else : print str ( i ) + " th iteration " continue else : print " got through the sequence without pressing 9 "

iteration iteration iteration iteration through the sequence without pressing 9

6.5.1

Example of complex loop constructs

The primes function (listed below) prints out all the prime numbers less than a given integer. A prime number is one that cannot be divided by any number other than 1 and itself. Remember that the % operator gives remainders on division. So the result of a%b is the remainder of a/b. Like this: 1 2 3 4 5 6 7 8 9 10 11 12 13

>>> for i in range (1 , 10): ... print str ( i %3) ... 1 2 0 1 2 0 1 2 0 >>> The code works quite simply, although it looks a bit odd! All we’re doing here is looking through the integers from 2 to n. For each of these, we see if it has any prime factors, by looking at the remainder on division of the number we’re looking at (called i in the

6.6. FURTHER READING

71

program) with every number from 2 to i-1. If the remainder on division is zero, we know that the number has a factor an isn’t a prime, so we break out of the inner loop and see if the next i is prime. If, in the inner loop, we get all the way through 1 to i-1 and don’t find any prime factors, then we end the inner loop without a break statement. In this case, we know that i is prime, and the else clause attached to the outer loop prints a message to the terminal. Listing 6.6: Prime number filter using for loops 1 >>> def primes ( n ): 2 ... " " " Prints all prime numbers less than n . 3 ... """ 4 ... for i in range (2 , n +1): 5 ... for j in range (2 , i ): 6 ... if i % j == 0: 7 ... break 8 ... else : 9 ... print i , " is prime . " 10 ... 11 >>> primes (50) 12 2 is prime . 13 3 is prime . 14 5 is prime . 15 7 is prime . 16 11 is prime . 17 13 is prime . 18 17 is prime . 19 19 is prime . 20 23 is prime . 21 29 is prime . 22 31 is prime . 23 37 is prime . 24 41 is prime . 25 43 is prime . 26 47 is prime . 27 >>>

6.6

Further reading

ˆ Iteration (and more) in the Python tutorial: http://docs.python.org/tut/node6.html ˆ Iteration in Wikipedia: http://en.wikipedia.org/wiki/Iteration

72

CHAPTER 6. REPETITION: ITERATION ˆ Cæsar cipher on Wikipedia: http://en.wikipedia.org/wiki/Caesar_cipher

6.7

Glossary

break jump out of the inner most loop. continue skip back to the top of the inner most loop. Counter a variable which holds the number of times the body of a loop has been executed, so far. for a syntactic construct for performing iteration: for in : else: Iterate to use iteration. Iteration repetition with counters (as oppose to recursion). Loop an iteration statement in a programming language. Nested loop a loop inside another loop Scope where a particular variable is visible and accessible in the code. Terminate stop. while a syntactic construct for performing iteration: while : else:

6.8

Homework exercises

1. Implement an iterative Python function which returns the sum of the first n integers. 2. while and for loops are equivalent: whatever you can do with one you can do with the other. In the following programs, convert the following while loops into for loops and the for loops into while loops. Of course, your new loops should give the same results as the old ones!

6.8. HOMEWORK EXERCISES

73

(a) >>> for i in range (1 ,100): ... if i % 3 == 2: ... print i , " mod " , 3 , " = 2 " ... (b) >>> for i in range (10): ... for j in range ( i ): ... print ’* ’ , ... print ’ ’ ... * * * * * * * * * * * * * * * * * >>>

* * * * * * *

* * * * * *

* * * * *

* * * * * * * * * *

(c) >>> i = 0 >>> while i >> char = " " >>> print " Press Tab Enter to stop and Enter to keep going ... " Press Tab Enter to stop and Enter to keep going ...

74

CHAPTER 6. REPETITION: ITERATION >>> iteration = 0 >>> while not char == " \ t " and not iteration > 99: ... print " Keep going ? " ... char = raw_input () ... iteration += 1 ... Keep going ? Keep going ? Keep going ? Keep going ? >>> 3. How many times will Python execute the code inside the following while loops? Try to figure this question out without using the interpreter! Make sure you justify your answers. (a) i = 0 while i < 0 and i > 2 : print " still going ... " i = i +1 (b) i = 25 while i < 100 : print " still going ... " i = i - 1 (c) i = 1 while i < 10000 and i > 0 and 1 : print " still going ... " i = 2 * i (d)

6.8. HOMEWORK EXERCISES

75

i = 1 while i < 10000 and i > 0 and 0 : print " still going ... " i = 2 * i (e) while i < 2048 and i > 0 : print " still going ... " i = 2 * i (f) i = 1 while i < 2048 and i > 0 : print " still going ... " i = 2 * i (g) for i in []: print " foobar ! "

6.8.1

Key Assignment

Implement an iterative Python function to generate numbers in the Fibonacci sequence: fib(0) = 1 fib(1) = 1 fib(n) = fib(n − 1) + fib(n − 2)

76

CHAPTER 6. REPETITION: ITERATION

Chapter 7 State This Chapter is all about state, which is a fundamental concept in programming. You will learn more about how variable assignment works, what state is and why it is important in your programs. Practically, you will learn about writing Python programs which manage their state, particularly an important class of programs called lexers.

Learning outcomes At the end of this Chapter, you will be able to: ˆ Describe how assignment works by substitution. ˆ Apply substitution to determine the result of an expression. ˆ Describe the terms state, state space, state transition and finite state machine and their relationship to Python variables. ˆ Draw state transition diagrams for finite state machines. ˆ Define the term lexer and write simple lexers in Python (e.g. to lex integers and decimal numbers).

7.1

Variables, assignment and substitution

Although this Chapter is all about something called state, before we get to that we need to revisit assignment (which you first met in Section 2.6 of Chapter 2) and explain in more detail how that works. Have a look at the following Python code: 1 >>> a = 5 2 >>> b = 6 3 >>> c = 7 77

78

CHAPTER 7. STATE

4 >>> print ( a * b )/( c + a ) 5 2 6 >>>

Figure 7.1: Some Python variables It should be obvious to you know what this does: it creates some variables – a, b and c and then prints out the result of an arithmetic expression. In this case, the number printed will be 2 (remember the result will be an integer). Figure 7.1 shows the variables in our little program. When Python executes this code, it stores the numbers 5, 6 and 7 in variables named a, b and c. What happens when we change the values stored in these variables, as in the code below? 1 2 3 4 5 6 7 8 9

>>> >>> >>> >>> 2 >>> >>> 3 >>>

a = 5 b = 6 c = 7 print ( a * b )/( c + a ) c = 5 print ( a * b )/( c + a )

Here, a and b are still store the values 5 and 6, but now c stores 5. So, the result of our calculation will now be 3. Figure 7.2 shows what’s going on here. The c variable now stores 5, but the value 7 has disappeared. Notice that c has its own copy of the number 5, so if we change a it won’t affect c, or vice-versa.

7.1.1

Substitution

When you are writing Python programs and – more importantly – when you are reading programs that other people have written, sometimes you will find it difficult to understand

7.1. VARIABLES, ASSIGNMENT AND SUBSTITUTION

79

Figure 7.2: Python variables after re-assignment

what the code in front of you means. It’s very important to have a way of working through code, on paper, to work out what Python will do with your program when it runs. Programmers call this a walk-through, because you are “walking” through each line of code in turn. Much of the work that programmers do in a walk-through involves understanding what values are assigned to variables at any point in the execution of a program. Although the programs we are considering here are very simple, the techniques you will learn in this Section are applicable anywhere and will help you with much more complicated programs. So, back to our first program. The last line prints out the result of this expression on line 4: (a ∗ b)/(a + c) But what does Python do next? Internally, Python looks back through the information it has collected to see what values have been stored in a, b and c. It then substitutes those values for the variable names in the expression, before it can work out the result of that expression. Let’s look at how this works for our little program. The technique we’ll use here doesn’t quite reflect the complicated internals of the Python interpreter, but it gives us an understandable way of figuring out what our programs do. We will use the notation [x \ 1] to mean “1 is substituted for x”. In this case, we can make substitutions for a, b and c in any order. We can substitute a for 5: (a ∗ b)/(a + c)[a \ 5] Which gives us: (5 ∗ b)/(5 + c) Then we can substitute b for 6: (5 ∗ b)/(5 + c)[b \ 6] Which gives us:

80

CHAPTER 7. STATE

(5 ∗ 6)/(5 + c) And lastly we need to substitute c for 7: (5 ∗ 6)/(5 + c)[c \ 7] Leaving us with: (5 ∗ 6)/(5 + 7) Which evaluates to the answer: 2

7.1.2

Simultaneous assignment

As a short cut, Python allows you to make several assignments on the same line of code, like this: >>> a , b , c = 5 , 6 , 7 This is called simultaneous assignment and it’s very useful in cutting down the number of lines of code you have to write. This means less typing, which is great, but of course it is important for your code to remain clear and easy to understand. One thing to remember though: when Python runs this line of code, it will evaluate all the expressions on the right hand side of the “=” before making the assignments. Can you imagine of a line of code for which this might be a problem? There’s an exercise at the end of the Chapter that requires you to answer this question with a bit of help from the Python interpreter. There is one idiom that makes use of simultaneous assignment that you’ll need to know about. It might seem a bit odd at first, but it does make sense and you are very likely to see it in other peoples’ code. Here it is: 1 2 3 4 5 6 7

>>> >>> >>> 5 >>> 4 >>>

a, b = 4, 5 a, b = b, a print a print b

On line 1 we made a simple assignment – storing 5 in a and 6 in b. Figure 7.3 shows the result of this. On line two, we do something that looks a bit strange – we assign a to b and b to a. This is actually perfectly legal Python code, although some languages don’t allow you to make such an assignment.

7.2. STATE

81

Figure 7.3: Simultaneous assignment: line 1

This works because Python evaluates all of the right hand side of the expression before making the assignments. When Python executes this code, on the right-hand side of the assignment the “old” values for a and b are used. So, Python makes this substitution: a, b = b, a[a \ 5, b \ 6] which becomes: a, b = 6, 5 which is equivalent to: a=6 b=5 Figure 7.4 shows this graphically.

Figure 7.4: Simultaneous assignment: line 2

7.2

State

The state of something is the configuration it is in. For example, your light switch may be “On” or your mobile phone might be “Ringing”. In programming, state is usually represented by the contents of variables, hence our excursion into assignment, above.

82

CHAPTER 7. STATE

The state space of a device or program is the set of all states that the device or program can be in. So, a light switch might have the state space: {On, Off}. A mobile phone might have a state space such as: {Ringing, Locked, On hold, Connected, . . .} Notice that some things might have an infinite state space, or one so large that it can’t sensible be represented in a program. For example, it might be nice for Physicists to model the Universe in a computer simulation by modelling the state of every atom in the Universe. However, this is such a huge amount of data that it’s unlikely to ever be possible. The state of a program or device usually changes over time. So, a light switch can change its state from “On” to “Off” and a mobile phone might transition from “Ringing” to “Connected”. It is up to the designer of a program or device to make sure that their creation never ends up in an undesirable state, such as one in which your phone hangs or your ATM reboots and swallows your debit card. It is also good practice to avoid inconsistent state. This is where two or more variables hold inconsistent information. For example, if you are storing someone’s date of birth, the current year and their age, then these need to match up. The following variables would be inconsistent: >>> year = 2005 >>> age = 27 >>> dob = " 16 Feb 1970 " >>> Much of the job of building a computer or writing a program involves managing its state in some way and it’s important to try and think clearly about this.

7.3

Changing state

Figure 7.5: A state transition diagram for a light switch The state space that a program can occupy and the transitions between its states are usually represented in a state transition diagram. Figure 7.5 shows a state transition diagram for a light switch. The states of the system are represented by circles and the

7.4. FINITE STATE MACHINES

83

name of each state is written inside its circle. Transitions between states are represented by directed arcs and the sideways “V” shape in the “Off” state shows that “Off” is the state in which the system should start. We can represent the same state changes in Python, using assignment and choice: Listing 7.1: Light switch 1 >>> def change_state ( l ): 2 ... if l == " off " : 3 ... l = " on " 4 ... elif l == " on " : 5 ... l = " off " 6 ... else : 7 ... l = " off " 8 ... return l 9 ... 10 >>> change_state ( " off " ) 11 ’ on ’ 12 >>> change_state ( " on " ) 13 ’ off ’ 14 >>> change_state ( " zklsdjaskldj " ) 15 ’ off ’ 16 >>> Note that in the Python code we have an else clause which sensibly resets the system if it has fallen into an unknown state. On resetting, we move the system into its starting state of “Off”. This is one aspect of good practice in programming with state which you should aim to incorporate into your own work.

7.4

Finite state machines

The light switch system is an example of a finite state machine (sometimes called a finite state automaton). A finite state machine is a system which transitions between states in its state space, like our light switch moving from “On” to “Off” then “On” again, and so on. Another example is a set of traffic lights. Figure 7.6 gives the state transition diagram for this system - it should be familiar to you, but if not we hope you don’t drive! Again, we can write these state transitions out in Python code. If the system gets into an unknown state, we can transition it back to “Red”, which we know to be safe: Listing 7.2: Traffic lights 1 >>> def change_state ( s ): 2 ... if s == " red " : 3 ... s = " red and amber "

84

CHAPTER 7. STATE

Figure 7.6: A state transition diagram for a set of (British) traffic lights

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

... elif s == " red and amber " : ... s = " green " ... elif s == " green " : ... s = " amber " ... elif s == " amber " : ... s = " red " ... else : ... s = " red " ... return s ... >>> change_state ( " red " ) ’ red and amber ’ >>> change_state ( " green " ) ’ amber ’ >>> change_state ( " amber " ) ’ red ’ >>> change_state ( " aasdnhasjkd " ) ’ red ’ >>> In both the light switch and traffic light systems the next state is only determined by the previous state. So, for our light switches, if the machine was in its “Off” state, then the next state has to be “On”. If the traffic light system is in its “Red” state the next transition has to be to “Red/Amber”. In some state machines, however, the next state is determined by the current state and an input. As an example, consider the vending machine in Figure 7.7. Here, a vending machine can dispense every programmers favourite drinks – cola and coffee. Since coffee is superior to cola, cola costs one token and coffee costs two tokens. Some of the time the system is in the “Idle” state, but once a token has been placed in the machine it enters the “1 token” state. Notice that the arcs in Figure 7.7 are labelled. The arc going from the “Idle” to the “1 token” states is labelled with “token”, meaning that if the machine is in the “Idle” state an input of “token” will cause it to move into the “1 token” state. If the “cola button” is pressed while the machine is in the “1 token” state it will output a

7.5. LEXERS

85

Figure 7.7: A state transition diagram for a simple vending machine

cola. The syntax “cola button / vend cola” means that on that particular state transition, “cola button” is an input and “vend cola” is an output to the machine. Following the state transitions round, you can work out how a consumer can use the vending machine. For example, there are two ways of buying two colas, given that the machine starts in its “Idle” state. Firstly you can put two tokens in consecutively (putting the machine in its “2 tokens” state) then press the “cola button” twice. Otherwise, you can put one token in (moving the machine into its “1 token” state) then press the “cola button”, and repeat those actions. Follow the state transition diagram round some more. What happens if the consumer puts three tokens into the machine without pressing any buttons?

7.5

Lexers

Lexers are an important class of state machine. Their job is to take a string and output a set of tokens which represent the words, numbers or symbols in that string. Why is that useful? Well, in an compiler or interpreter (such as the Python runtime), when you type in a line of code, it is the job of the compiler or interpreter to make sense of what you’ve given it – or to issue an error message. The first thing the compiler or interpreter does is to split up your line of code into tokens. So, if you entered this line into the Python interpreter: >>> sum , diff = ( x + y ) , (x - y ) Python’s lexer might issue something like the following set of tokens: NAME " sum " COMMA NAME " diff "

86

CHAPTER 7. STATE

EQUALS LPAREN NAME " x " PLUS NAME " y " RPAREN COMMA LPAREN NAME " x " MINUS NAME " y " RPAREN The next thing that Python does is to try and figure out what those tokens mean. This process is called parsing, which you’ll see in Section 8.5.1 of Chapter 8. Looking through the list of tokens in the Listing above, you’ll probably notice a couple of things. Firstly, there’s no whitespace (spaces, tabs, carriage returns, etc). Usually these aren’t important, although in Python, of course, some tabs and returns need to be kept because they mark out blocks of code (as with loops and choice statements). Secondly, the lexer has figured out that sum and diff are both names, by looking at the spaces and symbols that separate the words sum and diff from other tokens. Of course, variable names can be as long as we like in Python, so the lexer has to be smart enough to figure out that a name is a name, without trying to guess how long it might be. Equally, we might have an int that is very long 1234567... and the lexer should still recognise that it is a number and convert it to an int (or some other appropriate type) for the parser to handle. Figure 7.8 shows how a lexer for integers could work. The state transition diagram shows that once the lexer has read one digit it is in its “digit” state, and it’ll stay in that state whilst its input is a character which is a digit (i.e. 0,1,2,3,4,5,6,7,8 or 9).

Figure 7.8: A lexer for integers So, how should we implement this lexer? Well, first we need to use our state machine to scan in all the consecutive digits in the input. There are a couple of useful Python built in functions to know about here: raw input() grabs input from the keyboard and returns it as a string (type str). It will

7.5. LEXERS

87

grab every character that is input from the keyboard up to a carriage return (or “enter”). isdigit() can be called on a string and returns True if that string contains only digits and False otherwise. You can use it like this: "abc".isdigit(), "123".isdigit() or c.isdigit(), where c is of type str We’ll assume that the input at least starts with a digit and that we should stop lexing when we encounter a non-digit, as we’ve represented in the state transition diagram in Figure 7.8. The code below does this. Have a look through and make sure you can see where each part of the code matches the diagram. Which part of the diagram does the for loop correspond to? 3 input = raw_input () # Grab raw keyboard input 4 int_string = " " 5 6 for c in input : 7 if c . isdigit (): # state : digit 8 int_string = int_string + c 9 else : # state : accept 10 break So, when this code has been executed, the int_string variable is holding a string of digits which were entered at the keyboard. Next, we need to convert that string into an integer. It’s probably easiest to see how this might be done by considering a few concrete examples before considering at the general case. If the int_string variable holds the string "123" then the integer we want to generate is 123. We can do this with the following arithmetic expression: 100*1+10*2+1*3. If the int_string variable holds "54321" then the corresponding int could be made like this: 10000*5+1000*4+100*3+10*2+1*1. Say there are m characters in the string, then we can represent it like this: nm nm−1 nm−2 . . . n2 n1 n0 where each ni is a digit and the subscript tells us whereabouts that digit occurs in the string. Now, we need to multiply each digit by the relevant power of ten and add it to the rest, like this: nm ∗ 10m + nm−1 ∗ 10m−1 + nm−2 ∗ 10m−2 + . . . + n2 ∗ 102 + n1 ∗ 101 + n0 ∗ 100 This is simple to do in Python – we just use a for loop to iterate over the digits, multiply them by the right power of ten and add them onto the output variable. At the end we can print out the result:

88 13 14 15 16 17 18 19 20

CHAPTER 7. STATE

power = len ( int_string ) - 1 output = 0 for c in int_string : output = output + int ( c ) * 10 ** power power = power - 1 print output Putting all that together, we get the full lexer: Listing 7.3: A lexer for integers

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

# !/ bin / env python2 .4 input = raw_input () # Grab raw keyboard input int_string = " " for c in input : if c . isdigit (): # state : digit int_string = int_string + c else : # state : accept break # Evaluate string and print out the integer result power = len ( int_string ) - 1 output = 0 for c in int_string : output = output + int ( c ) * 10 ** power power = power - 1 print output

7.6

Programs which manage state

Programs that manage state are often called imperative and languages that support them (like Python, Java and C) are called imperative programming languages. Making sure that your programs are always in a consistent and desirable state, no matter what happens is a tough job. In Chapter 10 we’ll look at pre- and post-conditions which will also help you write correct programs. However, it’s worth considering some small improvements to your programs which will help to keep them in a consistent state. For example, in the light switch system we used the state variables “On” and “Off”. In the code, we just used strings to hold these values:

7.7. FURTHER READING

89

if state == " On " : state = " Off " It might be easier to to manage the state of this program if the possible values of the state variable were held in a list, then we could at least ensure that whenever we assign something to state, we assign a value from that list.

7.7

Further reading

ˆ State on FOLDOC: http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?state ˆ Finite state machines on FOLDOC http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?finite+state+machine ˆ Finite state machines on Wikipedia http://en.wikipedia.org/wiki/Finite_State_Machine ˆ Lexical analysis on Wikipedia http://en.wikipedia.org/wiki/Lexical_analyser ˆ Robert Newman, Elena Gaura, Dominic Hibbs (2002) Computer systems architecture Published by Crucial, ISBN 1-903337-07-0, Sequential logic and finite state machines pp. 28–29 ˆ Frank Giannasi and Robert Low (1995) Maths for computing and information technology Published by Longman, ISBN 0-582-23654-1, Finite State Automata, Chapter 5, pp 82-99

7.8

Glossary

Finite state automaton another name for a finite state machine. Finite state machine a theoretical machine which transitions between states in its state space. imperative programming language is one which provides facilities to manage state (such as assignment). Python, C, C++, Java and Perl are all examples of imperative programming languages. Inconsistent state an undesirable configuration for a program or machine to be in. In programming, this usually means that some variable or set of variables holds inconsistent data. Lexer a special sort of state machine which takes a string as an input and outputs a series of tokens which represent the words and symbols in the original string. For example, if the input to a lexer is:

90

CHAPTER 7. STATE foo = bar + 100 the resulting tokens might be: NAME " foo " EQUALS NAME " bar " PLUS INT 100

Lexical token another name for a token. Simultaneous assignment Assigning values to two or more variables at once. Such as: x , y , z = 1 , 2 , 2**3 State the configuration of something. In programming, the state of a program at a particular time is usually the values held in the variables in that program. State space the set of all possible states that a program or machine can be in. State transition the transition of a program or machine from one state to another. In programming this usually happens by making an assignment to a variable. Substitution replacing a variable name with its value, as part of a walk-through. This is usually written [x \ 1] meaning “substitute 1 for ‘x”. Token the output of a lexer. Walk-through looking through a program and determining what each line of code means (without the help of a computer). This is often done by teams of programmers to check the quality of code and find bugs.

7.9

Homework exercises

1. Type the following line of code into a Python interpreter: x , y , z = 1 , 2 , x **3 What error message do you get? Why do you think this happens? Re-write the code to eliminate the error. 2. For each lines 1-4 in the following code, write down the values associated with each variable name in the program:

7.9. HOMEWORK EXERCISES 1 2 3 4 5

x, y, z, y, x, y, y, z, print

91

z = 1, 2, 3 x = x, z, y z = ( z +1) , (x -2) , ( y *3) x = ( y / x ) , ( x * y ) , ( z ** x ) ( x + y )/( z - y )

Use substitution to work out what Python will print when it executes line 5. Do this explicitly, like we did in Section 7.1.1. 3. Write down the state space of the following devices: (a) An MP3 (or Ogg Vorbis) player. (b) A microwave oven. 4. Draw a state transition diagram for a two-bit binary counter. A binary counter is a digital circuit which has a clock input and an output which gives the number of clock cycles which have been counted. A two-bit binary counter only has two output wires, which can either read 1 or 0. This means that a two-bit counter can count from 0 to 22 − 1 and back to 0 again. 5. Write out Python code which manages the state of your two-bit binary counter (like we did in Sections 7.3 and 7.4 for the light switch and traffic light examples). 6. Draw a state transition diagram for a PIN (Personal Identification Number) recogniser. If the user enters “1” followed by “8” followed by “5” followed by “0” followed by “enter”, the machine should enter a state called “accept”. If the user enters any other combination of numbers, followed by “enter” the machine should enter a state called “reject”. Is this program a lexer? 7. Look at the lexer for integers from Section 7.5. Notice that the lexer can only lex non-negative integers. Improve the script by adding code to lex negative numbers

7.9.1

Key Assignment

Write a lexer to read in decimal numbers. Think hard about how to do this. If your input is the following string: “12345.9876” what do you have to do to each character in the string to generate your float? How are the numbers before the decimal point different to the numbers after the decimal point? Make sure you have an algorithm that works correctly on paper before you start coding and draw a state transition diagram for the scanning part of your lexer.

92

CHAPTER 7. STATE

Chapter 8 Compound types Learning outcomes At the end of this Chapter, you will be able to: ˆ Use compound sequence types in python ˆ Understand the use of string, lists, tuples and dictionaries, their similarities and differences ˆ Be able to use compound types

8.1

Introduction

Types were introduced in Section 2.4. As a quick recap, remember that the type of something indicates the values it can hold and the operations that can be performed on it. The types we have used so far have been simple types, with one exception. A simple type stores a single value - an integer, a float, a boolean, etc. The exception is the string type. String is actually a compound type, since it holds a number of values. That is, a string is made up of single characters. It can hold 0 or more characters and, as we saw in Section 7.5, it is possible to examine each element individually. We will begin this chapter with a more detailed look at strings, before we move on to new compound types.

8.2

Strings

We have used strings repeatedly, but their compound nature only became apparent in Section 7.5, where we discovered that we could iterate over the individual elements (the characters) that make the string. 93

94

CHAPTER 8. COMPOUND TYPES

Access to the elements of a string is not restricted to loops, however. It is possible to access single elements or ranges of elements as if they were separate variables. To do this, we use the python’s index and slice notation. Since indexing and slicing are used for more than just strings, they are each given their own subsection here.

8.2.1

Indexing

Indexing allows us to get at the individual elements of a compound type. For example, if we wanted to know the first letter of the string variable a, we could use a[0]. Here’s a quick example: 1 2 3 4

a = ’ This is some text ’ print a [0] print a [1] print a [6] Which produces the following output: T h s Note that the first element is 0, so if the length is n, then the last element will be n − 1. Using this, we could iterate over the characters in the string like this:

1 a = ’ This is some text ’ 2 for i in range ( len ( a )): 3 print a [ i ] The output is: T h i s i s s o m e t

8.2. STRINGS

95

e x t As we saw in Section 6.4, we can get the same result with: 1 a = ’ This is some text ’ 2 for i in a : 3 print i But using indices allows us more flexibility. Consider the following: 1 a = ’ Assistant poppy sniffer ’ 2 for i in range ( len ( a )): 3 if i >0: 4 if a [ i ]== a [i -1]: 5 print ’* 2 ’ , 6 else : 7 print ’ , ’+ a [ i ] , 8 else : 9 print a [ i ] , You should be able to determine the output without executing it. What is the output?1 Now try to do this without indexing. That is, using for c in a:. Negative indices It is valid in Python to use negative indices. At first this may seem nonsensical - there are no elements that come before the first one. Negative indices in Python are used to refer to elements in reverse order. That is, for a string, a, of length n, the first element can be addressed as a[0] or a[-n]. The last element can be accessed as either a[n-1] or a[-1]. Here is an example program that tests for palindromes - words or numbers that read the same forwards or backwards. Listing 8.1: Test for palindromes 1 # Tests for palindromes 2 3 # Read some input from the user 4 print " Enter text : " , 5 text = raw_input () 6 7 1

A,s*2,i,s,t,a,n,t,,p,o,p*2,y,,s,n,i,f*2,e,r

96

CHAPTER 8. COMPOUND TYPES

8 # Look at each character 9 for i in range ( len ( text )): 10 11 # And compare it with the character at the same distance from the 12 # opposite end 13 if text [ i ] != text [ -( i +1)]: 14 print text , " is not a palindrome . " 15 break 16 else : 17 print text , " is a palindrome . "

8.2.2

Slicing

Slicing is similar to indexing, but it allows us to extract more than one element. Here is a simple example: 1 2 3 4

a = ’ ABCDEFGHIJKLMNOP ’ print a print a [0:4] print a [2:7] The output of this is: ABCDEFGHIJKLMNOP ABCD CDEFG So, we need two numbers to perform a slice. a[m:n] gives us m−n items in total, beginning with the mth element of a and ending at the (n − 1)th . We can use implicit and/or negative numbers like this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14

# Print the 4 th character to the end of the string print a [4:] # Print up to the 6 th character print a [:6] # Print the last 2 characters print a [ -2:] # Print all but the last 10 characters print a [: -10] # Print the whole string print a [:]

8.3. TUPLES

8.3

97

Tuples

A tuple in Python is a series of items of any type. A tuple is immutable - it cannot be changed once it is created. Here is an example of tuples: 1 stuff =(12 , ’ James ’ , True , 0.66) 2 for i in stuff : 3 print i , ’ has type ’ , type ( i ) The output of this code is: 12 has type James has type True has type 0.66 has type So, we can put arbitrary items in a tuple and access them with indexing or slicing. This is extremely useful, as it lets us group information together. The example below shows how useful this is.

8.3.1

Distance finding with tuples

The distance between two points, (x1 , y1 ) and (x2 , y2 ) is calculated as: d=

p

(x1 − x2 )2 + (y1 − y2 )2

We could write a function to do this calculation that takes four arguments - the x and y coordinates of the pair of points. It makes more sense, however, to write one that just takes two points as arguments. The two points would be tuples, each containing two numbers. If we call the points p1 and p2, then the x coordinate of p1 would be p1[0]. Here is our example, with a few simple tests: 1 2 3 4 5 6 7 8

import math def distance ( p1 , p2 ): return math . sqrt ( ( p1 [0] - p2 [0])**2 + ( p1 [1] - p2 [1])**2 )

print ’ (1 ,1) to (1 ,10) = ’ , distance ((1 ,1) ,(1 ,10)) print ’ (0 ,0) to (1 ,1) = ’ , distance ((0 ,0) ,(1 ,1))

98

8.3.2

CHAPTER 8. COMPOUND TYPES

Adding to immutables

We have already said that tuples and strings are immutable, and yet this: 1 a =(1 ,2 ,3) 2 a = a +(12 ,0.1) 3 print a Prints (1, 2, 3, 12, 0.1). Does this mean we are changing a tuple? In fact, it doesn’t. Python creates a second tuple that is the same as the concatenation of the two given tuples, and this is assigned to a. This tends to make the operation slower than with lists, which we discuss next.

8.4

Lists

Pythons lists behave in the same way as tuples, except they are not immutable. 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# Tuple stuff =(12 , ’ James ’ , True , 0.66) # Causes an error ... stuff [0]=15 # List stuff =[12 , ’ James ’ , True , 0.66] # Assigns 15 to the first element of ’ stuff ’ stuff [0]=15 # Concatenate lists stuff = stuff +[12 , ’ James ’ , True ]

8.4.1

Greenfly reproduction

This example models the reproduction of greenfly. Greenfly reproduce asexually - it only takes one greenfly to breed. Starting with one greenfly, we can use the average number of eggs hatched per day and the time it takes those eggs to hatch, to work out how many greenfly we can expect after a given number of days. Listing 8.2: Greenfly Reproduction 1 # Set up some variables ... 2 # Total days in the study 3 total_days =15

8.4. LISTS 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

99

# Eggs laid per greenfly per day eggs_per =4 # Days for eggs to hatch hatch_in =5 # Make enough room for calculations days =[0]*( total_days + hatch_in ) # One greenfly to start days [0]=1 for i in range ( total_days ): # Lay eggs - add to population in future days [ i + hatch_in ]+= days [ i ]* eggs_per # Add todays population to tomorrow days [ i +1]+= days [ i ] print days print " Greenfly after " , total_days , " days = " , days [ total_days -1]

8.4.2

Cellular automata

Cellular automata are simple but interesting models of ecologies. The “space” that we look at is divided into cells. Each of these cells is dead or alive. For each iteration, the state of a cell is determined by its neighbours. Here we look at a simple one-dimensional ecology, with a very simple rule. The rule is that a cell is alive if the number of cells in its vicinity that were alive in the last iteration was one or two. Here is the code. Listing 8.3: Cellular Automata 1 import random 2 3 def life_init ( l ): 4 space = range ( l +2) 5 for i in range (1 , l +1): 6 space [ i ]= False 7 if random . random () >=0.5: 8 space [ i ]= True 9 space [0]= space [ l +1]= False 10 return space 11 12 def life_display ( space ): 13 out = " "

100 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

CHAPTER 8. COMPOUND TYPES for i in range (1 , len ( space ) -1): if space [ i ]: out = out + " # " else : out = out + " " print " | " + out + " | "

def life_step ( space ): space2 =[ False ]* len ( space ) for i in range (1 , len ( space ) -1): if 1 0 and numerals [ input [i -1]] < numerals [ input [ i ]]: pass # Already dealt with this numeral Otherwise, we just add this to the total:

28 29

else : output = output + numerals [ input [ i ]] So, now we’ve parsed the numeral, all we have to do is print out the result for the user to see:

31 print output Putting those stages together gives us the complete script: Listing 8.5: A parser for Roman numerals 1 # !/ bin / env python2 .4 2 3 input = raw_input () 4 5 # Lookup - table for Roman numerals

104 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

CHAPTER 8. COMPOUND TYPES

numerals = { " I " "i" "V" "v" "X" "x" "L" "l" "C" "c" "D" "d" "M" "m"

: : : : : : : : : : : : : :

1, 1, 5, 5, 10 , 10 , 50 , 50 , 100 , 100 , 500 , 500 , 1000 , 1000}

output = 0

for i in range ( len ( input )): if i < len ( input ) - 1 and numerals [ input [ i ]] < numerals [ input [ i +1]] output = output + numerals [ input [ i +1]] - numerals [ input [ i ]] elif i > 0 and numerals [ input [i -1]] < numerals [ input [ i ]]: pass # Already dealt with this numeral else : output = output + numerals [ input [ i ]] print output Because we only look one step ahead, we cannot use IIX to get 8 - our program will add 1 to the total (I), and then 9 (IX), so we get 10. To get 8 we need VIII. Another oddity with this is that we can use IIIIIIIIIIII, for example, to get 12.

8.5.2

IMDB-style database

Dictionaries are great for database-style records because items can be named. Here’s an example of a simple database application: Listing 8.6: A simple film database 1 2 3 4 5 6 7

#A list of entries # Each element will be a dictionary records =[] # User input user = " " while user != " 0 " :

8.5. DICTIONARIES 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

# display menu print print " Film Database " print " -- -- -------- --- - - " print " 1 -- Add entry " print " 2 -- List entries " print " 0 -- Exit " print " -- -- -------- --- - - " print len ( records ) , " entries . " print print " Enter option : " , # Get user input user = raw_input () if user == " 1 " : # Add to database # Create empty dictionary item ={} print " Enter title : " , item [ " Title " ]= raw_input () print " Enter director : " , item [ " Director " ]= raw_input () print " Enter year : " , item [ " Year " ]= raw_input () records . append ( item ) elif user == " 2 " : # Display database print " \ t " + " -" *5 for r in records : print " Title :\ t \ t " ,r [ " Title " ] print " Year :\ t \ t " ,r [ " Year " ] print " Director :\ t " ,r [ " Director " ] print " \ t " + " -" *5 print else : print " Unknown option "

105

106

CHAPTER 8. COMPOUND TYPES

8.6

Further reading

ˆ WikiBooks Python Dictionaries chapter http://en.wikibooks.org/wiki/Programming:Python_Dictionaries ˆ Data structures - from the python documentation http://docs.python.org/tut/node7.html

8.7

Glossary

Compound type A type that contains more than one piece of data Tuple A type that stores an immutable ordered list of objects List A type that stores an mutable ordered list of objects Dictionary Stores named items LUT Look-Up Table - a way to store values and use a key to retrieve them Key Data identifying a value Index A position in a sequence, or the act of retrieving data by index Immutable Fixed, unchangeable

8.8

Exercises

1. Are Python’s strings immutable?

8.8.1

Key Assignment

Extend the functionality of the film database by: 1. Allowing a single film to be displayed by index 2. Allowing a film to be removed by index

Chapter 9 Searching and sorting Learning outcomes ˆ Understand the concepts involved in searching and sorting ˆ Understand and implement the basic algorithms for searching and sorting ˆ Select the appropriate algorithm for a task

9.1

Introduction

Searching and sorting are two activities that computers carry out regularly. Sorting your addresses in your mail client, sorting directory entries, looking for a file, finding a piece of text in a document, etc. So, it shouldn’t be much of a surprise that searching and sorting have been analysed extensively, and that there are a number of algorithms to accomplish each one. Here, we’re going to look at the most common algorithms and their python implementations. You might wonder why we need to understand these algorithms, or be able to write them, since there are many implementations available as libraries for practically any language you care to choose - Python has the ability to sort lists already, for example. We look at these algorithms because they are so often used, and used in many different ways. Perhaps you need to sort data in a slightly different way, or sort a non-standard type of data. Understanding these algorithms gives you the ability to customise them for your needs. They are also very good as exercises in computer science. Python’s list type is extremely useful for implementing these algorithms, since it allows us to very simply append or slice ranges of items. Compare, for example, a Java implementation of the quick sort algorithm with a Python one. 107

108

9.2

CHAPTER 9. SEARCHING AND SORTING

Searching

Searching is the act of finding a piece of information. There are a number of methods for doing this, and we will examine a few of the most common ones here.

9.2.1

Linear search

A linear search is the most basic of searching algorithms. If we have N items of data, then we search from 1 to N until we find the data we’re looking for. If we’re doing an exhaustive search - one that will find every matching record - then we must always look at every item. Here, we have a simple program that searches for a record. Records are stored in a list, called records, and each one is a dictionary. The keys for the dictionary are name, age, and course. In this example, we’re looking for people on the course “111”. Obviously, in a real-world program we’d have real-world data. For our examples, we’re going to use some generated data. The function we will use to generate the data is listed one below. Calling this function creates a list of a given size filled with names, ages and courses. Notice the use of the random.choice() function, which makes a random selection from a given list. Listing 9.1: Generating random records 2 def makeRecords ( number ): 3 names1 =[ " Alice " ," Bob " , " Clare " , " Dave " , " Emma " , " Frank " ] 4 names2 =[ " Smith " , " Jones " , " Davis " , " Wardly " , " Bright " ] 5 courses =[111 ,101 ,110 ,171 ,159] 6 records =[] 7 for i in range ( number ): 8 item ={} 9 name = random . choice ( names1 )+ " " + random . choice ( names2 ) 10 item [ ’ name ’ ]= name 11 item [ ’ course ’ ]= random . choice ( courses ) 12 item [ ’ age ’ ]= int ( random . uniform (18 ,50)) 13 records . append ( item ) 14 return records Now let’s take a look at the linear search function, lSearch. This takes a list of records as input, along with the key to check and the value to check it for: Listing 9.2: Linear Search 16 def lSearch ( key , value , records ): 17 found =[] 18 for r in records : 19 if r [ key ]== value : 20 found . append ( r )

9.2. SEARCHING 21

109

return found We could have hard-coded the key and value that we’re looking for, but that would make out function too specific to this problem. As it is, the only requirement is that we are searching a list containing dictionaries. Here is the complete code, showing the function call and listing of results: Listing 9.3: Linear Search

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

import random def makeRecords ( number ): names1 =[ " Alice " ," Bob " , " Clare " , " Dave " , " Emma " , " Frank " ] names2 =[ " Smith " , " Jones " , " Davis " , " Wardly " , " Bright " ] courses =[111 ,101 ,110 ,171 ,159] records =[] for i in range ( number ): item ={} name = random . choice ( names1 )+ " " + random . choice ( names2 ) item [ ’ name ’ ]= name item [ ’ course ’ ]= random . choice ( courses ) item [ ’ age ’ ]= int ( random . uniform (18 ,50)) records . append ( item ) return records def lSearch ( key , value , records ): found =[] for r in records : if r [ key ]== value : found . append ( r ) return found records = makeRecords (30) found = lSearch ( ’ course ’ ,111 , records ) for i in found : print i [ ’ name ’ ]+ " ( " + str ( i [ ’ age ’ ])+ " ) studies " + str ( i [ ’ course ’ ])

9.2.2

Binary search

While the linear search will search up to N records to find the item needed, we can search more efficiently if our data meets a few criteria: 1. We are searching for a single item 2. The data is sorted by this item

110

CHAPTER 9. SEARCHING AND SORTING

The binary search process is very simple, and is one that we might use in real life when looking through the library to find a book by the author’s name. Knowing that the data is ordered, we begin in the middle of the list. If the value is greater than the one we’re after, we know that it must lie in the first half of the list. If the value is less than the desired value, we know that we must look in the second half of the list. We now have the same problem to solve, but with half of the data removed. We repeat the process until we have only one element left. First, let’s define a function to create some data. This function creates a list of names and adds them to a list: Listing 9.4: Generating Random Names 2 def makeRecords (): 3 names1 =[ " Alice " ," Bob " , " Clare " , " Dave " , " Emma " , " Frank " ] 4 names2 =[ " Smith " , " Jones " , " Davis " , " Wardly " , " Bright " ] 5 records =[] 6 for i in names1 : 7 for j in names2 : 8 name = i + " " + j 9 records . append ( name ) 10 records . sort () 11 return records Each combination of first and last name is created and added to the list. Notice that we’ve used the sort() function to ensure that our data is in order. We begin our search with the start and end points being the start and end of the list. We calculate the midpoint and check that value. By moving either the start or end point, depending on the value of the midpoint, we can repeat the search until we find out person, or we are sure it doesn’t exist. Listing 9.5: Binary Search 13 def bSearch ( value , records ): 14 name =( " Not found " ,0) 15 start = -1 16 end = len ( records ) 17 while end - start >1 : 18 midpoint =( end - start )/2+ start 19 if records [ midpoint ]== value : 20 name =( value , midpoint ) 21 break 22 elif records [ midpoint ] > value : 23 end = midpoint 24 else : 25 start = midpoint 26

9.2. SEARCHING 27

111

return name This function returns a tuple containing the name, or a “Not found” message, and the position of that name in the list. If the two ends meet, then the name was not found. Here is a complete listing: Listing 9.6: Binary Search

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

import random def makeRecords (): names1 =[ " Alice " ," Bob " , " Clare " , " Dave " , " Emma " , " Frank " ] names2 =[ " Smith " , " Jones " , " Davis " , " Wardly " , " Bright " ] records =[] for i in names1 : for j in names2 : name = i + " " + j records . append ( name ) records . sort () return records def bSearch ( value , records ): name =( " Not found " ,0) start = -1 end = len ( records ) while end - start >1 : midpoint =( end - start )/2+ start if records [ midpoint ]== value : name =( value , midpoint ) break elif records [ midpoint ] > value : end = midpoint else : start = midpoint return name

records = makeRecords () print bSearch ( " Dave Bright " , records ) To show the movement of the start and end points, you may wish to use the modified search function below. Each iteration shows the positions in the list, plus the start and

112

CHAPTER 9. SEARCHING AND SORTING

end points, shown as |. Listing 9.7: Visual Binary Search 37 38 def bSearch_show ( value , records ): 39 name =( " Not found " ,0) 40 start = -1 41 end = len ( records ) 42 while end - start >1: 43 o="" 44 for i in range ( len ( records )): 45 o = o + str ( i %10) 46 print o 47 48 print " " *( start )+ " | " + " " *( end - start -1)+ " | " 49 50 51 midpoint =( end - start )/2+ start 52 if records [ midpoint ]== value : 53 name =( value , midpoint ) 54 break 55 elif records [ midpoint ] > value : 56 end = midpoint 57 else : 58 start = midpoint Here’s some example output, showing the partitioning required to search for “Clare Jones”:

012345678901234567890123456789 | | 012345678901234567890123456789 | | 012345678901234567890123456789 | | 012345678901234567890123456789 | | 012345678901234567890123456789 | | (’Clare Jones’, 12)

9.3. SORTING

9.3

113

Sorting

Here, we’re going to look at some of the most common sorting algorithms. We’ll look at sorting simple lists of figures, as well as more complex data, such as a list of dictionaries.

9.3.1

Selection sort

The selection sort is probably the most simple sorting algorithm, and is very easily understood and implemented. To sort our data, we simply look for the smallest item in the list and move it to the first position. Then we look for the next smallest and move it to the second position and so on. We’ll use the randomly generated data from the linear search example to test it. Again, we’ll write a function that takes a key as an argument as well as the list to be sorted. This way, we can re-use our function for sorting by other keys. Here’s our implementation: Listing 9.8: Selection Sort 1 import random 2 def makeRecords ( number ): 3 names1 =[ " Alice " ," Bob " , " Clare " , " Dave " , " Emma " , " Frank " ] 4 names2 =[ " Smith " , " Jones " , " Davis " , " Wardly " , " Bright " ] 5 courses =[111 ,101 ,110 ,171 ,159] 6 records =[] 7 for i in range ( number ): 8 item ={} 9 name = random . choice ( names1 )+ " " + random . choice ( names2 ) 10 item [ ’ name ’ ]= name 11 item [ ’ course ’ ]= random . choice ( courses ) 12 item [ ’ age ’ ]= int ( random . uniform (18 ,50)) 13 records . append ( item ) 14 return records 15 16 def selSort ( key , records ): 17 #A loop for the position we 18 # want to place the smallest into 19 for start in range ( len ( records )): 20 smallest_location = start 21 #A loop for the records that follow start 22 for compare in range ( start +1 , len ( records )): 23 if records [ compare ][ key ] < \ 24 records [ smallest_location ][ key ]: 25 smallest_location = compare

114 26 27 28 29 30 31 32 33 34 35 36

CHAPTER 9. SEARCHING AND SORTING # Swap records [ start ] , records [ smallest_location ]= \ records [ smallest_location ] , records [ start ]

a = makeRecords (30) selSort ( " age " ,a ) for r in a : print r [ ’ age ’] , print The first loop increments the start variable, and represents the current start of the search for the smallest item. When the smallest is found, it is placed here and we move to the next position. The smallest_location variable stores the position in the list of the current smallest item. For each search of the list, this is first set to the beginning of the search area and then altered as we encounter smaller items. The swapping of the two items has been split onto two lines because of its length: Listing 9.9: Selection Sort - swapping

26 27 28

# Swap records [ start ] , records [ smallest_location ]= \ records [ smallest_location ] , records [ start ] In some languages, we may need a temporary variable to swap two values, but python’s multiple assignment means we can do this more elegantly. Once the data is sorted, we have a simple test that shows the ages from the records in the order they appear in the list. Simply reading the numbers confirms that our sorting was successful. Listing 9.10: Selection Sort - result

34 for r in a : 35 print r [ ’ age ’] , The output of a run may look like this: 18 19 21 22 22 24 24 26 27 43 44 44 46 48 48 48 49 49 But of course this depends on the randomly generated data and length of the list of records.

9.3.2

Bubble sort

The bubble sort is another fairly trivial sorting algorithm. The list is repeatedly scanned, with consecutive pairs being compared. Any pairs out of order are swapped. So, positions

9.3. SORTING

115

0 and 1 are compared first, followed by 1 and 2, etc. This is repeated until the list is in order. The high values “bubble” to the end of the list. Here is a python implementation: Listing 9.11: Bubble Sort 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

import random def makeRecords ( number ): names1 =[ " Alice " ," Bob " , " Clare " , " Dave " , " Emma " , " Frank " ] names2 =[ " Smith " , " Jones " , " Davis " , " Wardly " , " Bright " ] courses =[111 ,101 ,110 ,171 ,159] records =[] for i in range ( number ): item ={} name = random . choice ( names1 )+ " " + random . choice ( names2 ) item [ ’ name ’ ]= name item [ ’ course ’ ]= random . choice ( courses ) item [ ’ age ’ ]= int ( random . uniform (18 ,50)) records . append ( item ) return records def bubSort ( key , records ): for i in range ( len ( records )): for j in range ( len ( records ) -1): if records [ j ][ key ] > records [ j +1][ key ]: # Swap records [ j ] , records [ j +1]= records [ j +1] , records [ j ] a = makeRecords (30) bubSort ( " age " ,a ) for r in a : print r [ ’ age ’] , print Notice that we repeatedly scan the list, since we can only be sure of one correctly placed value each iteration. If we assume only one item is placed in the correct position for each iteration, then we also know that we need a maximum of N iterations of the scan, for N elements. Also, since each iteration will ensure the next-largest value is placed into order, we could slightly improve our efficiency by terminating the second loop after one fewer iterations per scan.

116

CHAPTER 9. SEARCHING AND SORTING

Alternatively (or additionally), if we made a note for each scan if a swap was made, if we find that no values were re-ordered, then we can terminate our sorting. Here’s an alternative bubble sort function with that improvement: Listing 9.12: Improved Bubble Sort 16 def bubSort2 ( key , records ): 17 for i in range ( len ( records )): 18 swap = False 19 for j in range ( len ( records ) -1): 20 if records [ j ][ key ] > records [ j +1][ key ]: 21 # Swap 22 records [ j ] , records [ j +1]= records [ j +1] , records [ j ] 23 swap = True 24 if not swap : 25 break

9.3.3

Merge sort

The merge sort and the quick sort are known as “divide and conquer” algorithms. That is, they divide up the problem into smaller problems until the solution is trivial. Both of these algorithms can be much more efficient than the selection and bubble sorts. The merge sort divides the list into two halves, sorts each half separately and then merges the sorted lists together. Because each half is also sorted by the merge sort, we eventually reach a trivial case of single-element lists. As you would expect, this is usually implemented recursively, since the base and induction cases are obvious. So, the “divide” part of the algorithm is very simple. The real core of the algorithm, however is the merging of the two sorted halves. There is more than one way to do this, but a common one is as follows: 1. Look at the first elements in each list, which we will call i and j. Remember that the two lists are sorted. 2. Whichever item is smallest is placed into the next empty position of a new list. 3. If element i was selected, increment i. If element j was selected, increment j. 4. Repeat from step 2 until i or j reach the end of their list. 5. If either list has elements left, they are copied to the end of the new list, since they must be greater than any in the current result, and are already sorted. A python implementation is shown below. Note that there are two functions: one for dividing and one for merging.

9.3. SORTING

117 Listing 9.13: Merge Sort

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

import random import random def makeRecords (): names1 =[ " Alice " ," Bob " , " Clare " , " Dave " , " Emma " , " Frank " ] names2 =[ " Smith " , " Jones " , " Davis " , " Wardly " , " Bright " ] records =[] for i in names1 : for j in names2 : name = i + " " + j records . append ( name ) records . sort () return records

def merge ( records ): # create a NEW list to put sorted data in merged =[0]* len ( records ) length = len ( records ) mid = length /2 i =0 j = mid pos =0 # Merge while i < mid and j < length : if records [ i ] < records [ j ]: merged [ pos ]= records [ i ] i = i +1 else : merged [ pos ]= records [ j ] j = j +1 pos = pos +1 # Copy any leftovers if i < mid : merged [ pos :]= records [ i : mid ] if j < length : merged [ pos :]= records [ j : length ] # Copy data back into records for i in range ( len ( records )): records [ i ]= merged [ i ]

118 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

CHAPTER 9. SEARCHING AND SORTING

def mergeSort ( records ): length = len ( records ) if len ( records ) >1: mid = length /2 mergeSort ( records [: mid ]) mergeSort ( records [ mid :]) records = merge ( records )

a = makeRecords () mergeSort ( a ) print len ( a ) , " records " for r in a : print r Also of interest is that the list isn’t actually broken into two. Instead, the mergeSort function is called on slices of the original. The merge function, then, does not make i and j refer to the beginnings of two separate lists, but the start and middle of one list. This has the advantage of requiring less memory. It is also possible to execute the merge algorithm in-place, saving even more space, but the algorithm becomes less clear.

9.3.4

Quicksort

The quick sort, another divide and conquer technique, is also usually implemented recursively. The quick sort algorithm is: 1. Pick a an element in the list, which we will call pivot 2. Create a list, which we will call low, and place all of the elements lower than pivot into it. 3. Create a second list, high, and place all of the elements higher than pivot into it. 4. We now have two lists and a single element between them. 5. Sort the high and low lists. 6. Concatenate: high + pivot + low.

9.3. SORTING

119

Again, since we repeatedly divide the list, we eventually have single-element lists, which require no sorting. Here is a python implementation of the quick sort: Listing 9.14: Quick Sort 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

import random , sys def makeRecords (): names1 =[ " Alice " ," Bob " , " Clare " , " Dave " , " Emma " , " Frank " ] names2 =[ " Smith " , " Jones " , " Davis " , " Wardly " , " Bright " ] records =[] for i in names1 : for j in names2 : name = i + " " + j records . append ( name ) records . sort () return records

def quick ( records ): if len ( records ) >> def foobar ( n ): ... x = 10 ... return n + 10 ... >>> foobar (15) 25 >>> print x Traceback ( most recent call last ): File " < stdin > " , line 1 , in ? NameError : name ’x ’ is not defined >>>

10.1. FUNCTIONS

125

In the example above we created a variable x inside the function foobar and when we tried to print the value of x outside of foobar Python complained that we hadn’t defined a variable called x. There is a way for us to make x available outside the scope of foobar, using the global keyword: >>> ... ... ... ... >>> 25 >>> 10 >>> >>> 25 >>>

def foobar ( n ): global x x = 10 return n + x foobar (15) print x x = 25 print x

It’s important to understand how Python manages scope, so that whenever you refer to a variable (or function) name you know which name Python will be using. Python programmers refer to the “LGB rule” to explain this. Before we get into details, you should know that these rules don’t work in quite the same way for names containing a ., such as math.sqrt. We’ll discuss that in more detail in Chapter 12. Python has three sorts of scope: ˆ local (names inside a function); ˆ global (names outside a function in the interpreter or in a file); and ˆ built-in names that come with Python (e.g. range, len, etc.).

By default, assignment always creates or changes local names (unless you use the global keyword or there’s a . in the name). Whenever you use a name (without a .) inside a function, Python searches for it first in the local scope, then the global scope and lastly the built-in scope, hence the “LGB rule”. For example: >>> x = 10 >>> def foobar (): ... x = 100 ... print x # x from local scope ... >>> foobar () 100

126

CHAPTER 10. FUNCTIONS AND MODULES

>>> print x # x from global scope 10 >>>

10.1.2

Example: verifying ISBN checksums

ISBN stands for International Standard Book Number and is the ten digit unique identifier you see on the back of books (usually above the bar code). It’s important to have some way of verifying that a given ten digit number is a valid ISBN and this is done by making the last digit of the number a checksum. Checksums can be used in many ways, but for ISBNs if the original ten digit number is given by the letters s0 to s9 like this: s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 then this formula needs to be true for the ISBN to be valid: 10 ∗ s0 + 9 ∗ s1 + 8 ∗ s2 + 7 ∗ s3 + 6 ∗ s4 + 5 ∗ s5 + 4 ∗ s6 + 3 ∗ s7 + 2 ∗ s8 = s9 mod 11 Note that here we’ve called the checksum s9 because it is the last digit in the sequence. Also, we’re dividing the sum on the left by 11 and looking at the remainder. What if the remainder is 10? In this case, the ISBN standard states that the valid checksum should be written as “X” or “x” to keep the whole number down to ten characters. For example, the ISBN for Kafka’s Metamorphosis (published by Penguin Modern Classics) is 0141182520. The last digit, 0, is the checksum. So, to validate this ISBN we need to check that: 10 ∗ 0 + 9 ∗ 1 + 8 ∗ 4 + 7 ∗ 1 + 6 ∗ 1 + 5 ∗ 8 + 4 ∗ 2 + 3 ∗ 5 + 2 ∗ 2 = 0 mod 11 or: 0 + 9 + 32 + 7 + 6 + 40 + 8 + 15 + 4 = 0 mod 11 or: 121 = 0 mod 11 Which is true, as 121 = 11 ∗ 11. So, 0141182520 is a valid ISBN. So, how can we write this in Python? This should be an easy job by now, the main part of the work will just be a loop which iterates over the ISBN creating the summation. Here’s a first draft: 1 isbn = " 0141182520 " # Kafka : Metamorphosis 2 mult = 10 3 total = 0

10.1. FUNCTIONS 4 5 6 7 8 9 10 11 12 13 14 15 16 17

127

for i in isbn [: len ( isbn ) -1]: total += int ( i ) * mult mult -= 1 if isbn [ len ( isbn ) -1] == " X " or isbn [ len ( isbn ) -1] == " x " : checksum = 10 else : checksum = int ( isbn [ len ( isbn ) -1]) if total % 11 == checksum : print isbn , ’ is a valid ISBN . ’ else : print isbn , ’ is an invalid ISBN . ’ This looks fine and it does work – so what’s the problem? Well, the code above only works for the ISBN stored in the isbn variable. If we want to test another ISBN we need to change the variable at the top and run the whole thing again (and possibly type it all out again if we’re using the interpreter). The solution is to use a function to encapsulate the ISBN validation. Encapsulation means that the algorithm for performing the validation is wrapped up in the function and cannot be altered (i.e. broken!) by other code. So here’s the second draft of our ISBN validation using functions:

1 def validate_isbn ( isbn ): 2 mult = 10 3 total = 0 4 for i in isbn [: len ( isbn ) -1]: 5 total += int ( i ) * mult 6 mult -= 1 7 8 if isbn [ len ( isbn ) -1] == " X " or isbn [ len ( isbn ) -1] == " x " : 9 checksum = 10 10 else : 11 checksum = int ( isbn [ len ( isbn ) -1]) 12 13 if total % 11 == checksum : 14 print isbn , ’ is a valid ISBN . ’ 15 else : 16 print isbn , ’ is an invalid ISBN . ’ 17 18 kafka = " 0141182520 " # Metamorphosis 19 validate_isbn ( kafka )

128

CHAPTER 10. FUNCTIONS AND MODULES

This is pretty neat now, but it could be even better. Right now validate_isbn doesn’t return a value1 but it does print out at statement saying whether or not it’s argument is valid ISBN. This isn’t very helpful when we come to use the code in a larger program – if we want to know whether an ISBN is valid, we’d have to get a human to check whether a message had been printed! To make our code usable by other programmers, it’s more helpful to have validate_isbn return a Boolean and to do the printing in a separate function. This separates out our two concerns (validation and reporting to the user) nicely. Here’s the finished program: Listing 10.1: Verifying ISBN checksums 1 # !/ bin / env python2 .4 2 3 def validate_isbn ( isbn ): 4 mult = 10 5 total = 0 6 for i in isbn [: len ( isbn ) -1]: 7 total += int ( i ) * mult 8 mult -= 1 9 10 if isbn [ len ( isbn ) -1] == " X " or isbn [ len ( isbn ) -1] == " x " : 11 checksum = 10 12 else : 13 checksum = int ( isbn [ len ( isbn ) -1]) 14 15 if total % 11 == checksum : 16 return True 17 else : 18 return False 19 20 def test_isbn ( isbn ): 21 if validate_isbn ( isbn ): 22 print isbn , ’ is a valid ISBN . ’ 23 else : 24 print isbn , ’ is an invalid ISBN . ’ Can we break this program down into smaller functions? Well, yes we probably could, for example split the validate_isbn function up into smaller functions. The question really is whether this would be useful. For example would the following function ever be used anywhere other than the validate_isbn function: 1 def get_checksum ( isbn ): 2 if isbn [ len ( isbn ) -1] == " X " or isbn [ len ( isbn ) -1] == " x " : 3 return 10 1

Of course, really it returns the special value None.

10.2. FUNCTIONAL PROGRAMMING WITH PYTHON 4 5

129

else : return int ( isbn [ len ( isbn ) -1]) I would guess it probably wouldn’t be useful elsewhere. If that’s true and that code is only ever needed once, then there’s no point in making it a separate function.

10.2

Functional programming with Python

Functional programming (sometimes called declarative programming) is essentially programming without assignment. Many programming languages exist which pretty much only support functional programming, such as Hope, Miranda, LISP, Scheme, Haskell, SML, Caml and Ocaml. Few of these languages have gained widespread use, but the concepts found in them are fundamental to Computer Science. However, while functional languages are not widely used as general purpose programming language, languages like SQL (for database processing) and the language used to perform sophisticated calculations in Excel spreadsheets are both examples of domain-specific functional languages. Equally, functional programming has proved very useful in Computer Science research. Functional programs are referentially transparent, meaning that the same function, given the same parameters, will always return the same result. This means that it is easy to reason about functional programs mathematically and to prove that they are correct. On the other hand, languages with state (like Python) are not referentially transparent. So, in Python, a function might not return the same result if you call it twice with the same parameters. For example: >>> >>> ... ... >>> 110 >>> >>> 135 >>>

x = 10 def addX ( y ): return x + y addX (100) x = 35 addX (100)

In this example, the state of x had an effect on the result of the function addX. Functions can also side effects the state around them, meaning that a function can change variables declared outside its scope. In Python, if a function does this, it needs to explicitly state that a variable name it refers to is not local to the function. This is done with the keyword global. Here’s an example: >>> x = >>> def ... ...

10 addX ( y ): global x x = x + y

130

CHAPTER 10. FUNCTIONS AND MODULES

... return x * y ... >>> addX (10) 200 >>> addX (10) 300 >>> addX (10) 400 >>> You can probably see from this example why imperative programs (those with state and side effects) can be difficult to understand. Here, to figure out how the value x changes as the program executes, we have to track how addX is called and possibly other functions with side effects. For a large program with hundreds of thousands, or even millions of lines of code, keeping track of the program state can be incredibly difficult. So, for various reasons functional programming is interesting and (because of referential transparency) it is sometimes easier to think about and use than imperative programming. The designers of both Perl and Python have included facilities for functional programming in their languages. Of course, Python has both imperative and functional language features. As a practicing programmer it is important that you have experience of both so that you can determine when it is appropriate to use each. In Python the most widely used functional programming features are the keyword lambda and the built-in functions map and filter.

10.2.1

Using lambda to create anonymous functions

Python provides a keyword called lambda which can be used to create anonymous functions. lambda expressions look like this: lambda x , y , z : x + y + z In the example above, x, y and z are arguments to the function we’re creating. The part after the : describes what should happen when the function is called. lambda is an expression, not a statement. This means that it returns a value, which is a new function. def on the other hand, is a statement. It doesn’t return anything, it just assigns the function it defines to the name of that function. If we want to use the function returned by a lambda, we need to assign a name to it ourselves. This isn’t always necessary – for example, if the function created by lambda is the argument to a function, it doesn’t need a name. However, if we do want to give a name to the new function, we just use assignment as we would in any other context: >>> add = lambda x , y , z : x + y + z >>> add (1 , 2 , 3) 6 >>>

10.2. FUNCTIONAL PROGRAMMING WITH PYTHON

131

Practically, you will probably use lambda most often when writing a named function just seems like too much trouble. In Chapter 13 we’ll use lambda to implement a function which calculates the negative of a pixel. We’ll pass that anonymous function to another function which will apply it to every pixel in an image, returning the negative of the original image. Like this: negative = image . point ( lambda pixel : 255 - pixel ) Without lambda we would write that code like this: def negate ( pix ): return 255 - pix negative = image . point ( negate ( pix )) which is a lot more work. Lutz and Ascher’s book Learning Python 2 has an equally nice example of creating a GUI button that prints out the text “Hello world!” when it’s pressed. With lambda and the Tkinter GUI toolkit this can all be done on one line of code: import sys widget = Button ( text = " Press me ! " , command = lambda : sys . stdout . write ( " Hello World ! " )) However, there are much more interesting things that you can do with lambda. For example, a lazy list is a cunning way of creating an infinite list. Of course, an infinite list cannot be stored in a finite amount of memory (like you’ve got in your PC), so we have to be a bit clever to make this work. Thanks to lambda we can use something called lazy evaluation, which means evaluating the result of a function as late as possible. A lazy list consists of a value (which is one element of the list) and a function which generates the next element in the list. To generate the integers, we need to store the last integer that we generated and a function to produce the next one. We can wrap all that up in a function: >>> def lazy_list ( x ): ... return (x , lambda : lazy_list ( x +1)) ... >>> Now, when we call the lazy_list function we’ll get back a tuple containing the argument and a function to generate the next integer. If we do this in the interpreter Python will display the (anonymous) function by telling us what type it is (type function) and the address in memory where Python has stored the function. This looks a bit odd if you’ve never seen it before: >>> def lazy_list ( x ): ... return (x , lambda : lazy_list ( x +1)) ... 2

page 113

132

CHAPTER 10. FUNCTIONS AND MODULES

>>> lazy_list (0) (0 , < function < lambda > at 0 xb7f33f44 >) >>> Next, we want to call the function. In the interpreter, the last value printed can be referred to as _. This is useful for us, as we haven’t given the result of lazy_list a name. So, the function that is the second part of the tuple above can be referred to as _[1] using the usual indexing for sequences. To call that function we can just say _[1](). Here’s a full interpreter session: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Listing 10.2: A lazy list for generating the integers >>> def lazy_list ( x ): ... return (x , lambda : lazy_list ( x +1)) ... >>> lazy_list (0) (0 , < function < lambda > at 0 xb7f33f44 >) >>> _ [1]() (1 , < function < lambda > at 0 xb7f33e64 >) >>> _ [1]() (2 , < function < lambda > at 0 xb7f33f44 >) >>> _ [1]() (3 , < function < lambda > at 0 xb7f33e64 >) >>> _ [1]() (4 , < function < lambda > at 0 xb7f33f44 >) >>> _ [1]() (5 , < function < lambda > at 0 xb7f33e64 >) >>> _ [1]() (6 , < function < lambda > at 0 xb7f33f44 >) >>> _ [1]() (7 , < function < lambda > at 0 xb7f33e64 >) >>> _ [1]() (8 , < function < lambda > at 0 xb7f33f44 >) >>> _ [1]() (9 , < function < lambda > at 0 xb7f33e64 >) >>> _ [1]() (10 , < function < lambda > at 0 xb7f33f44 >) >>>

10.2.2

map and filter for automatic list processing

map and filter are built in Python functions which you can use to manage lists (rather than using iteration or explicit recursion). map applies a function to every value in a list. For a function, f if the list argument to map is [l0 , l1 , l2 , ..., ln ] then map will return

10.2. FUNCTIONAL PROGRAMMING WITH PYTHON

133

[f (l0 ), f (l1 ), f (l2 ), . . . , f (ln )]. As you might guess, it’s quickest to use lambda to create your functions as arguments to map. Here’s a quick example:

>>> integers = range (1 ,21) >>> map ( lambda x : x *2 , integers ) [2 , 4 , 6 , 8 , 10 , 12 , 14 , 16 , 18 , 20 , 22 , 24 , 26 , 28 , 30 , 32 , 34 , 36 , 38 >>> map ( lambda x : ( x +1)/3 , integers ) [0 , 1 , 1 , 1 , 2 , 2 , 2 , 3 , 3 , 3 , 4 , 4 , 4 , 5 , 5 , 5 , 6 , 6 , 6 , 7] >>> vowels = [ " a " , " e " , " i " , " o " , " u " ] >>> map ( lambda x : x *2 , vowels ) [ ’ aa ’ , ’ ee ’ , ’ ii ’ , ’ oo ’ , ’ uu ’] >>> map ( lambda x : x + " foobar " , vowels ) [ ’ afoobar ’ , ’ efoobar ’ , ’ ifoobar ’ , ’ ofoobar ’ , ’ ufoobar ’] >>> filter works similarly. It takes a function and a list as arguments. The function should return a Boolean (Mathematicians would call this a predicate) and filter will return a list of values for which the predicate returned True. Here’s a quick example: >>> integers = range (1 ,21) >>> filter ( lambda x : x %3 == 0 , integers ) [3 , 6 , 9 , 12 , 15 , 18] >>> passwords = [ " foo " , " bar " , " verylongpassword " ] >>> filter ( lambda s : len ( s ) >6 , passwords ) [ ’ verylongpassword ’] >>>

10.2.3

An example using lambda, map and filter

As a slightly larger example, lets consider yet another way to write a primes filter, this time using functional programming. Like other algorithms we’ll design this top-down. So, when we’re finished we’ll want to have a function called primes which takes an integer and returns a list of all the primes less than that integer. How can we do this functionally? Well, perhaps this simplest algorithm would take a predicate called something like is_prime, which returns True if its argument is prime, and uses that to filter a list of integers. Of course, we can generate the integers with the built in function range. Here’s the code for that: def primes ( n ): return filter ( is_prime , range (2 , n )) So, now we just need to write the is_prime function. Notice that we’ve given this a name because it looks like it might be complicated enough to put in a named function. How should is_prime work? Well, one thing we can do with filter is to generate a list of factors for a given integer. So, for the number 100, we can generate all the numbers from 2 to 99 and determine which of those numbers are factors of 100, like this:

134

CHAPTER 10. FUNCTIONS AND MODULES

>>> filter ( lambda x : 100 % x == 0 , range (2 , 100)) [2 , 4 , 5 , 10 , 20 , 25 , 50] >>> filter ( lambda x : 97 % x == 0 , range (2 , 97)) [] >>> If, as in the case with 97 above, there are no factors, then the list returned by filter will be empty. This means that the number is prime. How can we use this? Well, the built in function bool converts its argument to a Boolean. In the case of lists, bool will return True if the list is non-empty. This is equivalent to the expression not len(mylist) == 0. So our final is_prime function looks like this: def is_prime ( num ): return not bool ( filter ( lambda x : num % x ==0 , range (2 , num ))) And our functional primes filter is finished: Listing 10.3: A primes filter with functional programming 1 >>> def is_prime ( num ): 2 ... " " " Returns True if num is prime and False otherwise . 3 ... Creates a list of factors of num and returns True 4 ... if that list is empty . 5 ... """ 6 ... return not bool ( filter ( lambda x : num % x ==0 , 7 ... range (2 , num ))) 8 ... 9 ... 10 >>> def primes ( n ): 11 ... " " " Returns a list of all primes less than n . " " " 12 ... return filter ( is_prime , range (2 , n )) 13 ... 14 >>> primes (100) 15 [2 , 3 , 5 , 7 , 11 , 13 , 17 , 19 , 23 , 29 , 31 , 37 , 41 , 43 , 47 , 16 53 , 59 , 61 , 67 , 71 , 73 , 79 , 83 , 89 , 97] 17 >>>

10.3

Preconditions and postconditions for functions

Much of the work of a professional programmer involves either using code that other people have written or writing code for others to use. You’ve already had some experience of using other peoples code when using modules like the turtle module. In both these cases, it is extremely important that code which is intended for reuse is clearly documented so that it can be understood by others. In fact, even if you are only writing programs for yourself,

10.3. PRECONDITIONS AND POSTCONDITIONS FOR FUNCTIONS

135

you will almost certainly find that it’s very difficult to understand your own code after a few weeks if it isn’t clearly documented. Preconditions and postconditions provide a way of reasoning about and documenting functions so that it is clear to programmers what sort of arguments should be passed to the function and what sort of result (and change of global state) can be expected in return. This is particularly useful when there are strict constraints on function arguments. For example, a function that takes the age of a customer and returns the price of a cinema ticket will want to ensure that “age” is non-negative and not too large. The following code shows a precondition for this function which tells developers what sort of value might make an acceptable age argument. Notice that we’ve written the precondition as a Boolean expression just like any other Boolean expression in Python. 1 def get_price ( age ): 2 " " " Returns the price of a ticket given the 3 age of the customer . 4 5 pre :: 6 type ( age )== type (0) and 0 < age and age 1 6 post :: 7 forall ( __return__ , 8 lambda i : i < n and forall ( range (2 , i ) , 9 lambda j : not i % j ==0)) 10 """ 11 return filter ( is_prime , range (2 , n )) We write __return__ to represent the value returned by a function. Also, we’re using a function here which isn’t defined in Python called forall. This is a predicate which takes a list and a function and means “return True if the function returns True when applied to

136

CHAPTER 10. FUNCTIONS AND MODULES

every element in the list”. Mathematicians would know this predicate as the ∀ symbol. If we wanted to implement the forall function, we could do so very simply using filter: Listing 10.4: A ”forall” function 1 forall = lambda lst , f : len ( lst )== len ( filter (f , lst ))

10.4

Modules

Although you have used modules (such as the turtle module) before, you probably haven’t realised that you have also written modules. In Python a module is a piece of Python code in a file. So, the turtle module is stored somewhere on your computer in a file called turtle.py. You have already seen that to use code in a module you need to import it using Python’s import statement. Then, every function (or piece of data) can be referred to as ., such as turtle.forward(100). You can also choose to import only some of the features of a module, using from, whose syntax is from import such as from turtle import forward, right, left. If you do this, the names you have imported can be referred to directly, without the .. Lastly, if you need to re-import a module, don’t use import more than once, use the function reload, as in reload(turtle).

10.4.1

Special data for modules

Python also provides some built-in names that you can assign to in your modules to help with documentation. These are: author

the main author of the module.

credits

anyone who has contributed work to the module.

date

when the module was last modified.

Also, the name of the module is stored in __name__. If the module is being run from the command line (rather than being imported into another module) then __name__ will be assigned the value __main__. This is particularly useful for testing, as we’ll see in the next example.

10.5

Sets: an example module with preconditions and postconditions

This is the full listing for the sets module:

10.5. SETS: AN EXAMPLE MODULE WITH PRECONDITIONSAND POSTCONDITIONS137 Listing 10.5: A module for managing sets represented by lists 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

# !/ bin / env python2 .4 """ Set module for python . Sets are modelled as lists . Operations are implemented functionally -- i . e . we don ’t use variable assignment . """ __author__ = ’ Sarah Mount ’ __credits__ = ’ Sarah Mount ’ __date__ = ’ November 2005 ’ empty = [] # The empty set def isempty ( set ): " " " Returns True if set is empty and false otherwise . pre :: True post :: __return__ == ( len ( set ) == 0) """ return not bool ( set ) def ismem (e , set ): " " " Returns True if e appears in set and False otherwise . pre :: not set or type ( set ) == type ([]) post :: __return__ == e in set """ return bool ( filter ( lambda x : x == e , set )) def addmem (e , set ): " " " Adds element e to set and returns the new set . pre :: not set or type ( set ) == type ([]) post :: ismem (e , __return__ ) """ if not ismem (e , set ): return [ e ] + set

138 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84

CHAPTER 10. FUNCTIONS AND MODULES else : return set

def intersect ( set1 , set2 ): " " " Returns the intersection of set1 and set2 . That is , every element that is in both set1 and set2 . pre :: type ( set1 ) == type ( set2 ) == type ([]) post :: forall ( __return__ , lambda e : ismem (e , set1 ) and ismem (e , set2 )) """ return filter ( lambda x : ismem (x , set2 ) , set1 ) def difference ( set1 , set2 ): """ Returns the difference of set1 and set2 . That is , every element that is in set1 and not in set2 . pre :: type ( set1 ) == type ( set2 ) == type ([]) post :: forall ( __return__ , lambda e : ismem (e , set1 ) and not ismem (e , set2 )) """ return filter ( lambda x : not ismem (x , set2 ) , set1 ) def symmetric_difference ( set1 , set2 ): " " " Returns the symmetric difference of set1 and set2 . That is , every element that is in set1 or set2 , but not in both . pre :: type ( set1 ) == type ( set2 ) == type ([]) post :: forall ( __return__ , lambda e : ismem (e , set1 ) xor ismem (e , set2 )) """ return difference ( set1 , set2 ) + difference ( set2 , set1 ) def union ( set1 , set2 ): " " " Returns the union of set1 and set2 . That is , every element that is in either set1 or set2 . pre :: type ( set1 ) == type ( set2 ) == type ([])

10.5. SETS: AN EXAMPLE MODULE WITH PRECONDITIONSAND POSTCONDITIONS139

85 post :: 86 forall ( __return__ , 87 lambda e : ismem (e , set1 ) or ismem (e , set2 )) 88 """ 89 if not set1 : 90 return set2 91 elif not set2 : 92 return set1 93 elif ismem ( set1 [0] , set2 ): 94 return union ( set1 [1:] , set2 ) 95 else : 96 return union ( set1 [1:] , [ set1 [0]] + set2 ) 97 98 99 if __name__ == ’ __main__ ’: 100 # Test data . 101 set1 = addmem (1 , addmem (2 , addmem (3 , addmem (4 , addmem (5 , empty ))))) 102 set2 = addmem (4 , addmem (5 , addmem (6 , addmem (7 , addmem (8 , empty ))))) 103 # Testing . 104 print ’ empty : ’ , empty 105 print ’ set1 : ’ , set1 106 print ’ set2 : ’ , set2 107 print ’ isempty ( empty ): ’ , isempty ( empty ) 108 print ’ isempty ( set1 ): ’ , isempty ( set1 ) 109 print ’ ismem (1 , set1 ): ’ , ismem (1 , set1 ) 110 print ’ ismem (0 , set1 ): ’ , ismem (9 , set1 ) 111 print ’ addmem (3 , set1 ): ’ , addmem (3 , set1 ) 112 print ’ addmem (0 , set1 ): ’ , addmem (0 , set1 ) 113 print ’ intersect ( set1 , set2 ): ’ , intersect ( set1 , set2 ) 114 print ’ difference ( set1 , set2 ): ’ , difference ( set1 , set2 ) 115 print ’ symmetric_difference ( set1 , set2 ): ’ , 116 print symmetric_difference ( set1 , set2 ) 117 print ’ union ( set1 , set2 ): ’ , union ( set1 , set2 )

10.5.1

Documentation for the module

This is the documentation for the sets module, as displayed in the interpreter. Notice how the builtin data has been displayed: Listing 10.6: Documentation for the sets module 1 Help on module sets : 2 3 NAME

140

CHAPTER 10. FUNCTIONS AND MODULES

4 sets 5 6 FILE 7 / home / snim2 / Desktop / pybook / listings / functions / sets . py 8 9 DESCRIPTION 10 Set module for python . 11 Sets are modeled as lists . Operations are implemented 12 functionally -- i . e . we don ’t use variable assignment . 13 14 FUNCTIONS 15 addmem (e , set ) 16 Adds element e to set and returns the new set . 17 pre :: 18 not set or type ( set ) == type ([]) 19 post :: 20 ismem (e , __return__ ) 21 22 difference ( set1 , set2 ) 23 Returns the difference of set1 and set2 . 24 That is , every element that is in set1 and not in set2 . 25 pre :: 26 type ( set1 ) == type ( set2 ) == type ([]) 27 post :: 28 forall ( __return__ , 29 lambda e : ismem (e , set1 ) and not ismem (e , set2 )) 30 31 intersect ( set1 , set2 ) 32 Returns the intersection of set1 and set2 . 33 That is , every element that is in both set1 and set2 . 34 pre :: 35 type ( set1 ) == type ( set2 ) == type ([]) 36 post :: 37 forall ( __return__ , 38 lambda e : ismem (e , set1 ) and ismem (e , set2 )) 39 40 isempty ( set ) 41 Returns True if set is empty and false otherwise . 42 pre :: 43 True 44 post :: 45 __return__ == ( len ( set ) == 0) 46

10.5. SETS: AN EXAMPLE MODULE WITH PRECONDITIONSAND POSTCONDITIONS141 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

ismem (e , set ) Returns True if e appears in set and False otherwise . pre :: not set or type ( set ) == type ([]) post :: __return__ == e in set symmetric_difference ( set1 , set2 ) Returns the symmetric difference of set1 and set2 . That is , every element that is in set1 or set2 , but not in both . pre :: type ( set1 ) == type ( set2 ) == type ([]) post :: forall ( __return__ , lambda e : ismem (e , set1 ) xor ismem (e , set2 )) union ( set1 , set2 ) Returns the union of set1 and set2 . That is , every element that is in either set1 or set2 . pre :: type ( set1 ) == type ( set2 ) == type ([]) post :: forall ( __return__ , lambda e : ismem (e , set1 ) or ismem (e , set2 )) DATA __author__ = ’ Sarah Mount ’ __credits__ = ’ Sarah Mount ’ __date__ = ’ November 2005 ’ empty = [] DATE November 2005 AUTHOR Sarah Mount CREDITS Sarah Mount

142

CHAPTER 10. FUNCTIONS AND MODULES

10.6

Further reading

ˆ Mark Pilgrim on why map and filter are important http://diveintopython.org/functional_programming/data_centric.html ˆ Wikipedia on sets http://en.wikipedia.org/wiki/Set ˆ Wikipedia on functional programming http://en.wikipedia.org/wiki/Functional_programming ˆ John Hughes’ famous paper on Why Functional Programming Matters. http://www.math.chalmers.se/~rjmh/Papers/whyfp.html ˆ Wikipedia on ISBNs: http://en.wikipedia.org/wiki/ISBN ˆ Modules from the Python tutorial: http://docs.python.org/tut/node8.html

10.7 author

Glossary a special variable in a module to hold details of the main module author.

credits a special variable in a module to hold details of anyone who has contributed to developing a module. date

a special variable in a module to hold the date that it was last modified.

name a special variable in a module which is automatically assigned either the name of the module which imported it or __main__ if the module has been run from the command line. bool a built in Python function which converts its argument to a Boolean. On lists, bool returns True if the list is non-empty. This is the same as the expression not len(mylist)==0 Declarative programming another name for functional programming. Encapsulation containing a series of statements within a namespace so that the statements can be executed anywhere and cannot be altered by client code. filter applies a predicate to every value in a list and returns a list of all those values for which the predicate returned True. from import particular names from a module:

10.7. GLOSSARY

143

from turtle import forward , left ... forward (100) left (90) Function the simplest way of encapsulating code. Functional programming programming without state. global used inside functions to tell Python that a particular name is declared at modulelevel not in local scope. import imports all the names in a give module: import turtle ... turtle . forward (100) turtle . left (90) lambda used to create anonymous functions. e.g. lambda x: x+1 Lazy evaluation evaluating the result of an expression only when it is needed. Lazy list an “infinite” list usually implemented as a value from the list and a function which returns the next value. LGB rule when looking for a name (without a .), Python searches first the local scope, then global scope then builtin functions. map applies a function to every value in a list and returns the list. Module code encapsulated within a file. Namespace see scope. Postcondition a predicate which a function is committed to meeting when it returns. Precondition a predicate which should be True when a function is called. This should be met by the code calling the function. Predicate a function which returns a Boolean. reload re-import a module which has already been imported:

144

CHAPTER 10. FUNCTIONS AND MODULES import turtle ... turtle . forward (100) turtle . left (90) ... reload ( turtle ) ... turtle . forward (100) turtle . left (90)

Scope the area of a program where a particular variable can be referred to and modified. Separation of concerns breaking up your code into functions or modules (or classes which we’ll see in Chapter 12) which do not overlap. Set a collection of items without repetition. For example: {1, 2, 3, 4} {”Alice”, ”Bob”, ”Charlie”} {T rue, F alse} Side effect a function has side effects if it alters the state outside its scope.

10.8

Exercises

1. Add a function to the sets module called issubset with a function signature issubset(set1, set2). Your function should return True if every element in set1 appears in set2. Make sure you document your function fully and remember to include preconditions and postconditions. Also, include some testing for your new function.

10.8.1

Key Assignment

Add a function to the sets module called isequal with a function signature isequal(set1,set2). Your function should return True if every element in set1 appears in set2 and every element in set2 appears in set1. Make sure you document your function fully and remember to include preconditions and postconditions. Also, include some testing for your new function.

10.8.2

Challenge

Write a lazy list which generates prime numbers.

Chapter 11 Input and output Learning outcomes After completing this chapter, you should be able to: ˆ Read and write text files in Python ˆ Load and save compound data structures ˆ Use regular expressions to search strings

11.1

Simple keyboard input

Already we have used simple text input in our programs. You should be familiar with the raw_input() function, since it’s been used extensively from the beginning. So far, we have been writing things like: 1 print ’ Enter some text : ’ , 2 t = raw_input () Which is fine. We can do this a little more elegantly, though: 1 t = raw_input ( ’ Enter some text : ’) The raw_input() function optionally takes an argument that is the string to print in order to “prompt” the user into typing. This wasn’t introduced earlier because we hadn’t yet covered functions and argument passing. You may also have noticed another input function used in the birthday example from Section 1.6. This function was used in very much the same way, and can also take a string argument for a prompt. So, what is the difference between the two? The following code will highlight the difference: 1 print type ( input ( ’ Enter a number : ’ )) 2 print type ( raw_input ( ’ Enter a number : ’ )) 145

146

CHAPTER 11. INPUT AND OUTPUT

Execute the program and enter two numbers. Notice that the type of the return value is different for each function. raw_input() returns a string while input() returns an integer. raw_input() always returns a string, but input() doesn’t. Try entering [1,2] instead of a number into the program. Notice that raw_input() still returns a string (’[1,2]’), but input() now returns a list. The difference is that the input() function evaluates the user input as if it were python code. So, when we enter a number, it is evaluated as a literal and the number is returned with the correct type. To enter strings, we put them in quotes, just like string literals in python code. And, as we just saw, lists are evaluated as lists. It may appear at first that we should use raw_input() to get strings from the user and input() for other types, but this is not recommended. Because the text entered is evaluated, it is possible that the behaviour of your program can be accidentally modified by user input. Worse still, a malicious user could cause your program to wreak havoc on their behalf - delete all your file, unleash a flood of pings on another computer, or gain increased privileges on the system. How, then, would we read types other than strings from the user? We simply cast the return value from raw_input(), like this: 1 a = int ( raw_input ( ’ Enter a number : ’ )) 2 aq = a **3 3 print a , ’ cubed is ’ , aq

11.2

Command-line Arguments

A very useful method of getting information into a program is through command-line arguments. In Unix systems, the ls command can be made to show more detailed information with the -l switch. The DOS command edit, takes an argument that is the name of the file to edit, such as edit example.txt. The sys module in python contains a list of command-line arguments. The first element in the list is always the name of the program that was executed. This is handy if you have different names for the same program, and expect slightly different behaviour depending on which one is used. The example program below simply prints out the list of command-line arguments. Listing 11.1: Command-line arguments 1 2 3 4 5 6 7

import sys # print the list print sys . argv # Print enumerated list count =0

11.3. FILES

147

8 for i in sys . argv : 9 print count , " -" ,i 10 count +=1 Here’s an example run of the program: >cmdline.py hello this is a test "lahlah lah" [’listings/input_output/cmdline.py’, ’hello’, ’this’, ’is’, ’a’, ’test’, ’lahlah lah’] 0 - cmdline.py 1 - hello 2 - this 3 - is 4 - a 5 - test 6 - lahlah lah Notice that the quoted string is seen as a single argument, even though there is a space in it.

11.3

Files

Reading and writing files is a common requirement of programs. Here, we look at simple text file reading and writing, and then writing and reading files containing more complex data. There are more ways to deal with text files in Python than are presented here, but these are the most common and useful. Readers are encouraged to view the Python documentation for other operations.

11.3.1

Text files

Writing Python’s support for reading and writing text files is very simple. The following example writes some text to a file: 1 2 3 4 5

Listing 11.2: Writing text files datafile = open ( " myfile " ," w " ) datafile . write ( " This is some text .\ n " ) datafile . write ( " It ’s in a file .\ n " ) datafile . write ( " Dumdedumdum .\ n " ) datafile . close () Line 2 opens the file, called “myfile” and assigns the result to datafile. This assignment stores a “file handle” in the variable, through which we can access the file. The

148

CHAPTER 11. INPUT AND OUTPUT

second parameter to open is the string “w”, which tells python that we wish to write to the file. Using the file-handle, on lines 2-4, we write text into the file. Notice the addition of a newline character at the end of each line (\n). Without this, there would be no line break between the pieces of text. We do not have to write literals, and could write the contents of variables containing text, such as datafile.write(surname). Finally, on line 5, we close the file. This is an important step, as it tells python, and the operating system, that we are finished. Most operating systems will “buffer” writes to disk, and until we explicitly close the file, the contents might not have been written. The contents of “myfile” are now: This is some text. It’s in a file. Dumdedumdum Appending to files No matter how many times we run the above example, the contents of “myfile” will be the same. This is because when we open a file for writing, it is initialised as an empty file. If we wished to add to the file, we would open it for appending. This is achieved by entering “a” as the second argument to the open function. If the file did not exist before we asked to append to it, it will be created as an empty file. The program below appends to “myfile” instead of overwriting it. 1 2 3 4 5

Listing 11.3: Appending to text files datafile = open ( " myfile " ," a " ) datafile . write ( " This is some text .\ n " ) datafile . write ( " It ’s in a file .\ n " ) datafile . write ( " Dumdedumdum .\ n " ) datafile . close () After executing this new program twice (and assuming that we ran the first one before hand), the contents of “myfile” should be: This is some text. It’s in a file. Dumdedumdum This is some text. It’s in a file. Dumdedumdum. This is some text. It’s in a file. Dumdedumdum. This is some text.

11.3. FILES

149

It’s in a file. Dumdedumdum. Reading Reading from files is just as simple as writing them. The open function is used again, with “r” as the second parameter. To read from a file handle, we can use the readline() function, which returns the current line in the file. Repeated calls return successive lines. We could also use the readlines() function to read every line of the file, returning a list of lines to process. This method is used below, in the regexp example. We can also use a file handle in a for loop in place of a list. The following example reads “myfile” and prints each line: 1 2 3 4

Listing 11.4: Reading text files datafile = open ( " myfile " ," r " ) for l in datafile : print l [: -1] datafile . close () Notice on line 3 that we are printing everything except the last character of the line. Remember that the print command will append a newline character to whatever it prints. Our file contains a newline character at the end of every line already, so if we did not remove it, we would see a blank line after every line. Slice notation was discussed in Section 8.2.2

11.3.2

Pickling and shelving

Writing text files is very useful, and is something you’ll probably find yourself doing again and again. If you want to write a list of integers into a file, you could write each element, in order, on a separate line. Then for loading, you read each line, cast it as an int and append it to a list. We turn our data structure into something simple and linear. However, we can create some pretty complex data structures in python, and translating all of this into plain text for saving and then back to data structures would be time consuming, error-prone and, let’s face it, boring. Luckily, python provides us with a way to convert types like list and dictionary into things we can write to files. The process of turning things into simple strings for storing or transmitting is called serialisation. Converting it back into it’s in-memory equivalent is called de-serialisation. Python has a module for dealing with this, called pickle. In python lingo, we pickle and unpickle types. Pickling gives us a serialised version of a type so that we can move it out of memory on to disk, perhaps, or across the network.

150

CHAPTER 11. INPUT AND OUTPUT

There is another module in python, called shelve, that provides functions for writing types to files and reading them from files. The shelve functions do all of the pickling and unpickling for us, so rather than look at pickling in great detail, we will move straight on to shelving.

11.3.3

Using shelve

After importing shelve, we must open a file. The following function invocation opens the shelf file called “datafile”. This is the most simple way to open a shelf file, and the only one we will discuss here. Interested readers are recommended to read the python shelving documentation for information on options for opening shelf files. 1 file = shelve . open ( ’ datafile ’) If “datafile” did not exist before the function call, it will be created. The variable file is now a “shelf” for storing pickled things - I’m sure the analogy is now obvious, if it wasn’t from the outset. A shelf behaves like a dictionary in that it has keys that uniquely identify data and has a similar set of functions. So, if we wished to store an item on our shelf, we assign a key to it and add it to the dictionary. For example, say we have two lists; one containing a names and one containing sales figures. We might pick the keys ’names’ and ’sales’ to identify them, and they would be written to our shelf (in this case, called file) like so: 1 file [ ’ names ’ ]= namelist 2 file [ ’ sales ’ ]= salesfigs Retrieving an item from the shelf also works like retrieving an item from a dictionary. Let’s retrieve the dictionary identified by the key ’departments’: 1 deps = file [ ’ departments ’] Of course, this assumes that there is a key called ’departments’. Just as with dictionaries, if the key does not exist, there will be a run-time error. A safer way to do this would be: 1 deps ={} 2 if file . has_key ( ’ departments ’ ): 3 deps = file [ ’ departments ’] This ensures that after this section of code, there will be a dictionary assigned to the variable deps, whether it is a newly created dictionary, or one loaded from the shelf. If we make changes to an item loaded from the shelf, we must reshelve it. For example, if we have modified deps, then we will need to reshelve it so that the changes are stored: 1 file [ ’ departments ’ ]= deps And, finally, closing a shelf is simple: 1 file . close ()

11.3. FILES

151

A more useful film database - persistence vs. volatility Back in Section 8.5.2, we created a database program for storing simple film information. The problem with this program was that the data was volatile - it was lost after the program finished. To be a useful database, we should have the same data when we run the program as when we last ended it. This is called persistence. Below is a modified version of the database program that uses shelving (and, therefore, implicitly also uses pickling) to create a persistent database of films. Listing 11.5: Persistent film database 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

import shelve # File for database storage filename = " films . dat " file = shelve . open ( filename ) records =[] # If we have some data already , # the ’ films ’ key willl exist if file . has_key ( ’ films ’ ): records = file [ ’ films ’]

# User input user = " " while user != " 0 " : # display menu print print " Film Database " print " -- -- -------- --- - - " print " 1 -- Add entry " print " 2 -- List entries " print " 0 -- Exit " print " -- -- -------- --- - - " print len ( records ) , " entries . " print print " Enter option : " , # Get user input user = raw_input ()

152 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

CHAPTER 11. INPUT AND OUTPUT

if user == " 1 " : # Add to database # Create empty dictionary item ={} print " Enter title : " , item [ " Title " ]= raw_input () print " Enter director : " , item [ " Director " ]= raw_input () print " Enter year : " , item [ " Year " ]= raw_input () records . append ( item ) elif user == " 2 " : # Display database print " \ t " + " -" *5 for r in records : print " Title :\ t \ t " ,r [ " Title " ] print " Year :\ t \ t " ,r [ " Year " ] print " Director :\ t " ,r [ " Director " ] print " \ t " + " -" *5 print elif user == " 0 " : file [ ’ films ’ ]= records file . close () else : print " Unknown option "

11.4

Regular expressions

Regular expressions are strings conforming to a standard format that can be used for matching text. Regular expressions, or regexps, are powerful and flexible and are supported in many languages and by many applications. Linux users will probably be aware of the “grep” command, which searches for matches within files based on a given regexp. In this section, we will examine basic regular expressions and how to use them in Python. Further information on regular expressions and python’s support of them can be found in the python documentation. Imagine we need to find instances of the word “is” in the following string: “This is a string. It contains some text. Is this enough? I think it is, but maybe you disagree.” We could use the find() function of the string type to locate “is”, but that would also

11.4. REGULAR EXPRESSIONS

153

match the end of “This”. It would also miss the second instance, because the first letter is capitalised. “Aha!”, you might think. Using the upper() function, we could convert the whole string to uppercase, and then search for “ IS “ - case is no longer a problem, and we will only find the letters when surrounded by a space, so “This” will not match. But what about the third instance of “is”? This one’s followed by a comma, so we still won’t match all of them. Of course, we could go through the string character by character, analysing the surrounding letters for punctuation, etc., but this gets complicated.

11.4.1

Finding substrings with regular expressions

Here is how we would find “is” using regexps: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

# import the regexp module import re s = ’ This is a string . It contains some text . ’ +\ ’ Is this enough ? I think it is , but maybe you disagree . ’ # Create a regexp exp = re . compile ( ’ (?
CHAPTER 11. INPUT AND OUTPUT

11.5

Further reading

ˆ File objects - Python documentation http://docs.python.org/lib/bltin-file-objects.html ˆ Regular Expressions - Python Documentation http://docs.python.org/lib/module-re.html ˆ Shelve - Python Documentation http://docs.python.org/lib/module-shelve.html ˆ Regular Expressions HOWTO - Extended documentation for Python regexps http://www.amk.ca/python/howto/regex/

11.6

Glossary

File handle An object that gives us access to a specific file Regular expression A pattern for matching text Buffer A space for temporary storage, used to collect up data and perform quicker block operations instead of many small operations Pickle To serialise data Serialise Convert complex structures into representations that are simple to save, transmit, etc. Shelve Serialise and store items Command-line Argument Switches, options, data, etc., passed to a program when executed Volatile (In programming) Remaining only while the program is running Persistent (In programming) Stored between sessions

11.7

Exercises

1. Write a program that accepts a filename as a command-line parameter. The program will search the given file for each letter in the alphabet and print statistics on the number of occurrences.

11.7.1

Key Assignment

Create a program that looks through a list of given files for quoted strings. Print the line number of the quoted strings and the strings themselves.

Chapter 12 Object oriented Python Object oriented programming (OOP) was invented in the 1960s with the Simula programming language but really became popular in the 1990s with the rise of C++ and Java. Most modern languages have some sort of facility for using the key features of OOP – classes, objects and inheritance, and Python is no exception. No matter what sort of programming you go on to do after this module, chances are that you’ll need objects. One thing you will notice about OOP is the vast amount of jargon that goes with it. Don’t worry if you get confused – just refer to the glossary at the end of this Chapter.

Learning outcomes ˆ Write classes to implement new types. ˆ Document classes with class invariants (and methods with preconditions and postconditions). ˆ Appropriately use single and multiple inheritance to implement polymorphism. ˆ Overload builtin functions using operator overloading methods. ˆ Implement exceptions. ˆ Use Python’s unit testing framework (PyUnit) to test code.

12.1

Writing classes and using objects

As with functions, objects are entities which you have met before, although we haven’t discussed them in detail until now. Here’s an example use of objects: 1 >>> import random 2 >>> r = random . Random () 3 >>> r . random () 157

158 4 5 6 7 8 9 10 11

CHAPTER 12. OBJECT ORIENTED PYTHON

0.35504954450565551 >>> r . random () 0.76348784528130653 >>> r . random () 0.46062079064093542 >>> r . random () 0.43806239719483753 >>> In the example above, the name r refers to an object. The idea of objects is very simple. Mainly, when you create a new value it is some built in type such as an int, float or list. Any program you write can only use data made up of some combination of the built in types (such as a list of strings, or a dictionary of tuples, etc.). Object oriented programming is one way of creating new types, which can be as complicated as you like. So, in the example above r had the type Random. A Random is a random number generator which holds a bunch of data, some of which is Python code to manipulate other data. For example, r.random() looks like a function call. In fact it isn’t, it’s a call to a method, which is like a function, but usually just returns a value or manipulates state held within the object r. By this point, you have probably realised that there’s a lot of jargon involved in object oriented programming. Most of this is just names for things you’ve already used. It’s very important that you learn to use this jargon correctly as when you read books or documentation, or (much more importantly) when you talk to other programmers, you’ll find this jargon is used everywhere. As a programmer yourself you are expected to know it! So, in the example above, r is an object of type Random. We can also say that it is an instance of Random and that when we created is (r=random.Random()) we instantiated r. Don’t worry if this seems very confusing now, the more we use these new words the easier it will be to understand them. To create new types of your own you need to write a class. Here’s a very quick example. Notice the indentation which, as usual in Python, tells us about the scope of a class.

1 import random 2 3 class Die : 4 " " " A class to represent a die -- used in games of chance . 5 6 inv :: 7 self . spots in Die . sides 8 not self . random == None 9 len ( sides ) > 0 10 """ 11 sides = [1 , 2 , 3 , 4 , 5 , 6] # Sides on a die 12

12.1. WRITING CLASSES AND USING OBJECTS 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

159

def __init__ ( self ): """ pre :: True post :: self . spots in Die . sides not self . random == None """ self . spots = Die . sides [0] self . random = random . Random () return def roll ( self ): " " " Roll the die . pre :: True post : self . spots in Die . sides """ self . spots = self . random . choice ( Die . sides ) return We can use our new class and create (or instantiate) objects with it like this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14

>>> >>> >>> 2 >>> >>> 4 >>> >>> 5 >>> >>> 1 >>>

d = Die () d . roll () d . spots d . roll () d . spots d . roll () d . spots d . roll () d . spots

So, what on earth is going on here? Well, when we instantiate our new object, d, Python finds our Die class in its global scope and uses the type definition there to create d. Notice that we say d=Die(), almost as if creating d calls a method. Well, actually Python does execute a method for us when an object is created which is always the special method called __init__(), which is known as a constructor.

160

CHAPTER 12. OBJECT ORIENTED PYTHON

There’s some other new words in the example too. self1 means “this object”. So, for our example object d, self always means d. Notice that self is an argument to every method within Die, but we didn’t pass any arguments at all to __init__() or to roll(). Just like Python implicitly called __init__() without us having to call it explicitly, self is always passed to any method which is called using the syntax .. So, when we write d.roll() Python automatically ensures that self==d inside the method. Lastly, our Die example tells us something about how scope works in objects. Remember we said in the code: 1 class Die : 2 sides = [1 , 2 , 3 , 4 , 5 , 6] # Sides on a die 3 4 def __init__ ( self ): 5 self . spots = Die . sides [0] 6 self . random = random . Random () 7 return Notice how some names (spots and random) are always written self.spots and self.random. This means that every object instantiated from the Die class contain their very own spots and random names. This is why we can say: 1 >>> d . spots 2 1 The same is true of the method roll(), which we refer to as d.roll(). On the other hand, sides is never called self.sides, instead we call it Die.sides. This means that there is only one copy of the value called sides which is shared between all of the objects of type Die. Here’s an example to illustrate this: 1 2 3 4 5 6 7 8 9 10 11

>>> >>> >>> >>> >>> 4 >>> 5 >>> [1 , >>>

d1 = Die () d2 = Die () d1 . roll () d2 . roll () d1 . spots d2 . spots Die . sides 2 , 3 , 4 , 5 , 6]

Here, notice that d1 and d2 each have their own spots variable which (in the example) hold the values 4 and 5. 1

If you’ve ever used Java you will know self as the Java keyword this

12.1. WRITING CLASSES AND USING OBJECTS

161

Lastly, we’ve added preconditions and postconditions to the methods in Die class. We have also added one other bit of documentation which is new. Here it is: 1 class Die : 2 " " " A class to represent a die -- used in games of chance . 3 4 inv :: 5 self . spots in Die . sides 6 not self . random == None 7 len ( sides ) > 0 8 """ 9 sides = [1 , 2 , 3 , 4 , 5 , 6] The inv:: condition lets us list some boolean expressions which must always remain True, whatever happens. It’s part of our contract to whoever might be using the Die class that these invariants will always be True. In this example, we just want to say that self.spots always holds some value from the Die.sides list; that self.random holds a value (not the dummy value None) and that the length of the Die.sides list is greater than zero (otherwise what would we put in self.spots?). So far we have a nice class to represent a die, which we can roll. Let’s do something interesting with that and create a new type to implement a game of craps. This is going to be a very simple game – we won’t worry about sophisticated casino-style betting! In our game, we’ll bet a certain amount, roll two dice and see if the die land how we guessed. If they do, we win, otherwise we lose. There’s nothing really new in this example, except that it’s a slightly more sophisticated one, just because it makes use of objects instantiated from the Die class: 1 class Craps : 2 def __init__ ( self ): 3 " " " Initialise a game of Craps . 4 5 pre :: 6 True 7 post :: 8 not self . die1 == None 9 not self . die1 == None 10 self . winnings == 0 11 """ 12 self . die1 = Die () 13 self . die2 = Die () 14 self . winnings = 0 15 return 16 17 def play_round ( self , bet , spots1 , spots2 ): 18 " " " Make a bet and roll the dice .

162 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

CHAPTER 12. OBJECT ORIENTED PYTHON

pre :: spots1 in Die . sides spots2 in Die . sides bet > 0 post :: if spots1 == self . die1 . spots and spots2 == self . die2 . spots : winnings += bet """ self . die1 . roll () self . die2 . roll () if self . die1 . spots == spots1 and self . die2 . spots == spots2 : winnings += bet def menu ( self ): " " " Print the user menu . pre :: True post :: True """ print " 1. Keep playing . " print " 2. New game . " print " 3. Exit . " def get_side ( self ): " " " Get a number from the user which could be a die side . pre :: True post :: __return__ in Die . sides """ d = int ( raw_input ()) while not d in Die . sides : print " Please enter another number : " d = int ( raw_input ()) return d def play ( self ): " " " Play a game of craps . " " " self . menu ()

12.1. WRITING CLASSES AND USING OBJECTS 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82

163

menu = int ( raw_input ()) valid = [1 , 2 , 3] while not menu == 3: if menu == 2: winnings = 0 print " How much would you like to bet ? " bet = input () print " What will the first die roll ? " s1 = self . get_side () print " What will the second die roll ? " s2 = self . get_side () print " Rolling ... " self . play_round ( bet , s1 , s2 ) print str ( self . die1 . spots ) , str ( self . die2 . spots ) if self . die1 . spots == s1 and self . die2 . spots == s2 : print " You win ! " else : print " You lose ! " print " Winnings so far : " , self . winnings self . menu () menu = int ( raw_input ()) Putting this together, here’s our finished craps module:

Listing 12.1: Game of craps with contracts 1 # !/ bin / env python2 .4 2 3 """ 4 Classes to implement a game of craps . 5 """ 6 7 __author__ = ’ Sarah Mount ’ 8 __credits__ = ’ Sarah Mount ’ 9 __date__ = ’ November 2005 ’ 10 11 import random 12 13 class Die : 14 " " " A class to represent a die -- used in games of chance . 15 16 inv :: 17 self . spots in Die . sides 18 not self . random == None 19 len ( sides ) > 0

164

CHAPTER 12. OBJECT ORIENTED PYTHON

20 """ 21 sides = [1 , 2 , 3 , 4 , 5 , 6] # Sides on a die 22 23 def __init__ ( self ): 24 """ 25 pre :: 26 True 27 post :: 28 self . spots in Die . sides 29 not self . random == None 30 """ 31 self . spots = Die . sides [0] 32 self . random = random . Random () 33 return 34 35 def roll ( self ): 36 " " " Roll the die . 37 38 pre :: 39 True 40 post : 41 self . spots in Die . sides 42 """ 43 self . spots = self . random . choice ( Die . sides ) 44 return 45 46 class Craps : 47 def __init__ ( self ): 48 " " " Initialise a game of Craps . 49 50 pre :: 51 True 52 post :: 53 not self . die1 == None 54 not self . die1 == None 55 self . winnings == 0 56 """ 57 self . die1 = Die () 58 self . die2 = Die () 59 self . winnings = 0 60 return 61 62 def play_round ( self , bet , spots1 , spots2 ):

12.1. WRITING CLASSES AND USING OBJECTS 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105

165

" " " Make a bet and roll the dice .

pre :: spots1 in Die . sides spots2 in Die . sides bet > 0 post :: if spots1 == self . die1 . spots and spots2 == self . die2 . spots : winnings += bet """ self . die1 . roll () self . die2 . roll () if self . die1 . spots == spots1 and self . die2 . spots == spots2 : winnings += bet def menu ( self ): " " " Print the user menu . pre :: True post :: True """ print " 1. Keep playing . " print " 2. New game . " print " 3. Exit . " def get_side ( self ): " " " Get a number from the user which could be a die side . pre :: True post :: __return__ in Die . sides """ d = int ( raw_input ()) while not d in Die . sides : print " Please enter another number : " d = int ( raw_input ()) return d def play ( self ): " " " Play a game of craps . " " "

166

CHAPTER 12. OBJECT ORIENTED PYTHON

106 self . menu () 107 menu = int ( raw_input ()) 108 valid = [1 , 2 , 3] 109 while not menu == 3: 110 if menu == 2: 111 winnings = 0 112 print " How much would you like to bet ? " 113 bet = input () 114 print " What will the first die roll ? " 115 s1 = self . get_side () 116 print " What will the second die roll ? " 117 s2 = self . get_side () 118 print " Rolling ... " 119 self . play_round ( bet , s1 , s2 ) 120 print str ( self . die1 . spots ) , str ( self . die2 . spots ) 121 if self . die1 . spots == s1 and self . die2 . spots == s2 : 122 print " You win ! " 123 else : 124 print " You lose ! " 125 print " Winnings so far : " , self . winnings 126 self . menu () 127 menu = int ( raw_input ()) 128 129 if __name__ == ’ __main__ ’: 130 Craps (). play () And here’s how Python represents the documentation for that module: Listing 12.2: Documentation for the craps module 1 Help on module craps : 2 3 NAME 4 craps - Classes to implement a game of craps . 5 6 FILE 7 / home / snim2 / Desktop / pybook / listings / oo_python / craps . py 8 9 CLASSES 10 Craps 11 Die 12 13 class Craps 14 | Methods defined here : 15 |

12.1. WRITING CLASSES AND USING OBJECTS 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |

167

__init__ ( self ) Initialise a game of Craps . pre :: True post :: not self . die1 == None not self . die1 == None self . winnings == 0 get_side ( self ) Get a number from the user which could be a die side . pre :: True post :: __return__ in Die . sides menu ( self ) Print the user menu . pre :: True post :: True play ( self ) Play a game of craps . play_round ( self , bet , spots1 , spots2 ) Make a bet and roll the dice .

pre :: spots1 in Die . sides spots2 in Die . sides bet > 0 post :: if spots1 == self . die1 . spots and spots2 == self . die2 . sp winnings += bet

class Die | A class to represent a die -- used in games of chance . |

168 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98

CHAPTER 12. OBJECT ORIENTED PYTHON | | | | | | | | | | | | | | | | | | | | | | | | | |

inv :: self . spots in Die . sides not self . random == None len ( sides ) > 0 Methods defined here : __init__ ( self ) pre :: True post :: self . spots in Die . sides not self . random == None roll ( self ) Roll the die . pre :: True post : self . spots in Die . sides -----------------------------------------------------Data and other attributes defined here : sides = [1 , 2 , 3 , 4 , 5 , 6]

DATA __author__ = ’ Sarah Mount ’ __credits__ = ’ Sarah Mount ’ __date__ = ’ November 2005 ’ DATE November 2005 AUTHOR Sarah Mount CREDITS Sarah Mount

12.1. WRITING CLASSES AND USING OBJECTS

12.1.1

169

The film database with objects

As a last example, let’s return to the film database from Chapter 8. There, we represented films as a dictionary (with keys like “director” and “title”) and a film database is just a list of dictionaries. This is great, but every time we add a new film, or access an old one, we have to remember how the dictionary should be structured. If we want to make sure that (say) we always print out our films in the same way, we have to check every place in the program where we’ve done this, or we could move that job into a function. Of course, for a program as small as our film database that’s no big deal, but for anything much bigger it means we really have to work hard to keep track of all our code. Instead, we can create a new type to represent films and give that type the capability to convert itself to a string so we can print it out neatly. In fact, this turns out to be really easy in Python. We’ve already seen how to create a new type using the class keyword. In order to define a way of printing out our new types we can write a special method called __repr__ with the signature __repr__(self). Once we’ve done that, we can say print or str() just like we would for any built in type and Python will know what to do to convert the object to a string – it can just call __repr__. Here’s a simple example: 1 2 3 4 5 6 7 8 9 10 11 12

>>> class Foobar : ... def __repr__ ( self ): ... return " foobar " ... >>> f = Foobar () >>> print f foobar >>> str ( f ) ’ foobar ’ >>> print str ( f ) foobar >>> Below is our new database program. Notice that we have one new type (class) to represent films and one to represent databases. The rule is that every class should represent one particular “thing” and contain all the code and algorithms related to that thing. So, in our craps example, the Die class defined how to “roll” a die, not the Craps class. And the Craps class defined how to play a game of craps, not the Die class. That probably all sounds like common sense, but it’s really worth trying to keep your classes simple and self-contained. That way your code will be much clearer. Listing 12.3: Film database with objects

1 # !/ bin / env python 2 3 class FilmEntry : 4 " " " Represents a film in an IMDB style database .

170

CHAPTER 12. OBJECT ORIENTED PYTHON

5 """ 6 def __init__ ( self , title , director , year ): 7 self . title = title 8 self . director = director 9 self . year = year 10 return 11 12 def __repr__ ( self ): 13 s = ’ Title :\ t ’ + self . title + ’\ n ’ 14 s += ’ Director :\ t ’ + self . director + ’\ n ’ 15 s += ’ Year :\ t ’ + self . year + ’\ n ’ 16 return s 17 18 class IMDB : 19 " " " IMDB style database of film entries . 20 """ 21 def __init__ ( self ): 22 self . imdb = [] 23 return 24 25 def __len__ ( self ): return self . imdb . __len__ () 26 27 def addFilm ( self , title , director , year ): 28 self . imdb . append ( FilmEntry ( title , director , year )) 29 return 30 31 def __repr__ ( self ): 32 s = ’ Film Database :\ n ’ 33 s += ’ - - - - - - - - - - - - -\ n ’ 34 j = 0 35 for i in self . imdb : 36 s += ’ Film ’ + str ( j ) + ’ :\ n ’ 37 s += i . __repr__ () 38 j += 1 39 return s 40 41 if __name__ == ’ __main__ ’: 42 db = IMDB () 43 # User input 44 user = ’ ’ 45 while user != ’0 ’: 46 # Display menu 47 print

12.1. WRITING CLASSES AND USING OBJECTS 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77

print print print print print print print print print

’ Film Database ’ ’ ------ ---- --- - --- ’ ’1 -- Add entry ’ ’2 -- List entries ’ ’0 -- Exit ’ ’ ------ ---- --- - --- ’ len ( db ) , ’ entries . ’ ’ Enter option : ’ ,

# Get user input user = raw_input () if user == ’1 ’: # Add to database print ’ Enter title : ’ , title = raw_input () print ’ Enter director : ’ , director = raw_input () print ’ Enter year : ’ , year = raw_input () db . addFilm ( title , director , year ) elif user == ’2 ’: # Display database print db . __repr__ () print elif user == ’0 ’: break else : print ’ Unknown option ’ And here’s an example user session with the new database: Listing 12.4: Film database in use

1 2 3 4 5 6 7 8 9 10

$ ./ imdb . py Film Database ---- ---- -------- 1 -- Add entry 2 -- List entries 0 -- Exit ---- ---- -------- 0 entries .

171

172 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53

Enter Enter Enter Enter

CHAPTER 12. OBJECT ORIENTED PYTHON option : 1 title : Fight Club director : David Fincher year : 1999

Film Database -- --- ----------- 1 -- Add entry 2 -- List entries 0 -- Exit -- --- ----------- 1 entries . Enter Enter Enter Enter

option : 1 title : Apocalypse Now director : Francis Ford Coppola year : 1979

Film Database -- --- ----------- 1 -- Add entry 2 -- List entries 0 -- Exit -- --- ----------- 2 entries . Enter Enter Enter Enter

option : 1 title : The Shining director : Stanley Kubrick year : 1980

Film Database -- --- ----------- 1 -- Add entry 2 -- List entries 0 -- Exit -- --- ----------- 3 entries . Enter option : 2 Film Database : -------------

12.2. INHERITANCE 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76

173

Film 0: Title : Fight Club Director : David Fincher Year : 1999 Film 1: Title : Apocalypse Now Director : Francis Ford Coppola Year : 1979 Film 2: Title : The Shining Director : Stanley Kubrick Year : 1980 Film Database -- --- ----------- 1 -- Add entry 2 -- List entries 0 -- Exit -- --- ----------- 3 entries . Enter option : 0 $

12.2

Inheritance

One reason to use object oriented programming is that it provides a way of reusing your code. These days it is common for programs to consist of well over a million lines of code and to be written by a combination of programming teams, sometimes from different companies. Any useful method of code reuse is bound to cut down on development time and costs. In fact, even in a relatively small program, being able to reuse code can make your programs easier to read and maintain and quicker to write. This magic method of code reuse is known as inheritance and it is probably the single biggest reason for the popularity of object oriented programming. With inheritance, one type can “inherit” all the data and methods of its parent. That way, if you have several classes that are all rather similar, instead of having to retype all the code to define some of their methods, you can use inheritance to say “this type has all the data and methods of it’s parent and these new ones as well”. In object oriented programming the “parent” class is called the superclass and the “child” class is called the subclass. We can tell Python that we want a new class to have a particular superclass using this syntax:

174

CHAPTER 12. OBJECT ORIENTED PYTHON

class < name >( < superclass >): ... For example, in the following code the class Child inherits all the capabilities of its superclass Parent: >>> class Parent : ... def foo ( self ): ... print " foo " ... >>> class Child ( Parent ): ... def bar ( self ): ... print " bar " ... >>> c = Child () >>> c . bar () bar >>> c . foo () foo >>> p = Parent () >>> p . foo () foo >>> p . bar () Traceback ( most recent call last ): File " < stdin > " , line 1 , in ? AttributeError : Parent instance has no attribute ’ bar ’ >>> and notice that the superclass (Parent) doesn’t have any methods that are only defined in the subclass. Also, if we haven’t specified a superclass, like we haven’t with the Parent class (or Die, Craps or our other examples) then Python uses a special superclass called object. In real programs, inheritance is useful wherever several types need to share the same capabilities. For example, many graphical programs (games, for example) deal with points in space. These might be two or three dimensional and will probably have methods to do things like finding the distance between the point and the origin, finding the distance between the point and another point, and so on. Here’s a small module with classes to represent two and three dimensional points: Listing 12.5: Points: inheritance as extension 1 # !/ bin / env python 2 3 class Point ( object ): 4 def __init__ ( self , x , y ): 5 self . x = x

12.3. POLYMORPHISM

175

6 self . y = y 7 return 8 def dist ( self ): 9 import math 10 return math . sqrt ( self . x ** 2 + self . y **2) 11 def __repr__ ( self ): 12 return ’ Point ( ’+ str ( self . x )+ ’ , ’+ str ( self . y )+ ’) ’ 13 14 class Point3d ( Point ): 15 def __init__ ( self , x , y , z ): 16 Point . __init__ ( self , x , y ) 17 self . z = z 18 return 19 def dist ( self ): 20 import math 21 return math . sqrt ( self . x ** 2 + self . y **2 + self . z ** 2) 22 def __repr__ ( self ): 23 return ’ Point ( ’+ str ( self . x )+ ’ , ’+ str ( self . y )+ ’ , ’+ str ( self . z )+ ’) ’ 24 25 if __name__ == ’ __main__ ’: 26 # Basic testing ! 27 p1 = Point (16 , 16) 28 p2 = Point3d (25 , 25 , 25) 29 print p1 . __repr__ () , ’ is : ’ , p1 . dist () , ’ from the origin ’ 30 print p2 . __repr__ () , ’ is : ’ , p2 . dist () , ’ from the origin ’ and here is the output of that program: 1 2 3 4

$ ./ point . py Point (16 ,16) is : 22.627416998 from the origin Point (25 ,25 ,25) is : 43.3012701892 from the origin $ Notice that the Point3d class is a subclass of Point. If you’ve done a lot of maths that might seem odd because you will probably think of three dimensional points as being more general than two dimensional points. However, in object oriented programming, a subclass is like an extension of its superclass – the subclass adds more functionality to the superclass. So, our three dimensional point has one more datum (we’ve called it self.z) than our two dimensional point, so Point3d inherits from Point.

12.3

Polymorphism

In the example above about points, the Point and Point3d classes both had a method called dist. As it happened, Point3d had it’s own definition of dist, but even if it didn’t

176

CHAPTER 12. OBJECT ORIENTED PYTHON

objects of type Point3d would still have had the dist method from the Point superclass. So, even though instances of Point and Point3d have different types, they still have the same methods. This property is known as polymorphism – Greek for “many shapes”. This is useful because we might have a number of objects and if they all share a superclass then we can call methods which we know will be in all of our objects, regardless of which class they instantiate. For example, if we have a list of points then we can find the distances between each one and the origin, without having to worry about whether each particular object is of type Point or type Point3d: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

>>> from point import * >>> p = [] >>> p . append ( Point (23 , 24)) >>> p . append ( Point3d (22 , 22 , 22)) >>> p . append ( Point3d (0 , 0 , 100)) >>> p . append ( Point (1000 , 1000)) >>> p . append ( Point3d ( -100 , 0 , 100)) >>> p . append ( Point3d (25 , -50 , 25)) >>> p . append ( Point ( -99 , 45)) >>> for i in p : ... print i . dist () ... 33.2415402772 38.1051177665 100.0 1414.21356237 141.421356237 61.2372435696 108.747413762 >>>

12.3.1

Shapes example

Here’s another example of polymorphism. This time we have some classes representing shapes (the sort of thing that might be useful for a vector graphics illustrator). Each sort of shape can determine its own area and perimeter, so later on we can iterate through a list of them and call the methods common to all shape types. Listing 12.6: Shapes: inheritance in use 1 # !/ bin / env python 2 3 class Shape : 4 def __init__ ( self ):

12.3. POLYMORPHISM

177

5 return 6 def getArea ( self ): 7 return 0.0 8 def getPerimeter ( self ): 9 return 0.0 10 11 class Rectangle ( Shape ): 12 def __init__ ( self , side1 , side2 ): 13 Shape . __init__ ( self ) 14 self . side1 = side1 15 self . side2 = side2 16 return 17 def getArea ( self ): 18 return self . side1 * self . side2 19 def getPerimeter ( self ): 20 return (2 * self . side1 ) + (2 * self . side2 ) 21 22 class Square ( Shape ): 23 def __init__ ( self , side ): 24 Shape . __init__ ( self ) 25 self . side = side 26 return 27 def getArea ( self ): 28 return self . side * self . side 29 def getPerimeter ( self ): 30 return 4 * self . side Again, we can create a list of Shape objects and we can make use of any of the methods that the Shape class provides – without finding out which subclass of Shape each object is an instance of: 1 2 3 4 5 6 7 8 9 10 11 12 13

>>> from shapes import * >>> myshapes = [ Rectangle (1 , 2) , Square (77) , Rectangle (86 , 99) , Square (6) , Square (64)] >>> for i in myshapes : ... print " Area : " , i . getArea () ... print " Perimeter : " , i . getPerimeter () ... Area : 2 Perimeter : 6 Area : 5929 Perimeter : 308 Area : 8514 Perimeter : 370

178 14 15 16 17 18

CHAPTER 12. OBJECT ORIENTED PYTHON

Area : 36 Perimeter : 24 Area : 4096 Perimeter : 256 >>>

12.3.2

Expression evaluator example

Finally, we’ll look at an example of a non-linear datatype – that is, one where we have data that isn’t held in a list, array, tuple or other linear storage. In Chapter 7 we saw a lexer, which took an input string from the user and converted it into an integer. Lexing is the first part of a compiler or interpreter, like the Python interpreter you’ve been using throughout the course. Lexers take a string (from a file or the interpreter shell) and convert it into tokens, each of which represents one bit of syntax in the original string. So, if you entered this line into the Python interpreter: >>> sum , diff = ( x + y ) , (x - y ) Python’s lexer might issue something like the following set of tokens: NAME " sum " COMMA NAME " diff " EQUALS LPAREN NAME " x " PLUS NAME " y " RPAREN COMMA LPAREN NAME " x " MINUS NAME " y " RPAREN Next, the list of tokens is parsed – that is, the compiler (or interpreter) figures out what the grammar (or syntax) of the input program is supposed to be and whether it contains any syntax errors. In this example, we’ll look at what might be the result of parsing expressions in a small calculator language. In our language, we won’t have any variable, or even brackets, we’ll just have numbers and operations on them. So, this is the grammar of our little language: < digit > :: (0|1|2|3|4|5|6|7|8|9) < number > :: < digit >+? " . " < digit >*

12.3. POLYMORPHISM < expr > :: | | | |

< number > < expr > " + " < expr > " -" < expr > " * " < expr > " / "

179

< expr > < expr > < expr > < expr >

Notice that we’re writing the grammar as if it were a regular expression. So, our language has numbers, which are like floats (a string of digits followed, optionally, by a decimal point and another string of digits) and expressions. Expressions can be a number or two expressions added (or subtracted, or multiplied or divided) together. So, the following are all valid programs in our little language: 1000 1 + 3 1 * 2 + 3 / 4 - 5 and the following are invalid: 1000 + 1 * / 3 1 && 2 Notice that we haven’t included brackets and we’re not bothering about order of precedence in this example. So, to represent programs in our language with objects, we’ll have a superclass called Expr which is a template for all our different sorts of expression. Then subclasses of Expr will be Number, Plus, Minus, Times and Div, representing each type of expression. Our Expr objects will be able to do two useful things: ˆ return their string representation (with a __repr__ method); and ˆ evaluate themselves (with a method called eval).

Evaluating an expression object here, means working out the result of the particular arithmetic expression that that object represents. So, if we have a Plus object which represents 1+2 then that object should be able to evaluate itself and return 3. Here’s the code to do this: Listing 12.7: A simple expression evaluator 1 # !/ bin / env python 2 3 class Expr ( object ): 4 def eval ( self ): 5 return 0 6 def __repr__ ( self ): 7 return ’ ’ 8

180 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

CHAPTER 12. OBJECT ORIENTED PYTHON

class Num ( Expr ): def __init__ ( self , num ): self . num = num return def __repr__ ( self ): return str ( self . num ) def eval ( self ): return self . num class Plus ( Expr ): def __init__ ( self , left , right ): self . left = left self . right = right return def __repr__ ( self ): return self . left . __repr__ () + ’+ ’ + self . right . __repr__ () def eval ( self ): return self . left . eval () + self . right . eval () class Minus ( Expr ): def __init__ ( self , left , right ): self . left = left self . right = right return def __repr__ ( self ): return self . left . __repr__ () + ’ - ’ + self . right . __repr__ () def eval ( self ): return self . left . eval () - self . right . eval () class Times ( Expr ): def __init__ ( self , left , right ): self . left = left self . right = right return def __repr__ ( self ): return self . left . __repr__ () + ’* ’ + self . right . __repr__ () def eval ( self ): return self . left . eval () * self . right . eval () class Div ( Expr ): def __init__ ( self , left , right ): self . left = left self . right = right

12.4. ABSTRACT DATA TYPES

181

52 return 53 def __repr__ ( self ): 54 return self . left . __repr__ () + ’/ ’ + self . right . __repr__ () 55 def eval ( self ): 56 return self . left . eval () / self . right . eval () 57 58 if __name__ == ’ __main__ ’: 59 # Testing ! 60 def test ( expr ): 61 print expr , ’ Evaluates to : ’ , expr . eval () 62 return 63 64 expr = [] 65 expr . append ( Plus ( Num (1) , Num (2))) 66 expr . append ( Minus ( Plus ( Num (1) , Num (2)) , Times ( Num (5) , Num (6)))) 67 expr . append ( Div ( Minus ( Plus ( Num (1) , Num (2)) , Times ( Num (5) , Num (6))) , 68 69 map ( test , expr ) Running the script gives us the results of testing that we’ve placed at the end of the module (after the if __name__ ... statement):

$ 1+2 Evaluates to : 3 1+2 -5*6 Evaluates to : -27 1+2 -5*6/2 Evaluates to : -14 $

12.4

Abstract data types

12.4.1

Sets

In Section 10.5 we defined some functions to implement set operations. In that example we modeled sets as lists and each set operation took one or more list as an argument. For example: >>> import sets >>> sets . intersect ([1 , 2 , 3] , [3 , 4 , 5]) [3] >>> There are a couple of drawbacks to this approach. One is that anyone using this module might pass in values which aren’t lists to our functions. If this happens the results will all depend on how we’ve implemented the module. If someone passes in values which can

182

CHAPTER 12. OBJECT ORIENTED PYTHON

behave like lists (maybe tuples or strings) then perhaps our functions will be able to return meaningful results. If not, then Python might give us some sort of error message. Object oriented programming can help us overcome this problem by allowing us to define a new type for sets which implements the set operations that we’re interested in (such as ismem and union). This way, we’re keeping the data that defines a particular set (still in a list) together with the operations that act upon it. This makes a lot of sense, and has the advantage that all our code to do with sets is collected in one place. It should also serve to remind anyone using our code that sets need to interact with other sets and that it wouldn’t make sense to see if a set “intersected” with, say, a float. The implementation of the set type that we’re about to create is an example of an abstract data type. Here, abstract means that the implementation of the type (i.e. using lists to represent sets) is hidden from anyone who is creating an instance of the set type. We could exchange this implementation (for example, replacing lists with dictionaries) and no one would ever notice – only the code in the Set class would need to be changed. Here’s the first part of our Set class. In this example, we’ll use imperative programming (not functional programming) to implement all the set operations, but we’ll keep the preconditions and postconditions in the documentation, just to remind you of what the different operations should do. class Set : " " " An abstract data type for sets . Sets are modeled as lists . inv :: type ( self . set ) == type ([]) """ empty = [] # The empty set . def __init__ ( self ): self . set = [] return The constructor here just creates an empty list with which we can store set elements. Next, we need to think about the methods which our class will provide. Since the set class is intended to be an abstract data type, our intention is that anyone creating a Set object will make use of the methods provided but probably won’t access self.set directly. Of course, there’s nothing to stop people doing so, like this: >>> s = Set () >>> s . set . append (1) >>> but we’re assuming that authors won’t do this. The reason for this is that we want to maintain any class invariants we have and our methods should be careful to keep the

12.4. ABSTRACT DATA TYPES

183

promises made in their postconditions. It may be that modifying data like self.set directly could break a class invariant. Or it may be that the method to add an element to the set needs to do more than just call self.set.append() to meet it’s postconditions. In fact, this is exactly the situation we have with the Set class. Someone accessing self.set directly can do this: >>> s = Set () >>> s . set . append (1) >>> s . set . append (1) >>> i.e. they can add a value to the set more than once. Once a “set” contains more than one instance of a particular value then by definition it is no longer a set. So, to add elements to sets we really need a method that first checks to see if the element we want to add already appears in the set. Here are some methods to do this: def ismem ( self , e ): " " " Returns True if e appears in self . set and False otherwise . pre :: True post :: __return__ == e in self . set """ return e in self . set def addmem ( self , e ): " " " Adds element e to self . set . pre :: True post :: ismem (e , self . set ) """ if not self . ismem ( e ): self . set . append ( e ) return def __repr__ ( self ): return ’ Set : ’ + repr ( self . set ) Next, we can implement the set operations which combine sets in different ways. Below is an intersect method which returns the intersection of two sets. Notice that when we return the intersection, we are returning a new instance of Set: def intersect ( self , set ): " " " Returns the intersection of self . set and set . That is , every element that is in both self . set and set . pre ::

184

CHAPTER 12. OBJECT ORIENTED PYTHON type ( set ) = type ( Set . empty ) post :: forall ( __return__ , lambda e : ismem (e , self . set ) and ismem (e , set )) """ res = Set () for i in self . set : if set . ismem ( i ): res . addmem ( i ) return res

The other methods in Set are similar, so we won’t discuss them in detail. Here is the code for the whole class: Listing 12.8: A class to model sets 1 class Set : 2 " " " An abstract data type for sets . 3 Sets are modeled as lists . 4 5 inv :: 6 type ( self . set ) == type ([]) 7 """ 8 9 empty = [] # The empty set . 10 11 def __init__ ( self ): 12 self . set = [] 13 return 14 15 def __repr__ ( self ): return ’ Set : ’ + repr ( self . set ) 16 17 def isempty ( self ): 18 " " " Returns True if self . set is empty and False otherwise . 19 pre :: 20 True 21 post :: 22 __return__ == ( len ( self . set ) == 0) 23 """ 24 return self . set == Set . empty 25 26 def ismem ( self , e ): 27 " " " Returns True if e appears in self . set and False otherwise . 28 pre :: 29 True

12.4. ABSTRACT DATA TYPES 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72

185

post :: __return__ == e in self . set """ return e in self . set def addmem ( self , e ): " " " Adds element e to self . set . pre :: True post :: ismem (e , self . set ) """ if not self . ismem ( e ): self . set . append ( e ) return def intersect ( self , set ): " " " Returns the intersection of self . set and set . That is , every element that is in both self . set and set . pre :: type ( set ) = type ( Set . empty ) post :: forall ( __return__ , lambda e : ismem (e , self . set ) and ismem (e , set )) """ res = Set () for i in self . set : if set . ismem ( i ): res . addmem ( i ) return res def difference ( self , set ): """ Returns the difference of self . set and set . That is , every element that is in self . set and not in set . pre :: type ( set ) == type ( Set . empty ) post :: forall ( __return__ , lambda e : ismem (e , self . set ) and not ismem (e , set )) """ res = Set () for i in self . set :

186 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107

CHAPTER 12. OBJECT ORIENTED PYTHON if not set . ismem ( i ): res . addmem ( i ) return res def symmetric_difference ( self , set ): " " " Returns the symmetric difference of self . set and set . That is , every element that is in self . set or set , but not in both . pre :: type ( set ) == type ( Set . empty ) post :: forall ( __return__ , lambda e : ismem (e , self . set ) xor ismem (e , set )) """ res = self . difference ( set ) res2 = set . difference ( self ) for i in res2 . set : res . addmem ( i ) return res def union ( self , set ): " " " Returns the union of self . set and set . That is , every element that is in either self . set or set . pre :: type ( set ) == type ( Set . empty ) post :: forall ( __return__ , lambda e : ismem (e , self . set ) or ismem (e , set )) """ res = Set () for i in self . set : res . addmem ( i ) for i in set . set : res . addmem ( i ) return res

So, now we can create new sets and make use of the methods that they provide: >>> >>> >>> >>> >>> >>>

import sets s = sets . Set () s . addmem (1) s . addmem (2) s . addmem (3) m = sets . Set ()

12.4. ABSTRACT DATA TYPES

187

>>> m . addmem (3) >>> m . addmem (4) >>> m . addmem (5) >>> i = s . intersect ( m ) >>> print s Set : [1 , 2 , 3] >>> print m Set : [3 , 4 , 5] >>> print i Set : [3] >>>

12.4.2

Overloading built in operators and functions

You have probably noticed already that many of the built in functions and operators that Python provides can be applied to data of many different types. For example, we can add two numbers together or two strings or two lists, . . . >>> 1 + 2 3 >>> 1 + 2.5 3.5 >>> 2.5 + 3.5 6.0 >>> " abc " + " def " ’ abcdef ’ >>> [1 , 2 , 3] + [1 , 2 , 3] [1 , 2 , 3 , 1 , 2 , 3] >>> 1 * 2 2 >>> " a " * 2 ’ aa ’ >>> [1] * 2 [1 , 1] >>> This is called operator overloading. Now that we can create new types with the class keyword, we can also define what should happen when one of the built in functions or operators has been applied to an object our new type. In fact, we have already done this with the __repr__ method which we used to overload the str and repr functions: >>> class Foobar : ... def __repr__ ( self ): ... return " foobar "

188

CHAPTER 12. OBJECT ORIENTED PYTHON

... >>> f = Foobar () >>> str ( f ) ’ foobar ’ >>> repr ( f ) ’ foobar ’ >>> print f foobar >>> Python provides several of these “special” names with which to implement operator overloading. Lists of these names can be found online, here: http://docs.python.org/ref/specialnames.html. Some examples are: and or xor

overloads the & operator. overloads the | operator. overloads the ^ operator.

contains len

overloads the in function.

overloads the len function.

We can write these methods into our class and so define what all these operators and functions should do when they are applied to instances of our new type: 1 class Set : 2 " " " An abstract data type for sets . 3 Sets are modeled as lists . 4 5 inv :: 6 type ( self . set ) == type ([]) 7 """ 8 9 empty = [] # The empty set . 10 11 def __init__ ( self , values = None ): 12 self . set = [] 13 return 14 15 # Overload some operators and functions . 16 def __repr__ ( self ): return ’ Set : ’ + repr ( self . set ) 17 def __len__ ( self ): return len ( self . set ) 18 def __and__ ( self , other ): return self . intersect ( other )

12.4. ABSTRACT DATA TYPES 19 20 21 22 23

def __or__ ( self , other ): return self . union ( other ) def __xor__ ( self , other ): return self . symmetric_difference ( other ) def __contains__ ( self , e ): return self . ismem ( e ) ... Here’s an example of the operators and functions in use:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

189

>>> from sets import * >>> s = Set () >>> s . addmem (1) >>> s . addmem (2) >>> s . addmem (3) >>> s . addmem (4) >>> s . addmem (5) >>> m = Set () >>> m . addmem (4) >>> m . addmem (5) >>> m . addmem (6) >>> m . addmem (7) >>> m . addmem (8) >>> s . intersect ( m ) Set : [4 , 5] >>> s . union ( m ) Set : [1 , 2 , 3 , 4 , 5 , 6 , 7 , 8] >>> s . difference ( m ) Set : [1 , 2 , 3] >>> s . symmetric_difference ( m ) Set : [1 , 2 , 3 , 6 , 7 , 8] >>> s . ismem (0) False >>> s & m Set : [4 , 5] >>> s | m Set : [1 , 2 , 3 , 4 , 5 , 6 , 7 , 8] >>> s ^ m Set : [1 , 2 , 3 , 6 , 7 , 8] >>> 0 in s False >>> 1 in s True >>> len ( s ) 5 >>> repr ( s )

190

CHAPTER 12. OBJECT ORIENTED PYTHON

37 ’ Set : [1 , 2 , 3 , 4 , 5] ’ 38 >>> print s 39 Set : [1 , 2 , 3 , 4 , 5] One criticism of operator overloading is that it can be confusing for anyone using your types. How will they know what a particular operator or function does? In our example we’ve tried to be quite intuitive about what the operators should mean. len usually returns the length of its argument, so we’ve got it returning the number of elements in a set. & usually means “bitwise and”, so we’ve defined it to return all the elements that are in one set and another. In general, when you use operator overloading, it’s sensible to follow these rules: ˆ Document your methods well; ˆ try to make your implementations intuitive; and ˆ if in doubt, don’t implement it!

12.5

Exceptions

Although we’ve talked about writing preconditions, postconditions and class invariants to document contracts between different parts of programs, we haven’t yet covered what should happen when a contract is broken. Using Python, you will already have seen different sorts of exceptions, when Python complained that you have written some code that doesn’t make any sense. For example, the following semantic error (i.e. an error in the meaning of a program) generates the Python exception ZeroDivisionError: 1 2 3 4 5

>>> 1 / 0 Traceback ( most recent call last ): File " < stdin > " , line 1 , in ? ZeroDivisionError : integer division or modulo by zero >>> And you’ve also seen other errors related to names and types:

1 2 3 4 5 6 7 8 9

>>> [1 , 2 , 3 , 4] + ’ abcde ’ Traceback ( most recent call last ): File " < stdin > " , line 1 , in ? TypeError : can only concatenate list ( not " str " ) to list >>> print foobar Traceback ( most recent call last ): File " < stdin > " , line 1 , in ? NameError : name ’ foobar ’ is not defined >>>

12.5. EXCEPTIONS

191

You can, however, create your own exceptions, decide when to “raise” them and tell Python what to do if it encounters an exception. Remember in the film database example, we had the following code to get input from the user to choose between items in a menu: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

while user != ’0 ’: ... user = raw_input () if user == ’1 ’: # Add to database print ’ Enter title : ’ , title = raw_input () print ’ Enter director : ’ , director = raw_input () print ’ Enter year : ’ , year = raw_input () db . addFilm ( title , director , year ) elif user == ’2 ’: # Display database print db . __repr__ () print elif user == ’0 ’: break else : print ’ Unknown option ’ Here, we haven’t bothered to check whether or not the user has entered a number at all; we just say “Unknown option” if the user input doesn’t match ’0’, ’1’ or ’2’. This isn’t very helpful to the user, it would be better to give clearer feedback. Instead, we could try to convert the user input to an int then handle any errors that raises. We can test this out in the interpreter to see exactly what exception Python might raise if we try to convert a non-digit character to an int:

1 2 3 4 5

>>> int ( ’a ’) Traceback ( most recent call last ): File " < stdin > " , line 1 , in ? ValueError : invalid literal for int (): a >>> So Python raises the exception ValueError. Here’s some code to do this:

1 2 3

while user != 0: ...

192 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

CHAPTER 12. OBJECT ORIENTED PYTHON

while True : try : user = int ( raw_input ()) break except ValueError : print ’ That was not a number ! ’ print ’ Please enter a number between 0 and 2 ’ if user == 1: # Add to database print ’ Enter title : ’ , title = raw_input () print ’ Enter director : ’ , director = raw_input () print ’ Enter year : ’ , year = raw_input () db . addFilm ( title , director , year ) elif user == 2: # Display database print db . __repr__ () print elif user == 0: break else : print ’ Please enter a number between 0 and 2 ’ Here, we’ve used some new syntax:

1 try : 2 < statements > 3 except < name >: 4 < statements > 5 else : 6 < statements > When Python executes code like this, firstly the try block is executed (all the code between the words try and except). If no exception is raised, then Python looks for an else clause. If one is present, then the statements in that else clause are executed. If there is no else clause (as in the film database example above) then Python just skips to the next block of code. If an exception has been raised in the try block, then Python looks for a matching except block. For example, in the film database code, we had an except block which caught errors of type ValueError: except ValueError: .... So, if an exception is raised in a try block and Python finds a matching except block then the statements in that except block are executed.

12.5. EXCEPTIONS

193

Here’s a short example of the whole syntax in action: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

>>> while True : ... try : ... x = int ( raw_input ( ’ Please enter a number : ’ )) ... except ValueError : ... print ’ Fool ! That was not a number ! ’ ... else : ... print ’ You entered the number : ’ , x ... break ... Please enter a number : abcd Fool ! That was not a number ! Please enter a number : @ Fool ! That was not a number ! Please enter a number : &&*;;; Fool ! That was not a number ! Please enter a number : 1000 You entered the number : 1000 >>> There may be times when you want to handle several different errors in the same way, in which case you can put them all in the same except clause, like this:

1 except ( RuntimeError , TypeError , NameError ): 2 pass Also, you might just want to catch any old error that might be thrown, in which case you can just omit the error name from the except clause: 1 except : 2 print ’ Unexpected error ! ’ 3 sys , exit (1) Lastly, you can raise exceptions yourself, any time you like, including exceptions which you’ve defined for yourself. The syntax: 1 raise < name > raises an exception. To invent your own exceptions, you just need to subclass the built in Exception class. For example: 1 class NotASet ( Exception ): 2 def __init__ ( self , value ): 3 self . value = value 4 return 5 def __repr__ ( self ): 6 return self . value . __repr__ ()

194

CHAPTER 12. OBJECT ORIENTED PYTHON

7 8 class Set : 9 ... 10 def union ( self , other ): 11 try : 12 if not isinstance ( other , Set ): 13 raise NotASet ( other ) 14 except NotASet : 15 print ’ Cannot find the union of a Set and a non - Set . ’ 16 ...

12.6

Unit testing with PyUnit

In the early days of software engineering programmers used to define requirements for programs, write code to implement those requirements, then test their code. Or that was, at least, what software engineering students were told to do. In modern software development many engineers prefer to design and implement their testing first, before they’ve written the program which needs testing. The idea of test-first programming is that writing test cases will help you clarify in your own mind how your program should work. In this sense, unit testing is similar to writing preconditions and postconditions and class invariants. Both of these activities help you to decide how to break your program into smaller pieces (modules, classes, functions, etc.) and to be clear about what those smaller pieces of code should do. Equally, both activities help to document what your program ought to do. This should be helpful to you, when you come back and look at your code some time after you’ve written it, and to anyone who has to maintain your code, or just make use of it in a larger program. Proponents of modern software development strategies (such as “agile programming” and “xtreme programming”) say that testing is as much about developing code as it is about catching bugs and getting your code ready to ship: The most common misconception about unit testing frameworks is that they are only testing tools. They are development tools same as your editor and compiler. Don’t keep this powerful development tool in reserve until the last month of the project, use it through out. http://www.extremeprogramming.org/rules/unittestframework.html In object oriented programming a common framework for unit testing has emerged, called xUnit. This is a set of helpful objects and methods which you can use to structure your testing. xUnit is based on Kent Beck’s unit testing framework for the Smalltalk language. Now, most object oriented languages have a similar library. Python has PyUnit, Java has Junit, .NET languages have Nunit, and frameworks exist for Actionscript, VB, Perl and pretty much any other modern language you can name. There are other unit testing frameworks for Python apart from PyUnit, but PyUnit comes as standard with the Python distribution. Also, it has gained some popularity in industry:

12.6. UNIT TESTING WITH PYUNIT

195

PyUnit is cool. It’s helping us find lots of bugs. – Jim Fulton, Digital Creations I am really impressed with the effect of unit testing on the quality of my code since I started using PyUnit about two months ago. I know that the extra effort has saved me literally days of looking for subtle bugs. Writing the unit tests first is also a very good inoculation against over-engineering. – Terrel Shumway

12.6.1

A test harness for the sets class

As an example of how to use PyUnit, we’ll create a test suite for the sets example from Section 12.4.1. To begin using the PyUnit framework we just need to import the unittest module and any code we want to test: 7 import unittest 8 from sets import * Next, any test cases we write will be subclasses of the TestCase class that the unittest module provides: 10 class SetTestCase ( unittest . TestCase ): Now that we’ve subclassed unittest.TestCase we can use any of the data and methods provided by it. To get a list of these we can open the Python interpreter, import the unittest module and use the help() function to get some detailed documentation about the class: >>> import unittest >>> help ( unittest . TestCase ) Help on class TestCase in module unittest : class TestCase ( __builtin__ . object ) | A class whose instances are single test cases . | ... For our purposes, we won’t be using many of the methods in TestCase, but you might want to be much more adventurous when you come to use unittest in the exercises. Most importantly, you need to get used to using professional documentation, like the information which comes with PyUnit. When PyUnit first tests your code, for any subclass of unittest.TestCase you have written, the first method to be run will always be a method setUp, if you have provided one. This can be used to create data in your TestCase object, or to open files (or sockets, or databases . . . ) and so on. There is also an analogous method called tearDown which will be run after all your testing is done and you can use that to close files, databases, sockets and so on.

196

CHAPTER 12. OBJECT ORIENTED PYTHON

To test our Set class, we just need to create some sets which we can use to test our various set methods: 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

def setUp ( self ): # set1 contains 1, 2, 3, 4, 5 self . set1 = Set () self . set1 . addmem (5) self . set1 . addmem (4) self . set1 . addmem (3) self . set1 . addmem (2) self . set1 . addmem (1) # set2 contains 4, 5, 6, 7, 8 self . set2 = Set () self . set2 . addmem (8) self . set2 . addmem (7) self . set2 . addmem (6) self . set2 . addmem (5) self . set2 . addmem (4) return Since we aren’t using any files or other resources on disk or over the network we won’t need to write a tearDown method. So, all we have to do now is to write some methods to test the various facilities that the sets.Set class provides. In PyUnit, we can do this by writing methods in our subclass of unittest.TestCase whose name begins with the characters test. PyUnit will “know”2 that method names beginning with test denote some testing to be done. For example, to test the ismem method we probably want to test: ˆ That ismem can correctly report that an element is in a set; and ˆ that ismem can correctly report that an element is not in a set.

Here’s some code to implement this testing: 29 30 31 32 33

def testIsempty ( self ): self . failUnless ( Set . empty == []) self . failUnless ( not self . set1 . isempty ()) self . failUnless ( not self . set2 . isempty ()) return Notice that we’re making extensive use of the method failUnless which is provided by the unittest.TestCase class. This method takes a boolean argument and will cause the test case to fail (by raising an exception) if its argument does not evaluate to True. Next, we can start testing some of the methods which combine sets in different ways. Testing for the intersect method probably needs to check that: 2

Using something called reflection

12.6. UNIT TESTING WITH PYUNIT

197

ˆ The intersection of set1 and set2 contain the elements we would expect (i.e. 4 and 5); and ˆ the intersection doesn’t contain any other elements (i.e. it only contains 2 elements).

Here’s the code to do this: 40 41 42 43 44 45

def testIntersect ( self ): inter = self . set1 . intersect ( self . set2 ) self . failUnless ( inter . ismem (4)) self . failUnless ( inter . ismem (5)) self . failUnless ( len ( inter )==2) return We won’t discuss each of the remaining test cases in detail, as the other methods in the SetTestCase class are all very similar. The last things we need to finish off our unit testing for the sets module is to turn our testsets module into an executable script. To do this, we use the usual Python idiom to check that the file is being run from the command line (i.e. if __name__ == ’__main__’: ...) and call the unittest.main() method to manage the testing and output for us:

79 if __name__ == ’ __main__ ’: 80 unittest . main () Here is the entire script: Listing 12.9: A test harness for the sets class 1 # !/ bin / env python 2 3 __author__ = ’ Sarah Mount ’ 4 __credits__ = ’ Sarah Mount ’ 5 __date__ = ’ November 2005 ’ 6 7 import unittest 8 from sets import * 9 10 class SetTestCase ( unittest . TestCase ): 11 12 def setUp ( self ): 13 # set1 contains 1, 2, 3, 4, 5 14 self . set1 = Set () 15 self . set1 . addmem (5) 16 self . set1 . addmem (4) 17 self . set1 . addmem (3) 18 self . set1 . addmem (2)

198 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

CHAPTER 12. OBJECT ORIENTED PYTHON self . set1 . addmem (1) # set2 contains 4, 5, 6, 7, 8 self . set2 = Set () self . set2 . addmem (8) self . set2 . addmem (7) self . set2 . addmem (6) self . set2 . addmem (5) self . set2 . addmem (4) return def testIsempty ( self ): self . failUnless ( Set . empty == []) self . failUnless ( not self . set1 . isempty ()) self . failUnless ( not self . set2 . isempty ()) return def testIsmem ( self ): self . failUnless ( self . set1 . ismem (1)) self . failUnless ( not self . set1 . ismem (100)) return def testIntersect ( self ): inter = self . set1 . intersect ( self . set2 ) self . failUnless ( inter . ismem (4)) self . failUnless ( inter . ismem (5)) self . failUnless ( len ( inter )==2) return def testDifference ( self ): diff = self . set1 . difference ( self . set2 ) self . failUnless ( diff . ismem (1)) self . failUnless ( diff . ismem (2)) self . failUnless ( diff . ismem (3)) self . failUnless ( len ( diff )==3) return def testSymmtetricdifference ( self ): diff = self . set1 . symmetric_difference ( self . set2 ) self . failUnless ( diff . ismem (1)) self . failUnless ( diff . ismem (2)) self . failUnless ( diff . ismem (3)) self . failUnless ( diff . ismem (6)) self . failUnless ( diff . ismem (7))

12.6. UNIT TESTING WITH PYUNIT

199

62 self . failUnless ( diff . ismem (8)) 63 self . failUnless ( len ( diff )==6) 64 return 65 66 def testUnion ( self ): 67 union = self . set1 . union ( self . set2 ) 68 self . failUnless ( union . ismem (1)) 69 self . failUnless ( union . ismem (2)) 70 self . failUnless ( union . ismem (3)) 71 self . failUnless ( union . ismem (4)) 72 self . failUnless ( union . ismem (5)) 73 self . failUnless ( union . ismem (6)) 74 self . failUnless ( union . ismem (7)) 75 self . failUnless ( union . ismem (8)) 76 self . failUnless ( len ( union )==8) 77 return 78 79 if __name__ == ’ __main__ ’: 80 unittest . main () Running the test cases Now that we’ve implemented all our test cases, we can run the testsets.py script from the command line:

$ ./ testsets . py ...... --------------------------------------------------Ran 6 tests in 0.001 s OK $ and as you can see, the script has reported back that six test cases were run in 0.001 seconds and no errors were found. What would the output look like if we did have a bug in the code? Here’s an example:

$ ./ testsets . py .. F ... == = = = = = = = = = = == = = = = = = = = = = = = = = = = == = = = = = = = = = = = = = = = == = = FAIL : testIsempty ( __main__ . SetTestCase ) --------------------------------------------------Traceback ( most recent call last ): File " ./ testsets . py " , line 29 , in testIsempty

200

CHAPTER 12. OBJECT ORIENTED PYTHON

self . failUnless ( not Set . empty == [] ) AssertionError --------------------------------------------------Ran 6 tests in 0.002 s FAILED ( failures =1) $ Here, PyUnit is reporting back that it ran six tests and found one error. It also tells us that the error was found in the testIsempty test case and gives us some details about the exact exception that was raised: Traceback ( most recent call last ): File " ./ testsets . py " , line 29 , in testIsempty self . failUnless ( not Set . empty == [] ) AssertionError From this we can see that test that failed was the assertion to test whether Set.empty is []. So, PyUnit has given us pretty much all the information we need to fix this bug. We know where in the code it is (the Set.empty field) and which test case triggered the bug (Set.empty == []). Now we can go back to the sets module and debug it.

12.7

Further reading

ˆ Wikipedia on object oriented programming: http://en.wikipedia.org/wiki/Object-oriented_programming ˆ A list of object oriented jargon: http://en.wikipedia.org/wiki/List_of_object-oriented_programming_terms ˆ Michele Simionato’s essay on method resolution order: http://www.python.org/2.3/mro.html ˆ Special method names (like __and__): http://docs.python.org/ref/specialnames.html ˆ Criticisms of object oriented programming:

– Richard P. Gabriel’s essay “Objects have failed”: http://dreamsongs.com/ObjectsHaveFailedNarrative.html – Richard Mansfield’s article “OOP is much better in theory than in practice”: http://www.devx.com/opinion/Article/26776 – Wiki discussion on “Arguments against OOP”: http://www.devx.com/opinion/Article/26776

12.7. FURTHER READING – Object oriented programming is oversold! http://www.geocities.com/tablizer/oopbad.htm ˆ The Python Tutorial on errors and exceptions: http://docs.python.org/tut/node10.html ˆ A list of Python’s built in exceptions and errors: http://docs.python.org/lib/module-exceptions.html ˆ PyUnit homepage: http://pyunit.sourceforge.net ˆ Mark Pilgrim’s Chapter on unit testing in his Dive Into Python book: http://diveintopython.org/unit_testing/index.html ˆ Python’s unittest module: http://docs.python.org/lib/module-unittest.html ˆ Kent Beck’s original paper on unit testing: http://www.xprogramming.com/testfram.htm ˆ Unit testing frameworks:

– Unit testing frameworks: http://www.extremeprogramming.org/rules/unittestframework.html – Unit testing for Actionscript: http://asunit.sourceforge.net/ – Unit testing for Actionscript 2: http://www.as2unit.org/ – PyUnit project page: http://pyunit.sourceforge.net/ – Unit testing for Java: http://www.junit.org/index.htm – Unit testing for C++: http://opensourcetesting.org/unit_c.php – Unit testing for .NET languages: http://www.nunit.org/ – Unit testing for Ruby: http://www.ruby-doc.org/stdlib/libdoc/test/unit/rdoc/

201

202

CHAPTER 12. OBJECT ORIENTED PYTHON

12.8 and

Glossary a method to overload the & operator.

contains init len or

a method to overload the in operator.

the constructor method which is executed automatically when a new object is created. a method to overload the len function. a method to overload the | operator.

repr a method to tell Python how to represent objects as strings (used, for example, when an object is printed or passed as an argument to the str function). xor

a method to overload the ^ operator.

ADT see abstract data type. abstract data type a type whose internal data structures are hidden behind a set of access functions or methods. Instances of the type may only be created and inspected by calls to methods. This allows the implementation of the type to be changed without changing any code which instantiates that type. attribute data inside an object. class a new type defined with the keyword class, which has the following syntax: class < name >( < superclass - name >): < statements > class invariant a boolean expression which should remain true throughout the lifetime of an object. Usually documented with a class to help programmers understand how a new type works. constructor see

init

design by contract using preconditions, postconditions and class invariants to describe how a new type should behave. inheritance reusing code in one class (the superclass) by extending it >>> class Super : ... def foobar ( self ): ... print " foobar ! " ... >>> class Sub ( Super ):

12.8. GLOSSARY

203

... def barfoo ( self ): ... print " barfoo ! " ... >>> s = Sub () >>> s . barfoo () barfoo ! >>> s . foobar () foobar ! >>> instance a value of a particular type. For example, 1 is an instance of int and Die() is an instance of class Die. instantiate to create a new object which is an instance of some class. In the example below we instantiate the class Foobar: >>> class Foobar : ... def __repr__ ( self ): ... return " foobar " ... >>> obj = Foobar () isinstance a function to determine whether a value is an instance of a particular type. For example: >>> isinstance (1 , int ) True >>> isinstance (1.0 , int ) False >>> isinstance ( Set () , Set ) True >>> method a method is like a function which is inside the name space of an object. Placing code inside objects like this means that all the code needed to handle a particular sort of data can be placed in the class where that data is defined. This means that related code is kept together which makes programs easier to read and maintain. To call a method, the syntax .() is used. Here’s an example: >>> class Foo : ... def my_method ( self ): ... print " Foobar ! " ... >>> f = Foo () >>> f . my_method ()

204

CHAPTER 12. OBJECT ORIENTED PYTHON Foobar ! >>> Note that one difference between a method and a function is that methods always take the special name self as an argument.

multiple inheritance where a subclass has more than one superclass. In Python multiple superclasses can be listed as comma separated names in class definitions, like this: class < name >( < superclass1 > , < superclass2 > , ...): < statements > One criticism of multiple inheritance is to do with name resolution. If two superclasses implement methods with the same name then which method is executed if that name is called in the subclass? For example: class Super1 : def foobar ( self ): ... class Super2 : def foobar ( self ): ... class Sub ( Super1 , Super2 ): def barfoo ( self ): ... self . foobar () ... # which foobar ? In Python the search for the names in superclasses happens left to right (i.e. in the order that the superclasses are listed in the subclass definition). So, in the example above Python will look for foobar first in Super1 and then in Super2. None a special object which is generally used to signify the absence of a value. For example, if a function or method does not explicitly return a value, Python will make sure it returns None. object an instance of a class. In the following example, obj is and instance of Foobar: >>> class Foobar : ... def __repr__ ( self ): ... return " foobar " ... >>> obj = Foobar () >>> print obj foobar >>> object oriented programming programming with objects, classes and inheritance.

12.8. GLOSSARY

205

OOP see object oriented programming. overload define how an existing function or operator should behave when applied to a new type. In Python this is done by providing some special methods in a new class, for example: >>> class Foobar : ... def __init__ ( self ): ... self . foo = " foobar " ... def __add__ ( self , other ): ... return self . foo + other . foo ... >>> f = Foobar () >>> g = Foobar () >>> f + g ’ foobarfoobar ’ >>> override to re-implement a method which has already been defined in a superclass. In the following example, polymorphism being able to interact with values in the same way because they are all instances of classes which share a superclass. See Section 12.3. PyUnit Python’s unit testing framework. The module which implements the framework is called unittest. Documentation for PyUnit can be found here: http://docs.python.org/lib/module-unittest.html self the name self in an object refers to the object itself. When writing classes, self can be used to access data and methods. For example: >>> ... ... ... ... >>> >>> >>> 1 >>> 2 >>>

class Num : def __repr__ ( self ): return str ( self . num ) def __init__ ( self , num ): self . num = num n1 = Num (1) n2 = Num (2) print n1 print n2

single inheritance where every class has exactly one superclass.

206

CHAPTER 12. OBJECT ORIENTED PYTHON

subclass a subclass inherits all the data and methods of it’s superclass. It may override some of those (by declaring new data and methods with the same names or signatures) and it may add new data and methods. superclass a class from which subclasses are derived. unit testing a method of testing where each part of a program (often each class in a program) is tested separately. Usually, testing is implemented before the program is written and test cases are run every time the program is modified. In Python, unit testing can be implemented with the unittest module.

12.9

Homework exercises

1. Write a class called BankAccount which stores the current balance, interest rate and account number of a bank account. Your class should provide methods to withdraw, deposit and add interest to the account. The user should not be allowed to withdraw money if they are overdrawn. Make sure you test your class using the unittest module. 2. Create two subclasses of your BankAccount class: CreditAccount where the user is charged a certain amount for every withdrawal that is made. If the user is overdrawn, the withdrawal charge is doubled. StudentAccount where new accounts start off with a balance of £500 and an overdraft of up to £3000 is allowed, with no charges for withdrawal. Make sure you test your classes with Python’s unittest module.

12.9.1

Key Assignment

Implement and test (using PyUnit) a class to represent stacks. A stack is an abstract data type in which the last value placed on the stack is always the first value to be removed (Last In First Out). Your Stack class should have the following methods: push place a value on the “top” of the stack. pop remove a value from the “top” of the stack and return the value. peek return the value at the “top” of the stack (but leave it on the stack). Use exceptions where necessary. Wikipedia has a longer description of stacks here: http://en.wikipedia.org/wiki/Stack_(data_structure)

Chapter 13 Python extensions One of the nice things about the Python programming language is that so many, very powerful, libraries have been written for it. In this chapter, we’ll be looking at three of these libraries, which we hope will give you a good idea of the scope of Pythons capabilities and encourage you to go out and find more libraries to play with. The libraries we have chosen to use here are PyGoogle (an interface to the popular Google search engine), PIL (an image processing library) and Pygame (for creating games with Python). There should be enough examples here to show you what each library can do for you and give you some idea of how professional programmers might use them, in practice. However, we expect you to read this chapter in conjunction with the documentation for each library, which you can find in the Further Reading Section on Page 248. This will give you a better idea of how each library is put together and how you might learn more about the features of PyGoogle, PIL and Pygame that we don’t have time to cover in this Chapter.

Learning outcomes At the end of this Chapter, you will be able to: ˆ Define the term API ˆ Use the Python google module to perform web searches and check spelling. ˆ Explain the RGB colour model. ˆ Define the term lookup table. ˆ Use the Python Imaging Library to:

– perform colour transformations on images; – apply filters to images; and – perform pixel-by-pixel transformations on images. 207

208

CHAPTER 13. PYTHON EXTENSIONS ˆ Describe the term blitting. ˆ Describe and use dirty rect animation. ˆ Use the Pygame module to write simple animations and games.

13.1

PyGoogle

The popular search engine, Google, provides APIs (Application Programming Interface, or sometimes Application Program Interface) to its services. An API is a set of functions or methods in one program, which any other program can call. The Google APIs are not written in Python, but the google module provides access to them. To use the google module, you also need a copy of the Google API and a license key from Google, both of which you can obtain from http://www.google.com/apis/. The google module will look in several places (in this order) for the license key: ˆ The license_key argument of each function; ˆ the module-level LICENSE_KEY variable (call setLicense() once to set it); ˆ an environment variable called GOOGLE_LICENSE_KEY; ˆ a file called .googlekey in the current directory; ˆ a file called googlekey.txt in the current directory; ˆ a file called .googlekey in your home directory; ˆ a file called googlekey.txt in your home directory; ˆ a file called .googlekey in the same directory as google.py; or ˆ a file called googlekey.txt in the same directory as google.py.

The main function in the google module that you will want to use is doGoogleSearch(), which instructs Google to perform a web search. doGoogleSearch() takes a string argument which is the query to be passed to Google and a number of other name arguments. Most of the time, you won’t want to use these, but it’s important to know what they are so you can see what the module can do for you. Here is the full list: q=’’ Search string. start=0 Index of the first result that doGoogleSearch should return (starting at zero). maxResults=10 The maximum number of results that doGoogleSearch should return. filter=1 Filter out similar results.

13.1. PYGOOGLE

209

restrict=’’ Restrict results to by country or topic. safeSearch=0 Turn safe searching on. language=’’ Return only results in a particular language (defaults to English). inputencoding=’’ Set input encoding. outputencoding=’’ Set output encoding. license key=None Use a given license key. http proxy=None Use a given HTTP proxy. doGoogleSearch() returns a list of results from Google. Each result is an object, of type SearchResultValue. Each SearchResultValue object contains just two instance variables: meta and results. meta is an list of objects containing meta data for the results of the query and results is a list containing the results themselves. Each element in the results list is an instance of SearchResult, which contains the following instance variables: URL URL of the result. title Title of the result (in HTML). snippet HTML showing query context. cachedsize Size of result in Google cache (in Kb). relatedInformationPresent Is there related information? hostName Used when filtering occurs. directoryCategory Open directory category for this result. directoryTitle Open directory title for this result. summary Open directory summary for this result. Some of these entries might seem strange, but if you look closely at a Google web search, you’ll see that almost all of them are visible. The google module might seem a bit complicated at first, -but at its simplest, you can just call doGoogleSearch() and iterate over the list of results:

210

CHAPTER 13. PYTHON EXTENSIONS

$ python Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> import google >>> hits = google . doGoogleSearch ( ’ query ’) >>> for r in hits . results : ... print r . URL One last thing before we look at an example program using google. On all of my machines I get the following error message when importing google:

$ python Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> import google / usr / lib / python2 .4/ site - packages / pygoogle / google . py :58: DeprecationWarning : SOAPpy not imported . Trying legacy SOAP . py import GoogleSOAPFacade >>> This is saying that Python has tried to import a module called SOAPpy.py which it couldn’t find, so instead it has imported another module called SOAP.py. The DeprecationWarning means that SOAP.py is now out of date and shouldn’t be used. None of this affects the work we are doing here with google. We won’t be using SOAP.py directly and all the functions in google will work. So, don’t worry if you get this error message too!

13.1.1

Googlewhacking

A Googlewhack is a Google query of two words which only have one Google hit. You might have seen Dave Gorman’s stage show, DVD or book on the subject, but if not you can fund out more from Dave’s website about his Googlewhack adventure: http://www.davegorman.com/googlewhack.htm. In this example, we’ll write a very simple Googlewhack game. The player will supply two words and the program will find out if they are a Googlewhack. If not, we can either give a list of results or (if there are no search results at all) perform a spelling check. So, the player will use the script like this:

$ ./ googlewhack . py pbthon disdotdat Did you mean python ? In our program, we first need to import the google module, then perform a Google query with the command line arguments to the script (like f00bar and barf00 in the example above) and then see how many search results Google has given us. Remember that

13.2. PIL

211

the sys module gives you access to command line arguments in an array called sys.argv. However, sys.argv[0] is the name of the program we’re running (googlewhack.py), so the two command line arguments we want will be held in sys.argv[1] and sys.argv[2]. 8 9 10 11 12 13 14 15

import google , sys __author__ = ’ Sarah Mount ’ print ’ GoogleWhacking ...\ n ’ query = sys . argv [1] + ’ ’ + sys . argv [2] res = google . doGoogleSearch ( query ) hits = len ( res . results ) Next, we need to see how many search results Google gave us, and act accordingly. So, if we got no hits at all, we can do some spell checking using the google.doSpellingSuggestion() function. Remember that if Google has no spelling suggestions for us, the google.doSpellingSuggestion() will return None:

17 if hits < 1: # No hits , so check spelling 18 s1 = google . doSpellingSuggestion ( sys . argv [1]) 19 s2 = google . doSpellingSuggestion ( sys . argv [2]) 20 for s in s1 , s2 : 21 if not s is None : 22 print ’ Did you mean : % s ? ’ % s If we have exactly one hit, then the player has found a Googlewhack: 23 elif hits == 1: 24 print ’ Congratulations ! "% s " is a GoogleWhack :)\ n ’ % query 25 print res . results [0]. title , ’\ n \ t ’ , res . results [0]. URL Lastly, if we have more than one hit, we can just give the user the set of results that Google has given us. Remember from the description of the google module above, the maximum number of search results that will be returned is ten, so we don’t have to worry too much about our program output scrolling on for ages! 26 else : 27 print ’% i results for "% s " :( ’ % ( hits , query ) 28 print ’\ nGoogle Results for "% s ":\ n ’ % query 29 for r in res . results : 30 print r . title , ’\ n \ t ’ , r . URL

13.2

PIL

In this section, we’ll see how the Python Imaging Library (PIL) can use be used for image processing. In the listings/images/ directory you can find some images which you can

212

CHAPTER 13. PYTHON EXTENSIONS

use to try out the programs here, which you can find in listings/py_extensions/. Before we dive into PIL, we need to look at two things: 1. the viewport.py script written for this Chapter. This can be used to display images in a window. PIL does provide a method to do this (Image.show()) but that is only intended to be used for debugging and isn’t portable to all platforms. 2. The RGB colour model which PIL uses to represent colour in images. This is a popular colour model and we’ll be using it extensively in the examples which will follow.

13.2.1

Viewport

In the listings for this chapter there is a script called viewport, which we will be using throughout this section. This is a simple program using the Python Tkinter module, which is designed for building graphical user interfaces with the multi-platform Tk toolkit. viewport is designed to display image on the screen, and can be called from the command line, from the Python interpreter or from a Python program. Listing 13.1: Viewport 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

# !/ bin / env python2 .4 """ Converts a colour image to sepia . """ import Image , ImageTk import Tkinter __author__ = ’ Sarah Mount ’ def display_image ( pil_image , title ): """ Take a PIL image and display it in a GUI . """ root = Tkinter . Tk () root . title ( title ) im_width , im_height = pil_image . getbbox ()[2:4] canvas = Tkinter . Canvas ( root , width = im_width , height = im_height ) canvas . pack ( side = Tkinter . LEFT , fill = Tkinter . BOTH , expand =1) photo = ImageTk . PhotoImage ( pil_image ) item = canvas . create_image (0 , 0,

13.2. PIL

213

25 anchor = Tkinter . NW , 26 image = photo ) 27 Tkinter . mainloop () 28 29 if __name__ == ’ __main__ ’: 30 import Image , sys 31 filename = sys . argv [1] 32 image = Image . open ( filename , ’r ’) 33 display_image ( image , ’ Tk Viewport ’) Listing 13.1 shows the contents of the viewport module. It has one function called display_image, which takes as arguments an image and a string. The image has to be created by the Python Imaging Library (PIL), which has a special module called ImageTk especially for creating images that can be used in Tk interfaces. If the module is used from the command line, line this:

$ ./ viewport mypicture . png The listings directory has a folder called images which contains photographs for you to use with the viewport program and the other scripts we will be writing in this Section. viewport looks for a command line argument which should be an image, uses the Python Image library to open the image and calls display_image. We aren’t covering Tk in any detail here, but we will discuss this script in detail, so that you have some idea how simple GUIs can be created in Python. In viewport the root variable holds the result of the Tkinter.Tk() function. This creates a window on which any “widgets” can be placed. A widget is an element of a user interface, such as a button, text entry field, scroll bar, menu item or so on. In this case, we only want to place a canvas on the root window, which we will use to display our image. First we use PIL to find out how big the image is, so we can size the viewport correctly: 15

im_width , im_height = pil_image . getbbox ()[2:4] Note that we’re getting the image size calling a function here called getbbox(). A bounding box is a rectangle in which all the image data is held. So, if all but the centre of an image is blank, the bounding box will surround the image data in the centre, but not include the blank parts around the edge. In PIL, bounding boxes are represented by four-tuples containing an x-coordinate, a y-coordinate, a width and a height. The (x, y) coordinates given by the bounding box represent the top left hand corner of the bounding box. PIL always places the origin of an image at its top left hand corner. In the display_image function, once we know the dimensions of the image, we need to create a canvas to put it on:

16 17 18

canvas = Tkinter . Canvas ( root , width = im_width , height = im_height )

214

CHAPTER 13. PYTHON EXTENSIONS

19 20 21 22

canvas . pack ( side = Tkinter . LEFT , fill = Tkinter . BOTH , expand =1) photo = ImageTk . PhotoImage ( pil_image ) This is done in two stages. Firstly we create a canvas, giving its size and which window we want to put it on. Next we tell Tkinter to “pack” the canvas, meaning that it should place the canvas on the root window. We also pass in some arguments to say how we want the canvas to be positioned on the root window, whether we want it to expand to fill all the available space, and so on. Next, we need to create the image and place it on the canvas. Lastly, we call the Tkinter mainloop() function which places the GUI on the screen:

22 23 24 25 26 27

photo = ImageTk . PhotoImage ( pil_image ) item = canvas . create_image (0 , 0, anchor = Tkinter . NW , image = photo ) Tkinter . mainloop ()

13.2.2

The RGB colour model

Many of the example we’re going to look at here are to do with colour. So, it’s important that before we start looking at PIL, we describe how colour is commonly represented. There are many different ways to do this, but the colour model we will use here is called RGB (Red, Green, Blue). This model is popular because it is very simple to understand and use, it is based on how the human eye responds to light and most computer monitors display colour using RGB. An RGB colour is a triplet giving the amount of red, green and blue light that is in the colour being represented. Typically (and in PIL) 24-bits are used for each colour. That means that colours range from 0-255 (or 0-28 − 1). Below is a list of some common colours represented in the RGB colour model (you can also see these in Figure 13.1): ˆ (0, 0, 0) is BLACK. ˆ (255, 255, 255) is WHITE . ˆ (255, 0, 0) is RED. ˆ (0, 255, 0) is GREEN. ˆ (0, 0, 255) is BLUE. ˆ (255, 255, 0) is YELLOW. ˆ (0, 255, 255) is CYAN.

13.2. PIL

215

Figure 13.1: RGB colour representation. Adding red to blue gives magenta, adding blue to green gives cyan, adding green to red gives yellow. Adding red, blue and green yields white.

ˆ (255, 0, 255) is MAGENTA.

13.2.3

Colour to greyscale

The first script we’ll write using PIL will just convert a colour image to greyscale. See Figure 13.2 to see the results. As with all our PIL scripts, you can call this one from the command line, like this (note that pwd is a BASH command which tells you which directory you are in):

$ pwd .../ listings / py_extensions $ ./ greyscale . py ../ images / long_beach . jpg The PIL module you will probably use the most is called Image. This allows you to load images into Python and perform various operations on them. When an image is loaded, it is represented by a Python object, which contains various information such as the filename of the original image, its file format, “mode” (meaning colour model) and size (in pixels):

$ python Python 2.4 (#1, Nov 30 2004 , 11:25:14) [ GCC 3.4.2 20041017 ( Red Hat 3.4.2 -6. fc3 )] on linux2 Type " help " , " copyright " , " credits " or " license " for ... >>> import Image >>> img = Image . open ( ’ long_beach . jpg ’) >>> print img . filename , img . format , img . mode , img . size

216

CHAPTER 13. PYTHON EXTENSIONS

Figure 13.2: Colour to greyscale: original image and result long_beach . jpg JPEG RGB (1000 , 750) >>> In the listing above, we saw that the long_beach.jpg image was imported in RGB mode. Other PIL modes are L (greyscale - “L” stands for “luminance”) and CMYK (Cyan, Magenta, Yellow, Key). To convert an image from one mode to another, you can just call its convert() method. This is all that our first script does: load an image, call convert() then pass the image to the viewport.display_image() function: 1 2 3 4 5 6 7 8 9 10 11 12 13

Listing 13.2: Colour to greyscale # !/ bin / env python2 .4 """ Converts a colour image into greyscale . """ import Image , sys , viewport __author__ = ’ Sarah Mount ’ filename = sys . argv [1] title = ’ Colour - > Grayscale Using Python Image Library ’ image = Image . open ( filename , ’r ’) image = image . convert ( ’L ’) viewport . display_image ( image , title )

13.2.4

Colour to negative

Some images have very nice negatives, such as the one shown in Figure 13.3. The negative of an image has the amount of red, green and blue in each image inverted. For example, the negative of a red pixel (255, 0, 0) would be a cyan pixel (0, 255, 255). The negative of

13.2. PIL

217

Figure 13.3: Colour to negative: original image and result a black pixel (0, 0, 0) would be a white pixel (255, 255, 255), and so on. So, to create the negative of an image, we just have to subtract the current red, green and blue values of each pixel from 255. In PIL we can do this very easily. Two things help us: PIL stores the red, green and blue pixel values of an image as separate list elements (rather than storing a list of threetuples). We’ll see in Section 13.2.5 that PIL will also let us split a single image into three images, one for each colour band. Secondly, each PIL image has a method called point() which allows us to apply a transformation on each pixel value in the image. point() can be used in two ways, you can either pass it a function or lambda expression to apply to each pixel value, or you can pass it a lookup table. Here, we’ll be using a lambda expression, but in Section 13.2.7 we will revisit point and pass it a lookup table. The listing for the negative.py script is given below in Listing 13.3. Almost all of the script should make sense to you, except that before we perform the image transformation (by calling point()) we have called a function called load(). This explicitly loads all pixel data into memory, so that we can apply point() to it. If we do not call load(), PIL will delay loading pixel data into memory and we get the following error message: Traceback ( most recent call last ): File " ./ negative . py " , line 14 , in ? negative = image . point ( lambda pixel : 255 - pixel ) File " / usr / lib / python2 .4/ site - packages / PIL / Image . py " , line 1039 , in point lut = map ( lut , range (256)) * self . im . bands AttributeError : ’ NoneType ’ object has no attribute ’ bands ’ Listing 13.3: Colour to negative 1 # !/ bin / env python2 .4 2 """ 3 Produces the negative of a positive colour image .

218

CHAPTER 13. PYTHON EXTENSIONS

Figure 13.4: Swapping colour bands: original image and result 4 5 6 7 8 9 10 11 12 13 14 15

""" import Image , sys , viewport __author__ = ’ Sarah Mount ’ title = ’ Colour - > Negative Using Python Image Library ’ filename = sys . argv [1] image = Image . open ( filename , ’r ’) image . load () negative = image . point ( lambda pixel : 255 - pixel ) viewport . display_image ( negative , title )

13.2.5

Swapping colour bands

PIL allows us to split an image into its component colour bands using a method called split(), which returns a sequence of three lists, representing the red, green and blue pixel values for an image. We can also merge images together, using a method called merge(). Putting these two methods together, we can swap the colour bands of an image around. Figure 13.4 shows the result of swapping the red and blue colour bands of a photograph.

13.2. PIL

219

Notice how the colour of the sky has changed and particularly the golden colour inside the sculpture. The hedge has also changed colour, even though we haven’t touched the green colour band. Why do you think that is? The listing for the swap.py script is below: Listing 13.4: Swapping red and blue colour bands # !/ bin / env python2 .4 """ Swaps the red and blue colour bands of an image . """ import Image , sys , viewport __author__ = ’ Sarah Mount ’

1 2 3 4 5 6 7 8 title = ’ Swapping Red and Blue bands Using Python Image Library ’ 9 10 if __name__ == ’ __main__ ’: 11 filename = sys . argv [1] 12 image = Image . open ( filename , ’r ’) 13 (r , g , b ) = image . split () 14 image = Image . merge ( ’ RGB ’ , (b , g , r )) 15 viewport . display_image ( image , title )

13.2.6

Filters: edge enhancement and embossing

PIL also defines a number of filters which can be used to perform transformations on images. These are all kept in the ImageFilter module. To use them, you just need to create an instance of one of the filter classes in ImageFilter and pass it to the filter() method in an image object. Here, we have two scripts which are an example of using PILs built in filters. The first enhances the edges in an image (see Figure 13.5): Listing 13.5: Edge enhancement 1 2 3 4 5 6 7 8 9 10 11

# !/ bin / env python2 .4 """ Enhance the edges of an image . """ import Image , ImageFilter , sys , viewport __author__ = ’ Sarah Mount ’ title = ’ Edges Enhanced Using Python Image Library ’ if __name__ == ’ __main__ ’: filename = sys . argv [1]

220

CHAPTER 13. PYTHON EXTENSIONS

Figure 13.5: Edge enhancement: original image and result

Figure 13.6: Embossing: original image and result

13.2. PIL

221

Figure 13.7: Colour to sepia: original image and result 12 13 14

image = Image . open ( filename , ’r ’) image = image . filter ( ImageFilter . EDGE_ENHANCE_MORE ()) viewport . display_image ( image , title ) and the second creates an embossed version of the image, where the edges look raised above the flat surface of the image: Listing 13.6: Using the PIL emboss filter

1 2 3 4 5 6 7 8 9 10 11 12 13 14

# !/ bin / env python2 .4 """ Emboss an image . """ import Image , ImageFilter , sys , viewport __author__ = ’ Sarah Mount ’ title = ’ Image Embossed Using Python Image Library ’ if __name__ == ’ __main__ ’: filename = sys . argv [1] image = Image . open ( filename , ’r ’) image = image . filter ( ImageFilter . EMBOSS ()) viewport . display_image ( image , title )

13.2.7

Pixel by pixel transformations: colour to sepia tone

Many digital cameras can create sepia tone photographs, which look soft and aged. In this section, we’ll see how to do this in post processing – Figure 13.7 shows an example.

222

CHAPTER 13. PYTHON EXTENSIONS

Sepia tone Sepia tone is a monochrome image, like greyscale, except the picture appears to be in shades of brown, rather than black. In traditional photography, sepia images were produced by adding a pigment made from the Sepia cuttlefish to the print. Sepia tone photographs tend to last longer than other because the pigment converts metallic silver to a sulphide which degrades slowly over time. In Section 13.2.4 we saw how the point() method could be used to apply a pixel-bypixel transform on an image. We also said that point() could be used with a lookup table, rather than a function. A lookup table (often called a LUT in image processing) is a pre-computed table of function results. The idea of using a LUT is that it should be faster to access a list element than to compute the value of a function. This is important in time-critical applications such as computer graphics and games. The structure of the sepia.py script is slightly different to the other scripts in this Chapter. We will create a method called generate_lut() which will pre-compute a lookup table, then call the point() method to map the pixel data in our image to its new values. Remember that sepia toned images are monochrome, so, before we create the new image, we convert the original to greyscale (just like in Section 13.2.3). So, the last part of our sepia script will look like this: 21 22 23 24 25 26 27 28

self . blue . append ( generate_value (2 , i )) self . lut = self . red + self . green + self . blue def sepia ( self , image ): """ Convert an RGB image to sepia , using a lookup table . """ image = image . convert ( ’L ’) return image . point ( lut = self . lut , mode = ’ RGB ’) Notice that the point() method takes named parameters, which are the LUT and the mode in which the processed image should be returned. In our case, the sepia toned image will need to be in colour, so we’ve used the RGB mode. Next we need to write the generate_lut() function, which will do the hard(ish!) work of creating our lookup table. The first part of this is simple. We will eventually need an array to hold our LUT, but in PIL lookup tables are created by concatenating tables for red, green and blue values. So, we can start off with empty arrays for each primary colour. Also, as we’ll be scaling colours from white to brown, in our new monochrome image, we need to know which colour triples to use for the colours “white” and “brown”. Since client code might want to vary the values of “white” and “brown”, we’d better make these parameters to our method. Better still, we can help the authors of client code by giving the parameters sensible default values:

8 class Sepia ( object ):

13.2. PIL 9 10

223

brown = (150 , 80 , 30) white = (255 , 255 , 255) Lastly, we need to fill the three arrays to create the LUT. To understand how the LUT should be created, it’s worth finding out how PIL will use it. When the point() method is executed, it will look at each pixel in the original image and use the LUT to work out what the colour values for that pixel should be in the new image. Remember that the LUT is constructed by concatenating individual LUTs for red, green and blue. So, if the first pixel in the original image has an RGB value of (29, 38, 127), that pixel will become (lut[29], lut[38+256], lut[127+512]). If we wanted to leave all the colours in an image unchanged, we could create and use a LUT like this: lut = range (256) + range (256) + range (256) image . point ( lut = lut ) in which case, out LUT would be this array: [0, 1, 2, 3, ...255, 0, 1, 2, 3, ...255, 0, 1, 2, 3, ...255] Of course, for our sepia toned images we don’t want to leave all the colours unchanged. Unchanged, each set of colour values would range from 0 to 255. However, we want each one to range from the red, green or blue part of the brown tuple to the red, green or blue part of the white tuple (which are the arguments to the generate_lut() method). The equations below describe how to do this. If it isn’t clear how they work, fire up an interactive Python session and get Python to show you what these arrays will look like. brown = (150, 80, 30) white = (255, 255, 255) red[i] = brown[0] + i × (white[0] − brown[0])/255 green[i] = brown[1] + i × (white[1] − brown[1])/255 blue[i] = brown[2] + i × (white[2] − brown[2])/255 These equations are quite simple to translate into Python. Here, we’ve used a helper function (generate_value()) to simplify the code a bit.

11 12 13 14 15 16 17 18

def __init__ ( self ): " " " Initialise lookup tables for colours . " " " self . red = []; self . green = []; self . blue = [] def generate_value ( ind , i ): col = Sepia . brown [ ind ] col += i * ( Sepia . white [ ind ] - Sepia . brown [ ind ]) / 255 return abs ( col ) for i in range (256):

224

CHAPTER 13. PYTHON EXTENSIONS

Putting this all together, gives us the final script:

Listing 13.7: Colour to sepia 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

# !/ bin / env python2 .4 """ Converts a colour image to sepia . """ import Image , sys , viewport __author__ = ’ Sarah Mount ’ class Sepia ( object ): brown = (150 , 80 , 30) white = (255 , 255 , 255) def __init__ ( self ): " " " Initialise lookup tables for colours . " " " self . red = []; self . green = []; self . blue = [] def generate_value ( ind , i ): col = Sepia . brown [ ind ] col += i * ( Sepia . white [ ind ] - Sepia . brown [ ind ]) / 255 return abs ( col ) for i in range (256): self . red . append ( generate_value (0 , i )) self . green . append ( generate_value (1 , i )) self . blue . append ( generate_value (2 , i )) self . lut = self . red + self . green + self . blue def sepia ( self , image ): """ Convert an RGB image to sepia , using a lookup table . """ image = image . convert ( ’L ’) return image . point ( lut = self . lut , mode = ’ RGB ’) if __name__ == ’ __main__ ’: title = ’ Colour - > Sepia Using Python Image Library ’ filename = sys . argv [1] image = Image . open ( filename , ’r ’) s_obj = Sepia () s = s_obj . sepia ( image ) viewport . display_image (s , title )

13.3. PYGAME

225

13.3

Pygame

13.3.1

Bouncing ball animation

Figure 13.8: Bouncing ball animation A ball bouncing around a screen seems to be the standard first example in any tutorial on animation programming. So, following tradition, this is the first example we have chosen to explore in our tour of Pygame, Python’s premier library for writing games. This example is an adapted version of a program in Pete Shinners’ Python Pygame Introduction. To start with, we’ll need to initialise some variables. We want a nice image for our ball, a sound file to be played when the ball hits the sides of the screen, a caption for the GUI containing the animation, the screen size, and so on: 5 import pygame , pygame . mixer , sys ...

226

CHAPTER 13. PYTHON EXTENSIONS

10 11 if __name__ == ’ __main__ ’: 12 13 ballimg = ’ ../ images / beachball . png ’ 14 ballsnd = ’ ../ sounds / bong . wav ’ 15 caption = ’ Bouncing ball animation using PyGame . ’ 16 screen_size = width , height = 600 , 500 17 black = 0 , 0 , 0 The pygame.init() function initialises all the Pygame modules and must be called at the start of any Pygame program. We can now create the screen on which we’ll show our animation and this is done with the pygame.display.set_mode() function. Pygame automatically finds the best graphics mode to use for the hardware on which the animation is run, so we don’t have to worry about any platform-specific issues. set_mode() returns an object of type pygame.Surface that represents the graphics that you will see on your monitor when you run the bounce script, as any drawing done on this surface will be visible on your monitor: 19 20 21

pygame . init () # has to be called screen = pygame . display . set_mode ( screen_size ) Next, we can create some objects which we will use in the animation. We want to load the image that will represent the ball and the sound that will be played when the ball hits the sides of the screen. Notice that after the image is loaded, we call its convert() function. This converts images into something called a pixel format, which is the format used by Pygame to represent colours on surfaces. If we didn’t call convert() then this would be done every time we draw our image on the screen (in this case every frame of the animation) and this would slow down our program considerably. Lastly, we find out where Pygame has placed the image by asking for its rect. A rect in Pygame is an object, a bit like a bounding box, which gives the coordinates of an object on the screen.

23 24 25

sound = pygame . mixer . Sound ( ballsnd ) ball = pygame . image . load ( ballimg ). convert () Next, we can start writing the animation. We want to keep bouncing forever, so we can place our animation logic in an infinite while loop. If the user intervenes by pressing a key or a mouse button, or by moving the mouse or joystick this is represented in Pygame by an event. We can ask Pygame to give us all the events that have happened (since we last checked!) by calling pygame.event.get(). It is important to do this often, so that the program feels responsive to the user – especially when writing games! In this case, we check for events every time we draw a new frame of the animation. We are only really interested in events that would cause our program to exit. The user might click the close box (usually marked with an “X” at the top right-hand corner of the window) or press

13.3. PYGAME

227

Alt+F4, or perform some platform-specific action. Thankfully, we don’t have to check for each of these events individually. Pygame has a constant called pygame.QUIT that is the event type for all such quit events: 27 28 29 30 31

while True : for event in pygame . event . get (): if event . type == pygame . QUIT : sys . exit () Next, we can start thinking about the animation. Firstly, we check to see if the ball has “hit” the sides of the screen. Of course, the ball might have actually moved beyond the edge of the screen. So, rather than expecting the ball to exactly collide with the screen, we just check to see if the ball has moved beyond the edge of the screen – and for our purposes, this seems to be accurate enough. We can do this by examining the instance variables in the rect which represents the balls position on the screen. If the ball has “hit” the edge of the screen, we need to do two things: firstly, play the sound (just by calling the play() method) and change the direction that the ball is moving in. In this simple example, we just reverse the speed at which the ball is moving in the direction in which the ball is moving. This is hardly a sophisticated model of Physics, and a more complex game or animation would need to be more realistic.

32 33 34 35 36 37

if ballrect . left width : sound . play () speed [0] = - speed [0] if ballrect . top height : sound . play () speed [1] = - speed [1] Lastly, we need to move the ball and redraw it. Computer animation works by drawing a series of images in sequence, hopefully fast enough that humans watching the animation are fooled into seeing smooth motion. In this program, we will animate the ball very simply. In the Snake example below, we will use a faster and more sophisticated technique called dirty rect animation. Here, we first move the ball by calling the move() method in the ballrect object. This moves the rectangle holding the ball data relative to where it is currently situated, by given values in the x and y directions. So, we can move the ball by one pixel left one pixel like this: ballrect = ballrect . move ([ -1 , 0]) and down one pixel like this: ballrect = ballrect . move ([0 , 1]) This should make sense to you when you remember that Pygame will represent the origin of the screen at its top left hand corner.

228

CHAPTER 13. PYTHON EXTENSIONS

Once we have moved the rectangle, we need to get rid of the ball data in the old rectangle. We can do this by filling the whole screen with black pixels, using the fill() method in the screen object. When the old rectangle has been removed, we need to draw the ball in its new position. Remember that we already know that the ball should be placed in the position represented by the ballrect object. To draw the ball, we just need to copy this data onto the screen, which in computer graphics is called blitting. So, to blit the ball to the screen, we call the blit method in the screen object, passing in the ball (containing the balls pixel data) and the ballrect which tells blit() where to place the data. Lastly, we need to update the whole screen and make our changes visible to the viewer. We can do this either by calling pygame.display.flip(). This manages a double buffer containing the changes we have made to the screen. Without buffering, the user would see partly incomplete areas of the screen as they are being drawn. 38 39 40 41

ballrect = ballrect . move ( speed ) screen . fill ( black ) screen . blit ( ball , ballrect ) pygame . display . flip () Putting this all together, here is the final program: Listing 13.8: Bouncing ball animation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

# !/ bin / env python2 .4 """ Bouncing ball animation . """ import pygame , pygame . mixer , sys __author__ = ’ Pete Shinners < pete@shinners . org > ’ __date__ = ’ August 2005 ’ __credits__ = ( ’ From Pete Shinners Python Pygame Intro . ’ + ’ Adapted by Sarah Mount ’) if __name__ == ’ __main__ ’: ballimg = ’ ../ images / beachball . png ’ ballsnd = ’ ../ sounds / bong . wav ’ caption = ’ Bouncing ball animation using PyGame . ’ screen_size = width , height = 600 , 500 black = 0 , 0 , 0 speed = [2.0 , 2.0] pygame . init () # has to be called screen = pygame . display . set_mode ( screen_size ) pygame . display . set_caption ( caption )

13.3. PYGAME 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

229

sound = pygame . mixer . Sound ( ballsnd ) ball = pygame . image . load ( ballimg ). convert () ballrect = ball . get_rect () while True : for event in pygame . event . get (): if event . type == pygame . QUIT : sys . exit () if ballrect . left width : sound . play () speed [0] = - speed [0] if ballrect . top height : sound . play () speed [1] = - speed [1] ballrect = ballrect . move ( speed ) screen . fill ( black ) screen . blit ( ball , ballrect ) pygame . display . flip ()

13.3.2

Game functions module

Pygame is quite a low-level games library, and it is a good idea to define a set of functions which you can use in a number of similar games. In our case, there are a few simple reusable functions we can create to be used in old fashioned arcade games, such as Pong, Space Invaders and Pacman. We’ve put these in a module called game_functions. Once you know how Pygame works (and you will by the end of this Chapter!) these functions will be pretty easy to understand. There isn’t time to go through the whole module in detail, but the documentation for it is listed below. In the exercises, you will be asked to add a function to this module which implements a High Score table. That would be a good opportunity for you to look through the other functions in game_functions and make sure that you understand them all. Listing 13.9: Documentation from the game functions module Help on module game_functions : NAME game_functions - Functions common to many games built with pygame . FILE .../ listings / py_extensions / game_functions . py FUNCTIONS

230

CHAPTER 13. PYTHON EXTENSIONS display_score ( screen , score , fgcol =(255 ,255 ,255) , bgcol =(0 ,0 ,0)) Displays the current score . load_png ( filename ) Load a PNG image and return an image object and its rect . loading ( screen , fgcol =(255 ,255 ,255)) Tells the user that the game is loading . lose ( screen , fgcol =(255 ,255 ,255) , credits = None , fade =1500) Called when the player loses the game . Fadesout any music that might be playing and ( optionally ) displays author credits . start ( screen , game = ’ ’ , fgcol =(255 , 255 , 255) , bgcol =(0 , 0 , 0)) Asks the user to start the game .

13.3.3

Old skool arcade games: Snake

In this Section we will build a version of the game Snake. Usually in Snake there is a sprite, representing a snake, which grows longer every time it “eats” other sprites. To keep this example reasonably simple, we’ve written a version of the game where the snake grows longer every few milliseconds. In the exercises for this Chapter, we will ask you to adapt this code into the traditional game. You’ll notice as we go through the code that we’ll be explaining some code slightly out-of-order, so the line numbers might jump around a bit. This explanation of the code follows quite closely the thought pattern of it’s author. Starting on page 243 you can find a listing of the whole file. Whilst you’re reading through it would be sensible to have a copy of the Pygame documentation to refer to, so that you can look up methods and classes in the Pygame API. Reading API documentation is an important professional skill for any programmer to master, so this is good practice for you. You can find the Pygame documentation at http://www.pygame.org/docs/ Game variables The structure of the Snake code is quite carefully constructed and is similar to many Pygame programs, particularly arcade games. At the top of the file we put all the variables we might want to change later. So, the size of the window we want the game to run in, colours we’ll need later, filenames of images and sounds, text for various purposes, and so on:

13.3. PYGAME

231

Figure 13.9: Snake: press any key to start

14 15 16 17 18 19 20 21 22 23

# ## Global variables fgcol = pygame . color . Color ( ’ white ’) # Foreground colour bgcol = pygame . color . Color ( ’ black ’) # Background colour screen_size = (500 , 500) head_img = ’ ../ images / snake - head . png ’ body_img = ’ ../ images / snake - body . png ’ soundtrack = ’ ../ sounds / ATT . ogg ’ caption = ’ Snake : built with Pygame ’ credits = ’ Sarah Mount ( code ) , James Shuttleworth ( music ) ’ fps = 30 # Frames per second

Overall structure After that, we have a bunch of classes which describe complicated objects that we’ll use in the game - in this case parts of the snake. After that, we’ll write a function which

232

CHAPTER 13. PYTHON EXTENSIONS

Figure 13.10: Snake: game over!

describes how the game is animated and played. That function (usually called main()) will handle events like user input, manage drawing all our game objects to the screen and so on. Lastly, we’ll have the usual condition if __name__ == ’__main__’: which will be true if the program is run from the command line (or by double-clicking on an icon). So, the overall structure of Snake will look like this: # ## Global variables ... # ## Game classes class SnakePart ( pygame . sprite . Sprite ): ... class SnakeHead ( SnakePart ): ... class SnakeBody ( SnakePart ): ...

13.3. PYGAME

233

# ## Game logic def main (): ... # Event loop clock . tick ( fps ) frame +=1 # Handle events for event in pygame . event . get (): ... ... # Run the game if __name__ == ’ __main__ ’: main () There are probably millions of ways of structuring a Snake program, some simpler than others. The separation of classes and functions we’re using here is probably slightly more complex than it needs to be, but it should give you a good idea of how to structure a more complex arcade game well. Most importantly, we’re keeping related code together. So, all the event handling goes in one place, all the drawing goes in one place, the sprites are kept together and so on. This should make the code easy to read and easy to maintain later on. Game classes The three game classes we’ll write for Snake will be: ˆ SnakePart which will be a superclass containing some information that any part of a snake would need to know; ˆ SnakeHead which will represent the first segment of the snake which can be moved around the screen by the user; and ˆ SnakeBody which will represent the parts of the snake which follow the head. These won’t be controllable by the user. In the game animation they will just follow the path of the head as it moves around the screen.

The SnakePart class inherits from Pygame’s class pygame.sprite.Sprite. In games and animation a sprite is just a two dimensional, pre-rendered (drawn) image which is used as part of a larger scene. You can read more about Pygame’s Sprite class on the Pygame documentation pages: http://www.pygame.org/docs/ref/sprite.html In our case, our SnakePart class only really needs to hold it’s location on the screen and the rectangle in which is it drawn. Like the previous Pygame program, we can make use of Pygame’s Rect class here. Lastly, we want a way of representing the current direction in which the snake is moving. Only the SnakeHead needs to move but here we want to

234

CHAPTER 13. PYTHON EXTENSIONS

create some values to represent the four direction of motion: up, down, left and right. In this case we’re using the following line of code: class SnakePart ( pygame . sprite . Sprite ): UP , DOWN , LEFT , RIGHT = [0 , -1] , [0 ,1] , [ -1 ,0] , [1 ,0] Notice first that this code lies outside any method and we haven’t said self.UP=..., so these four variables are not part of an object – they are part of the SnakePart class. To refer to these class variables we can say SnakePart.UP, SnakePart.DOWN and so on. Remember that the top left hand corner of the screen is the origin. So, our four values representing direction tell us something about movement relative to the x and y axes. Think of each list as being an x, y pair. So, [0,-1] means zero motion in the x direction and negative motion in the y direction – so that pair represents upwards motion. There are other ways of representing movement simply, but later on we’ll see how this makes some of our algorithms extremely simple. Here’s the full listing for the SnakePart class: 25 # ## Game classes . 26 class SnakePart ( pygame . sprite . Sprite ): 27 " " " Segment of a snake . " " " 28 UP , DOWN , LEFT , RIGHT = [0 , -1] , [0 ,1] , [ -1 ,0] , [1 ,0] 29 def __init__ ( self , position ): 30 pygame . sprite . Sprite . __init__ ( self ) 31 self . rect . center = position 32 self . area = pygame . display . get_surface (). get_rect () 33 return Next we’ll look at the SnakeBodyPart class which represents a segment of the snake which follows the head. This is a very simple class – all we need to do is to be able to draw a body part in a particular place on the screen. To keep things really simple we won’t move the SnakeBodyPart sprites. Instead, each time the snake moves we’ll destroy the old body and create a new one. Notice that here we’re using a PNG image to represent each body part on the screen. We use different images for the snake head and the body parts. Here’s the code: 76 class SnakeBodyPart ( SnakePart ): 77 " " " Body part of a snake . 78 Body parts are not expected to be updated . 79 """ 80 def __init__ ( self , position ): 81 self . image , self . rect = game . load_png ( body_img ) 82 super ( SnakeBodyPart , self ). __init__ ( position ) 83 return Next we need to write the SnakeHead class. This one is a bit more complicated because we need to make sure that a snake head can move around the screen. Instead of creating a

13.3. PYGAME

235

new snake head each time a new frame of the animation is drawn we’ll move the existing snake head along. This will mean we need a bit more logic in our class to tell Python how that should happen. First of all, though, we need a constructor method for SnakeHead. This needs to hold a few things: ˆ The current position of the snake head on the screen; ˆ the current direction in which the snake head is moving; ˆ the current size of the snake’s body (so we know how many SnakeBodyParts to draw); and ˆ a list of the positions the SnakeHead has been drawn in (so we know where to draw all the sprites in the body).

Here’s the code for that: 35 class SnakeHead ( SnakePart ): 36 " " " Snake head . 37 The head of a snake can change direction and update 38 its position . 39 """ 40 def __init__ ( self , screen ): 41 self . image , self . rect = game . load_png ( head_img ) 42 self . screen = screen 43 start_pos = (10 , 10) 44 super ( SnakeHead , self ). __init__ ( start_pos ) 45 self . direction = SnakePart . DOWN 46 # Keep track of previous positions . 47 self . trail = [ self . rect . center ] 48 # Length of the body behind the snake head . 49 self . bodysize = 0 50 return Next, we want the SnakeHead to be able to change direction when the user presses a key. We’ve already said that event handling will go in a method called main() and not in this class so all the SnakeHead has to do is set it’s self.direction attribute. The code for this couldn’t be simpler: 64 65 66 67 68

def change_direction ( self , dir ): " " " Change the direction in which the snake head is moving . """ self . direction = dir

236

CHAPTER 13. PYTHON EXTENSIONS

Before we write the update() method, which is the last one for this class, we need one other small method to help us out. This just multiplies the contents of two lists together and returns a new list. For example, if we wanted to perform this multiplication: [1 , 2 , 3 , 4] * [5 , 6 , 7 , 8] we would expect the answer: [1*5 , 2*6 , 3*7 , 4*8] which evaluates to: [5 , 12 , 21 , 32] The code to do this is simple, we just create an empty list, write a loop to populate it with the right numbers and return it. The code is below, but if it doesn’t make sense immediately you should type it into a Python interpreter and play with it until you understand how it works. There are a couple of brief things to notice here to do with the name of the method: _dotproduct. Firstly this starts with a _ symbol – this tells Python that this method should only be used by code inside the same class. Secondly, we have names the method after “dot products” which you may remember from studying vectors and matrices in maths at school. 69 70 71 72 73 74

def _dotproduct ( self , l1 , l2 ): " " " Vector dot product . " " " product = [] for i in range ( len ( l1 )): product . append ( l1 [ i ] * l2 [ i ]) return product Lastly for this class we can write the update() method which updates the head on the screen whenever a new frame is drawn. This method needs to do several things: ˆ Update the position of the head on the screen; ˆ manage the list of positions that the head has occupied; and ˆ check if the snake head has hit the sides of the screen – in which case the game is over and we can call the lose() function in the game_functions module.

Let’s take these one at a time. To move the head we need to decide where the new position of the head should be then tell Pygame to move the rectangle that the head is drawn in. To make moving the snake simple we’ll move the snake head by it’s own length or height, depending on whether it’s moving horizontally or vertically. If the snake head is the same size as each of the snake body parts then all we have to do to know where the snake body should be drawn is to store the centres of each rectangle the snake head occupied. So, if we want to move the snake head up, we need to move it by 0 in the x direction and −height where height is the height of the snake head. Or, in our code:

13.3. PYGAME

237

self . _dotproduct ( SnakePart . UP , self . rect . size ) remember that SnakePart.UP is [0,-1] and self.rect.size is a list containing just the width and height of the rectangle in which the snake head is drawn. So, to calculate the new position of the snake head we just need this line of code: movepos = self . _dotproduct ( self . direction , self . rect . size ) Unlike the bouncing ball animation we won’t use the move() from the Rect class to move the snake head, instead we’ll use a similar method called move_ip(). The Pygame documentation for move_ip says: Rect . move_ip moves the rectangle , in place Rect . move_ip (x , y ): return None Same as the Rect . move - moves the rectangle method , but operates in place . So, instead of destroying the old Rect object and creating a new one at the new location, move_ip() just changes the coordinates of the original Rect. This should be slightly faster. Generally, we aren’t too worried about the speed of programs, but when a program is running on an embedded device or has a lot of user interaction (like a game or a user interface) speed and responsiveness can be an important part of making software usable and useful. def update ( self ): " " " Update the position of the snake head . " " " movepos = self . _dotproduct ( self . direction , self . rect . size ) self . rect . move_ip ( movepos ) To manage the list of positions that the head has been drawn in we need to first append the new position to the self.trail list. Lastly, we want to make sure that if the size of the snake’s body is quite small, we don’t keep an enormous list of positions that we don’t really need. So, we can check the length of the self.trail list against the size of the snake body – held in self.bodysize – and delete an element in the trail if the list is bigger than it needs to be. Here’s the code: self . trail . append ( self . rect . center ) # Keep the self . trail list small . if len ( self . trail ) > ( self . bodysize +1): del self . trail [0] In the bouncing ball animation we checked for the ball hitting the sides of the screen by comparing the coordinates of the rectangle the ball was drawn in with the height and width

238

CHAPTER 13. PYTHON EXTENSIONS

of the screen. In this example we’ve done the same thing slightly differently. Remember from the SnakePart class that self.area is a rectangle covering the drawable surface of the whole screen. We can use one of the methods in Pygame’s Rect class (called contains()) to check if the self.area rectangle still contains the rectangle used to draw the snake head: # Check if head has hit the side of the screen . if not self . area . contains ( self . rect ): game . lose ( self . screen , credits = credits ) return Here’s the full update() method: 51 52 53 54 55 56 57 58 59 60 61 62 63

def update ( self ): " " " Update the position of the snake head . " " " movepos = self . _dotproduct ( self . direction , self . rect . size ) self . rect . move_ip ( movepos ) self . trail . append ( self . rect . center ) # Keep the self . trail list small . if len ( self . trail ) > ( self . bodysize +1): del self . trail [0] # Check if head has hit the side of the screen . if not self . area . contains ( self . rect ): game . lose ( self . screen , credits = credits ) return Game functions The main() method contains all the code to animate the game and handle events. As with the bouncing ball example or any other Pygame program we need to initialise Pygame’s internal variables and (while we’re setting things up) we can set a caption on the title bar of the application window:

85 # ## Game logic . 86 def main (): 87 " " " Play Snake . " " " 88 pygame . init () 89 screen = pygame . display . set_mode ( screen_size ) # Blit buffer . 90 pygame . display . set_caption ( caption ) Next we’ll fill the screen with black pixels and call the loading() function in the game_functions module to tell the user that the game is currently busy. We’ll be loading some large files which will take some time and without giving the user any feedback it might look as if the program has crashed.

13.3. PYGAME

239

As before, to fill the screen we’ll ask Pygame for a surface on which we can draw that is the same size as the application window (we’ve called that background). Then we’ll fill that surface with the background colour, blit (or copy) the surface onto the screen and call pygame.display.flip() to write all the whole screen out to the graphics card. Remember that this is a time consuming process because we’re updating every single pixel on the application window. When we draw the game animation we’ll be a bit smarter about how we manage drawing. 92 93 94 95 96 97

# Draw the game background . background = ( pygame . Surface ( screen . get_size ())). convert () background . fill ( bgcol ) screen . blit ( background , (0 ,0)) pygame . display . flip () game . loading ( screen ) We will next initialise all our game objects. In this program we have the SnakeHead class to instantiate and although we won’t start off with any SnakeBodyPart objects we’ll create a group of sprites to put the body in. One of the advantages of using sprites is that they can be stored in groups which means you can render all the sprites at once (rather than having to write loops to do this) and you can use Pygame’s facilities for detecting collisions between sprites. Lastly, we can load the soundtrack for the game, create a clock which will automatically manage the frame rate of the game’s animation for us and we’ll start a counter to tell us how many frames have been drawn. Note that you can find the Clock class listed in the documentation for the pygame.time module.

99 100 101 102 103 104 105 106 107

# Initialise the snake . head = SnakeHead ( screen ) headsprite = pygame . sprite . RenderPlain ( head ) body = pygame . sprite . Group () bodysprite = pygame . sprite . RenderPlain ( body ) music = pygame . mixer . Sound ( soundtrack ) clock = pygame . time . Clock () frame = 0 # Number of frames we have drawn so far . Once we’ve loaded our game objects we can start the game. First we’ll start playing the soundtrack. The play() method in the pygame.mixer.Sound class will play the music1 . Passing -1 to the method will make sure that the music loops indefinately. We also need to redraw the background to cover up the message that told the player that the game is loading. Then we can call the start() function from the game_functions module which will tell the player to “Press any key to start playing Snake”. 1

In this game the music we’re using is James’ excellent piece “All The Trips” which you can download from his website http://dis-dot-dat.net/

240 109 110 111 112 113 114

CHAPTER 13. PYTHON EXTENSIONS # Ask the player to start the game . music . play ( -1) screen . blit ( background , (0 , 0)) pygame . display . flip () pygame . event . clear () # Clear event queue . game . start ( screen , game = ’ Snake ’)

The event loop is a little different to the one you saw in the bouncing ball animation. In that example we just looped the animation as fast as possible. In this game, we want to be able to control the speed of the animation so that we can choose the most playable speed for the player. Have a go at varying the fps variable at the top of the file and see what happens. In Pygame, animation speed is controlled by a clock which we’ve already instantiated from the pygame.time.Clock class. We can ask the clock to “tick” at the right time and update our frame counter like this: 116 117 118 119

# Event loop while True : clock . tick ( fps ) frame += 1 Next we need to handle user events. Like the bounce animation, we need to check to see if the user has asked the application to quit, in which case our main() function can return. We also need to enable the user to move the head of the snake. Events of type KEYDOWN are posted whenever the user presses a key. The pygame.locals module contains constants representing (among other things) keys on the keyboard which the user can press. Look through the help for that module and note that the constants whose names begin with K_ are keyboard keys, those starting with JOY represent joystick movements, and so on. For this game we’ll use the arrow keys to control the snake, although if you play games regularly you may be used to other key combinations, such as A, S, W and D. When the user has requested a change of direction, we need to make a note of the new vector and instruct the snake head to redirect it’s motion. Here’s the code:

121 122 123 124 125 126 127 128 129 130 131 132

# Handle events . for event in pygame . event . get (): if event . type == QUIT : return elif event . type == KEYDOWN : if event . key == K_UP : dir = SnakePart . UP elif event . key == K_DOWN : dir = SnakePart . DOWN elif event . key == K_LEFT : dir = SnakePart . LEFT elif event . key == K_RIGHT :

13.3. PYGAME

241

133 134

dir = SnakePart . RIGHT head . change_direction ( dir ) Next, we can check whether the snake head has collided with any of the sprites in the snake body, in which case the player has lost the game. Since we’ve used sprites, we can make use of Pygame’s spritecollideany function (in the pygame.sprite module) which detects whether one sprite has collided with a group of sprites:

136 137 138

# Check if the head has collided with the body . if pygame . sprite . spritecollideany ( head , body ): game . lose ( screen , credits = credits ) Before we deal with the details of the animation, we need to increment the bodysize field in the head object. This is the number that tells us how many sprites should be in the body of the snake and, as we said earlier, should be updated every 50 frames. So, we just need to check the frame counter and increment when appropriate:

140 141 142

# Add a new SnakePart object every 50 frames . if frame % 50 == 0: head . bodysize += 1 Dirty rect animation In the bouncing ball animation we created the illusion of a moving object by drawing the ball on the screen, then creating a new image with the ball slightly moved, filling the screen with black pixels, then blitting the updated image onto the screen. This is particularly inefficient – every pixel on the screen has to be updated every time something moves! Better, surely, to simply update those rectangles of the screen that change between each frame. This technique is known as dirty rect animation. It’s very simple to implement – we just need to keep a list of all the rectangles (the dirty rectangles) which need updating then blit the updates for those rectangles straight onto the screen, leaving the remaining pixels unchanged. So, first of all, we need to keep a list of all those dirty rects:

144

dirty = [] # Dirty rects . Next we can add to the list all of the rectangles containing parts of the snake, so, the snake head and all the body parts. We can also fill those rectangles with background colour to blank them out. It’s important to remember when you read the next chunk of code that we have two variables screen and pygame.display. screen is just an image in memory that we write to (the blit buffer ) whereas pygame.display is the actual image in the graphics card that the user will see on his or her screen.

146 147 148

# Draw over the old snake . dirty . append ( copy . copy ( head . rect )) screen . fill ( bgcol , head . rect )

242 149 150 151

CHAPTER 13. PYTHON EXTENSIONS for b in body . sprites (): dirty . append ( b . rect ) screen . fill ( bgcol , b . rect )

Now we’ve got a list of all the old positions of the snake parts, and we’ve drawn over them, we can move the snake head and draw in all the new snake parts. Again, we only draw on the blit buffer and we’ll make sure these rectangles also appear in the dirty rect list, so we know to update them on the display. First, we’ll deal with the snake head: 153 154 155

# Move the snake head . head . update () dirty . append ( head . rect ) ...and now the body. Remember, the snake head was moved “in place”, whereas we decided to destroy all the snake body parts on every frame. So, here we need to create new SnakeBodyPart objects and place them in the body sprite group. The head.trail list holds all the locations on the screen where we ought to be drawing body parts. As with everything else, these rectangles need to go in the dirty rect list so that we remember to update them on the display:

157 158 159 160 161 162

# Delete old sprites and create new snake body . body . empty () for i in range ( head . bodysize ): new_part = SnakeBodyPart ( head . trail [ i ]) body . add ( new_part ) dirty . append ( new_part . rect ) Now, we’re almost there. We have a list of every rectangle which needs to be updated on the display and we’ve created all of the sprites that have been drawn on the blit buffer. As with the first frame of the animation, we need to make sure the sprites are rendered correctly by calling the RenderPlain and draw methods for each sprite (or sprite group). We’ll also want to update the score:

164 165 166 167 168 169

# Render the current score and the snake . game . display_score ( screen , head . bodysize ) headsprite = pygame . sprite . RenderPlain ( head ) bodysprite = pygame . sprite . RenderPlain ( body ) headsprite . draw ( screen ) bodysprite . draw ( screen ) Last of all, we need to blit all our dirty rects to the display:

171 172

# Update dirty rects . pygame . display . update ( dirty )

13.3. PYGAME

243

Complete listing for Snake Putting all that work together, this is the complete listing for snake: Listing 13.10: Full listing of the Snake program 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

# !/ bin / env python2 .4 """ Snake made with pygame . """ import copy , math , pygame from pygame . locals import * import game_functions as game if not pygame . font : print ’ Warning , fonts disabled ’ if not pygame . mixer : print ’ Warning , sound disabled ’ __author__ = ’ Sarah Mount ’ # ## Global variables fgcol = pygame . color . Color ( ’ white ’) # Foreground colour bgcol = pygame . color . Color ( ’ black ’) # Background colour screen_size = (500 , 500) head_img = ’ ../ images / snake - head . png ’ body_img = ’ ../ images / snake - body . png ’ soundtrack = ’ ../ sounds / ATT . ogg ’ caption = ’ Snake : built with Pygame ’ credits = ’ Sarah Mount ( code ) , James Shuttleworth ( music ) ’ fps = 30 # Frames per second # ## Game classes . class SnakePart ( pygame . sprite . Sprite ): " " " Segment of a snake . " " " UP , DOWN , LEFT , RIGHT = [0 , -1] , [0 ,1] , [ -1 ,0] , [1 ,0] def __init__ ( self , position ): pygame . sprite . Sprite . __init__ ( self ) self . rect . center = position self . area = pygame . display . get_surface (). get_rect () return class SnakeHead ( SnakePart ): " " " Snake head . The head of a snake can change direction and update its position .

244

CHAPTER 13. PYTHON EXTENSIONS

39 """ 40 def __init__ ( self , screen ): 41 self . image , self . rect = game . load_png ( head_img ) 42 self . screen = screen 43 start_pos = (10 , 10) 44 super ( SnakeHead , self ). __init__ ( start_pos ) 45 self . direction = SnakePart . DOWN 46 # Keep track of previous positions . 47 self . trail = [ self . rect . center ] 48 # Length of the body behind the snake head . 49 self . bodysize = 0 50 return 51 def update ( self ): 52 " " " Update the position of the snake head . " " " 53 movepos = self . _dotproduct ( self . direction , 54 self . rect . size ) 55 self . rect . move_ip ( movepos ) 56 self . trail . append ( self . rect . center ) 57 # Keep the self . trail list small . 58 if len ( self . trail ) > ( self . bodysize +1): 59 del self . trail [0] 60 # Check if head has hit the side of the screen . 61 if not self . area . contains ( self . rect ): 62 game . lose ( self . screen , credits = credits ) 63 return 64 def change_direction ( self , dir ): 65 " " " Change the direction in which the snake head 66 is moving . 67 """ 68 self . direction = dir 69 def _dotproduct ( self , l1 , l2 ): 70 " " " Vector dot product . " " " 71 product = [] 72 for i in range ( len ( l1 )): 73 product . append ( l1 [ i ] * l2 [ i ]) 74 return product 75 76 class SnakeBodyPart ( SnakePart ): 77 " " " Body part of a snake . 78 Body parts are not expected to be updated . 79 """ 80 def __init__ ( self , position ): 81 self . image , self . rect = game . load_png ( body_img )

13.3. PYGAME

245

82 super ( SnakeBodyPart , self ). __init__ ( position ) 83 return 84 85 # ## Game logic . 86 def main (): 87 " " " Play Snake . " " " 88 pygame . init () 89 screen = pygame . display . set_mode ( screen_size ) # Blit buffer . 90 pygame . display . set_caption ( caption ) 91 92 # Draw the game background . 93 background = ( pygame . Surface ( screen . get_size ())). convert () 94 background . fill ( bgcol ) 95 screen . blit ( background , (0 ,0)) 96 pygame . display . flip () 97 game . loading ( screen ) 98 99 # Initialise the snake . 100 head = SnakeHead ( screen ) 101 headsprite = pygame . sprite . RenderPlain ( head ) 102 body = pygame . sprite . Group () 103 bodysprite = pygame . sprite . RenderPlain ( body ) 104 105 music = pygame . mixer . Sound ( soundtrack ) 106 clock = pygame . time . Clock () 107 frame = 0 # Number of frames we have drawn so far . 108 109 # Ask the player to start the game . 110 music . play ( -1) 111 screen . blit ( background , (0 , 0)) 112 pygame . display . flip () 113 pygame . event . clear () # Clear event queue . 114 game . start ( screen , game = ’ Snake ’) 115 116 # Event loop 117 while True : 118 clock . tick ( fps ) 119 frame += 1 120 121 # Handle events . 122 for event in pygame . event . get (): 123 if event . type == QUIT : 124 return

246 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167

CHAPTER 13. PYTHON EXTENSIONS elif event . type == KEYDOWN : if event . key == K_UP : dir = SnakePart . UP elif event . key == K_DOWN : dir = SnakePart . DOWN elif event . key == K_LEFT : dir = SnakePart . LEFT elif event . key == K_RIGHT : dir = SnakePart . RIGHT head . change_direction ( dir ) # Check if the head has collided with the body . if pygame . sprite . spritecollideany ( head , body ): game . lose ( screen , credits = credits ) # Add a new SnakePart object every 50 frames . if frame % 50 == 0: head . bodysize += 1 dirty = [] # Dirty rects . # Draw over the old snake . dirty . append ( copy . copy ( head . rect )) screen . fill ( bgcol , head . rect ) for b in body . sprites (): dirty . append ( b . rect ) screen . fill ( bgcol , b . rect ) # Move the snake head . head . update () dirty . append ( head . rect ) # Delete old sprites and create new snake body . body . empty () for i in range ( head . bodysize ): new_part = SnakeBodyPart ( head . trail [ i ]) body . add ( new_part ) dirty . append ( new_part . rect ) # Render the current score and the snake . game . display_score ( screen , head . bodysize ) headsprite = pygame . sprite . RenderPlain ( head ) bodysprite = pygame . sprite . RenderPlain ( body )

13.3. PYGAME 168 headsprite . draw ( screen ) 169 bodysprite . draw ( screen ) 170 171 # Update dirty rects . 172 pygame . display . update ( dirty ) 173 174 if __name__ == ’ __main__ ’: 175 main ()

247

248

CHAPTER 13. PYTHON EXTENSIONS

13.4

Further reading

ˆ Google APIs: http://www.google.com/apis/ ˆ Dave Gorman’s Googlewhack Adventure: http://www.davegorman.com/googlewhack.htm ˆ PyGoogle website: http://pygoogle.sourceforge.net/ ˆ PyGoogle documentation: http://pygoogle.sourceforge.net/dist/doc/public/google-module.html ˆ Wikipedia entry on the RGB colour model: http://en.wikipedia.org/wiki/RGB/ ˆ PIL website: http://www.pythonware.com/ ˆ PIL handbook: http://www.pythonware.com/library/pil/handbook/ ˆ Pygame website: http://www.pygame.org/ ˆ Pete Shinners’ Pygame introduction: http://www.pygame.org/docs/tut/intro/intro.html ˆ Tom Chance’s Making games with Pygame tutorial: http://www.tomchance.uklinux.net/pygame/ ˆ David Clark’s Newbie guide to Pygame: http://www.pygame.org/docs/tut/newbieguide.html

13.5

Glossary

API the set of functions or methods that a program makes available to other programs. blitting copying pixel data into memory so that it becomes visible on the users screen. bounding box an area of data inside an image. In PIL this is represented by a fourtuple containing an x-coordinate, a y-coordinate, a width and a height. The x and y coordinates represent the top left hand corner of the bounding box. dirty rect animation a technique where in each frame of an animation only the rectangles which need to be drawn are blitted to the screen. This improves the speed and efficiency of animations, as the whole screen isn’t redrawn at once. lookup table a list (or dictionary) containing pre-computed values from a function. Lookup tables are usually used because it is faster to access a value in a list than to call a function every time value is needed. In image processing a lookup table is often called a LUT.

13.6. HOMEWORK EXERCISES

249

RGB a colour model where each pixel is represented by a three-tuple containing its red value, green value and blue value. In a 24 bit RGB model each value ranges from 0-255. sprite a two dimensional, usually pre-rendered, image which is used as part of a larger animation or game. widget an element in a graphical user interface. Such as a button, menu item, text field, scroll bar, etc.

13.6

Homework exercises

1. Define the term API. 2. Use the google module to find out approximately how many Python tutorials are available on the web. Why might your estimate be inaccurate? 3. Why is it a good idea to have the origin of an image at its top left hand corner? Why not put the origin in the centre? 4. Describe the RGB colour model. 5. When we swapped the red and blue bands of the listings/images/san_diego.jpg photograph, we didn’t touch the green band. So, why did the green hedge in the foreground and the green awnings in the background change colour? 6. The image listings/images/union_station.jpg is blurred. Find and apply a PIL filter to sharpen it. 7. Look at the listings/py_extensions/sepia.py script. Comment out the line image = image.convert(’L’) and run the script on the listings/images/long_beach.jpg photograph. The result looks quite different to a sepia toned photograph! Why is that? 8. Describe the term blitting. 9. Briefly describe the technique of dirty rect animation. 10. Change the snake program so that the snake only grows in length when the head collides with randomly placed sprites (snake food!), which should be the same size as a SnakePart. Make sure that after a random period of time, new food is made available to the snake. 11. Augment the game_functions module with a high score table. Use either the shelve or pickle module to store the scores.

250

13.7

CHAPTER 13. PYTHON EXTENSIONS

Key Assignment

Use the game_functions module and the animation skills you have learned in this Chapter to implement a single player arcade game, such as Space Invaders (see Space Invaders on Wikipedia for a description: http://en.wikipedia.org/wiki/Space_Invaders). The exact game you implement is up to you, you don’t have to choose an old game, you could invent one of your own. We expect you to make good use of the facilities in Pygame, including those which handle collisions. You should aim to produce a program of several hundred lines. Single-player Pong might be a bit too short, Space Invaders is about right, but you probably want to stick to one game level only. For several levels, you would probably want to write code to parse some configuration files describing what should happen on each level. Of course, you’re welcome to do this, but we don’t expect you will have enough time! You should create any images and sounds you need to use in the game yourself. We recommend the GIMP (http://www.gimp.org/) or Inkscape (http://www.inkscape.org) for graphics (both available in AS116 and AS121), but you are welcome to use any software you have available to you. If you are also registered on 102CR you may wish to hand in any graphics and/or music you write in your portfolio so that your work will count for both modules. You should also consider the playability of your game and any work you do towards that could count 106CR or any other Usability module that you attend.

Index in the interpreter, 132 and , 187 author , 136 or , 187 credits , 136 date , 136 init , 157 len , 187 name , 136 or , 187 repr , 157 or , 187

command, 22 command-line argument, 146 constructor, 157 de Morgan’s laws, 31 declarative programming, see functional programming dirty rect animation, 241

efficiency, 49 encapsulation, 127 examples a “forall” function, 136 a lazy list for generating integers, 131 abstract data type, 181 a module for managing sets represented ADT, see abstract data type by lists, 136 and, 30 a prime number filter with functional proAPI, 208 gramming, 133 Application Program Interface, see API a test harness for the sets class, 195 Application Programming Interface, see API an abstract data type for sets, 181 assignment, 24, 77 bouncing ball animation, 225 attribute, 157 Cæsar cipher, 67 colour to greyscale, 215 base case, 48 colour to sepia, 221 binary operator, 30 drawing a square, 5 binary search, 109 edge enhancement, 219 blit, see blitting embossing an image, 219 blitting, 227 expression evaluator, 178 boolean, 29 film database with objects, 169 boolean algebra, 30 game functions module, 229 bounding box, 213 game of craps, 158 bubble sort, 114 Googlewhacking, 210 canvas widget, 213 iterative factorial with for, 63 checksum, 126 iterative factorial with while, 66 class, 157 iteratively raising a number to a power code walk-through, see walk-through with for, 62 251

252 iteratively raising a number to a power with while, 65 lexer for integers, 86 negative images, 216 parsing Roman numerals, 102 points, 174 prime number filter using for loops, 71 Random signatures 1, 41 recursive factorial, 47 recursively raising a number to a power, 46 shapes, 176 Snake arcade game, 230 square spiral, 11 swapping colour bands, 218 traffic lights, 83 verifying ISBN checksums, 126 viewport, 212 von Koch fractal, 50 exceptions, 190 except, 190 finally, 190 raise, 190 try, 190 exclusive or, see xor exponend, 47 exponent, 47 expression, 23

INDEX for, 60 fractal curve, 51 from, 136 functional programming, 129 functions, 123 global, 125, 129 if, 37 imperative programming, 130 inconsistent state, 81 indentation, 38 index, 94 indexing, 94 induction case, 48 infinite recursion, 48 inheritance, 173 multiple inheritance, 173 polymorphism, 175 single inheritance, 173 subclass, 173 superclass, 173 input, 145 input(), 145 instance, 157 instantiate, 157 int, 22 integer, 22

lambda, 130 factorial, 47 lazy evaluation, 131 field, 157 lexer, 85 file lexical token, see token complex, 149 lexing , see lexer text files, 147 linear search, 108 appending, 148 literal, 22 reading, 149 lookup table, 102, 221 writing, 147 LUT , see lookup table, see lookup table filter, 132 lut , see lookup table, see lookup table finally, 190 finite state automata , see finite state ma- map, 132 chine merge sort, 116 finite state machine, 83 method, 157 float, 22 modules, 136

INDEX namespace, see scope nand, 35 negative images, 216 nesting, 39 None, 157 not, 30 object, 157 object oriented programming, 157 OOP, see object oriented programming or, 30 order of precedence, 23 output, 145

253 set, 181 shelve, 149, 150 shelving, 149 side effect, 129 simultaneous assignment, 80 slection sort, 113 slice, 96 slicing, 96 sort, 113 sorting, 113 state, 81 state diagram, see state transition diagram state machine , see finite state machine state space, 81 state transition, 82 state transition diagram, 82 statement, 21 str, 22 string, 22 stub, 50 subclass, 173 substitution, 79 superclass, 173 syntax, 23

pickle, 149 pickling, 149 PIL, see Python Imaging Library polymorphism, 175 postcondition, 134 precondition, 134 pwd, 215 Pygame, 225 PyGoogle, 208 Python Imaging Library, 211 PyUnit (Python’s unit testing framework), 194 termination, 48 test-first programming, 194 quick sort, 118 the LBG rule, 125 Tk, 212 raise, 190 Tkinter, 212 range function, 60 token, 85 raw input(), 145 top-down design, 50 recursive function, 46 except, 190 referential transparency, 129 try, 190 regexp, 152 type, 22, 175 regular expression, 152 type(), 22 reload(function), 136 RGB colour model, 214 scope, 61, 124 search , 108 searching, 253 self, 157 separation of concerns, 124

unary operator, 30 unit testing, 194 variable, 24, 77 walk-through, 78 while, 64

254 widget, 213 xor, 30

INDEX

Suggest Documents