Programming Methods and Tools
Programming Methods and Tools Paweł Lula, Jan Madej, Janusz Stal, Ryszard Tadeusiewicz, Janusz Tuchowski
4
Programming Methods and Tools
Editorial Committee (Komitet redakcyjny): prof. dr hab. Tadeusz Grabiński, dr Lesław Piecuch Reviewer (Recenzent): dr hab. inż. Krzysztof Boryczko, prof. AGH Scienti c Editor (Redakcja naukowa): Jan Madej
Authors (Zespół autorów): Paweł Lula, Jan Madej, Janusz Stal, Ryszard Tadeusiewicz, Janusz Tuchowski
The Project Leader (Lider projektu): Cracow University of Economics (Uniwersytet Ekonomiczny w Krakowie) ul. Rakowicka 27, 31-510 Kraków tel.:+48 12 293 57 00 lub +48 12 293 52 00 faks: +48 12 293 50 10 www.uek.krakow.pl Project Partner (Partner projektu): Association ‘Education for Entrepreneurship’ (Stowarzyszenie „Edukacja dla Przedsiębiorczości”) ul. Orląt Lwowskich 2/2, 31-518 Kraków tel.:+48 12 430 18 11 of
[email protected] www.edp.org.pl © Copyright by Cracow University of Economics Cracow 2014 ISBN 978-83-64509-07-0 Free of charge (Publikacja bezpłatna) Coverage, layout, typesetting on behalf of the Cracow University of Economics (Opracowanie wydawnicze, gra czne, skład i łamanie na zlecenie Uniwersytetu Ekonomicznego w Krakowie) Grupa Wydawnicza LUMINA Sp. z o.o. Correction of content-language translations into English (Korekta merytoryczno-językowa tłumaczenia na język angielski): Michał Jezierski, Lingua Lab, www.lingualab.pl The publication elaborated within the project "Launching a unique faculty Applied Informatics in response to labour market demand" is co-founded by European Union in the frame of European Social Fund. (Wydawnictwo przygotowane w ramach projektu „Uruchomienie unikatowego kierunku studiów Informatyka Stosowana odpowiedzią na zapotrzebowanie rynku pracy” jest współ nansowane ze środków Unii Europejskiej w ramach Europejskiego Funduszu Społecznego.)
Cracow University of Economics
Table of contents
5
Table of contents Introduction ........................................................................................11
THEORETICAL FOUNDATIONS OF PROGRAMMING 1.1. Basics of algorithmics................................................................17 1.2. The history of algorithms and the concept of recursion.............27 1.3. The concept of a Turing machine ............................................. 28 1.4. A nite automaton..................................................................... 30 1.5. Automaton with variable states – a sequential system ............. 32 1.6. Formal languages and grammars............................................. 35 1.7. Automata and grammars .......................................................... 38 1.8. A few additional comments on grammars and languages ........ 40 1.9. Properties of Turing machines...................................................42 1.10. An example of the construction of a Turing machine and its operation .................................................. 44 1.11. Bibliography...............................................................................47
PRACTICE OF IMPERATIVE STRUCTURED PROGRAMMING 2.1. Programming paradigms ...........................................................51 2.2. Java programming language .....................................................51 2.2.1. Program development ................................................... 53 2.2.2. Compiling and running the program .............................. 53 2.2.3. Runtime environment .................................................... 55 2.2.4. Conventions applied ...................................................... 56 2.2.5. Input and output..............................................................57 Association 'Education for Entrepreneurship'
6
Programming Methods and Tools
2.3. Data types, variables and operators...........................................57 2.3.1. Data types ......................................................................57 2.3.2. Literals ........................................................................... 58 2.3.3. Constants and variables................................................ 59 2.3.4. Operators .......................................................................61 2.3.5. Type conversion and casting ......................................... 62 2.3.6. Character strings ........................................................... 63 2.3.7. Arrays ............................................................................ 65 2.3.8. Enumerated type ............................................................67 2.4. Controlling the ow of execution of the program ........................67 2.4.1. Statement block............................................................. 68 2.4.2. Conditional statement.................................................... 68 2.4.3. Multiple choice statement...............................................71 2.4.4. Inde nite loops ...............................................................73 2.4.5. De nite loops..................................................................75 2.4.6. Interrupting the control statement...................................78 2.5. Program decomposition .............................................................78 2.5.1. Primitive and class type variables ..................................79 2.5.2. Creating and deleting objects ........................................ 80 2.5.3. Creating and calling methods .........................................81 2.5.4. Using library classes ......................................................87 2.5.5. Wrapper classes............................................................ 89 2.6. Questions .................................................................................. 90
OBJECT-ORIENTED PROGRAMMING PARADIGM 3.1. Basic concepts .......................................................................... 95 3.1.1. De nition of a class ....................................................... 95 3.1.2. Creating objects ............................................................ 98 3.1.3. Components of a class.................................................. 99 3.2. Composition and inheritance....................................................106 3.2.1. Composition .................................................................107 3.2.2. Inheritance.................................................................... 110 3.2.3. Polymorphism............................................................... 113 Cracow University of Economics
Table of contents
7
3.2.4. Packages...................................................................... 115 3.2.5. Additional concepts ...................................................... 118 3.3. Interfaces and abstract classes................................................ 118 3.3.1. Interfaces...................................................................... 119 3.3.2. Abstract classes and methods .....................................123 3.3.3. Final classes and methods ...........................................125 3.4. Streams, les and error handling..............................................126 3.4.1. Exceptions and their structure......................................127 3.4.2. Handling exceptional situations ....................................129 3.4.3. Data streams ................................................................132 3.4.4. Files and directories .....................................................136 3.5. Graphical User Interface ..........................................................138 3.5.1. Main application window ..............................................138 3.5.2. Containers and components ........................................139 3.5.3. Event handling..............................................................144 3.5.4. Graphic elements .......................................................150 3.6. Applets .....................................................................................152 3.7. Questions .................................................................................156 3.8. Bibliography..............................................................................158
PROGRAMMING OF MOBILE DEVICES 4.1. Rules for creating mobile sites .................................................163 4.1.1. De nition and content of a Webpage ...........................164 4.1.2. Creating navigation and references..............................167 4.1.3. Interaction with the user ...............................................168 4.1.4. Testing the Website ......................................................168 4.2. Introduction to the development of mobile applications............169 4.3. User interface ...........................................................................173 4.3.1. High-level interface.......................................................173 4.3.2. Program information.....................................................178 4.3.3. Graphical User Interface ..............................................181 4.3.4. Uniform user interface ..................................................186 Association 'Education for Entrepreneurship'
8
Programming Methods and Tools
4.4. Methods of communication with the surroundings ...................189 4.4.1. Establishing and closing connections...........................190 4.4.2. Multithreading ..............................................................191 4.4.3. Practical implementation of communication .................192 4.4.4. Sending and receiving text and binary messages .......195 4.4.5. Sending and receiving multimedia messages ..............196 4.4.6. Programmed message receipt ....................................199 4.5. Saving and loading data .......................................................... 202 4.5.1. Record Management System ..................................... 203 4.5.2. Practical use of the RMS capabilities .......................... 205 4.6. Questions ................................................................................ 208 4.7. Bibliography............................................................................. 209
ALGORITHMS AND DATA STRUCTURES 5.1. Complexity of algorithms ..........................................................213 5.1.1. Complexity classes.......................................................213 5.1.2. Determining the computational complexity of an algorithm..............................................................216 5.2. Recursion and its applications..................................................218 5.3. Array algorithms ...................................................................... 226 5.3.1. Inversion of elements in a vector................................. 226 5.3.2. Sieve of Eratosthenes ................................................. 228 5.3.3. Selection sort .............................................................. 229 5.3.4. Insertion sort ............................................................... 230 5.3.5. Bubble sort ...................................................................231 5.3.6. Quick sort .................................................................... 232 5.3.7. Merge sort ................................................................... 234 5.4. Combinatorial algorithms......................................................... 236 5.4.1. Combinations without repetition .................................. 236 5.4.2. Permutations without repetition ....................................237 Cracow University of Economics
Table of contents
9
5.5. Dynamic structures.................................................................. 239 5.5.1. Sets and their implementation ..................................... 239 5.5.2. List structures ...............................................................247 5.5.3. An associative array .................................................... 255 5.5.4. Trees............................................................................ 259 5.5.5. Graphs......................................................................... 266 5.6. Bibliography............................................................................. 269 List of tables.....................................................................................271 List of pictures..................................................................................271
Association 'Education for Entrepreneurship'
10 Programming Methods and Tools
Cracow University of Economics
Introduction
11
Introduction Since the inception of computers (or more broadly – calculating machines), their creators aimed to either solve or accelerate the processes whose “traditional” implementation would take a lot longer and require considerable resources. However, to make this possible, the thought that it could be done had to come rst. This thought had to be turned into an idea – how to achieve this (an algorithm). The idea had to be presented in a form that could be converted into commands for a computer (the source code in a programming language) and then translated into a set of machine statements that a computer could execute (executable program). Unfortunately, this process – although summarised in the three sentences above – is not simple. Each stage requires special knowledge and skills. This textbook Programming Methods and Tools is a collective attempt to look at the process of developing computer programs. There are many publications available on the market describing algorithms, data structures and programming languages, but the intention of the authors is to present the entire issue in one textbook: from the history of algorithms, through their analysis, to object-oriented programming and programming of mobile devices. Such an approach to this problem was not easy. Among the wealth of issues related to creating algorithms, building and using data structures and writing programs, the authors had to select those the knowledge of which is essential to the understanding of the entire subject, at the same time allowing readers to gain knowledge that is suf cient for effective programming. The textbook consists of ve chapters. The rst chapter, Theoretical Foundations of Programming, contains basic information about algorithms. It presents the most important de nitions, a brief history of algorithms, their properties and methods of representation. Next, the concept of a Turing machine is presented – a machine which is an abstract model of the calculation process. The chapter also discusses issues related to formal languages and grammars. Finally, the construction and operation of a sample Turing machine is presented. Because the Java Standard Edition programming language is used as a tool to illustrate the issues presented in the later sections, the beginning of the second chapter, Practice of Imperative Structured Programming, discusses the working environment and rules for preparing applications in that language. The reader will have the opportunity to become familiar with the basic concepts, a set of tools and guidelines for developing a program (writing, Association 'Education for Entrepreneurship'
12 Programming Methods and Tools
compiling, and running) using the IDE environment and the command line. Subsequent sections of the chapter discuss issues related to the de nition of data types, the declaration of variables and constants, and the performance of operations using the available set of operators. The section describing the control of the ow of program presents the possibilities of multiple execution of the program statements and discusses issues concerning the conditional execution and termination of a statement block contained in the program. The third chapter, Object-Oriented Programming Paradigm, analyses issues that are the basis for programming in any object-oriented language. It includes the de nition of a class, the discussion on objects and methods, and demonstrates the rules for their creation. Java SE is also used as a tool to help illustrate the presented issues. It is used as an example to discuss, among others, simple variable types, class variable types, creating and removing objects, creating and calling methods, access modi ers and method overloading, and the use of library classes. The language features which characterise object–oriented programming are discussed in the latter part of the chapter – in particular, inheritance, polymorphism, and encapsulation. The following sections deal with the issues concerning creation of abstract classes and the use of interfaces, as well as the communication of the application with the environment (input-output operations and error handling). Matters related to creating a graphical user interface are presented at the end of the chapter. The fourth chapter, Programming of Mobile Devices, was inspired by the observation of phenomena that take place in our everyday lives. In recent years, we have witnessed the rapid development of wireless technology for Internet access. This creates new opportunities both in entertainment and in business. It should be noted however, that as the result of the expansion of mobile infrastructure, it is necessary to develop and create relevant applications, which are often different from “normal” computer programs. This chapter contains information on the principles of creating applications in Java Micro Edition. The chapter presents the low-level and high-level user interface, which has a major impact on the shape of the created applications. The fth chapter, Algorithms and Data Structures, begins with the discussion of computational complexity of algorithms. Following that, a recursion is discussed – which is one of the most important mechanisms used in programming, yet very dif cult for novice programmers. After presenting its de nition and discussing its application, selected recursive algorithms are presented (the determination of Fibonacci numbers, the determination of the factorial, and the Tower of Hanoi). Single- and multi-dimensional arrays are presented, and the main groups of algorithms operating on these types of structures are characterised (including sorting Cracow University of Economics
Introduction 13
algorithms and algorithms for determining combinations and permutations) in the following sections of the chapter. Dynamic data structures are presented at the end of the chapter, such as stacks, queues, sets, lists, associative arrays, trees and graphs. Despite the lack of many issues, which due to the limited capacity of the textbook had to be omitted, we hope that this publication will prove to be of help when learning how to program and will make it easier to understand the rules for the development of algorithms useful in solving various problems.
Association 'Education for Entrepreneurship'
14 Programming Methods and Tools
Cracow University of Economics
Theoretical Foundations of Programming 15
1
Theoretical Foundations of Programming Ryszard Tadeusiewicz Association 'Education for Entrepreneurship'
16 Programming Methods and Tools
Cracow University of Economics
Theoretical Foundations of Programming
17
1.1. BASICS OF ALGORITHMICS The programming process consists of two components. The rst one is the development of an algorithm, i.e. creating a method of solving a given problem – a method which a computer can use. The second component is describing this algorithm in a chosen programming language, designing data structures to be used, nding the right libraries of ready-to-use subroutines or classes that can be built into the program, and compilation, consolidation, implementation, running (debugging1 and testing) of the program – and nally, putting it into operation along with the appropriate training of its user. This section discusses the rst of these components. More speci cally – the chapter presents basic information about algorithmics, the eld of knowledge about algorithms and their general characteristics. This section provides a theoretical introduction to the speci c matters, which can be found in the Algorithms and Data Structures chapter, where the theory presented here is supplemented with practical information being the basis for the professional conduct of a programmer creating a given program. Before we start our discussion about algorithmics, as it was de ned, e.g. in the classic book (Harel, 1992), we must try, at least preliminarily, to determine what an algorithm is. It is to be emphasised, however, that at this point it will only be a preliminary de nition, a sample of what will be more fully discussed in the chapter Algorithms and Data Structures. 1
The term “debugging” means searching and removal of technical errors that almost always exist in a newly-written code. Debugging differs from program testing in the level of detail of the program control. The test stage consists of checking how the person using the services sees the program. The debugging stage consists of checking how the program is seen by the computer performing all the actions foreseen in the program. Debugging precedes testing, because before the program is shown to the user, who will evaluate it and request any corrections, we must nd out if the program works at all, which generally is not that obvious. Thence, debugging detects and removes all technical jams that hinder the program at its most basic technical level. For this purpose, special programs (debuggers) are normally used, which are a kind of “working package” for the program, allowing it to run safely even if the program contains errors that could have catastrophic consequences for the computer, and also allowing the precise observation and control of its actions with the capability of executing the tested program step by step. During debugging, there is usually the need to put special “traps” in the tested program, as well as to view changes in values of speci c variables and the state of some technical elements of the computer (the contents of microprocessor registers, signals transmitted on the bus, messages coming from the peripheral devices, etc.)
Association 'Education for Entrepreneurship'
18 Programming Methods and Tools
For the purpose of this discussion, we shall be contented with the de nition which was quoted in the course book of the Academy of Economics (the predecessor of today’s University of Economics) issued in 1974 (Kulik, Tadeusiewicz, 1974): (…) an algorithm is a set of descriptions of simple steps and strict rules for their execution. Taking into account its use in calculation techniques2 , this de nition is rened in the same textbook (published in 1974!) (p. 234) in the following way: (…) an algorithm is a speci c rule de ning the computational process and leading from input data, which show variability, to the required result. However, given the more general de nition quoted below (in the same book), specifying an algorithm as (…) a set of rules, statements, descriptions of successive operations or constraints that determine the sequence of execution of individual operations in order to obtain a certain result. it was stated therein that: (…) both a mathematical formula, a manufacturing recipe or an operator manual of some device, as well as, for example, development programme of an organism contained in the genetic code can be regarded as an algorithm3 . The abovementioned de nitions and terms – formulated some time ago, but still applicable – de ne an algorithm as a recipe of proceeding, which should lead to the achievement of a particular purpose. However, not every recipe can be called an algorithm. To be called an algorithm, a recipe has to meet four requirements (Tadeusiewicz, 1998). The rst one is the requirement of uniqueness. This requirement stems from the fact that we can use different ways of de ning algorithms and different ways of expressing them. Examples of these methods will be extensively described in further sections of this course book, because different programming languages are simply different ways of expressing algorithms. However, we must ensure that, regardless of who or what will execute the algorithm (that is, regardless of whether the algorithm will direct the action of a person 2
The term “computer science” did not exist back in those days, so the term “calculation techniques” was used.
3
It had been written exactly thirty years before the genetic code of different organisms was read, including this of humans.
Cracow University of Economics
Theoretical Foundations of Programming 19
or a computer) – the result of the action has to be exactly the same. This means that the description of the actions to be carried out during the execution of an algorithm cannot contain any ambiguities or default elements (“because it goes without saying”). Also, rules for determining the order of these operations and conditions for any change to this order must be de ned in a way that does not leave any possibility of arbitrary interpretation or voluntary decisions. The executor of an algorithm must be in a situation resembling a tram on tracks and not a car that freely selects the route of its movement. Let’s consider an example. If there is a rule stating that a student will be given a scholarship for academic performance if their average grade in the previous academic year is greater than or equal to 4.8 – a rigid algorithm is at work here. Regardless of who will make the decision, those who meet the speci ed condition will be awarded the scholarship, and those who do not meet the condition, have no chance of a positive decision, because the formulated criterion is unequivocal (Figure 1)
Figure. 1. Unequivocality of a criterion guarantees a fair award of scholarships
If, on the other hand, we imagine a situation (an abstract one, because a situation like this does not happen – but we can still imagine it) in which social scholarships are awarded based on a vague and ambiguous criterion that states they should be awarded to students from low-income families - than the predictability of a decision is impossible. In this case, it is highly likely that one of two students being in exactly the same material position will receive a cholarship and the other will not (Figure 2). Of course, such a distribution of bene ts has nothing to do with an algorithm! Association 'Education for Entrepreneurship'
20 Programming Methods and Tools
Figure. 2. Lack of unequivocality of criteria leads to unfair decisions
The second requirement is related to a discrete nature of an algorithm. The word “discrete” in this context should be regarded as the opposite of the term “continuous”. A rule can aspire to be an algorithm, if it describes the required actions as a sequence of consecutive basic steps. A “basic step” means a simple task whose execution does not require any additional explanation. When creating an algorithm, we must therefore choose which basic steps should be performed in what order, and in what conditions we should choose these and not other actions – but we do not need to explain the actions themselves. When some of them are indicated and selected, it is then obvious what we should do. The discrete nature of an algorithm in some cases may be in con ict with the continuous nature of the world where we live. Let us consider a very simple example. Imagine a vessel lled with water (or another uid) with a hole near the bottom – Figure 3.
Figure. 3. An object used as an example for the discussion of the difference between continuous reality and a discrete algorithmic description
Cracow University of Economics
Theoretical Foundations of Programming 21
Initially, the hole is plugged and the water level in the vessel is at H0. When the hole is unplugged, water ows out and its level in the vessel drops. As the water ows under the pressure exerted by a column of water in the vessel – the speed of the out ow is proportional to this pressure and the pressure is proportional to the water level H. Water leaving the vessel causes the level H to decrease. The rate of this decrease, expressed using the derivative dH/dt, is proportional to H. We can therefore write the equation: (1) which expresses this relationship, where k is the coef cient of the rate of water leaving the vessel (the larger the hole through which water ows, the greater the coef cient). The solution of the equation (1), showing how the water level in the vessel will change over time, is known and has the following form (2) It is a continuous function whose graph is shown in Figure 4. The course of this function accurately represents the situation known from everyday experience: the level in the vessel decreases and this decrease is initially rapid, and as there is less the water – it gets slower.
Figure. 4. The course of the actual process taking place in the object in Figure 3 is continuous by its very nature
Association 'Education for Entrepreneurship'
22 Programming Methods and Tools
The course of the actual process is continuous by its very nature, that is, at any time t, it is possible to specify a particular value of the water level H(t). However, if we want to describe this process in the form of an algorithm (for example, if we want the computer to simulate the ow of water), we should not refer to any continuous process – everything should be done in a discrete manner. Therefore, a discrete time scale would have to be introduced – instead of a continuous variable t, representing this parameter in physics, we have to introduce discrete numbers of time moments n, distant from each other by a nite time step (time interval) Δ, and discrete values of this variable (water level) H(n) precisely in these time moments. We decide that we will consider time moments for a xed time interval: n = 1, 2, …, N. In addition, let us agree that the character “=” in the representation of the algorithm is not an equality sign in the mathematical sense, but commands the calculation of the value appearing on the right side of the character and assigning this value to the variable on its left side. A scheme of this operation (generally known in computer science as an assignment statement) is as follows: variable_to_which_the_value_will_be_assigned = formula_calculating_the_value
(3)
Let us examine the scheme presented above, as its thorough understanding will prevent us from being surprised when viewing this representation: n = n +1
(4)
If the above statement was considered a mathematical equation – it would be a contradiction, because regardless of what we would want to assign as the current value of the variable n, it is impossible for this value to equal itself … plus one! However, if we take into account the interpretation given in the formula (3), the representation given in the formula (4) becomes clear: we simply have to take the previous (current) value of variable n and increase it by 1, and the result should be used as the new value of the variable n. After these explanations, the algorithm, which we could use, would look like this. 1. Set n = 1 and H(n) = H(1) = H0. 2. Calculate the decrement in water level U(n) caused by the water that ows out from the vessel during the time interval Δ between the time moment n and the time moment n+1. The decrement U(n) can be calculated using the formula: U(n) = H(n) k Δ Cracow University of Economics
Theoretical Foundations of Programming 23
3. Calculate the water level at n+1 by subtracting the decrease U(n), which occurred between the time moment n and the time n+1, from the level H(n). We use this formula: H(n+1) = H(n) – U(n) 4. We change the number of the time moment under consideration, that is, we increase the variable n by one: n = n +1 5. If n < N, we repeat the above steps starting from step 2. 6. We reach this point when n reaches the assumed value of N. This means the completion of the calculations and stopping the algorithm. As a result of executing this algorithm, we can obtain a series of discrete values, giving us the water level in the vessel at selected discrete time moments (Figure 5), which, however, differ quite signi cantly from the continuous course illustrating the real course of events (see Figure 4).
Figure. 5. Set of discrete values that approximate – using the algorithm – the course of emptying the vessel
The discreet nature of the algorithm does not have to be a signi cant obstacle when obtaining practically useful results. By interpolating discrete points gained using algorithmic simulation, we can plot the course (shown as a dotted line in Figure 6), which approximates the continuous reality quite Association 'Education for Entrepreneurship'
24 Programming Methods and Tools
well. However, this fundamental distinction between a continuous (Figure 4) and adiscrete (Figure 5) representation of the same phenomena and processes must accompany any attempt of using an algorithm.
Figure. 6. Reconstruction of the approximation of a continuous course from the series of discrete values using interpolation
The third requirement which a recipe has to meet in order to be regarded as an algorithm concerns its nite character. Finite means guaranteeing the completion of the execution of a recipe after a nite number of steps. For example, the nite character of the algorithm discussed in the previous paragraph was guaranteed by giving the maximum value of the variable n (step number), whose achievement (or exceeding) resulted in stopping the calculation. In a general case, it may be impossible to speci cally determine this limiting number of steps to be performed in order to complete an algorithm. Moreover, in case of many algorithms, a very large number of steps may be needed before the algorithm is completed. However, an algorithm must be constructed to exclude the possibility that subsequent steps will be carried out in nitely. Such a situation, where an algorithm counts and counts and counts – and cannot nish, is always associated with an incorrect construction of the algorithm, and such an example of never-ending action is called the looping of an algorithm. A humorous illustration showing the dangers of waiting for the result of calculations after looping of an algorithm is shown in Figure 7. This is obviously a joke, but it shows clearly why we believe that a good algorithm must not loop, regardless of what happens. Cracow University of Economics
Theoretical Foundations of Programming 25
Figure. 7. Humorous commentary on the effects of the lack of nite character of an algorithm Source: http://1.bp.blogspot.com/__vUx979RBaY/TKT1gMAe_hI/AAAAAAAAAjs/ zK-JP5aFy_I/s1600/dead-at-computer.jpg [retrieved 2011-04-14]
One important fact should be noted here. The purpose to be ful lled by an algorithm can sometimes be unattainable. The creator of the algorithm must therefore be prepared for both success and failure. If a success is achieved, the very fact of obtaining the desired nal result stops each correctly built algorithm, because obviously there is no need for further action if a solution is found. This is different in case of a failure. Sometimes we can quickly nd out that a solution is not achievable after checking some criterion in the algorithm. When solving a quadratic equation in the following form: a x2 + b x + c = 0
(5)
if it is determined that the discriminant of the equation is negative b2 – 4 a c < 0
(6)
Association 'Education for Entrepreneurship'
26 Programming Methods and Tools
we know that there are no solutions for this equation in the domain of real numbers. (The solution is a conjugate complex root, but such a solution is not always accepted.) So, checking the sign of a discriminant allows the stopping of an algorithm with a failure signal and the algorithm is still nite, although it does not provide the desired solutions. However, failure is more often manifested by the fact that the executor of an algorithm (let us say, a computer) repeatedly attempts to nd a solution, and the result is still negative. There should be an “emergency brake” built into the structure of the algorithm which works when all other methods to end its operation fail. The algorithm then makes an emergency stop, signalling failure, but it will not run in a fatal loop. The last requirement we set when we want to check that a given recipe should be regarded as an algorithm is a requirement of generality. An algorithm should allow a certain class of problems to be solved or a certain group of goals to be achieved – and not one problem or one goal. The easiest way to explain this is with an example. The following mathematical formula ful ls three of the four conditions that are associated with the concept of an algorithm: X=2+3
(7)
This formula is unequivocal (regardless of who uses it, they will always get the same result); it is also discrete (it is necessary to take two elementary steps: add two numbers and assign the resulting value as the variable X), and it is nite (we will certainly obtain the result after a nite time). However, this formula cannot be accepted as an algorithm, because it solves only one pre-de ned problem. Whatever we do – the result is always the same: variable X will be given a value of 5. However, the following representation, very similar to the formula (7): X=A+B
(8)
is a proper algorithm, because it can solve a whole CLASS of tasks – by substituting different values for the variables A and B.
Cracow University of Economics
Theoretical Foundations of Programming 27
1.2. THE HISTORY OF ALGORITHMS AND THE CONCEPT OF RECURSION Like many other civilisation and culture achievements – algorithms were invented in ancient Greece, reaching Europe again after the “Dark Ages” thanks to Arab scholars. Among Greek mathematicians, Euclid is best known as the inventor and creator of algorithms. Although some historians argue that Eudoxus of Cnidus was the real creator of the concept of algorithms, most books on algorithms (Harel, 1992) connect this concept with the work of Euclid. His name in the original Greek text was written as Εὐκλείδης, which is worth quoting in case anyone reading this book may want to place it somewhere in a prominent place on their desktop. And this would be worthwhile, because there should be statues built of this man for his memorable work. The algorithm for searching for the greatest common divisor of two numbers that he described in a work entitled Elements (original Στοιχεῖα, Stoicheia) has been quoted for over two thousand years as an example of a model algorithmic solution to a really non-trivial problem. This algorithm refers to recursion, which as an really respectful achievement of its author, because recursion is a dif cult concept, and even today’s programmers sometimes are baf ed with it. Recursion occurs when solving a given problem using an algorithm, and a reference to the algorithm itself is placed inside this algorithm, which viewed as a ready-to-use part of the solution. The essence of recursion is very well shown in the picture of a Droste cocoa box found in Wikipedia, shown in Figure 8. The picture shows a nun carrying … a Droste cocoa on a tray. The box carried by the nun also shows the same nun carrying Droste cocoa on a tray, etc. In theory, there could be an in nite number of such “nestings”, in practice, however imperfection of printing rather quickly puts a stop to the subsequent images of a nun on the box. In the Euclidean algorithm, recursion refers to a mathematical problem, whose goal is to nd the greatest common divisor (GCD) of two numbers. Let us denote these two numbers A and B and assume that A > B. 1. We divide A by B, but we are only interested in the remainder of the division. The remainder is denoted C. 2. We replace the position A with the number B, and position B with the number C. 3. If the position B = 0, then the GCD searched for equals A, otherwise go to step 1. Association 'Education for Entrepreneurship'
28 Programming Methods and Tools
Figure. 8. A drawing using the principle of recursion. Explanation in the text Source: http://upload.wikimedia.org/wikipedia/commons/6/62/Droste.jpg, (retrieved 2011-04-14)
The intellectual achievements of antiquity (the ancient Greeks, in particular) were forgotten during the Middle Ages. Therefore, the concept of an algorithm as such and in particular the concept of the Euclidean algorithm (which is indeed only an example) would be unknown – if not for the Arabs, who had taken over the Hellenistic heritage and developed it for many centuries in isolation from the scienti c trends occurring in Europe. The Persian mathematician Muhammad ibn Mūsā al-Khwārizmī had a particular merit in the preservation and development of algorithms. He lived in the ninth century in Baghdad and contributed to development of the decimal numeration system and laid the foundations of algebra and trigonometry. It was from his mangled last name that algorithms and knowledge about them, called algorithmics, have taken their names.
1.3. THE CONCEPT OF A TURING MACHINE Describing algorithms above, we have repeatedly emphasised that their sense is that an executor should be found, who will execute everything that has been stored in the algorithm. Cracow University of Economics
Theoretical Foundations of Programming 29
But who could be this executor? Executing algorithms by humans is theoretically possible, but very inconvenient. Using a computer as an executor of considered algorithms is impractical, because each computer has a lot of complicated technical details in its structure, whose exploration and analysis just to see how this or other algorithm works – is again very impractical. Thus, it is appropriate to use a formal model, simpli ed enough so that its own characteristics do not interfere with focusing on the operation and properties of analysed algorithms, but ef cient enough to be able to implement any algorithm. The so-called Turing machine, which is an abstract model of the calculation process, is a well-known and respected model. A Turing machine (Figure 9) consists of a so-called nite automaton A (which is the simpli ed model of a microprocessor operating in each computer) and a memory having the form of a tape divided into frames, where a special head G writes and reads selected symbols si, at the same time moving to a neighbouring frame (one position to the left or right) – or staying in the same place (if the algorithm requires just that at that given point). The tape serves as a model of a computer’s mass storage, whereas the feature distinguishing the Turing machine from a real computer is the fact that the tape in a Turing machine has an unlimited length.
Figure. 9. Diagram of a Turing machine
The symbols si on the tape may have various meanings, whereas their set ned by the creator of the machine as needed.
Σ, called the alphabet of the Turing machine, is arbitrarily de Σ = < s1 , s2 , …., sn >
(9)
Because of the similarity to the operation of digital components included in a typical computer, binary Turing machines are sometimes considered, where a set of possible symbols consists entirely of zeros and ones Σ = . Such a Turing machine is discussed in the famous book by Roger Penrose, titled Emperor’s New Mind. However, one needs to take into account fairly Association 'Education for Entrepreneurship'
30 Programming Methods and Tools
long strings of symbols when considering the binary Turing machines in order to enable sensible operation of the machine, so we shall use richer sets of symbols in this chapter and consider the operation of sample Turing machines using shorter strings of more signi cant symbols.
1.4. A FINITE AUTOMATON As mentioned above – the automaton A, shown in Figure 9, is a “heart” of the Turing machine. The automaton in general is a theoretical product, but in technical practice, a number of practical implementations of automata is used, because they are highly useful for various purposes. The automaton in a general case has an input and an output. Signals appear on the automaton input (informing the automaton about the state of its surroundings), as well as on the output, which are the replies of the automaton to the action of the environment. In general, input and output signals can be arbitrary, but the automata used in a Turing machine (and many other applications) have signals on input and output in the form of symbols from speci c alphabets. Since each alphabet consists of a FINITE number of symbols, automata using such alphabets are called nite automata. Their theory is the most developed, and their practical applications are the most common. The alphabet of input signals for a nite automaton may be different from the alphabet of output signals, but in our case, where an automaton A is the “heart” of a Turing machine, both input and output characters belong to the same alphabet Σ, because these are the characters read from the tape or written to the tape by the head. The rule binding the input and output signals (characters) is called the output function of an automaton and, for a nite automaton, it can be conveniently stored in the form of a table with as many columns as there are possible input symbols. As an example, let us look at a nite automaton used for encrypting messages. One method of encryption is the principle of substituting, in place of the real characters forming a given message – their counterparts, selected in accordance with a given rule (encryption key). One of the most famous ciphers of this type is called a Caesar cipher. Its name comes from Julius Caesar, who encrypted his correspondence with Cicero in such a way that instead of writing each letter of the alphabet, he wrote a letter appearing three places further. There are many letters in the alphabet, and the principle itself, which we want to show, is very simple. Therefore, in order to illustrate the operation of a nite automaton executing this type of encryption, it is suf cient to show the table binding the input and output signals (characters) only for a set of digits, and not Cracow University of Economics
Theoretical Foundations of Programming 31
for all symbols (letters, digits, spaces, punctuation, etc.). The output function of such an automaton can be expressed in a table (Table 1). Table 1. Output function of a selected nite automaton Input symbol
0
1
2
3
4
5
6
7
8
9
Output symbol
7
8
9
0
1
2
3
4
5
6
The operation of this type of an automaton is shown in Figure 10. If a string of digits is input to the automaton – another sequence of digits is output, misleading the unauthorised person, but easy to correctly interpret by the owner of the encryption machine. Using this type of a nite automaton, we could encrypt for example, PIN or the telephone number of a friend and be sure that an accidental onlooker will not be able to steal our secret. Table 2. Output function of a selected nite automaton with internal states Input symbol → Internal state ↓
0
1
2
3
4
5
6
7
8
9
Monday
7
8
9
0
1
2
3
4
5
6
Tuesday
6
7
8
9
0
1
2
3
4
5
Wednesday
5
6
7
8
9
0
1
2
3
4
Thursday
4
5
6
7
8
9
0
1
2
3
Friday
3
4
5
6
7
8
9
0
1
2
Saturday
2
3
4
5
6
7
8
9
0
1
Sunday
1
2
3
4
5
6
7
8
9
0
A nite automaton which always assigns the same output characters to the same input characters is quite a primitive device with limited capabilities. However, we can introduce a more sophisticated design of the automaton having a trait modifying its behaviour according to the so-called internal state. For example, imagine an automaton that will encrypt the input symbols (in our case, digits), depending on which day of the week the encryption takes place. Such a cipher is harder to break, so this procedure can be very reasonable. A nite automaton, whose behaviour will depend on two pieces of information: the signal at the input and the internal state, which will be the day of the week, will be a suitable cipher machine. A suitable table describing the output function of such an automaton is shown in Table 2. Input signals are given in the header of table columns (digits to be encoded), internal states (weekdays) are provided in the descriptions of each row, and the values given in the table will de ne the output signals produced by the encryption automaton. Association 'Education for Entrepreneurship'
32 Programming Methods and Tools
1.5. AUTOMATON WITH VARIABLE STATES – A SEQUENTIAL SYSTEM The example of a nite automaton described in Table 2 has internal states imposed from the outside (by the calendar de ning the current day of the week), however we can build an automaton that will change the internal states by itself. Such an automaton would have to have two functions de ned: the output function described above that sets the signal produced by this automaton at the output when it receives a speci ed input signal while being in a particular state – and the transition function, specifying the new state to which the automaton will move if it receives a speci ed input signal while being in a particular state. Note that the transition function is de ned for the same information determining its value, as the output function (the new state also depends on the current state and the input signal). This means that both these functions can be presented in the same table, with the assumption that we will enter two values in the table elds, assuming that the rst one is the output signal produced by the automaton, and the second one – a new state, to which the automaton will move following the action of the transition function. Let us examine the example of a simple nite automaton having three internal states (denoted only by numbers for the sake of simplicity) and producing only three symbols at the input and output: A, B and C. Table 3 shows both the output function and the transition function of this automaton. Table 3. Output function and transition function of a sample automaton Input → State ↓
A
B
C
1
B,2
A,3
C,1
2
C,1
B,3
A,2
3
A,3
C,2
B,1
The set of the automaton de nitions also includes its initial state. It is the internal state of the automaton where it begins its operation. In many automata including computers and other complex digital systems, such as GPS or more advanced GSM phones, the initial state is restored after recycling power or after pressing the “reset” button. The state number 1 is the initial state in our example and for this reason is indicated in bold in Table 3. We will soon look at how this automaton works. However, let us begin with the following general remark: Automata with internal states and de ned transition functions are characterised by being able to yield different output signals in response to the same input signals. This results from the fact that prior to Cracow University of Economics
Theoretical Foundations of Programming 33
the response to a given signal at a given moment of time, the automaton could have wandered through different sequences of previous internal states, and its current operation depends on its history. Therefore, automata of this type are sometimes called automata with a memory or sequential automata, because their behaviour and properties can be described by taking into account the sequences (strings arranged in time) of output signals produced in response to sequences of input signals. Let us also add that “the regular” automata, which have no internal states and are completely determined by the signals currently appearing on the input, are called combinational circuits. We will track the operation of a sequential circuit using the example cited above. Imagine that the following sequence of symbols is input to the automaton de ned in Table 3
ABBACACA Input State
Figure. 10. The rst step in the operation of a sample automaton (sequential circuit) – see text
In the beginning, the situation is as shown in Figure 10 on the left. The automaton is in the state 1 and receives A as an input signal. As can be seen from the items selected in the table shown in the centre of the drawing – the symbol B will appear on the automaton’s output as the rst element of the output sequence, and the state of the automaton will change (to number 2 according to the table). This situation is shown in the diagram of the automaton on the right side of the drawing. The next step is shown in Figure 11. As you can see, the automaton, now in state 2, receives the symbol B on the input and produces also the symbol B on the output as a result (as de ned in the table) and enters state 3. Association 'Education for Entrepreneurship'
34 Programming Methods and Tools
Input State
Figure. 11. Second step in the operation of a sequential circuit
You may want to try the next steps of the operation of the automaton on your own (this requires considerable attention!), and as a result, they will see that in answer to the sequence:
ABBACACA the automaton will produce the following output sequence of symbols:
BBCCCBAC and it will be in state 1 when ending its operation. As you can see, we have to consider the sequence of input signals, in response to which the automaton will produce a sequence of output signals when considering the behaviour of a sequential circuit – for example, the one described in 3. This will be done entirely automatically (according to the colloquial understanding of the word “automaton”). It is important in further considerations that we can associate the principle of conversion of the input sequence into the output sequence with a certain algorithm, as a result of which we can say that the automaton is the executor of this algorithm. Although every algorithm can be presented in such a form that the essence of its operation is based on converting the input sequence of selected symbols into the sequence of output symbols, not every algorithm can be executed using an automaton, because there are algorithms, for which it is impossible to build an automaton. We will discuss a machine that can execute any algorithm a bit further, but at present, we will briefly discuss the relationships that exist between automata and languages.
Cracow University of Economics
Theoretical Foundations of Programming 35
1.6. FORMAL LANGUAGES AND GRAMMARS The sequence of input symbols ABBACACA, examined above when discussing the operation of a sequential machine, can be considered a string in a certain language. Of course, we can be interested in what the string means in this language – this issue is dealt with by a eld of science known as semantics. However, in this chapter, we are not interested in the content and meaning of strings in different languages, as this area depends on the purposes for which these or other strings are prepared, and this is outside the scope of theoretical foundations of programming. However, we can also ask whether the string is correctly built in a given language – and this is what we will focus on. Correctness of the various language structures is dealt with by syntax. It turns out that it is syntax that is most important in computer science, because the entire practice of using arti cial languages (the so-called algorithmic languages), which are the basis of modern methods of programming, is based on it. Grammar is the key concept of the syntactic approach to language. Grammars in natural languages were formed during the development of communication practices, when the users of a language began to spontaneously follow certain rules and principles (initially not written or codi ed in any way) in order to ef ciently and effectively use the language to communicate. With time, these rules and principles were recorded in the form of speci c recommendations, which de ned how to construct correct sentences in this language – and thus speci c normative grammars were created, specifying the correct way to build sentences and statements in a given language. Because of this path of the emergence of “natural” grammars, they are usually complicated and not entirely logical, as well as full of exceptions – in a word, poorly suited to the world of computers, a fact well known by computer scientists working in the area of computer science known as computational linguistics or NLP (natural language processing). Quite different is the situation when it comes to arti cial languages adapted to the communication of humans with machines. For the most part, this means computer programming languages – although currently, the domain of languages for the communication with robots and other automated systems is also in development. The construction of these arti cial languages is currently accompanied by the construction of their grammars, which are strongly formalised from the beginning. Thanks to this, the translation process, meaning the automatic translation of a program designed to be convenient for humans (that is, the one expressed in an algorithmic language which is clear and understandable to the programmer) to the so-called internal code (binary strings controlling the operation of a machine, mainly microprocessors) can be conducted quickly and ef ciently. Association 'Education for Entrepreneurship'
36 Programming Methods and Tools
The theory of formal grammars and languages is a very broad domain, whose elements can be explored by referring to the titles listed in the bibliography at the end of this chapter. Here, we will only provide the basic facts and the simplest examples. We will describe the so-called formal grammar that allows to mathematically decide whether a given sentence4 is properly constructed in the considered language, and, by de nition, we assume that an empty sentence, indicated by the character ε, belongs (among others) in every language. This sentence contains no characters, so seemingly it is useless in practice, but it is necessary for the closure of various theoretical considerations. From the mathematical point of view, formal grammar consists of four organised elements G = (Σ, N, S, P)
(10)
where: Σ denotes the set of input symbols (also called terminal symbols). Not without reason the same sign as in formula (9) was used here. The strings, which could be considered the elements of the language under consideration, must be composed solely of terminal symbols, and this is a necessary condition, but it is not suf cient, because not every juxtaposition of terminal symbols is a valid element of the language under consideration, so the grammar has to include additional rules that distinguish correct and incorrect strings in accordance with these rules. N
is the set of nonterminal symbols, also known as ancillary symbols, or simply nonterminals. Nonterminals are used during the generation of a word, but are removed from it before the generation is nished.
S ∈ N is a special initial nonterminal or an axiom. It starts the generation of each word. P
is a nite set of the so-called grammar productions.
The key concept among these is the concept of production, whose name is somewhat misleading, and which, therefore, must be looked at more carefully. Productions are rules that allow replacing nonterminals with certain strings 4
In natural languages, sentences consist of words, and these words are written using speci c symbols (letters). In formal languages, including programming languages, it is assumed that the symbols used, even if they are single characters, have the role of words and not letters, which are contained in words. This causes, among other things, the elements of those languages to have a speci c meaning, but in practice written using a number of letters, such as keywords or longer variable names de ned in the language, are treated as single indivisible symbols. For example, in the C language, + is a symbol, but ++ is also a single symbol, and so is the word while as well as name _ of _ the _ variable _ storing _ data.
Cracow University of Economics
Theoretical Foundations of Programming 37
containing (as needed) nonterminals, and possibly also terminal symbols. By using productions, we are guaranteed that we will not go beyond the language de ned by the grammar. So if we start with a string belonging to the language (for example, the axiom) and then use any production from the set P any number of times – the strings that we create will always belong to the language. Let us consider a basic example. Let the set Σ consist of only two letters: a and b (Σ = {a,b}). This means that the language under consideration will consist solely of strings containing the letters a and b in different combinations, and the appearance of a string that contains even a single different letter will indicate immediately that this string does not belong to the language de ned by this grammar. The grammar which we will build is very simple, so we only need one nonterminal. Of course, this will be the axiom S (N = {S}). The production set P will also be very simple. It will contain two rules: S = aSb
(11)
S = ab
(12)
The rst one is used to build the correct expressions in the language considered here, and the second is needed so that the construction of any such expression can be completed. If we have a string of terminal symbols, and we want to know if it is a valid sentence according to the language de ned by the grammar under consideration, we can use one of two methods. A human, executing such a check “manually”, will most often use the generation method suggested above in the description of the role of production P. For example let us prove that the string aaabbb belongs to the language generated by grammar considered here. We will start with the axiom S. By de nition, it belongs to the language. Using the production described by the formula (11), we can successively perform the following transformations: S → aSb → aaSbb → aaabbb
(13)
The sequence of transformations shown in the formula (13) begins with an element belonging to the language (from the axiom of that language), and is carried out only based on the productions contained in the grammar of the language, ending with the examined string aaabbb, which proves that the string belongs to the language generated by the grammar considered here. Association 'Education for Entrepreneurship'
38 Programming Methods and Tools
A similar series of transformations cannot be built if the string (consisting of correct terminal symbols) has an incorrect structure. It is impossible for example to build a string of transformations starting with the axiom of the language and ending with the string aab, which proves that this string is incorrect in the grammar under consideration here. Strings abab and bbbaaa are also incorrect. The reader can easily discover the principle for deciding which strings in the grammar will be treated as correct, and which will be incorrect – taking a careful look at the structure of the production (11) and its operation during the construction of an deduction (13) will enable to generalise this rule for strings of any length that the given language can generate. The generation method, shown above as an example, is suitable if the proof of grammatical correctness is carried out by humans. Humans have the ability to choose the most appropriate production based on an intelligent overview of the string that is sought and the string which is currently available (at each step of building the deduction) as the result of the commencement of the proo ng process from the axiom of the language and the use of several pre-selected productions. Such a behaviour is inconvenient for a machine, because it requires ingenuity and creativity. Therefore, the most commonly used proof of the correctness of strings in a given language (for example computer programs) is using the reduction method. This method starts with the string, whose correctness is to be veri ed, and sequentially goes through the usable productions used in “reverse”: the analysed string is examined for the occurrence of a sequence of symbols on the right side of the corresponding production, which is replaced by the string on the left side. As a rule, productions are constructed so that the strings on the left are signi cantly shorter than the strings on the right; therefore, the use of a production in the “from right to left” scheme is accompanied by the systematic shortening of the analysed string, hence the name reduction method. If, after further reduction, the analysed string is reduced to a single axiom of the language (S), we can conclude that it has been proven that the analysed string is correct. However, if such a reduction cannot be done, it proves the incorrectness of the string.
1.7. AUTOMATA AND GRAMMARS One can determine the grammatical correctness or incorrectness of a particular string using an automaton. This relationship between the scopes discussed above, the scope of formal languages and grammars and the scope of nite automata, will be the subject of this subsection. The practical value of the information presented here must be emphasised. The use of programming Cracow University of Economics
Theoretical Foundations of Programming 39
languages, which is now the norm in computer programming, is based on the capability of automatic translation of programs written in one programming language into a computer-executable form. This last form consists of a sequence of codes that control the work of the microprocessor and is usually called an executable, or a binary code. Writing a binary program by a human is certainly possible but very time-consuming, tedious, and highly error-prone – hence virtually no one programs this way today. However, translators translating programs from various algorithmic languages into the binary code need to thoroughly analyse the program text in terms of its grammatical correctness. Moreover, it turns out that the productions of the corresponding grammar used in a reductive analysis of the correctness of the program instructions unequivocally point to the activities that the microprocessor has to perform in order to ful l the requests of the programmer and execute the required algorithm. Therefore, the analysis of grammatical correctness is also the key to producing an executable program, which is of great practical importance. The part of the translator which checks the grammatical correctness of the strings expressed in a particular programming language is called a parser, and a typical implementation of a parser refers to its implementation in the form of a nite automaton. A simple example can show us how such a parser can operate. Let us examine a grammar with the following form: Σ = {a}
(14)
N = {S}
(15)
S=S
(16)
P = {S = aaS, S= aa}
(17)
It is easy to see that this grammar de nes the language where the only correct strings are those composed of an even number of repetitions of the symbol a. Therefore, the strings aaaa or aaaaaa are correct, while the string aaa is invalid.
Association 'Education for Entrepreneurship'
40 Programming Methods and Tools
Table 4. Output function and transition function of an automaton accepting words in the analysed grammar Input →
A
ε
1
?, 2
ACCEPTANCE, STOP
2
?,1
REJECTION, STOP
END, STOP
END, STOP
State ↓
STOP
The analysed grammar is regular, which means that it is possible to build an automaton that will examine the correctness of any strings in that language. Such an automaton has to read the subsequent characters included in the analysed string passing from state to state according to the relevant transition table, until it encounters an empty symbol ε, which stops its operation (the state of the automaton will be indicated by STOP, from which no signal can change the state of the automaton). The state number 1 is the starting state of the automaton. During the transition from one state to another, the automaton may, but does not have to send output signals (in this case they are ignored, therefore marked with the symbol ?), but when reaching the STOP state, it outputs the ACCEPTANCE or REJECTION signal, depending on whether the considered string was a correct “sentence” in the analysed language or not. As mentioned above, it is not possible to build such an automaton for every grammar. If it is possible – the grammar is called a regular grammar. If it is impossible, the grammar is irregular. For example, the grammar considered in Section 1.6, which generates words such as anbn, namely those where the symbol a is repeated n-times, and then the symbol b n-times (i.e., the same number of times!), in spite of its simplicity, is an irregular grammar. As we shall see, it is possible to build a system that will have signi cantly richer capabilities than a nite automaton. It is a Turing machine, which is the best known example of the generalised system executing any algorithm – but we will return to it in a moment.
1.8. A FEW ADDITIONAL COMMENTS ON GRAMMARS AND LANGUAGES The examples of languages analysed in Sections 1.6 and 1.7 are of course trivial, since it is possible to show all the important properties of the concepts, which are discussed here, in such languages without entangling the reader in tracking the excessive number of unnecessary details. But in order not to Cracow University of Economics
Theoretical Foundations of Programming
41
leave the impression that the theory of formal grammars and languages is concerned only with a barren juggling of meaningless symbols – let us now look at a simple example of grammar which has features of practical usability. Let us analyse the grammar that examines the correctness of simple algebraic expressions, such as x + y * (z + x) (18) We will describe the grammar, which describes the language for recognising correct and incorrect words, using the so-called Backus Naur Formalization (denoted in the literature as BNF). BNF was rst used in 1960 in order to describe the grammar of a programming language called Algol (from Algorithmic Language) and is still used today to describe subsequent programming languages. In BNF notation, elements of set N (nonterminals) are in the form of words written in italics, rather than individual letters. This facilitates the understanding of their role in the structure of a language. Furthermore, after noticing that in many cases, when constructing large grammars, there are series of productions with the same nonterminal symbol on the left side and various combinations of symbols on the right side – a notation that allows the placement of all of these alternatives in a single formula is used in BNF. For example, instead of writing S = a, S = b, S = c
(19)
S=a b c
(20)
we can write
where the symbol (vertical bar) shall be read as “or”. BNF (and especially its newer version EBNF – Extended Backus Naur Formalization) offers a few more clever shortcuts which would also be useful in de ning grammars we need, but their explanation would take too much space, and will therefore be omitted here. The analysed grammar will use seven terminal symbols only: Σ = { x, y, z, +, *, (, ) }
(21)
There could be more symbols (we could include a larger number of variables or allow the occurrence of numbers in expressions), but these symbols will be suf cient to build an interesting example. Association 'Education for Entrepreneurship'
42 Programming Methods and Tools
The set N will include nonterminal symbols presented in accordance with BNF in the form of strings: N = {expression, component, factor}
(22)
The axiom of the language will be the nonterminal expression S = expression
(23)
and nally, the most important thing, productions: P = {expression = component expression + component,
(24)
component = factor component * factor,
(25)
factor = x y z (expression)}
(26)
To demonstrate in this grammar that the string (18) is built correctly, we can create a deduction, which looks like this: expression = expression + component = expression + component * factor = = expression + component * (expression) = = expression + component * (expression + component) = = x + y * (z + x) (27) You can see for yourself that it is impossible to build a proper deduction for such (example) expressions x+*y )z + y( xy + z z+ therefore, all these expressions are treated as invalid (and rightly so).
1.9. PROPERTIES OF TURING MACHINES The interest of computer scientists in the design and operation of Turing machines comes from the fact that already in the 1940’s, when the rst computers were created, the so-called Church-Turing thesis was formulated, stating that: Cracow University of Economics
Theoretical Foundations of Programming 43
The Turing machine can solve any problem if an algorithm can be formulated for its solution. In other words, using this thesis, the set of all algorithmic problems can be equated with the set of all solvable problems, which has far-reaching practical and theoretical consequences. The relationship of the Church-Turing thesis with the general theory of the mind is dealt with by philosophers, drawing various conclusions from this thesis about what is rationally knowable and what is already outside the possibilities of rational cognition. Also, physicists use the Church-Turing thesis, trying to decide whether the universe is cognizable, or whether there will always be those aspects of the universe which we cannot get to know despite hundreds of years of research and continuing development of knowledge about the world. In this textbook, aimed at providing the readers with useful knowledge from the point of view of practical applications of computer science – mainly those associated with economy – we can regard the Church-Turing thesis as a kind of curiosity, but for anyone watching the rapid development of computer science, wondering what this development will lead to, the existence of this thesis is an essential ingredient for the considerations carried out and the conclusions drawn. Not delving too far into the theory, we can, however, note a few more interesting facts. First, the Church-Turing thesis has never been formally strictly proven. Its functioning is therefore based on the fact that no one has been able to falsify it for more than half a century, that is, to present such a problem for which you could build an algorithm, and for which it would be impossible to build a Turing machine that executes this algorithm. However, there is no formal proof of its correctness. The second interesting observation is that the Turing machine is a more universal system which solves computing problems than any actually existing computer. The difference is the availability of memory (tape) with unlimited capacity in the Turing machine, which in spite of continuous progress of technology cannot be practically implemented in existing computers. Although no Turing machine has ever used an in nite length of tape (because all the algorithms are nite, that is, they end after a nite number of steps, and it is impossible to use an in nite length of tape when executing a nite number of steps) – the fact that a real computer can sometimes run out of memory, and a Turing machine will never do, proves that the set of all possible computations that are executable by computers is signi cantly narrower than the set of all computations. One more (the last quoted here) theoretical aspect of the issue discussed here involves the fact that there are such calculations for which no algorithm can be created. The most well-known problem of this kind is the Busy Beaver problem, de ning a certain task (the need to generate the longest possible series Association 'Education for Entrepreneurship'
44 Programming Methods and Tools
of ones on the tape of a Turing machine, after which the algorithm should stop), for which it can be proven that it is impossible to build an ef cient algorithm.
Figure. 12. Illustration of the Busy Beaver problem Source: http://www.catonmat.net/blog/wp-content/uploads/2009/10/busy-beaver-turing-machine.jpg, (retrieved 2011-04-17)
These facts are an invitation to becoming familiar with Turing machines and gaining at least a super cial knowledge of them.
1.10. AN EXAMPLE OF THE CONSTRUCTION OF A TURING MACHINE AND ITS OPERATION While discussing the automata that can demonstrate the grammatical correctness of speci c strings, we were had to admit that such an automaton cannot be built for every grammar. In particular, we could not build an automaton accepting words in grammar described in formulas (11) and (12), hence we had to describe this simple and nice grammar as an irregular one. In contrast, a Turning machine can do what a nite automaton cannot – thus proving its superiority. Let us consider a Turing machine constructed according to the scheme shown in Figure 9 with the automaton de ned as shown in Table 5. This table shows the states of the automaton (there are 10 of them), where one of these states is de ned as A (acceptance), which corresponds to an interrupt by the Turing machine and stating that the word written is correct in the analysed grammar, and another is de ned as B (error), which also interrupts the operation, because at this point it is already known that the word written is certainly incorrect. The choice of cells in the table is also determined by the input symbols (a, b, or ε which corresponds to an empty cell on the tape). The individual cells of the table have the values of the output function, which Cracow University of Economics
Theoretical Foundations of Programming 45
take the form of a pair: of the symbol, which will be entered using the head in the current cell on the tape, and the movement direction of the head (L = one cell to the left, R = one cell to the right, N = leaving the head in the same cell). Of course, each time after reading and writing the symbol on tape and moving the head, the automaton changes its state, and the state which the automaton moves to is also shown in the appropriate cells of the table as the third symbol of each record. Table 5. Output function and transition function of an automaton, which is the main element of the analysed Turing machine Input → State ↓
a
b
ε
0
aR0
bR0
εL1
1
aNB
εL2
εL1
2
aL3
bL2
εNB
3
aL3
bNB
εR4
4
εR5
bNB
εNB
5
aR5
bR6
εL7
6
aNB
bR6
εL1
7
aNB
bNB
εRA
A (acceptance)
aNA
bNA
εNA
B (error)
aNB
bNB
εNB
Let us trace how it works. The machine starts operation when the automaton is in the initial state (this state is de ned as 0); the analysed string is on the tape and the head is set somewhere on the right side compared to the rst symbol of the analysed string (Figure 13).
Figure. 13. Start of the Turing machine’s operation
The rst step of the machine’s operation is to check whether its operation was properly initialised. If there is a symbol (not a blank cell) under the head at the start, this means that the head is not properly located (inside the string, and not on its right side). The machine then maintains the state of 0 and moves the head to the right until it nds the rst blank cell. Association 'Education for Entrepreneurship'
46 Programming Methods and Tools
When the head nds an empty cell, the machine switches to state 1 and moves to the left until it nds the rst (far right) character in the string (Figure 14).
Figure. 14. During machine nds the rst symbol in the string
If the character is an a – the machine stops, moving to state B (error). Otherwise, it deletes the encountered symbol b, goes into state 2 and starts its movement leftwards (without disturbing the characters b found in cells) to nd the rst symbol of the a series (Figure 15). When it nds it – it moves to state 3 and continues searching to the left until it nds an empty cell (Figure 16).
Figure. 15. The next step of the operation of a Turing machine
Figure. 16. The moment of reversal in the operation of a Turing machine
After nding it, the machine reverses – it begins to move to the right, cutting off the far left symbol a on its way. Perhaps the reader can trace further operation of the machine that, when alternately moving along the string to its right, and then to its left end and systematically “cutting off” symbols a and b at the ends of the string, nally causes the properly constructed string to be completely destroyed. However, Cracow University of Economics
Theoretical Foundations of Programming 47
if the number of symbols in strings a or b are not equal – at least one symbol a or b is not erased and an error is signalled. The operation of a Turing machine shown proves that it can perform operations which a simple machine could not perform. However, it also proves that the design of a Turing machine for a particular application requires considerable ingenuity, and tracing the operation of the machine while solving this or another program requires inexhaustible patience. The readers who wish to “practice” playing a game with the interesting Turing machine should refer to the text on the Webpage http://goodmath.blogspot.com/2006/03/playing-with-mathematical-machines.html where you can see how a Turing machine can perform calculations in the so-called unary arithmetic.
1.11. BIBLIOGRAPHY Aho A.V., Hopcroft J.E., Ullman J.D. [2003], Projektowanie i analiza algorytmów. Helion, Gliwice Alagić S., Arbib M.A. [1982], Projektowanie programów poprawnych i dobrze zbudowanych. WNT, Warszawa Ben-Ari M. [2005], Logika matematyczna w informatyce. WNT, Warszawa Cormen T.H., Leiserson C. E., Rivest R. L. [2001], Wprowadzenie do algorytmów. WNT, Warszawa Dembiński P., Małuszyński J. [1981], Matematyczne metody de niowania języków programowania. WNT, Warszawa Harel D. [1992], Rzecz o istocie informatyki – Agorytmika. WNT, Warszawa Hopcroft J.E., Motwani R., Ullman J.D.[2005], Wprowadzenie do teorii automatów, języków i obliczeń. PWN, Warszawa Kulik C., Tadeusiewicz R. [1974], Elementy cybernetyki ekonomicznej, Wydawnictwo Akademii Ekonomicznej, Kraków Tadeusiewicz R., Moszner P., Szydełko A. [1998], Teoretyczne podstawy informatyki, Wydawnictwo Naukowe WSP, Kraków
Association 'Education for Entrepreneurship'
48 Programming Methods and Tools
Cracow University of Economics
Practice of Imperative Structured Programming 49
2
Practice of Imperative Structured Programming
Janusz Stal, Janusz Tuchowski
Association 'Education for Entrepreneurship'
50 Programming Methods and Tools
Cracow University of Economics
Practice of Imperative Structured Programming
51
2.1. PROGRAMMING PARADIGMS As de ned in the PWN Dictionary of Polish, a paradigm is “the accepted view of reality in the given discipline, doctrine”. In case of programming, this term refers to the style of programming used in the given period of development of a computer science or in certain situations. Each paradigm is characterised by a set of mechanisms used by the programmer in the creation of programs and by the method based on which these programs will be executed by the computer. In this chapter, we characterise the principles of imperative programming and, in the next chapter, of object-oriented programming. Imperative programming is a programming method, where the created program consists of variables and sequences of statements that modify variable values. This concept is extended by the structured programming, which hierarchically divide the set of statements included in the program into blocks, ensuring the clearness of the created code, and procedural programming, which divides the code into smaller fragments carrying out certain tasks. A typical example of a programming language that uses the paradigm of imperative programming is found in the machine code of a computer. Subsequent high level programming languages (Fortran, Pascal, C) have eliminated the disadvantages of programming in machine code due to their properties. One such language is Java, which, although primarily implementing the object-oriented programming paradigm, contains a number of classic structures related to imperative programming. Therefore, this language is used here as a tool to illustrate the discussed matters.
2.2. JAVA PROGRAMMING LANGUAGE Java is a fast growing high-level programming language created by Sun Microsystems™ 5, as well as a platform for running applications. Its main features include: • Being fully object-oriented • Independence from the architecture (of the operating system, CPU) 5
In 2009 the company was acquired by Oracle Corporation™.
Association 'Education for Entrepreneurship'
52 Programming Methods and Tools
• Functionality – automatic memory management, exception handling, multithreading, network handling, application development using the GUI (Graphical User Interface), creating applets (applications running in a Web browser) • Reliability and security of created applications More extensive discussion of the properties of both the language and the environment, where applications are created and executed, is available in “The Java Language Environment”6 and “The Java Language Speci cation”7. In practice, there are a number of concepts and names associated with developing and running applications in the Java programming language. Presented below are the most common ones:8 • JDK (Java Development Kit) 9 – the set of tools for creating programs, which include a compiler, a bytecode interpreter, a browser allowing the running of applets, or additional tools. • compiler – the program converting a source code to the code understood by the Java Virtual Machine (bytecode) • JVM (Java Virtual Machine) a “virtual computer” that runs programs written in Java. • JRE (Java Runtime Environment) – the Java Virtual Machine along with a set of standard classes. JRE is necessary to run any program created in Java. • API (Application Programming Interface) – the collection of ready-to-use components. In case of Java, it is a set of classes and interfaces grouped into appropriate packages. • IDE (Integrated Development Environment) – used for development, modi cation and testing of programs. • console, command line – elements allowing the user to communicate with the computer using text commands.
6
The Java Language Environment, http://java.sun.com /docs/white/langenv/, (retrieved 2011-04-14).
7
Gosling J., Joy B., Steele G., Bracha G. [2005], The Java Language Speci cation. Third Edition, Addison-Wesley, Boston, http://java.sun.com/docs/books/jls/download/langspec-3.0.pdf, (retrieved 2011-04-14).
8
See also: Unraveling Java Terminology, http://java.sun.com/new2java/programming/ learn/unravelingjava.html, (retrieved 2011-04-14).
9
Depending on the environment where applications created in Java will run a number of language speci cations have been de ned: SE (Standard Edition), ME (Micro Edition), EE (Enterprise Edition).
Cracow University of Economics
Practice of Imperative Structured Programming 53
2.2.1. Program development The preparation of the application requires the use of tools enabling or supporting the process of creating the source code, its compilation and running. The necessary tools include: • Java SE Development Kit10 • Text editor (e.g. notepad, notepad++, vi, jedit) or an integrated development environment (e.g. JCreator, NetBeans, Eclipse) • Documentation containing a detailed description of all available classes, along with Java (Java API)11 The source code of the program can be edited in any text editor. The source le created in Java has the .java extension (e.g. Test.java).
2.2.2. Compiling and running the program The task of the compiler is to convert a source code to the code executed by a computer. As a result of compilation, a program is created as an executable le with the .class extension (e.g. Test.class). The execution of the program boils down to issuing a command java with the name of the program (e.g. java Test). The program will be launched using the Java Virtual Machine. An application created in Java can include multiple source les. We can distinguish a number of components in each of them that are decisive to the future operation of the program. Typical elements of each le include: • Optional initial comment, containing the author, a description of the program, or how it is run • Package declaration instructions or/and instructions for the import of the classes used • Interface or class declarations
10
Java Standard Edition, http://www.oracle.com/technetwork/java/javase/downloads/, (retrieved 2011-04-14).
11
Java SE Technical Documentation, http://download.oracle.com/javase/index.html, (retrieved 2011-04-14).
Association 'Education for Entrepreneurship'
54 Programming Methods and Tools
Source text
Compiler
Application (bytecode)
Other OS
Figure. 17. Stages of development and running an application
/* * FirstProgram * author: Jan Kowalski (c) 2009 * * Display the string on the console. * * Run: java FirstProgram * */ public class FirstProgram { public static void main(String[] args) { // Display a simple string System.out.println (“First Java program”); } }
The foregoing program includes a class named FirstProgram containing the main() method, which marks the beginning of the program operation. Its purpose is to output a sample text to the console (display on the monitor) using the System.out.println() method, which has the string of characters to be displayed as an argument. The header of the method in Java contains information about the name, access modi er, the type of return value and arguments passed when it is called. The sequence of statements contained in the method is enclosed in curly brackets and separated by semicolons. The individual elements of this program are: Cracow University of Economics
Practice of Imperative Structured Programming 55
• /*...*/– comment block (usually includes a few lines), ignored during
compilation • public – access modi er12 (speci es the scope of visibility of the class) • class – the beginning of the class de nition • PierwszyProgram – class name (consistent with the name of the source
le) • static – the category of a method indicating that it can be called without having to create the FirstProgram class object • void – the type of value returned by the method (in this case the method
returns no value) • main()– the name of the method which begins the operation of the program13, as de ned in the FirstProgram class • (String[] args) – the arguments of the main() method; the method
code is always enclosed in curly brackets • // ... – comment line14, ignored when the program is compiled • System.out – the use of the System class, along with the standard output
stream (associated with the monitor screen by default) • println()– the method placing a string of characters in the output stream
2.2.3. Runtime environment The program code can be prepared, compiled and run from the command line or using the integrated development environment (IDE). In the rst case: • Create the source code le in any text editor: • notepad FirstProgram.java • Compile the source code (a le with the .class extension): • javac FirstProgram.java • Run the program: java FirstProgram The integrated development environment enables to perform all activities included in the program development process, from creation of the source 12 13
Access modi ers are discussed in detail later in this textbook. The method which begins the program contains the header public static void
main (String [] args). 14
There is another kind of comment, the so-called documentation comment.
Association 'Education for Entrepreneurship'
56 Programming Methods and Tools
code through compilation to running the program. In the exemplary JCreator IDE, this process goes as follows: • Create a source le (File → New → File → Java Classes → Empty Java File) • Specify the name and location of the le • Edit the program code in the editor window • Compile the source code (Build → Build File) • Run the program (Run → Run File) Program 1. Determining the JDK version One of the rst actions in creation of an application is nding out which version of the tools is available on the computer used to create the program. To this end, invoke the command line mode15, then run the compiler (javac16) and the Java Virtual Machine (java). Successful execution of programs should result in displaying of the list of available parameters. The use of the version parameter displays information about the version number of the available compiler (javac) or interpreter (java).
2.2.4. Conventions applied When developing a program, it is good to follow some simple rules related to the applied nomenclature of classes, methods or variables. This facilitates the subsequent analysis of a source code. The names used should clearly identify the elements of the program code, but not be overly concise17. Below are the selected rules to be followed when creating the program code. Names of classes, interfaces: • Begin with an uppercase letter • Do not contain an underscore (“_”) • Consecutive words that make up the name start with a capital letter • Examples: Customer, BankCustomer Names of variables and methods: • Begin with a lowercase letter
15
In Windows (2000, XP, Vista) select Run from the Start menu, then enter the cmd command and press Enter.
16
Missing or incorrect de nition of the appropriate path in the PATH environment variable can cause the lack of access to the compiler from the command line.
17
Note that the Java syntax is case sensitive.
Cracow University of Economics
Practice of Imperative Structured Programming 57
• Do not contain an underscore • Consecutive words that make up the name start with a capital letter • The name of the method should specify the action (usually a verb-noun pair of words is used, or just a verb) • Examples: price, priceOfGoods, add(), addEmployee() Names of constants: • Consist exclusively of uppercase letters • Consecutive words are separated by an underscore • Examples: TAX, VAT _ TAX Detailed information about the conventions used in the source code of Java programs (e.g. indentation, comments, etc.) are available on the Internet18.
2.2.5. Input and output The standard input and output are related to the application where commands are issued (terminal window). Data can be passed to the program using additional parameters listed in the command line when executing the program, or using the available classes which receive data from the user at runtime (e.g. the Scanner class). Displaying of information takes place using classes related to the standard output stream. The methods primarily used here include print(), println() and printf() (e.g. System.out.print(“Java”)).
2.3. DATA TYPES, VARIABLES AND OPERATORS A computer program consists of a sequence of statements executed by the computer in order to implement the task. Statements include expressions that operate on data represented by variables and literals. The type of performed operation is speci ed by operators.
2.3.1. Data types The data type determines the type of information and the range of acceptable values of a constant, variable, expression argument, parameter, and
18
See: Code Conventions for the Java Programming Language, http://www.oracle.com/tech-
network/java/codeconv-138413.html, (retrieved 2011-04-14).
Association 'Education for Entrepreneurship'
58 Programming Methods and Tools
method result. Java de nes simple (primitive) and complex (object) types. In the former case, we can distinguish19: • Integer (int, short, long, byte) • Floating point ( oat, double) • Character (char) • Boolean (boolean) • Empty (void) The speci cation of the language de nes the type of information and acceptable ranges of values. It is worth mentioning that the method of data representation is platform-independent, hence the range of acceptable values remains the same regardless of the platform where the program is run.
2.3.2. Literals A literal is a string of characters representing the value stored directly in the program code. Note the use of additional symbols in order to precisely determine the type of a literal (293L, 293F). Below are some sample values: “maroon” ‘K’ ‘\u0041’ 293 49.6 0251 0x25 400L 2.45F 3.06D 9.72E4 true
// // // // // // // // // // // //
character string “char” type value “char” type value in the “unicode” standard (pre x \u) “integer” type decimal value “double” type real value “integer” type octal value (pre x 0) “integer” type hexadecimal value (pre x 0x) “long” type value (suf x L) “ oat” type value (suf x F or f) “double” type value (suf x D or d - default) exponential notation (symbol E or e) Boolean value
Java also contains a set of escape characters): \b, \t, \f, \r, \n, \”, \’, \\, which can be used directly in the code: System.out.println (“Cracow University\nof Economy”);
19
The complete speci cation of data types found in the programming language is available in the paper titled “Primitive Data Types”, http://download.oracle.com/javase/tutorial/java/
nutsandbolts/datatypes.html, (retrieved 2011-04-14).
Cracow University of Economics
Practice of Imperative Structured Programming 59
Program 2. Output of results to the console The effect of the program operation is the presentation of results. They may be directed to a console or a printer. The following example shows the method of outputting information to the console using the println method. public class BasicOperations { public static void main(String[] args){ int i = 7; System.out.println (“A novel \” War and Peace\” by Leo Tolstoy”); System.out.println (“64kB is” + 0xFFFF + “ bits”); System.out.println (“Is 32 > 15 ? “ + (32>15 ? “yes” : “no”) ); System.out.println(“\\u0042 (unicode) represents the character \u0042”); System.out.println(“7 * (2 to the 3rd power) = “ + (7 >>= • Arithmetic23: + - * / % • Unary: + - ++ -- ! • Relational: == != > >= < >>> & ^ | It should be noted that the way of using operators depends on the type of argument(s), while the order of operations results from the priority of operators. The order of operations can be changed using parentheses. The following expressions illustrate the examples of the use of operators: portableComputer = true; desktopComputer = !portableComputer; distance -= 24; i++; productOfSum = (1+2)*(3+4); delta = b*b – 4*a*c; circleArea = 3.14 * r * r; averageGrade = (biologyGrade + physicsGrade + historyGrade) / 3;
Program 4. Basic arithmetic operations Data can be passed to the program in several ways. They can be read from a le, entered by the user using a keyboard or placed on the command line when starting the program. In case of this last option, the access to data is achieved by reading the contents of the args array. Values of each of the cells in the array contain the subsequent values placed on the command line. The following program demonstrates the use of the command line to pass data to the program. When it is called, the sum of two integers placed on the command line is calculated after running the program.
22
See: Operators, http://download.oracle.com/javase/tutorial/java/nutsandbolts/operators.html, (retrieved 2011-04-14).
23
Division operator yields the integer value (without remainder) if both arguments are integers. In all other cases, the value is a real number.
Association 'Education for Entrepreneurship'
62 Programming Methods and Tools
/* * Sample program execution: * java ArithmeticOperations 523 -68 */ public class ArithmeticOperations { public static void main(String[] args){ // Read and convert numbers into a numeric form int numberA = Integer.parseInt(args[0]); int numberB = Integer.parseInt(args[1]); // Display the results on System.out.print(“A = “ + System.out.println(“, B = System.out.println(“A + B
the console numberA); “ + numberB); = “ + (numberA+numberB));
} }
2.3.5. Type conversion and casting If the arguments of an expression do not have compatible types, type conversion is needed. When it is carried out without loss of information, it is performed implicitly. If it may lead to loss of information, it is necessary to perform an explicit cast: byte x = 17; int y = x; // incompatible types (implicit conversion) double a = 15.4; int b = (int) a; // incompatible types (possible loss of information, // casting required)
Program 5. Type casting One of the attributes of data appearing in programming languages is their type. It directly relates to the method of physical data representation, where a sequence of bits corresponds to the presented value. When creating a program, it is often necessary to perform an operation on different types of data or to treat data differently. These operations can be achieved using the type conversion.
Cracow University of Economics
Practice of Imperative Structured Programming 63
The example below is a program converting temperature expressed in degrees Celsius to temperature in degrees Fahrenheit and Kelvin. The value of temperature in Celsius is entered from a keyboard. The values of temperature in K and F are presented as integers, which was obtained by casting to the int type. import java.util.Scanner; public class TemperatureConverter { public static void main(String[] args){ Scanner sc = new Scanner(System.in); // Read the temperature in degrees Celsius System.out.print(“Enter the temperature in degrees Celsius: “); double tempC = sc.nextInt(); // Celsius // Calculate the temperature in other scales double tempF = 32 + (9/5d)*tempC; // Fahrenheit double tempK = tempC + 273.15; // Kelvin // Display the results of calculations, to the nearest whole degree System.out.println(“* temperature in degrees C: “ + (int)tempC); System.out.println(“* temperature in degrees F: “ + (int)tempF); System.out.println(“* temperature in degrees K: “ + (int)tempK); } }
2.3.6. Character strings A string is a sequence of characters contained in an object or a literal having the type of String24: String person = “Jan Kowalski”; // Declaration and initialisation of a String type variable
Specifying the number character in the variable or String type literal is possible using the length() method: String productName = “Sony-Ericsson K800i mobile phone”; System.out.println(“Number of characters: “ + productName.length()); System.out.println(“Number of characters: “ + “Distributed programming”.length());
The java.lang.String class contains methods that enable to manipulate characters in a string, compare strings25, search, extract a substring, create 24
It is also possible to create a String object using the new operator (see the available constructors of the String class).
25
Because of the method of representing data, it is recommended to use the equals() method instead of the comparison operator == in order to compare character strings.
Association 'Education for Entrepreneurship'
64 Programming Methods and Tools
a backup of characters, convert characters to uppercase/lowercase letters based on the Unicode26standard, or to convert numeric values to a character string. String result; String address = “Szeroka St.”; result = “Kwiatowa St.”.equals (“Kwiatowa St.”) ? “right address” : “other address”; System.out.println(result); result = “Kwiatowa St.”.equals (address) ? “right address” : “other address”; System.out.println(result);
Program 6. Operations on character strings In contrast to int, char, or double, the String type is an object type. It provides a large collection of methods for performing operations on character strings. The characteristics of these methods are available in the Java API. Examples of the use of some of them are included in the program below. It converts any sequence of characters read from the console27 to uppercase letters. import java.util.Scanner; public class UppercaseLetters { public static void main(String[] args){ // Get the text from the console Scanner sc = new Scanner(System.in); System.out.print(“Enter any text: “); String text = sc.nextLine(); // Display the text by converting to uppercase System.out.println(text.toUpperCase()); } }
26
Computer industry standard for encoding characters used in the majority of written languages throughout the world.
27
Data can also be read from the console using other methods, for example the java.io. Console class (see: I/O from the Command Line, http://download.oracle.com/javase/ tutorial/essential/io/cl.html, (retrieved 2011-04-14)).
Cracow University of Economics
Practice of Imperative Structured Programming 65
2.3.7. Arrays Arrays represent a container used to store any number of values of a single type. The number of array items is determined at the time of its creation and cannot be changed during runtime. Each element has a consecutive number (index28) that allows reading and writing of data: //# One-dimensional arrays // Declaration of an array variable double[] productPrice; // Declaration of an array variable (element name and type) String[] characterTrait; // Declaration of an array variable int [] diceRoll; // Create an array (specify the number of its elements) productPrice = new double[50]; characterTrait = new String[3]; diceRoll = new int[8]; // Simultaneous declaration of an array variable and creation of an array String[] meansOfTransport = new String[2]; // Simultaneous declaration of a variable, creation of a 3-element array and assigning a value String[] traf cLight = {“red”, “yellow”, “green”}; // Access to array items (assignment and reading of values) productPrice[17] = 328.50; productPrice[17] = 29.0; double total = productPrice[17] + productPrice[25]; diceRoll[7] = 3; characterTrait[0] = “communicative”; characterTrait[2] = “assertive”; meansOfTransport[1] = “bike”; System.out.println(traf cLight[2]); //# Multi-dimensional arrays // Declaration and initialisation of a two-dimensional array int[][] multiplicationTable = {{1,2,3}, {2,4,6}, {3,6,9}}; System.out.println(multiplicationTable[2][0]);
In order to determine the number of array items, read the value in the length eld:
28
The numbering of array elements starts with index 0.
Association 'Education for Entrepreneurship'
66 Programming Methods and Tools
String[] rainbowColour = {“red”, “orange”, “yellow”, “green” “blue”, “violet”}; int numberOfRainbowColours = rainbowColour.length;
Copying data between arrays is possible using the available java.lang. System.arraycopy() method. There is a considerable number of methods to manipulate the arrays, which perform operations of sorting or searching the elements. These methods are grouped in the java.util.Arrays class. Program 7. Command line parameters As already mentioned, the values of parameters provided in the command line when calling the program are available in the args array passed as an argument of the main() method. The following example demonstrates the output of data on the console ( rst name, last name and address) that were entered into the program when it was run via the command line. /* * Sample program execution: * java CommandLineParameters Kowalewski Eugeniusz “ Szeroka St. 15 apt. 3” */ public class CommandLineParameters { public static void main(String[] args){ System.out.println(“last name: “ + args[0]); System.out.println(“ rst name “ + args[1]); System.out.println(“address: “ + args[2]); } }
Program 8. Operations on arrays Writing or reading data from individual array cells is possible by using an index pointing to a speci c cell. These steps are illustrated in the following program, which includes an array of character strings containing the basic units of the SI system (units[]). The result of the application is the output of units of temperature, weight, and length to the console.
Cracow University of Economics
Practice of Imperative Structured Programming 67
public class SIUnits { public static void main(String[] args){ String[] units = {“kg”, “cd”, “s”, “A”, “K”, “mol”, “m”}; System.out.println(“SI system basic units”); System.out.println(“=================================”); System.out.println(“Unit of temperature (Kelvin): “ + units[4]); System.out.println(“Unit of mass (kilogram): “ + units[0]); System.out.println(“Unit of length (metre): “ + units[units.length-1]); System.out.println(“---------------------------------”); System.out.println(“Number of units: “ + units.length); } }
2.3.8. Enumerated type Enumerated type consists of a xed, nite set of constant values (instances of the type). Each of the instances remains the same during the program runtime. Thus, by convention, identi ers of instances should be formed using uppercase letters and the underscore character. enum Day {MON,TUE,WED,THU,FRI,SAT,SUN}; Day workDay = Day.TUE; Day dayOff = Day.SAT; System.out.println(workDay!=dayOff);
2.4. CONTROLLING THE FLOW OF EXECUTION OF THE PROGRAM Statements included in the computer program are usually executed sequentially, starting with the rst and ending with the last one. However, complex computing tasks often require combining the statements into groups, conditional processing, or their multiple repetition, in order to perform the task. Therefore, almost any programming language includes a set of control statements designed to allow the construction of the program in accordance with the implemented algorithm 29. Java distinguishes ve control statements: while, do..while, for, if, switch, whose purpose is to allow solving of multiple problems faced by the program developer.
29
An algorithm is, in simple terms, a nite ordered sequence of actions necessary to perform the task.
Association 'Education for Entrepreneurship'
68 Programming Methods and Tools
2.4.1. Statement block Programming practice shows that there is often the need to process statements which are a part of the program in bulk. This functionality is provided by a statement block. It allows grouping the statements by placing them in curly braces {}: { }
statements;
Variable declarations can be placed inside the block. It should be noted, however, that these variables are visible only inside the block – it is not possible to refer to these variables outside of the block where they are declared. Also, an attempt to declare a variable inside a block, with the name that is identical to the existing variable outside of the block, will result in a compilation error. { String telephoneNo = “555 302 197”; System.out.println(telephoneNo); } System.out.println(telephoneNo); // Variable not available outside of the statement block
2.4.2. Conditional statement This kind of a control statement allows the conditional execution of a single statement or block of program statements depending on the value of the Boolean expression. If the Boolean expression is true (takes the value of true), the program statements occurring within the30 if statement block will be executed. If the Boolean expression takes the value of false, none of the statements contained in the statement block will be executed. if (Boolean_expression) { statements; }
Both program statements contained in a statement block in the following example will be executed only if the value of the variable x will be different to 0. 30
When the if conditional statement contains only one statement of the program, then using a statement block (inside a pair of curly braces {}) is not required.
Cracow University of Economics
Practice of Imperative Structured Programming 69
int x = 5; if (x != 0) { x *= 3; y = x + 4;
// Boolean expression takes the value true // Therefore, both statements are executed
}
There is also a version of the conditional if statement, which takes the alternative situation into account. An additional statements block contained after the else clause is executed only if the Boolean expression takes the value false. if (Boolean_expression) { statements1; } else { statements2; }
The following example demonstrates the use of an else clause. The statement block that follows the else clause is executed only if the value of x is equal to or less than 0. int x = 0; if (x > 0) { x = 2*x+4; } else { x = x – 5; y = 2*x; }
// Boolean expression takes the value false, // therefore the statement will not be executed // Both statements are executed
If the number of conditions to be checked is large, it is possible to use nested conditional statements: int x = 5; if (x > 0) { System.out.println(“Positive value of x”); } else if (x < 0) { System.out.println(“Negative value of x”); } else { System.out.println(“The value of x is 0”); }
Association 'Education for Entrepreneurship'
70 Programming Methods and Tools
Program 9. Checking the value of a number The program checks whether the natural number read from the console is even by using a conditional statement. The if conditional statement is used for this purpose. import java.util.Scanner; public class EvenNumber { public static void main(String[] args) { int naturalNumber; // Analysed value /* Get the number from the console */ Scanner sc = new Scanner(System.in); System.out.print(“Enter any natural number: “); naturalNumber = sc.nextInt(); /* Check whether the number is even (divisible by 2 without remainder) */ boolean evenNumber = naturalNumber % 2 == 0 ? true: false; /* Display the results on the console */ if (evenNumber) { System.out.printf(“Number %d is even,” naturalNumber); } else { System.out.printf(“Number %d is even,” naturalNumber); } } }
Program 10. Determining the roots of a quadratic equation Quadratic equation is an algebraic equation with one unknown, in the form of ax2+bx+c=0. The purpose of the program is to determine the roots of a quadratic equation. The sqrt() method used here enables the calculation of the square root (java.lang.Mathclass)
Cracow University of Economics
Practice of Imperative Structured Programming
71
import java.util.Scanner; public class QuadraticEquation { public static void main(String[] args) { double x1, x2; // Roots of the equation // Read the equation coef cients from the console Scanner sc = new Scanner(System.in); System.out.println(“Coef cients of the quadratic equation”); System.out.println(“in the form of ax2 + bx + c = 0”); System.out.print(“a = “); double a = sc.nextDouble(); System.out.print(“b = “); double b = sc.nextDouble(); System.out.print(“c = “); double c = sc.nextDouble(); // One root of the equation if(a == 0) { x1 = -c/b; System.out.print(“Root of the equation: x = %s”, x1); return; } // Determine the discriminant of the equation double delta = b*b - 4*a*c; // Determine the roots of the equation if (delta > 0) { x1 = (-b - Math.sqrt(delta))/(2*a); x2 = (-b + Math.sqrt(delta))/(2*a); System.out.printf(“Roots of the equation: x1 = %s, x2 = %s”, x1, x2); } else if (delta == 0) { x1 = -b / (2*a); System.out.print(“Root of the equation: x = %s”, x1); } else { System.out.println(“The equation has no roots!”); } } } }
2.4.3. Multiple choice statement If the number of conditions to be checked is large, using an if–else statement can become cumbersome. We can use the multiple-choice switch statement in such a situation. Its action consists in the determination of the value of expression, and then comparing this value to the values of constants speci ed after the case clauses. In case of equality, control is transferred to the statements appearing after the case clause, for which the value of the constant is consistent with the value of the expression. If no equality occurs,
Association 'Education for Entrepreneurship'
72 Programming Methods and Tools
statements are executed which occur after the keyword default, or further program statements, if this clause is not used. The break statement stops the execution of the switch statement and transfers the control to subsequent program statements that occur after the switch statement. switch (expression) { case value1: statements1; break; case value2: statements2; break; case valueN: statementsN; break; default: // Optional part of the switch statement statements; }
The following example illustrates the use of a multiple-choice statement. Depending on the value that the season variable takes, an appropriate string will be output to the console. char season = ‘Wi’; switch(season) { case ‘Sp’: System.out.println(“Spring”); break; case ‘Su’: System.out.println(“Summer”); break; case ‘Au’: System.out.println(“Autumn”); break; case ‘Wi’: // The constant matches the expression System.out.println(“Winter”); // The statement will be executed break; // The statement will be executed default: System.out.println(“Incorrect symbol”); }
Program 11. Representation of numerical values in words Grading is a conventional way of qualifying the progress of a pupil or student. It can be expressed in a symbolic notation (e.g. numbers 1 to 6) or in words. The following program presents a grading notation using words
Cracow University of Economics
Practice of Imperative Structured Programming 73
(excellent, very good, good, satisfactory, poor, unsatisfactory) as a representation of its numeric value. Input of the verbal representation of a grade to the console was carried out using a multiple-choice statement. import java.util.Scanner; public class GradeInWords { public static void main(String[] args) { // Read data from the console Scanner sc = new Scanner(System.in); System.out.print(“Grade (1..6): “); int n = sc.nextInt(); // Display the grade System.out.print(“Grade “ + n + “ “); switch(n) { case 1: System.out.println(“Unsatisfactory”); break; case 2: System.out.println(“Poor”); break; case 3: System.out.println(“Satisfactory”); break; case 4: System.out.println(“Good”); break; case 5: System.out.println(“Very good”); break; case 6: System.out.println(“Excellent”); break; default: System.out.println(“Incorrect!”); } } }
2.4.4. Inde nite loops Iterative statements (loops) allow the multiple execution of a single statement or statement block. If the number of repetitions is not known in advance, inde nite loops are used. The while statement one of them, which is based on the value of the Boolean expression. If this expression takes the value of true, the statement block occurring after the expression is executed, and then the value of the Boolean expression is recalculated. The statement block will
Association 'Education for Entrepreneurship'
Programming Methods and Tools
74
be executed repeatedly until the Boolean expression takes the value of false. Then, the control is transferred to the next statement in the program, occurring after the while statement. while (Boolean_expression) { statements; }
The following code shows an example of the use of the iterative statement while. The statement block of the program will be executed until the variable x takes a value lower than 0 (in this case the statement block is executed eight
times). It should be noted that it is usually necessary to modify the values of Boolean expressions inside a statement block, so that the while statement can be terminated in a nite number of repetitions. int x = 7; while (x >= 0) { x--; System.out.println(x);
// Both statements included in // the statement block // will be executed eight times
}
A variation of the while statement is an inde nite loop do..while, whose action is similar. The only difference is that the value of the Boolean expression is determined after executing the statement block. Thus, in contrast to the while statement, the statement block is executed at least once. do { statements; } while (Boolean_expression);
Program 12. Determining the number of coins Assuming that coins of denominations PLN 1, PLN 2 and PLN 5 are in circulation, the following code performs the task of presenting any amount, expressed as an integer in the range of using the smallest possible number of coins. The iterative statement do used here controls the value input by the user, so that it ts in the given range.
Cracow University of Economics
Practice of Imperative Structured Programming 75
import java.util.Scanner; public class Coins { public static void main(String[] args) { int amount; Scanner sc = new Scanner(System.in); // Get the amount from the console do { System.out.print(“Enter the amount in PLN (1..1000): “); amount = sc.nextInt(); } while (amount < 1 || amount > 1000); // Determine the smallest possible number of coins System.out.println(“-----------------------”); System.out.println(“PLN 5 coins: “ + amount/5); System.out.println(“PLN 2 coins: “ + (amount%5)/2); System.out.println(“PLN 1 coins: “ + (amount%5)%2); } }
2.4.5. De nite loops If the number of repetitions is known in advance, it is possible to use the iterative statement for. Its action consists in determining the initial value de ning the beginning of the iteration. The execution of the statement block depends on the value of the Boolean expression. If it is true (has the value of true), the statement block is executed and then the expression modifying the initial value set is executed. In the next step, the value of the Boolean expression is re-checked and if it is true, the statement block is executed. The described actions are repeated until the Boolean expression takes the value false. Then, the program control is passed to the next statements occurring after the for statement. for (initial_value; Boolean_expression; modifying_expression) { statements; }
In this example, the statement block within the iterative statement for is executed eight times, each time displaying the value of variable i on the console. for (int i=5; i