Selection of Computer Programming Languages for

Selection of Computer Programming Languages for Developing Distributed Systems

SHADMAN SALIH Software Technology Research Paper De Montfort University Faculty of Technology The Gateway, Leicester LE1 9BH, UK May 15, 2014 [email protected]

This dissertation is submitted to the department of Software Engineering, University of De Montfort in partial fulfillment of the requirements for the degree of Master of Science in Software Engineering.

Copyright © 2014 De Montfort University. All rights reserved

Attestation

I understand the nature of plagiarism, and I am aware of the University’s policy on this. I clarify this project is entirely my own work, and dissertation reports are original work by me during my university project period except where otherwise indicated, describes my own research.

Shadman Q. Salih

May 15, 2014

i

Acknowledgments I would never have been able to finish my dissertation without the help and support from the most Gracious and Merciful (ALLAH). I would like to express my thanks to Him for giving me the ability to complete this work successfully.

I would like to express my deepest appreciation and special gratitude to my advisor, Dr. Ali Al-Bayatti, for his endless support, patience, guidance, and his positive comments. He continually encouraged me to do hard work and improve the quality of my research project by providing his useful advices and valuable feedbacks. Your advice and feedback on both research as well as on my career have been priceless. The only thing that I can tell him is, “Thank you very much” for everything.

I must express my special thanks to my beloved family and my dearest friend “Hawre” for their great and endless support during my two years period of study in De Montfort University.

~Thank You~ Leicester, United Kingdom, 2014 Shadman Salih

ii

Table of Contents Attestation ........................................................................................................................................... i Aknowledgments ...............................................................................................................................ii Table of Contents ......................................................................................................................... iii-v List of Figures .................................................................................................................................... v List of Tables...................................................................................................................................... v Abstract ............................................................................................................................................. vi 1. Introduction ................................................................................................................................... 2 1.1 Statement of the Problem ....................................................................................................... 2 1.2 Research Aims and Objectives ............................................................................................... 3 1.3 Research Significance ............................................................................................................. 3 1.4 Methodology ............................................................................................................................ 4 1.5 Project Outline......................................................................................................................... 5 2. Background and Related Researches .......................................................................................... 7 2.1 Introduction to distributed systems ....................................................................................... 7 2.2 Background .............................................................................................................................. 7 2.3 Basics of Distributed Systems................................................................................................. 8 2.3.1 Definition of distributed systems ....................................................................................... 8 2.3.2 Characteristics of distributed systems ................................................................................ 9 2.3.3 Benefits and Drawbacks of Distributed Systems ............................................................. 12 2.4 Types of Distributed Systems ............................................................................................... 14 2.4.1 Distributed Computing Systems....................................................................................... 14 2.4.2 Distributed Information Systems ..................................................................................... 16 2.4.3 Distributed Pervasive/Embedded Systems ....................................................................... 17 2.5 Middleware ............................................................................................................................ 18 2.6 Summary ................................................................................................................................ 19 3. Distributed Programming Languages ....................................................................................... 21 3.1 History of Distributed Programming Languages ............................................................... 21 3.2 A brief History of High-Level Programming Languages .................................................. 23 3.3 Classification of Distributed Programming Languages .................................................... 26 3.3.1 Imperative programming languages ................................................................................. 26 3.3.2 Functional programming languages ................................................................................. 26 3.3.3 Object-Oriented programming languages ........................................................................ 27 3.3.4 Logical programming languages ...................................................................................... 27 3.4 Distributed Programming Techniques ................................................................................ 29 3.4.1 RPC (Remote Procedure Call) ......................................................................................... 29 3.4.2 Message Passing Interface (MPI) ..................................................................................... 30 3.4.3 Common Object Request Broker Architecture (CORBA) ............................................... 30 3.4.4 Java Remote Method Invocation (Java RMI) .................................................................. 31 3.4.5 Distributed Component Object Model (DCOM) ............................................................. 31 3.4.6 .Net Framework Remoting ............................................................................................... 32 3.5 Criteria for Language Selection ........................................................................................... 32

iii

4. Samples of distributed programming languages ...................................................................... 37 4.1 C++ as a first sample of distributed programming languages .......................................... 37 4.1.1 Introduction to C++ Programming Language .................................................................. 37 4.1.2 Characteristics of C++ Language ..................................................................................... 38 4.1.3 Features of C++ as a Distributed Language ..................................................................... 39 4.1.4 How C++ Relates to Java and C#? ................................................................................... 41 4.1.5 The Role of C++ and CORBA in Distributed Systems.................................................... 42 4.2 Java as a second sample of distributed programming languages ..................................... 43 4.2.1 Introduction and historical perspective of Java ................................................................ 43 4.2.2 The Java Language Features ............................................................................................ 44 4.2.3 The Importance of Java in Distributed Systems............................................................... 46 4.2.4 What does CORBA offer Java programmers? ................................................................. 47 4.3 C# as a Third Sample of Distributed Programming Languages ...................................... 48 4.3.1 An Introduction to C# ...................................................................................................... 48 4.3.2 Features of the C# Programming Language ..................................................................... 49 4.3.2.1 Object-orientation .............................................................................................................. 49 4.3.2.2 Type Safety ........................................................................................................................ 50 4.3.2.3 Memory Management ........................................................................................................ 50 4.3.2.4 Platform Support ................................................................................................................ 50

4.3.3 The Significance of C# in Distributed Systems ............................................................... 51 4.4 Summary ................................................................................................................................ 52 5. Comparison & Findings.............................................................................................................. 54 5.1 Related Work ......................................................................................................................... 54 5.2 Language Comparison Criteria ........................................................................................... 55 5.2.1 Concurrency ..................................................................................................................... 56 5.2.2 Reliability ......................................................................................................................... 56 5.2.3 Scalability ......................................................................................................................... 57 5.2.4 Security............................................................................................................................. 57 5.2.5 Portability/Platform .......................................................................................................... 58 5.2.6 Simplicity and Usage ....................................................................................................... 58 5.2.7 Efficiency ......................................................................................................................... 58 5.2.8 High integrity ................................................................................................................... 59 5.2.9 Reusability ........................................................................................................................ 59 5.2.10 Maintainability ............................................................................................................... 59 5.3 Comparing Candidate Languages (C++, Java and C#) Against Selected Criteria ......... 60 5.3.1 C++, Java and C# vs. Concurrency .................................................................................. 61 5.3.2 C++, Java and C# vs. Reliability ...................................................................................... 63 5.3.3 C++, Java and C# vs. Scalability ..................................................................................... 65 5.3.4 C++, Java and C# vs. Security ......................................................................................... 66 5.3.5 C++, Java and C# vs. Portability/Platform ....................................................................... 69 5.3.6 C++, Java and C# vs. Simplicity and Usage .................................................................... 71 5.3.7 C++, Java and C# vs. Efficiency ...................................................................................... 73 5.3.8 C++, Java and C# vs. High integrity ................................................................................ 74 5.3.9 C++, Java and C# vs. Reusability .................................................................................... 76 5.3.10 C++, Java and C# vs. Maintainability ............................................................................ 78 5.4 Results and Evaluation: ........................................................................................................ 80 5.5 Summary ................................................................................................................................ 82

iv

6. Conclusions & Recommendations ............................................................................................. 84 6.1 Project Summary................................................................................................................... 84 6.2 Critical Review of This Research ......................................................................................... 85 6.3 Weak Points of This Study ................................................................................................... 86 6.4 Conclusion .............................................................................................................................. 86 6.5 Future Work & Recommendations ..................................................................................... 87 Biblograpghy.............................................................................................................................. 88-92

List of Figures Figure 2.1 An Example of Cluster Computing .................................................................15 Figure 2.2 The OSI Model and Middleware ....................................................................18 Figure 3.1 History of High Level Programming Languages Evolution ...........................25 Figure 3.2 Langauge Classification for Distributed Programming ..................................28 Figure 4.1 Characteristics of C++ Programming Language.............................................39 Figure 5.1 The Rate of Concurrency in C++, Java and C# ..............................................62 Figure 5.2 The Rate of Reliability in C++, Java and C# ..................................................64 Figure 5.3 The Rate of Scalability in C++, Java and C# ..................................................66 Figure 5.4 The Rate of Security in C++, Java and C#......................................................68 Figure 5.5 The Rate of Poratbility/Platform in C++, Java and C# ...................................70 Figure 5.6 The Rate of Simplicity & Usage in C++, Java and C# ...................................72 Figure 5.7 The Rate of Efficiency in C++, Java and C# ..................................................74 Figure 5.8 The Rate of High Integrity in C++, Java and C# ............................................75 Figure 5.9 The Rate of Reusability in C++, Java and C# .................................................77 Figure 5.10 The Rate of Miantainability in C++, Java and C# ........................................79 Figure 5.11 Comparison of Candidate Languages Against Selected Criteria ..................81

List of Tables Table 5.1 Language Comparison Against Selected Criteria ............................................80

v

ABSTRACT Programming languages and distributed systems have long influenced each other. Naturally, every programming language has its strengths and weaknesses. Consequently, it might be difficult to decide precisely which language should be chosen for a software project. However, the selection of the right programming language can be crucial to the success of a project or a software system. This research project attempts to compare C++, Java and C# in an open distributed systems environment with respect to the following technical and economic language comparison criteria: concurrency, scalability, reliability, security, portability or platform, simplicity and usage, efficiency, high integrity, reusability and maintainability. These criteria are chosen so as to make a comparative study between the three candidate programming languages against the criteria mentioned above in order to find out how best a programming language is selected for a project based on distributed systems. At the end the evaluation and findings are presented in the form of a comparison table and bar chart graph to provide evidence and analysis on why Java is better than other languages or has an advantage over C++ and C# according to some criteria.

vi

Chapter – 1

Introduction

1

1. Introduction Programming languages and distributed systems have long influenced each other. Implementing or developing any distributed system requires a particular programming language. Various computer programming languages are available nowadays that can be used to build or developing distributed systems such as C++, Java and C#. Naturally, every language has its own strengths and weaknesses. Accordingly, it might be difficult to decide precisely which language should be chosen for a software project. However, the selection of the right programming language can be crucial to the success of a project or a software system. Thus the research problem is aiming to compare three different programming languages according to ten specified criteria in order to determine the strengths, weaknesses, applicability and suitability of the nominated programming languages for each criterion, to distinguish their pros and cons as well as to explore and evaluate the associated features on those chosen programming languages. The main goal is to find the most suitable programming language that could be recommended for a software project in an open distributed system environment.

1.1 Statement of the Problem Distributed systems have become large-scale and complex systems, which are quite difficult to develop. The construction and programming of distributed applications still remains a difficult task for programmers. Several programming languages and operating systems have been proposed to simplify the construction and programming of open distributed systems. However, new requirements or even new application areas are emerging that challenge and promote the evolvement of current programming languages. Also from time to time some new programming languages are developed in order to keep up with the changing programming paradigms and meet the new requirements of the user. The choice of which programming language to use has become another problem for software engineers or programmers because of numerous existing languages. This dissertation hopes to solve these issues by comparing three high-level programming languages C++, Java and C# based on ten certain criteria in an open distributed systems environment to find out how best a programming language can be selected for a project based on distributed systems.

2

1.2 Research Aims and Objectives Using current literature and available research, it is well understood that comparison of programming languages is not a static topic and innovations are happening at rapid pace. Hence this research project aims to find out how best a programming language is chosen for a project based on distributed systems thorough the following topics: (a) History of distributed programming languages (b) Classification of distributed programming languages (c) Distributed programming languages: theory and techniques (d) Samples of distributed programming languages (e) Criteria for language selection In order to achieve this aim, the following sub-objectives require to be investigated: Ø Identify the main characteristics of distributed systems in relation to programming languages that can affect the construction of distributed applications. Ø Identify samples of distributed programming languages for comparison as well as the main criteria for language selection Ø Comparing selected distributed languages (C++, Java and C#) against chosen criteria in an open distributed systems environment. Ø Finally evaluate and analyse the results of the comparison process to discover the applicability and suitability of the languages for each criteria.

1.3 Research Significance A distributed system or any software system built with wrong programming language may lead to some big issues. The time spent with the development of the new software system can be very long and the maintenance also can be very difficult. Sometimes a new software system has to be rebuilt due to the lack of performance, reliability and efficiency problems, which is a huge waste of time, human effort and money. This study helps software engineers to avoid or reduce the likelihood of these issues mentioned above from happening by recommending a right programming language in order to improve the overall performance of the software system or distributed applications and develop large volumes of software that can be used to support various application activities.

3

1.4 Methodology The proposed methodology to be adopted to fulfill objectives of this research as well as to achieve the aim of this study is briefly outlined in this section. It is classified into four work packages as follow: First work package will be literature review: In this work package multiple researches, IEEE standards, reports, theses, scholarly papers and articles will be reviewed to understand current body of knowledge related to distributed systems. The wider range in terms of distributed systems and how to develop them are introduced in relevant research that helps to increase the feasibility of this study. Second work package will be review of distributed programming languages: This work package is also a part of literature review. In this part, various research papers will be analysed and reviewed to collect background knowledge about history, classification and techniques of distributed programming languages. Also several criteria for language selection will be introduced based on previous studies to understand and identify those criteria that are related to both programming languages and distributed systems. Third work package will be samples of distributed programming languages: In this section three distributed programming languages that are C++, Java and C# will be reviewed in detail to find out the features, characteristics and their role in developing distributed systems. Fourth work package will be comparison of distributed programming languages with respect to ten criteria: In this work package the category of requirements of language comparison process are clarified by reviewing several research papers and related works in this area. This aids to discover the methods or modules used for comparison as well as the gap and limitations of the previous studies. Furthermore, the comparison exceeds former researches by selecting ten criteria related to distributed systems and comparing them with C++, Java and C# in details with proper evaluation of the outcomes.

4

1.5 Project Outline This dissertation is mainly divided into six chapters. The former sections provided an introduction of the project, research question or statement of the problem, research aims and objectives, the significance of this study and methodology that has been adopted. This can be named as Chapter – 1. The rest of the research is structured into the following chapters: Chapter – 2: Background and related researches or literature review. This chapter provides an introduction to distributed systems, background and basics of distributed systems that involves definition, characteristics and advantages & disadvantages of open distributed systems. It also describes three types of distributed systems: distributed computing systems, distributed information systems and distributed pervasive systems. Finally it briefly discusses Middleware as a sample of distributed systems technology. Chapter – 3: An overview of distributed programming languages. This chapter explores the history of distributed programming languages, classification and techniques of distributed programming. It also discusses some language comparison criteria in relation to distributed systems. Chapter – 4: Samples of distributed programming languages (C++, Java and C#). In this chapter, C++, Java and C# have been taken as a three different samples of distributed programming languages. It explores historical perspective, features, and characteristics of each language as well as their role in developing distributed applications. Chapter – 5: Comparison and findings. This chapter compares C++, Java and C# against ten specified criteria founded form previous chapters. It also presents the outcomes of the comparison process and recommends a language that can be selected for a software project or distributed systems. Chapter – 6: Conclusions, recommendations and future study. This chapter summarises the whole work that has been described in this research and identifies the weak points or limitations of this study. It also proposed some possible solutions and recommendations to improve the current study that can be applied for future studies in this area.

5

Chapter – 2

Background and Related researches

6

2. Background and Related Researches 2.1 Introduction to distributed systems Since the birth of the first computing machines, computing has passed through many transformations and developments. The need of performing complex data calculations and processing quickly is lead to motivation of the advent of computers. The initial solutions are based on a model that is known as centralized systems, in which all incoming requests has been processed by a single computer with one or multiple Central Processing Units (CPUs). However, this model has become less attractive because of the separate nature of the divisions, cost and reliability issues. An alternative model has been proposed to address these issues related to the centralized systems model that is called distributed systems. In distributed systems instead of a single powerful computer multiple computers are used and linked together to communicate with each other through a shared network. Distributed, independent and heterogeneous nature of these computers underlies the importance of having software for distributed systems to provide a common view of the systems. (Tari & Bukhres 2001).

2.2 Background The 1990s is the decade of distributing systems. At that time the user could not make a distinction between the local and remote operations in the distributed systems. A command was used to run the programs. Programs did not necessarily execute on the workstation. Only one file system was used and shared by all users, processors have been allocated dynamically whereas the resource is required the most (Mullender 1989). Later, distributed systems have become more common and can be found everywhere. They are becoming a large scale and complex systems, which is quite difficult to develop. Various programming languages and platforms have been proposed to simplify the programming and construction of distributed systems (Hutchinson 1987). A distributed system is a system with many processing elements and storage devices that can be connected together via a common network. Potentially, this causes distributed applications become more powerful. Likewise, two important properties give a distributed system the potential to be more reliable. First, in distributed systems every function replicates several times. If one processor fails, another one can take over the work because many processors

7

can be used. The information cannot be destroyed if any disk crashed because the files can be saved on several disks. Second, in distributed systems many works can be done in the same amount of time because many computations can be carried out in parallel. These two properties that are known as fault tolerance and parallelism makes a distributed system to be more powerful than traditional operating systems. (Mullender 1989).

2.3 Basics of Distributed Systems 2.3.1 Definition of distributed systems Numerous definitions have been given for distributed systems. Simply it can be defined as a collection of autonomous computers that can be connected by a network and equipped with distributed system software. Andrew and Maarten in the distributed systems principles and paradigms book defines distributed systems as a collection of independent computers that can be appeared as a single coherent system to its users. Based on these definitions several important aspects can be specified. First aspect is that distributed systems are autonomous because it consists of many components such as computers and servers. This aspect deals with hardware of distributed systems. The second aspect, which is typically deals with software, is that distributed systems appear as a single system so that users feel they are dealing with one system. That means both aspects are essential and the collaboration is required between the autonomous components (Tanenbaum & Steen 2007). Distributed systems are the opposite of centralized system, which is composed, of a single computer with one or more powerful CPUs for processing all incoming requests. The interconnected computers can be enabled by the software of distributed systems to share system resources as well as coordinate their activities. The illusion of a single and an integrated environment implemented by multiple computers in different locations can be provided by well-developed software of distributed systems. Or, in other words it is possible to say that the distributed system software provides a distribution transparency to the whole system (Tari & Bukhres 2001).

8

2.3.2 Characteristics of distributed systems Several key characteristics of distributed systems can make it reliable and deliver the greatest performance for the users. These characteristics are also acquired to achieve a careful design and utmost implementation. To accomplish these features a distributed system must possess the following characteristics: -‐

Resource Sharing: The main motivation for constructing distributed systems is sharing of resources. In distributed systems all resources can be accessed by clients and managed by servers. They also can be accessed by other user objects and encapsulated as objects. The Internet or Web is a clear example of resource sharing (Coulouris et al. 2005). Software of distributed system provides interfaces to enable the resources to be manipulated and managed in a very simple way in order to achieve effective sharing. A resource manager, which is a software module, manages a particular type of resources to perform its duty based on some methods and a set of management policies (Tari & Bukhres 2001).

-‐

Openness: This characteristic of distributed systems determines how the system can be re-implemented and extended in different ways. Openness is primarily measured by the degree to which new resource sharing services can be added and makes them available for a variety of client programmers to be able to use them. The achievement of Openness is based on the availability of the specification and documentation of the software interfaces of the system components. Accordingly, all key software interfaces of the components of a system must be available to its software developers (Coulouris et al. 2005). However, the availability or publication of software interfaces is only the starting point of adding and extending services in distributed systems. The challenge for designers will be how to tackle the complexity that consists of in many components of the distributed systems produced by different people. A closed distributed system is the opposite of an open distributed system. In a closed distributed system the set of facilities and features provided by the system will remain static overtime. Also adding the new features and facilities into the system will not be allowed. This prevents the system from delivering or providing

9

new resources. Hardware extensibility and software extensibility can be determined as two main perspectives of this characteristic of distributed systems. The first one is the ability of adding hardware components to a distributed system from various vendors, whereas the second one is about being able to add new software components or modules from different vendors to a distributed system (Tari & Bukhres 2001). -‐

Scalability: Distributed systems are scalable in size. They can operate efficiently and effectively at several different scales. Scalability allows the entire system and software applications to remain stable and do not require change or modification when a number of resources as well as the number of users and the scale of the system increases significantly. This property of distributed systems is extremely important because the amount of requests processed by a distributed system will tend to grow, rather than decrease overtime. In this case, additional software and/or hardware might be required to handle this increase. However, throwing more hardware and software components may not resolve the scalability issues. In fact, the system will not be able to utilize the additional hardware and/or software to process requests efficiently if it is not scalable. A system is described as scalable if it provides enough flexibility for the size growth or increase in the system scale. (Coulouris et al. 2005, Tari & Bukhres 2001).

-‐

Concurrency: In distributed systems both applications and services can offer resources that can be shared by clients. Possibly many clients will be attempted to access the shared resources at the same time. This characteristic of distributed systems called concurrency. (Coulouris et al. 2005). Concurrency In distributed systems can be described as the ability to process multiple requests or tasks at the same time. Typically any distrusted system includes many computers. Each of them can have one or more processors. The existence of multiple processors in the computer can be exploited to perform many requests or tasks at the same amount of time. This property can be assigned as a crucial ability to improve the overall performance of the distributed system. For instance, a mainframe should be able to handle the multiple requests that can be sent from several users at the same time. Furthermore, the software used is responsible to ensure that there is no confliction

10

in accessing the same resources by clients at the same time. Concurrency is very important characteristic of distributed systems to improve performance. Since each request should be processed sequentially, thus without concurrency the performance would suffer (Tari & Bukhres 2001). -‐

Transparency: The application programmers or users should perceive a distributed system as a whole rather than a collection of autonomous components. The user should not be aware about the location of the services and transformation process between a local machine and remote one (Mantena 1998). Moreover, in distributed systems the component separation should be invisible to human eye. The system should be appears to its users or application programmers as a single entity rather than a set of independent components system. A major influence on the design of the system software is the implications of transparency (Coulouris et al. 2005). A separated nature of distributed system is required transparency to hide all unnecessary details regarding this separation from clients.

Implementing the

distributing systems may face a number of issues because of the complexity in distributed systems. However, this complexity must be hidden to the users and the users should not worry about it. Various different forms of transparencies are proposed to incorporate distributed systems. Mainly there are eight different types of

transparencies

transparency,

in

distributed

concurrency

systems:

transparency,

Access

replication

transparency, transparency,

location failure

transparency, migration transparency, performance transparency and scaling transparency (Tari & Bukhres 2001). -‐

Fault Tolerance: Fault tolerance is a characteristic in which a distributed system provides an appropriate handling of errors that might occur in the system. A high degree of availability can be provided if a distributed system has good fault tolerance mechanisms. This availability of the system can be measured by the amount of time that the system is available of use. The system’s availability can be increased with a better fault tolerance and this availability can be achieved by deploying two approaches, which are: software recovery and hardware redundancy. The former is a mechanism to design well-developed software that can be recovered from faults when they are detected. The latter is an approach for

11

preventing hardware failures by means of repetition (Coulouris et al. 2005, Tari & Bukhres 2001).

In Section 2.3.2 above the main characteristics of distributed systems has been reviewed and discussed based on several previous researches. Some of these characteristics such as concurrency, scalability, reliability, performance and portability can be chosen as criteria for language comparison to make a comparative study between three different types of distributed programming languages which are Java, C++ and C#. In addition to that, there are other features of distributed systems that also can be selected as language comparison criteria such as high integrity, flexibility, security, ease of use and simplicity. In this research project some of these characteristics and features will be chosen as the main criteria for language comparison to find out how best a programming language can be selected for a project based on distributed systems. 2.3.3 Benefits and Drawbacks of Distributed Systems Distributed systems have several advantages. However, beside these benefits distributed systems also have some drawbacks. This section will discuss the advantages and disadvantages of distributed systems. Benefits of distributed systems: •

Shareability: Sharing of resources is one of the main characteristic of distributed systems. A good example of a shared distributed system is Internet. That means distributed systems are shareable. Shareability can be defined as the ability that permits the comprising systems to use the resources of each other. This sharing will happen on a computer network connected to each system by using a shared protocol that manages communication between the systems. Each computer that wishes to share resources needs to be connected to the network and implement TCP/IP.

•

Expandability/Scalability: Distributed systems are scalable in size. Additional hardware and/or software components can be added to the system easily or sometimes even a new system allows to be added to the current system as a member of the overall system. That ability to permit the new system to be added to the overall system is defined as the expandability of distributed systems. However, sometimes a distributed system might provide unused resources on machines with under-capacity utilization

12

that may cause waste of money and time. This can be resolved by adding shared resources only when they are really required. •

Local Autonomy: It is distributed system’s responsibility to manage its resources. In other words, a distributed system is responsible to provide a local autonomy for its systems of their resources. In this case, all systems can have a local autonomy of their resources, which means that each system will be able to apply its settings, local policies or access controls to these services and resources. That causes distributed systems to become ideal for those organizations that consist of independent objects but located in different places.

•

Improved Performance: In distributed systems, the response time will start to degrade when the number of clients accessing a particular resource is increased. For instance, some techniques can be used to improve performance such as replication that allows the same resources to be copied and balancing will also be loaded which allocates the accessed requests among these copies. In Addition, resources may exist in different machines. In that case the separate nature of distributed systems will be helpful to improve performance.

•

Improved Availability and Reliability: In a distributed system disruptions does not stop the entire system from providing its services and resources. There might be some unavailable resources, but it does not stop the system because the other resources are still available and accessible to its clients. This is because in distributed systems multiple computers are used, each of them has its own separate resources that can be managed by one computer. The disruption might occur only if these resources are replicated, but it will have a minimum impact on the system.

•

Potential Cost Reductions: Cost effectiveness is the main advantage of distributed systems over centralized systems. For instance, if there are CPU X and CPU Y, the performance of CPU X is four times than CPU Y. Then CPU X can be acquired at twice the cost of CPU Y. Consequently, instead of paying more money to buy a single CPU, the effective way to achieve a better price to performance ratio will be harness a large number of CPUs to process requests. Another way to reduce the cost is using a distributed system for handling request processing shared of multiple organizations. (Tari & Bukhres 2001).

13

Drawbacks of distributed systems: •

Network Reliance: Distributed systems are network-based. Usually all components are connected together through a common network and these components are rely on that shared network to communicate with each other. Consequently, any problem or disruption of the network will possibly disrupt the whole activities in the system. This might be true especially for physical issues such as routers, bridges and broken network cables.

•

Complexities: Software of a distributed system is quite difficult to develop. It should be capable to deal with errors that may occur from the components (computers & servers) that a distributed system is made up on. It also should be able to manipulate resources of components with a good range of heterogeneities.

•

Security: Another disadvantage of distributed systems is security. Security issues may happen in a distributed system because of the easy way of collaboration and sharing resources between the system components. This convenience access between the components might cause a security problem if a proper security mechanism is not provided. For instance, private resources may expose to a wider range of potential hackers with unauthorized accesses lunched from any computer that linked to the system. In fact, a centralized system is normally can be more secure than a distributed system. (Tari & Bukhres 2001).

2.4 Types of Distributed Systems Mainly distributed systems can be classified into three types. Distributed computing systems, distributed information systems and distributed pervasive/embedded systems. First type is concerned with offering computations in a distributed way, while the second one is deals with providing information in a distributed manner. The last type of distributed systems is ubiquitous in nature and it is known as the next generation distributed system. In this section these three types of distributed systems will be discussed to make a distinction between them. 2.4.1 Distributed Computing Systems Distributed computing systems are usually responsible to provide computations in a distributed context. They are relatively important and can be used for high-performance computing tasks. Nowadays, computing power is required in various different industries such as finance, banking, life science and manufacturing. Therefore, the computing

14

architectures that have been developed a decade back will probably need reconsideration due to the fast technological process in the field of networking specifically and computers generally. Distributed computing systems can be divided into two subgroups: Cluster Computing and Grid Computing. In cluster computing systems, a very high-speed local area network like Gigabit Ethernet and Infiniband recently can be used to connect a collection of different computing resources such as PCs and similar workstations. This kind of computing technology becomes more popular when the performance ratio of workstations and personal computers improved. Homogeneity is a characteristic feature of cluster computing, the computers used in cluster computing are largely the same, they usually have the same operating system and all of them can be connected via the same high-speed network.

Figure 2.1: An Example Cluster Computing However, in the case of grid computing technology the situation becomes slightly different. In grid computing, distributed systems are frequently constructed as a federation of computer systems, each system falls under a different administrative domain and they might be very different in hardware, software and deployed network technology. In contrast of cluster computing, grid computing systems can have a high-degree of heterogeneity, no assumptions can be made regarding to networks, operation systems, hardware, security policies and administrative domains. (Belapurkar et al. 2009, Tanenbaum & Steen 2007).

15

2.4.2 Distributed Information Systems Generally, distributed information systems provide information in a distributed manner. They are also responsible to store and retrieve data in a distributed way. An important goal of this type of distributed systems is to provide reliable access to the shared data that can be accessed and distributed concurrently (Belapurkar et al. 2009). In several cases, a simple networked application consists of a server running that application which often includes a database to make it available to its clients or remote programs. Then the clients will be able to send a request to the server that belongs to them for performing or executing a specific operation, after a while the request will be sent back to the clients. As applications were separated gradually into autonomous components to distinguish database components from processing components and they has become more sophisticated, it became clear that the applications should be allowed to communicate with each other in a direct way for assisting the integration to take place in the process. This encouraged the huge industries to concentrate more on Enterprise Application Integration (EAI) (Tanenbaum & Steen 2006). Many communication models have been deployed. However, only two of them are using regularly which are known as the most common two communication models in general. The first one is called Remote Procedure Calls (RPCs). RPCs are procedure calls to remote servers that are often encapsulated in a transaction and lead to a transactional RPC. The second one is Remote Method Invocations (RMI). RMI is developed to allow calls to remote objects. This technique is required to be developed because the popularity of object technology increased. Essentially RMI is similar to RPC the only difference is that RMI operates on objects whereas RPC operates on applications. Moreover, distributed information systems are transaction-based systems. The characteristic feature of transaction systems is all-or-nothing. That means either all the operations are executed or none of them are executed. This is one of the four characteristic properties of transactions that are referred to by their initial letters ACID. Atomic (Atomicity): Transactions are atomic and they should happen invisibly. Atomicity ensures that each operation or transaction either executes completely or not. Sometimes transactions may happen completely or they may not happen at all, or it happens but in a single invisible and instantaneous action.

16

Consistent (Consistency): Transactions are consistent and usually does not interrupt system invariants. Consistency ensures that if the system has some certain invariants that needed to be hold continuously, and if they held before the transaction, then they will hold afterward as well. Isolated (Isolation): Transactions are serializable or isolated. This property states that if multiple (two or more) transactions are running at the same time, the interfering will not happen between them and the final result appears as a system independent order. For instance, if a current transaction T is in process the other transactions T1, T2 will happen either before T or after T, they will never happen both together. Durable (Durability): Durability means transactions are durable. What it means is that if a transaction commits, it will go forward and the results will become permanent no matter if anything happens during transaction execution process. After the transaction commitment, the failure cannot undo the results or cause them to be lost (Belapurkar et al. 2009, Tanenbaum & Steen 2007). 2.4.3 Distributed Pervasive/Embedded Systems Distributed pervasive/embedded systems can be defined as a new paradigm for the nextgeneration distrusted systems. They are ubiquitous in nature and can be found everywhere. The main goal of this this type of distributed systems is to make services and data willingly available for the users at any time and everywhere. Computers or technological devices have to be hidden in the background of the user’s activity that wishes to perform every day. Another goal of distributed pervasive systems is to provide consistent shared data access that can be distributed and accessed concurrently. Instability can be assigned as the main characteristic of distributed embedded systems comparing with the traditional distributed systems or other types of distributed systems. Distributed pervasive systems include embedded computing devices such as Laptops, Mobile devices, PDA, Smart boards and Wireless sensors that can be configured by the programmers or developers and used to design a pervasive system for assisting human beings in their daily life. For example: A smart interactive room system in a hospital can be designed to assist both care team or medical staff and patients to improve care and manage the patient’s daily life in a hospital in a very simple way. Several examples of pervasive distributed systems are available nowadays, for instance, smart home system, augmented reality, capturing context, pervasive healthcare and so on (Siewe. 2014, Belapurkar et al. 2009).

17

In section 2.4 above three different types of distributed systems has been presented and explained briefly. Each of them is a wide area in the field of distributed systems that can be studied as a single subject. However, their characteristics and features are mostly similar to each other. Thus this project will focus on distributed systems in general and make a comparative study against three kinds of distributed programming languages in relation to these three types of distributed systems to discover how best a programming language is selected for a project based on distributed systems.

2.5 Middleware One of the most popular technologies of distributed systems is Middleware. So far Middleware in distributed systems is defined as a type of distributed software that is able to connect various different kinds of applications and provide distribution transparency to those connected applications at the same time. It can be used to bridge heterogeneities that could occur in the system. Middleware is divided into several categories based-on significant standards or products in the market, these categories are: Socket, Remote Procedure Call (RPC), Remote Method Invocation (RMI), Distributed Computing Environment (DCE), Distributed Component Object Model (DCOM) and Common Object Request Broker Architecture (CORBA). The details of these categories of middleware will not be given in this section because it only focuses on familiarise the reader with middleware technology in distributed systems.

Figure 2.2: The OSI Model and Middleware

18

Figure 2.2 above shows that, application, session, and presentation layers of the OSI (Open Systems Interconnection) model can be replaced with middleware in distributed systems (Belapurkar et al. 2009, Tanenbaum & Steen 2007).

2.6 Summary Distributed systems are a group of independent computers that can be connected by a highspeed network and equipped with distributed software. They usually may have several different characteristics depending on their designs and implementations. Some of these characteristics such as openness, resource sharing and transparency are crucial and plays an important role in addressing heterogeneities of the distributed system. Distributed systems are heavily rely on a shared network for communications despite of providing the advantage of shareability, local autonomy, expandability, improved performance and reliability, availability…etc. The most irritating feature of distributed systems is complexity. Complexities in distributed systems make the software that can exploit many computers and hide the heterogeneities of the system to be developed in a difficult way. Further, sharing of resources between the system components and the easy access between them may cause security issues in distributed systems, which is another disadvantage of distributed systems. In addition, middleware is used by distributed systems to link many different types of applications together and offer distribution transparency between them.

19

Chapter – 3

An Overview of Distributed Programming Languages

20

3. Distributed Programming Languages 3.1 History of Distributed Programming Languages Almost as soon as the first computers were built, it became clear that they require a tool in order to make them usable for the users. Without this tool computers might not have that powerful means to assist human beings in many applications as they have nowadays. This tool is known as Programming Language. Programming language provides a tool for computers to be useable and enables the user to communicate with the computer. Many programming languages are available and in use in the field of computer technology today. Most of them have evolved from older languages such as C++ and C and they are likely to continue evolving in the future. At the beginning of the computers development, programming referred to working with 0’s and 1’s. Later or more specifically in 1940, the first system was developed which was called Plankalkul. In this system a symbol was used instead of zeros and ones, which was better than using machine language and easier to understand by humans. After that several generations of programming languages are invented and developed. These generations are: First generation: The first generation of programming languages which is also called the original computer programming languages was knows as Machine Language. Machine language was used to program the functions of a computer by using a sequence of 0’s and 1’s. It was great language for computers at that time but not so interested by humans, because it has many simple instructions and the programmers required to use all of them. Another irritating feature of machine language is that, it was vague language and not very understandable for human beings while programmers do programming. Consequently, they needed to spend a lot of energy and time to understand it. Second generation: The difficulties to understand the machine language for human beings motivated the researchers to find other ways around for programmers. Therefore, they invented second generation programming languages, which were called Assembly Language. Assembly language was the first step to translate the sequences of 0’s and 1’s into human words such as “add or subtract” which was much simpler to understand by

21

programmers. However, it also can be turned back into machine code (0’s and 1’s) by using a special program that is called assembler. Third generation: Third generation programming languages were known as High-Level Languages (HLLs). This generation of programming languages was introduced for ease of programmability by humans to assist them in doing programming easily. The first highlevel language was called FORTRAN (FORmula TRANslaror). FORTRAN was invented by John Backus at IBM in year 1954 and released commercially or developed in 1957. It was the most popular and used programming language at that time. FORTRAN programs were caught on very quickly because they were more productive. However, they needed to be complied or translated into machine language for execution and did not run as fast as machine language programs. FORTRAN is still using today for programming mathematical and scientific applications. However, several high-level programming languages are available nowadays and can be used by programmers such as C, C++, Java and C#. Fourth generation: This generation of programming languages is called fourth generation programming language (4GL) or SQL language. Comparing with high-level languages 4GLs are closer to human languages and mostly used to access databases. For instance: FIND ALL RECORDS WHERE NAME IS “JOHN” is a typical 4GL command, which is very similar to human language and more understandable. As a result, this feature makes fourth generation programming languages to be very simple for learning by humans and gives a better performance to the programmers while they are using it. Although 4GLs are more understandable and very easy to learn, they are not very common and not used by programmers. Fifth-generation programming languages also exist nowadays but the dominant programming ideas of this generation of programming languages is still not clear. Many experts have tried to find out this issue. However their outcome is still uncertain. The possible candidates for this generation of programming languages in the future will be function-oriented

programming,

logic-oriented

programming

and

object-oriented

programming. Since this dissertation is only focus on C++, Java and C# which are the most popular high-level languages that are using by programmers today, only these three

22

types of third generation programming languages will be discussed in this project (MacLennan 1999, Chen 2003, Chen et al. 2005).

3.2 A brief History of High-Level Programming Languages High-level programming languages started with FORTRAN in 1954. It was developed by IBM for scientific and numeric purposes. The meaning of FORTRAN is FORmula TRANslation. The big burden on that language was the inheritance from the past languages. Later, LISP and ALGOL were developed respectively in 1958. The first one was designed for the manipulation of patterns and symbols. LISP or List Processing is one of the most popular languages for Artificial Intelligent (AI). The second one is also called ALGOrithmical Language or ALGOL 58. Mainly it was designed to be the successor of FORTRAN language. Although ALGOL was a very powerful language and strongly structured, it was not acceptable as FORTRAN by the programmers. In 1960 ALGOL 58 was developed and named as ALGOL 60. In the same year another language was developed for administrative purposes called COBOL. COBOL is the abbreviation of COmmon Business Oriented Language. It is still used by many business companies and government industries for administrative purposes (MacLennan 1999, Chen 2003, Chen et al. 2005). In year 1965, the BASIC or Beginners All-Purpose Symbolic Instruction Code was designed and developed by Dartmouth College in USA. BASIC was specially designed to assist programmers to experiment programming and help them to solve simple and small issues in programming. After that a combination of FORTRAN, ALGOL 60 and COBOL which was called PL/I provided for scientific purposes and business. It was not efficient as FORTRAN but better than COBOL. SIMULA was another language designed for modelling purposes and simulation in 1967. It was the first language with object-oriented features and could be used as general-purpose language. Between years 1968 - 1969 PASCAL language was designed and published in 1970 by Nicolaus Wirth. PASCAL was an efficient language proposed for encouraging good programming practices by using data structuring and structured programming. Later in 1985 this version of PASCAL has been developed and redesigned for object-oriented programming. Before year 1985 the USA department of Defense (DoD) developed the language called Ada in 1983. The main goal of designing this language was to develop a common and powerful language that could be

23

used by all governmental institutes of USA. Ada was one of the very powerful and wellstructured programming languages with a lot of features that could not be seen from the previous languages. However, it was large and quite complex (MacLennan 1999, Chen 2003, Chen et al. 2005). The combination of object-oriented and graphical environment together is called Smalltalk. Smalltalk was an extreme object oriented language, which was, enables the programmers to do programming very fast and simple because of dynamic and interpreted features of it. However, it was not that much popular and not used as C++ language. The concept behind the development of Smalltalk was very important to the continued development of objectoriented programming languages such as Java. The C language was developed by the universities of Cambridge and London together. It was derived from ALGOL language and designed to develop the UNIX operating system. C was the first language designed by programmers. It was flexible and portable, very well structured and powerful that could be used for many purposes such as, operating systems, databases, computer animations and games. Moreover, by adding object-oriented facilities from Simula language to the C language another popular langue introduced called C++. After C++ a series of objectoriented languages are designed which are mostly in use by programmers today such as Java and C#. The details of each C++, Java and C# will be discussed in the other chapters of this project (MacLennan 1999, Chen 2003, Chen et al. 2005). Figure 3.1 bellow illustrates the history of third generations programming languages or high-level programming languages (HLL) evolution from year 1954 till 2002. Some of the most popular languages are explained in section 3.2 above. However, this dissertation will choose only three of them (C++, Java and C#) as samples of distributed programming languages for comparison based on distributed systems in general.

24

Figure 3.1: history of high-level programming languages evolution form 1954 – 2002

25

3.3 Classification of Distributed Programming Languages Generally distributed programming languages can be classified by various different categories such as classification by programming model, classification by typing, classification by mode of execution and classification by modularity. However, classification by programming model is the most popular one. Programming languages are typically classified based on the paradigm that they support. The following are the main programming models or classes now in general use and they are used to express a computation nowadays: •

Imperative programming languages

•

Functional programming languages

•

Object-oriented programming languages

•

Logical programming languages

3.3.1 Imperative programming languages In this class of programming languages, programs are usually decomposed into computation steps (such as instructions, statements or commands), reflecting the step-wise execution of programs in traditional hardware. Procedures or routines, which are also called sub-programs, are used to modularise the program. Imperative programs provide the accurate descriptions of “How to solve a given problem”. FORTRAN, ALGOL 60, C and PASCAL are examples of imperative programming languages (Fernandez 2004, Beaudouin 1994). 3.3.2 Functional programming languages Programs in this class are functions that can be composed to create new functions similar to building functions in the mathematical theory of functions. Functional languages are also called declarative languages because this class of language is more focus on what should be computed in the program rather than focusing on how the program should be computed. Declarative languages highlight the use of expressions, which are evaluated by simplification. Examples of functional languages are Haskell language, Erlang language SML (Standard ML language) and Caml (Fernandez 2004, Beaudouin 1994).

26

3.3.3 Object-Oriented programming languages Object-oriented programs are a set of objects that can be accessed via the defined methods or operations on them which are organised in a hierarchy way. An obvious example of object-oriented language is Java. Object-oriented design focuses on the combination of fields/data and methods/operations, which are called entities. Some researches show that object-orientation is a general approach for programming rather than a specific type that is easily classifiable. Thus, sometimes object-orientation can be considered as a feature of imperative languages. However, it can also be found in functional languages and combined with logic languages as well (Fernandez 2004, Beaudouin 1994). 3.3.4 Logical programming languages This class of languages defines a problem that required to be solved rather than describing an algorithmic implementation. The main focus is on the specification of the problem that needs to be solved instead of declaring the way in which it is solved. Logical languages are also declarative. The most popular logical programming language is Prolog, which was designed in 1970 by Comerauer, Kowalski and Roussel. Nowadays, logic programming is combined with constraint-solving and called modern logic programming languages (Fernandez 2004, Beaudouin 1994). The classification discussed above is based on programming model. However, distributed programming languages can also be classified by a simple scheme. They can be divided into logically distributed and logically non-distributed languages: -‐

Logically distributed languages: In logically distributed languages parallel communications such as processes can communicate with each other by transferring messages between each other. The address space of the entire program can be distributed because in this type of programs the address spaces of different computations do not overlap.

-‐

Logically non-distributed languages: In logically non-distributed languages, parallel units can communicate with each other by using the stored data in the shared address space because all parallel units have a logically shared address space and the address space is logically shared between them.

27

Figure 3.2 below illustrates all languages that are based on logically shared data have been implemented on distributed computing systems without having a shared primary memory between its components. Classifying languages for two different categories based on their mechanisms and communication are further partitioned into a number of classes. The first category includes rendezvous, synchronous message passing, asynchronous message passing, automatic transactions, multiple communication primitives, RPC (Remote Procedure Calls) and objects. The second category makes a distinguish between implicit communication by using function results which are used in parallel functional languages, shared logical variables in parallel logic languages and distributed data structures. The classification is clearly explained in figure 3.2 (Bal et al. 1989, Bal 1992).

Figure 3.2: Language classification for distributed programming

28

3.4 Distributed Programming Techniques 3.4.1 RPC (Remote Procedure Call) RPC or remote procedure call is one of the most popular and used techniques of distributed programming. It increases portability, flexibility and interoperability of applications and decreases the complexity of developing applications that may include several network protocols and operating systems by separating the application developer from the details relevant to various different network interfaces and platforms or operating systems. In addition, by using RPC protocol users feel like they work with local procedures while in fact they are working with remote procedures. Routines inside RPC protocol define remote procedure calls. Each call message should be related to the corresponding reply message. Call-back procedures from server side can also be supported by RPC protocol because it is a message protocol too. In RPC, a set of procedures can be represented by a program, which is provided from each server. Also a particular remote procedure can be specified precisely by using a combination of program number, procedure number and server address. Inside RPC model, the client is responsible to call the procedure for the data packet to be sent to the server. After the data packet arrived, the server is responsible to call the dispatching routine, carry out the request and send the response back to the client. Then the result will be returned to client process by the procedure call. Furthermore, remote procedure call interface can be used for communication between the processes, which are placed on multiple computers on the shared network. However, the functions of RPC are equally successful between different processes running on the same computer (Golub et al. 2005, AIX 2004, Vondrak 2004).

29

3.4.2 Message Passing Interface (MPI) In MPI technique sub-routines are defined for performing collective operations as well as sending and receiving messages. MPI is recognised as an actual standard or paradigm for message passing programming due to its wide usage in the field of community science. Both portable is the main advantage of MPI over the older libraries of message passing. In this model of programming (MPI), a computation usually includes one or more processes that can communicate with each other via the library routines for sending and receiving messages to other processes. The number of the participating processes in the computation can be fixed during the runtime of the program. A Single Program Multiple Data (SPMD) is the default programming model of message passing interface programs, the MPI cannot be used for implementing a dynamic client/server infrastructure (i.e. the independent executed clients of the server), because it does not have the ability to prevent remote procedure access or creation of interfaces. However, some form of client/server model is employed by most of the developed applications in a dedicated and constrained environment (Golub et al. 2005). 3.4.3 Common Object Request Broker Architecture (CORBA) CORBA can be defined as a middleware that provides standardisation, integration and interoperability, which is required in heterogeneous world today. Most of the modern enterprise applications are normally distributed in heterogeneous environments that include various different operating systems, hardware platforms, network protocols and databases. They usually contain components that are written in several different programming languages and integrated with many legacy applications that might be very expensive to rewrite or port. The software development is supported by CORBA for heterogeneous environments to introduce the standard concept of distributed objects and separate the implementation of these objects from their interfaces clearly by using Interface Definition Language (IDL). CORBA was developed by a group called Object Management Group (OMG), at the beginning it was developed to support a number of operating systems, networks and programming languages. Later it became a standard for object method call via the network. Some programming languages such as java can support CORBA and implement an Interface Definition Language (IDL). That allows the methods to be called easily by

30

enabling other software. An IDL is defined as a neutral language, which allows every programming language that has IDL mapping to use it (Brose et al. 2001, Golub et al. 2005). 3.4.4 Java Remote Method Invocation (Java RMI) Another technique of distributed programming is called Java RMI. Java RMI is an effective and healthy solution for developing those distributed applications that their programs are written in java language. Accordingly RMI can represent an amazing easy and simple framework for utilisation. The primary goal of designing RMI by RMI designers was to enable programmers to develop a distributed java programs by using the same semantic and syntax, which is used for developing a non-distributed java programs. To achieve this, the object model of the single Java Virtual Machine (JVM) and java class were needed to be mapped very carefully into the new model of multiple JVMs or distributed environment. In Java RMI technique, an Interface Definition Language (IDL) may not be required to use, because RMI functions are already in a homogenous environment (Reilly & Reilly 2002, Golub et al. 2005).

3.4.5 Distributed Component Object Model (DCOM) Distributed component object model is a high-level network protocol that takes over the user’s job for writing network code to control and maintain the communication which is required for the distributed components interaction over the shared network. It also supports the communication between the objects that are located on several different computers on the internet (WAN or LAN). Microsoft’s distributed COM has extended the Component Object Model (COM) especially for this support. However, because DCOM is unnoticed evolution of the Component Object Model (COM), the existing investment into COM-based tools, applications, knowledge and components can possibly be reused for the movement into the distributed computing world based on some standards. Later, Object Remote Procedure Call (ORPC), which is a new set of low-level call interfaces, has introduced by Microsoft. ORPC expands the technical model of programming to accommodate distributed objects and it is placed on the top of today’s Distributed Computing Environment RPC (DCE RPC), which is standard (Hoang 2004, Golub et al. 2005).

31

3.4.6 .Net Framework Remoting A framework provided by .Net Remoting enables the objects to interact with each other over the application domains. It also can ensure many services, including communication channels responsibility for the messages delivery to remote applications as well as a support for the object lifecycle and activation. Formatters can be used to encode and decode the messages before transferring them through the communication channel. If the performance is crucial, binary coding can be used by applications, however, if the interoperability is essential with other distributed technologies XML coding will be adequate. Simple Object Access Protocol (SOAP) is used by XML coding for the messages transportation between application domains. While designing .Net Remoting, security is considered in mind, thus several ways will be exist to access the serialised data stream and messages before this stream of data transferred via the channel. Remote objects lifecycle management would be extremely difficult without the inherent framework support. Net Remoting offers many activation modes that can be chosen such as Client Activated Objects (CAOs) and Server Activated Objects (SAOs). The first category can be controlled by a lifecycle manager depending on leases to ensure that an object is destroyed when its lease expires. In the second category developers can have two options. They can select either singleton or single call model. A lease also controls the lifecycle of the singleton object (McLean et al. 2003, Rammer, 2002, Golub et al. 2005).

3.5 Criteria for Language Selection Several programming languages such as C, C++, Java, C# and so on exist today. Each of them has its strengths and weaknesses. This availability of many programming languages confused software engineers in deciding which one to use for a project. Sometimes a language is selected because the developers have knowledge about it and like it or sometimes the language is chosen because it is the latest and most used one. These decisions might be reasonable. However, the language should be selected based on its strengths to solve the determined problem and some relevant technical and economic criteria. Thus, the language selection becomes a very important decision at some stage of the software development. There are various technical criteria that software engineers or programmers can take their decisions on it for language selection, these criteria are:

32

•

Efficiency: Software engineers should ensure that is the selected language capable enough to be implemented in an efficient way? Sometimes dynamic dispatch and class tags which are known as run-time overheads can be involved in some aspects of object-oriented programming. Run-time checks and garbage collections are both costly and they can slowdown the program at unexpected times. Interpretive code is slower than native machine code by approximately ten times. If the program contains some critical parts that need to be highly efficient, does the language enable them to be turned by calls to procedures, which are written in a lower, level language or turned them by resort to low-level coding? (Kulkarni et al. 2008, Findlay & Watt 2004).

•

Reliability: Programming errors sometimes cause unlimited harm that may crash the whole program. To detect these errors and eliminate them as quickly as possible, the chosen language should be designed in such a way that errors in programming can be detected and eliminated as fast as possible. During compiletime checks some errors can be detected and guaranteed absent in the program. Similarly, errors detected by run-time checks can also be guaranteed to cause no harm other than throwing an exception. However, those errors that cannot be detected at all might cause unlimited harm (i.e. data corruption) before the running program crashes (Kulkarni et al. 2008, Findlay & Watt 2004).

•

Portability/Platform: Is the language designed can run on multiple operating systems or platforms? In other words, does the language allow programmers to write a portable code that can easily be moved from one platform to a different one without need of major changes? (Aldarwiesh 2009, Kulkarni et al. 2008).

•

Readability: Significantly the code should be written in such a way that other programmers could read it and be able to modify it. So the question is that, does the chosen language assist or hinder good programming practice? Writing a readable code will be very difficult if the language enforces very short identifiers, an absence of type information, cryptic syntax and default declarations. Therefore, the

33

important point is that the code is read by other programmers and its author more often than it is written (Findlay & Watt 2004, Kulkarni et al. 2008).

•

Reusability: Does the selected language support effective reuse of program units? If it supports, then the development of the project can be faster by reusing program units that are tried and tested before. New program units also might be developed that may be suitable for reuse in the future (Kulkarni et al. 2008).

•

Scale: Programs should be allowed to be built from compilation units that have been tested and coded by different programmers separately. The selected language should allow this and support the orderly development of large-scale programs. Simply, programmers require ensuring that does the language chosen support and maintain the large-scale programs development? (Findlay & Watt 2004).

•

Modularity: Does the selected language support the programs decomposition into suitable program units? If so, then what a program unit is to do and how it will be coded can clearly be distinguished. This separation of concerns is important and becomes a necessary intellectual tool for managing the software development of large-scale programs (Findlay & Watt 2004).

•

Level: Programmers need to be encouraged by the selected language to think about high-level abstractions oriented to the project or the chosen language may force them to think most of the time about low-level details such as pointers and bits. Although low-level code is essential in some parts of the project, it is notoriously error-prone, especially when pointers are involved (Findlay & Watt 2004).

•

Availability of tools and compliers: Are high-quality compliers and tools available for the selected language? The syntax of the language and type rules can be enforced by a good quality complier. It also has ability to generate an efficient and correct object code as well as generate run-time checks to cover any errors that are not detected at compile-time checks and report them in a clear and accurate way. Another question is that, is a good-quality integrated development, which is

34

called IDE available for the chosen language? IDE can improve the productivity by combining a complier, program editor, debugger, linker and related tools into a single integrated system (Findlay & Watt 2004, Kulkarni et al. 2008).

•

Other criteria: There are also various criteria for language selection such as, familiarity, data modeling and process modeling, writability, simplicity and ease of use, extendibility, security and continuity. For simplicity these criteria will not be discussed in this dissertation.

In section 3.5, several criteria for language selection has been explained and discussed. These are the main criteria that professional software engineers or programmers should rely on during language selection process instead of selecting a language based on wrong decisions such as, commercial pressures, fashion, fear of change, inertia, prejudice, conformism and fanaticism. With so many available criteria for language selection it is not easy to decide which one to choose. However because this dissertation is a comparative study between C++, Java and C# based on distributed systems, only those criteria that are shared between these three programming languages and mostly associated with distributed systems will be chosen to find out which language is more suitable to be selected for a project or distributed systems. The selected criteria and their details as well as evaluation and findings will be explained in comparison chapter [Chapter 5].

35

Chapter – 4

Samples of Distributed Programming Languages (C++, Java and C#)

36

4. Samples of distributed programming languages This chapter describes three different types of computer programming languages which are the most popular samples of distributed programming language using for developing distributed applications nowadays. It is divided into three sections: in section one C++ language will be discussed as a first sample of distributed programming languages, section two explores Java as a second sample of distributed programming languages and finally section three will discuss C# language as another modern sample of distributed programming languages. The details of each language will be explained in section 4.1, 4.2 and 4.3.

4.1 C++ as a first sample of distributed programming languages 4.1.1 Introduction to C++ Programming Language C++ is a complicated object-oriented high level programming language. It was designed in 1979 by Bjarne Stroustrup and developed at AT&T Lab in the 1980’s. After that it is evolved gradually with some additional override or virtual functions, subclasses, overloading and several different features. The standard and internationally recognised version of C++ language was agreed in year 1998. C++ is derived from C language and its syntax is more similar to C with many extensions and additional keywords that are required to support object-orientation (OO) features, inheritance and classes. Although C++ has inherited some faults and errors from C, it is one of the most popular and suitable languages for developing applications on PCs and UNIX systems. A wide broad range of object-oriented programming (OOP) features can be offered by the C++ programming language such as: dynamic memory management, multiple inheritance, overloading, strong typing, polymorphism, templates and exception handling. Also the visual features that are expected from an application language can be supported by C++. For instance: type conversations, data types, full Input/output facilities and a variety of data types containing arrays, structures and strings. The Standard Template Library (STL) of C++ can provide a set of collection and abstract data type facilities. The C++ language was unified in 1989 and finally standardlised by the American National Standards Institute (ANSI) and the International Organisation for Standardisation ISO (Kulkarni et al. 2008, Findlay & Watt 2004, Aldrawiesh 2009).

37

4.1.2 Characteristics of C++ Language Despite its main characteristics, C++ programming language has all the features and characteristics of the C programming language, because it is the extended version of C. That means all the features that are available in C also exist in C++, these features that are also the characteristics of C++ programming language are: -‐

Modularity: Commonly usable modular programs

-‐

Efficiency: Efficient and close to the machine programming

-‐

Portability: Portable programs for several operating systems or platforms.

C++ language can be defined as a hybrid language which includes the same functionality of C. Therefore it is not a purely OOP (Object-Oriented Programming) language. For that reason the large amount of the existing C source code can be used in C++ programs. In addition, C++ has the ability to support the concepts of object-oriented programming which are: ü Data encapsulation: it is the encapsulated data for controlled access to object data ü Data abstraction: it is the creation of classes to define objects ü Polymorphism: it is defined as the implementation of instructions during program execution that may have various effects on the program ü Inheritance: it is a multiple derived classes that can be created inheritance form derived classes. These are the main characteristics of C++. However, Breedlove and Randal (2008) in the book C++: an active learning approach stated another three characteristics of the C++ programming languages: Simplicity and ease of use: C++ is an enhanced and easier to use version of C language. For example: the syntax of the data reading in C++ is much easier than C. Object orientation: C++ can provide an excellent help for powerful object oriented activities such as: inheritance and polymorphism. Data abstraction by means of classes: C++ contains classes to facilitate an objectoriented programming language. Although C++ is an enhanced version of C, several features and language elements were added to C++, for instance: templates, exception handling and references. These added language elements and features are very important to implement the program in an

38

efficient way. However, they are not precisely object-oriented programming features (Prinz & Kirch-Prinz (2002), Randal & Breedlove 2008).

Figure 4.1: Characteristics of C++ Programming Language

4.1.3 Features of C++ as a Distributed Language C++ has many features as a programming language. For instance, it is a complied language that can be compiled directly to machine code. This enables the C++ language to be one of the fastest languages among other programming languages if it is optimized. It is a strongly typed unsafe language. That means C++ language expects the programmers to know what they are doing. However, therefore it allows for incredible amounts of control. It supports both dynamic and static type checking by allowing type conversations to be checked during run-time or compile-time of the program that provides flexibility. C++ also can support both inferred and manifest typing by means of avoiding verbosity where desired, and that enables another degree of flexibility. Another feature of C++ is that, it provides several paradigm choices by offering a remarkable support for object-oriented programming paradigms, procedural, generic and various other paradigms. It is portable language and still one of the most frequently used programming language as an open language. It has an extensive range of compliers that can run on various different platforms that support it. Other features of C++ as a language are that, it is an open ISO standardised language, which is the international standardised

39

committee group for the C++ programming language that standardised C++ in 1998. It is also upwards compatible with C language and has many incredible library supports (Cplusplus 2002). The features illustrated above are mostly language features of C++. However, as a distributed programming language, C++ also has many features that are well suited for programming distributed systems such as: exception handling, abstract base classes, virtual functions and virtual inheritance. These features of C++ assist the separation between object interfaces and implementations. Additionally, support for garbage collection can have a great impact in reducing the complexity of memory management. Similarly, before and after methods in C++ enables the organising and disorganising of parameters that are passed to Remote Methods Calls (RMC). However, the lack of some other features of the C++ language may possibly increase the complexity of developing concise and robust distributed applications (Schmidt & Vinoski 1995) Based on the characteristics and features discussed above, it possible to say that C++ might be a suitable distributed programming language for developing distributed systems and it can be chosen as a sample for comparison because of having several features and characteristics which are mostly related to distributed systems such as, portability, efficiency, simplicity, ease of use and so on. These features and characteristics also can be selected as the criteria for language comparison and they are the main reason of choosing C++ language as a first sample to make a comparative study. However, the C++ language my not have all the features required for developing distributed applications and it might not be a good example to be chosen for a project. The validity of this will be explained in the comparison chapter [Chapter 5].

40

4.1.4 How C++ Relates to Java and C#? C++ can be assigned as a parent for both Java and C#. Although Java and C# modified, removed, or added many features of the C++ language, overall their syntax is nearly the same and identical. For instance, the object model used by Java and C# is more similar to the object model used by C++. This is the reason that some programmer experts believe that Java and C# borrowed the syntax of C++ and object model. However, each of them is designed for a different type of computing environment and their role in building and developing distributed systems is different. Thus, the main difference between them is the type of computing environment. C++ was designed to produce high performance programs for a specific type of Central processing Unit (CPU) and operating system. For example C++ might be the best language for writing a high performance program that can run on Intel Pentium under the windows operating system. Both Java and C# were developed in response to the unique programming needs of the highly distributed networked environment that typifies much of contemporary computing. Furthermore, Java was designed to allow the creation of portable code that can run on various different platforms over the Internet. Therefore, by using Java a portable program can be written that runs on a wide range of CPUs and operating systems, it also has the ability to run in a wide variety of environments. Consequently, a program written in Java can move freely on the Internet. C# was designed for Microsoft’s .NET framework, which supports mixed-language and component-based code that can work in a networked environment. Although both Java and C# enable the creation of portable code that can work in a highly distributed environment, their efficiency will pay for this portability. As a result, Java and C# programs execute slower than C++ programs. Therefore, high performance software can be created by using C++ and highly portable software can be created by using either Java or C#. However the important point is that, C++, Java and C# are designed to solve different sets of issues. So it might not be a problem which language is best and better than the others, the main question is that which one is right and suitable to be chosen for a project or the job at hand (Schildt 2003).

41

4.1.5 The Role of C++ and CORBA in Distributed Systems C++ can be used to simplify distributed applications. When used properly it can be a very well suited programming language for the contraction of distributed object support systems and object components themselves. High-level abstractions with the efficiency of low-level languages such as C can also be combined by C++. Various environments and frameworks which are emerged for distributed object computing are based on the C++ language because of its widespread appeal and availability of commercial tools such as CORBA and freely available software tools such as ACE (Adaptive Communication Environment) that supports object-based distributed programming using C++. However, the C++ language does not support parallelism and it does not include any keyword primitives for parallelism. No way can be seen within the C++ language specify that multiple statements can be executed in parallel while built-in parallelism is used by other languages as feature of selling. Also multithreading for the most part of C++ does not mentioned by

the C++

ISO standard (Hughes & Hughes 2003, Schmidt & Vanderbilt 2008). CORBA or Common Object Request Broker Architecture is a middleware designed to allow application programs to communicate with each other regardless of their hardware and software platforms, their communication networks and implementers and their programming languages. This standard, which is adopted by the Object Management Group (OMG), enables interoperability between distributed applications in heterogeneous environment irrespective of their location. CORBA’s main objective is to automate several network programming tasks which are common such as, framing and error handling, location and activation, object registration, parameter marshaling and demarshaling, operation dispatching and request demultiplexing. This automation between networking functions and software intermediary is known as ORB (Object Request Broker). Various different Object Request Brokers have been implemented in such a way that can support a variety of programming languages including C++ and Java (Tari & Bukhres 2001, Coulouris et al. 2005). The CORBA is also designed to support many programming languages. The most popular language for deploying CORBA objects is C++. The Interface Definition Language (IDL) of CORBA is used to define the interfaces to the objects needed in distributed systems. It allows the programmer to define interfaces to objects without the implementation

42

specification of the defined interfaces. A C++ class needs to be defined that can be accessed via the defined interface to implement a CORBA IDL interface and then create objects for that class within a server application. The mapping of IDL to C++ can support programmers in the development stage of the CORBA applications in C++. CORBA offers various benefits to the C++ programmers while building distributed applications by using C++ language such as: Support interoperability across various platforms and programming languages, location transparency, legacy integration, programmer productivity, open standardisation of CORBA, vendor independence and finally reusing CORBA facilities and services (Henning & Vinoski 1999, Vogel et al. 1999, de Paraga 2008).

4.2 Java as a second sample of distributed programming languages 4.2.1 Introduction and historical perspective of Java In the early 1990s Java was initially designed by a team at Sun Microsystems for embedded system applications. It was originally named as “Oak” by its developer James Goslin. Java is relatively young programming language that had very humble beginnings. At the beginning, it was mainly designed to develop the software for embedding programs into customer’s electronic handheld devices such as Personal Digital Assistant (PDA), home security systems and microwave ovens. Although Java was initially designed only for embedded system applications, it evolved quickly into a suitable language for Web applications. James Goslin modified a C++ compiler and then from that modification created a new powerful language rather than adapting C++. This language is called Java that has many interesting features such as efficiency, flexibility and portability. However, in the case of using Java improperly several unpleasant side effects can happen such as, run-time errors including cross pointers, memory leaks and slow performance (Morelli & Walde 2012, Reilly & Reilly 2002, Findlay & Watt 2004). Later, Java has become a portable object-oriented programming language that carried several advantages from C++ language and borrowed many ideas from other objectoriented languages such as, objective-C, Mesa and Modula-3. However, some disadvantages of Java were born. After that the Oak team tried to incorporate the Java language into a Web browser, because the focus from customer electronics devices is changed to online services. From that time the name of the language changed to Java and the first Web browser with the capability of running Java software was generated. This

43

new browser was called “HotJava”. It is released in March 1995 and changed the way people looked at the Web. Instead of dynamically generated pages and static pages formed at the server side, the Web today has active documents that are executed Java applets. Both Microsoft and Netscape licensed Java applet technology to be used in their particular Web browsers and that lead to success of the Java language (Reilly & Reilly 2002, Kulkarni et al. 2008). 4.2.2 The Java Language Features The Java language has been designed with a number of interesting features because of its original intended role as a language for programming embedded applications and its support for the Internet applications and the World Wide Web (WWW). Several of these features and properties of Java are present in other programming languages. The rapid adaptation and sheer popularity of Java by programmers indicate that the right mixed functionality and sophistication language found by the Sun Microsystems. The following are some interesting features of Java:

•

Java is a distributed programming language: Java programs can be designed in such a way that can run on computer networks. The language itself comprises an extensive collection of classes and code libraries that can be used to facilitate the building of distributed applications. Java is indicated as a very well suited language to support applications on corporate networks. The main reason for this indication is that, Java programs or software can be used directly for particular types of applications and that make it particularly easy to build software systems for the Internet and World Wide Web.

•

Java is platform independent (Portable language): Java programs are able to run on various different platforms or operating systems such as, Windows and Macintosh. The trademark of Java is “Write once, run anywhere”, that means Java programs has the ability to run without changes on different types of computers. This portability of Java that cannot be seen in the other high-level programming languages makes Java to be a well-suited language for WWW applications.

•

Java is an object-oriented language: Programs can be divided by object-oriented languages into separate modules, which are called objects to encapsulate the data and operations of the program. Therefore, object-oriented programming and object-

44

oriented design can be used as a preferred approach for organising programs as well as building complex distributed systems. OOP also provide several advantages for programmers such as helping them to find out using objects is simpler to deal with rather than procedures, writing code in an OOP language such as Java is more productive. Another advantage is that, visibility modifiers and class inheritance features make OOP languages much safer and easier to work with than other older procedural languages. •

Java is robust: Robust means errors that occur in Java programs do not affect the whole program and they usually do not lead to system crashes while errors in the other programming languages such as C++ most often cause system crashes. This feature of the Java language detects many potential errors during compile-time before running the program.

•

Java is a secure language: Java is designed with the security in mind. It contains some features that can protect the Java program against untrusted code which may generate a virus or sometimes corrupt the whole program and system in some way. For instance, when they are downloaded to programmer’s browser, Java programs that are web-based can be prevented from writing and reading information from and to programmer’s PC.

•

Java is simple and easy to use: Comparing with C++, Java is much simpler to learn and easier to use. A programmer can access objects only via object references, direct memory access is denied in Java because there are no pointers to access the memory. Also Java do not allow multiple inheritance, that means classes in Java only can inherit from one class, they are not allowed to inherit from a second class. That makes coding much simpler for the programmers, which is very important for different type of applications especially networking applications. This simplicity of Java’s design and easy access to its libraries makes Java far simpler that many other object-oriented languages.

•

Java is multi-threaded programming language: Java can support concurrent processing as a multi-threaded language, but with shared memory for application data and code. This enables memory to be conserved by threads and work collaboratively if needed by interacting with each other. This feature of Java is very useful and makes it to be an attractive choice for almost any type of programming. Other programming languages such as C++ and C# also support multithreading, but

45

in the form of operating systems calls or add-on API. However, Java offers several keywords to simplify the writing of threads and safe code, because it has been designed especially from the ground up to support multi-threaded programming. Despite of these significant features described above, Java also can be chosen as an introductory programming language. Perhaps the main reason for that selection is Java’s potential for bringing fun and excitement into learning how to program. Within Java a graphically based application that can easily be distributed on a Web page or a simple computer game can be written by the most novice programmers. In addition, the use of Java Virtual Machine (JVM) for automatic garbage collection, Internet awareness, the simplicity of the design of Java and easy access to its libraries makes Java to be a successful and most used language by almost all the programmers (Morelli & Walde 2012, Reilly & Reilly 2002). 4.2.3 The Importance of Java in Distributed Systems Most researchers now believe that object-oriented programming in distributed systems no longer can be imagined without Java. Since its first Beta version in 1995, it has experienced a boom not similar to other programming languages. The popularity of Java has a number of good reasons such as its portability, browsing ability and security concepts. Only with these three aspects, Java already provides a number of advantages that have never been combined before in such away in other programming languages and that make Java to be appear as an ideal language for the needs of Internet today. Another significant aspect of Java is that, in the field of distributed systems it helps to reduce the complexity by integrating some important mechanisms for programming of distributed applications directly into the language and that lead to overcome the heterogeneity through its platform independence. Also the programming of distributed systems can be simplified by using an appropriate development of Java. Thus, it is possible to say that the Java language can cover all the three aspects that are exist in any distributed system which are distribution, persistence and concurrency with the help of Java RMI and CORBA (Boger 2001, Kulkarni et al. 2008).

46

4.2.4 What does CORBA offer Java programmers? Originally Java was not designed to support the development of distributed systems. Before the advent of Java Remote Method Invocation (RMI), the network library classes in the package java.net was used to implement distributed applications that was directly supported in Java. The java.net classes offer an Application Programming Interface (API) to sockets and for the handling of URLs. The URL API provides high-level access to Web resources, while sockets that are low-level abstractions provide access to support protocols such as TCP/IP and UDP/IP. However, a connection management or distribution transparency cannot be provided by the socket API. In addition, the Java programming language binding for Object Management Group Interface Definition Language (OMG IDL) offers an application programmer with high-level distributed object paradigm of CORBA: -‐

Provide access to objects that are implemented in other programming languages

-‐

Provide access to objects irrespective of their location (location transparency)

-‐

Interfaces defined independently of implementations

-‐

Provide access to standard CORBA facilities and services

-‐

Automatic code generation to deal with remote invocations (Brose et al. 2001).

These attractive features discussed in section 4.2.2 and the role of Java in implementing distributed applications are the main reasons for choosing Java as a sample of distributed programming language to find out how Java is suitable to be chosen for developing distributed systems based on various different criteria related to distributed systems. Despite of its portability, object-orientation, security, automatic garbage collection, simplicity and ease of use and other good properties of Java, it might not be suitable for a project. In some cases C++ or C# may be better than Java for developing distributed applications. For that reason a comparative study between these three languages is crucial to discover which one is the best to be selected for a project based on distributed systems.

47

4.3 C# as a Third Sample of Distributed Programming Languages 4.3.1 An Introduction to C# C# is a type-safe, simple, general purpose and modern object-oriented programming language designed by Microsoft Corporation in December 1998 for implementing a wide range of enterprise applications that are implemented on the .Net Framework. It is standardised as the ISO/IEC-2327 standard by ISO/IEC and as the ECMA-334 standard by ECMA International. The implementation of both of them is conformed by Microsoft’s C# complier for the .Net Framework. The syntax of C# is quite similar to C++ and inherited its roots from C family of languages. Therefore C, C++ and Java programmers can easily be familiar with this modern language. Programmer productivity is the main goal of the C# language. C# inherently takes advantage of the .Net framework features and hides much of the .Net framework details while permitting access to system level functions. It also allows programmers to develop applications with the power of C++ and the ease of Visual Basic. C# is built by Microsoft from the ground up with the Object Oriented Programming (OOP) and the .Net framework in mind. However, because it is a strongly typed language in which everything is an object, sometimes it expands Object Oriented Programming even beyond C++ concepts (Hejlsberg et al. 2010, Aldrawiesh et al. 2009). Although C# is an object oriented language, it includes support for component-oriented programming as well. Modern software design mostly rely on the components of the software in the form of self-describing packages and self-contained of functionality. These components present a programming model with methods, events and properties. They also have some attributes to provide declarative information about the software components and incorporate their own documentation. To support these concepts as well as making C# a natural language in which to use and create software components, C# also provides language constructs. The basic features of C# are derived from C, C++ and Java languages that help in the contraction of robust and durable applications. These fundamental features or properties of the C# language will be discussed in the section 4.3.2 below (Hejlsberg et al. 2010, Microsoft 2006).

48

4.3.2 Features of the C# Programming Language 4.3.2.1 Object-orientation C# defines every type in the language as an object to unify the type system. Thus, the developers use a struct, an array or a class as an object. The C# language is a rich implementation of the object-orientation paradigm that comprises polymorphism, encapsulation and inheritance. Encapsulation is the boundary created around an object to make a separation between its internal implementation details and external behavior. The distinctive features of C# from an object-oriented perspective are: •

Unified type system: In C#, all types including primitive types such as double and int can inherit from a single root object type. Consequently, a set of common operations can be shared for all types and values of any type can be transported, stored and operated upon in a reliable way. In addition, both value types and userdefined reference types can also be supported by C# to allow dynamic allocation of objects.

•

Interfaces and classes: The only kind of type can be found in the pure objectoriented paradigm, which is a class. However, in C# several other kinds of types are available such as interfaces which are similar to interfaces in Java. The only difference between an interface and a class is that, an interface is only a definition for a type, not an implementation. This feature of C# is very useful when multiple inheritance is needed, because C# unlike C++ does not support multiple inheritance of classes.

•

Events, methods and properties: All functions in the pure object-oriented paradigm are methods, while methods in C# are only one type of function member that contains both events and properties. Events are defined as function members that simplify acting on the changes of object state, but properties are described as function members that encapsulate a piece of the state of an object (Albahari & Albahari 2007, Microsoft 2006).

49

4.3.2.2 Type Safety Types in C# can interact with each other only via protocols they define, because it is mainly a type safe programming language. Thus, the internal consistency of each type can be ensured. In other words, C# supports static typing which means the C# language enforce type safety at compile-time before running the program. However, C# also supports dynamic type safety at run-time of the program. The main advantage of static typing is the elimination of a huge class of errors before running the program by shifting the burden away from run-time unit tests onto the complier to verify that all the types in a program fit together correctly. This makes huge programs to be more robust, more predictable and much easier to manage. C# is also called a strongly typed language because of its strict type rules that makes it impossible to read from uninitialized variables to index arrays beyond their bounds, or to perform unchecked type casts (Albahari & Albahari 2007). 4.3.2.3 Memory Management C# depends on the run-time to perform automatic memory management. The Common Language Runtime (CLR) includes a garbage collector that executes as a part of the program and reclaiming memory for those objects that are no longer referenced. This makes programmers to be free from eliminating the issue of corrupt pointers faced in some languages i.e. C++ and explicitly de-allocating the memory for an object. C# simply makes the pointers unnecessary for almost all programming tasks instead of eliminating them. Pointers might be used only for interoperability and performance-critical hotspots, but they are allowed only in explicitly unsafe blocks (Albahari & Albahari 2007). 4.3.2.4 Platform Support C# is a language that can be compiled into an executable program. It is typically used to write a code that can run on Windows platforms. Although the C# language and the Common Language Runtime (CLR) are both standardised by Microsoft through ECMA and approved by ISO, the total amount of resources dedicated to supporting C# on Linux and Mac or non-Windows platforms is relatively small. Accordingly, it is possible to say that when multiplatform support required for an application, Java is a sensible choice. However, C# also can be used to write cross-platform code in the following scenarios:

50

-‐

C# code may run on a different run-time environment than the Common Language Runtime (CLR) of Microsoft. For instance, it can be run on Linux or Mac OS if they have Mono compiler and runtime. The most common example is Mono Project.

-‐

C# code may run on the server and provide DHTML that runs on any platform. This is the same as ASP.Net.

-‐

C# code may run on a host that can support Microsoft Silverlight supported for Mac OS and Windows (Albahari & Albahari 2007).

There are also several features of the C# language that can help in the construction of durable and robust applications such as Garbage collection that can be used to automatically reclaim memory that are occupied by unused objects. Also Exception handling that can provide an executable and structured approach for detecting errors and recovery. In addition, Versioning in the C# design helps to ensuring that C# libraries and programs evolve over time in a compatible way. As a result beside versioning, features such as general-purpose programming language, object and component oriented, easy to use and learn, efficient programs and structured language, part of .Net Framework that manages code to improve reliability and its portability to compile on various different platforms are the main reasons to make C# a professional language that is widely used by programmers nowadays. Perhaps these are also the reasons for choosing the C# language for this comparison with the other languages (C++ and Java) based on several different criteria of distributed applications. 4.3.3 The Significance of C# in Distributed Systems With the assist of .Net Framework runtime environment and a huge rich class library, C# has the ability to simplify the development and deployment of distributed systems or modern component-based distributed applications. In C# the .Net Remoting API framework which is equivalent to Java RMI in Java enables the objects on both client and server side to communicate with each other. Thus, the low-level socket protocol that typically requires management by the programmer can be abstracted out and that helps the programmer to operate at much higher and simpler level of abstraction. Also the .Net Framework offers a wide range of interesting features such as user interface prototyping, multi-threading, database connectivity, service-orientated application development and web application. Moreover, in C# the Common Language Runtime (CLR) can provide full

51

support for remote object calls. So that using distributed objects in C# may possibly not require interfaces or stubs as in Java language. In addition despite of its integrated support for building distributed applications. C# also supports all the common abstractions and concepts that exist in C++ and Java languages (Drayton et al. 2003, Zurich 2004, Rabah et al. 2010).

4.4 Summary To summarize, over all the role of each C++, Java and C# in developing distributed systems are quite similar. However, each of them has own strengths and weaknesses. It might be difficult to decide which one should be used for developing a software project. The decision should be made based on the work in hand and try to discover which language is the best to be chosen for a particular distributed application based on the features of the language that are more related to that application. Features such as language portability, efficiency, simplicity and ease to use, reliability, object orientation and so on can be used to compare the similarity and differences between those three distributed languages with respect to various different criteria of distributed systems to find out how powerful the selected language for developing a particular project. In addition the role of each language in developing distributed applications is considered as a crucial fact to make a distinct between them. The next chapter will explain the power of each C++, Java and C# for developing distributed systems by doing a comparison between them based on various different language features and several criteria that are shared among those languages and distributed systems.

52

Chapter – 5

Comparison and Findings

53

5. Comparison & Findings This chapter will identify some economic and technical criteria that concluded from the former chapters to make a comparative study between three different programming languages discussed in [chapter 4] based on distributed systems. It also attempts to illustrate how these criteria can be used to evaluate candidate languages (C++, Java and C#) based on the selected criteria for a particular distributed project and clarify which language is more suitable to be chosen for building or developing distributed systems.

5.1 Related Work Comparison of programming languages has become a common topic among software engineers or programmers. Numerous programming languages have been designed, implemented and specified in order to adapt the new changes in programming paradigm, new technologies and hardware evolution. Extensive research and work have been done in this area. The most popular and recent one is “Selecting the best object-oriented programming language for distributed computing systems” by Aldrawiesh and Ajlan in 2009. In this paper the authors have studied and compared C++, Java and C# against 6 different criteria in an open distributed systems environment. Based on their evaluation and findings, they found that Java is more suitable for developing distributed systems than other languages. Also Aldrawiesh et al in the paper titled “A comparative study between computer programming languages for developing distributed systems in Web environment” have presented a view of the capabilities of ANSI C++, C++, C# and Java programming languages for developing distributed systems and Web services. The authors as language comparison criteria used distributed systems, reliability, simplicity and usage, platform, high integrity, maintainability and concurrency. In the article written by P. Kulkarni et al. in 2008 called “Programming Languages: A Comparative Study”, the authors presented 4 programming languages, Perl, C++, Java and Lisp and compared every language with respect to portability, reusability, readability, reliability, familiarity, availability of compilers and tools and efficiency. They discussed each language and related criteria as well as presented the evaluation and outcomes based on run time efficiency, programming effort, memory consumption, program length and reliability as primary criteria.

54

In the paper named “Comparative Studies of 10 Programming Languages Within 10 Diverse Criteria” by J. Li et al. ten programming languages: Java, C++, C#, PHP, JavaScript, AspectJ, Haskell, Scheme and Schalar has been discussed by the authors. They also summarised and compared the languages mentioned above against 10 different criteria such as Web application development, Object Oriented-based abstraction, Reflection and so on. At the end, they provided evidence and analysis on why some languages are more suitable and better than others. Although several researches has been done in this area, the only first two papers discussed above compared programming languages based on some criteria that are related to distributed systems. However, the comparison missed many criteria that needed to be considered during language comparison process. Also their discussion, evaluation and outcomes have not been presented properly. Thus based on the gaps and weak points founded from previous works, this study attempts to fulfill those gaps, increase the number of criteria to 10, give more details about chosen languages and criteria as well as explain the comparison of each criterion by using charts with a proper evaluation and more accurate outcomes.

5.2 Language Comparison Criteria With so many existing criteria for language comparison, it is quite difficult to decide which criteria to choose. However, the selection of the right criteria and programming language is crucial to the success of a project. Because this dissertation only focuses on how best a programming language can be selected for a project based on distributed systems, only those criteria that founded to be the most important and shared criteria among the three candidate programming languages in relation to distributed systems will be chosen. The following are the criteria that concluded from this research project and they should be considered during language selection process for developing software projects or distributed applications:

55

•

Concurrency

•

Reliability

•

Scalability

•

Security

•

Portability/Platform

•

Simplicity and Usage

•

Efficiency

•

High integrity

•

Reusability

•

Maintainability

5.2.1 Concurrency Concurrency is fundamental to improve the overall performance of distributed systems and it has always been considered essential to administrate the shared resources used by a number of tasks or processes. In distributed systems each request requires to be processed sequentially, therefore without concurrency performance would suffer. For that reason it is crucial to choose a programming language that directly supports concurrency and has ability to carry out multiple operations at the same time. Also in terms of programs efficiency and behavior concurrency is an important one. It is the main source of compilation in program design, analysis, verification and very valuable in program modeling particularly. Thus, concurrency has principle requirements for modern high integrity programming languages used for developing distributed systems (Boger 2001, Findlay & Watt 2004, Aldrawiesh 2009). 5.2.2 Reliability Reliability is always important in distributed systems and it becomes essential in safetycritical systems. A reliable distributed application should be designed to be a fault tolerant as much as possible to provide an opportunity for handling errors that may occur in the system. Selecting a good programming language for programming distributed systems plays a major role in making distributed systems reliable and helps to eliminate and detect the errors as quickly as possible. It also helps to increase the automatic discovery of programming errors as well as avoid and repair them when they occurred. The failure of distributed systems can lead to anything from easily repairable programming errors to catastrophic breakdowns. Therefore, the chosen programming language should be able to

56

support the evolution of reliable programs. Sometimes redundant can exist in high-level programming languages such as C++, Java and C# but there is no duplicative specifications in programs (Findlay & Watt 2004, Aldrawiesh 2009).

5.2.3 Scalability One important goal of an open distributed system is to be scalable enough to allow the users, recourses, additional hardware and/or software to be added to the system without change of the system or application software while the scale or size of the entire system increased. This feature of distributed systems also allows a proper and easy management of the system even if it is widely divided between several different organisations and still enables the new users or resources to be added to the systems as well as keep the communication between them flexible. Accordingly scalability can be measured as a significant factor in the constriction of distributed systems for multi-threading in clientserver applications. Therefore the chosen programming language should greatly support multi-threading in building distributed systems and allow the construction of programs from the separate coded and tested compilation units by different programmers (Deshpande & Kamalapur 2009, Findlay & Watt 2004, Vanier 2011).

5.2.4 Security The security issue is one of the most irritating feature and convincing challenge associated with distributed systems. However, it is extremely important and necessary for providing confidentiality, availability and integrity in distributed systems. A distributed system should allow its components for collaboration and sharing resources between each other more easily. It also should enable users, programmers and resources to communicate with each other on various different computers by applying necessary security arrangements. Consequently, a well-suited security mechanism is required to avoid hackers or unauthorized users from accessing the system as well as maintain any security issues that may occur in the system. Accordingly a secure programming language that provides safety and supports a good security mechanism is essential to be chosen for developing distributed systems (Tari & Bukhres 2001, Thampi 2009, Belapurkar et al. 2009).

57

5.2.5 Portability/Platform Distributed systems should be designed in such a way that the comprising systems be able to use the resources of each other. That means they should be programmed by a portable programming language that can run on various different platforms or operating systems such as Windows, Linux and Mac OS. Portability has a significant advantage in open distributed systems and modern programming languages. It enables the language to run itself on several dissimilar platforms although it might be written for a specific platform. Thus, portability allows the distributed applications to be transformed from on software or hardware platform to another one without need for major changes (Tari & Bukhres 2001, Aldrawiesh 2009). 5.2.6 Simplicity and Usage Complexity is one of the main issues of distributed systems. The main reason for that is the difficulty in developing distributed system’s software, because it should be able to deal with all the errors that possibly could occur from all the components or computers that a distributed system is built up on. So that it is important to select a programming language that includes a minimum code complexity for building distributed systems. In addition, the language should have a uniform semantic structure that can reduce the number of underlying concepts. It should also be able to support the users or programmers in how the language should be easy to learn, use and teach (Tari & Bukhres 2001, Aldrawiesh et al. 2009). 5.2.7 Efficiency Distributed systems should have a simple and efficient implementation to reduce the cost of unused components. Efficiency can be improved by using protection domains and grouping objects hierarchically depending on the operations they provide. Thus the selected programming language should be designed with a number of restrictions to allow it to be implemented in a very efficient manner. It also should not involve run-time overheads and garbage collection which both of them are costly and slowing down the program at erratic times (Chen 2003, Kulkarni et al. 2008, Tanenbaum & Steen 2007).

58

5.2.8 High integrity High-integrity in distributed systems has a significant role in transportation, protecting systems, communications and power management. A distributed system with highintegrity has the ability to support data integrity and provide improved functionality, design management, decrease production costs and increase flexibility. Additionally, it also has the ability to provide multiple choices in developing object-oriented programming languages. Thus the selected programming language should support high-integrity while sending and receiving data between system components for ensuring data integration and increasing performance (Aldrawiesh et al. 2009). 5.2.9 Reusability Reusability plays a major role in implementing or developing distributed systems as quickly as possible by reusing program units that have been tried and tested before. This possibly helps programmers to accelerate the project and develop some new features or program units that might be suitable to be reused in the future. Therefore, it is important to choose a programming language that allows reuse of its program units such as classes, generic units, abstract types and packages (Kulkarni et al. 2008). 5.2.10 Maintainability Maintainability in distributed systems refers to how a failed distributed system can be repaired in an ease and simple way. A system with high maintainability means it still has availability to provide its resources or services even if the system contains disruptions or faults. This criterion is crucial during program compilation, tracing and debugging. So, programs of the selected programming language should be easy to maintain, consistency, emphasize, clarity as well as easy to understand by the programmers. It is also important that the chosen programming language encourage a user or programmer to perform documentation while programming distributed applications (Aldrawiesh et al. 2009, Cao 2014).

59

5.3 Comparing Candidate Languages (C++, Java and C#) Against Selected Criteria In this part, C++, Java and C# will be compared against the language comparison criteria including Concurrency, Reliability, Scalability, Security, Portability/Platform, Simplicity and Usage, Efficiency, High integrity, Reusability and maintainability in order to explore the weaknesses and strengths of the nominated languages in open distributed systems environment. It also will present the findings and evaluation of the selected criteria against candidate programming languages. A set of numbers from zero to ten (0 – 10) with graphical representation will be used to show the rate of each criteria and declare whether a language can be recommended or not for a project regarding its strengths, weaknesses, requirements and features. The rates will be given to each language criteria out of 10. Where 0 presents the lowest level of satisfaction and 10 presents the highest level of satisfaction. For simplicity the rating will be as follow: 0: lowest level of satisfaction

10: highest level of satisfaction

1-2-3: Not satisfactory 4-5-6: Low level of satisfaction 7-8-9: High level of satisfaction. The numbers given are subjective and based on the analysis and reviews of the previous researches that have been done in this area before. There is no precise or actual testing and simulation to prove that the numbers that has been given to each criterion is accurate and shows the exact level of language ability to support those criteria. However, the numbers are classified into three different levels, which are not satisfactory, low level of satisfaction and high level of satisfaction in order to give the most possible accurate results.

60

5.3.1 Concurrency Many distributed systems struggle with problems of concurrency. Concurrency means performing multiple independent activities in parallel at the same time. In programming languages concurrency is the support of multithreaded programs. A multithreaded language should support concurrent processing and allow programmers to write portable multithreaded code with parallel processing or multiple threads of control. This will make it possible to write multithreaded programs without depending on platform-specific extensions. Each of C++, Java and C# support concurrency in different levels: C++: Because C++ is not designed form the ground up to support multithreading, C++ programs do not support concurrency directly. However, the C++ language have multithreaded support but in the form of operating systems calls or an add-on Application Programming Interface (API). Therefore, the rating of concurrency support in C++ can be assigned as (3) which is not satisfactory (Williams 2012). Rating: 3 Java: In contrast to C++, the Java programming language has been initially designed to support concurrent programming. That means concurrent execution of multiple threads is fully supported by Java. In Java, concurrency can be directly supported in the form of lightweight processes, independently from the operating systems and the hardware that a Java Virtual Machine (JVM) is running on. The JVM itself is a good example of multithreaded program. JVM threads are able to perform several tasks that are necessary to the successful execution of Java programs. The Garbage Collector Thread is one of the JVM threads used to automatically reclaim memory taken up by those objects that are not used in the program. This support of concurrency within Java is very useful and makes it to be an attractive choice for almost any type of programming. Thus the rating of concurrency support in Java can be assigned as (8) which is high level of satisfaction (Boger 2001, Goetz et al. 2011, Morelli & Walde 2012). Rating: 8 C#: C# has the ability to support parallel code execution through multithreading. C# programs run on the .NET Framework environment which contains a rich collection of class libraries and a multi-language execution engine that provides a multithreaded execution environment with synchronisation based on locks associated with the objects. In addition,

61

numerous traditional concurrency control primitives can be implemented by C# .Net libraries. The .NET Framework too offers higher-level infrastructure for building distributed systems and services such as remote method calls (RMC). Consequently, it is possible to say that C# is very powerful in supporting concurrency with the .NET Framework environment and it can be rated as (7) which is high level of satisfaction (Benton et al. 2002). Rating: 7 Graphical Representation for Concurrency Criteria:

Concurrency 10 Comparision Criteria Range

9 8 7 6 5 4

Concurrency

3 2 1 0 C++

Java

C#

Candidate Languages

Figure 5.1: The Rate of Concurrency in C++, Java and C# Figure 5.1 illustrates the comparison of candidate languages against concurrency criteria. It presented the rate of concurrency in each language based on their support for multithreading programs or concurrency. As showed in the figure above, the rate of C++ language is 3, which is not satisfactory level because it does not support concurrent programming. However, the rates of each Java and C# are 8 and 7 which is high level of satisfaction, because Java is fully support multiple threads of control and C# powerfully supports concurrency in .Net Framework environment particularly. Accordingly, it is possible to say that both Java and C# are better than C++ in supporting concurrency and

62

they can be recommended for a project. However, Java might be better than C# as it works on various different platforms and greatly supports concurrency, but C# is only powerful on the .Net Framework environment. 5.3.2 Reliability While selecting a programming language for a project it is important to think about how reliable it is to make sure that the chosen language supports the required level of system safety and reliability. In a reliable system errors can be predicted and detected easily before the system crash. However, increasing reliability might be more costly. The level of reliability support is different in C++, Java and C#: C++: Although C++ supports reliability by using object-oriented features, it is not reliable enough and its programs are less reliable than both Java and C#. Object-oriented features of C++ increase the likelihood of compile-time detection of some type of errors but it does little checking at run-time. Also C++ supports separate specifications as well as system reliability by improving the characteristics of C with other features such as improved expression and encapsulation (Chen 2003, Findlay & Wat 2004). Rating: 5 Java: Java offers several suitable features to support system reliability such as performing full compile-time checks for automatically detecting a large amount of programming errors and automatic run-time checks to ensure that other errors in the program such as out of range array indexing are discovered and eliminated before serious harm is done that may crash the whole system. However, for making a program unreliable Java needs the information specification such as type specifications and the omission of which can make a program unreliable (Findlay & Wat 2004, Aldrawiesh 2009). Rating: 9 C#: C# and Java are quite similar in supporting reliability. The same as Java, C# has a number of run-time checks or debugs to identify errors quickly and eliminate them by using a good exception handling plan. In addition, most of the issues related to reliability and safety in C# are addressed with the .Net Framework environment (Microsfot 2006, Aldrawiesh 2009). Rating: 6

63

Graphical Representation for Reliability Criteria:

Reliability 10 Comparision Criteria Range

9 8 7 6 5 4

Reliability

3 2 1 0 C++

Java

C#

Candidate Languages

Figure 5.2: The Rate of Reliability in C++, Java and C# Figure 5.2 demonstrates the comparison of candidate languages against reliability criteria. As shown in the figure above the rate of reliability in C++ is 5, which is low level of satisfaction due its poor support for reliability. However, the rate of Java is 9 which is very high level of satisfaction because of its very good support for reliability and the rate of C# also satisfactory but it has low level comparing with Java which is 6. That means C++ is less reliable than both Java and C#. Therefore both Java and C# can be recommended for developing distributed systems but the priority of Java is much higher than C# because of its great support for reliability.

64

5.3.3 Scalability A scalable programming language should support the development of large-scale programs and allow the construction of programs from separate coded and tested completion units. By using a scalable programming language the difficulty of managing the complexity of the program can be reduced and that helps to manage the scale of the program easily. C++: Unfortunately the absence of garbage collection that dramatically increases scalability and the presence of pointers to manage the reclamation of memory as well as having some odd features that interact in particular ways make C++ non-scalable language. However, by adding some generic features and object-oriented features to C the abstraction level of C++ programs can significantly be improved and this make the C++ language slightly scalable (Vanier 2011). Rating: 4 Java: Java has garbage collection instead of pointers to reclaim memory and greatly support object-oriented programming. This makes Java programs to be easy management and scale well. It also has some interning mechanisms to achieve platform independence and its JVM have become more scalable by using the J2EE patterns. However, the abstraction level of Java is comparatively weaker than C++. Therefore, many of the scalability features of Java are not a part of the language itself but of the environments created around the language (Vanier 2011). Rating: 6 C#: Similar to Java, C# also has garbage collection and supports object-oriented programming. Beside interesting mechanisms for achieving platform independence, C# also includes a good mechanism for using unsafe code but encapsulated in specially marked unsafe modules. The .NET framework of C# allows a substantial amount of inter-language interaction and that makes C# programs to be more scalable. However, the abstraction level of C# is also weaker than C++ and not nearly as good as languages that support both functional and object-oriented programing. Thus, C# is also not a very good one but fairly better than C++ because of its support for garbage collection and the support of its .NET framework environment. However it might be the same as Java (Vanier 2011). Rating: 6

65

Graphical Representation for Scalability Criteria:

Scalability Comparision Criteria Range

10 9 8 7 6 5 4

Scalability

3 2 1 0 C++

Java

C#

Candidate Languages

Figure 5.3: The Rate of Scalability in C++, Java and C# Figure 5.3 exhibitions the comparison of candidate languages against scalability criteria. As explained in the figure above all three languages (C++, Java and C#) have low level of satisfaction in supporting scalability. This is because of the absence of garbage collection in the C++ language and the lack of support for functional programming in both Java and C# comparing with the other programming languages such as Eiffel which is very scalable language than C++, Java and C#. Therefore, choosing any of these three languages for developing distributed systems might face the scalability issues. However, both Java and C# can be recommended similarly based on their environment support for scalability. 5.3.4 Security Application security is one major issue associated with open distributed systems or networked applications. Because distributed systems architecture comprises of several different nodes and resources, it is essential to choose a secure programming language that can handle the security when data is transferred between those nodes and resources through public networks.

66

C++: C++ was not designed from the bottom up as a secure language. The lowest point in C++ security evaluation is the area of memory safety. Most C++ security flaws originate from unchecked memory manipulation and buffer overflow attacks due to the fact that C++ permits memory manipulation via the use of memory pointers (Qahtani et al. 2010). Rating: 3 Java: The architecture of Java for distributed applications was designed to take numerous security requirements into consecration. It is originally written as a secure language and built on the type-safe model that does not allow the boundary of buffers to be superseded. Also Java includes two main types of security for distributed systems, the ability to engage in secure remote transactions and secure local run-time environment. In addition, in Java many errors can be captured during compile-time and use of direct memory pointers are not allowed. The advantage of using JVM and Sandbox Java security model makes Java code to be implicitly trusted for execution without causing any security breach or damage and that makes it an attractive choice for network programs (Moreno 2002). Rating: 8 C#: C# is a safe programming language that has less run-time errors. The C# code is more managed and the Common Runtime Language (CLR) performs several tasks such as typesafety checks, garbage collection, memory management and memory overwrites. This possibly reduces memory leaks and related issues as well as avoids the code from direct memory accessing, reduce crashes and eliminate pointers. Another interesting feature of C# is type safety that promotes safe programs and robust (Microsoft 2006, Turner 2013). Rating: 7

67

Graphical Representation for Security Criteria:

Security 10 Comparision Criteria Range

9 8 7 6 5 4

Security

3 2 1 0 C++

Java

C#

Candidate Languages

Figure 5.4: The Rate of Security in C++, Java and C# Figure 5.4 shows the comparison of candidate languages against security criteria. It is clear that in networked programming security is a significant problem for programmers or software developers while developing distributed systems. Therefore, security is an important criterion that should be considered when choosing programming languages for a project. As it can be seen in the chart above, the security of C++ is ranked as 3 which means not satisfactory due to its security issues such as vulnerable code to buffer overflow, no type-safe checking during compile-time and unchecked memory safety. However, the security of Java is ranked as 8 and C# as 7, because both of them have similar security mechanisms and functionality in cryptography. Both supports a wide variety of authentication services such as user-defined services and provides modules for authentication as well as permit for role-based security through the provided model. The differences between them are the expectations of the both languages considering the framework targets. (Simpleprogrammer 2014, Turner 2013).

68

5.3.5 Portability/Platform Portability/platform has a significant advantage in open distributed systems and modern programming languages. A portable language can run itself on various different operating systems or platforms. Accordingly, a portable code or program can be written once by using a specific portable language and then run on many dissimilar operating systems. C++: C++ is designed to be a portable language by means of being able to execute or run itself in the same environment that a program was written such as Windows, UNIX…etc. and allows for encapsulation, separation of date and code as a structured language. Also the widespread of C++ tools such CORBA and other tools make it to be more suitable for many platforms. However, it has several features and unspecified behaviors that cause programs non-portable (Reinhardt 2004). Rating: 6 Java: Java has many good features and characteristics that make it exceptionally portable. For instance, it has the ability to run the same web application on several different machines and platforms. Its programs are extremely portable that makes it possible to move object code from one platform to another without major changes. Its compiler can produce a byte code which is platform independent that can be transformed to machine code during runtime by JVM (Einarsson 2005, Qahtani et al. 2010). Rating: 9 C#: Although C# is a modern programming language and works greatly on MS-Windows platform on the .Net framework environment, it is only available on Windows and so far it can only be implemented in the .Net environment. That makes the C# language to be unportable language. However, the efforts are currently put in place to make C# a portable language and port it to other platforms or operating systems such UNIX, MacOS and LINUX porting is already being done as a part of Mono Project (Aldrawiesh 2009, Microsoft 2006). Rating: 3

69

Graphical Representation for Portability/Platform Criteria:

Portability/Platform 10 Comparision Criteria Range

9 8 7 6 5 4

Portability/Platform

3 2 1 0 C++

Java

C#

Candidate Languages

Figure 5.5: The Rate of Portability/Platform in C++, Java and C# Figure 5.5 displays the comparison of candidate languages against Portability/Platform criteria. Portability or Platform can be specified as one of the fundamental requirements for distributed applications to transfer the application from one platform to another or one server to a different one. As presented in the chart, Java has the highest rate, which is 9 due to its significant characteristic of being able to run on various platforms and excellent support for portability. Also because C++ has the ability to run or execute itself in the same environment that a program was written and encourage the encapsulation of dependencies as well as a feature that facilitates portability, thus it can be ranked as 6, which is still low level of satisfaction. However, because C# is especially designed to work on the .Net framework environment of Microsoft and its ability to run only on Windows platform as well as being implemented only in the .Net environment, it is rated as 3 which is in the range of not satisfactory. Hence, based on the evaluation it is possible to say that both C++ and Java are better than C# in terms of portability. But, Java has higher rate even than C++.

70

5.3.6 Simplicity and Usage Language simplicity should be the main goal of language design. A programming language should be simple by itself and include a minimum code complexity in order to assist programmers to solve problems easily and allow them to express solutions naturally. If the language complicated, programs will become brittle and distracted. Also it makes difficulty and produces issues for programmers while developing large-scale applications. C++: C++ is a complicated object-oriented programming language. The syntax of C++ programs is fairly complex compared to Java programs and that makes it to be quite difficult to learn and use especially when programmers use it for the first time. However, it is faster than Java and still one of the most popular languages used by programmers (Kulkarni et al. 2008, Aldrawiesh & Ajlan 2009). Rating: 5 Java: Java is a simple object-oriented programming language. Compared to C++ and C# the syntax of Java programs is much simpler and easier to learn and use by programmers. A programmer can access objects only via object references and there are no pointers to access the memory directly. Also in Java classes are able to inherit only from one class, they are not allowed to inherit from a second class and that avoids multiple inheritance. Thus, Java is far simpler than other programming languages (Reilly & Reilly 2002, Aldrawiesh & Ajlan 2009). Rating: 7 C#: In simplicity, it is fair to say that C# is quite similar to C++ and different form Java. This is due to the fact that C# programs are very complicated and difficult to novice programmers to learn and use. It uses several constructs, extra keywords and control structure that makes the C# language more complex and there will be always a price for complexity in its programs (Aldrawiesh & Ajlan 2009). Rating: 5

71

Graphical Representation for Simplicity Criteria:

Simplicity and Usage 10 Comparision Criteria Range

9 8 7 6 5 4

Simplicity and Usage

3 2 1 0 C++

Java

C#

Candidate Languages

Figure 5.6: The Rate of Simplicity and Usage in C++, Java and C# Figure 5.6 presents the comparison of candidate languages against simplicity and usage criteria. The simplicity of object-oriented programming languages can make software development much easier and simpler. Indeed, the complexity of programming languages may slowdown this process or fair. As shown in the chart above, the rate of both C++ and C# are similar which is 5 due to their complexity and complicated syntax of programs. C++ is renowned for their complexity, because of having some irritating features that makes it more complex such as, multiple inheritance, direct memory access via pointers as well as the need for explicitly allocate and deallocate memory for the storage of data structures and objects. C# on the other hand is complex by itself, because of using extra keywords, control structure and construct. However, compared to both C++ and C#, Java is far simpler language to use and learn, because it does not use the concept of pointers in which memory can be accessed. Instead it uses object references to access another object. Also multiple inheritance is not allowed by Java so that classes can inherit only form one class and they are not allowed to inherit from a second. All these good features make Java

72

programs to be simpler which is extremely important in networked applications and all types of distributed applications. 5.3.7 Efficiency Efficiency plays a major role in the choice of language design pattern implementation. Thus, sometimes the efficiency of programing languages is essential to reduce or avoid execution costs and maximise the number of safe optimisations available to interpreters as well as ensuring that constant and unused portions of programs will not be added to execution costs. The design of a programming language should keep simple in order to aid the production of efficient programs by choosing features that have a simple and efficient implementation. C++: The reasons of well understanding of how to use a language by programmers and better compilers as well as the use of compile-time options that increase efficiency lead to improve the efficiency of C++ a decade ago. Also C++ pointer aliasing and arithmetic prohibit some code optimisations. However, the C++ language implicit conversion operations that may be activated in such situations which is not easy to be recognised by its users (Chen 2003, Einarsson 2005). Rating: 7 Java: The garbage collection of Java is under a question about guaranteed timing and efficiency particularly in real-time and distributed systems. This is because Java’s garbage collection can set aside more memory for efficiency reasons even when the lifetime of allocated date is too short. In addition, because Java uses JVM compiler it is frequently considered memory intensive and very slow (Prechelt 1999). Rating: 4 C#: Similar to Java, the garbage collection and some memory management issues raise questions about C#’s efficiency and guaranteed timing. However, features such as generics, delegates and other good properties of C# can be used for a readable implementation that has an acceptable efficiency overhead (Bishop & Horspool 2008). Rating: 6

73

Graphical Representation for Reusability Criteria:

Ef

Selection of Computer Programming Languages for

Selection of Computer Programming Languages for

Suggest Documents

Computer Science 320 Programming Languages

Semantics of Programming Languages - Cambridge Computer ...

Fundamentals of Programming Languages - Computer Science

Fundamentals of Programming Languages - Computer Science

Concepts of Programming Languages Lecture Notes - Computer ...

Semantics of Programming Languages - Cambridge Computer ...

Semantics of Programming Languages - Cambridge Computer ...

Programming Languages Concepts of Programming Languages ...

Types and Programming Languages - Computer & Information ...

Probabilistic Programming Languages - UBC Computer Science

Design Concepts in Programming Languages - Computer Science

Probabilistic Programming Languages - UBC Computer Science

Dualities in Programming Languages - Computer Systems Laboratory

programming languages mathematical theories - Computer Science ...

The Pervert's Guide to Computer Programming Languages

Scaling up visual programming languages - Computer

Concurrency in Modern Programming Languages - IEEE Computer ...

Expert tutoring system for teaching computer programming languages

A comparative study between computer programming languages for ...

Principles of Programming Languages

Concepts of Programming Languages

ESSENTIALS OF PROGRAMMING LANGUAGES

Concepts of Programming Languages

ESSENTIALS OF PROGRAMMING LANGUAGES