Implementation of the real-time functional language ... - Springer Link

Implementation of the real-time functional language Erlang on a massively parallel platform, with applications to telecommunications services Beshar Zuhdy, Peter Fritzson, Kent Engstr6m PELAB, Dept. of Computer and Information Science LinkOping University, S-581 83 LinkOping, Sweden Email: [email protected]; Fax: +46-13-282666

Abstract. Many real-time systems need large amounts of computational power. This may soon provide a larger market for parallel computers than the scientific computing area where most of them are used today. Examples of new and interesting areas are telephone switching systems, image recognition, real-time databases, multi-media services and traffic guidance systems. Programming parallel computers for these new applications is often complex and error-phrone. To alleviate this condition, Ericsson has developed a new non-lazy functional programming language called Erlang. This new language, which has already been used in several large projects, was designed to provide a good environment for building large fault-tolerant real-time applications with explicit concurrency. Existing Erlang implementations run on SISD computers. Together with Ericsson, we have developed a MIMD version of Erlang, initially for the Parsytec GC/PowerPlus. This is one of the first implementations of a functional language used in industry on a MIMD computer. To benchmark the parallel Erlang version, we are using a telecommunications application developed by Ericsson.

1

Introduction

In the near future, parallel computers will be used for scalable multiprocessing real-time applications, where there is an increasing need for computational power. Examples are computerized real-time driving assistant systems in cars, multi-media services, realtime databases, picture processing and image recognition, telephone switching systems, etc. The potential market of such applications will be much larger than the rather narrow area of scientific computing which currently dominates the parallel processing business. The implementation of Erlang for massively parallel MIMD platforms is eventually aimed at this new multiprocessing real-time application area. Erlang is a new functional language designed for efficient programming of concurrent, real-time, distributed fault-tolerant systems. The language assumes no shared memory, and is thus suitable for MIMD implementation. However, most current Erlang implementations run on single-processor workstations where all light-weight processes are time-shared. There is also a recent distributed version that runs on networks of workstations, but no implementation is available for parallel MIMD multiprocessors. All interaction between Eflang processes is by asynchronous message passing. Distributed systems can easily be built in Erlang. Applications written for a

887

single processor can be ported to run on networks of processors. Erlang was developed at the Ericsson and Ellemtel Computer Science Laboratories, and is now marketed by Erlang Systems AB. It has already been used to implement non-trivial telephone switching applications by Ericsson. The language is described in [1]. The objective of this work is to facilitate software development for parallel MIMD distributed memory platforms by implementing Erlang in an efficient manner for a massively parallel distributed-memory architecture, and demonstrate its usability for realistic soft real-time applications.

2

ExistingErlang implementations

Three Erlang implementations currently exist: a byte-code interpretive version, a threaded code interpretive version which is faster, and a compiled version which generates C-code with some special tricks. The last version produces the fastest executing code, which however takes much more space than the interpretive code. All three versions use roughly the same run-time system for message-passing, I/O, error handling, etc. Erlang master node Spare 20 frontend

Erlang slave / nodes ~-~

-I

Sparc 10 I ~le storage node

Sparc 10 file storage node

l~go L A section of the Parsytec GC/Powerplus in Lknk6pingwhich has a total of 64 computational nodes, 128 processors, and 6 parallel file system nodes. In the Erlang implementation, the user comunicates directly with the master node, but all Erlang nodes can communicate with each other independently.

2.1

Run-time system for the parallel implementation

The major portion of the work in implementing Erlang on the parallel platform concerns the run-time system. The message passing interface, input/output, handling of lightweight processes, etc. are all part of the run-time system. Also, communication between

888

processor nodes on the parallel machine and the frontend workstation needs to be handled. The communication module, previously using TCP/IP for distributed Erlang on workstations, has been replaced by a communication module suitable for internal communication within the MIMD machine. Also, parts of the Erlang run-time library has been restructured for use on the PARIX message-based operating system, lacking UNIX-like features such as signals and fork/join-style process semantics. 2.2

Interprocess communication and file I/O

In the Unix Erlang implementation, an important component called the Erlang port mapper daemon (epmd), runs on each workstation. This is essentially a name server that provides file descriptors for TCP/IP and socket-based communication between Erlang processes on different workstations. This daemon has been eliminated in the parallel implementation. Instead, message passing has been implemented using efficient synchronous communication over virtual links using the Parix primitives non-blocking Send and Receive. Node names used in our implementation are of the form "node32@Parsytec" (for node 32 or some other number), which is compatible with the name format of the Unix Erlang implementation. The number part of the node name is extracted and used to place the message at the corresponding node. Erlang has built-in support for authentication by magic cookies. This is used to ensure authorized interaction between distributed nodes. In our parallel implementation, all nodes are owned by the same user. Therefore we provide a default cookie (parixcookie) for all nodes at startup time. This ensures complete access rights between all nodes on the Parsytec machine and eliminates further checking. In the Unix Erlang implementation file I/O is handled by a separate process. This rather inefficient solution has been replaced by direct calls to the underlying Parix operating system in our parallel implementation. Parix then dispatches the I/O to the frontend through remote procedure calls or sends lJO to the parallel file system nodes.

3

An application

Traditional telephone switching networks are not intelligent. They have one primary task: to connect telephone A to telephone B. During the 1960s and 1970s, customers began to demand additional services. Customers want multi-party calls, flexible billing (e.g. the called party pays for the call), mass calling, telephone voting, universal personal phone-numbers, and a lot of other services. To be able to provide better services now and in the future, telecommunication companies have developed the Intelligent Network (IN) concept. The goal is to create a distributed system where new services can be implemented swiftly, and where the services can be tailored for specific users. Intelligent networks consist of different kinds of nodes. The idea is that switches should do only the switching, while higher-level functions should be delegated to other

889

nodes more suited for the task. For more information about IN technology in general, see [2] and [3]. The application program selected for porting on top of the parallel Erlang implementation is a simulator for an intelligent network service control system, see Fig. 2, which previously has been implemented in Erlang by Ericsson Telecommunicatie B.V. in the Netherlands. An example of a service provided by such an intelligent network service control is telephone voting. Suddenly, thousands of calls must be serviced almost at the same time. This is a typical example where the scalable processing power provided by MIMD multiprocessor systems is needed. For this application, the simulator for the intelligent network service control runs together with real switches or load generator programs to test the system, e.g. for situations which create a massive influx of phone calls. Intelligent Network Service Control (On a scalable parallel computer, for availability of high capacity service)

\

Incoming calls Incoming calls Incoming calls Fig. 2. A scalable Erlang application: a simulator for an Intelligent Network Service Control. 3.1

Parallelizing the simulator

As described above, intelligent networks consist of a number of nodes. Together, they implement the advanced features desired in the telecommunications network by executing "programs" called Flexible Service Profiles. These profiles are designed and verified using graphical tools. They are typically invoked when the user requests a service. The three most important node types in the simulator are the SSF, the SCF and the SDF. The SSF, Service Switching Function, controls the operation of a telephone switch. Basic call processing is handled completely by the SSF, while more advanced services are handled by asking the SCF to take control of the call. The SCF, Service Control Function, handles services by executing the Flexible Service Profiles. The SDF, Service Data Function, stores the profiles and other data needed when executing the profiles, e.g. customer data. Fig. 3 shows a simplified view of the interaction between nodes. The SSF calls upon the SCF when a service is invoked, and the SCF fetches and updates data items by contacting the SDF. In the current simulator, each SDF may serve many SCF:s, and each SCF may have many connected SSF:s. However, there may be only one SCF known by each SSF, and only one SDF for each SCF.

890

SSF

~

SCF

IN I

~ll --I S D F

[

Fig. 3. Simplifedview of node interaction. In order to adapt the IN simulator for execution on a parallel computer, one could parallelize the SCF internally, e.g. by having a master SCF node that distributes connections from SSF:s to a number of worker SCF nodes. A simpler solution, which will also be more reliable, is to remove the restriction that each SSF only knows about one SCF. If each SSF is allowed to connect to any one of a number of different SCF nodes, load balancing is achieved without a major rewrite of the SCF code. Different ways to distribute calls between the SCF:s will be investigated (e.g. round-robin, random, and selection based on load data piggy-backed onto other communication). With a lot more SCF nodes than before, there is a risk that the communication between the SCF nodes and SDF node becomes a bottleneck. To avoid this, caching will be used. Each SCF caches data fetched from the SDF. For the profiles and other data that are not updated often, this works fine. Data modified more often by SCF:s could however be a problem. Fortunately, reads are much more frequent than writes, so a simple cache invalidation protocol could be used. New or modified data is sent to the SDF, which then tells all other SCF:s to invalidate their copies.

3.2

A Scalable Toy Application

In order to get some results before the adaption of the industrial application is done, we have created a small example. This toy application uses task fanning to sum a series of floating point numbers (from 1.0 to 8 000 000.0 with step 1.0 in this case) by dividing the interval between the worker processes. This is truly a trivial task, but the worker application could easily be replaced with code that does more useful work. As Fig. 4 shows, this application scales linearly in the beginning, but the speedup levels off when the amount of calculation becomes too small compared to the amount of communication.

891 35

30

/,/

Speedup 25

,/-"

,,.,,"

Proes

.//

/~'/~

2O

,z"

15

Elapsed time (ms)

Speedup

1

92 478

1.00

2

46 106

2.00

4

23 108

4.00

8

11 642

7.94

12

7 852

11.8

16

5 972

15.5

24

4149

22.3

32

3 294

28.1

10

5

I

0 0

5

I

I

I

I

I

10

15

20

25

30

35

Processors Fig. 4. Speedup versus used processors for the summing example using the MIMD Erlang on the Parsytec GC/PowerPlus.

4

Conclusions

Parallel computer usage will soon be dominated by scalable multiprocessing real-time applications, where there is an increasing need for computational power. Since development of real-time software is especially time-consuming and expensive, it is imperative that such development is facilitated. The work described here implementing and extending Erlang and telecommuncations applications for massivly parallel platforms is an important step in this direction. It is also one of the first examples of making functional language technology usable for large-scale industrial real-time applications on parallel platforms.

.

Joe Armstrong, Robert Virding, Mike Williams: Concurrent Programming in ERLANG. Prentice Hall, 1993.

.

Rormie Lee Bennett, George E. Policello II: Switching Systems in the 21st Century, IEEE Communications Magazine, March 1993.

.

James J. Garrahan, Peter A. Russo, Kenichi Kitami, Robert Kung: Intelligent Network Overview, IEEE Communications Magazine, March 1993.

Implementation of the real-time functional language ... - Springer Link

Implementation of the real-time functional language ... - Springer Link

Suggest Documents

Functional individuation, mechanistic implementation ... - Springer Link

Controlling the behaviour of functional language systems - Springer Link

EFFICIENT REALTIME FPGA IMPLEMENTATION OF THE TRACE

Actor model of Anemone functional language - Springer Link

The meta-language - Springer Link

Constitutional implementation - Springer Link

Functional fixedness - Springer Link

Implementation of the Canadian Emergency ... - Springer Link

The implementation of SOMO - Springer Link

A tiny constraint functional logic language and its ... - Springer Link

Marketing implementation: The implications of ... - Springer Link

Implementation of realtime STRAIGHT speech manipulation system ...

Functional polymorphisms of the cyclooxygenase ... - Springer Link

Functional imaging of the bowel - Springer Link

Stack-based scheduling of realtime processes - Springer Link

Realtime tuning and verification of compartmental cell ... - Springer Link

Multidisciplinary Considerations in the Implementation ... - Springer Link

Realtime MR guided endomyocardial biopsy with an ... - Springer Link

Realtime functional MRI using ... - Wiley Online Library

Outcomes of the implementation of the computer ... - Springer Link

Implementation and evaluation of amyloidosis ... - Springer Link

Principles, structures, and implementation of ... - Springer Link

Implementation of an antimicrobial stewardship ... - Springer Link

Implementation of knowledge management in ... - Springer Link