Designing a Programming Language for Active ... - Semantic Scholar

3 downloads 354 Views 148KB Size Report
Jun 29, 1998 - mobile users. ... sign decisions on the programming language. ... with a language for the safe programming of Active Networks: A communi-.
Designing a Programming Language for Active Networks Ian Wakeman

Alan Jeffrey

Rory Graves

Tim Owen

June 29, 1998

1 Why Active Networks Most networks currently have a topology where ‘smart’ hosts sit at the edges of the network, and are connected by ‘dumb’ routers. In reality, these routers H Active nodes

H

H

R

R

Host

H Passive nodes

R

H

R

Router

H

Figure 1: Passive Network Elements are not so dumb: they have the capability to provide as much computational power as the hosts, but this computing power is mostly used to push packets around. The router managers decide what software should run on the nodes, so adding new features to the network requires the agreement of all the router manufacturers and managers. An alternative model is to allow users access to the computing power in the routers, and to run their own software. The network has some dumb routers and some smart routers, and we can run a virtual network to hide the dumb routers from the user. The creation of virtual networks has allowed the introduction of new services between islands in the network, whilst not requiring the intermediate routers to be changed. Examples include: The MBone The deployment of multicast routing technology has used typical workstations to create the MBone, using encapsulated IP to tunnel through the multicast-unaware routers between MBone sites:

1

A Active nodes

A

A

A

A

Host

A

A

A

A

Figure 2: An Active Network

S Active nodes

B

R

R

B

Multicast Source

R

Relay

S

Subscriber

S

R

S

Figure 3: Mbone

2

Virtual Private Networks Users can run their own virtual private intranet by setting up their own firewall processes at the gateway to tunnel their packets between sites:

Active nodes

T

F

F

T

Multicast Node

F

Firewall

U

Untrusted Router

T

U

Figure 4: Firewall

The software providing these services must be manually installed and configured on each node, and can require large amounts of valuable time. If the router manager makes their node active, and we have an agreed infra-structure which can download and run code over packets, then they can offer more services to users without costly negotiations over each protocol. Most services do not require intelligence within the network, especially if they are based on single sender and receiver pairs. Providing high bandwidth and engineering routers so that packet transmission is of low latency will generally prove the most effective solution for most distributed systems problems. However, networks will remain heterogeneous in the bandwidth available, for three major reasons: because of physics, such as in the radio spectrum; because of the expense of replacing legacy systems in poorer organisations or countries; and because of congestion. When there is an “impedance mismatch”1 between the bandwidths of regions of a network, placing some intelligence at the border of the regions can help in improving efficiency, as in the use of proxies for mobile users. When the communication involves multiple peers, then strategic placement of intelligence within the graph formed by the peers can reduce communication costs. When information needs to be synchronised in some way across a group, the latency of the synchronisation can be much reduced if the synchronisation happens at the centre of the group rather than going to each node at the edge. In particular, we see this as an excellent way to improve the communication latency of shared virtual reality spaces. For these reasons, networks will become increasingly programmable, at least at the borders of networks suffering from impedance mismatch. If this happens in an ad hoc manner, without thought applied to the inter-dependence between network nodes and connectivity, then our networks may become more fragile, falling over whenever a badly written program is pushed into the network. This is the motivation for much of the current research into Active Net1

Thanks to Jon Crowcroft for this term

3

works (ANs). If we can produce a model that allows routers within the network to execute code on behalf of some granularity of packet2 which is properly matched to the peculiarities of packet-switched networks, then we will hopefully improve the integrity of the network. In this paper, we describe our approach to designing active networks. By combining recent work on the semantics of computation with a pragmatic description of the processes of packet communication, we have produced a programming model that both protects the connectivity of the network and encourages programmers to think about AN programming in the “right” way. By using a strongly typed programming environment which embodies policies about safety within the type system, we can statically prove programs are safe, and improve efficiency by removing many run-time checks. In the next section we describe the interdisciplinary nature of our design process, showing how we have approached the design of our programming language from the perspectives of formal computation and of systems engineering of packet communications, defining safety and security policies to shape our design decisions. We derive safety and security requirements from our model. In the succeeding sections we explain the contribution of formal semantics in instantiating our safety and security policies, and meeting the derived requirements. We give an example of how the programming model can be used to produce a multicast algorithm, and we conclude with a comparison of our work to other AN activities.

2 The Design Activity This project has been a classic example of interdisciplinary work, involving the authors learning enough of each other’s fields to be able to make sensible design decisions on the programming language. In multidisciplinary work, each of the experts in the team makes their decisions based upon requirements filtered into their own language. Interdisciplinary work requires the designers to learn to evaluate requirements within the discipline in which they are captured [1, 2]. Although much more demanding upon the designers, the design space becomes much richer since choices are available from any of the fields, and the ability to switch and communicate from any perspective reduces the likelihood that design problems will emerge later down the development line. The design of the language has used three major viewpoints to come up with a language for the safe programming of Active Networks: A communications view; a programming view; and a safety view, as in Figure 5. By discussing and exploring from the three viewpoints, we have attempted to evolve a model for active networks that enables the novel applications people are hoping ANs to provide, yet protects the common resources of the network from both malicious attacks and buggy programs. In particular, we have asked ourselves the following questions:

 

What components of modern packet switched networks should be protected from AN programs? How does the communication model affect the programming model?

2 The

exact degree of granularity is a design point in the engineering of ANs.

4



What features of modern programming languages can we use to protect the network, and encourage the correct programming model? Safety and Security

Design Process

Communications Model

Programming Model

Figure 5: Design Process

3 The Communications Model A network enables entities to connect and send information between each other. Thus the most important feature of any network is the ease with which entities can connect with each other. The Internet has exhibited one of the fastest rates of connection of any telecommunications network. What is it about the Internet and in particular the Internet Protocol that encourages this fast growth? The packet forwarding process in the Internet can be modelled as shown in Figure 6. Packets arrive from a link onto a router, information is retrieved from tables about where a packet should be routed to based on the destination address, then the packet is placed on the appropriate output link. Currently in the Internet, the tables are maintained either by distributed routing protocols such as OSPF or BGP, or statically by the router manager.

switch

dst . .

processor

link . .

routing table pkt

dst

input link

output links

Figure 6: Basic router packet forwarding engine More importantly, the various router administrations have clean and appropriate interfaces to export and import routing information. These interfaces are reflected in the management processes between the router administrations, which only have to ensure that links are working and sufficient routing information is exchanged to keep the network working. Adding a new machine is a local matter of assigning an address, which can then reach out globally. Since

5

the return address is carried within the packet, the connection can be two way immediately. As long as connectivity is maintained, TCP/IP will ensure reliable delivery of data. In order to keep a network working, any new services should not disrupt the connectivity of the network, and should not overload the router administrators by requiring that they hand install new programs. The key component in maintaining connectivity of any packet switched network is thus the routing infra-structure. After many years of struggle, the Internet has evolved “correct” routing protocols, and enough intra- and interorganisational process so that the routing generally works (see Paxson in [3] for a discussion of just how well it works). The routing protocols distribute local knowledge about which links are available to a sufficient area of the network so that connectivity is maintained. The routing is also dynamic, in that changes in the availability of links may affect the routing tables of other nodes. From the above, we have derived the following communication requirements in the programming model: Requirement 3.1 Given a destination, a program should only be able to discover the next hop on the route to the destination. Requirement 3.2 Routing is asymmetric: x is a neighbour of y does not imply y is a neighbour of x Requirement 3.3 Routing is not static: if x is a neighbour of y now, there is no guarantee x will remain a neighbour of y. Since AN programs will probably only be effective at the impedance mismatch points wihtin the network, we believe that AN nodes will not exist at every router but at strategic points in the network, such as routinely congested links across oceans, firewalls or boundaries between radio and terrestial links. It is important that the communication model should not require ubiquitous AN nodes, but should allow an arbitrary number of routers to be transparently in between AN nodes. Transparency is easily obtained by adding a generic AN bit to the packet header. A packet belonging to a flow operated upon by an AN program will set this AN bit within the header. A non-AN router can just ignore the bit and forward the packet normally. When an AN router sees the AN bit within a packet, the packet is routed to be processed by the Active Node processor. The runtime system on the Active Node will select the correct program to be run over the packet, and allow access to the shared heap across the programs, as in Figure 7. We assume that we have a homogeneous AN architecture. By separating the active node from the main forwarding path, the active node can be on a separate card or even on a separate machine to the router. This requirement forces us to use a virtual network model of communication between AN nodes, where the AN nodes are overlaid onto the actual communications network. Requirement 3.4 The next AN hop should be transparent to the actual number of non-AN routers in between the two AN nodes. As is well known, the Internet Protocol is a best effort protocol, allowing packets to be dropped duplicated or misordered. ANs should account for this in the design of the programming langauge, so we have the following requirements: 6

AN switch Programs Runtime Active Node

Heap & other memory

dst . .

processor

link . .

routing table pkt a dst

input link

output links

Figure 7: Active router packet forwarding engine

Requirement 3.5 Communication between nodes can be lost Requirement 3.6 Communication between nodes can be misordered Requirement 3.7 Communication between nodes can be replicated one or more times Nodes in the network are also subject to arbitrary reboots, or can disappear from the network and then reappear. State installed in the nodes can thus disappear and then reappear. This is known in the literature as soft state, which is re-installed as required. There are thus the following requirements on the state installed in nodes. Requirement 3.8 The state installed in a node must be re-installable or valid when reappearing after a temporary absence. To make judgements about the state of the network, so as to decide upon possible courses of action, the network must provide estimates for the bandwidth, loss and delay likely to be incurred upon any next hop. Requirement 3.9 Network conditions to any “next hop” should be available.

4 Safety and Security in Active Networks Networks are shared communication resources, used by all to commincate with each other. To ensure the integrity of the network, we must ensure that each of the components of the network continue to work. The problem is that the resources of the various component parts of the network (bandwidth, memory and processor time) are shared by all. If we allow arbitrary code to execute on behalf of some user, we must ensure that all the resources required by other users of the network are protected from abuse by misbehaving or malicious programs. 7

4.1 Bandwidth The bandwidth of the links is the key resource of a network. Any AN program injected into the network must not be able to attempt a denial of service attack by generating unbounded numbers of packets by multiplying packet generating entities within the network. Requirement 4.1 The resulting generation of packets from sending a packet into the network must be bounded. Network queues should be protected from short term congestion caused by large packet bursts being injected into the network. Requirement 4.2 The number of packets generated per unit time from a single node must be bounded.

4.2 Processing time Even though we posit that the active node will be separate from the main router, the active node processor is shared amongst all AN programs. Therefore, we must ensure that no single program is able to consume all of the processing time on the active node. Analagous to the fast path processing of a normal router, we believe there should be fast path processing for packet forwarding code and a different form of processing for control programs. In particular, when a program is using packet forwarding code, the processing required for the packet should be completed in one gobbet, without interruption. We’d like this to happen so that we can minimise the overhead of context switching and take advantage of natural dispatch points such as packet sending. For this to happen, we make the following requirement on the code: Requirement 4.3 Packet Forwarding Code must not be able to enter an infinite loop. In the control portion of a program, the computation can be arbitrary, and thus we cannot statically determine the path of computation. Instead we need to provide some means of pre-empting threads from the processor and ensuring that the scheduling of threads by the run-time is fair. Requirement 4.4 There will be a bound on the amount of processing time per unit time a thread can consume Often a control thread will be awakened periodically by a timer. Each timer uses memory, and loads the processor. In addition, if the granularity of the timer is small, then the frequency of the timers could generate too many interrupts on the processor. If a thread can arbitrarily set timers, then it can deny processor time to other threads. Requirement 4.5 There will be a bound on the number of timers per unit time a thread can set.

4.3 Memory Within the control section of a program, a program can spawn threaads to run. Each thread uses the equivalent of a thread control block, taking up memory. 8

Requirement 4.6 There shall be a bound on the number of threads a program can generate The memory usage of a program must be controlled. If a program allocates all of the shared heap to itself, then it denies access to the resource to other programs. Requirement 4.7 There will be a bound on the amount of memory a program upon a node can use. To protect programs from each other, programs cannot be allowed to arbitrarily generate memory accesses. Requirement 4.8 Programs cannot generate arbitrary references to memory. For packet forwarding code, we would prefer that there was no time spent allocating heap, nor any time spent running the garbage collector. If we don’t allow heap to be allocated during execution, then there will be no garbage to collect. Requirement 4.9 Packet forwarding code shall not allocate heap. As described above in Section 3, the routing table is the most sensitive data structure of a network. We believe that direct access to the routing table must be forbidden to protect the network. Requirement 4.10 The AN program cannot directly manipulate the routing table of the router.

4.4 Security Issues The requirements listed above are intended to prevent misbehaving programs from trashing the network and in preventing obvious denial of service attacks from malicious programs. However, the difficulty in security is in anticipating how the malicious agents will attack the system. Our security model is based around a set of trusted nodes, and trusted code, which is protected using cryptographic techniques. Principals are trusted to either provide code or recommend good coded. Requirement 4.11 It must be possible to trace a chain of trust from a given code segment to a trusted principal. Requirement 4.12 The chain of trust associating code and principals must not be forgeable.

5 The Programming Model Implementing safety and security policies in current computing environments requires run-time checks. Typical examples include comparing of virtual memory addresses against bounds in the segment and page tables to ensure that memory accesses are not out of bounds and checking user identifiers against access control lists to ensure that a particular call can be made. 9

These run-time checks add to the overhead in executing a program, and for functions with strict performance constraints, the resulting delays may be unacceptable. For AN technologies in particular, the need to compute, manipulate and switch packets in real time means we need to minimize the number of run-time checks whilst still obeying the strict safety requirements. In our design we have used recent (and not so recent) advances in the semantics of programming languages to define an AN language which replaces dynamic run-time checks with static checks which can be carried out when the active code is downloaded. Static checks are usually more complex then dynamic checks, but they only have to be carried out once, during the initialization phase, so the packet-handling phase can execute more efficiently. We have taken the safety policies required of an AN language and built the type system to support these policies. If we can prove that the type system embodies the safety policies, and that the language satisfies subject reduction, we need only check that the program is well-typed before running the program. A common approach taken in the formal definition of concurrent or distributed programming languages (originating with Milner’s CCS [4] and Hoare’s CSP [5]) is to define a grammar representing a network. For example a network with two threads t1 and t1 running at location l1 , and a thread t2 running at location l2 is written: 0

l1 [t1] j l2 [t2] j l1 [t1 ] 0

The threads are described using a grammar based on a core programming language, for example a program which spawns two sub-threads is: l1 [spawn ft1g; spawn at l2 ft2 g; t1 ] 0

We can then formally define an reduction relation on networks defining how execution of threads may take place, for example: l1 [spawn ft1 g; spawn at l2 ft2 g; t1 ] ! l1[t1] j l1[spawn at l2 ft2g; t1] ! l1[t1] j l2[t2] j l1[t1] 0

0

0

We can also formally define a type system for programs, and hence for networks, with judgements N : network. A distributed language satisfies subject reduction if well-typed networks stay well-typed, that is:



if N : network and N ! N then N : network. 0

0

Typed programming languages and subject reduction have long been investigated, using programs as syntactic presentations of logical proofs (see [6] for example). The first full programming language to be formally specified was Standard ML [7]. One common feature of type-safe programming languages is that they provide an abstraction barrier between the programmer and the run-time environment. For example, direct pointer arithmetic, or arbitrary casting are not allowed.

5.1 Virtual Networks and Best Effort Programming We believe that the correct model for programming ANs is to view the program as a form of virtual network creation and maintenance. The virtual network is 10

made up of a number of nodes at the edge and within the network. In the packet forwarding part of the program, the program at each node runs packet forwarding code to send data to the necessary places. If some exceptional event occurs, such as a new node joining or leaving the network, then the program will run code at each node to communicate and update state between each of the nodes. A network of nodes communicating with each other is very close to the use of communication channels in concurrent languages with dynamic channel creation. Languages based upon the creation of unique channel names such as PICT [8] or Concurrent ML [9] build on the original work of Milner, Parrow and Walker’s π-calculus [10]. These unique channel names have been generalised to locations in Ambient [11] and distributed π-calculi [12, 13]. In our language, each node in the virtual network has a unique identifer. The unique identifiers can only be tested for equality, and are collected in a set forming a type of location. To allow for the construction of a graphs of nodes, we define a function route that when given a destination location, will return the next hop on the route to the destination, as the type location. We also have a subtype of location of here which is the location of the current node where the program is executing. In the π-calculus, communication between threads is based on channels with unique names. We use the same model of communication, allowing the programmer to generate a unique name and then bind a new resource to this unique name. These resources can be local (corresponding to heap-allocated objects), global (corresponding to registered objects) or classes. If a program is on the node on which a resource is registered, then it can access the resource. The sending of packets is modelled by spawning threads at neighbour nodes with associated bindings of the free variables within the thread code. In particular, if the name of a resource is unbound within the thread code, then it will be bound from a name within the sending program. In this way, we can communicate resources across the network so that they can be used from other nodes. The bindings of the free variables is the closure of the thread code, so that it is a self contained chunk of code to be run at the neighbouring node. After a thread is spawned at a remote node, the spawning node can only communicate by spawning a new thread back on the original node with new data in the thread closure. This returning thread can then access resources registered in the original program to return values. The typing rules for the spawn operation force the programmer to calculate the next hop to the destination at that instant, by using the route function eg spawn at route dst f some code g In the case when the programmer does not care about which node is the neighbour, they can simply spawn forwarding code as shown below. This obviously satisfies Requirement 3.4. When the programmer wishes to keep track of the topology of the network so as to implement some form of application routing, then they can store the location values returned by the route function. But before these values can be reused, the programmer must test their validity before spawning threads at these supposed neighbours, since the route may have changed between storing the value and actually using it eg.

11

let myNextHop : location = route dst; : : : some time later : : : if (myNextHop == route dst ) f spawn at route dst f some code g g else f fix up the state

g

In particular, since the route function is the only way of generating neighbour values, the program has to interrogate the local node to discover where to send packets (Requirement 3.1 3.2). Since there is no requirement on the route to be consistent about the values returned, then the programmer is forced to recognise the non-stationary nature of routing topology (requirement 3.3) and maintain soft state about the actual topology of the virtual network, reinstalling the program state if the route changes (Requirement 3.8). A normal packet network is thus transformed from one in which just data is sent around into a virtual active network in which threads are sent around with the necessary resource names of places along their routes so that they can send threads in the reverse direction to communicate computed information. The spawning of threads at a neighbour is analogous to sending a packet. Since the state of threads cannot be discovered except by sending another thread after it, we model best effort packets (Requirements 3.5 and 3.6). If the program requires the thread to return some values, then the programmer must surround the thread with a timer to repeat the attempt. Since the new route for the second attempt at respawning may not be the same, the programmer must again follow the soft state model (Requirement 3.8), reinstalling resources if the virtual graph is reformed. For example, a program which delivers a packet to a remote stream is: interface Stream hPacketi f method handle ( p : Packet);

g

class Forward hPacketi f static method forward ( p : Packet; l : location; s : at l Stream hPacket i) f if (l == localhost) f s:handle ( p); g else f spawn at route l f Forward:forward ( p; l ; s);

g g

g g

5.2 Packet Forwarding Code Resource allocation is of primary importance to AN code: allocating heap is an expensive operation, and we would like to ensure that packet-forwarding can happen without the overhead of heap allocation. To achieve this, we can distinguish between three categories of program:

12

  

values that just immediately return a result, packet-forwarding code, which cannot allocate heap, and control code, which can perform arbitrary computation.

These categories are reflected in the type system, which can enforce the restriction that packet-forwarding code cannot allocate heap (Requirement 4.9). This separation of categories of computation is commonly used in semantics [14, 15, 16] to distinguish between expressions with no computational significance such as: 37 1 + 2 and those with computational significance: while (true) f g print("hello world") get(r) This separation into categories of computation can be presented using monadic type systems where we distinguish between:

 

values: functions guaranteed to terminate, no side-effects,. . . computations: functions may not terminate, side-effects,. . .

This system generalizes, for example it is the basis of side-effects in the Haskell programming language, which can distinguish using types between side-effectfree computation, usage of I/O, concurrent programs, memory allocation, and others. Once we have distinguished between different categories of computation, we can make engineering decisions about which facilities should be available in which category. For example, we could allow while-loops in control code, but restrict to for-loops during packet-forwarding, to ensure that packetforwarding code is guaranteed to terminate (Requirement 4.3).

5.3 Security Mechanisms Nominal calculi have not only modelled communication networks; they have also been used to model security processes [17, 18, 19]. We will be using the nonces created in nominal calculi to model the chains of trust formed from principals to their code, and in the use of secure certificates.

5.4 Accounting Mechanisms Recent work on typed languages [20, 21] has shown that more subtle safety properties can be verified by a type system. For example, we wish to ban a denial-of-service attack where a packet is bounced backwards and forwards between two nodes forever. The usual solution is to add a hop-count to each packet, so we add a new parameter to the spawn command, which is a counter to decrement whenever a new thread is created, for example:

13

class Forward hPacketi f static method forward ( p : Packet; l : location; s : at l Stream hPacket i; c : int) f if (l == localhost) f s:handle ( p); g else f spawn (c = c ? 1) at route l f Forward:forward ( p; l ; s; c);

g g

g g

The semantics of spawn is now: l [spawn (c = n ? 1) at m ft1 g t2 ] ! m[let c = n ? 1; t1] j l [t2]

(when

n > 0)

If the counter ever drops to zero, then the packets are dropped. This method works for trusted routing code, but an unscrupulous agent can flood the network with packets by always setting the hop-count to one: class Attack f static method attack (l1 : location; l2 : location) f if (l1 == localhost) f attack (l2 ; l1); g else f let d : int = 1; spawn (d = d ? 1) at route l1 f Attack:attack (l1 ; l2); Attack:attack (l1 ; l2);

g g

g g

This creates an infinite loop between the two sites l1 and l2 , creating exponentially many processes along the way. This denial-of-service attack is possible because we have allowed active code to treat integers as hop-counts. One possible solution is to introduce a new type for hop-counters, which is a subtype of int, so counts can be cast to ints, but not vice versa. If we use this new type rather than int to represent hop-counts then we can still forward packets, but the denial-of-service attack fails to type-check since int is not a subtype of count. This simple approach stops unbounded use of the network, but still allows exponential blow-up, for example:

14

class Attack f static method attack (l1 : location; l2 : location; c : count) f if (l1 == localhost) f attack (l2 ; l1); g else f spawn (c = c ? 1) at route l1 f Attack:attack (l1 ; l2; c); Attack:attack (l1 ; l2; c);

g g

g g

Although this process will eventually terminate, it does so after using resources exponential in the size of c. Fortunately, there is a standard type-theoretic solution to such problems, which is to add a restriction that the count type is linear, that is can only be used once. Linear types have a logical basis in linear logic [20], and form the basis of linear lambda-calculi [21, 22]. We can use linear counts as the basis for determining how much of the processor a thread can use on a given node, how many threads can be spawned, how many packets sent, how much heap or resources can be allocated and how many timers can be set. Since linear types can only be used once, they also offer the possibility of detecting replicated threads. Yet another use of linear types is in setting the maximum burst a packet can send into the network (Requirements 4.4, 4.5, 4.6, 4.7, 3.7, 4.1, 4.2). Although the core language will include the linear counts as parameters to the resourced operations, a programmer will not normally need to do anything sophisticated with the counts. Therefore we will hide the details of the counts beneath abstractions that use the counts in appropriate ways and remove the burden of managing the counts from the programmer. We have used these abstractions in the examples and in Section 6.

5.5 The Runtime System The runtime system is responsible for accepting incoming threads and running the appropriate code. It will act as the interface between the route function in the code and the routing table, and provide access to other network information, such as the MIBs of various network components, or estimates of network state (Requirement 3.9). Full details of the runtime system will be detailed in a following report.

6 Building Multicast Trees - an example The example shown is based loosely upon the multicast routing algorithm in sparse mode PIM [23]. The factory class runs at a well-known node, which is equivalent to the rendezvous point. To create a new group, the build method of the factory class is called. This generates a new class with a new name, equivalent to the multicast group address and some local state, such as the rendezvous location. 15

To join and be able to send and receive to and from the group, a node has to load the generated group class. Discovering the name of the group class and some place to load it from is equivalent to discovering a multicast address from a tool such as a session directory. When the group class is loaded, it spawns a thread at the neighbour en route to the rendezvous node that will subscribe the current location. This is the analogue of the join packet going upstream. The subscribeLocation method invoked checks to see if the routing is symmetric, and if so, subscribes the downstream host. If the routing is asymmetric, it adds the hop back to the list of subscribed downstream neighbours and launches a thread back to the original location to subscribe on the way down. Since a method in the class is invoked on the way downstream, the class will be loaded, and the upstream node subscribed automatically. To receive packets, an entity hands a packet handling Stream to the subscribe method of the group class. To send a packet, the entity calls handle. This then invokes the sendDownstream and sendUpstream methods to spawn threads to send the packets up to the rendezvous and then back downstream. Ther are various points to note. We can only spawn at neighbours. A real implementation would need to add an unsubscribe method, allow nodes to leave the tree, and would periodically check to make sure its upstream neighbour was still alive.

16

interface MulticastStream hPacketi extends Stream hPacketi f method subscribe (s : Stream hPacketi);

g

interface Factory hBuilti f method build () : Built;

g

class MulticastFactory hPacket i implements Factory hMulticastStream hPacketii f method build () : MulticastStream hPacketi f let rendezvous : location = localhost; class Group implements MulticastStream hPacketi f let current : location = localhost; let downstream : Vector hlocationi = new Vector hlocationi (); let subscribers : Vector hStream hPacketii = new Vector hStream hPacket ii (); if (localhost ! = rendezvous) f spawn at route rendezvous f Group:subscribeLocation (current ); g

g

static method subscribeLocation (l : location) f if (route l == l ) f downstream:addElement (l ); g else f spawn at route l f Group:subscribeLocation (l );

g

g

g

static method sendUpstream ( p : Packet) f if (localhost == rendezvous) f Group:sendDownstream ( p); g else f spawn at route rendezvous f Group:sendUpstream ( p);

g

g

static method sendDownstream ( p : Packet) f for (n : location) in downstream f spawn at route n f Group:sendDownstream ( p);

g

for (s : Stream hPacket i) in subscribers f s:handle ( p);

g

g

method subscribe (s : Stream hPacketi) f subscribers:addElement (s);

g

method handle ( p : Packet) f Group:sendUpstream ( p);

g g

return new Group;

g g

17

g

g

7 Optimisations The programming language is designed to match the communications model of networks, and is generally hop by hop. However, the implementation of the runtime system is not constrained to follow the language. In particular, we will implement library routines defined in the Safetynet language for common programming needs such as rpc or the registering of remote resources using more appropriate protocols which go straight to the destination, rather than enforcing the hop by hop nature of the semantics. We will be investigating the use of static analysis in the compiler to optimise the communication and processing costs.

8 Comparison with other Active Network Research As far as we are aware, there is only one other project which is attempting to bring together modern programming technologies and Active Networks. The Switchware project [24] at Penn is producing a two level architecture based on strongly typed languages. The PLAN language [25] is based upon a subset of ML with the addition of primitives to support remote execution. The language provides only primitive recursion and so guarantees termination, but doesn’t allow the accumulation of state except through use of special system primitives. We believe that the type systems defined herein can protect against unsafe heap usage. Netscript from Columbia is a dataflow language to glue together data streams into the appropriate filter primitives [26]. It is an object oriented scripting language and so would be more akin to TCL than to a strongly typed language. MIT [27] are basing their work upon Java with its attendant level of safety. Bhattacharjee et al. at GSU [28] are providing hooks into router supplied and controlled functions, rather than providing full programming environment. BBN has defined two programming languages, “Sprocket” a high-level language rooted in C that compiles to “Spanner” an intermediate language that assembles into a small footprint, so that the programs can sit inside a single packet [29]. Spanner is a CISC-style stack based assembly language with high level types for the operands. Both Sprocket and Spanner perform type checking at compile and assembly time, respectively. Additional safety relies on cryptographic authentication of the incoming packet, run-time authorization of the program, and a “sandbox” virtual machine environment which checks the types of arguments before instruction execution and limits the program’s access to node resources. Other work outside of the AN scene of relevance includes Spin [30] which is an extensible operating system which allows users to download code into the kernel. An operating system faces many challenges common to those facing ANs. The language chosen by the spin designers is Modula-3 which is a strongly typed language with its automatic garbage collection, and known denotational semantics for a Modula-3 subset [31]. The type checking of the Modula-3 compiler is used to ensure domains of protection at the level of binding names for access to objects. One of the seminal works in the field of ANs

18

was the thesis of Christian Tschudin [32], in which he used a form of Pascal as the basis for developing the active packets.

9 Current Status The language is being specified, and a prototype compiler is currently being built for the first phase of the language. The first pass of the compiler will compile down to Java classes, so that we can bootstrap upon the existing AN Java infra-structure [27], and use our AN simulator [33]. The compiler will be finished by September 1998. Prototype applications are planned to appear by Autumn 1998, such as self organising web caches. We plan to extend the work by designing a well-typed, formally defined virtual machine, rather than using Java bytecode.

10 Conclusion We have described the process of designing a language for Active Networks. This has been an inter-disciplinary work, basing the semnatics of the programming language upon the communication model of the Internet and using results from the programmming languages community to reduce the overhead of run-time checks.Whilst we would not claim that proection can be achieved solely through the use of language level techniques, they can go a long way to removing the costly runb-time checks that bedevil a language like Java, such as array bounds checking, checking the class details for any method invocation, tracing the type hierarchies to ensure assignment compatibility and so forth.

References [1] S. Kim. The Art of Human Computer Interface Design, chapter Interdisciplinary Collaboration, pages 31–45. Addison Wesley, 1990. [2] M. Scaife, E. Curtis, and C. Hill. Interdisciplinary collaboration: a case study of software development for fashion designers. Interacting with Computers, 6(4):395–410, 1994. [3] Vern Paxson. Measurement in the Internet. PhD thesis, University of California, Berkeley, 1997. [4] R. Milner. Communication and Concurrency. Prentice-Hall, 1989. [5] C. A. R. Hoare. Communicating Sequential Processes. Prentice-Hall, 1985. [6] Jean-Yves Girard, Yves Lafont, and Paul Taylor. Proofs and Types, volume 7 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, 1989. [7] Robin Milner, Mads Tofte, and Robert Harper. The Definition of Standard ML. MIT Press, 1990.

19

[8] B.C. Pierce and D.N. Turner. Pict: A programming language based on the pi-calculus. Technical Report CSCI 476, Computer Science Department, Indiana University, 1997. To appear in Proof, Language and Interaction: Essays in Honour of Robin Milner, G. Plotkin, C. Stirling, and M. Tofte, eds, MIT Press. [9] J. Reppy. A higher-order concurrent langauge. In Proc. SIGPLAN 91, pages 294–305, 1991. [10] R. Milner, J. Parrow, and D. Walker. A calculus of mobile proceses. Inform. and Comput., 100(1):1–77, 1992. [11] Luca Cardelli and Andrew D. Gordon. Mobile ambients. Unpublished manuscript, 1997. [12] Peter Sewell. Global/local subtyping for a distributed π-calculus. Technical Report 435, Computer Laboratory, University of Cambridge, 1997. [13] J. Riely and M. Hennessy. Distributed processes and location failures. Technical Report 2/97, Univ. Sussex, 1997. Also presented at ICALP ’97. [14] E. Moggi. Notions of computation and monad. Inform. and Comput., 93:55– 92, 1991. [15] Philip Wadler. Comprehending monads. In Proc. 1990 ACM Conf. Lisp and Functional Programming, pages 61–78, 1990. [16] P. Hudak, S. L. Peyton Jones, P. Wadler, et al. A report on the functional language Haskell. SIGPLAN Notices, 1992. [17] A. Gordon and M. Abadi. A calculus for cryptographic protocols: The spi calculus. Research Report 149, Digital Equipment Corporation Systems Research Center, 1998. To appear in Information and Computation. [18] Andrew D. Gordon. A calculus for cryptographic protocols: The spi calculus. In Proc. ACM Conf. on Computer and Communications Security, pages 36–47. ACM Press, 1997. [19] Andrew D. Gordon. Nominal calculi for security and mobility. In Proc. DARPA Workshop on Foundations for Secure Mobile Code, pages 10–14, 1997. [20] J.-Y. Girard. Linear logic. Theoret. Comput. Sci., 50:1–102, 1987. [21] N. Benton, G. Bierman, V. de Paiva, and M. Hyland. A term calculus for intuitionistic linear logic. In Proc. Int. Conf. Typed Lambda Calculi and Applications, number 664 in LNCS. Springer-Verlag, 1993. [22] N. Benton. A mixed linear and non-linear logic: Proofs, terms and models. In Proc. Computer Science Logic ’94, number 933 in LNCS. Springer-Verlag, 1995. [23] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C.-G. Liu, and L. Wei. An architecture for wide-area multicast routing. In Proceedings ACM SIGCOMM, pages 126–135, London, UK, August 1994. ACM.

20

[24] Carl A. Gunter, Scott M. Nettles, and Jonathan M. Smith. The switchware active network architecture. To be published in IEEE Network special issue on Active Nets. [25] Michael Hicks, Pankaj Kakkar, Jonathan T. Moore, Carl A. Gunter, and Scott Nettles. Plan: A programming language for active networks. submitted to PLDI’98. [26] Y. Yemini and S. da Silva. Towards programmable networks. In IFIP/IEEE International Workshop on Distributed Systems: Operations and Management,, L’Aquila, Italy, October 1996. [27] David J. Wetherall, John Guttag, and David L. Tennenhouse. Ants: a toolkit for building and dynamically deploying network protocols. In Proceedings of first IEEE Conference on Open Architectures for Signalling, San Francisco, Ca, April 1998. [28] Samrat Bhattacharjee, Ken Calvert, and Ellen W. Zegura. An architecture for active networking. In High Performance Networking (HPN’97), White Plains, NY„ April 1997. [29] Alden W. Jackson and Craig Partridge. Smart packets. Presented at 2nd Active Nets Workshop, March 1997. [30] Brian Bershad, Stefan Savage, Przemyslaw Pardyak, Emin Gun Sirer, David Becker, Marc Fiuczynski, Craig Chambers, and Susan Eggers. Extensibility, safety and performance in the spin operating system,. In Proceedings of the 15th ACM Symposium on Operating System Principles (SOSP15), pages 267–28, Copper Mountain, CO, 1996. [31] Martin Abadi. Baby modula-3 and a theory of objects. Journal of functional programming, 4(2), April 1994. [32] Christian Tsuchudin. On the Structuring of Computer Communication. PhD thesis, University of Geneva, Switzerland, 1993. [33] Rory Graves and Ian Wakeman. Experiences of implementing a jvm for simulation. Technical report, School of Cognitive and Computing Sciences, Uni of Sussex, 1998.

21