Towards a Logical Addressing and Routing Sublayer for Internet Multicasting Jean-Jacques Pansiot * (
[email protected]) Dominique Grad † (
[email protected]) Stella Marc-Zwecker * (
[email protected]) * Université Louis Pasteur, LSIIT and Computer Science Department, 7 rue René Descartes F67084 Strasbourg Cedex, France Phone: (33) 88 41 64 28 Fax: (33) 88 41 66 28 † Université Robert Schuman, IUT Strasbourg-Sud, Département d’Informatique 72, Rte du Rhin F67400 Illkirch-Graffenstaden, France
ABSTRACT We consider the internetwork layer for multicast communication and show that addressing and routing problems differ substantially from the case of point to point communications. In particular internetwork group addresses are highly dynamic objects that are created by applications, and multicast routing must handle scalability and security. We show that these new problems could best be handled in a new sublayer above the Internet sublayer called Logical Addressing and Routing. We then give some hints on how such an architecture could work, in particular for dynamic group address allocation, incremental diffusion tree construction and routing control. I Introduction Today's large networks, and in particular the Internet, are based on an architecture designed for point-to-point communications. Network addresses are basically geographical location, or routes towards such location. Routing, though dynamic, is essentially concerned with finding the best route, given (static) routing policies and a (static) configuration with possible components failures. Objects (addresses, routes) handled in the Internet layer, say IP, are thus mainly static and independent from the user applications. The first layer that depends on applications is the Transport layer, where for example a connection is dynamically created by a calling user process, say telnet. Moreover, except for Application gateways, the Transport layer is involved only at the two endpoints of the communication. - 1 -
As we will show, multipoint communications (or group communications) are very different. In particular, a group may be dynamically created by the Application layer, and its address must be handled at intermediate nodes, for example for the creation of a multicast tree. We propose to define a sublayer handling logical addressing and routing for best effort delivery of multicast datagrams sent to a (logical) group address. This sublayer makes use of the normal unicast Internet sublayer, e.g. IP, IPng or CLNP, and associated routing protocols. In section II we give some basic definitions and the main differences between unicast and multicast at the Internet layer. In section III we propose our two layered approach and give some indications on how it could work. In section IV we conclude on some benefits and problems with this architecture. II
Point to Point and Multipoint communications in the Internet layer. A point to point (or unicast) communication involves only two hosts, and each packet is sent
to only one destination. In a multicast (or multipoint, or group) communication, a packet may be sent to a group of destinations. This means that the packet must be duplicated somewhere. In this paper we are concerned with the Internet layer, and our examples will be taken mainly from the TCP/IP world and its multicast extensions. We think that the situation would be basically the same in other internetwork architectures. The current approach is to deal with internetwork multicasting at the same level as the usual unicasting. The basic idea is to construct a tree such that the sender is a node of the tree, and the packet is forwarded along the tree until it reaches all nodes, including all destinations. Current research includes algorithms to build efficiently such trees while optimizing some parameters such as their diameter. Also for a group with several sources, one may define one tree per source as in DVMRP [RFC1075] or only one tree for all sources as in CBT [BFC93]. We will look in turn at addressing, routing, duplication and switching. II.1
Addressing Network addressing in a unicast communication is based on hierarchical addresses pointing
to a geographical location, or sometimes on routes towards such a location. The correspondence - 2 -
between a host and a Network address is static, except for mobiles which are not directly handled in the current architecture (see VIP [TUS94]). Moreover, because of the hierarchical structure of addresses, intermediate entities need not to know individual addresses, but rather subnetworks, networks. These (sub) network addresses are even more static than host ones, and are basically allocated "by hand". Unicity of addresses is rather easy to get. A group address, for example a class D IP address for IP multicast [RFC1112], does not correspond to a geographical location. Correspondence between a host and a group address is dynamic, it is established while this host is a member of this group. Moreover the lifetime of a group address may itself be limited, because groups may be dynamically created and destroyed. Clearly there must be an automatic system to allocate Network group addresses if computer supported cooperative work [BB91] is to be widely deployed. Moreover global (flat) group addresses, such as class D addresses, are difficult to allocate in a distributed manner. Finally, one may observe that a unicast Network address concerns a physical host where a multicast address is a logical address corresponding to some instance of application. II.2
Routing By routing we mean here informations necessary to route a packet toward its destination(s),
and the protocols to get such informations. In the unicast case, routing informations concern individual hosts, or more generally subnets or nets. Because of the rapidly increasing number of nets, higher levels of aggregation have been defined, such as in CIDR [RFC1519]. Therefore, in a routing table, the further is a net, the lesser information has to be kept in the table. Routing tables are generally dynamically constructed using routing protocols. However this evolution is rather slow and is due to: • components failures or recovery, hopefully not so frequent at least on backbones, • modification of network architecture such as new links, • connection of new networks. It is shown in [Chi93] that, except for failures of component at the periphery of the internet, routing - 3 -
tables are quite stable. Moreover if we consider interdomain routing, propagation of routing information is strictly filtered by the routing policy of each routing domain. To summarize, routing informations are stable, depend on routing policy and geography, and are independent of user application currently running. In the multicast case, routing information is associated to group addresses, which are basically independent of location, and possibly to the source address. The principle of aggregation seems hard to use. If group communications such as those needed for cooperative work are to be widely used, the number of required group addresses is potentially unlimited, since any set of users in the Internet may form a group at a given time. In fact the size of these routing tables depends not only on the number of groups whose communication must get through a given router, but also on the number of sources in the group. This is the case in particular, when one multicast tree is defined for each source. Moreover, these routing tables must evolve fast, because their contents depend on the creation or deletion of groups, and worse, on insertion and deletion of group members. The lifetime of a group might be measured in hours for example, but in large groups, membership could change as fast as once every minute. This means that routing information must change much more often than in the unicast case. Another difficulty with group addresses, for highly open and dynamic groups, is to dynamically define routing policy for them. To summarize, multicast routing tables are highly dynamic, may reach a size basically independent of the internet, and depend on user applications. One would expect for example that the size of routing tables would grow at peak hours in the same fashion as traffic. Therefore, scalability is an important eschew for multicast routing. As a last remark we observe that multicast routing may use information from unicast routing protocols. Some proposed Internet multicast protocols use specific unicast routing protocols such as MOSPF [RFC1584], others are independent of unicast routing protocol such as PIM [DEF94]. II.3
Switching and Duplication By switching we mean the operations performed on a received datagram before it is delivered. - 4 -
In the unicast case, switching consists mainly in a lookup into the routing table, the lookup key being the destination address, plus possibly other header fields such as Type of Service and Source Route. The packet is sent only once. In the multicast case, the table lookup must generally make use of the sender address, at least in groups with multiple senders. The packet may then be duplicated. This is because a multicast packet with N destinations must be duplicated at some points in the network to generate N copies. It should be noted that duplication can be done in two ways: • explicit duplication by a node in the Network layer. This is the only solution when the underlying service at the Link level is only point to point, • implicit duplication by the medium in the Link or MAC layer. For example over Ethernet, the Network layer has only to pass a single copy of the packet, with a multicast destination. One copy of the packet will be read simultaneously by several destinations. We see that duplication is the main difference between unicast and multicast.
But
duplication is a serious security hole. This is well known for the implicit duplication at the MAC layer. Anybody can spy on data sent on an Ethernet network. Uncontrolled multicast routing would just make the whole Internet one big Ethernet network as far as security is concerned. Because of this it should be impossible for a new duplication to be done in some node without knowledge of upper layers. Since duplication in a node is just a consequence of routing information, this means that dynamic routing of a group should be controlled by upper layers. This is totally different from the unicast case. II.4
Conclusion From the points discussed above, we may draw some conclusions:
• multicast deals with dynamic objects created by upper layers and users, • multicast puts big constraints on a router in terms of scalability, frequency of routing updates, and security, • these constraints should be imposed only on routers where they cannot be avoided, that is on routers where duplication takes place, - 5 -
• multicast may make use of unicast routing for constructing routing tables. From these remarks we propose that multicasting be handled in a separate sublayer above the usual (unicast) Internet layer. This new Logical Addressing and Routing (LAR) layer would allow to separate problems of different nature and to handle multicast for a group only where it is unavoidable, that is where duplication takes place. In the next section we present our proposition for a two-layered approach. III
The two-layered approach
III.1
General principles
III.1.1 The LAR sublayer We propose the insertion of a Logical Addressing and Routing (LAR) sublayer, between the Internet sublayer (that will be more generally called Network layer in the following sections) and the Transport layer. This new sublayer is the upper one that can be present at intermediate nodes (excepting Application gateways), while being the lower one that handles informations of Application dependent semantics. In particular it allows to identify the members of a group independently of their location (logical addressing), and to convey information between them (routing). The LAR sublayer’s message delivery service is supplied to a group of entities that can be widely spread among a networks interconnection. This service can be seen as connectionoriented (members must join the group in an initial phase and leave it in a final phase), although it provides only a best effort message delivery, without flow control or error control. III.1.2 Logical addressing principle We define the LAR addressing by the following characteristics: (i) LAR addresses provide (logical) identification of dynamic Application objects (i.e groups), and are totally independent of any geographical location, by opposition to Network layer addresses [PR90], [RFC791]. For example logical group addresses could be derived from an identifier of the organization to which the group's creator belongs. (ii) Logical addresses are hierarchically allocated to easily guarantee their uniqueness. Let us remind that flat allocation of IP multicast addresses does not provide this property. - 6 -
(iii) The LAR sublayer allows nodes to hierarchize address analysis. That is due to the fact that only Network PDUs (NPDUs) whose Network destination address explicitly identifies the LAR entity of a given node will reach the LAR sublayer at this node. Therefore only nodes that play an active part in multicast operation for a given group will be addressed at Network level and will analyze LAR level (group) informations. An NPDU that encapsulates a LAR PDU then transparently crosses all the intermediate nodes that are not identified at LAR level by Network addresses, i.e no duplicating nodes: these intermediate nodes act in fact as passive relays and do not have any knowledge about the group (see Fig. 1). This principle can be advantageously compared to IP multicast, where every intermediate node that is reached by a Network PDU must analyze its IP multicast destination address, even if it does not take an active part in multicast operation (no duplication). Consequently it has to maintain entries to process the multicast address of each group that passes through it. The selection of the nodes that are relevant for a given group (called LAR nodes), is allowed by address conversion. This mechanism performs a mapping between a LAR address and the associated Network addresses. It allows a LAR entity to decide after previous analysis of the LAR address, if it has to deliver the LAR PDU to the upper layer (member node) and/or it has to forward it to other LAR entities (duplicating node). The Network destination addresses that are mapped from the LAR address identify in fact those next LAR entities. Note that the same conversion mechanism is executed by IP routers to perform the mapping between Network and Link or MAC addresses. Remark: We say "conversion" because at each intermediate duplicating LAR node, source and destination Network addresses are changed (see Fig. 1). III.1.3 LAR Architecture In a group communication, copies of a packet are propagated towards members by successive duplications made by strategic nodes. Those duplicating nodes will be chosen in such a way that many copies of one packet will not be sent on a same link. Fig. 1 depicts an example of our architecture where NSDU (@NS, @ND, LAR PDU) represents a Network SDU, @NS and - 7 -
@ND are the source and destination Network addresses that identify respectively SDU’s sending and receiving LAR entities. NSDU1 is sent from LAR node S and received by LAR node A, that makes a duplication, and sends NSDU2 to LAR node D1 and NSDU3 to LAR node D2.
DUPLICATING LAR NODE
INTERMEDIATE NODE
INTERMEDIATE NODE
routing
routing
B
C
MEMBER LAR NODE
upper layers duplication LAR sublayer (2)
(3)
(1) Network sublayer lower layers
from S
A
to D1
to D2
D2
(1) NSDU1 (@NS, @NA, LAR PDU) (2) NSDU2 (@NA, @ND1, LAR PDU) (3) NSDU3 (@NA, @ND2, LAR PDU)
Fig. 1: LAR Framework In our architecture, we choose to build a unique center-based tree instead of source-specific trees. To build the tree is to select duplicating nodes and members as the tree vertices. A tree edge such as (A,D2) is supported by the Network layer as a point to point communication between vertices. Intermediate nodes (like B and C) do not activate the LAR sublayer for this group. Only members such as D2 deliver SDUs to upper layers. III.1.4 Duplication Duplicating nodes carry out duplications by sending copies to individual nodes, and/or by sending a single copy to a set of destinations, via a shared broadcast medium (called multicast environment in the sequel). Such an environment could be a LAN,
providing a Link level
broadcast service, or an Autonomous System, using a Network layer multicast protocol [BFC93] [DEF94] [RFC1075] [RFC1584]. This routing protocol could multicast easily, and only the border nodes would have to implement the LAR sublayer. Fig. 2 illustrates the different types of duplication, where grey nodes belong to LAR tree and white ones do not:
- 8 -
AS
LAN
Duplication Intermediate Node
Duplicating Node
Duplicating Node & Member
Member
Fig. 2 : Tree Duplication Points Remark: Nodes with degree 2 in classical multicast trees, do not have to belong to the LAR tree if they do not have a duplication function. Nevertheless, only nodes that are members or with a neighboring multicast environment should have a degree ≤ 2 in the LAR tree. We will make use of duplications on multicast environment when possible (see Fig. 3): medium A is used for duplication between three nodes, but medium B remain as a point to point edge, because it is used for communication between only two LAR nodes.
A B
Classical Multicast Tree
LAR Tree
Duplicating Node and/or Member Duplication by Multicast Environment
Relay Node
Fig. 3: Multicast Distribution Trees - 9 -
III.2. LAR sublayer addressing We first describe how the LAR group addresses are managed, and then we show the relationship between LAR addresses and adjacent layers addresses. III.2.1 LAR addresses definition Since we consider the context of group communications, we present the management of logical addresses within a group. III.2.1.1 Obtaining logical addresses We suppose that groups are identified by logical hierarchized names, similar to DNS (Domain Name Service [RFC1034]) ones (e.g. "perf.net.u-strasbg.fr" could denote a performance analysis working group, organized by the network research team of Strasbourg university). The association between a group name and the corresponding unique LAR address, can be performed by a DNS-like hierarchized directory, that we call LAR directory. The additional difficulty for this directory lies in the need to provide dynamic address allocation. This mechanism must be automated and actually faster than DNS's, that generally relies on manual modifications. Note that address deallocation must also be automatic, but with less time constraints. To allow LAR communications we must define, not only group destination LAR addresses (noted @LD), but also source LAR addresses (noted @LS), so that the originator of a LAR message can be identified. The identification of a message sender must be done indeed at LAR level, because the Network address of the sender is lost after the traversal of the next LAR node (see conversion mechanism in Fig. 1). The source LAR address (@LS) can be used, if needed, to individually identify the message originator. Therefore individual LAR addresses could be either: • directly derived from the originator's Network address (not good for layer independence), • or derived from the group logical address (@LD), for example by adding a selector, • or independently assigned as suggested for Universal Transport Addresses in [RFC1705]. Remark: Every LAR node should also have an individual LAR address to perform layer management, in the same way as Internet routers have their own IP addresses. - 10 -
III.2.1.2 Group creation The group manager (in the sequel, we will assume that the creator is also the manager of the group), sends a request to LAR directory with the proposed name. The Directory responds with the LAR address created for the group, as well as the manager's individual LAR address. After this, the manager can publish the group name in a group directory similar to the Session Directory (sd) used in the Mbone [Eri94]. III.2.1.3 Joining a group Any candidate to join a group must send a request to the manager, whose address has been obtained from the the group directory. The manager then executes a group access control phase for this potential new member and sends a failure or success response. In case of success, the manager starts the execution of a distributed algorithm for adding the new member to the LAR tree (see III.4.1). Following this step, the manager sends to the new member: • the group's LAR address, • the new member's individual LAR address, • the addressing information needed by the new member to join the LAR tree. This information obtained during the previous tree modification process, consists in the Network address(es) of the LAR tree node(s) that is (are) going to be the direct neighbor(s) of the new member. Therefore the new member will be able to perform conversion between group's LAR address and Network address(es) of the node(s) that attach him to the LAR tree. Remark: Case of open groups The proposed architecture allows to take into consideration the case when the originator of a packet addressed to a group does not itself belong to the group. This external sender must first join the group manager (using the groups directory), by sending him a request. The manager then performs an "open" group access control phase, and informs the external candidate on the failure or success of his request. In case of success, the manager sends to the external member: • the group's LAR address, • the external member's LAR address: this can be a constant LAR address designating any external - 11 -
sender, in case of anonymous identification. Otherwise it is defined as in III.2.1.1, • the Network address of the LAR tree node of attachment. Note that, in opposition to a member’s attachment, the aforementioned attachment is unidirectional and allows an external member to send messages to the group, but not to receive messages from the group, because the external member's address does not figure in its attachment node's routing table. III.2.1.4 Leaving a group When any member different from the manager leaves the group, this is followed by his LAR individual address deallocation. When the manager, that we suppose to be the last member, leaves the group, this is followed by both his LAR individual address and the group's LAR address and name deallocation. This leads to the modification of both the LAR and the groups directories. Remark: When the LAR protocol detects the silent failure of a member, it must undertake the member's LAR address restitution. III.2.2 Relationship with Network and Transport addresses We first present the relationship between LAR addresses and adjacent layers addresses. Then we show that the concept of multicast addresses at Network level keeps its interest, and we suggest a new hierarchized structure for such addresses. III.2.2.1 Building of adjacent layer addresses LAR addresses are not built from Network addresses, since a logical group (or member) identifier is independent of any location. The link between LAR and Network addresses for a group member is computed dynamically during the group joining phase. On the other hand, the LAR address can serve as a basis for the hierarchical building of a Transport address, by adding a selector [PR90]. For instance, TCP port numbers constitute selectors that are added to IP addresses. III.2.2.2 Multicast Network addresses As shown in section III.1.4, LAR sublayer must be able to send a single packet to a set of destinations over a multicast environment. Consequently, multicast addresses are needed at - 12 -
Network level. That means that the conversion of a LAR destination address into Network addresses must supply a unique multicast Network address, instead of a set of unicast Network addresses that would lead to redundancy. However, in contrast to current IP multicast addresses, it would be better to have hierarchical Network multicast addresses. We can observe indeed that in the current IP addressing scheme, unicast and broadcast addresses are hierarchical. Moreover one may send a broadcast in a (sub)network, even a distant one: this is sometimes called a directed broadcast. Current flat IP multicast addresses do not allow such mechanism. Therefore, in the case of a (sub)network covered by an implicit diffusion medium, a hierarchical multicast Network addressing scheme would allow either: • to limit the spread of diffusion within a (sub)network, • to avoid unnecessary duplications at the Link level, • to provide a directed multicast mechanism. Note that the multicast semantics of such addresses need not to be known outside of the identified network, since only the network's prefix is analyzed outside of it. We may observe that such a multicast addressing at Network level would only concern a localized set of machines sharing a common diffusion medium: that is perfectly suited to the geographic semantics of Network addresses. Remark: To use such type of addresses, we need to have a way to dynamically allocate multicast Network addresses within a LAN. [Hay93] describes such a mechanism for the dynamic allocation of MAC addresses. III.3
Switching For a given group, only the duplicating nodes and/or members maintain an entry in their
LAR sublayer routing table. Each logical address matches a tree neighbor descriptor list. @LD1 → @LD2 → ......
(A10, A11, A12, A1 3, A14) (A20, A21, A22) ...............
Each descriptor AIj indicates a Network address value, if this address is unicast or multicast and if it - 13 -
identifies a passive member (receiver only), an external member (sender only) or a general group member (sender and receiver). Each address value can belong to one of three types: (i)
the reserved indicator @ME, if the node itself is a member,
(ii)
a unicast Network address identifying one neighbor in the tree,
(iii)
a multicast Network address for a multicast environment.
For instance, in a node which is only a member (a tree leaf), the list is composed of two addresses: the indicator @ME and the Network address of its tree attachment point (see E in Fig. 4) E
F M
B
A in B in C in E
@LD @LD @LD
→ → →
C
D
(@NA (ii), @NM (iii), @NC (ii) ) (@NB (ii), @ME (i), @ND (ii) ) (@ME (i), @NM (iii) )
Fig. 4: LAR Routing Table entries for group G On a tree node, the switching process has to forward a datagram copy along each contiguous edge except the incoming one. To do this, all multicast algorithms analyze destination and source addresses. The LAR switching algorithm has to distinguish a unicast from a multicast destination Network address to identify the incoming edge (see switching algorithm below): Let {@LD} be the address list of the group identified by @LD On reception of ( @NS, @ND, @LS, @LD, msg ) IF @ND is an unicast adress THEN // sender is a neighbour in the tree // sender source address @NS is in {@LD} IncomingEdge = @NS ELSE // received by a contiguous multicast environment // multicast destination address @ND is in {@LD} IncomingEdge = @ND ENDIF RecipientList = { {@LD} \ IncomingEdge } FOR each @NDest in RecipientList //sending copy or delivering to upper layer DO IF @NDest is @ME THEN // to upper layer (member) Delivering of ( @LS, @LD, msg ) to upper layer ELSE // towards Network layer (duplication) Sending of ( My@N, @NDest, @LS, @LD, msg ) ENDIF DONE
- 14 -
For instance, this algorithm works as follows with network of Fig. 4: On reception of ( @NA, @NB, @LS, @LD, msg ) in B @NB is unicast IncomingEdge = @NA RecipientList = {@LG} \ @NA = { @NM, @NC } → Sending of ( @NB, @NM, @LS, @LD, msg ) → Sending of ( @NB, @NC, @LS, @LD, msg ) On reception of ( @NB, @NC, @LS, @LD, msg ) in C @NC is unicast IncomingEdge = @NB RecipientList = {@LD} \ @NB = { @ME, @ND } → Delivering of ( @LS, @LD, msg ) to the upper layer → Sending of ( @NC, @ND, @LS, @LG, msg ) On reception of ( @NB, @NM, @LS, @LD, msg ) in E @NM is multicast IncomingEdge = @NM RecipientList = {@LD} \ @NM = { @ME } → Delivering of ( @LS, @LD, msg ) to the upper layer
III.4
Routing and tree evolution Nodes which will undergo a modification are successively determined according to the
evolution of group membership. A new member needs one more duplication in the multicast tree. The member insertion can be done in four different ways (see Fig. 5): • adjunction of an edge contiguous to an existing node (A). • adjunction of a duplicating node of degree 3 by splitting an existing edge into two parts (B). • adjunction of three nodes contiguous to a multicast environment, which then becomes duplicating (C). • without any modification if the new member is connected to a multicast environment already duplicating in the tree (D) (or if the new member is already a duplicating node). B
D nodes added to LAR tree
A
nodes belonging initially to LAR tree
nodes not belonging to LAR tree
C
Fig. 5 : Adjunction of A,B,C,D to a LAR Tree. A member disconnection yields a symetrical modification of the tree.
- 15 -
III.4.1 Tree construction We are currently working on a simple incremental algorithm based on the center of the tree. Recall that the eccentricity of a node is the distance to the furthest node, and the center is the node with minimal eccentricity. The algorithm inserts (according to one of the four situations of Fig. 5) new edges and/or duplicating LAR nodes, where the route from the center to the new node departs from a tree node or edge. After each modification, the new
center can be distributively
recomputed. III.4.2 Robustness We should not find failures on tree edges because intermediate node or link failures are handled by the Network layer. We could use well known « keep alive » mechanism to detect member or duplicating node failures as in EGP [RFC904]. A mechanism common to all groups, could establish a set of neighbors and send a Hello message to each of them. An edge shared by many LAR trees would need only one Hello message. When a neighbor fails, all trees containing this node should be updated. III.4.3 Failure and tree reconstruction In case of failure of a member without a duplication function, the neighbor node should delete the tree edge which connects it to the member. It should also inform the manager to ensure the deallocation of the logical member’s address (see II.2.1.4). A node detecting a duplicating node failure would notify the manager, who could either respond by giving the address of another neighbor, or ask for tree reconstruction. Remark: Following such a failure, the manager could be flooded by the notifications. Nevertheless, the number of these messages cannot exceed the node degree which should be limited by the number of node interfaces. The incremental evolution of the tree requires the periodic follow-up of the tree weight and diameter by the manager. The manager can ask a complete tree reconstruction that could be performed for instance by the computation of the shortest-path tree rooted on the tree center.
- 16 -
Remark 1: Case of non-LAR nodes: During tree construction, a LAR node has to reach the next LAR node on the route to a given node. This address cannot be supplied by the Network layer in case of presence of non-LAR nodes. Each LAR node could resolve this problem by two different ways: statically by keeping the addresses of all its LAR neighbors, or dynamically by successively querying the intermediate nodes on the route to some node. An address list of these nodes should be supplied by the Network layer. Remark 2: Problems related to a multicast environment: In a multicast environment, all contiguous LAR nodes should cooperate. A designated node will be in charge of maintaining for each group a list of members and duplicating nodes [Dee91]. This information will be used in case of node failure.
III.5
Network service interface and primitives The Network service interface should indicate whether an address is unicast or multicast. It
should also deliver source and destination Network addresses (needed by switching) with each LAR message. It should finally indicate whether two nodes are in the same multicast environment or not. The Network service should provide: • a Network reserved address AllLarNodes; this address allows to reach all LAR nodes including those not activated for a given group, • a Network multicast address dynamically associated to each LAR address in a multicast environment (see III.2.2.2), • the Network address of the next hop towards a given node and the distance to any node. This information allows tree construction (see III.4.1), • a route (address list) towards any node, only in case of non-LAR nodes presence.
- 17 -
IV Benefits and problems with a two-layered architecture Our proposition can be summarized in two main points: create a specific sublayer to handle Logical Addressing and Routing, and use this layer only where necessary.
IV.1 Separation into two sublayers We have seen that multicast addresses are very different in nature from unicast ones. Our architecture allows to separate these two types of objects, and to leave the Network layer almost unchanged. The main goal of the Network layer remains the computation of good routes towards basically static addresses, in the presence of potential link failures. This sublayer remains independent of users and applications. The main goal of the LAR sublayer is to dynamically associate group members to dynamically allocated group addresses, while enforcing group membership policies and computing a good multicast tree. The independence of these two (sub) layers should allow the corresponding routing protocols to be independent, multicast routing being computed on top of unicast routing. In this aspect, it can be seen as a generalization of tunneling. We believe that the LAR sublayer will be useful not only for multicasting, but also each time logical addresses are necessary. We can mention a few of them: • communication with mobiles, for example the protocol VIP [TUS94] uses a separate virtual addressing sublayer, • anycasting, [RFC1546] where the problem is to reach a service provider using a generic logical address, • conversion when different addressing schemes or protocols are used at the internetwork layer. This last problem will be increasingly important with the future cohabitation of IP, IPng and CLNP. The LAR sublayer would allow to construct LAR-level routers in the same way as Network layer routers are currently used, making address and header translation between heterogeneous Network layers. This means that the LAR sublayer could be used in normal point-to-point communication. It
- 18 -
can be seen as a drawback since this calls for more header overhead. However it is already suggested that the Transport layer should be modified to include Universal Transport Addresses [RFC1705]. The LAR sublayer would be one way to implement these addresses.
IV.2 Presence of the LAR sublayer only where necessary This feature has at least three benefits, in terms of resources, control and applicability. IV.2.1 Resources Obviously, for a given group, the LAR sublayer will need to be activated only at duplication point, saving resources at intermediate points where no duplication takes place. This saving depends on the group structure. We can see our LAR tree as a tree where edges are in fact point to point routes, and each edge is either a single node or a (network layer) multicast cloud. The benefits are readily apparent if the average length of point to point routes is greater than one. To illustrate this point, we constructed a classical shortest-path tree interconnecting the maildrops of 19 persons, working in the same project (in fact members of the editorial board of a journal). This group could represent what a group supporting cooperative work could be. We got the following results: 19 nodes for members (leaves), 86 internal nodes with degree 2 (intermediate nodes), 13 internal nodes with degree at least 3 (duplicating nodes). Therefore among the 99 internal nodes, about 86% could get rid of routing information concerning this group. Of course this is only one example, and we plan to do some more systematic measures on this type of groups that may become more and more common. IV.2.2 Control Since duplication points are potential security holes, it is better to have as few as possible of them. In our proposal, one could say that it is the tree that goes towards the new member, not the way around, because an external node should not modify by itself the routing tables of a LAR tree node. Moreover, the algorithm to construct the tree could allow duplication points only in some routing domains known for their level of security. As a special case, duplication could be allowed only on domains containing members for example. Note that control must also be considered from - 19 -
the point of view of transit routing domain policies. One could imagine that some routing domain would not allow to have duplication points for some groups even if point to point communication of some members are allowed. To see this consider the following situation: - group G consists of members A, B and C - routing domain D allows transit routes from A to B but not from A to C. If a duplication point for G is installed in D, then communications from A to C could be sent using G, and thus go through D, in opposition to the unicast routing policy of D. IV.2.3 Applicability Since a general multicast protocol is not yet available, it will be a long time before all routers implement a common Network layer multicast protocol. This is the problem currently solved by tunneling in the Mbone. Things will be even worse, since at the same time the unicast Network layer will change with the implementation of IPng. It seems easier to separate evolution of the unicast and multicast problems at the Internet layer. Moreover as already mentioned, the LAR sublayer need not to be implement on all routers, and may play a conversion role, making a transition phase simpler.
IV.3 Remaining problems and perspectives The main problems we are now considering are: • incremental construction of a good multicast tree using only information given by the Network layer, • good recovery algorithms in case of a LAR node failure, • a dynamic directory for allocation of logical addresses, and local multicast Network addresses inside a domain, • recovery in case of failure from the group manager. More generally a study of group structure and dynamics is also necessary.
- 20 -
REFERENCES
BB91
Studies in Computer Supported Cooperative Work: Theory, Practice and Design, Eds. J.M. Bowers and S.D. Benford, Elsevier, 1991
BFC93
T. Ballardie, P.Francis, J.Crowcroft, Core Based Trees (CBT): An Architecture for scalable Inter-domain Multicast Routing, SIGCOMM'93, Ithaca, N.Y., USA, Sept. 93
Chi93
B. Chinoy, Dynamics of Internet Routing Information, SIGCOMM 93, Ithaca USA, Sept.93, p 45-52
Dee91
S. Deering, Multicast Routing in a Datagram Internetwork, PhD University, California, U.S.A., 1991
DEF94
S. Deering, D. Estrin, D. Farinacci, Van jacobson, C.G. Liu, L. Wei Protocol Independent Multicast (PIM): Motivation and Architecture, Internet Draft, draft-ietfidmr-pim-arch-01.ps, Oct. 94, 19 p.
Eri94
H. Eriksson, MBone: The Multicast Backbone, CACM, Vol 37 No 8, Sept. 93
Hay93
C. P. Hayden, Lan Addressing for Digital Video Data, Digital Technical Journal, Vol 5 No 2, Spring 93
PR90
A. Patel, V. Ryan, Introduction to Names, Addresses and Routes in an OSI Environment, Computer Communications, Vol 13 No 1, Feb. 90
RFC791
J. Postel, Internet Protocol, Sept.81, 45 p.
RFC904
D. Mills, Exterior Gateway Protocol Formal Specifications, Apr. 84, 30 p.
thesis,
Stanford
RFC1034 P. Mokapetris, Domain names - concepts and facilities, Nov. 87, 55 p. RFC1075 D. Waitzman, C. Partridge, S. Deering, Distance Vector Multicast Routing Protocol, Nov. 88, 24 p. RFC1112 S. Deering, Host extensions for IP multicasting, Aug. 89, 17 p. RFC1519 V. Fuller, T. Li, J. Yu, K. Varadhan, Classless Inter-Domain Routing (CIDR): an Address Assignment and Aggregation Strategy, Sept. 93, 24 p. RFC1546 C. Partridge, T. Mendez, W. Milliken, Host Anycasting Service, Nov. 93, 9 p. RFC1584 J. Moy, Multicast Extensions to OSPF, Mar. 94, 102 p. RFC1705 R. Carlson, D. Ficarella, Six Virtual Inches to the Left: The Problem with Ipng, Oct 94, 27 p. TUS94
F. Teraoka, K. Uehara, H. Sunahara, J. Murai, Mobility, CACM Vol 8 No 37, Aug. 94, p67-75.
- 21 -
VIP: A protocol Providing Host