(pbs) architecture for hop and span–constrained optical networks - Webs

2 downloads 1603 Views 204KB Size Report
To realize optical switching in enterprise networks, flexible network ... The integration of the proposed PBS network architecture with low-cost optical switching .... A simplified block diagram of the PBS switching node archi- tecture is shown in ...
GMPLS control RX

SDRAM SDRAM

SDRAM

Control interface unit

Network processor

Control CPU

Con proc netw man TX

Glue logic

PHOTONIC BURST SWITCHING (PBS) ARCHITECTURE FOR HOP AND SPAN–CONSTRAINED OPTICAL NETWORKS Shlomo Ovadia, Christian Maciocco, and Mario Paniccia, Intel Corporation Ramesh Rajaduray, University of California Santa Barbara

A BSTRACT A new architecture called photonic burst switching (PBS) with variable time slot provisioning, supporting high-speed bursty data transmission within hop and span-constrained networks, is presented as follows. First, the edge and switching node architecture for in-band and out-of-band signaling with the optical switch fabric performance parameters is defined. Second, the edge router regions of operation for burst assembly are studied via simulations. Third, we introduce: • A GMPLS-based PBS software architecture in terms of control and data plane operations that is extended to enable the PBS optical interfaces with the software building blocks for edge and switching nodes • An adaptive PBS MAC layer functionality and framing of multiple generic payloads The integration of the proposed PBS network architecture with low-cost optical switching fabrics and GMPLS-based software architecture should provide the means for robust and efficient optical transport of bandwidth-demanding applications within enterprise networks.

I NTRODUCTION Optical burst switching (OBS) scheme is emerging as a promising solution to support high-speed bursty data traffic over wavelength-division multiplexed (WDM) optical networks [1–4]. The OBS scheme offers a practical opportunity between the current optical circuit switching and the emerging all-optical packet switching technologies. It has been shown that under certain conditions, the OBS scheme achieves high-bandwidth utilization and class of service (CoS) by elimination of electronic bottlenecks as a result of the optical-electric-optical (OEO) conversion occurring at switching nodes, and by using a one-way end-to-end bandwidth reservation scheme with variable time slot duration provisioning scheduled by the ingress nodes. Optical switching fabrics are attractive because they offer at least one or

S24

0163-6804/03/$17.00 © 2003 IEEE

more orders of magnitude lower power consumption with a smaller form factor than comparable OEO switches. However, most of the recently published work on OBS networks focuses on the next-generation backbone data networks (i.e., metro or Internet-wide networks) using high-capacity (i.e., 1 Tb/s) WDM switch fabrics with a large number of input/output ports (32 × 32 or more), optical channels (eight wavelengths or more), and requiring extensive buffering [1–4]. These WDM switches tend to be complex, bulky, and very expensive to manufacture. Beyond the backbone data networks, there is a growing demand to support a different class of bandwidth-demanding applications such as storage area networks (SANs) and multimedia multicasting at low cost for both local and wide area networks. To realize optical switching in enterprise networks, flexible network architecture with affordable hardware implementation, operation, maintenance, and management is required. Consequently, in this work we are proposing to adapt the OBS scheme to future high-speed optical enterprise networks with limited span and number of hops. These networks are based on fast (< 100 ns) optical switch fabrics with a limited number of input/output ports (i.e., ≈ 8 × 8, 12 × 12) and with no or limited optical buffering. Preliminary analysis indicates that such optical switch fabrics can be developed using complementary metal oxide semiconductor (CMOS)-compatible technology where the cost per switched bit per second is expected to be at least tenfold lower than conventional highcapacity WDM switches. Although conceptually similar to backbone-based OBS networks, the design, operating constraints, and performance requirements of these high-speed hop and span-constrained enterprise networks are different. Thus, in this article we refer to these optical enterprise networks as photonic burst-switched (PBS) networks to distinguish them from conventional OBS networks. The article is organized as follows. First, we introduce the PBS architecture for hop and span-constrained networks, including the PBS edge and switching nodes architecture. Second, the edge router regions of operation for burst assembly are studied via simula-

IEEE Optical Communications • November 2003

tions. In addition, a set of ideal performance requirements for the optical switching fabric are defined. Then we discuss the GMPLS-based PBS software architecture for both ingress/egress nodes and switching nodes to support bursty traffic. This is followed by a discussion on the PBS medium access control (MAC) layer functionality and framing to transport multiple generic data payload.

PBS LER

λ1,λ2,λ3

PBS network 1 10 GbE Server farm

10 Gb/s

10 GbE

λ1,λ2,λ3

PBS network 2

10 Gb/s

10 GbE PBS LER GbE

GbE 0.1–1 Gb/s LAN 1

0.1–1 Gb/s LAN 2

FIGURE 1. Photonic burst switched network architecture.

Multiserver blades

er rv Se card

Flash

Buffer

Bus bridge

Egress network processor

Buffer SPI-4.2 SPI-4.2

Ingress network processor Buffer

Electrical backplane

Electronic interface (GbE)

PBS Framer/ SerDes

XFP

PBS server NIC

FIGURE 2. Multiserver blades system architecture with a block diagram of the PBS network interface card.

EDGE NODE ARCHITECTURE Figure 2 shows, for example, the architecture of a high-speed optical input/output interface within a modular reconfigurable multiserver system located at the edge node. Internal data communications between each server card and the PBS interface as well as among the different server cards is done using the electrical backplane fabric of the multiserver blades system. At the ingress node, the server system receives multiple data flows from local or wide area networks (LANs/WANs) via its electrical 1 Gb/s Ethernet (GbE) interface. It classifies these flows and statistically multiplexes them to form photonic control and data bursts. A data burst is a collection of IP packets and/or Ethernet frames with the

IEEE Optical Communications • November 2003

Storage array

PBS network 3

PHOTONIC BURST SWITCHING ARCHITECTURE We propose to segment an enterprise network into small islands of high-performance mesh architecture PBS networks with peer-to-peer signaling where network performance is balanced between implementation and complexity. Such network segmentation into PBS islands considerably simplifies the design, implementation, operation, and management of the optical switching nodes, resulting in reduced overall PBS network costs. Figure 1 shows the proposed PBS network architecture, where an edge device, which can be either a label edge router (LER) or a multiserver system located at the edge node, is optically connected through a PBS interface to other network devices such as other server systems, LERs, and storage arrays, which are also equipped with PBS interfaces, via PBS switches. Each PBS switching node can be connected to multiple edge devices. PBS-to-PBS network connectivity is done through an edge node LER using either a conventional interface or a PBS interface. (Later we discusses PBS-to-PBS network routing). Specifically, PBS networks have limited physical span, typically less than 10 km, and a limited number of optical channels, typically less than 8, from ingress to egress nodes. The limited span plays to the advantage of the PBS network as it reduces the guard band uncertainty due to the propagation and processing time delays [3]. PBS networks are also hop-constrained due to limited optical power budget for lower-cost network implementation. Although the maximum size of a PBS network is still under investigation, analysis indicates that a typical PBS network has about five to 15 switching nodes with about three to four hops along a given optical label-switched path (OLSP). The provisioning, administration, maintenance, and operation cost of these networks are essential to the adoption of the PBS architecture. The rest of this section is focused on the edge and switching nodes architecture with edge node simulation results are presented later [5].

PBS LER

same classification, such as the same destination address, quality of service (QoS) parameters, and transmission time window. To enable low-cost PBS networking, for example, a modified 10 GbE interface card at the ingress/egress node can be used. The modified 10 GbE card consists of dual high-end network processors (NPs), one for the outgoing bursts and one for the incoming bursts, onboard memory, a physical layer framer, and a 10 GbE optical transceiver. The onboard DRAM cluster allows one to temporarily buffer multiple scheduled data bursts in case of data burst loss and retransmit requests. The burst assembly and framing, burst scheduling, and control, which are part of the PBS MAC layer, and related tasks are performed by the NPs. NPs are very powerful processors with flexible micro-architectures

S25

mean packet length was 608.1 bytes with a standard deviation of 71.325 bytes. In addition, the interpacket timing gap process was assumed to be exponentially distributed with a mean interpacket gap of 4.865 µs. The SR had seven output IP queues, and its scheduler used the iSLIP algorithm where the output data rate was sped up to 10 Gb/s [8]. Two burst assembly algorithms were considered. In algorithm 1, the burst is assembled between minimum burst size (Bmin) and a time window of T ms normalized to the maximum amount of receivable data (D max ) in time T. If the assembled burst is larger than B min and smaller than D max, the burst is sent to the burst queue. Algorithm 2 is similar to algorithm 1 except if the burst length exceeds Bmax < Dmax, the burst is sent to the next stage. The last packet can be either kept with this burst as was done in this article or used to start a new burst. In stage 3 of the LER, the oldest burst first scheduling algorithm is used where transmission priority is given to the oldest burst among contending bursts. Let us define F min = Bmin/Dmax, and Fmax = Bmax/Dmax. Figures 4a and 4b show the normalized LER average burst size and total latency vs. the input channel utilization (ICU). Three regions of LER operation can be identified. For ICU/7 < Fmin (region 1), no burst is assembled, resulting in a large total LER latency. For Fmin < ICU/7 < F max (region 2), the burst size increases with increased ICU, where its size approaches F max at high ICU values. The total LER latency is reduced due to the reduction in the average waiting time for a packet in the IP queue until the burst is assembled. In region 3 (ICU/7 > Fmax), the burst size is clamped to Fmax and the normalized LER total latency approaches one. From the LER simulation results, it is clear that the total LER latency is significantly reduced by selecting a smaller F min (< 0.1). PBS network simulations using the defined LER model with self-similar traffic are currently underway, quantifying its performance according to various metrics such as end-to-end latency, burst loss probability, and bandwidth utilization, and are planned to be published later.

1

Probability P (packet length < p)

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 500

0

1000 1500

2000

2500

3000

3500

4000 4500

Packet length (p) (bytes)

FIGURE 3. The measured cumulative packet length probability distribution function from an Intel data center. that are suitable to support a wide range of packet processing tasks, including classification, metering, policing, congestion avoidance, and traffic scheduling. For example, the Intel™ IXP2800 NP, which has 16 microengines, can support the execution of up to 1493 microengine instructions/packet at a packet rate of 15 million packets/s for 10 GbE and a clock rate of 1.4 GHz [6].

EDGE NODE SIMULATIONS In span-constrained networks, the LER plays a critical role in determining the network performance since the LER total latency (which includes the aggregation and scheduling latencies) dominates over the propagation delays throughout the network. Consequently, before the PBS network performance can be analyzed, one needs to understand the operation of the LER. Specifically, the impact of the burst assembly process at the LER on the burst size and latency was investigated via simulations [5]. The LER model consisted of a sorting router (SR) stage at the input, a burstifier (stage 2), a burst scheduling and transmission section (stage 3), and an optical output transmission section (stage 4). The input traffic to the LER consisted of 10 variable-length random IP streams, where each stream was operating at 1 Gb/s. Figure 3 shows the cumulative packet length probability distribution obtained from traffic measurements from an Intel data center [7]. The

SWITCHING NODE ARCHITECTURE A simplified block diagram of the PBS switching node architecture is shown in Fig. 5. The intelligent node consists of a strictly nonblocking optical switch fabric, an NP, glue logic, optical multiplexers/demultiplexers, and optical transceivers. The optical switch fabric has strictly nonblocking space-division architecture with fast (< 100 ns) switching times and a limited number of input/output ports (i.e., ≈ 8 × 8, 12 × 12). Each of the incoming or outgoing fiber links typically carries only one data burst wavelength. The switch fabric, which has

5.00

0.40 Fmin = 0.025; Fmax = 1 Fmin = 0.1; Fmax = 1 Fmin = 0.2; Fmax = 1 Fmin = 0.025; Fmax = 0.05 Fmin = 0.025; Fmax = 0.1 Fmin = 0.025; Fmax = 0.20

0.30 0.25

Fmin = 0.025; Fmax = 1 Fmin = 0.10; Fmax = 1 Fmin = 0.20; Fmax = 1 Fmin = 0.025; Fmax = 0.05 Fmin = 0.025; Fmax = 0.10 Fmin = 0.025; Fmax = 0.20

4.50 Normalized total latency

Normalized burst size

0.35

0.20 0.15 0.10 0.05

4.00 3.50 3.00 2.50 2.00 1.50 1.00 0.50 0.00

0.00 0

0.2

0.4

0.6

0.8

1

0.1

0.2

0.3

0.4

0.5

0.6

ICU

ICU

(a)

(b)

0.7

0.8

0.9

1

FIGURE 4. The normalized LER average burst size (a) and total latency (b) vs. the input channel utilization (ICU).

S26

IEEE Optical Communications • November 2003

IEEE Optical Communications • November 2003

SDRAM SDRAM

SDRAM

no or limited optical buffering fabric, performs statistical burst switching within a variable-duraControl interface unit tion time slot between the input and output ports. If needed, the optical switch buffering can Control burst Control be implemented using fiber delay lines (FDLs) GMPLS processing wavelengths control on several unused ports. The specific optical network =λ'0, λ0 management buffering architecture, such as feed-forward or feedback, needed for PBS switch fabric is still Network RX TX processor under investigation [9]. However, the amount of optical buffering is expected to be relatively small Glue Control compared to a conventional packet switching fablogic CPU Data ric since these FDLs can carry multiple data wavelengths burst wavelengths. Another possible contention =λ1, λ2,λ3,λ4,λ5 Control plane resolution scheme that is relatively simple to implement is deflection routing. The PBS netλ'0, λ2,λ3 work can operate with a relatively small number λ'0, λ4,λ5 of control wavelengths (λ’0, λ0) since they can be shared among many data wavelengths. FurtherOutput fibers more, the PBS switch fabric can also operate Mux Input fibers with a single wavelength using multiple fibers, λ , λ λ λ 0 1, 4, 5 Demux but this case will not be discussed here. The control bursts can be sent either in-band λ0, λ1,λ2,λ3 (IB) or out of band (OOB) on separate optical channels (Fig. 5). For the OOB case, the optical Photonic burst switch (PBS) data bursts are statistically switched at a given Data plane wavelength between the input and output ports within a variable time duration by the PBS fabFIGURE 5. PBS switching node architecture. ric based on the reserved switch configuration as set dynamically by the NP in the control interface unit. The NP is responsible for extractFor example, assuming 10 GbE interfaces at the edge nodes, ing the routing information from the incoming control the total available optical power budget is limited to about bursts, providing fixed-duration reservation of the PBS 20 dB using standard p-i-n receivers. Consequently, the maxswitch resources for the requested data bursts, and forming imum possible number of switching nodes along a given the new outgoing control bursts for the next PBS switching lightpath is limited by the total insertion loss at each switchnode on the path to the egress node. In addition, the NP ing node. This means, for example, that with 5 dB insertion provides overall PBS network management functionality loss at each switching node, a PBS network with up to four based on an extended GMPLS framework. For the IB case, switching nodes along a given OLSP can be constructed. As both the control and data bursts are transmitted to the PBS previously discussed, when the number of possible switching switch fabric and control interface unit. However, the NP nodes is reduced, the PBS network becomes simpler to operignores the incoming data bursts based on the burst payload ate and manage. However, the available bandwidth and netheader information. Similarly, the transmitted control bursts work resources may be reduced due to suboptimal lightpath are ignored at the PBS fabric since the switch configuration routing. From a practical point of view, the minimum numhas not been reserved for them. The advantage of the IB ber of hops within PBS network is two, corresponding to approach is that it is simpler and costs less to implement about 10 dB optical switch insertion loss. Optical switching since it reduces the number of required wavelengths. Howevspeed and latency are also critical to network operation. er, it also leads to lower bandwidth utilization since there They must be sufficiently fast that the accumulated time are larger timing gaps between successive control bursts delay for the transmitted data bursts with no contention required to be processed by the NP at each of the PBS (about 0.4 µs, assuming four switching nodes with 100 ns switching nodes. Another approach for IB signaling is to use switching time) throughout the PBS network is a relatively different modulation formats for the control bursts and data small fraction of the control plane’s processing delay and the bursts. For example, the control bursts are non-return to end-to-end propagation delay (about 50 µs for 10 km fiber zero (NRZ) modulated while the data bursts are return to link). The optical switch fabric must be carefully designed to zero (RZ) modulated. Thus, only the NRZ control bursts limit the optical crosstalk at each switching node since the are demodulated at the receiver in the PBS control interface accumulated crosstalk along the lightpath can lead to excesunit, while the RZ data bursts are ignored [10]. The sively high bit error rates at the egress nodes. Another key NRZ/RZ in-band control/data channel approach is more parameter is of the optical switch fabric PDL. Most of the complicated to implement since the data channel must be demonstrated guided wave optical switches using LiNbO3 or engineered for RZ data transmission, which includes dispersion management and monitoring of the RZ pulses. The spesemiconductors that can only operate at either transversecific OOB or IB control signaling scheme to be selected is electric (TE) or transverse-magnetic (TM) polarizations or application-dependent. exhibit a large amount of PDL, which requires the use of The optical switch fabric must be able to meet a set of polarization-maintaining fibers [11, 12]. This optical switch performance requirements to effectively operate within PBS characteristic is undesirable and may limit the practical use networks. The key optical switch parameters include switch of these guided wave switches. Finally, the optical switch configuration, insertion loss, switching speed and latency, must be designed to be fully integrated with high-speed logic optical crosstalk, and polarization-dependent loss (PDL). and operating over a wide temperature range (0–50°C) with The optical switch configuration must be strictly nonblocking relatively low power consumption even at high switching and only needs a small number of input/output ports (i.e., ≈ rates. A promising technology to implement such an optical 8 × 8) to limit PBS network complexity. The insertion loss of switch fabric is based on the optical beam steering method the optical switch fabric is also critical to network operation. where silicon-based optical phase shifters are used to modu-

S27

GMPLS-BASED PBS SOFTWARE ARCHITECTURE

Switch parameter

Requirement

Configuration

≈8×8

Switching speed

≤ 100 ns

Latency

≤ 100 ns

Operating wavelength bands

1310 nm or 1550 nm

Total insertion loss

≤ 10 dB

Optical crosstalk

< 30 dB (typical)

PDL

≈ 0 dB

Optical return loss

< 40 dB

To enable PBS networking within enterprise networks, it is advantageous to extend the GMPLS-based protocols suite to recognize the proposed PBS optical interfaces at both ingress/egress nodes and switching nodes [14, 15]. Under the GMPLS framework, the PBS MAC layer is tailored to perform the different PBS operations while still incorporating the MPLS-based traffic engineering features and functions for control burst switching of coarse-grained (from seconds to days or longer) optical flows established using a reservation protocol and represented by a PBS label.

DATA PLANE OPERATION Figure 6 shows an integrated data and control plane PBS software architecture with the key building blocks at ingress/egress nodes. On the data path, packets from legacy interfaces (i.e., IP packets or Ethernet frames) are classified based on ntuples classification into forward equivalent classes (FECs) at the ingress/egress node. Specifically, the adaptive PBS MAC layer at the ingress node typically performs data burst assembly and scheduling, control burst generation, and PBS logical framing, while deframing, defragmentation and flow demultiplexing are performed at the egress node. The PBS MAC layer features are discussed later.

TABLE 1. Ideal performance requirements of optical switch fabric at each PBS late the optical mode in the silicon waveguide. Fast refractive index modulation in silicon can be achieved through the free carrier plasma dispersion effect. Such free carrier optical effects have been experimentally shown to provide an efficient refractive index modulation capable of use in these photonic devices [13]. The silicon-based optical switch fabric should enable integration with the necessary driver circuits, potentially all on the same silicon die, with low power consumption (≈ 10 W for 8 × 8 switch configuration). Table 1 summarizes some of the ideal performance requirements for the optical switch fabric operating within a PBS switching node.

Link management

PBS signaling

CONTROL PLANE OPERATION Figure 7 illustrates the PBS control plane software architecture with the key building blocks at the switching nodes. The transmitted PBS control bursts, which are processed electronically by the PBS NP, undergo the following steps:

Protection and restoration

OAM&P (optical device control)

Routing

Control plane Data plane Flow classification

FEC

Flow classification

Flow management L3 forward

Label processing Queue management

Flow scheduler

PBS MAC layer (out)

PBS MAC layer (in)

Adaptive PBS MAC layer

Adaptive PBS MAC layer

Data burst assembly

Legacy interfaces (ingress/egress)

Burst scheduler

Offset manager

Data demux

Control burst

Data burst reassembly

Burst framing

Burst deframing

Control burst

Data burst

Control burst

Data burst

Data I/O

Control burst

FIGURE 6. PBS software architecture and building blocks at ingress/egress nodes.

S28

IEEE Optical Communications • November 2003

Control burst

IEEE Optical Communications • November 2003

Control burst

• The control burst is deframed, classified according to its prioriProtection OAM&P Link and (optical device Routing ty, and the bandwidth reservamanagement restoration control) tion information is processed. If an optical flow has been signaled PBS control processor and established, this flow label is used to look up the relevant Contention Burst control Control burst processing information. resolution • The PBS switch configuration PBS Resource - Bandwidth reservation - Updated control - FDL switch manager settings for the reserved band- Next control hop packet - NACK/drop configuration - update width on the selected wavelength selection - Deflection routing and network - Control LSP setup - Other control resources at a specific time are either con(GMPLS) (λ, TBD) firmed or denied. - Control label swapping • PBS contention resolution is proPBS MAC layer PBS MAC layer cessed in case of PBS switch configuration conflict. One of the two possible contention resolution schemes, namely FDL-based buffering and deflection routing, Control plane can be selected. If none of these schemes are available, the incoming data bursts are dropped until the PBS switch becomes available and a negative acknowledgFIGURE 7. GMPLS-based control-plane software architecture at PBS switching node. ment message is sent to the ingress node to retransmit. • A new control burst is generated based on the updated network resources from the resource 0 78 39 40 55 56 63 manager and scheduled for transmission. • The new control burst is framed and placed in Input fiber port Input wavelength Lightpath segment ID Reserved the output queue for transmission to the next node. FIGURE 8. PBS label format with its corresponding fields. The key control plane software components interacting with the PBS network on a control channel are: switching nodes based on various user-defined criteria when a Label signaling: Coarse-grained lightpaths are signaled link failure is reported by the link management component. end to end and assigned a unique PBS label. The PBS label Routing: This component provides routing information to has only lightpath segment significance, not end-to-end sigestablish the route for control and data burst paths to their nificance. Figure 8 shows, for example, the PBS label format final destination. For PBS networks with bufferless switch with its corresponding fields. The signaling of PBS labels for fabrics, this component also plays an important role in maklightpath set-up, tear down, and maintenance, is done ing PBS networks a more reliable transport network by prothrough an extension of IETF resource reservation protocol viding backup route information that is used to reduce (RSVP-TE). The PBS label identifying the data burst input contention. Each PBS network behaves like an autonomous fiber, wavelength, and lightpath segment, is used on the consystem, employing an internal gateway protocol (IGP) such as trol path to enable one to make a soft reservation request of modified Open Shortest Path First (OSPF). The PBS-to-PBS the network resources (through a RESV message). If the network routing within a larger enterprise network is done request is fulfilled (through the PATH message), each using a modified external gateway protocol (EGP) to deterswitching node along the selected lightpath commits the mine the best available route to a particular PBS network requested resources, and the lightpath is established with the when multiple lightpaths are available. The route selection by appropriate segment-to-segment labels. Each switching node the EGP is done via the associated attributes of the specific is responsible for updating the initial PBS label through the PBS network. signaling mechanism. This indicates to the previous switching Operation, administration, management, and provisioning node, the label for its lightpath segment. If the request can (OAM&P): This component is responsible for performing varnot be fulfilled or an error occurred, a message describing ious administrative tasks such as device provisioning. the condition is sent back to the originator to take the appropriate action (i.e., to select other lightpath characteristics). Thus, the establishment of a PBS label through signalPBS MAC L AYER F EATURES ing enables an efficient MPLS type lookup for the control burst processing. This processing improvement of the control The adaptive PBS MAC layer at the ingress nodes performs burst at each switching node reduces the required offset time data burst assembly, scheduling, control burst generation, and between the control and data bursts. This results in an PBS MAC layer framing. It should be pointed out that the improved PBS network throughput. adaptive burst assembler at the ingress nodes should be Link management: This component is responsible for prodesigned to guarantee acceptable limits on data burst end-toviding PBS network transport link status information such as end latency and throughput at various network traffic loads link up/down, loss of light, and so on. The component runs its to satisfy different CoS. Preliminary PBS simulation results own link management protocol on the control channel. (Interwith self-similar traffic indicate that if both Fmin and Fmax are net Engineering Task Force, IETF, Link Management Protoproperly selected, the total LER latency is bounded without a col, LMP, is extended to support PBS interfaces). significant change in the throughput [5]. Thus, burst length Link protection and restoration: This component is responpredictions, burst timeout period, and the need to dynamicalsible for computing alternate optical paths among the various ly monitor the lightpath bandwidth utilization at different

S29

3 VN

8 PT

10 CP

11 IB

12 LP

13

31

FH Resolution (0)

PBS burst length PBS burst ID Generic burst header HEC (optional - 0 if not used)

PBS generic burst header PBS generic burst header PBS burst payload (control or data)

PBS generic burst header Payload data

0

15

31

PBS control length Resolution EH AT PH (0) Control channel Data channel wavelength wavelength

Byte transmission order

0

VN: Version number PT: Payload type (0:data, 1:control) CP: Priority level IB: In band/out of band signaling LP: Label present FH: Header HEC present

EH: Extended header present AT: Address type, e.g. IPv4, IPv6, etc. PH: Payload FCS present

PBS label

Payload FCS (optional)

PBS data burst length PBS data burst start time PBS data burst Data burst Resolution TTL priority (0) PBS data burst destination address (variable length) Extended header (optional)

Payload data (none for a control burst) Payload frame check sequence (optional)

FIGURE 9. PBS MAC layer framing for control and data bursts.

traffic loads are keys to enable efficient PBS network operation. The PBS MAC scheduling component schedules the outgoing data burst using a variety of known algorithms such as just-in-time and just-enough-time [2, 1]. However, these burst scheduling algorithms can further reduce the burst loss probabilities at the switching nodes in hop and span-constrained PBS networks if the network topology and bandwidth utilization of different lightpath segments are taken into account. Consequently, from a practical point of view, an optimized PBS network design can be achieved by balancing network complexity and implementation and operating costs with acceptable limits on end-to-end latency, burst loss probabilities and throughput. The role of the adaptive PBS MAC layer framing is to enable the following features: • Concatenation of multiple payloads (i.e., IP packets or Ethernet frames) within the same PBS data burst frame • Adaptive PBS data burst segmentation and reassembly based on available network processing resources and transport protocol characteristics As an example, for TCP/IP traffic, the adaptive burst assembler sets the maximum PBS burst assembly period to match the TCP window size, allowing higher throughput, especially until TCP reaches its optimal window size [16]. Figures 9 and 10 below show the generic PBS framing format for both control and data bursts. The generic PBS burst frame has the following fields: • A PBS generic payload header common for all type of PBS payload (i.e., control burst or data burst. The Payload Type

S30

(PT) field of this header identifies the payload carried by the burst. • A PBS burst payload with either a control or data payload having: –A specific payload header –Payload data –An optional payload frame check sequence (FCS) — set to 0 if not used Figure 10 also illustrate the encapsulation of existing LAN/WAN traffic such as Ethernet (10/100 Mb/s, 1 Gb/s, 10GbE) over the PBS network. When framing Ethernet MAC frames, one must be careful to take into account the interframe gap (IFG) requirement. It is usually a 12 byte (≈ 9.6 ns for 10GbE) timing gap between frames to allow the receiving MAC device to update its internal counters, calculate the frame FCS, and so on. So there are two ways to take the IFG into account: • Include the IFG bytes in front of the encapsulated Ethernet frame at the expense of wasted network bandwidth • Rely on the receiving device to offset the data at the expense of extra processing The selected IFG method is signaled in the PBS control burst. Thus, the key advantage of the PBS MAC layer framing is that it takes care of the PBS requirements such as segmentation/reassembly, scheduling and control, and enables flexible mapping of multiple generic payloads such as Ethernet frames and/or IP packets within PBS frames in order to satisfy different CoS requirements under various traffic loads.

IEEE Optical Communications • November 2003

3 VN (0)

8 PT (0)

10 CP (0)

11

IB (0)

12

31

LP Resolution (0) =0

PBS burst length PBS burst ID Generic burst header HEC (optional - 0 if not used)

PBS generic burst header PBS generic burst header PBS burst payload (control or data)

PBS payload header

0

Payload data

Resolution S-ID (0) (0)

8

15

29 0

SB (0)

30

31

CP (1)

Payload FCS (optional)

PH (0)

Byte transmission order

0

VN: Version number PT: Payload type (0:data, 1:control) CP: Priority level IB: In band/out of band signaling LP: Label present

S-ID: Segment ID of burst SB: Segmented burst (1) CP: Concatenated payload (1) PH: Payload FCS present

Ethernet MAC frame PBS burst payload length

Preamble (7 bytes) Start of frame delimeter (1 B) Destination address (6 B)

Ethernet frame number 1

Source address (6 B) Length / type (2 B)

PBS burst payload length

Ethernet frame number 2

MAC client data Frame check sequence (4 B)

PBS burst payload length

Ethernet frame number 3

FIGURE 10. PBS MAC layer framing example of multiple Ethernet frames within PBS data burst frame.

C ONCLUSIONS We have proposed a new intelligent PBS architecture for hop and span-constrained optical enterprise networks to support high-speed bursty traffic. The overall control and operation of PBS switching nodes and the ingress/egress nodes are performed by the NPs for both IB and OOB signaling. Three different regions of LER operation for the burst assembly process were identified via simulation results. To reduce LER total latency, Fmin < 0.1 should be used. In addition, a set of ideal optical switch fabric performance requirements were defined. The GMPLS-based PBS software architecture, with the key building blocks for both ingress/egress nodes as well as switching nodes, was introduced to recognize the PBS interfaces at the edge nodes, and to define the PBS label space with the associated signaling. This enables the development and deployment of IP/Ethernet over WDM integration. Under the GMPLS framework, the adaptive PBS MAC layer is tailored to perform various PBS operations such as data burst assembly, segmentation, reassembly, concatenation of multiple payloads within the same PBS data burst frame, control burst generation, scheduling, and framing while still incorporating the MPLS-based traffic engineering features and functions for control burst switching and network management. The PBS MAC layer framing enables the mapping of multiple generic payloads such as Ethernet frames and/or IP packets into PBS data burst frames. Looking forward, there are still many technical challenges for these hop- and span-constrained PBS networks that must

IEEE Optical Communications • November 2003

be solved. For example, the availability and scalability of a low-cost, fast (< 100 ns), and strictly nonblocking, photonic switching fabric that can be integrated with high-speed electronics remains one of the key obstacles. A research program developing a silicon optical switch fabric based on the defined performance requirements for a PBS network is currently underway at Intel. Other key issues, such as an optimal optical buffering scheme that can be integrated with the PBS switching fabric, as well as the balancing of network complexity and implementation costs with the traffic performance, must also be addressed. Successful integration of the PBS network architecture with low-cost optical switching fabrics based on CMOS-compatible technology and GMPLS-based software architecture should provide the means for robust and efficient transport of bandwidth-demanding applications on optical enterprise networks.

ACKNOWLEDGMENTS The authors would like to thank Prof. Dan Blumenthal for collaboration and technical discussions of this work.

REFERENCES [1] C. Qiao, “Labeled Optical Burst Switching for IP-over-WDM Integration,” IEEE Commun. Mag., vol. 38, no. 9, 2000, pp. 104–14. [2] J. Y. Wei and R. I. McFarland Jr., IEEE J. Lightwave Tech., no. 18, 2000, pp. 2019–37. [3] J. S. Turner, “WDM Burst Switching for Petabit Data Networks,” Tech. Dig. OFC, 2000. [4] M. Düser and P. Bayvel, “Analysis of a Dynamically Wavelength-Routed Optical Burst Switched Network Achitecture,” IEEE J. Lightwave Tech., no. 20, 2002, pp. 564–85.

S31

[5] R. Rajaduray, D. J. Blumenthal, and S. Ovadia, “Impact of Burst Assembly Parameters on Edge Router Latency in an Optical Burst Switching Network,” IEEE/LEOS Annual Meeting, Tucson, AZ, Oct. 26–30, 2003. [6] M. Adiletta et al., “The Next Generation of Intel IXP Network Processors,” Intel Tech. J., vol. 6, no. 3, pp. 6–18, 2002. [7] F. Hady, “Network Processor Focused Internet Traffic Characteristics,” Intel private pub., 2001. [8] N. McKeown, “The iSLIP Scheduling Algorithm for Input-Queued switches,” IEEE Trans. Net., no. 7, 1999, pp. 188–201. [9] L. Xu, H. G. Perros, and G. Rouskas, “Techniques for Optical Packet Switching and Optical Burst Switching,” IEEE Commun. Mag., no. 1, 2001, pp. 136–42. [10] D. J. Blumenthal et al., “All-Optical Label Swapping Networks and Technologies,” IEEE J. Lightwave Tech., no. 18, 2000, pp. 2058–75. [11] E. J. Murphy et al., “16×16 Strictly Nonblocking Guided-Wave Optical Switching System,” IEEE J. Lightwave Tech., no. 14, 1996, pp. 352–58. [12] G. Wenger et al., “Completely Packaged Strictly Nonblocking 8x8 Optical Switch Matrix on InP/InGaAsP,” IEEE J. Lightwave Tech., no. 14, 1996, pp. 2332–37. [13] C. K. Tang and G. T. Reed, “Highly Efficient Optical Phase Modulator in SOI Waveguides,” IEE Elect. Lett., no. 31, 1995, pp. 454–55. [14] A. Banerjee et al., “Generalized Multi-Protocol Label Switching: Overview of Routing and Management Enhancements,” IEEE Commun. Mag., vol. 39, no. 1, 2001, pp. 144–50. [15] A. Banerjee et al., “Generalized Multiprotocol Label Switching; an Overview of Signaling Enhancements and Recovery Techniques,” IEEE Commun. Mag., vol. 39, no. 7, 2001, pp. 144–51. [16] X. Cao, J. Li, Y. Chen, and C. Qiao, “Assembling TCP/IP Packets in Optical Burst Switched Networks,” Proc., IEEE GLOBECOM, Taiwan, 2002.

BIOGRAPHIES S HLOMO O VADIA [SM] ([email protected]) has earned his B.Sc. in physics from Tel-Aviv University in 1978, and his M.Sc. and Ph.D. in optical sciences from the Optical Sciences Center, University of Arizona in 1982 and 1984, respectively. After two years as a postdoctoral fellow at the Electrical Engineering Department, University of Maryland, he joined IBM at East Fishkill as an optical scientist developing various IBM optical communications and storage products. He joined Bellcore in 1992, where he developed an HFC testbed, and studied the transmission performance of multichannel AM/QAM video transmission systems. In 1996, he joined General Instruments as a principal scientist developing the next-generation communications products such as digital set-top boxes and cable modems. In 2000 he joined Intel’s cable network operation in San Jose, California, as a principal system architect developing communication products such as CPU-controlled cable modems. He joined Intel Research in 2001 as principal optical architect focusing on the architecture, design, and development of

S32

optical burst switching in enterprise networks based on silicon photonics components. He is the author of a recently published book titled Broadband Cable TV Access Networks: From Technologies to Applications (Prentice Hall, 2001). He is a member of OSA with more than 60 technical publications and conference presentations. He also serves on the technical committees of many IEEE/LEOS conferences, and he is a regular reviewer for various IEEE publications such as Photonics Technology Letters and Journal of Lightwave Technology. He is the inventor of 29 patents, in which six patents were issued, and all the others are pending; his personal biography is included in the Millennium edition of Who’s Who in Science and Engineering (2000/2001). CHRISTIAN MACIOCCO is a senior staff architect at Intel. He works on software architecture that will bring flexible radio communications to wireless computing devices. He joined Intel in 1994 and has held positions in research, architecture development, and engineering management. He has worked on a number of emerging technologies at Intel, including media streaming over the Internet, defining DTV data broadcast support on the PC, and optical networking. He has also represented Intel in various standards and industry organizations, including ATM Forum, ATSC, IETF, OIF, and SDR Forum. He earned a Dip.-Ing. from Ecole Speciale de Mecanique et Electricité, SUDRIA, Paris, France. MARIO PANICCIA received his B.Sc. in physics in 1988 from the State University of New York at Binghamton and his Ph.D. in solid state physics from Purdue University in 1994. He joined Intel in 1995 where he led an effort to develop an optical testing technology (now called the Laser Voltage Probe) for use in speed path debug for C4 packaged flagship microprocessors. In 1998 he started a research program focused on developing technology in the area of optical interconnects and optical clocking for future generation microprocessors. He is currently director of optical technology development for Intel’s Communication and Interconnect Technology Laboratory where he leads a research team focused on developing silicon-based photonic devices using standard CMOS processing for use in next-generation enterprise and data center communication networks. He has won numerous Intel awards including winner of the Innovators Day award in 1997, and part of a team receiving an Intel Achievement Award in 1998. He has published numerous papers and conference proceedings, and has 33 issued patents and 16 additional patents pending. RAMESH RAJADURAY received his Bachelor of Engineering (electrical and electronic engineering) with First Class Honors from the University of Western Australia in 1998, and an M.S. in electrical engineering from the University of California at Santa Barbara (UCSB) in 2000. He is currently pursuing his Ph.D. in electrical engineering at UCSB. His research interests include optical burst switching and subcarrier multiplexing.

IEEE Optical Communications • November 2003

Suggest Documents