HEARW¡P: EnHanced MEdium Access ContRol Protocol ... - CiteSeerX

2 downloads 1042 Views 176KB Size Report
Voice over IP (VoIP) is transporting speech signals over IP network. .... Since IP network only offers best effort service, no guarantee on the delay can be ...
1

HEARW¡P: EnHanced MEdium Access ContRol Protocol for Wireless Voice over IP Sinem Coleri, Mustafa Ergen, Ahmad Bahai {csinem,ergen}@eecs.berkeley.edu, [email protected]

O BJECTIVE This paper describes a MAC enhancement protocol for enhancing voice communication quality over wireless networks. The protocol named HEARW¡P is based on IEEE 802.11 standard. I. I NTRODUCTION Voice over IP (VoIP) is transporting speech signals over IP network. What makes VoIP challenging is IP network is not designed to serve for real time applications. Real-time aspects of the conversation must be respected; the overall delay between both ends of the conversation should be low to avoid irritably long gaps of silence. This of course should be addressed differently than the broadcast programs where it is not interactive and desire less stringent requirement. VoIP is an alternative to telephony network and could be seen as a replacement technology. A computer network may replace telephony network by installing VoIP system and the only connection to the telephony network might be a gateway that does address translation from IP to phone number or vice versa. The real benefit arises when considered capacity since the capacity of a computer network could be utilized better and bandwidth is cheap compared to telephony network. Long distance calls could be possible along with certain applications like whiteboarding, application sharing, file transfer, video image [1]. Conversation within the LAN is possible without effort but across the LAN when the WAN is involved delay matters. End to end delay is important and when it gets too large, the conversation experience distortion. This could happen due to the heavily loaded roads and lost packets during the routing inside the WAN. Also, a high percentage of packet loss will decrease the quality of the conservation. Moreover, there is a difference in the delay of the packet arrivals to the receivers due to the variations in the congestion level of the network over time. The rest of the report is as follows: Section II explains the components of the VoIP communication. Section II-A gives compression techniques that are currently available for VoIP applications. Section III shows the range of delay in each part of the voice communication. Section IV gives the basics of IEEE 802.11a and 802.11e. Section V explains a new protocol called HEARW¡P that can increase the delay characteristics of VoIP applications in WLANs. Section VI gives the implementation results of HEARW¡P . Section VII concludes the report. II. VOICE C OMMUNICATION The components of VoIP include grabbing/reconstruction, compression/decompression, transmission/ reception over IP in sender/receiver part. The analog speech signal is first encoded into a digital representation to be able to be transferred over IP network. At regular small intervals, blocks of digitized speech information is sent over the

2

network from the transmitter to the receiver. On the receiver side, this digitized block is transformed back to an audio signal, which is then output to speakers. Digitization of voice data includes sampling and quantization. The sampling rate and the number of bits used in quantization determines the rate of data transmission before data compression. The speech signals of humans can contain frequencies of beyond 12kHz. However, high quality communication is attained in the telephone system by transmitting only frequencies below 4kHz. Nyquist theorem then suggests that a sampling rate of 8kHz is enough for the digitization of speech. The range of amplitudes of the voice signal is covered by at least 12 bits when a uniform quantization is used. A uniform quantization with 8-bit quantization also gives telephone quality. The the required bandwidth for telephone quality conversation is 64kbps, which is 8kHz times 8 bits per sample. To avoid the delay jitter, which is the difference in packet arrivals, a buffer is used at the receiver. Instead of playing the voice data immediately after the reception of the packet, the packets are buffered. Although the buffer slightly increases the delay, it increases the probability of playing the packets consecutively without interruption. The number of bits that each packet contains is very important in terms of delay and packet loss. To reduce the amount of lost information, a packet should contain only a small amount of the voice signal. If a packet is lost, this will only be a small fraction of the conversation. However, as the length of data in each packet decreases, the overhead in the network increases due to the proportion of the data length to the length of the header fields. The bandwidth used by the digitised information can be reduced by the compression schemes. The compression fall into one of two categories: general compression techniques and compression techniques that exploit the fact that we are dealing with voice information. A. Compression Techniques Compression schemes fall into one of two categories: waveform coding and vocoding. Waveform coding encodes the waveform itself whereas vocoding the information about how the speech signal is produced by the human vocal system. The simplest form of waveform coding is PCM. Differential PCM (DPCM) tries to exploit that the value of the samples of the audio signal can be predicted from previous values. DPCM calculates the prediction of the sampled value and uses a fixed number of bits to store only the difference between the predicted and actual values of a PCM signal. Adaptive DPCM (ADPCM) uses some of the bits to store the difference to adjust the resolution of the difference in contrast to using all the bits to encode the difference in DPCM. Vocoding is a combination of “voice” and “coding”. Instead of trying to encode the waveform itself, vocoding techniques try to determine the parameters about how the speech was created and use these parameters to encode the signal. To reconstruct the signal, these parameters are fed into a model of the vocal system that outputs a speech signal. Since the vocal tract and excitation signal change relatively slowly, the signal that has to be analyzed is split into several short pieces. A piece of the signal is then examined. If the signal is voiced, the pitch period is determined and accordingly the excitation signal is modelled as a series of periodic pulses. If the signal is unvoiced, the excitation is modelled as noise. The effect of vocal tract is recreated through the use of linear filter. This filter will contain certain parameters that have to be determined by the vocoder. Several types of vocoders exist. The main difference between these methods is the vocal tract model used. Linear predictive coding (LPC) uses a vocal tract model as an approximation of a series of concatenated acoustic tubes. It examines the input and estimates the parameters to use in the vocal tract filter. It then applies the inverse of this filetr to the signal. The result of this is called the residue signal and it basically describes which excitation signal should be used to model the speech signal

3

Codec Bit rate (Kbps) Framing Interval (ms) Payload (Bytes) Packets/sec

GSM 6.10 13.2

G.711 64

G.723.1 5.3/6.3

G.726-32 32

G.729 8

20

20

30

20

10

33

160

20/24

80

10

50

50

33

50

50

TABLE I C ODEC SPECIFICATIONS FOR THE STANDARDS

as closely as possible. The parameters of the filter is found by a difference equation which describes each sample as a linear combination of the previous ones. Such an equation is called linear predictor. Waveform coders do not perform well at data rates below 16kbps. Vocoders produce intelligible communication at very low data rates , usually below 4.8kbps. However, the reproduces speech often sound quite synthetic and the speaker is often unrecognizable. Hybrid decoders try to exploit the advantages of both techniques. Residual Excited Linear Prediction (RELP) uses residual signal directly as the excitation for speech synthesis instead of checking whether the signal is voiced or unvoiced and trying to model the excitation signal. Codebook Excited Linear Prediction (CELP) allows a wide range of excitation signals, which are all captured in the CELP codebook. The to determine which excitation signal to use, the coder performs an exhaustive search. The excitation signal is the encoded by the index of the corresponding entry. The standards for voice communication is established to make interoperability between the applications possible. The most widely known standards in the VoIP domain are the G. standards of the ITU-T. Table I gives the list of the standards. B. Transmission Protocols Since IP network only offers best effort service, no guarantee on the delay can be provided. Similarly, the amount of lost packets can be very high during the congestion. Somehow, the sender should know whether the receiver can handle the incoming stream or not. Over IP network, UDP together with Real-Time Transport Protocol(RTP) and RTP Control Protocol (RTCP) is used. TCP cannot be used since waiting for retransmitted packets adds extra delay to the communication. Also, TCP has not support for multicasting, which can be used to distribute speech data to several users at the same time. UDP is not enough by itself to support the transmission of real-time data since it provides no mechanism for synchronization and there are no support for flow or congestion control. RTP and RTCP adds extra information to the speech data and use UDP to distribute this control and speech information. RTP includes field sequence number to deliver received packets to the application in the correct order and timestamp to include the synchronization information for a stream of packets. RTCP on the other hand periodically send RTCP packets to observe the number of participants to each session and to provide feedback on the quality of data distribution. These protocols however by itself does not provide any mechanism to ensure timely delivery to be able to give any QoS guarantees. Both IPV4 and IPV6 have a way to specify the priority of a datagram. In the IPV4 header, some level of QoS can be specified in the Type of Service (TOS) field whereas the IPV6 header uses traffic class field for the same purpose. If all routers take such

4

priorities into account, this could help real-time data to be delivered with low delay. the only thing that needs to be done is to adjust the routers so that they take the priorities of packets into account. These mechanisms however can only help to give a better service, they cannot give any guarantees. If all the packets in the network is of high priority, the quality will still be poor. The protocols such as Resource Reservation Protocol (RSVP) can be used to reserve resources in the network so as to give some guarantees on the delay. If a host is going to transmit data which should arrive with a certain QoS, it periodically sends a path message to the destination of the data. The path message contains information about the characteristics of the traffic generated by the sender. Then the exact reverse path is followed in response to the path message to make the reservations. III. D ELAY C OMPONENTS The total delay fall into one of two categories: fixed delay and variable delay. Fixed delay is the total delay introduced by sampling, compression/decompression, transmission and jitter buffering. The variable delay on the other hand is caused by packet queueing at the routers. Compression delay can be divided into two categories. First part is caused by the calculations which need to be done. This amount of delay depends much on the capabilities of the system performing the compression. Second part is the delay caused to wait the whole information that will be included in the speech data. For instance, if the vocoder operates on a 20ms segment of speech, additional 20 ms is included. Network delay is the delay to transport this data over the network. 20msec segment at 8kbps: 160 bits=20bytes, RTP header:16 bytes, UDP header: 8 bytes, IP header: 20 bytes, the total is 64byte datagram, which takes 8ms at 64kbps, 512µs at 1Mbps. It is not possible to make a general claim about the transmission delay in IP networks although one-way transmission delays rarely tend to exceed 100ms. However, it is possible that this delay exceed 200ms, which is the minimum tolerable value. Jitter buffering is typically sent to 10-20ms. IV. IEEE 802.11 P ROTOCOL It is extremely unusual for a wireless device to be able to receive and transmit simultaneously, the IEEE 802.11 MAC uses collision avoidance rather than the collision detection of IEEE 802.3. It is also unusual for all wireless devices in LAN to be able to communicate directly with all other devices. For this reason, IEEE 802.11 MAC implements a network allocation vector (NAV). The NAV is a value that indicates to a station the amount of time that remains before the medium will become available. Even if the medium does not appear to be carrying a transmission by the physical carrier sense, the station may avoid transmitting. The NAV, then, is a virtual carrier sensing mechanism. By combining the virtual carrier sensing mechanism with the physical carrier sensing mechanism, the MAC implements the collision avoidance portion of the CSMA/CA access mechanism. We first give the packet structures in Section IV-A, which is common for most IEEE 802.11 standards. Then we explain the extra functionality related to packet structure and MAC protocol for IEEE 802.11b and 802.11e in Sections IV-C and IV-D respectively. A. Frame Format MAC accepts MSDUs from higher layers and add headers and trailers to create MAC Protocol Data Unit (MPDU). The MAC may fragment MSDUs into several frames, increasing the probability of each individual frame being delivered successfully. Header+MSDU+Trailer contains information:

5

addressing information • IEEE 802.11-specific protocol information • information for setting the NAV • frame check sequence for verifying the integrity of the frame. The MAC frame format comprises a set of fields that occur in a fixed order in all frames. The fields Address 2, Address 3, Sequence Control, Address 4 and Frame Body are only present in certain frames. General MAC frame format is as follows: FC D/ID Addr. 1 Addr. 2 Addr. 3 Seq Addr. 4 Frame FCS Cont. Body 2 2 6 6 6 2 6 0-2312 4 bytes FC - Frame Control: 16bits 1) Protocol Version (2 bits): to identify the version of the IEEE 802.11 MAC protocol: set to zero now. 2) Frame Type (2 bits) and Subtype (4 bits): identifies the function of the frame and which other MAC header fields are present in the frame. There are three frame types: control, data and management. Within each frame type, there may be subparts. 3) To DS (1 bit) and From DS (1 bit): To Distribution Service (DS) set for every data sent from mobile station to the AP. Zero for all other frames. From DS set for the data types from AP to the mobile station. When both zero that means a direct communication between two mobile stations. When both are on, for special case where an IEEE 802.11 WLAN is being used as the DS referred as wireless DS. The frame is being sent from one AP to another, over the wireless medium. 4) More Fragments Subfield (1bit): indicates that this frame is not the last fragment of a data or management frame. 5) Retry Subfield (1bit): when zero, the frame is transmitted for the first time, otherwise it is a retransmission. 6) Power Management Subfield (1bit): mobile station announces its power management state; 0 means station is in active mode and 1 means the station will enter the power management mode. The subfield should be same during the frame exchange in order for the mobile to change its power management mode. Frame exchange is 2 or 4 way frame handshake including the ACK. 7) More Data Subfield (1bit): AP uses to indicate to a mobile station that there is at least one frame buffered at the AP for the mobile station. Mobile polled by the PC during a Collision Free Period (CFP) in PCF mode also may use this subfield to indicate to the PC that there is at least one more frame buffered at the mobile station to be sent to the PC. In multicast, AP may also set to indicate there are more multicast frames. 8) WEP Subfield (1bit): 1 indicates that the frame body of MAC frame has been encrypted using WEP algorithm.(only frames of type data and management and subtype authentication) 9) Order Subfield (1bit): indicates that the content of the data frame was provided to the MAC with a request for strictly ordered service, provides information to the AP and DS to allow this service to be delivered. Duration/ID Field (D/ID): 16bits; alternatively contains information for NAV or a short ID (association ID-AID) used by mobile station to get its buffered frames at the AP. only power-save poll (PS-Poll) frame contains the AID. most two significant bit is set to 1 and the rest contains ID. All values larger than 2007 are reserved. When bit-15 is zero the rest (14-0) represents the remaining duration of a frame exchange •

6

to update NAV. The value is set to 32,768 (bit-15=1 and the rest 0) in all frames transmitted during the CFP to allow a station who missed the beginning to recognize that it is in middle of the CFP session and it set NAV a higher value. Address Fields: 4 address fields: besides 48-bit address (IEEE 802.3) additional address fields are used (TA,RA,BSSID) to filter multicast frames to allow transparent mobility in IEEE 802.11. 1) IEEE 48bit address comprises three fields: • a single-bit Individual/Group field: When set to 1, the address is that of a group. if all bit are 1, that means broadcast. • a single-bit Universal/Local bit: when zero, the address is global and unique, otherwise it may no be unique and locally administered. • 46bit address fields. 2) BSS Identifier (BSSID): unique identifier for a particular BSS. In an infrastructure BSSID it is the MAC address of the AP. In IBSS, it is random and locally administered by the starting station. This also give uniqueness. In the probe request frame and group address can be used. 3) Transmitter Address (TA): MAC address of the station that transmit the frame to the wireless medium. Always an individual address. 4) Receiver Address (RA): to which the frame is sent over wireless medium. Individual or Group. 5) Source Address (SA): MAC address of the station who originated the frame. Always individual address. May not match TA because of the indirection performed by DS of an IEEE 802.11 WLAN. SA field is considered by higher layers. 6) Destination Address (DA): Final destination . Individual or Group. May not match RA because of the indirection. Sequence Control Field: 16bit: 4bit fragment number and 12bit sequence number. Allow receiving station to eliminate duplicate received frames. 1) Sequence Number Subfield (12bit): Each MSDU has a sequence number and it is constant. Sequentially incremented for the following MSDUs. 2) Fragment Number Subfield (4bits): Assigned to each fragment of an MSDU. The firs fragment is assigned to zero and incremented sequentially. Frame Body Field: contains the information specific to the particular data or management frames. Variable length. An application may sent 2048 byte with 256 byte upper layer headers. Frame Check Sequence Field: 32 bits; CCITT CRC-32 polynomial: G(x) = x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1 The frame check sequence is an IEEE 802 LAN standards and generated in the same way as it is in IEEE 802.3. B. Control Frame Subtypes The subfields within Frame Control fields of control frames are set as follows: B0-B1 B2-B3 B4-B7 B8 B9 B10 B11 B12 B13 B14 Protocol Type Subtype To From More Retry Pwr More WEP Version DS DS Frag Mgt Data Protocol Control Subtype 0 0 0 0 Pwr 0 0 Version Mgt

B15 Order 0

7

Request to Send (20bytes): Frame Duration RA TA FCS Control Octets:2 2 6 6 4 The purpose is to transmit the duration to stations in order for them to update their NAV to prevent transmissions from colliding with the data or management frame that is expected to follow. Duration information conveyed by this frame is a measure of the amount of time required to complete the four-way frame exchange. Duration (ms)= CTS+Data or management frame+ ACK+ 2 SIFS Clear to Send: (14bytes): Frame Duration RA FCS Control Octets:2 2 6 4 In the update of NAV, Duration (ms) =Data or management frame + ACK + 1 SIFS Acknowledge: (14 bytes): Frame Duration RA FCS Control Octets:2 2 6 4 • Frame Control Field • Duration/ID Field (ms): Duration is zero if the ACK is an acknowledgement. The value of the duration information is the time to transmit the subsequent data or management frame, an ACK frame, and two SIFS intervals, if the acknowledgement is of a data or management frame where the more fragments subfield of the frame control field is one. • RA: individual address. RA is taken from the address 2 field of data, management or PS-Poll frame. • FCS The purpose of this frame is two-fold. First, the ACK frame transmits an acknowledgement to the sender of the immediately previous data, management, or PS-Poll frame that the frame was received correctly. Second, the ACK frame is used to transmit the duration of information for a fragment burst as in CTS. Association Request frame format: Order Information 1 Capability Information 2 Listen Interval 3 SSID 4 Supported Rates Association Response frame format: Order Information 1 Capability Information 2 Status Code 3 Association ID (AID) 4 Supported Rates

8

C. IEEE 802.11B 1) Frame Format: Frame format of IEEE 802.11b for RTS, CTS, DATA and ACK packets are the same as IEEE 802.11. 2) Timing Intervals: There are five timing intervals. 1) PHY determines: the short interframe space (SIFS) 2) PHY determines: the slot time. 3) the priority interframe space (PIFS), 4) the distributed interframe space (DIFS), 5) and the extended interframe space (EIFS). The SIFS is the shortest interval, followed by the slot time which is slightly longer. The PIFS is equal to SIFS plus one slot time. The DIFS is equal to the SIFS plus two slot times. The EIFS is much larger than any of the other intervals. It is used when a frame that contains errors is received by the MAC, allowing the possibility for the MAC frame exchanges to complete correctly before another transmission is allowed. Through these five timing intervals, both the DCF and PCF are implemented. 3) MAC Protocol: The basic 802.11 MAC protocol is the DCF based on CSMA. Stations deliver MAC Service Data Units (MSDUs). Stations deliver MSDUs of arbitrary lengths up to 2304 bytes, after detecting that there is no other transmission in progress on the channel. However, if two stations detect the channel as free at the same time, a collision occurs. The 802.11 defines a Collision Avoidance (CA) mechanism to reduce the probability of such collisions. Before starting a transmission a station has to keep sensing the channel for an additional random time after detecting the channel as being idle for a minimum duration called DIFS, which is 34 us for the 802.11a PHY. Only if the channel remains idle for this additional random time period, the station is allowed to initiate its transmission. Figure 1 represents the finite state machine of DCF operation. When the station has packet to transmit, it senses the channel by Physical Carrier Sense(PCS) and Virtual Carrier Sense(VCS). PCS notifies the MAC layer if there is a transmission going on and VCS is NAV procedure, If NAV is set to a number, station waits untill it resets to zero. After carrier sensing, station backoffs and transmit the data. If there is a collision, corresponding retry counter increments and backoff interval increases. In every transmission station backoffs, this is put into standard in order to provide fairness among the stations. Post-Tx backoff successful Busy during backoff

Busy During Tx

Idle

PCS VCS Wait

Just Transmitted Ack or CTS

Idle for IFS time

Pre-Tx backoff successful

Medium not busy during Tx attempt

All other transmitted frames whether successful or not

Finish Tx

Tx

Backoff

Still in sequence and last step successful

Sequence & Retry

Fig. 1. MACRO Finite State Representation

9

1) when the MAC receives a request to transmit a frame, a check is made of the physical and virtual carrier sense mechanisms. 2) if the medium is not in use for an interval of DIFS (or EIFS if the pre-received frame is contained errors), the MAC may begin transmission to the frame. 3) if the medium is in use during the DIFS interval, the MAC will select a backoff and increment the retry counter. 4) The MAC will decrement the backoff value each time the medium is detected to be idle for an interval of one slot time. 5) it there is a collision, the contention window is doubled, a new backoff interval is selected Figure 2. System Fields:

PAV NAV currentTime BC (Backoff counter) TS = 1 slot time = 20 (802.11b), 9 (802.11a)

BC--

YES

Enter backoff

YES

BC == 0?

BC = Rand() & CW

NO

BC == 0?

NO

MAX(PAV, NAV) < currentTime - TS

Wait 1 TS

PCS VCS Wait

Idle for IFS Time Enter backoff

N

CW = 2 -1

NO

YES

Leave backoff

Fig. 2. BACKOFF Procedure Finite State Representation

An example of a DCF operation is seen in Figure 3. NAV

Station 2

Station 3 Station 4

Station 5 Station 6

D I F S

NAV

RTS

S I F S

CTS S I F S

DATA

new random backoff (10 slots)

ACK

D I F S

NAV

ACK

DATA

Station defers

D I F S

random backoff (9 slots)

NAV

S I F S

S I F S

random backoff (7 slots)

Station 1

NAV

remaining backoff (2 slots) Station defers, but keeps backoff counter (=2) Station sets NAV upon receiving RTS

D I F S

ACK

DATA

S I F S

Station sets NAV upon receiving RTS Station sets NAV upon receiving CTS, this station is hidden to station 1

time

Fig. 3. Timing of the 802.11 DCF. In this example, station 6 cannot detect the RTS frame of the transmitting station 2, but the CTS frame of station 1.

D. IEEE 802.11E The IEEE 802.11e is an extension of the 802.11 Wireless Local Area Network (WLAN) standard for provisioning of Quality of Service (QoS). The new standard provides the means of prioritizing the radio channel access within an infrastructure Basic Service Set (BBS)of the IEEE 802.11 WLAN. A BSS that supports the new priority schemes of the 802.11e is referred to as QoS supporting BSS (QBSS). There are enhancements to the 802.11 MAC currently under discussion, called the 802.11e, which introduce Enhanced DCF (EDCF) and Hybrid Coordination Function (HCF). Stations, which operate under the 802.11e, are called QoS stations, and a QoS station, which works as the centralized controller for all other stations within the same QBSS, is called the Hybrid Coordinator (HC). A QBSS is a BSS, which includes an 802.11e-compliant HC and QoS stations. The HC will typically

10

reside within an 802.11e AP. In the following, we mean an 802.11e-compliant QoS station by a station. The EDCF is a contention-based channel access mechanism of HCF. We will only describe EDCF since only DCF is practical today. 1) Frame Formats: The frame format for IEEE 802.11e contains some extra fields such as QoS control. In this standard, the fields Adress 2, Address 3, Sequence Control, Address 4, QoS Control and Frame Body are only present in certain frame types and subtypes. The frame format is given as follows: FC D/ID Addr. 1 Addr. 2 Addr. 3 Seq Addr. 4 QoS Frame FCS Cont. Control Body 2 2 6 6 6 2 6 2 0-2310 4 bytes • QoS Control Field (16 bits): identifies the TC or TS to which the frame belongs and various other QoS-related information about the frame that varies with frame type and subtype. The field comprises 5 fields as defined for the particular sender and frame type and subtype. The typical data frames sent by WSTAs has the following structure: Type: TID 0 Ack Policy Reserved TXOP duration Bits: 4 1 2 1 8 – TID field: identifies the TC or TS to which the corresponding MSDU or fragment in the frame body field belongs or in the case of QoS null the TC or TS of traffic for which TXOP is being requested. – Ack Policy Field: identifies the ACK policy that shall be followed upon the delivery of the MPDU such as normal ACK, no ACK, group ACK. – TXOP duration: indicates the duration, in units of 32µs, which the sending station desires for its next TXOP. The range of time values is 32 to 8160 µs. The TXOP duration field is present in QoS null frames sent by WSTA s associated a QBSS with bit 4 of the QoS control field set to 0. A TXOP duration requested for a particular TID supercedes any prior TXOP duration requested for that TID. 2) Timing Intervals: 3) MAC Protocol: EDCF is based on differentiating priorities at which traffic is to be delivered. The QoS support is realized with the introduction of Traffic Categories (TCs). The differentiation is performed by varying the amount of time a station would sense the channel idle, the length of the contention window during a backoff or the duration a station may transmit once it has the channel access. These parameters are defined by AIFS[TC], CWmin[TC], CWmax[TC], TXOPLimit[TC], TXOPBudget[TC] and Load[TC]. These parameters are not fixed by PHY as for DIFS, CWmin and CWmax in DCF but are fixed by a management entity. MSDUs are now delivered through multiple backoff instances within one station, each backoff instance parameterized with TC-specific parameters. In the CP, each TC within the stations contends for a TXOP and independently starts a backoff after detecting the channel being idle for an Arbitration Interframe space (AIFS), each backoff sets a counter to a random number drawn from the interval [1, CW + 1]. The minimum size (CW min[T C]) of the CW is another parameter dependent on the TC. Priority over legacy stations is provided by setting CW min[T C] < 15 (in case of 802.11a PHY) and AIF S = DIF S. See Figure 4 for illustration of the EDCF parameters. As in the legacy DCF, when the medium is determined busy before the counter reaches zero, the backoff has to wait for the medium being idle for AIFS again, before continuing to count down the counter. A big difference from the legacy DCF is that when the medium is determined busy before the counter reaches zero, the backoff has to wait for the medium being idle for AIFS again, before continuing to count down the counter. A big difference from the legacy DCF is that when the medium is determined as being idle for the period of AIFS, the backoff counter is reduced

11

AIFS[TC]

low priority TC

AIFS[TC] AIFS[TC] (=DIFS)

With 802.11a slot: 9us SIFS: 16us PIFS: 25us DIFS: 34us AIFS: >=34us

backoff

medium priority TC

backoff

PIFS

time SIFS

ACK

SIFS

high priority TC

RTS SIFS

DATA

Contention Window (counted in slots, 9us)

defer access

CTS

count down as long as medium is idle, backoff when medium gets busy again

Fig. 4. Multiple parallel backoffs of MSDUs with different priorities. Note that AIFS may be smaller than DIFS. In that case the CW starts at 1 rather than 0, which is the same as AIFS=DIFS.

by one beginning the last slot interval of the AIFS period. Note that with the legacy DCF, the backoff counter is reduced by one beginning the firs slot interval after the DIFS period. After any unsuccessful transmission attempt a new CW is calculated with the help of the persistence factor P F [T C] and another uniformly distributed backoff counter out of this new, enlarged CW is drawn, to reduce the probability of a new collision. Whereas in legacy 802.11 CW is always doubled after any unsuccessful transmission (equivalent to PF=2), 802.11e uses the PF to increase the CW different for each TC: newCW [T C] >= ((oldCW [T C] + 1) ∗ P F ) − 1 The CW never exceeds the parameter CW max[T C], which is the maximum possible value for CW. A single station may implement up to eight transmission queues, for each user priorities (UPs) realized as virtual stations inside a station, with QoS parameters that determine their priorities. If the counters of two or more parallel TCs in a single station avoids the virtual collision. The scheduler grants the TXOP to the TC with highest priority, out of the TCs that virtually collided within the station, as illustrated in Figure 5. There is then still a possibility that the transmitted frame collides at the wireless medium with a frame transmitted by other stations. Legacy: one priority

802.11e up to 8 independent backoff instances

Higher priority TC 7

TC 6

TC 5

TC 4

TC 3

Lower priority TC 0

TC 1

TC 2

old new

backoff (DIFS) (15) (2)

backoff (AIFS) (CW) (PF)

backoff (DIFS) (15) (2)

backoff (DIFS) (15) (2)

backoff (DIFS) (15) (2)

backoff (DIFS) (15) (2)

backoff (DIFS) (15) (2)

backoff (DIFS) (15) (2)

backoff (DIFS) (15) (2)

SCHEDULER (RESOLVES VIRTUAL COLLISIONS BY GRANTING TXOP TO HIGHEST PRIORITY)

Transmission Attempt

Transmission Attempt

Fig. 5. Virtual backoff of eight traffic categories: (1) left one: legacy DCF, close to EDCF with AIFS=34us, CWmin=15, PF=2; (2) right one: EDCF with AIFS[TC]=¿ 34us, CWmin[TC]=0-255,PF[TC]=1-16.

Another important part of the 802.11e MAC is the Transmission Opportunity (TXOP). A TXOP is an interval of time when a station has the right to initiate transmissions , defined by a starting time and a maximum duration. TXOPs are acquired via contention (EDCF-TXOP) or granted by

12

the HC via polling (polled TXOP). The duration of an EDCF-TXOP is limited by a QBSS-wide TXOP limit distributed in beacon frames while the duration of a polled TXOP is specified by the duration field inside the poll frame. However, although the poll frame is a new frame as part of the upcoming 802.11e, also the legacy stations set their NAVs upon receiving this frame. During an EDCF TXOP won by a TC, a QSTA may initiate multiple frame exchange sequences to transmit MSDUs within the same TC or a higher-valued TC. The duration of this EDCF TXOP is bounded by EDCF TXOPLimit[TC]. V. HEARW¡P The aim of the protocol called HEARW¡P is to improve the performance of voice over WLAN operating on IEEE 802.11a and IEEE 802.11e. We now give the problems arising from the current standards in terms of supporting the requirements of VoIP applications. Then we propose HEARW¡P that tries to solve these problems. A. VoIP Problems in IEEE 802.11a and 802.11e The main problems of VoIP over WLAN are low VoIP capacity and increasing delay over WLAN. The number of VoIP session that can be supported in WLAN is much lower than that if the protocol overhead is excluded. We have seen in Section II-A that each VoIP typically requires a bandwidth of around 10Kbps. In principle, WLAN operated at 11 Mbps should be able to support more than 500 VoIP sessions. However, in reality, this is expected to be much lower due to the overhead of MAC protocol, which are packet overhead and backoff before the transmission for successful delivery of the packets. Waiting for random backoff time also increases the delay in the network. As the number of users in the network increases, the backoff time increases severely. B. Design Requirements The main design requirement for HEARW¡P is the compatibility with current implementations of IEEE 802.11b and 802.11e. We aim the nodes that support HEARW¡P to perform better than those that do not support HEARW¡P while working concurrently in the network. Since PCF is not supported in most 802.11 products while DCF is mostly used as an IEEE 802.11b protocol, we focus on DCF and EDCF in IEEE 802.11b and 802.11e respectively. C. Definition of HEARW¡P HEARW¡P protocol has two main functionalities: sending packets back-to-back and dropping packets as needed. The first functionality aims at decreasing the delay overhead at the beginning of each packet and the overhead of packet header. The second aims at decreasing the load in the network with the goal of decreasing the delay of the remaining packets. 1) Sending Packets Together: The problem in DCF and EDCF modes is that the AP has the same access to the channel as any other node in the network although it is expected to have as many packets as the total packets of all the nodes in the network based on the assumption that the communication between the nodes and the network is symmetric. Sending Packets Together functionality aims at decreasing the overhead of backoff at the beginning of each packet transmission by allowing AP to acquire the channel and send the packets back-to-back to the other nodes. It can also decrease the overhead of headers in small VoIP packets by joining them in multicast packets. By simulations, we can plot the average wait time during backoff for a successful transmission. It increases considerably as the number of nodes increases. Figure ???? from Mustafa Ergen’s work on modelling delay behavior in WLAN will be included soon. The packet overhead is also

13

considerable as shown in Section III. There is a 44 byte header overhead for the data of 20 bytes, which corresponds to 20ms segment at 8kbps. Sending packets in a multicast can be included in both IEEE 802.11b and 802.11e as an additional layer on top of MAC protocol. It decreases both the packet and backoff overhead. However, it requires extra functionality in the network. An alternative to this is to send the packets back-to-back. It can be realized very easily with slight changes in IEEE 802.11e MAC although it eliminates only the effect of backoff overhead not the packet overhead. Implementation of Multicast on Top of IEEE 802.11 MAC: Multicasting VoIP packets will be an additional layer, called ”VoIP Multicast Support Layer (VMSL)”, on top of MAC. The nodes that have this functionality subscribe to a multicast address where the AP is the transmitter. Upon reception of the packets of these nodes, the AP stores the payload and length of the packets in order to send them in one packet. It decides to send the packets upon reception of a certain number of packets or after a certain time, whichever is the first. If a mobile has VMSL, it informs the AP of this functionality by using the reserved bits in the association request frame. Upon reception of this request, if the AP also has VMSL, it informs the user that it also has this functionality by using the reserved bits inside the association response frame. Then it sends another packet to this user, that includes the multicast address associated with this node. At the end of this registration process, the user is subscribed to the multicast address such that it will transfer the packets with destination address containing this multicast address to the VMSL layer. The AP also included the address of the user inside the list of the nodes that subscribed to the multicast address. Figure 6 shows this process.

user

registration for multicast

AP subscribe user to the multicast address

response for multicast

Fig. 6. Handshaking in VMSL for registering to the multicast address

When AP receives a packet for a node which is included inside the list of the nodes in the multicast group, it first checks whether it is a VoIP packet or not based on the length of the packet. VoIP packets usually contain 20-40ms duration of voice data, which corresponds to 44 byte IP+RTP+UDP header plus 20-40 byte data payload, 64-84 byte. The data packets are expected to be much larger than these packets. If the node is in the multicast group and the packet is voice packet, then the complete IP packet is stored in an array. If this is the first packet to be included in the multicast packet, AP also starts a timer. Then after a certain time or the maximum number of node data that can be included inside a packet have been received, the AP includes the packet with source address as its own address and destination as multicast address and including all the IP packets in the data payload as shown in Table II. Implementation in IEEE 802.11e: The functionality of Sending Packets Together can be implemented in a simpler way in IEEE 802.11e. In IEEE 802.11e, TXOP is an interval of time when a station has the right to initiate transmissions , defined by a starting time and a maximum duration. Therefore, once the node acquires the channel, it can send the packets to different nodes during the TXOP duration. The implementation of the functionality in IEEE 802.11e is similar the implementation of VMSL. In IEEE 802.11, the association request and response frames do not need to use any of the reserved

14

4 16-bit identification protocol voice packet 1

IHL flags 16-bit header checksum voice packet 2

ToS 13-bit fragment offset AP address ...

16-bit total length TTL multi-cast address voice packet n

TABLE II M ULTICAST PACKET SENT FROM AP TO THE USERS SUBSCRIBED TO VMSL

bits. When the AP receives a packet destined to one the users associated with itself, it first checks whether it is a voice packet based on the length of the packet, as described in VMSL. Then it stores the packets in the same way. If this is the first packet to be included in one frame, AP also starts a timer. Then after a certain time or the maximum number of node data that can be included inside the TXOP limit have been received, the AP acquires the channel and send the packets back-to-back during the TXOP duration as shown in Figure 7.

TXOP duration packet 1

packet 2

...............

packet n

Fig. 7. Sending Packets Back-to-back in one TXOP duration

2) Dropping Packets as Needed: Normally, in both IEEE 802.11a and 802.11e, if the channel is busy, the node does not decrement the backoff value if the channel is not idle for the last IFS time. Also, if the transmission is not successful, the nodes keep increasing the size of the contention window until the CWmax value. Therefore, these protocols are designed to send the packets successfully without caring about the delay. In HEARW¡P , we propose to change the protocols such that the nodes make decisions about whether to send the packet or not after some time. In VoIP applications, there is a maximum tolerable RTT value of 200ms. If the RTT delay is greater than 200ms, then the packet is dropped at the receiver. Therefore, if the packet delay increases to much while contenting for the channel in WLAN, it is better to drop it beforehand without increasing the load of the network any further. The idea is therefore to drop packets with some probability if the node cannot get the opportunity to transmit its packets. The delay in getting the transmission opportunity is expected to be because of the increasing number of users contenting for the channel in WLAN. It is shown that in that case the throughput of the network start decreasing and the delay in the network increases. In order to decrease the delay of specific percentage of the packets, we claim that we should drop another percentage of the packets generated in the network as they wait to acquire the channel. The adjustment of this percentages is adjusted based on the length of the wait time of the packets in the queue. The algorithm is as follows: The node decides not to send the packet with some probability p1 at the end of each time period as shown in Figure 8. Also, if the packet transmission is not successful, the node will decide whether to start the retransmission with some other probability p2 . The implementation of this functionality in IEEE 802.11b and 802.11e is described next. Implementation in IEEE 802.11b: The data packets in IEEE 802.11b cannot drop packets since they have to send them. However, the voice packets are not useful after a certain time. In IEEE 802.11b, the channel access is not different for different types of traffic. If the voice packets drop their packets to avoid the WLAN throughput decreasing any further, then the data packets can get

15

backoff value is not zero period period

drop packet wp. p1

drop packet wp. p1

period drop packet wp. p1

period drop packet wp. p1

Fig. 8. Dropping packets periodically

the channel and increase the delay of VoIP traffic further while also increasing the packet loss. Therefore, this functionality may not be successful in IEEE 802.11b. Implementation in IEEE 802.11e: In IEEE 802.11e, each traffic category (TC) has its own CWmin, CWmax, TXOPLimit and AIFS. Since TCs achieve higher priority by choosing smaller window sizes, we may not need to worry about the low priority traffic such as TCP connections. Then we can implement Dropping Packets as Needed functionality. Before starting a transmission a station has to keep sensing the channel for an additional random time after detecting the channel as being idle for a minimum duration called AIFS. However, if the channel is busy most of the time, it means that there are a lot of active users in the network. Therefore, the node may choose to drop the packets to give the right of transmission to the other nodes. The node may decide whether to wait any longer or drop the packet periodically with some probability p1 . This way, if the channel is busy most of the time, the expected number of packets that are transmitted will decrease. This may be preferred since the delay will increase so much that they will be dropped anyway. If there is a collision, corresponding retry counter increments and backoff interval increases. In every transmission station backoffs, this is put into standard in order to provide fairness among the stations. However, this will also increase the delay a lot. It may not be worth retransmitting the packet. Therefore, for each retransmission, the node decides whether to continue trying to retransmit or drop the packet with some probability p2 . VI. P ERFORMANCE A NALYSIS A. Experimentation We experimented1 voip in the current system. The voip delay analysis consists of delay in wired domain and wireless domain. We focus on wireless domain since the tuning parameters are limited. In order to get a good estimate for the wireless domain we located the source and destination to the same subnet. They all are attached to the same access points. As a result, in this setting, the inter arrival delay is the delay budget for the wireless domain. We look at a station and take the difference of receiving times of the ACK. Figure 9 shows the plot for the instantaneous values, as it can be seen the jitter is around 30ms. Of course, this has been in the department network including all the ongoing other traffic. From Figure 10. 1

Thanks to Bill H. in EECS for his help.

16 0.12

0.1

interarrival time (sec)

0.08

0.06

0.04

0.02

0

0

50

100

150

200

250

packet

Fig. 9. Inter-arrival time of a node in reception

Another test we have done is with Skype voip software. This software works Peer to Peer and if one station wants to talk he contacts to the Skype server and gets the IP address of the destination then calls directly to that IP. In our setting as a result calls are confined within a subnet so wired domain part of the call is very limited and negligible. The Figures 11 and 12 shows the case when a pair of laptop is connected with a Skype software and they use voip. Now we can see that we can get jitter less than 30ms but the 30ms is still relevant in the majority of the interarrival times. The calls are composed of traversing through wireless domain to access point and to access point to wireless domain. Accessing the wireless domain is two times in this setting and HEARW¡P algorithm removes one wireless domain access and reduces the time significantly. VII. C ONCLUSION This paper introduces a Wireless VOIP algorithm and some experimentation with current system. HEARW¡P algorithm is explained and details are cited and experimentation with Skype software and 802.11b network is presented. R EFERENCES [1] Liesenborgs, J., Voice over IP in networked virtual elements. Ph.D dissertation, University of Maastricht, 2000. [2] IEEE 802.11 WG, “Reference number ISO/IEC 8802-11:1999(E) IEEE Std 802.11, 1999 edition. International Standard [for] Information Technology-Telecommunications and information exchange between systems-Local and metropolitan area networks-Specific Requirements-Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications,” 1999.

17

180

160

140

probability distribution

120

100

80

60

40

20

0

0

0.02

0.04

0.06 interarrival time

0.08

0.1

0.12

Fig. 10. Distribution of interarrival time for a node

0.035

0.03

interarrival time (sec)

0.025

0.02

0.015

0.01

0.005

0

0

50

100

150

200 packet

250

300

Fig. 11. Inter-arrival time of two nodes

350

400

18

140

120

probability distribution

100

80

60

40

20

0

0

0.005

0.01

0.015 0.02 interarrival time

0.025

0.03

Fig. 12. Distribution of interarrival time for a node

0.035

Suggest Documents