Distributed-Fountain Network Code (DFNC) for Content Delivery in Vehicular Networks Chong Li
Jubin Jose
Xinzhou Wu
Iowa State University Ames, IA, USA
Qualcomm Research Bridgewater, NJ, USA
Qualcomm Research Bridgewater, NJ, USA
[email protected]
[email protected]
[email protected]
ABSTRACT
1. INTRODUCTION
Vehicle-to-vehicle (V2V) communication utilizing dedicated short range communication (DSRC) has already been tested in field trials and is ready for potential deployment. This deployment would enable the possibility of large scale content delivery over small and large geographical areas. While content delivery in such vehicular networks can take advantage of the broadcast nature of DSRC, it has to cope with new challenges presented by dynamic topology, unpredictable erasures and lack of acknowledgements. Random linear network coding (RLNC) can address these challenges in theory, but the high decoding complexity limits its applicability in practice, especially for large scale content delivery. Motivated by this, a new network coding scheme for vehicular networks, distributed-fountain network code (DFNC), that has low encoding, re-encoding and decoding complexity is presented in this paper. DFNC uses a fountain code at the source and re-encoding at intermediate vehicles that approximate a fountain code. Re-encoding at intermediate vehicles comprises an innovative approach of low complexity degree reduction and random linear combination of degree-reduced packets to satisfy the degree distribution of the fountain code. Low-complexity belief propagation (BP) decoding is applied at the destinations. Through extensive simulations on two mobility models, random waypoint and Boston urban area, this paper establishes that DFNC performance is close to RLNC performance at order-wise lower complexity. DFNC significantly outperforms other relevant epidemic algorithms in literature.
Vehicles are already equipped with wireless communication devices that connect to cellular networks. These connected vehicles can take advantage of cellular infrastructure for connectivity. However, with the emergence of direct vehicle-to-vehicle (V2V) communication techniques such as dedicated short range communications (DSRC), there will be a paradigm shift in the nature of connectivity. Vehicles (e.g. cars, trucks, buses) will be able to exchange messages with each other and roadside units (RSUs). Even though the deployment of V2V communications is primarily driven by safety, it is possible to envision a future where cellular infrastructure and RSUs utilize V2V communications to deliver content that is destined for vehicles or homes or fixed delivery locations. DSRC [1] has already allocated several channels for non-safety services, ranging from officeon-wheels to peer-to-peer (P2P) file sharing and navigation software updates. Thus, content delivery in vehicular networks equipped with DSRC has the unique advantage of utilizing DSRC spectrum. For example, in the United States, 75 MHz of spectrum in the 5.9 GHz band has been allocated for DSRC to be used by Intelligent Transportation Systems (ITS) and in Europe, the European Telecommunications Standards Institute (ETSI) has allocated 30 MHz of spectrum in the 5.9 GHz band for ITS. Other advantages include support for high mobility and potential ubiquitous deployment. An important application is the distribution of Certificate Revocation List (CRL) created by the certificate authority to all vehicles [2, 3, 4]. Similar to other content delivery problems, this problem can be captured by the following example. A file consisting of many packets is generated at a source. Now, the objective is to distribute the entire file to all vehicles in a geographical region. This file can be delivered separately to each vehicle using costly cellular spectrum. Instead, an alternate approach is to simply seed the file into the vehicular network for a short duration of time and utilize V2V communications to distribute the file to all vehicles. This paper focuses on efficient content delivery algorithms for this alternate approach, which offer significant reduction in the communication cost of using cellular technology. Content delivery in highly mobile vehicular networks has many challenges. These challenges include: (i ) Vehicular network has a highly dynamic topology. Hence, it is not feasible to use techniques that require global information such as creating a multicast tree. (ii ) Packet erasures are unpredictable due to lack of coordinated transmissions. Hence,
Categories and Subject Descriptors C.2.1 [Network Architecture and Design]: Wireless Communication
Keywords Vehicular network, Content distribution, Network coding, Fountain code
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. VANET’13, June 25, 2013, Taipei, Taiwan Copyright 2013 ACM 978-1-4503-2073-3/13/06 ...$15.00.
31
allowed to mix between chunks. Even though chunk-based linear network coding reduces the complexity, it introduces other issues such as scheduling of chunks [13, 14, 15, 16]. An extension of chunk-based network coding is batched sparse (BATS) code [17], which combines the idea of fountain code and chunk-based network code. However, even with small batch size, the computational complexity can be prohibitive for implementation and batch scheduling has to be handled separately. Furthermore, the performance with small batch size may not be comparable to RLNC. Fountain codes (e.g. Luby Transform (LT) code [18], Raptor code [19]) are rateless erasure codes that have low decoding complexity (e.g. belief propagation (BP) decoding). Significant research has been done to extend point-to-point fountain code to a network code by allowing re-encoding at intermediate nodes. In [20], a Raptor-code-based network coding technique is used where the intermediate node reencode packets by using a specific re-encoding matrix. However, the decoding complexity is the same as RLNC due to the Gaussian elimination decoding scheme. In [21], a line network is considered and a stacked LT coding scheme is used to relay information. In this coding scheme, each relay node performs on-the-fly LT encoding based on the current available packets. In the end, the sink node performs successive decoding technique to peel off each LT code appended by each relay node. Although this coding scheme achieves capacity of line network, it is hard to extend it to general networks as it is specific to the network topology. In [22], authors have used growth code with a different decoding objective, that is, instead of aiming to recover the whole file in the shortest time, growth code wishes to maximize the expected number of decodable packets during all transmissions. An important paper, due to its simple and efficient encoding, re-encoding and decoding schemes, in reducing complexity using fountain codes is LT network code (LTNC) [23]. However, LTNC is not well suited for VANETS as it does not fully take advantage of the broadcast nature (to be precise later). We show that the coding scheme, DFNC, developed in this paper outperforms LTNC with similar computational complexity.
fixed rate codes may not be sufficient for error correction. (iii ) Packet acknowledgements are not part of DSRC broadcast protocol. Hence, it is not straightforward to use typical ARQ or hybrid ARQ notions. (iv ) Security constraints could limit the utilization of local information as well. For example, neighbor discovery or feedback could violate the security restrictions required by DSRC. Due to these challenges, reliable and efficient content delivery in vehicular networks cannot utilize majority of existing P2P protocols and techniques developed for computer networks. Random linear network coding (RLNC) is a powerful technique to address the above challenges [5, 6, 7]. However, the high decoding complexity (O(k3 ), where k is the number of packets in the file) limits its application in vehicular networks. In this paper, we develop a decentralized, efficient coding scheme, distributed-fountain network code (DFNC), for content delivery in vehicular networks. In addition to addressing the challenges in vehicular networks, DFNC has low encoding, re-encoding and decoding complexity (O(k log3 k)). Through extensive simulations on two mobility models, random waypoint and Boston urban area, this paper establishes that DFNC performance is close to RLNC performance at order-wise lower complexity. DFNC significantly outperforms other relevant epidemic algorithms in literature (see Section 1.1 for details). In this paper, first, we design a general coding framework, constrained random linear network coding (C-RLNC), for relaying at intermediate vehicles. The basic idea of C-RLNC is to include degree constraints to random linear combinations of encoded packets. This is expected to reduce the decoding complexity since belief propagation (BP) decoding can be used instead of Gaussian elimination. C-RLNC is only an intermediate technical step and it does not result in low re-encoding complexity. Next, we develop DFNC as an approximation of C-RLNC with low re-encoding complexity. DFNC has many distinguishing properties: (i ) DFNC can efficiently handle transmission loss due to mobility and interference. (ii ) DFNC removes the scheduling problem in intermediate nodes, e.g., peer selection and packet selection. (iii ) DFNC almost fully exploits the broadcast nature of DSRC.
2. PROBLEM SETUP AND BACKGROUND
1.1 Related Work
A source (e.g. cellular base station, roadside unit) has a file that consists of k packets. Each packet is a lengthl binary vector. The source intends to send the same file to multiple destinations through a wirelessly connected vehicular network. In this paper, the destinations are all the vehicles1 . The source encodes native packets using an LT code and then injects these coded packets into the vehicular network for a limited duration. The objective is to design a low complexity coding scheme for intermediate nodes (i.e., vehicles) that achieves a good file-broadcasting delay distribution. Each vehicle is equipped with a DSRC device and communicates with others using DSRC. The physical layer is abstracted as an erasure channel with a packet erasure probability of γ. This erasure probability may be time-varying and is not known to the vehicles. The medium access layer (MAC) is modeled by slotted carrier sensing (CSMA/CA). A transmission is successful if the intended transmitter is
Related work can be broadly categorized into two classes. One class mainly investigates the benefits and limitations of using network coding in vehicular ad hoc networks (VANETs). In [8], the authors study the application of network coding in VANETs and argue that their network coding based protocol successfully deals with typical mobile network issues such as dynamic topology, intermittent connectivity and unreliable channel. In [9], [10], it is shown that network coding can further mitigate the peer and packet scheduling problems at the intermediate nodes and effectively handle the transmission loss due to unreliable channel and interference. Implementation issues associated with network coding in VANETs such as CPU consumption, memory access and disk I/O has been analyzed in [11]. The resource constraints can have a significant impact on the network coding performance [11]. The other class of related work focuses on simplifying the complexity of random linear network coding (RLNC) technique. A popular approach is limiting encoding, decoding and re-encoding to chunks (also called generations), which is a subset of all available packets [12]. Thus, packets are not
1 The results apply even if only a random subset of the vehicles is chosen as destinations.
32
within a certain range (transmission range) and all other transmitters are beyond a certain range (interference range) from the receiver. All the vehicles move around in a geographical region based on a mobility model. More details on the mobility models used in the simulations are given in Section 5. Remark 2.1. The code design in this paper is not specific to any physical layer, MAC layer or mobility model. Next, we provide a brief background on LT code [18] and RLNC.
Figure 1: A motivating example
2.1 LT Code (i) Encoding Process: Consider k native packets. First, an integer degree d is randomly selected from Robust Soliton (RS) distribution (see [18]). Then, a set of d packets, uniformly selected at random, are XOR-ed to produce an encoded packet. The selected d packets are called neighbors of the encoded packet. Each encoded packet is produced using the above procedure. The average degree of RS distribution is in the order of O(log k). (ii) Decoding Process: The decoder performs belief propagation (BP) on n (n > k) received LT coded √ packets. In [18], it is shown that n = k + O(log 2 (k/δ) k) (δ ∈ [0, 1]) coded packets are sufficient to retrieve k native packets with at most δ probability of decoding failure. Due to space limit, we refer interested readers to [18] for details. Remark that the RS distribution on the degree of LT coded packets and the uniform distribution of the inherent native packets are two key statistical properties to guarantee efficient BP decoding.
the destination. These transmissions have a non-zero probability of packet erasure. In this example, all relay nodes have identical set of packets. Clearly, RLNC achieves the min-cut capacity of this network without any coordination, but with high decoding complexity. If we apply LT code at source and make relay nodes forward encoded packets in the right time sequence, the capacity can be achieved for large k. However, the timing information among intermediate nodes implies coordination, which is not possible in vehicular networks. Therefore, our basic idea herein is to take the benefits of “random linear combination”2 from RLNC and the benefits of BP decoding from LT code. In particular, it is very desirable for the re-encoded packets at each relay node to look like independently coded LT packets. A somewhat related idea is presented in [23]. However, it does not try to produce independently coded packets. In other words, the packets from different relays could be correlated as they are generated from the same set of packets with a refinement objective that tries to equalize the number of native packets in the re-encoded packets. In this motivating example, the LTNC approach in [23] results in a significant number of duplicate packets at the destination. With limited number of packets at the relays (compared to the file size), it is impossible to generate independently coded LT packets at all the relays. However, it is fairly easy to generate packets with a given degree. Therefore, in order to facilitate BP decoding, this degree can be selected from the RS distribution. Next, in order to minimize duplications, for any given degree-d, the encoded packet can be selected from the set of all possible degree-d packets. Furthermore, a weight distribution can be used to reduce packet duplications from a given relay. This approach is formalized using a two-step coding framework, denoted by C-RLNC(v1), for intermediate nodes.
2.2 Random Linear Network Code (RLNC) Let the original file is segmented into k packets {bi }ki=1 . At the source node or an intermediate node, the node independently and randomly selects a set of coding coefficients {ci }m i=1 in Galois field GF (2) (in [7], the ratio m/k is referred to as density). Then, this node randomly selects a set of coded packets [y1 , y2 , · · · , ym ] out of the coded packets it received so far, and produce a newly re-encoded packet y˜i = m i=1 ci yi for relaying. To decode, as soon as a node receives k linearly independent packets {˜ yi }ki=1 , it forms an k × k matrix A, using the coefficients embedded in {˜ yi }ki=1 . Then, the original packets can be recovered by [b1 , b2 , · · · , bk ]T = A−1 [˜ y1 , y˜2 , · · · , y˜k ]T , where the inverse of A is computed by Gaussian elimination with computational complexity of O(k3 ). In summary, RLNC aims to produce independently coded packets at intermediate nodes. This allows “close to optimal” throughput, i.e., RLNC can fully utilize the broadcast nature of wireless networks.
3.
1. Construction of coding set A(d): Given received encoded packets {yi }ri=1 , an intermediate node generates degree-d coding sets A(d) for all d, defined as r δi yi , δi = {0, 1} . A(d) = z|deg(z) = d, z =
CONSTRAINED RANDOM LINEAR NETWORK CODE
i=1
In this section, we design a coding framework, constrained random linear network code (C-RLNC), for intermediate nodes. Let us consider the motivating example shown in Figure 1. Assume that there are no erasures on the broadcast links. In each odd time slot, the source broadcasts one packet to N intermediate relay nodes. In each even time slot, a relay node, selected randomly, transmits one packet to
2. LT re-encoding: Select a degree d according to RS distribution. Then, randomly select a packet from A(d) according to a distribution Wd . If A(d) is empty, a 2 Random linear combination is essential to fully exploit the broadcast nature of DSRC in vehicular networks.
33
can be reduced further. Since the total degree is positive (and hence bounded below), the convergence of the process Dr is guaranteed. As a consequence, the first condition of Definition 3.1 is satisfied. Furthermore, if we record {˜ yi }ri=1 r and {δj }j=1,j=i for each iteration, it is straightforward to construct a set of {δj }rj=1 such that yi = rj=1 δj y˜j for each yi . Therefore, the second condition of Definition 3.1 is also satisfied. Note that the above approach is simply a proof technique and is not an efficient algorithm to construct a degree-reduced set. 2
packet is selected at random from the received packets. The first step is of utmost importance and is the backbone of our code design. Notice that the degree-d coding set is the largest set of degree-d packets that can be constructed from available encoded packets. When an intermediate node needs to blindly broadcast a degree-d packet to its neighbors, randomly choosing a packet from A(d) minimizes the probability of broadcasting redundant information. The second step (in conjugation with the first step) aims to preserve the two statistical properties of an LT code such that BP decoding can be applied at the destinations. The distribution Wd could help not only preserve the statistical properties of LT code but also help in reducing the transmission of redundant information from the same relay. The construction of sets A(d) is not straightforward in general. For example, given {yi }ri=1 , it has to perform 2r operations to construct all coding sets by using an exhaustive search algorithm, which is intractable as r increases. Next, we present an alternate formulation of C-RLNC(v1). Even though the complexity of this alternate approach is also high, it provides a starting point for the design of the low complexity distributed-fountain network code (DFNC).
Remark 3.3. Given a set of encoded packets {yi }ri=1 , the set {˜ yi }rj=1 may not be unique. For example, let y1 = x1 ⊕ x2 ⊕ x3 and y2 = x2 ⊕ x3 ⊕ x4 , then we have two possible degree-reduced sets {˜ y1 = x1 ⊕ x4 , y˜2 = y2 } and {˜ y 1 = y1 , y˜2 = x1 ⊕ x4 }. Definition 3.4. Given a degree d and a degree-reduced set {˜ yi }ri=1 , the degree-reduced coding set B(d) is defined as ⎫ ⎧ r ⎬ ⎨ δi y˜i , δi = {0, 1} . B(d) = z|deg(z) = d, z = ⎭ ⎩ i=1,deg(y ˜i )≤d
In contract to A(d), for a given degree d, the coding set B(d) is constructed by combining packets with degree less than or equal to d. As we show later, this property is very useful in degree-constrained packet construction. Next, we provide an important result that leads to the alternate formulation.
3.1 Degree Reduction-based Formulation First, we define degree-reduced set. Definition 3.1. (Degree-reduced Set) Given a set of enyi }rj=1 is a set coded packets {yi }ri=1 , the degree-reduced set {˜ satisfying the conditions: 1. deg(˜ yi ) = minδi,j ∈{0,1} deg( rj=1,j=i δi,j y˜j ⊕ y˜i ), 2. for each yi , there exists a set of yi = rj=1 δi,j y˜j .
{δi,j }rj=1
Theorem 3.5. Given a set of encoded packets {yi }ri=1 , A(d) = B(d), ∀d. Proof. For a set of encoded packets {yi }ri=1 and a given degree d, we have the induced degree-reduced set {˜ yi }ri=1 and set B(d). Clearly, B(d) ⊆ A(d). We next show that r A(d) ⊆ B(d). r Let z ∈ A(d), i.e., there exists {δi }i=1 such that z = i=1 δi yi and deg(z) = d. Based on Definition 3.1, each non-trivial contributor yi (i.e., δi = 1) can be represented by the sum of a set of y˜. Therefore, there exists a set of {˜ yi }hi=1 such that z = hi=1 y˜i . Next, we show that deg(˜ yi) ≤ d for each i through contradiction . Let y˜m be a packet with the highest degree in {˜ yi }hi=1 . Suppose deg(˜ ym ) > d, then ⎛ ⎞ h y˜i ⊕ y˜m ⎠ < deg(˜ d = deg(z) = deg ⎝ ym ).
such that
The first condition states that further degree reduction on a packet in the degree-reduced set is not possible. The second condition implies that degree reduction should not result in any information loss. The following lemma shows the existence of a degree-reduced set. Lemma 3.2. For any given encoded packets {yi }ri=1 , there exists a degree-reduced set {˜ yi }rj=1 defined in Definition 3.1. Proof. For any given {yi }ri=1 , the following approach can be used to construct a degree-reduced set {˜ yi }rj=1 , which satisfies Definition 3.1. First, let y˜i = yi for i = 1, 2, · · · , r. Then, for each i, if there exists a set of {δj }rj=1,j=i such that the degree of y˜i can be reduced, namely, ⎛ ⎞ r yi ), deg ⎝ δj y˜j ⊕ y˜i ⎠ < deg(˜
i=1,i=m
This clearly contradicts the first condition in Definition 3.1. Therefore, we have deg(˜ yi ) ≤ d for all i = 1, 2, · · · , h. Hence, z ∈ B(d) and A(d) ⊆ B(d). This completes the proof. 2 An alternate representation of C-RLNC using degree reduction, denoted by C-RLNC(v2), is given next.
j=1,j=i
1. Degree reduction: Each intermediate node constructs a degree-reduced set from the received encoded packets.
we update y˜i =
r
2. Degree-constrained re-encoding: Select a degree d according to RS distribution and then randomly select a set of degree-reduced packets with degree less than or equal to d, according to certain distribution W, such that the sum degree equals to d. Note that this distribution is on the degree-reduced packets. More details on the design of distribution W in given in the next section.
δj y˜j ⊕ y˜i .
j=1,j=i
The above process, denoted by Dr , is invertible and thus no information is lost. Repeat the process Dr until none of the degree of y˜i can be reduced further. The resulting {˜ yi }ri=1 is a degree-reduced set. For the process Dr to proceed, there must exist at least one encoding packet y˜i whose degree
34
Figure 2: DFNC performed at an intermediate node
with degree two can be generated with existing degree two packets.
Now, we have an alternate approach to re-encode packets at the intermediate nodes. However, this alternate approach could be complex, as performing complete degree-reduction is not easy. Therefore, next, we provide efficient algorithms for approximate degree reduction and other optimizations.
4.
4.2 Low-Complexity Degree Reduction We are interested in degree reduction with low-complexity. Hence, we utilize the decoded set, which is the set of decoded native packets denoted by D, and equivalence classes to perform degree reduction. Let the current set of degree-reduced packets be {˜ yi }ri=1 . The decoded set and equivalence classes are defined with respect to these degree-reduced packets. Now, when a new packet yr+1 is received, all native packets in yr+1 that belong to decoded set D are removed. Next, the intermediate node checks if there are any two of the native packets in yr+1 that belong to the same equivalence class. If so, this pair of native packets is removed by adding the same pair generated from the corresponding equivalence class. This process is repeated until no such pair of native packets in yr+1 can be removed. Now, if the degreereduced packet y˜r+1 has degree less than or equal to two, the decoded set D and equivalence classes are updated as follows. If deg(˜ yr+1) = 1, proceed further with BP decoding. Then, decoded set and equivalence classes are updated. If deg(˜ yr+1) = 2, equivalence classes are updated. This degree reduction algorithm is described in Algorithm 1. Figure 3 shows that, after degree reduction, encoded packets with high degree become packets with lower degree. Consider the example in Figure 2. A new encoded packet y7 is received. After degree reduction by using decoded set D, we have y7 = x3 ⊕ x5 ⊕ x9 . Then, after degree reduction by using an equivalence class, we have y7 = x5 . Since y7 has degree one, BP decoding is applied to reduce the degree of other packets. The degrees of y4 and y7 are reduced while the sizes of D and E1 are increased. The above degree reduction algorithm removes all packet duplications with degree one or two. However, some of the packet duplications with high degree may not be detected. Hence, in DFNC, we use an efficient sliding window mechanism to remove some fraction of these duplications. Let
DISTRIBUTED-FOUNTAIN NETWORK CODE (DFNC)
DFNC consists of two steps: (i ) low complexity degree reduction, and (ii ) degree-constrained re-encoding. Degreeconstrained re-encoding further consists of (a) distributionbased degree-d packet construction and (b) packet diversification. Before proceeding to describe each of these steps, we provide some background on equivalence class.
4.1 Background Definition 4.1. (Equivalence class) Given a set of encoded packets {yi }ri=1 , an induced equivalence class E is a set of unique native packets satisfying the following condition. If xa , xb ∈ E, then there exists a set of degree-2 encoded packets I ⊆ {1, . . . , r} such that xa ⊕ xb = i∈I yi . A given set {yi }ri=1 may induce several equivalence classes. Note that equivalence class is defined using degree-two packets only. For maintaining equivalence classes, we introduce the following data structure (also given in [23]). Each native packet x is mapped into an integer via f (x) such that xi , xj ∈ E if and only if f (xi ) = f (xj ), where f (·) can be thought of a mapping to the index of the leader of the connected components. Initially, set f (xi ) = i for i = 1, 2, · · · , k. Given a set of encoded packets {yi }ri=1 generated from native packets {xi }ki=1 , if there exists an encoded packet y equal to xi , f (xi ) is set to 0. When an encoded packet y has degree two, that is, y = xi ⊕ xj for 0 ≤ i < j ≤ k, f (xv ) is set to f (xj ) for all xv with f (xv ) = f (xi ). Using this data structure for equivalence class, it only takes O(1) to determine if a packet
35
Before Degree Reduction
After Degree Reduction
0.4
0.4
0.35
0.35
0.3
0.3
0.25
0.25
PDF
PDF
0.45
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0
20
40
60
80
0 0
100
20
40
60
80
100
Degree
Degree
Figure 3: Degree distribution of packets: The left figure represents Robust Soliton Distribution (k = 100, = 0.1, c = 0.1) while the right figure represents the degree distribution of packets in the buffer after degree reduction when 80 packets are received. The circled high degree fractions are removed by degree reduction.
Algorithm 1 Low-Complexity Degree Reduction
selected with degree less than or equal to d. Let this degree be d . Next, a packet is randomly selected with degree less than or equal to d − d . This process continues till the target degree is reached or a maximum number of steps. Note that the degree of the sum of two encoded packets need not be the sum of their respective degrees. While selecting a packet, degree reduction using equivalence classes is applied. The reason for applying this degree reduction is that the decoded set and the equivalence classes could have drastically changed compared to the time while this packet was received. The packets with degree less than or equal to a target degree d are randomly selected according to a pre-assigned distribution. This distribution is generated by normalizing weight assignments in the set S(d) = {y ∈ Y|deg(y) ≤ d}. Let z be the newly re-encoded packet. The basic idea of weight assignment is to give a weight to each y ∈ z such that these encoded packets are selected with low/zero probability in future transmissions. In particular, the initial value of W(y) is set to one (i.e., y has never been used to produce any re-encoded packet). Then, if y is selected to construct z (i.e., y ∈ z), W(y) is updated to W(y)e−c for some constant c. In our simulations, we use a simple version of this algorithm. We set c = ∞ at the beginning and c = 0 after a threshold on the fraction of packets decoded. In particular, we consider Size(D), the number of decoded packets, and set Size(D) = 14 k as the threshold. The intuition behind this strategy is that when an intermediate node recovers a fraction of native packets, it is capable of generating independent LT-like packets and this weight assignment can introduce undesirable time dependency. A pseudo-code of the distribution-based packet construction is presented in Algorithm 2.
E Input: yr+1 , {yi }ri=1 , D and {Ei }N i=1 NE r+1 Output: {yi }i=1 , D and {Ei }i=1 E 1: yr+1 ← reduce the degree of yr+1 by D and {Ei }N i=1 . 2: if deg(yr+1) = 1 then 3: {yi }r+1 i=1 ← BP decoding E 4: update {Ei }N i=1 and D 5: end if 6: if deg(yr+1) = 2 then E 7: update {Ei }N i=1 8: end if
window size be Sw (Sw = 50 is used in simulations). Then, DFNC compares a newly received packets with latest Sw received packets to avoid duplications. Although this detection mechanism is very conservative, it is efficient and powerful especially during the early period of content delivery. This is only an additional optimization and is not that crucial for large file sizes.
4.3 Degree-Constrained Re-encoding The degree-constrained re-encoding process includes two important steps. (i) In distribution-based packet reconstruction, a target degree d is drawn from RS distribution. Then, according to a weight distribution, a set of packets with degree less than or equal to d are selected such that the sum of these packets has degree equal to the target degree d. While a packet is selected, degree reduction is performed. Weights (used for the weight distribution) are updated based on the selected packets. (ii) In packet diversification, each native packet in the newly re-encoded packet is replaced randomly by another native packet in the decoded set D or the corresponding equivalence class. Next, we provide more details on these two steps.
Remark 4.2. Algorithm 2 creates more diversity in newly re-encoded packet z than the LTNC algorithm in [23]. For example, consider received packets as {y1 = x2 , y2 = x7 , y3 = x3 ⊕ x5 , y4 = x1 ⊕ x2 ⊕ x5 }. If d = 3 is selected twice, LTNC algorithm in [23] would build a fresh re-encoded packed z = y4 twice. This degrades the performance of the network code. In contrast, Algorithm 2 would build z = y4 , z = y1 ⊕ y3 , z = y2 ⊕ y3 or z = y3 ⊕ y4 .
4.3.1 Distribution-based Packet Construction When an intermediate node intends to produce a new packet to broadcast, it first picks a target degree d at random from RS distribution. Next, the node constructs a new packet with degree d as follows. First, a packet is randomly
36
Algorithm 2 Distribution-based Packet Construction
Table 1: Complexity analysis of DFNC DFNC Complexity (per symbol) Source: LT encoding O(log k) Re-encoding: Algorithm 1 O(log2 k) Re-encoding: Algorithm 2 O(log3 k) Re-encoding: Algorithm 3 O(log2 k) Destination: BP decoding O(log k)
E Input: k, d, S(d), W, imax , D and {Ei }N i=1 Output: z, W 1: z ← ∅, i ← 0 2: while deg(z) < d and i < imax do 3: i← i+1 4: if S(d) = ∅ then 5: z ← randomly select a packet y from set S(d) according to W 6: z ← further degree reduction on z by equivalence E class {Ei }N i=1 7: if d(z) ≤ deg(z ⊕ z ) ≤ d then 8: d ← d − deg(z ⊕ z ) 9: z ← z ⊕ z 10: end if 11: else 12: break; // re-select a degree d with S(d) = ∅. 13: end if 14: end while 15: Cy (z) ← {y ∈ Y|y ∈ z} // used for weight update 16: if Size(D) ≤ 14 k then 17: for all y ∈ Cy (z) do 18: W(y) ← W(y)e−c // weight update 19: end for 20: end if
Table 2: Complexity comparison: DFNC vs. RLNC
Source encoding Re-encoding Destination decoding
RLNC O(k) O(k) O(k2 )
DFNC O(log k) O(log 3 k) O(log k)
z = y1 ⊕ y3 ⊕ y5 = x1 ⊕ x3 ⊕ x9 after diversification. In this example, we have Cy (x4 , x9 ) = {y5 }.
4.4 Complexity Analysis Since LT encoding/decoding complexity is O(log k) per symbol, the source encoding and destination decoding complexity of DFNC is O(log k) per symbol. Next, we need to understand the complexity of re-encoding. First, we analyze the complexity of Algorithm 1. Let G(Ei ) be an undirected graph constructed by an equivalence class Ei , whose vertices are the indexes of native packets in Ei and there is a branch between x and x if there is an encoded packet y = x ⊕ x in the buffer. Since Algorithm 1 implicitly removes duplications of degree-2 packets on-the-fly, the constructed G(Ei ) should be an acyclic graph with an average height O(log k) [24]. Therefore, we need at most O(log k) operations to find the path (i.e., equivalence chain) between any pair of vertices in G(Ei ) by using lowest common ancestor algorithm. Now, assume yr+1 with degree d is received. First, we need to find all pairs of native packets included in z which belongs to an equivalence class. Based on the data structure used for constructing equivalence classes, this can be done by sorting the indexes of native packets. Then, the extreme case of degree reduction is that degree d is reduced to zero by usd E ing equivalence classes {Ei }N i=1 . That is, 2 tree searches are performed with total complexity d2 O(log k). As yr+1 has an average degree of O(log k), the overall average complexity is upper bounded by O(log2 k). Similarly, in Algorithm 2, the degree reduction step has average complexity O(log2 k). The worst case complexity of constructing a packet with degree O(log k) on average is to merge only degree-1 packets, each of which are obtained by further degree reduction. Therefore, the total complexity is O(log3 k). In Algorithm 3, the extreme case is that every native packet in the re-encoded packet z is replaced by another native packet in the corresponding equivalence class. Then, the complexity is at most O(log2 k) as the average degree of z is O(log k) and the complexity of finding equivalence chain is O(log k). Table 2 summarizes the complexity of DFNC and the complexity comparison with RLNC.
4.3.2 Packet Diversification Since degree reduction in Algorithm 1 is only partial, the packet construction process in Algorithm 2 could benefit from further packet diversification. The idea of diversification is to replace a native packet x in the newly re-encoded packet z with another native packet x . This replacement can be performed efficiently if x and x are in the decoded set or in the same equivalence class (denoted by x ∼ x). In order to maximize packet diversity, the native packet x in the decoded set or the equivalence class is selected uniformly at random. For x chosen from the equivalence class, x ⊕x can be generated by using currently available degree-2 encoded packets. Let equivalence chain Cy (x, x ) of (x, x ) be a set of packets with the sum equal to x ⊕ x . Finding this equivalence chain for a pair (x, x ) is essentially a tree search algorithm, which is discussed later in the section of complexity analysis. Algorithm 3 captures the relevant steps. Algorithm 3 Packet Diversification E Input: z, D and {Ei }N i=1 Output: z 1: for all x ∈ z do 2: A ← {x ∈ X |x ∼ x and x ∈ / z} 3: if A = ∅ then 4: x ← uniform randomly select a x from A 5: Cy (x, x ) ← {y ∈ Y|x ⊕ x = hi=1 yi , deg(yi ) ≤ 2} // construct equivalence chain 6: z ← z ⊕ Cy (x, x ) 7: end if 8: end for
5. PERFORMANCE EVALUATION
In Figure 2, z = y1 ⊕ y3 = x1 ⊕ x3 ⊕ x4 is the constructed packet. Then, say, x9 is selected at random from equivalence class E1 in order to replace x4 . As y5 = x4 ⊕ x9 , we have
In this section, we evaluate the performance of DFNC through numerical simulations. We use a set of vehicle traces
37
list (CRL), which is an important application for providing security in DSRC. Besides the distribution of CRL, more delay-sensitive applications (e.g. on-road emergence messages, traffic jam notifications) fit in this simulation setup as well. Based on the FBI’s Uniform Crime Reports (2005) summarized in [2], the size of CRL including the identifiers of all stolen vehicles in top ten U.S. metropolitan areas is assumed to be above 300 KB. Packet size is assumed to be 1.5 KB. Therefore, we assume k ≥ 200 packets as the file size. We consider a typical broadcast interval of 100 ms, and assume that only 10 ms is used for CRL distribution. The simulations follow IEEE 802.11p MAC layer protocol. A coding rate of 11 Mbps is used. The erasure probability of packet transmission is γ = 0.1. The finite seeding period of RSUs is assumed to be the time required for seeding three times the file-size. All coding schemes are performed in GF (2). As large storage memory is affordable, we herein do not consider any buffer storage constraint. In our simulations, we compare the performance of DFNC with the following relevant epidemic algorithms.
Road Map 2500 2000
Y (in meters)
1500 1000 500
1. Random Linear Network Coding (RLNC): RSU generates encoded packets by randomly combining native packets. Vehicles produce re-encoded packets by randomly combining up to a certain number of the received encoded packets so far. Limiting the number of combinations reduces the re-encoding complexity without loss of performance. For large k, this number can be set to log k + 20 [7]. However, as we consider small file-sizes (k ≤ 1000) in our simulations, we set this number to be k/2. Gaussian elimination is applied for decoding.
0 −500 −1000 −1500 −4000
−3000
−2000
−1000 X (in meters)
0
1000
2000
Figure 4: Map of Boston urban area generated from two mobility models - Boston urban area and 2-D random walk. In order to show the advantages of DFNC, we compare it with RLNC and other epidemic algorithms used for content delivery. The simulation results show that DFNC has performance very close to RLNC and significantly outperforms other algorithms.
2. Luby Transform Network Coding (LTNC): The LT encoding and decoding are applied at the source and destinations, respectively. The re-encoding and redundancy detection techniques are implemented as described in [23]. Since it is not practical to assume feedback channels from neighbors in vehicular networks, we do not use the feedback scheme given in [23].
5.1 Simulation Setup We evaluate DFNC using the following two mobility models. (i) Boston urban area model (Figure 4): We consider 923 vehicles moving with average velocity of 20m/s, where each vehicle has a time-invariant velocity randomly (uniform) selected from [15, 25]m/s. Each vehicle’s movement at every intersection follows a Markov chain P [Pij ], where Pij is the probability to switch from directed road segment i to directed road segment j. We calibrate P using the real daily traffic volume data reported for each major road segment in Boston’s Cambridge area [25]. (ii) Random-walk model : We take into account a 1000 m × 1000 m 2-D area with a single RSU at the center. There are 300 vehicles moving with average velocity of 20 m/s. Each vehicle’s direction of motion is chosen uniformly at random after every 20 seconds. The physical layer is abstracted as an erasure channel with a packet erasure probability of γ. The medium access layer (MAC) is modeled by continuous-time carrier sensing (CSMA/CA). A transmission is successful if the intended transmitter is within a certain range (transmission range of 200 m) and all other transmitters are beyond a certain range (interference range of 300 m) from the receiver. We consider the distribution of a certificate revocation
3. Luby Transform Random Forwarding (LTRF): RSU generates LT encoded packets. Vehicles remove packet duplications in the buffer and select packets to broadcast at random. Since vehicles do not re-encode packets, there are many efficient algorithms to remove packet duplications. For example, tag each LT encoded packet, and the intermediate nodes remove duplication if a newly received packet has the same tag with a packet in the buffer. 4. Load-balancing Random Forwarding (LBRF): As an extension of LTRF, in this scheme, vehicles select packets at random with a weight distribution. When a packet is selected to broadcast, a weight is assigned to this packet such that this packet has less possibility to be selected again. Thus, in the long run, packets are selected uniformly. 5. Non-Coding Random Forwarding (NCRF): The sources inject native packets into the network. Similar to the LTRF scheme, vehicles first remove all packet duplications in the buffer and then randomly select native packets in the buffer to broadcast.
38
1 Fraction of file−retrieved vehicles
Fraction of file−retrieved vehicles
1
0.8
0.6 RLNC DFNC LBRF LTRF LTNC NCRF
0.4
0.2
0 0
200
400
600
800
0.6
0.4
0.2
0 0
1000
DFNC LTNC LBRF LTRF NCRF
0.8
200
400
Time(s)
600
800
1000
Time(s)
Figure 5: Performance comparison: Single source, Boston model, 300-packet file
Figure 7: Performance comparison: Four sources, Boston model, 1000-packet file
1
Fraction of file−retrieved vehicles
Fraction of file−retrieved vehicles
1 0.8
0.6 RLNC DFNC LBRF LTNC LTRF NCRF
0.4
0.2
0 0
200
400
600
800
1000
0.8
0.6
0.2
0 0
Time(s)
RLNC DFNC LBRF LTNC LTRF NCRF
0.4
200
400
600
800
1000
Time(s)
Figure 6: Performance comparison: Four sources, Boston model, 300-packet file
Figure 8: Performance comparison: Single source, random-walk model, 300-packet file
Note that, in our simulations, vehicles generate independent LT coded packets after the whole file is retrieved.
call that, in [23], the redundancy-detection of LTNC is only implemented on received packets with degree less than or equal to three. Therefore, this detection is not efficient especially during the early seeding period. As expected, LBRF slightly outperforms LTRF due to the load-balancing approach. Another observation is that the performance of RLNC (or DFNC) does not improve much when more RSUs are deployed while other algorithms do. This is due to the significant coding benefits obtained from RLNC (or DFNC). Hence, by exploiting coding benefits of RLNC (or DFNC), no extra RSUs are necessary for content delivery, which minimizes the infrastructure cost. Figure 7 shows the performance comparison with a large file size of 1000 packets. Four RSUs seed LT coded packets and the Boston mobility model is used. In this case, we could not obtain the performance of RLNC due to its very high decoding complexity. Since DFNC does not have such high complexity, it is possible to obtain its performance
5.2 Simulation Results First, we plot the performance of different epidemic algorithms using the Boston urban area mobility model. The Boston urban area with the roads and sources is shown in Figure 4. In Figure 5 and Figure 6, we consider the scenarios of seeding from single RSU (solid circle) and four RSUs (four circles), respectively. The x-axis shows time (in seconds) and the y-axis shows the proportion of vehicles that has successfully retrieved the whole file. The decoding delay distribution corresponding to DFNC is slightly (< 10%) worse than RLNC, but significantly outperforms other coding and non-coding epidemic algorithms. Interestingly, we see that LTNC and LTRF have similar performance although LTNC implements re-encoding process while LTRF does not. This is because LTRF can efficiently remove all duplications in the buffer as it does not re-encode the received packets. Re-
39
[8] U. Lee, J.-S.Park, J. Yeh, G. Pau, and M. Gerla. “CodeTorrent: Content Distribution using Netwrok Coding in VANETs”, in Proc. MobiShare , 2006. [9] D.M. Chiu, R.W.Yeung, J. Huang, and B. Fan. “Can Network Coding Help in P2P Networks”, In Proc. NetCod, 2006 [10] J.-S. Park, M. Gerla, D. S. Lun, Y. Yi, and M. M´edard. “CodeCast: a Network-Coding-Based Ad Hoc Multicast Protocal”, IEEE Wireless Communications, vol. 13, no. 5, pp. 76-81, Oct. 2006. [11] S.-H. Lee, U. Lee, K.-W. Lee, and M. Gerla. “Content Distribution in VANETs using Network Coding: The Effect of Disk I/O and Processing O/H”, in Proc. SECON, pp. 117-125, 2008. [12] P. A. Chou, Y. Wu, and K. Jain, “Practical Network Coding”, in Proc. Annual Allerton Conference on Communication Control and Computing, pp. 40-49, 2003. [13] P. Maymounkov, N. J. A. Harvey, and D. S. Lun, “Methods for Efficient Network Coding”, in Proc. Annual Allerton Conference on Communication Control and Computing, 2006. [14] D. Silva, W. Zeng, and F. R. Kschischang, “Sparse Netwrok Coding with Overlapping Classes”, in Proc. NetCod, pp. 74-79, 2009. [15] A. Heidarzadeh, and A. H. Banihashemi, “Overlapped chunked network coding”, in Proc. ITW, 2010. [16] Y. Li, E. Soljanin, and P. Spasojevic, “Effects of the Generation Size and Overlap on Throughput and Complexity in Randomized Linear Network Coding”, IEEE Transactions on Information Theory, vol. 57, no. 2, pp. 1111-1123, Feb. 2011. [17] S. H. Yang, and R. W. Yeung, “Coding for a Network Coded Fountain”, in Proc. ISIT, 2011, pp. 2647-2651. [18] M.Luby, “LT Codes”, in FOCS, 2002. [19] A. Shokrollahi “Raptor Codes”, IEEE Transactions on Information Theory, vol. 52, no. 6, pp. 2551-2567, June. 2006. [20] N. Thomos and P. Frossard, “Raptor Network Video Coding”, in Proc. MV, pp. 19-24, 2007. [21] R. Gummadi and R. Sreenivas, “Relaying a Fountain Code Accross Multiple Nodes”, in Proc. ITW, 2008, pp. 149-153. [22] A.Kamra, V.Misra, J.Feldman, and D. Rubenstein, “Growth Codes: Maximizing Sensor Network Data Persistence”, ACM SIGCOMM, pp. 255-266, 2006. [23] M.Champel, K.Huguenin, A.Kermarrec, and N.L.Scouarnec, “LT Network Codes”, in Proc. ICDCS, pp.536-546, 2010. [24] F. Chung and L. Lu, “The average distances in random graphs with given expected degrees”, in Proc. NAS, pp. 15879-15882, 2002. [25] “MassDOT”, Internet: www.mhd.state.ma.us, [2012].
through simulations. DFNC significantly outperforms other algorithms. Figure 8 shows the performance comparison using the random waypoint mobility model and file size of 300 packets. Similar to the results in Boston model, DFNC offers large throughput benefits at low-complexity. This also demonstrates that the performance of DFNC is not limited to specific mobility models.
6.
CONCLUSION
Motivated by the emerging potential for content delivery using DSRC in vehicular networks, we develop a novel coding scheme, distributed-fountain network code (DFNC). DFNC uses the idea of fountain codes and builds on the ideas from random linear network coding (RLNC) to extend the application of fountain codes to broadcasting or multicasting with relaying. DFNC reduces the content dissemination delay at order-wise lower complexity compared to RLNC. It uses a novel approach of degree reduction of received packets and degree-constrained linear combination of these degree-reduced packets. Through extensive simulations, we demonstrate that DFNC performance is very close to RLNC performance and significantly outperforms other relevant epidemic coding schemes. Even though this paper focuses its attention to content delivery for vehicular networks, we believe that DFNC can be used in other important applications as well.
Acknowledgment The authors would like to thank Bo (Rambo) Tan, Sundar Subramanian and Tom Richardson (Qualcomm Research), Zhiyuan Yan (Lehigh university), and Sanjay Shakkottai (University of Texas at Austin) for the insightful discussions.
7.
REFERENCES
[1] “Standard Specifications for Telecommunications and Information Exchange Between Roadside and Vehicle Systems - 5GHz Band Dedicated Short Range Communications (DSRC) Medium Access Control (MAC) and Physical Layer (PHY) Specifications”, Sept. 2003. [2] P. Papadimitratos, G. Mezzour, and J.-P. Hubaux, “Certificate Revocation List Distribution in Vehicular Communication System”, in Proc. VANET, 2008. [3] K. P. Laberteaux, J. J. Haas and Y.-C. Hu, “Security Certificate Revocation List Distribution for VANET”, in Proc. VANET, 2008. [4] M. Nowatkowski, and H. Owen, “Scalable Certificate Revocation List Distribution in Vehicular Ad Hoc Networks”, in Proc. SWiM, pp. 54-58, 2010. [5] R.Ahlswede, N.Cai, S.Y. Li and R. Yeung, “Network Information Flow”, IEEE Transactions on Information Theory, vol. 46, no. 4, pp. 1204-1216, Jul. 2000. [6] T.Ho, M. M´edard, R. Koetter, D. Karger, M. Effros, J. Shi, and B. Leong, “A random Linear Network Coding Approach to Multicast”, IEEE Transactions on Information Theory, vol. 52, no. 10, pp. 4413-4430, Oct. 2006. [7] M. Wang and B. Li, “How Practical Is Network Coding?”, in Proc. IWQoS, 2006, pp. 274-278.
40