The SDN Approach for the Aggregation/Disaggregation of ... - MDPI

5 downloads 0 Views 4MB Size Report
Jun 25, 2018 - Figure 1 illustrates the simplified IoTtalk network architecture. ..... G/D/1 and the G/M/1 queues built in [18–21], and the details are omitted.
sensors Article

The SDN Approach for the Aggregation/Disaggregation of Sensor Data Yi-Bing Lin *

ID

, Shie-Yuan Wang, Ching-Chun Huang and Chia-Ming Wu

Department of Computer Science, National Chiao Tung University, Hsinchu 300, Taiwan; [email protected] (S.-Y.W.); [email protected] (C.-C.H.); [email protected] (C.-M.W.) * Correspondence: [email protected]; Tel.: +886-3-513-1500 Received: 23 May 2018; Accepted: 21 June 2018; Published: 25 June 2018

 

Abstract: In many Internet of Things (IoT) applications, large numbers of small sensor data are delivered in the network, which may cause heavy traffics. To reduce the number of messages delivered from the sensor devices to the IoT server, a promising approach is to aggregate several small IoT messages into a large packet before they are delivered through the network. When the packets arrive at the destination, they are disaggregated into the original IoT messages. In the existing solutions, packet aggregation/disaggregation is performed by software at the server, which results in long delays and low throughputs. To resolve the above issue, this paper utilizes the programmable Software Defined Networking (SDN) switch to program quick packet aggregation and disaggregation. Specifically, we consider the Programming Protocol-Independent Packet Processor (P4) technology. We design and develop novel P4 programs for aggregation and disaggregation in commercial P4 switches. Our study indicates that packet aggregation can be achieved in a P4 switch with its line rate (without extra packet processing cost). On the other hand, to disaggregate a packet that combines N IoT messages, the processing time is about the same as processing N individual IoT messages. Our implementation conducts IoT message aggregation at the highest bit rate (100 Gbps) that has not been found in the literature. We further propose to provide a small buffer in the P4 switch to significantly reduce the processing power for disaggregating a packet. Keywords: aggregation; disaggregation; Internet of Things; programmable switch; P4; sensor data; Software Defined Networking

1. Introduction Many Internet of Things (IoT) technologies have been used in applications for people flow, traffic flow, logistics flow, money flow, smart home, interactive art design, and so on. To promote IoT applications on campus, National Chiao Tung University (NCTU) is deploying several smart campus applications. These applications are created based on IoTtalk [1–4], an IoT application management platform that can be installed on top of IoT protocols such as AllJoyn [5], OM2M [6], OpenMTC [7], and an arbitrary proprietary protocol. IoTtalk allows a designer to quickly establish connections and meaningful interactions between IoT devices. Figure 1 illustrates the simplified IoTtalk network architecture. In this client-server architecture, the IoT devices (the clients) are connected to the IoTtalk engine (the server; Figure 1(1)) in the Internet through wireline or wireless technologies. When an IoT device is connected to the IoTtalk server, the server automatically creates the network application (Figure 1(2)) for the device to interact with other devices. The IoT device can be a group of sensors (such as color sensor and temperature sensor) or controllers (such as switches and buttons), which is called an input device (Figure 1(3)). The device is installed as a device application (DA; Figure 1(4)) that collects the sensor data generated by the sensor/controller of the device. The DA then sends the sensor data to the IoTtalk Sensors 2018, 18, 2025; doi:10.3390/s18072025

www.mdpi.com/journal/sensors

Sensors Sensors 2018, 2018, 18, 18, xx FOR FOR PEER PEER REVIEW REVIEW

22 of of 15 15

temperature sensor) or temperature or controllers controllers (such (such as as switches switches and and buttons), buttons), which which is is called called an an input input device device Sensors 2018, 18, sensor) 2025 2 of 15 (Figure 1(3)). The device is installed as a device application (DA; Figure 1(4)) that collects (Figure 1(3)). The device is installed as a device application (DA; Figure 1(4)) that collects the the sensor sensor data data generated generated by by the the sensor/controller sensor/controller of of the the device. device. The The DA DA then then sends sends the the sensor sensor data data to to the the IoTtalk IoTtalk for processing (see path (4) → (2) in Figure 1). Similarly, an IoT device is called an output device server for processing (see path (4) → (2) in Figure 1). Similarly, an IoT device is called server for processing (see path (4) → (2) in Figure 1). Similarly, an IoT device is called an an output output (Figure 1(5)) if it is a group of actuators (e.g., robot arm). The DA of the output device receives data device (Figure 1(5)) if it is a group of actuators (e.g., robot arm). The DA of the output device receives device (Figure 1(5)) if it is a group of actuators (e.g., robot arm). The DA of the output device receives fromfrom the IoTtalk engine (see path (2) →(2) (6)→ in(6) Figure 1) to 1) drive its actuators. data the engine (see in its data from the IoTtalk IoTtalk engine (see path path (2) → (6) in Figure Figure 1) to to drive drive its actuators. actuators.

Figure Figure 1. 1. Simplified Simplified IoTtalk IoTtalk network network architecture. architecture.

Based IoTtalk, we have developed outdoor applications using LoRA-based PM2.5 detection using LoRA-based PM2.5 detection [8] Based on on IoTtalk, IoTtalk,we wehave havedeveloped developedoutdoor outdoorapplications applications using LoRA-based PM2.5 detection [8] and NB-IoT-based parking (Figure 2), an emergency button, and dog tracking [9]. Besides outdoor and NB-IoT-based parking (Figure 2), an emergency button, and dog tracking [9]. Besides [8] and NB-IoT-based parking (Figure 2), an emergency button, and dog tracking [9]. Besides outdoor applications, applications, we we have have also also created created indoor indoor IoT IoT applications applications at at the the guest guest houses, houses, the the student student dorms, dorms, applications, we have also created indoor IoT applications at the guest houses, the student dorms, and research laboratory called Room 311 in the MIRC building (Figure 3). More than 100 sensors a research laboratory called Room 311 in the MIRC building (Figure 3). More than 100 sensors and and a research laboratory called Room 311 in the MIRC building (Figure 3). More than 100 sensors and actuators are installed in Room 311, and data are collected every day for machine-learning actuators are installed in Room 311, and data aredata collected every day for machine-learning analysis. and actuators are installed in Room 311, and are collected every day for machine-learning analysis. In addition, the smart home appliances are being installed in the newly built Graduate In addition, the smart home appliances are being installed in the newly built Graduate Dormitory 3. analysis. In addition, the smart home appliances are being installed in the newly built Graduate Dormitory 3. This student dorm will accommodate 1200 graduate students. A large amount of data This student dorm will accommodate 1200 graduate students. A large amount of data (IoT messages) Dormitory 3. This student dorm will accommodate 1200 graduate students. A large amount of data (IoT have collected in 311, and Dormitory 3. IoT have messages) been collected Room 311, MIRC, and Graduate Dormitory 3. These IoT messages are UDP (IoT messages) haveinbeen been collected in Room Room 311, MIRC, MIRC, and Graduate Graduate Dormitory 3. These These IoT messages are UDP packets with small payloads. For example, inside an elevator car, we have installed packets with small payloads. For example, inside an elevator car, we have installed a 3D accelerometer, messages are UDP packets with small payloads. For example, inside an elevator car, we have installed aa barometer, 3D aa barometer, and thermometer. Outside the the is to and a thermometer. Outside car, the motor is attached to a biaxial vibration accelerator. 3D accelerometer, accelerometer, barometer, and aathe thermometer. Outside the car, car, the motor motor is attached attached to aa biaxial vibration accelerator. volume of by about 12 The volume of data producedThe by these sensors is produced about 12 MB/minute. Thereis about 100 elevator biaxial vibration accelerator. The volume of data data produced by these these sensors sensors isare about 12 MB/minute. MB/minute. There are about 100 elevator cars on campus, which produce data at the speed of 20 MB/second. cars on campus, which produce data at the speed of 20 MB/second. There are over 20 IoT applications There are about 100 elevator cars on campus, which produce data at the speed of 20 MB/second. There There are over applications in and these applications generate large volumes of exercised in IoT NCTU, and theseexercised applications generate large volumes of small-packet path are over 20 20 IoT applications exercised in NCTU, NCTU, and these applications generatedata largethrough volumes of small-packet data through path (4) → (2) in Figure 1. (4) → (2) in Figure 1. small-packet data through path (4) → (2) in Figure 1.

Figure Smart parking Figure 2. 2. Smart Smart parking in in NCTU NCTU campus. campus.

delivered from devices to To To reduce reduce the the number number of of sensor sensor data data (IoT (IoT message) message) delivered delivered from the the IoT IoT devices devices to the the IoT IoT promising to aggregate server, a approach is several small IoT messages into a large packet server, a promising approach is to aggregate several small IoT messages into a large packet before before they are When the the packets packets arrive arrive at at the the destination destination (in (in our our they are delivered delivered through through the the network network [10]. [10]. When When the packets arrive at the destination (in example, example, the the IoTtalk IoTtalk server server is is installed installed in in the the NCTU’s NCTU’s private private cloud cloud at at the the computer computer center), center), they they are are In this paper, IoT data can be effectively disaggregated into the original IoT messages. In this paper, we show how IoT data can be effectively disaggregated into the original IoT messages. In this paper, we show how IoT data can be effectively aggregated and disaggregated. Specifically, we consider the software defined networking (SDN) approach to design IoT data aggregation and disaggregation for IoTtalk.

Sensors 2018, 18, x FOR PEER REVIEW

3 of 15

aggregated Sensors 2018, 18,and 2025disaggregated.

Specifically, we consider the software defined networking (SDN) 3 of 15 approach to design IoT data aggregation and disaggregation for IoTtalk.

Figure 3. Room 311 311 in in the the MIRC MIRC building. building. Figure 3. Room

The paper is organized as follows. Section 2 introduces the programmable SDN switches. The paper is organized as follows. Section 2 introduces the programmable SDN switches. Section Section 3 describes how programmable SDN switches can be utilized for sensor data aggregation and 3 describes how programmable SDN switches can be utilized for sensor data aggregation and disaggregation. Section 4 shows how to improve disaggregation performance for the SDN switches. disaggregation. Section 4 shows how to improve disaggregation performance for the SDN switches. Section 5 compares our solution with the previous studies. Specifically, we show an aggregation Section 5 compares our solution with the previous studies. Specifically, we show an aggregation solution based on SDN controllers. Finally, Section 6 concludes our work and points out a future solution based on SDN controllers. Finally, Section 6 concludes our work and points out a future research direction. research direction. 2. Programmable SDN Switch 2. Programmable SDN Switch To resolve the sensor data delivery issue, Software Defined Networking (SDN) is considered as a To resolve the sensor data delivery issue, Software Defined Networking (SDN) is considered as promising solution. SDN decouples the data and the control planes [11]. Recently, programmability of a promising solution. SDN decouples the data and the control planes [11]. Recently, programmability the data plane has become one of the most desirable SDN features, which allows the user to describe of the data plane has become one of the most desirable SDN features, which allows the user to how to process the packets in an SDN switch. This task can be achieved by using tools such as describe how to process the packets in an SDN switch. This task can be achieved by using tools such Programming Protocol-Independent Packet Processor (P4). P4 is a reconfigurable, multi-platform, as Programming Protocol-Independent Packet Processor (P4). P4 is a reconfigurable, multi-platform, protocol and target-independent packet processing language, which is used to facilitate dynamically protocol and target-independent packet processing language, which is used to facilitate dynamically programmable and extensible packet processing in the SDN data plane [12–14]. In SDN, a switch uses programmable and extensible packet processing in the SDN data plane [12–14]. In SDN, a switch uses a set of “match+action” flow tables to apply rules for packet processing, and P4 provides an efficient a set of “match+action” flow tables to apply rules for packet processing, and P4 provides an efficient way to configure the packet processing pipelines to achieve this goal. In a typical communications way to configure the packet processing pipelines to achieve this goal. In a typical communications network, a packet consists of a packet header and the payload, and the header includes several fields network, a packet consists of a packet header and the payload, and the header includes several fields defined by the network protocol. A P4 program describes how packet headers are parsed and their defined by the network protocol. A P4 program describes how packet headers are parsed and their fields are processed by using the flow tables, in which the matched operations may modify the header fields are processed by using the flow tables, in which the matched operations may modify the header fields of packets or the content of the metadata registers. fields of packets or the content of the metadata registers. Figure 4 illustrates the P4 abstract forwarding model elaborated in [15] (the conference paper Figure 4 illustrates the P4 abstract forwarding model elaborated in [15] (the conference paper version of this paper), and the details are reiterated here for the reader’s benefit. In this model, version of this paper), and the details are reiterated here for the reader’s benefit. In this model, a a configuration file describes all components declared in the P4 program, including the parse graph configuration file describes all components declared in the P4 program, including the parse graph (Figure 4(1)), the table configuration (Figure 4(2)), and an imperative control program that defines (Figure 4(1)), the table configuration (Figure 4(2)), and an imperative control program that defines the order of the tables to be applied in the processing flow (Figure 4(3)). This configuration file is the order of the tables to be applied in the processing flow (Figure 4(3)). This configuration file is loaded into the switch to specify the parser (Figure 4(4)) followed by the flow tables in the ingress loaded into the switch to specify the parser (Figure 4(4)) followed by the flow tables in the ingress and the egress pipelines (Figure 4(5),(6)). The ingress pipeline generates an egress specification that and the egress pipelines (Figure 4(5),(6)). The ingress pipeline generates an egress specification that determines the set of ports (and number of packet instances for each port) to which the packet will be determines the set of ports (and number of packet instances for each port) to which the packet will sent. Between these two pipelines, there is a traffic manager (TM; see Figure 4(7)) that performs the be sent. Between these two pipelines, there is a traffic manager (TM; see Figure 4(7)) that performs queueing functions for the packets. At the TM, the packets are queued before being sent to the egress the queueing functions for the packets. At the TM, the packets are queued before being sent to the pipeline, through which the packet header may be further modified. At the deparser (Figure 4(8)), egress pipeline, through which the packet header may be further modified. At the deparser (Figure the headers are assembled back to a well-formed packet. 4(8)), the headers are assembled back to a well-formed packet. The parser extracts the header fields and the payload of an incoming packet following the parse The parser extracts the header fields and the payload of an incoming packet following the parse graph defined in the P4 program. In P4, the parser state transitions are driven by the values of the graph defined in the P4 program. In P4, the parser state transitions are driven by the values of the fields in the headers. fields in the headers. Based on the above P4 forwarding model, we propose a novel approach that utilizes the header Based on the above P4 forwarding model, we propose a novel approach that utilizes the header manipulation of the P4 switch to aggregate small packets into large packets and disaggregate large manipulation of the P4 switch to aggregate small packets into large packets and disaggregate large packets back into small packets. We describe our approach through the software switch behavior packets back into small packets. We describe our approach through the software switch behavior model bmv2, and actually implemented our algorithm in two EdgeCore P4 switches, which use barefoot’s Tofino P4 chips. The bmv2 framework enables the developers to implement P4 applications

Sensors Sensors 2018, 2018, 18, 18, xx FOR FOR PEER PEER REVIEW REVIEW

44 of of 15 15

Sensors 2018, 18, 2025

4 of 15

model bmv2, and actually implemented our algorithm in two EdgeCore P4 switches, which use barefoot’s Tofino P4 chips. The bmv2 framework enables the developers to implement P4 on a softwareon switch. This framework is framework the de-factoisarchitecture forarchitecture most developers, as it is roughly applications a software switch. This the de-facto for most developers, as it is roughly equivalent the abstract forwarding model illustrated in Figure TheisTofino is equivalent to the abstract to forwarding model illustrated in Figure 4. The Tofino 4. chip based chip on the based onIndependent the ProtocolSwitch Independent Switch Architecture (PISA) and canusing be programmed P4. Protocol Architecture (PISA) and can be programmed P4. With theusing Capilano With the Capilano Software Development (SDE) P4 the compiler toolchain, the developers Software Development Environment (SDE) Environment P4 compiler toolchain, developers can compile and run can compile and P4 programs Tofino the line per rateport. up toThere 100 Gbps perports port.inThere P4 programs at arun Tofino P4 switchat atathe line P4 rateswitch up to at 100 Gbps are 64 a P4 are 64which ports in a P4 chip, sum upchip, to 6.5which Tbps.sum up to 6.5 Tbps.

Figure Figure 4. 4. The The P4 P4 abstract abstract forwarding forwarding model. model. Figure 4. The P4 abstract forwarding model.

3. Performance of IoT Packet Aggregation and Disaggregation in in the the P4 P4 Switch Switch The architecture architecture will will be illustrated in We have integrated IoTtalk the SDN SDN environment. environment. The have integrated IoTtalk in in the Section 5 later. In this section, Figure 5 illustrates a simplified general network architecture for Section 5 later. In this section, Figure 5 illustrates a simplified general network architecture for IoT IoT In this this architecture, architecture, K K IoT devices (Figure 5(1)) send small packet aggregation and disaggregation. disaggregation. In small messages) to the IoT server server (Figure (Figure 5(4)) 5(4)) though though the the P4 P4 switches switches (Figure (Figure 5(2),(3)). 5(2),(3)). packets (the IoT messages) When N consecutive IoT messages arrive at the first P4 switch, they are aggregated into a large packet. When N consecutive IoT messages arrive at the first P4 switch, they are aggregated into a large packet. After aggregation, aggregation, the the first first P4 P4 switch switch sends sends the the aggregated aggregated packet packetto to the the second second P4 P4 switch. switch. The The second second switchthen thendisaggregates disaggregates packet original IoT messages, and sends them to the IoT P4 switch thethe packet intointo the the original IoT messages, and sends them to the IoT server. server. Between these two P4 switches, a wide areanot network shown in the figure. Between these two P4 switches, there maythere be a may widebe area network shownnot in the figure.

Figure disaggregation. Figure 5. 5. The The P4 P4 switch switch network network architecture architecture for and disaggregation. disaggregation. for IoT IoT packet packet aggregation aggregation and and

Sensors 2018, 18, 2025

5 of 15

In our approach, the first P4 switch aggregates N IoT messages into one packet. Denote the aggregated packet as an N-packet. Figure 6 shows the packet formats of an IoT message and an N-packet for N = 8. For illustration purposes, we assume that an IoT message (Figure 6a) is a UDP packet with a 16-byte payload. In our design, the 16-byte payload is treated as a header called msg (Figure 6b). Between the UDP header and the payload is a 6-byte flag header (Figure 6c), of which the “type” field is used to indicate if the packet is an IoT message or an N-packet. The flag header is declared as header_type flag_t { fields { type: 8; padding: 40; } } header flag_t flag; Note that the flag header has a 40-bit “padding” field to make the minimum length of an Ethernet frame 64 bytes. To meet this requirement, we purposely pad the length of the flag header to be 6 bytes so that together with a 14-byte Ethernet header, a 20-byte IP header, an 8-byte UDP header, and a 16-byte msg, the total length of an IoT message is 64 bytes. For an N-packet (Figure 6d), we consider the payloads aggregated from the N consecutive IoT messages as a header called agg (Figure 6e). That is, the n-th header field msgn is the payload of the n-th IoT message (in this example, 1 ≤ n ≤ N = 8). Therefore, the agg header is declared as header_type agg_msg_t { fields { msg1: 128; msg2: 128; msg3: 128; msg4: 128; msg5: 128; msg6: 128; msg7: 128; msg8: 128; } } header agg_msg_t agg; We treat the payloads of a packet as a header and showed how to use the registers and the pipelines of the first P4 switch to conduct header operations to transform the payloads of N IoT messages (Figure 6b) into the payload of an aggregated packet (Figure 6e). When an IoT message arrives, the payload is parsed as a header. At Figure 4(4), the parser processes the Ethernet, the IP, and the UDP header (Figure 6f). When the flag_t header (Figure 6c) is encountered, the parser checks the value of the “type” field, then the parser extracts either the msg header (if “type” is 0xfa) or the agg header (if “type” is 0xfc). The P4 switch checks if the incoming packet is an IoT message (i.e., whether it has a valid msg header). If so, the payload of the message is considered as a header field that is saved in the register. The IoT message is then dropped. The P4 switch continues to collect the payloads of the incoming IoT messages in the register. When N payloads have been collected, these saved payloads are considered as header fields to produce the aggregated payload (Figure 6e). The aggregated packet is then sent out of the switch.

Sensors 2018, 18, x FOR PEER REVIEW

6 of 15

Sensors 2018, 18, 2025 Sensors 2018, 18, x FOR PEER REVIEW

6 of 15 6 of 15

Figure 6. The packet formats of an IoT message and an N-packet (N = 8). Figure Thedevice packetformats IoT message and N-packet Figure packet ofof anan IoT message and anan N-packet (N(N = =8).8).process with the In Figure 5, for 1 ≤ k6.6. ≤The K, kformats generates the packets following an arbitrary rate k . Then, the net arrivals of the packets to the first P4 switch form a Poisson process with the InInFigure 1≤ Figure5,5,for for 1 ≤k k≤≤ K, K, device devicekkgenerates generatesthe thepackets packetsfollowing followingan anarbitrary arbitraryprocess processwith withthe the K  the  arrival . Note that the Poisson property of merged input streams is observed in rate Then, net arrivals of the packets to the first P4 switch form a Poisson process with the  k rateλk.rate . Then, the net arrivals of the packets to the first P4 switch form a Poisson process with the k 1 k K arrival rate λ = λ . Note that the Poisson property of merged input streams is observed in the ∑ k =IoT 1K k traffic in Room 311, which is consistent with the superposition property of multiple sources of  ofIoT k . Note arrival rate that the Poisson property of merged input streams is observedofinthe the multiple sources Room 311, which is consistent with the superposition k[16]. 1 traffic the Poisson process We in first show that the inter-arrival times of the N-packets property arriving at the Poisson process [16]. first show that the311, inter-arrival timesthe of the N-packets secondof multiple sources of We IoTan traffic in Room which is consistent with the/ superposition property second P4 switch follow N-stage Erlang distribution with mean N : arriving at the P4the switch follow an N-stage Erlang with the mean N/λ: Poisson process [16]. We first distribution show that the inter-arrival times of the N-packets arriving at the  t N second P4 switch follow an N-stage Erlang distribution the mean N /  :  N t N 1ewith f N  t N  λ N t NNN −1 e−λt N f N (t N ) =  1 !  NN − ( N t N N 11e)! tN fN  tN   and Laplace transform transform is is  N  1 ! and its its Laplace N

and its Laplace transform is f * s Z∞  f t e  stN dt    λ  N (1)   tN 0f NN(tNN)e −stN dtNN = s    ∗ (1) fN (s)N =  s +λ N t N =0      st N * fN  s    f  t N  e dt N   (1) Based on the Poisson property of merged sources, we have conducted discrete event t N  0 N input  have  s we Based on the Poisson property of merged input sources, conducted discrete event simulation to model the packet departures of the first P4 switch to see if the packet departures (i.e., simulation toon model the packet departures of the first P4sources, switch towe seehave if theconducted packet departures (i.e., Basedarrivals theofPoisson property of merged event the packet the second P4 switch) followinput the Erlang-N distribution. The firstdiscrete P4 switch is the packet arrivals of the second P4 switch) follow the Erlang-N distribution. The first P4 switch simulation to of the first P4 to see if the packet (i.e., represented bymodel either the an packet M/M/1departures or an M/G/1 queue in switch the simulation. Figure 7 departures compares the isthe represented by either an second M/M/1P4orswitch) an M/G/1 queue in the simulation. Figure compares theis packetresults arrivals of histogram the the Erlang-N distribution. The 7first switch simulation (the bars) againstfollow the Erlang density function (the curves) inP4 which the simulation results (the histogram bars) against the Erlang density function (the curves) in which represented by either an M/M/1 or an M/G/1 queue in the simulation. Figure 7 compares the service times of the P4 switch are exponentially distributed (Figure 7a–c) or fixed (Figure 7d). In the service times of(the the histogram P4 switch bars) are exponentially distributed (Figure 7a–c) orcurves) fixed (Figure 7d). simulation results against the Erlang density function (the in which the Figure 7a, the service rate   1.1 and N = 5. The figure indicates that the departure process does Inservice Figuretimes 7a, theofservice µ = are 1.1λexponentially and N = 5. The figure indicates that the departure process7d). doesIn the P4rate switch distributed (Figure fixed (Figure  1.1λ from 1.17a–c)  (Figure to or2.2  (Figure 7b) the or fit Erlang-5distribution. distribution. Ifwewe increase the service rate fitFigure Erlang-5 If increase the service rate µ from to 2.2λ 7b) or increase 7a, the service rate   1.1 and N = 5. The figure indicates that the departure process does increase the factor aggregation N from7c), 5 tothe8 departure (Figure 7c), the departure process fits the Erlang aggregation N fromfactor 5 to 8 (Figure process fits the Erlang distribution better.  departure from 1.1fits  to 2.2  (Figure 7b) or fit Erlang-5better. distribution. If we increase the service ratethe distribution Also, if the service times are fixed, then the Erlang distribution Also, if the service times are fixed, then the departure fits the Erlang distribution better (Figure 7d). increase thefor aggregation factor N same from 5 toIoT 8 packets (Figure 7c), the departure fitspaper), the better (Figure 7d). Wekind noteof that for of (e.g., thepaper), IoT messages in this the We note that same packets (e.g.,kind the messages in this the process service times are Erlang fixed distribution better. Also, if the service times are fixed, then the departure fits the Erlang distribution service times are fixed in a P4 switch. in a P4 switch. better (Figure 7d). We note that for same kind of packets (e.g., the IoT messages in this paper), the service times are fixed in a P4 switch.

(a) (a)

(b) Figure 7. Cont.

(b)

Sensors 2018, 18, 2025 Sensors 2018, 18, x FOR PEER REVIEW

7 of 15 7 of 15

Sensors 2018, 18, x FOR PEER REVIEW

7 of 15

(d)

(c)

(d) exponential (  1.1 , N (c) Figure Figure 7. 7. Validation Validation of of the the Erlang Erlangdeparture departureprocess processof ofthe thefirst firstP4 P4switch: switch:(a) (a) exponential (µ = 1.1λ, =N 5); (b)(b) exponential (of ,1.1λ, N = 8); (c)8); exponential (of the N =switch: 5); (d) fixed ,= N 1.1λ, = 5).  1.1   2.2 =, 2.2λ,  1.1  1.1 = 5); exponential (µ = N= (c)process exponential (µfirst Nand = 5); and (d) (fixed Figure 7. Validation the Erlang departure P4 (a) exponential ((µ ,N  N = 5). = 5); (b) exponential (  1.1 , N = 8); (c) exponential (   2.2  , N = 5); and (d) fixed (  1.1 , N = 5).

The switches typically handle the packets without queueing, and its maximum processing The typically handle packets without its maximum processing capability per output port is given bythe its “line rate”. When queueing, the packetsand arrive the line rate, no The switches switches typically handle the packets without queueing, and itswithin maximum processing capability per output port is given by its “line rate”. When the packets arrive within the line rate, packet is dropped at the switch. We have actually implemented our algorithm in the EdgeCore P4 capability per output port is given by its “line rate”. When the packets arrive within the line rate, no no packet is dropped at the switch. We have actually implemented our algorithm in the EdgeCore switches based on Barefoot Tofino chip (see Figure 8a). packet is dropped at the switch. We have actually implemented our algorithm in the EdgeCore P4 P4 switches based on Barefoot Tofino chip (see Figure 8a). switches based on Barefoot Tofino chip (see Figure 8a).

(b) (a)

(b)

(a)environment with 2 EdgeCore P4 switches and the TestCenter: (a) physical Figure 8. The experimental connection of the EdgeCoreenvironment switch and the TestCenter andP4 (b) functional diagram. (a) physical Figure8. 8. The The experimental with EdgeCore switches andblock the TestCenter: TestCenter: Figure experimental environment with 22 EdgeCore P4 switches and the (a) physical connectionof ofthe theEdgeCore EdgeCoreswitch switchand andthe theTestCenter TestCenterand and(b) (b)functional functionalblock blockdiagram. diagram. connection Our study shows that for any N value, aggregation of the N-packets does not slow down the line rate of thestudy first shows P4 switch (Figure 8b(2)). However, packet disaggregation at not theslow second P4 the switch Our that ofof the N-packets does down Our study shows that for forany anyNNvalue, value,aggregation aggregation the N-packets does not slow down line the (Figure 8b(3)) slows down packet processing of the switch. This phenomenon is due to the fact that rate of the first P4 switch (Figure 8b(2)). However, packet disaggregation at the second P4 switch line rate of the first P4 switch (Figure 8b(2)). However, packet disaggregation at the second P4 switch the pipeline architecture is used, and to disaggregate an N-packet requires goingisthrough thefact ingress (Figure 8b(3)) slowsdown down packet processing ofthe theswitch. switch. This phenomenon phenomenon due to to the the that (Figure 8b(3)) slows packet processing of This is due fact that pipeline for N times. thepipeline pipelinearchitecture architectureisisused, used,and andto to disaggregate disaggregatean anN-packet N-packet requires requiresgoing goingthrough throughthe theingress ingress the On the other hand, the packet header overhead decreases as N increases. Specifically, for the pipeline for N times. pipeline for N times. example in 6, the size of an N-packet is On the theFigure other hand, hand, the packet packet header overhead overhead decreases decreases as as NNincreases. increases. Specifically, Specifically, for for the the On other the header examplein inFigure Figure6,6,the thesize sizeof ofan anN-packet N-packetSisis  48  16 N example (2) N

SN  48  16 N in which 48 bytes come from the Ethernet S(14 bytes), the IPv4 (20 bytes), the UDP (8 bytes), and (2) the 48 + 16N (2) N = flag (6 bytes) headers, and every agg header contributes 16 bytes (Figure 6). From (2), the payload to in which 48 bytes come from the Ethernet (14 bytes), the IPv4 (20 bytes), the UDP (8 bytes), and the in which 48 bytes come from the Ethernet (14 bytes), the IPv4 (20 bytes), the UDP (8 bytes), and the packet ratio flag (6 bytes) headers, and every agg header contributes 16 bytes (Figure 6). From (2), the payload to flag (6 bytes) headers, and every agg header contributes 16 bytes (Figure 6). From (2), the payload to packet ratio N 16 N packet ratio pN   (3) SNN 4 N 16N NN 16 p NpN= = (3) (3) SSNN 4+ N N delivered through the switch. If the throughput In (3), pN is the ratio of the useful information of a P4 X,the then we of can define the goodput asdelivered through the switch. If the throughput ratio the useful information Inswitch (3), pNis is of a P4 switch is X, then we can define the goodput as

Sensors 2018, 18, 2025

8 of 15

SensorsIn2018, FOR PEER REVIEW (3),18, p x is the ratio of the

8 of of 15 useful information delivered through the switch. If the throughput a P4 switch is X, then we can define the goodput as XN  pN X for N  1 (4) X N = p N X for N ≥ 1 (4) The network architecture in Figure 5 is actually implemented in Figure 8, in which two EdgeCore The network architecture in as Figure 5 is actually in Figure 8, in which two P4 switches are connected to serve P4 switches 1 and 2implemented in Figure 8b(2),(3). The Spirent TestCenter EdgeCore P4 switches connected to 8b(1)) serve as switches 1 and 2 inIoT Figure 8b(2),(3). Spirent [17] provides the trafficare sources (Figure to P4 pump the combined message trafficThe to the first TestCenter [17] providesSpirent the traffic sources (Figure to pump the combined message traffic to EdgeCore P4 switch. TestCenter is an 8b(1)) end-to-end testing solutionIoT that delivers high the first EdgeCore P4 switch. Spirent TestCenter solution delivers high performance with deterministic answers. We useisitan to end-to-end serve as thetesting 100 Gbps trafficthat source, which is performance with deterministic answers. We use it to serve as the 100 Gbps traffic source, which is the the highest speed in the world. The TestCenter also serves as the IoT server (Figure 8(4)), which highest speed in the world. The TestCenter also serves as theP4 IoT serverWith (Figure which receives receives the packets disaggregated at the second EdgeCore switch. the8(4)), Spirent TestCenter, the are packets theGbps second P4 switch. the Spirent TestCenter, we are able we able disaggregated to pump up toat 100 lineEdgeCore rate of packets in theWith experiments. to pump upon tothe 100throughput Gbps line rate of packets inXthe experiments. Based measurements of our experiments, Figure 9 shows the goodput XN Based on the throughput measurements X of our experiments, Figure 9 shows the goodput X N against the net packet arrival rate  from the IoT devices (Figure 5(1)) for different N values. The against the net packet arrival rate λ from the IoT devices (Figure 5(1)) for different N values. The figure figure indicates that as N increases, the goodput increases. However, the XN saturates as  indicates that as N increases, the goodput increases. However, the X N saturates as λ increases. For a faster. This is due toconstraint the hardware constraint of increases. a large N, XN saturates large N, XFor This phenomenon isphenomenon due to the hardware of the P4 switch. N saturates faster. the switch. The line of rate a portisof100 theGbps. switch is 100 Gbps. TheP4 line rate for a port thefor switch N

Figure9.9.Goodput Goodputvs. vs. packet packet arrival arrivalrates ratesfor forthe theP4 P4switch switchconfiguration configuration(1 (1≤ ≤N Figure N ≤≤8). 8).

The service service rate rate µNN for fordisaggregating disaggregating an an N-packet N-packet cannot cannot be directly measured from the P4 switch. Instead, Instead, by queue switch. by using using the the arrival arrival rate rateand andthe thegoodput goodputininFigure Figure9,9,we wecan canuse usethe theG/D/1/1 G/D/1/1 queue to model the P4 switch to derive the disaggregation processing time for an N-packet. We use the to model the P4 switch to derive the disaggregation processing time for an N-packet. We use the goodput when when there there is is no no packet packet loss loss and Observing from from the the goodput and without without buffering buffering as as the the service service rate. rate. Observing measurements of packet disaggregation of the P4 switch implementing our mechanism, we have measurements of packet disaggregation of the P4 switch implementing our mechanism, we have 11 α N ≈  N µN µ1 N

(5) (5)

1

in which 1.14. otherwords, words,the theprocessing processingtime time for for disaggregating disaggregating an an N N-packet is 0.85≤  αNN ≤1.14 in which 0.85 . InInother -packet is roughly the same as that for processing N IoT messages. roughly the same as that for processing N IoT messages. 4. Packet Disaggregation in the P4 Switch with Queueing 4. Packet Disaggregation in the P4 Switch with Queueing This section shows how the queueing mechanism can effectively reduce the computing power of This section shows how the queueing mechanism can effectively reduce the computing power the P4 switch required to perform disaggregation of the IoT packets. of the P4 switch required to perform disaggregation of the IoT packets. The previous section considers the case where the switch is operated without the input buffers, which is a typical design of switch. However, in our design for the disaggregation process, an N-

Sensors 2018, 18, 2025

9 of 15

The previous section considers the case where the switch is operated without the input buffers, which is a typical design of switch. However, in our design for the disaggregation process, an N-packet needs to be looped back to the input port N times through the pipeline (Figure 4(5),(6)). Each time, an IoT message is extracted from the N-packet and sent out of the second P4 switch (Figure 4(8)). Due to this design, the resulting data rate to an input port is N times of the data rate of the N-packets. If the resulting data rate is greater than the line rate of the input port during some time intervals and there is no buffer in the input port, the looped N-packets will have a high chance of being dropped. In contrast, if some buffers can be provided in the input port, then packet drops during the disaggregation process can be reduced. With buffers provided at the input port, some queueing effects at the input port are observed as follows. Based on the queueing theory, the second P4 switch can be designed to accommodate the traffic ρ N = λ/( Nµ N ) if there is no queueing effect when the packets arrive at the rate λ/N. In the real world, the queueing effect cannot be avoided at the switch if the packets arrive at random. Therefore, we say that the second P4 switch can accommodate traffic ρ N if the expected waiting time E[tW ] = β/µ N for a small β value (e.g., β = 0.1). Denote the service rate (the line rate) of the second P4 switch as µ1 . If the packets are not aggregated, then the packet arrivals to the second P4 switch are a Poisson process with the rate λ. If the packets are aggregated into N-packets, then the packet arrival rate to the second switch is λ/N, and the packet processing rate µ N is expressed in (5). With a small buffer time β/µ N or the buffer size d βe, we expected a smaller µ N to keep the same goodput as that for µ1 , and therefore the saved computing power can be allocated for other P4 traffics. To validate the analytic and the simulation models, let the processing time of the second P4 switch for an N-packet have an exponential distribution with the rate µ N . After we have validated the simulation model, the exponential assumption is relaxed to accommodate fixed processing time that fits the real P4 switch operation. The µ N value obtained in the previous section is used. The study in the previous section indicates that the inter-arrival times of the N-packets to the second P4 switch have the Erlang-N distribution with the rate λ/N. Then, the second P4 switch can be modeled as an EN /M/1 queue by assuming that the packet processing time for an N-packet is exponentially distributed. This model is used to validate the simulation experiments. Then, we use the validated simulation to investigate the configuration in Figure 5 (i.e., Figure 8b) with the fixed processing times in the second EdgeCore P4 switch. Let t R and tW be the response and the waiting times of an N-packet at the second P4 switch, respectively. Then, from [16] we have E [ tW ] =

x 1 1 and E[t R ] = E[tW ] + = µ N (1 − x ) µN µ N (1 − x )

(6)

From (1), x is expressed as ∗ x = fN (µ N (1 − x )) =



λ µ N (1 − x ) + λ

N



=

θ 1−x+θ

N (7)

and θ = λ/µ N . For N = 2, (7) is solved to yield p √ µ2 + 2λ − µ2 (µ2 + 4λ) 1 + 2θ − 1 + 4θ x= = 2 2µ2

(8)

Substitute (8) to (6) to yield µ2 +2λ−



µ2 (µ2 +4λ) 2µ2

p µ2 + 2λ − µ2 (µ2 + 4λ)  = h i E [ tW ] =  √ p µ +2λ− µ2 (µ2 +4λ) µ2 µ2 − 2λ + µ2 (µ2 + 4λ) µ2 1 − 2 2µ2

(9)

Sensors 2018, 18, 2025

10 of 15

For N = 3, (7) is rewritten as x 4 − 3(1 + θ ) x 3 + 3(1 + θ )2 x 2 − (1 + θ )3 x + θ 3 = 0 By extracting the factor (x − 1) of the above equation, we have h   i ( x − 1) x3 − (2 + 3θ ) x2 + 1 + 3θ + 3θ 2 x − θ 3 = 0 which solves to yield x

p p √ √ 3 3 = (2+33θ ) + V + 2 V 2 + W 3 + V − 2 V 2 + W 3 p p √ √ 3 3 = 32 + µλ3 + V + 2 V 2 + W 3 + V − 2 V 2 + W 3

in which V

= = =

−(2+3θ )(1+3θ +3θ 2 ) 6     2 − 2+ µ9λ +27 µλ 3

2µ3

and W

(2+3θ )3 27

+

+

(10)

θ3 2

3

54 2 3 +27λ

2 +9λµ

54µ3 2

(1+3θ +3θ 2 ) = − 3 3θ +1 =− 9 = − 3λ9µ+3µ3

(2+3θ )2 9

Note that from the Galois’s theorem, (7) does not have a close form for N > 3 and must be computed numerically. As we pointed out previously, the second P4 switch can accommodate traffic ρ N if we provide a small buffer space β to satisfy the requirement E[tW ] = β/µ N . In other words, if the second P4 switch can accommodate ρ N , its serving rate µ N (computing power) must satisfy E [ tW ] ≤

β µN

(11)

Suppose that the original design of the second P4 switch targets at the line rate µ1 , in which the packets are not aggregated, and every packet is processed at a fixed service time 1/µ1 . We also consider exponential service time with the same rate 1/µ1 for the comparison and validation purpose. For an arbitrary packet arrival rate λ, if we want to keep the line rate µ1 to the switch without packet aggregation/disaggregation, then the inequality (11) must be satisfied, that is, from the response time of an M/M/1 queue, we have λ µ1 ( µ1 − λ )



β or µ1 ≥ µ1



 β+1 λ β

(12)

If the packet processing time is a fixed value, then from the response time of an M/D/1 queue, we have   λ β 2β + 1 ≤ or µ1 ≥ λ (13) 2µ1 (µ1 − λ) µ1 2β Let µ1∗ ( β) be the minimal µ1 that satisfies (12) and (13). Then we have     β+1 λ for Exponential service times  β  µ1∗ ( β) =  2β+1 λ for fixed service times 2β

(14)

Sensors 2018, 18, 2025

11 of 15

Equation (14) indicates that the system with the fixed service times can process packets with less computing power than that for exponential service times. We have developed event-driven simulation to model the second P4 switch’s behavior. The event driven simulation is the same as those for the G/D/1 and the G/M/1 queues built in [18–21], and the details are omitted. Equation (14) is used to validate the simulation model. Our experiments indicate that the discrepancies between the analytic and the simulation results are within 0.4%. Now, we consider the case in which the packets are aggregated into 2-packets at the first P4 switch, and are sent to the second P4 switch for disaggregation. That is, for N = 2, substitute (9) into (11) to yield p µ2 + 2λ − µ2 (µ2 + 4λ) β h i ≤ p µ2 µ µ − 2λ + µ (µ + 4λ) 2

2

2

2

which is simplified as  λ−

βµ2 − 1+β

q

β2 + β



µ2 1+β

 λ−

βµ2 + 1+β

q

β2 + β



µ2 1+β



≤0

The above inequality implies βµ2 − λ− 1+β

q

β2





µ2 1+β



≤0

which is solved to yield λ≤

β+

! β2 + β µ2 1+β p

Let µ2∗ ( β) be the minimal µ2 that satisfies the above inequality. Then, µ2∗ ( β) =

(1 + β ) λ p β + β2 + β

(15)

Similarly, for N = 3, substitute (10) into (11) to yield  p p √ √ 3 3 2 2 V + V2 + W3 + V − V2 + W3 (2µ3 + 3λ) + 3µ3 β h p i ≤ p √ √ 3 3 2 2 µ 2 3 2 3 3 µ3 µ3 − 3λ − 3µ3 V + V +W + V − V +W Let µ3∗ ( β) be the minimal µ3 that satisfies the above inequality. Then, µ3∗ ( β) =

( β − 2) − 3(1 + β )

p 3

3(1 + β ) λ  p √ √ 3 2 2 V + V2 + W3 + V − V2 + W3

(16)

Equations (15) and (16) are used to validate the simulation model again. Our experiments indicate that the discrepancies between the analytic and the simulation results are within 0.9%. After validation by (15) and (16), the simulation model, together with the measured µ∗N ( β) obtained from Figure 9 and Equation (5), are used to compute µ∗N ( β), the minimal µ N that satisfies (11). Figure 10 plots µ∗N ( β) against the aggregation factor N and the queueing factor β, in which µ∗N ( β) is normalized by µ∗N (0). The figure indicates that a queueing mechanism with a small buffer can significantly reduce the computation power required to handle disaggregation of the IoT packets. Specifically, when β ≤ 0.02, increasing β will significantly reduce µ∗N ( β). On the other hand, for β ≥ 0.05, increasing the buffer size does not improve the performance. Therefore, a buffer of size d βe = 1 is good enough to reduce the processing rate to µ∗N ( β), which is not difficult to implement in the P4 chip (i.e., to create a buffer at Figure 4(9)).

can significantly reduce the computation power required to handle disaggregation of the IoT packets. Specifically, when   0.02 , increasing  will significantly reduce  N* (  ) . On the other hand, for

  0.05 , increasing the buffer size does not improve the performance. Therefore, a buffer of size     1 is good enough to reduce the processing rate to  N* (  ) , which is not difficult to implement

  2018, 18, 2025 Sensors

12 of 15

in the P4 chip (i.e., to create a buffer at Figure 4(9)).

Figure 10.10.µN ∗   against N and  . Figure N ( β ) against N and β. *

Aggregation/Disaggregation Based Based on on SDN SDN Controllers Controllers 5. Related Work and Aggregation/Disaggregation

aggregation/disaggregation is performed by software run atrun the In the the existing existingsolutions, solutions,packet packet aggregation/disaggregation is performed by software CPUs the servers. As we show in this section, software on the CPU-based at the of CPUs of the servers. As we show in thisthe section, thesolutions softwarebased solutions based on the architecture architecture will result inwill degraded goodput (andgoodput therefore(and degraded throughput). studies CPU-based result in degraded therefore degradedMany throughput). [10,22,23] described thedescribed benefits the of benefits packet aggregation in savingintransmission overhead in the Many studies [10,22,23] of packet aggregation saving transmission overhead networks. However, none none of them investigate the the processing costcost of ofa anetwork in the networks. However, of them investigate processing network node node for aggregation/disaggregation. This aggregation/disaggregation cost aggregation/disaggregation. This section section shows shows the aggregation/disaggregation cost in in the servers architecture. based on the CPU architecture. The original design of SDN separates the control plane and the data plane of legacy switches so that the network nodes (e.g., SDN switches) are only responsible for data forwarding, which makes convenient. The SDN controller called the OpenFlow the network management simpler and more convenient. Controller (OFC) utilizes OpenFlow [24] to interact with the SDN switches called the OpenFlow Switches (OFSs). SDN network management introduces a flow control application that maintains the routing rules rulesof of through theInOFC. In this makes more the network more routing thethe OFSsOFSs through the OFC. this way, SDN way, makesSDN the network programmable, programmable, flexible, and dynamically manageable. Specifically, the SDN can dynamically adjust flexible, and dynamically manageable. Specifically, the SDN can dynamically adjust the data routing the data routing path tocongestion. avoid network congestion. 11 shows an aggregation/disaggregation path to avoid network Figure 11 shows Figure an aggregation/disaggregation mechanism for mechanism foron IoTtalk based on the SDN architecture, which similar to the one illustrated in Figure IoTtalk based the SDN architecture, which is similar to theisone illustrated in Figure 5, except that 5, except that theare P4 replaced switches by arethe replaced by theswitches OpenFlow switches (which cannot be programmed the P4 switches OpenFlow (which cannot be programmed to perform to perform packet aggregation/disaggregation), and theControllers OpenFlow are Controllers In this packet aggregation/disaggregation), and the OpenFlow involved.are In involved. this architecture, K IoT devices (Figure 11(1)) are installed in Building 311 of the MIRC Building. The DAs architecture, K IoT devices (Figure 11(1)) are installed in Building 311 of the MIRC Building. The DAs of the devices collect the sensor data and send small packets (the IoT messages) to an OpenFlow Switch (OFS1; Figure 11(2)). OFS1 connects to an OpenFlow Controller (OFC1; Figure 11(3)) that is responsible for packet aggregation. OFC1 sends the aggregated packets to OFS1. These aggregated packets are sent to another OpenFlow Switch (OFS2; Figure 11(4)) through one or more SDN networks. OFS2 sends the packets to its controller (OFC2; Figure 11(5)) for packet disaggregation and outputs them to the IoTtalk server (Figure 11(6)). Both OFC1 and OFC2 are servers responsible for processing the packets.

responsible for Figure packet11(2)). aggregation. OFC1 sends aggregated packets (OFC1; to OFS1.Figure These11(3)) aggregated Switch (OFS1; OFS1 connects to anthe OpenFlow Controller that is packets are sent to another OpenFlow Switch (OFS2; Figure 11(4)) through one or more SDN responsible for packet aggregation. OFC1 sends the aggregated packets to OFS1. These aggregated networks. OFS2 the packets to its controller packet disaggregation and packets are sentsends to another OpenFlow Switch (OFC2; (OFS2; Figure Figure 11(5)) 11(4))for through one or more SDN outputs them to sends the IoTtalk serverto(Figure 11(6)). (OFC2; Both OFC1 and OFC2 servers responsibleand for networks. OFS2 the packets its controller Figure 11(5)) forare packet disaggregation processing the packets. outputs them to the IoTtalk server (Figure 11(6)). Both OFC1 and OFC2 are servers responsible Sensors 2018, 18, 2025 13 offor 15 processing the packets.

Figure 11. Sensor data aggregation/disaggregation for IoTtalk based on the SDN controllers (the CPUbased Figurearchitecture). 11. data aggregation/disaggregation for IoTtalk based on the SDN controllers (the CPUFigure 11.Sensor Sensor data aggregation/disaggregation for IoTtalk based on the SDN controllers basedCPU-based architecture). (the architecture).

Figure 12 shows the goodput of aggregation/disaggregation processed by Figure 11(3),(5). The Figure 12 shows shows thegoodput goodput aggregation/disaggregation processed by Figure 11(3),(5). goodput indicate that of inofaggregation/disaggregation the CPU-based architecture (i.e.,bythe OFC servers), Figurecurves 12 the processed Figure 11(3),(5). The The goodput curves indicate that in the CPU-based architecture (i.e., the OFC aggregation/disaggregation are significantly higher(i.e., than those of the servers), pipeline goodput curves indicate processing that in overheads the CPU-based architecture the OFC servers), aggregation/disaggregation processing overheads are significantly significantly higher than of pipeline P4 architecture shown (Figure 9). Specifically, the goodputs for N ≥ higher 2 are worse than those N=1 aggregation/disaggregation processing overheads are than those those of the thefor pipeline P4 architecture shown (Figure 9). Specifically, the goodputs for N ≥ 2 are worse than those for N = N ≥ 2 are much better than those for N = 1 in Figure in Figure 12. On the other hand, the goodputs for P4 architecture shown (Figure 9). Specifically, the goodputs for N ≥ 2 are worse than those for N1=in1 Figure 12. 12. OnOn thethe other hand, thethe goodputs for for N ≥N2≥are much better than those for for N =N1 =in1 Figure 10. 10. 2 are much better than those in Figure in Figure other hand, goodputs 10.

Figure12. 12.Goodput Goodputvs. vs. packet packet arrival arrival rates ratesfor for the theSDN SDNcontroller controllerconfiguration configuration(1 (1≤≤ N Figure N ≤≤8). 8). Figure 12. Goodput vs. packet arrival rates for the SDN controller configuration (1 ≤ N ≤ 8).

6. Concluding Remarks

This paper proposed to utilize the P4 switch for quick packet aggregation and disaggregation. We designed and developed novel P4 programs for aggregation and disaggregation, which are implemented in commercial P4 switches. Our study indicated that packet aggregation can be achieved in a P4 switch with its 100 Gbps line rate (without extra packet processing cost). On the other hand, to disaggregate a packet that combines N IoT messages, the processing time is about the same as processing N individual IoT messages. We further proposed to provide a small buffer at the input ports of the P4 switch to significantly reduce the packet drop probability when disaggregating an N-packet. Also note that our implementation conducted IoT message aggregation at the highest bit rate (100 Gbps) that has not been found in the literature. As for the future work, we are considering

Sensors 2018, 18, 2025

14 of 15

new techniques to improve the performance of packet disaggregation. One possible technique is the multicast mechanism supported by the EdgeCore switch. As a final remark, if the aggregation and the disaggregation P4 switches are indirectly connected through a network with bandwidth lower than their line rates, due to network congestion, when packets travel from the first P4 switch to the second P4 switch, some packets may be dropped while the aggregated N-packets are safely delivered to the second switch. In this case, together with the disaggregation process performed at the second P4 switch, the aggregation process performed at the first P4 switch effectively reduces the probability of IoT messages lost over the lower-bandwidth network. Author Contributions: Y.-B.L. is responsible for Analytic model; S.-Y.W. is responsible for Design and Data curation; C.-C.H. is responsible for Simulation and Data analysis; C.-M.W. is responsible for P4 switch programming. Funding: This research was funded by Ministry of Education: Center for Open Intelligent Connectivity from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project. Conflicts of Interest: The authors declare no conflict of interest.

References 1.

2. 3. 4. 5. 6. 7.

8.

9. 10.

11. 12.

13. 14. 15.

Lin, Y.-B.; Lin, Y.-W.; Chih, C.-Y.; Li, T.-Y.; Tai, C.-C.; Wang, Y.-C.; Lin, F.J.; Kuo, H.-C.; Huang, C.-C.; Hsu, S.-C. EasyConnect: A Management System for IoT Devices and Its Applications for Interactive Design and Art. IEEE Internet Things J. 2015, 2, 551–561. [CrossRef] Lin, Y.-B.; Lin, Y.-W.; Huan, C.-M.; Chih, C.-Y.; Lin, P. IoTtalk: A Management Platform for Reconfigurable Sensor Devices. IEEE Internet Things J. 2017, 4, 1552–1562. [CrossRef] Lin, Y.-W.; Lin, Y.-B.; Yang, M.-T.; Lin, J.-H. ArduTalk: An Arduino Network Application Development Platform based on IoTtalk. IEEE Syst. J. 2017, PP, 1–9. [CrossRef] Lin, Y.-W.; Lin, Y.-B.; Hsiao, C.-Y.; Wang, Y.-Y. IoTtalk-RC: Sensors as Universal Remote Control for Aftermarket Home Appliances. IEEE Internet Things J. 2017, 4, 1104–1112. [CrossRef] Open Connectivity Foundation. Available online: https://openconnectivity.org/ (accessed on 22 May 2018). oneM2M. Standards for M2M and the Internet of Things. Available online: http://www.onem2m.org/ (accessed on 22 May 2018). TS 102 921—V1.3.1—Machine-to-Machine Communications (M2M); mIa, dIa and mId Interfaces. Available online: http://www.etsi.org/deliver/etsi_ts/102900_102999/102921/01.03.01_60/ts_ 102921v010301p.pdf (accessed on 22 May 2018). Wang, S.-Y.; Chen, Y.-R.; Chen, T.-Y.; Chang, C.-H.; Cheng, Y.-H.; Hsu, C.-C.; Lin, Y.-B. Performance of LoRa-based IoT Applications on Campus. In Proceedings of the IEEE Vehicular Technology Conference, Toronto, ON, Canada, 24–27 September 2017. Lin, Y.-B.; Lin, Y.-W.; Hsiao, C.-Y.; Wang, S.-Y. Location-based IoT applications on campus: The IoTtalk approach. Pervasive Mob. Comput. 2017, 40, 660–673. [CrossRef] Koike, A.; Ohba, T.; Ishibashi, R. IoT Network Architecture Using Packet Aggregation and Disaggregation. In Proceedings of the IIAI International Congress on Advanced Applied Informatics, Kumamoto, Japan, 10–14 July 2016. ONF SDN Evolution. Available online: http://opennetworking.wpengine.com/wp-content/uploads/2013/ 05/TR-535_ONF_SDN_Evolution.pdf (accessed on 7 June 2018). Bosshart, P.; Daly, D.; Gibb, G.; Izzard, M.; McKeown, N.; Rexford, J.; Schlesinger, C.; Talayco, D.; Vahdat, A.; Varghese, G.; et al. P4: Programming protocol-independent packet processors. SIGCOMM Comput. Commun. Rev. 2014, 44, 87–95. [CrossRef] P4 Language Spec Version 1.0.4. Available online: https://p4lang.github.io/p4-spec/p4-14/v1.0.4/tex/p4. pdf (accessed on 22 May 2018). P4. Available online: http://www.p4.org/ (accessed on 22 May 2018). Wu, C.-M.; Wang, S.-Y.; Lin, Y.-B.; Huang, C.-C. Design and Implementation of Packet Aggregation by P4 Switches. In Proceedings of the IEEE Global Communications Conference, Abu Dhabi, UAE, 9–13 December 2018. under review.

Sensors 2018, 18, 2025

16. 17. 18.

19. 20. 21. 22. 23. 24.

15 of 15

Nelson, R. Probability, Stochastic Process, and Queueing Theory; Springer: New York, NY, USA, 1995; ISBN 978-0-387-94452-4. Spirent TestCenter—Verifying Network and Cloud Evolution. Available online: https://www.spirent.com/ Products/TestCenter (accessed on 22 May 2018). Huang, D.-W.; Lin, P.; Gan, C.-H. Design and performance study for a mobility management mechanism (WMM) using location cache for wireless mesh network. IEEE Trans. Mob. Comput. 2008, 7, 546–556. [CrossRef] Hong, C.-Y.; Pang, A.-C. 3-approximation algorithm for joint routing and link scheduling in wireless relay networks. IEEE Trans. Wirel. Commun. 2009, 8, 856–861. [CrossRef] Lin, P.; Wu, S.-H.; Chen, C.-M.; Liang, C.-F. Implementation and performance evaluation for a ubiquitous and unified multimedia messaging platform. ACM Wirel. Netw. 2009, 15, 163–176. [CrossRef] Tsai, S.-Y.; Sou, S.-I.; Tsai, M.-H. Reducing energy consumption by data aggregation in M2M networks. Wirel. Pers. Commun. 2014, 74, 1231–1244. [CrossRef] Yousefi, H.; Yeganeh, M.-H.; Alinaghipour, N.; Movaghar, A. Structure-free real-time data aggregation in wireless sensor networks. Comput. Commun. 2012, 35, 1132–1140. [CrossRef] Fan, K.-W.; Liu, S.; Sinha, P. Structure-Free Data Aggregation in Sensor Networks. IEEE Trans. Mob. Comput. 2007, 6, 929–942. [CrossRef] OpenFlow® Switch Specification Ver 1.3.5. Available online: http://opennetworking.wpengine.com/wpcontent/uploads/2014/10/openflow-switch-v1.3.5.pdf (accessed on 7 June 2018). © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Suggest Documents