Index TermsâInformation Centric Networking, Internet of. Things, Named Data ... the Forwarding Information Base (FIB), used to forward the Interests, records ...
IoT Data Processing at the Edge with Named Data Networking Marica Amadeo, Claudia Campolo, Antonella Molinaro, Giuseppe Ruggeri University “Mediterranea” of Reggio Calabria - DIIES Department Email: {name.surname}@unirc.it Abstract—Processing high volumes of raw Internet of Things (IoT) data at the network edge is becoming a popular solution to guarantee lower latency interactions compared to the traditional computing in the remote cloud. The synergy between networking and computing domains is today enabled by innovative paradigms such as Named Data Networking (NDN) and its recent extensions, which support the request of computing services “by name” and their distributed execution right inside the network nodes, instead of being deployed in purpose-built servers. In this paper, we extend the NDN architecture to turn the network edge into a dynamic computing environment for running user applications relying on IoT data streams processing and analytics. Novel naming and forwarding mechanisms are defined to properly guide service requests towards edge computing nodes, as close as possible to the IoT data sources, in order to offer low latency and avoid that raw IoT data flood the network. Through simulations with ndnSIM, we analyse the performance of our proposal in terms of volume of data traffic and service delivery time. Index Terms—Information Centric Networking, Internet of Things, Named Data Networking, Mobile Edge Computing
I. I NTRODUCTION The future Internet will be dominated by massive data generated by billions of heterogeneous Internet of Things (IoT) devices serving a variety of applications, including augmented reality, interactive gaming, and event monitoring in smart building, smart transportation and industry 4.0 domains. Offloading the back-end computing tasks from servers in the cloud to the network edge can not only reduce the IoT data traffic in the Internet backbone, but also provide services with lower latency and better resilience than the traditional cloud paradigm can offer [1]. Whereas there is a wide consensus on the potential and the benefits of moving computation to the edge, it is still unclear which is the networking model that best suits this paradigm change. The traditional host-centric IP networking model is still the dominant solution, where the end-to-end communication is managed between the IoT data sources and fixed purpose-built servers deployed at the network edge, each one identified by an IP address [2]. This model is too ossified and barely exploits the mushrooming distributed capabilities of in-network nodes, gradually shifting from simple forwarders to more capable computing and storage platforms. In this paper, we argue in favour of innovative informationcentric networking (ICN) solutions to improve IoT data processing at the network edge. ICN communication is connectionless and driven “by names”, which are directly used at the
network layer for data retrieval. Data retrieval is originated by a consumer, which transmits in an Interest packet its request for a desired content; the original content producer or any innetwork node storing a valid copy of that content can answer with a named Data packet. ICN naturally matches the contentcentric pattern of many IoT applications, which care about the data itself rather than the node that produces them [3], [4]. ICN also simplifies the system design since caching and security can be implemented at the network layer by design. A very popular ICN instantiation, called Named Data Networking (NDN) [5], has been considered to support sensing data collection and actuation tasks in IoT scenarios like smart home [6], [7] and smart building [8]. NDN has been also promoted as an enabler of the network cloudification, as originally presented in the Named Function Networking (NFN) proposal [9], an NDN extension that delivers ondemand results of (generic) computations. In NFN, a consumer requests a computation over a given content (e.g., a video compression), and the network helps to find a node for the execution and finally returns the result. NFN is at a very early stage of development, and so far it has been applied to support generic tasks, without a specific forwarding strategy for selecting the execution location. In this paper, we extend NDN to effectively and efficiently support IoT data processing at the edge. Our proposal, called IoT-Named Computing Networking (IoT-NCN), is specifically designed to manage computation requests over IoT contents, with minor modifications in the legacy NDN model and low additional overhead. IoT-NCN goes beyond existing solutions in that it targets not only the overhauling of the semantics of NDN packets used to request/deliver services, but it also provides the high-level and algorithmic design of the forwarding strategy. The latter one is conceived to resolve the requests as close as possible to the IoT sources, in order to limit the raw data traffic across the network and to reduce the service provisioning latency. Each service request can be managed by an in-network node provided with the available computation capability, while legacy NDN nodes, or nodes without the requested capacity, simply treat the request as a legacy Interest. The performance of the IoT-NCN proposal has been evaluated in terms of data traffic volume and service delivery time with the ndnSIM network simulator [10]. The remainder of the paper is organized as follows. Section II presents the NDN pillars and shortly scans the closest related works. The proposed solution is extensively discussed
in Section III. Results are reported in Section IV prior to concluding in Section V with hints on future works. II. BACKGROUND AND MOTIVATIONS Basics. NDN communication is driven by the consumer, which sends named Interest packets to retrieve the content transmitted in named Data packets. The node model consists of the following data structures: - the Content Store (CS) caches Data packets that traverse the node based on configured caching/replacement policies; - the Pending Interest Table (PIT) records all incoming unsatisfied Interests, waiting to be consumed when the correspondent Data packet arrives; - the Forwarding Information Base (FIB), used to forward the Interests, records the name prefix and a collection of outgoing faces (i.e., lower-level network interfaces and application faces) towards the content producer(s); - the Routing Information Base (RIB) records routing information that is used to fill the FIB; it is registered and updated by different parties, e.g., the routing protocol and the application services. At the Interest reception, each NDN node first looks in the CS for a name matching. If the content is stored, the Data packet is forwarded back. Otherwise, a PIT lookup is performed to check if the same request is already pending. If so, the PIT is just updated with the new incoming Interest face. Vice versa, the Interest is forwarded according to the FIB lookup and a novel PIT entry is created. The Data packet, sent by the content producer or by a caching node, is forwarded towards the consumer(s) by following the chain of PIT entries which record the face(s) from which the Interest arrived. Related works. NFN [9] was the pioneering proposal that extended the role of NDN from information access to information processing. In NFN, the Interests carry, in the name field, expressions involving named data as well as named functions, and the network is in-charge of computing the result by interlacing expression-resolution with namebased forwarding. The NDN forwarding machine is augmented with a λ−expression resolution engine, which processes all Interests that have the postfix name component /NFN. Instead of using λ−functions, which have limits in the expressiveness, the Named Function as a Service (NFaaS) proposal [11] supports more sophisticated processing with lightweight virtual machines in the form of named unikernels. A Kernel Store in the nodes is used to store the unikernels code, while statistics from observed function execution requests are included in the Measurement Table. A score function ranks the unikernels according to their popularity and decides which of them to download and make available and which of them to delete; nodes advertise the availability of their services in the domain by using a routing protocol. The first work customizing NFN in the IoT domain is PIoT (Programmable IoT) [12], an application layer solution, statically installed in the more capable nodes of the network. In [13], to complement PIoT, the same authors propose a protocol
that enables IoT devices to offer simple in-network computing operations based on their own capability. The proposed framework, however, moves away from the information-centric vision: it requires that specific hosts in the network perform the management/repository functionalities, and they must be queried for the execution of any service. Contributions. Unlike [12], [13], the proposed IoT-NCN performs distributed in-network IoT data processing at the network edge, by relying on NDN augmented with the cognition of named computations. Similar to [11], IoT-NCN lets edge nodes dynamically execute services, according to the request popularity or instructions from the network manager or the service operator. IoT-NCN is intended to entail minor modifications of legacy NDN to keep backward compatibility with it. To this purpose, it uses a naming scheme that identifies IoT contents and services without affecting the NDN routing. Unlike [11], where nodes advertise the services they can execute, IoT-NCN does not perform any publishing procedure to avoid that service advertisements (in addition to legacy contents advertisements) waste precious network resources. Indeed, broadcasted advertising messages should be frequently updated, e.g., because the service availability in a given node is subject to high dynamicity, since services can migrate and the available node resources can change. Instead, we adopt a simple reactive discovery mechanism with low overhead that runs in piggybacking with the Interest forwarding and involves only edge nodes. Specifically, a customized forwarding strategy identifies the service executors, i.e., nodes with the capabilities to execute the service, with the aim of confining IoT raw data as much as possible at the network edge, thus reducing the traffic volume and offering low-latency data collection and interactions. This is especially critical to bandwidth-consuming and/or real-time IoT applications. III. I OT-NCN As shown in Figure 1, we refer to a general network model, structured in the following levels: the remote data centers domain, the core network domain, multiple edge and IoT domains. The core domain consists of high speed routers, which transport named packets between edge domains, perform innetwork caching as in legacy NDN, but do not host functions. We make this design choice to maintain a light, high-speed core network. Each edge domain, instead, acts as a distributed platform, where nodes can execute computations. In particular, a hierarchical edge domain is assumed, where multiple nodes separate the core from the IoT domain, resembling, for instance, the backhaul segment of a cellular network. Several applications rely on information received from IoT domains to perform analytics and computing services. Moreover, often the same information is leveraged in several contexts [14]. In principle, IoT services can be highly heterogeneous. Conventional basic services are computing the maximum, minimum, mean, parity, and histogram over a pool of sensed values [15]. More sophisticated services span, e.g., spatial and temporal correlations, spectral characteristics of the data, and filtering operations on the raw data [16].
Fig. 1. Reference scenario.
Data streams generated by controllers, sensors, and control plants may require computation-intensive control algorithms in industrial environments [1]. Other services requiring rich computing resources include image compression, video transcoding but also video analytics over different cameras deployed in a urban area (e.g., for vandalism, lost child, and accident detection [1]). IoT-NCN is not tied to a specific service, but its design choices are especially suited for bandwidth-consuming (e.g., for video surveillance purposes) and low-latency (e.g., for critical industrial automation applications) IoT services. Moreover, it can work alongside the cloud, which can be used for longterm analysis/statistics and when no edge node is available for the processing. The consumer application could know, according to a pre-defined configuration, which services run in-the-network and which in the cloud. In the following, we show the main features of IoT-NCN. A. Naming To enable in-network IoT data processing, the naming scheme needs to identify the IoT content(s) and the processing service to be executed on them. The service name can include a limited set of parameters as input for the execution. Following the NDN hierarchical naming scheme, we assume that both contents and services are hierarchically named, with the service name added after the content name, in order not to affect the legacy NDN forwarding. The tag iotNCN is used to delimit the service name from the content name. For instance, the name /unirc/temp/buildingA/iotNCN/avg/{N=20} indicates that an average operation (i.e., avg is the service name) is required over at least 20 temperature values (i.e., N = 20 is the service input parameter), from building A of the University of Reggio Calabria; the name /unirc/humid/buildingA/iotNCN/max/{N=30} requires to find in the same building the maximum value among 30 humidity samples. Techniques like exclusion filters [17] can be used to obtain multiple data under the same name, if the executor does not know the exact data sources names. According to the proposed naming scheme, two (or more) requests are the same if they consist of the same service, with the same input parameters, applied over the same Data
set. Intermediate nodes may aggregate Interests with the same name coming from different consumers, so that they are not duplicated on the common links towards a given IoT data source or service executor. This is different from the legacy IP approach that, being data-agnostic, does not allow aggregation of requests [6]. At the same time, nodes can optionally cache the results and make them available to other consumers, without the need of performing the computation again. Core routers maintain information only on the main prefix (i.e., /unirc) and route the Interest packets towards the edge, where the rest of the name is taken into account and decisions are made according to the IoT-NCN strategy. If no service name is reported in the Interest, the request only retrieves the data, as in legacy NDN. B. Basic architectural choices Without loss of generality, we assume that each uniquely named service function is implemented through a selfconsistent code, that run on top of a virtual machine, a container, or through unikernels [13], [11]. A novel data structure, called Service Table (ST), is added to the NDN architecture of computing nodes to store the name of the available services. It is used during the Interest forwarding for fast checking the service availability. A boolean field is added to the PIT entry, called ExecAvailability, which is set to 1 if the node is able to execute the requested service, and 0 otherwise. At this stage of the work, each node installs or removes the pool of available services according to the request popularity and its available local computing resources, by following the ranking engine deployed in [11]. The Service Table is updated accordingly. To prevent parallel execution of the same service, we avoid the use of broadcast/multicast forwarding strategies for requests including the iot-NCN tag: the Interest in the core/edge network is sent only on a single outgoing face, the one with the lower cost. Core routers (and generally edge nodes without computing capabilities) do not execute services, they (optionally) look in the CS for a cached result (the same computation may have been requested before by a different consumer) and, in case of a cache miss, forward the Interest towards the IoT data sources. C. The edge forwarding strategy The IoT-NCN Interest is forwarded towards the IoT data sources until it reaches the so-called branching node, i.e., the last NDN content router in the path towards the IoT data source(s) for a given request, which is able to perform the data collection. The branching node is naturally selected during the forwarding process, by looking at the scope of the content names and the FIB entries per that name. For instance, let A be an NDN router connected to n temperature sensors through n direct wired links. Its FIB includes n dedicated entries per each sensor name, i.e., domainX/temp/room-1, domainX/temp/room-2, ... domainX/temp/room-n. Therefore, it will be the branching node for the Interest domainX/temp/iotNCN/max/{N=30}.
Ideally, to reduce the data traffic, all the computations should be performed by the branching node. However, this could not be always viable for two main reasons: (i) nodes so close to the IoT data sources could be resource-constrained, e.g., a Raspberry Pi acting as a gateway node; (ii) nodes, even with a reasonable amount of resources, could not bear the whole processing load. Therefore, the IoT-NCN strategy allows the nodes to dynamically candidate themselves for the service execution, if they have enough capabilities to do it. Forwarding decision. When an IoT-NCN Interest reaches an edge node C with computing capabilities (see Figure 2), it first looks in the CS for a cached result and, if the check fails, looks in the PIT. If no matching is found, C extracts the service name component from the Interest and looks for a matching in the ST. - An ST matching means that C is a candidate service executor. Then, by looking for a matching between the content name and the FIB entries, C realizes whether it is the branching node. If it is the last node of the forwarding chain, then it will execute the service; otherwise, it will forward the request towards the next hop to possibly discover another executor, and creates a novel PIT entry with the boolean ExecAvailability set to true. This information will be used in case no subsequent executor is found. - If no ST matching is found: C extracts the content name component and looks for a matching in the FIB to understand if it is branching node, or not. If C is just an intermediate node without the required computing capabilities, it creates a PIT entry with the ExecAvailability field set to false, and forwards the Interest according to the FIB information. Otherwise, if C is the branching node, it generates a negative acknowledgement (NACK) packet with the error code no-computation and sends it back. The first node receiving the NACK, with the PIT ExecAvailability field set to true, will execute the service (Figure 3). The envisioned procedure guarantees that, every time the node closest to the IoT data sources has the requested computing capabilities, it will execute the service. If no service executor is found, e.g., because all the nodes are busy, a NACK is immediately forwarded to the original consumer, which can eventually request the service execution to the remote cloud. Service execution and delivery. The service executor collects the IoT content(s), by using the content name and legacy NDN forwarding mechanisms. When the data collection is completed, the processing is performed and the result is sent back with the same name of the original IoT-NCN Interest. Each receiving node deletes the Interest in the PIT and forwards back the content. Interests remain pending in the PIT for a pre-defined lifetime. If the service provisioning time exceeds the Interest lifetime, no computation result can be sent back to the requester, since the Interest in the PIT of intermediate nodes is discarded. To address this issue, the consumer can issue the request as a long-lived-Interest [18], i.e., a request with a longer lifetime if compared with a standard content request.
Fig. 2. IoT-NCN Interest processing.
Fig. 3. IoT-NCN NACK processing.
In any case, before starting the execution, the provider checks if the estimated service provisioning time, Pt , is larger than the Interest lifetime. If so, it sends back an empty Data packet with the same name of the Interest and a code in the payload, which specifies that the operation is delayed of t seconds. The consumer will query again the node after that time. If the service is further delayed, the same mechanism can be repeated. If the outcome of the computation does not fit a single payload size, more Data packets are originated and proper techniques can be set up to maintain the Data flow, like longlived-Interest, or simply more Interests can be sent by the consumer with an increasing sequence number. IV. P ERFORMANCE EVALUATION A. Main simulation settings To evaluate the proposed IoT-NCN strategy, we conducted simulations in ndnSIM [10], the official ns-3-based simulator
deployed by the NDN community. We have modified its stock installation to span service requests and execution and further extended it to incorporate the proposed edge forwarding strategy. As shown in Fig. 1, the edge domain consists of a fat tree topology, with four levels and 40 IoT data sources for each leaf node. The root of the domain, linked to the core network, is the first computing node that receives the service requests. We consider a variable number of remote consumers Nc , varying from 2 to 10, each one requiring a distinct service1 during the simulation. Consumers are placed outside the edge domain over which data need to be retrieved. Inter-arrival time between requests from different consumers is 0.1 s. We consider two scenarios with different computation/data collection workloads (i.e., heavy and light). In both scenarios, each service, Si with i ∈ [1, 40], is characterized by the input data size and the amount of computing resources, aS,i , required for the service execution (i.e., the number of CPU cycles). Specifically, in the first simulation campaign, the simulated services mimic analytics performed over images captured by IoT data sources, similarly to [19]. Each service requires first the collection of the input data from three different sources; for the sake of simplicity, we assume that the contents to be collected have the same size, i.e., 500 KB, to resemble an image. In the second simulation campaign, we consider light services, which require first the collection of small input data (5 KB) from forty different sources, and then basic computations over them. After the data collection, the execution is performed, whose duration varies depending on the capabilities of the node that performs the processing. For each edge node, Ej , the CPU working frequency is specified as cE,j , which represents the number of CPU cycles per unit time. It is reasonable to assume that the root node has higher processing capabilities (i.e., 4 GHz) than the nodes close to the leaves (i.e., 2 GHz). Thus, aS,i . the execution time is derived as cE,j In both scenarios, we assume that a single Data packet, as service output, is sent back to the consumer after the computation. Long-lived-Interests are used to maintain the pending requests in the PIT and allow the forwarding of the resulting Data. Simulation parameters are summarized in Table I. Two metrics are derived: (i) the service provisioning time, computed as the time since the first Interest is transmitted from a consumer to request the service until the output of the computation is sent back, as a measure of the effectiveness of the approach, and (ii) the number of transmitted Data packets to retrieve the requested service, to evaluate the efficiency of the proposal. We compare the IoT-NCN framework against a benchmark approach, implemented over NDN, according to which the requested services are executed always by the same node, i.e., 1 To better understand the performance of the proposal, at this preliminary stage of the work, we decided to not consider the effect of caching over computed data. The impact of caching will be a subject matter of future work.
TABLE I M AIN SIMULATION SETTINGS Symbol Parameter Number of services 40 Number of consumers 2-10 CPU frequency 4 GHz (root node) 2 GHz (other edge nodes) Input data size 500 KB by 3 nodes (heavy workload) [19] 5 KB by 40 nodes (light workload) Number of CPU cycles per service 6000 Mcycles (heavy workload) 400 Mcycles (light workload) Data Packet payload 1 KB Data rate per link 100 Mbps (core network) 50 Mbps (edge links) 20 Mbps (last hop)
the root node in the considered topology, which is the edge node closest to the consumer. This approach resembles the standard anycast NDN approach, where the first node with the capabilities to execute the service (i.e., the root node) retrieves the Data packets to be processed. In the following, it is referred to as ”Legacy” approach. B. Simulation results Results achieved for the two considered scenarios are depicted in Figure 4. It can be observed that, not surprisingly, the Legacy approach outperforms the proposal in terms of service provisioning time (Figure 4(b)) when heavy workload services are considered. This is because, the more powerful root node is able to ensure the service execution in a shorter time. In particular, differences get more remarkable when more consumers request services, because the branching node saturates its available capacity, and the service execution is demanded to nodes which are further from the IoT data sources, thus increasing the service provisioning time. However, the improvements of the Legacy approach are achieved at the expenses of a higher overhead in terms of Data packets (Figure 4(a)), since raw data are delivered from IoT sources towards the root node. In our proposal, instead, the computation is executed close to the IoT data sources with a reduction of the exchanged data traffic. A different trend can be observed when light workload services are considered, confirming the viability of pushing computations close to the IoT data sources to reduce both the overall service provisioning time (Figure 4(d)) and the volume of traffic (Figure 4(c)). As the number of consumers increases, the service provisioning time achieved by IoT-NCN gets closer to the one experienced by the Legacy approach, since less powerful edge nodes get overloaded to handle requests from multiple consumers. V. D ISCUSSION AND CONCLUSIONS In this paper, we leverage the NDN philosophy to propose a distributed solution to orchestrate computing service execution over IoT data at the network edge. Compared to traditional host-centric IP approaches, it does not rely on purposebuilt servers and on ossified matching between services and
(a) Number of transmitted Data packets (Heavy workload services)
efficient than the benchmark approach, by saving precious network resources, and outperforms it when considering light processing services, which will characterize many IoT applications. Moreover, the results clearly highlight the need for smarter strategies which distribute in-network service execution and ensure efficient network resources utilization while accounting for: the computing resources of nodes and their current load; the workload needs of the services and the delay requirements of the applications. Thus, multi-objective service workload in-network placement will be targeted as a future work. R EFERENCES
(b) Service provisioning time (Heavy workload services)
(c) Number of transmitted Data packets (Light workload services)
(d) Service provisioning time (Light workload services) Fig. 4. Performance metrics for the two considered scenarios.
computing nodes, but allows any node in the path towards IoT data sources to perform the requested computation. The proposal has the virtue of simplicity; it can run while ensuring backward compatibility with legacy NDN, since the forwarding logics and the semantics of exchanged packets are not violated. Achieved results show that the proposal is always more
[1] S. Yang, “IoT stream processing and analytics in the Fog,” IEEE Communications Magazine, vol. 55, no. 8, pp. 21–27, 2017. [2] D. Alessandrelli, M. Petraccay, and P. Pagano, “T-res: Enabling reconfigurable in-network processing in IoT-based WSNs,” in IEEE DCOSS, 2013. [3] W. Shang, A. Bannis, T. Liang, Z. Wang, Y. Yu, A. Afanasyev, J. Thompson, J. Burke, B. Zhang, and L. Zhang, “Named data networking of things,” in IEEE IoTDI, 2016. [4] M. Amadeo et al., “Information-centric networking for the internet of things: challenges and opportunities,” IEEE Network, vol. 30, no. 2, pp. 92–100, 2016. [5] L. Zhang et al., “Named data networking,” ACM SIGCOMM Computer Communication Review, vol. 44, no. 3, pp. 66–73, 2014. [6] M. Amadeo, C. Campolo, A. Iera, and A. Molinaro, “Information centric networking in IoT scenarios: The case of a smart home,” in IEEE ICC, 2015. [7] M. Amadeo, O. Briante, C. Campolo, A. Molinaro, and G. Ruggeri, “Information-centric networking for m2m communications: Design and deployment,” Computer Communications, vol. 89, pp. 105–116, 2016. [8] W. Shang, Q. Ding, A. Marianantoni, J. Burke, and L. Zhang, “Securing building management systems using named data networking,” IEEE Network, vol. 28, no. 3, pp. 50–56, 2014. [9] M. Sifalakis, B. Kohler, C. Scherb, and C. Tschudin, “An information centric network for computing the distribution of computations,” in ACM international conference on Information-centric networking, 2014. [10] S. Mastorakis, A. Afanasyev, I. Moiseenko, and L. Zhang, “ndnSIM 2.0: A new version of the NDN simulator for NS-3,” NDN, Technical Report NDN-0028, 2015. [11] M. Kr´ol and I. Psaras, “NFaaS: named function as a service,” in ACM Conference on Information-Centric Networking, 2017. [12] Y. Ye, Y. Qiao, B. Lee, and N. Murray, “PIoT: Programmable IoT using information centric networking,” in IEEE/IFIP NOMS, 2016. [13] Q. Wang, B. Lee, N. Murray, and Y. Qiao, “CS-Man: Computation service management for IoT in-network processing,” in IEEE ISSC, 2016. [14] A.-C. G. Anadiotis, G. Morabito, and S. Palazzo, “An SDN-assisted framework for optimal deployment of mapreduce functions in WSNs,” IEEE Trans. on Mobile Computing, vol. 15, no. 9, pp. 2165–2178, 2016. [15] A. Giridhar and P. R. Kumar, “Computing and communicating functions over sensor networks,” IEEE JSAC, vol. 23, no. 4, pp. 755–764, 2005. [16] P. Vyavahare et al., “Optimal embedding of functions for in-network computation: Complexity analysis and algorithms,” IEEE/ACM Transactions on Networking, vol. 24, no. 4, pp. 2019–2032, 2016. [17] M. Amadeo, C. Campolo, and A. Molinaro, “Multi-source data retrieval in IoT via Named Data Networking,” in International conference on Information-centric networking. ACM, 2014. [18] J. Wang, R. Wakikawa, and L. Zhang, “DMND: Collecting data from mobiles using named data,” in VNC. IEEE, 2010, pp. 49–56. [19] X. Chen and J. Zhang, “When D2D meets cloud: Hybrid mobile task offloadings in fog computing,” in ICC. IEEE, 2017.