Future Generation Computer Systems Optimization of ...

37 downloads 0 Views 4MB Size Report
Jiachen Yanga, Yurong Hana, Yafang Wanga, Bin Jianga, Zhihan Lvb,*, ...... [12] Sung Pil Hong, Yun Hong Min, Myoung Ju Park, Kyung Min Kim, Suk Mun Oh, .... [36] Harvey J. Miller, Yi Hwa Wu, Gis software for measuring space-time acces-.
Future Generation Computer Systems (

)



Contents lists available at ScienceDirect

Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs

Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city Jiachen Yang a , Yurong Han a , Yafang Wang a , Bin Jiang a , Zhihan Lv b, *, Houbing Song c a b c

School of Electrical and Information Engineering, Tianjin University, Tianjin, China School of Data Science and Software Engineering, Qingdao University, Qingdao, China Department of Electrical, Computer, Software, and Systems Engineering, Embry-Riddle Aeronautical University, Daytona Beach, FL, United States

highlights • Dynamic traffic network is established using GIS and the Internet of multimedia. • An improved K -means taking transportation as weigh is proposed. • A DBN model which pre-process large real-time IoT data is built.

article

info

Article history: Received 22 May 2017 Received in revised form 1 December 2017 Accepted 5 December 2017 Available online xxxx Keywords: Smart city Dynamic transportation assignment IoT Large data Real-time online stream DBN

a b s t r a c t With the rapid development of the information age, smart city has gradually become the mainstream of urban construction. Dynamic transportation assignment has attracted more interest in the smart city construction under the new era of the Internet of things (IoT) because the urban road traffic is the heart of many problems in many fields, such as in the case of city congestion and processing center planning system. In this paper, we analyzed the processing center’s economic indexes and optimized the dynamic transportation network assignment based on continuous big IoT input database, and a high performance computing model is proposed for the dynamic traffic planning. Specifically, while the previous methods exploited the geographical information system (GIS) or K -means separately, the proposed transportation planning is based on the real-time IoT and GIS data, which is processed by DBN and K -means to make the final solution close to the practice and meet the requirements of high performance computing and economic cost. which is regarded as the key target index. Moreover, considering the large data characteristic of real-time online stream, the deep belief network (DBN) model is built to preprocess the data to improve the clustering effect of the K -means. This study works on the example case of hotel service centers problem in Tianjin to evaluate the optimal dynamic traffic network planning result. The experiment test has proved that based on the performance of super high computing, the model is precisely helpful for the optimal planning of traffic network under real time mass data situation and low cost, and promoting the construction and development of the smart city. © 2017 Published by Elsevier B.V.

1. Introduction Nowadays, smart city has gradually become the mainstream of urban construction. Smart city can be understood as a technology intensive network, and using advanced technology to link the population, information, vehicles [1], and other urban factors highly to a whole. The data provided by smart city network are big data streams. Especially today the Internet has penetrated into people’s

* Corresponding author.

E-mail addresses: [email protected] (J. Yang), [email protected] (Y. Han), [email protected] (Y. Wang), [email protected] (B. Jiang), [email protected] (Z. Lv), [email protected] (H. Song).

lives, the real time data mainly is the Internet data stream and the inconsistence problem affecting data stream processing [2] has to be considered. Transportation is a critical factor for most of customers and industries in a smart city. Optimal dynamic transportation solution based on the intelligent transportation technique, advanced route planning algorithm, high performance computing system and so on, cannot only help to solve urban traffic congestion, but also have great economic benefits [3–6]. However, traffic assignment is quite a complex process that is affected by many factors, such as traffic patterns, applied areas, mass data processing, dynamic data collection, and so on. In the past few years, a lot of deep topics have been researched to achieve the optimal transportation

https://doi.org/10.1016/j.future.2017.12.012 0167-739X/© 2017 Published by Elsevier B.V.

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

2

J. Yang et al. / Future Generation Computer Systems (

resource allocation: an approach using multiple optimal dynamic (O-D) pairs to solve the system optimal dynamic traffic assignment problem for networks [7], the Lighthil–Whitham–Richards (LWR) model employed to describe traffic and capture queue spillback [8], a research to design task paths and optimally select lanes in a transportation network [9] and so on and so forth. In recent years, more and more attention has been paid to intelligent transportation. Hamid et al. describe the possible intelligent transport system (ITS) applications that can use unmanned aerial vehicles (UAVs), and highlight the potential and challenges for UAV-enabled ITS for next-generation smart cities [10]. Zhu et al. [11] advocates a paradigm of transportation systems named public vehicles (PVs) in which both the number of vehicles and required parking spaces were significantly reduced. By examining the Smart Card data, Hong et al. observed a set of transit behaviors of metro passengers by the time intervals that identifies the boarding, transferring, or alighting train at a station [12]. Whether the concern of the research on the best route, the strong power of the computation, or the optimal transportation tasks assignment, they have solved each target somehow to make the transportation network under real time state goes to the optimal approaches mostly based on the exiting ends or spots. This paper proposed the system optimal dynamic traffic assignment plan based on big IoT data using DBN and clustering model in smart city. The best configuration of the centers can be proposed, to find the right location and the optimal transportation network solution to make the whole cost lowest. For the highly developing enterprise or even a developing country, due to more branches will be established, the transportation issue naturally will be the main concern for the enterprise expansion, or for the government planning for smart city or certain kind of industries. In some industry field, some advanced algorithm has been implemented like fast greedy heuristic for clustering and fuzzy clustering-based hybrid [13–16]. Most of the cases are studied as mathematical problems, and quantitative results are obtained. However, due to the rapid changes in the city’s traffic conditions, it is difficult to design a universally applicable road planning program [17]. Therefore, in this paper, the real-time traffic data were taken into consideration to be the primary data and the network was optimized based on the update data. To make the algorithm and the study case closer, the calculation based on the GIS data and the real transportation data from information technology is an essential approach [18,19]. Meanwhile, thanks to the latest technology and the related information and computing research [20–22] on the intelligent transportation, it makes possible to apply the intelligent transportation technology and machine learning into the advanced dynamic planning algorithm and easy to set up an effective system for dynamic traffic under the complex mass data processing system as well. Therefore, the combination of the K -means method and the GIS information that incorporates the suppliers, basic centers’ operation cost as well as other parameters related to hotel centers’ optimization planning. Processing center is a key component of the whole supply chain in many kinds of industries. Processing center means collecting the raw real time materials from the suppliers and make further processing or manufacturing, and then return the materials back to the supplier for sales or for next step’s manufacturing [23– 26]. For example the coal mines transport coal to the heat power station for electric generating; cash processing centers of banks process banknotes from the bank branches and return the qualified banknotes back to each branches. The cost of this kind of the processing centers, is mainly from the transportation between the center and the supplier points. Moreover, in the process of planning processing center, the requirement of the accuracy of the algorithm and the huge real time data processing bring the difficulties in calculation. The heavy

)



computation will weaken the superiority of the algorithm to a great extent, thus it has become the focus of the designing algorithm [27–29]. In addition, if the volume of data is big enough, the hidden statistical information and potential correlations will be revealed by the data set itself. Based on this, many research have been carried out, such as (simultaneous perturbation stochastic approximation) SPSA algorithm for the calibration of dynamic traffic assignment models [30], the method of successive averages (MSA)-based Heuristics [31] and tile-based reservation [32]. However, these algorithms are shallow traffic models and the planning results need to be more reasonable and effective. These inspire us to extend and apply the concept of deep learning which not only learn deep characteristics but also process large amounts of data quickly. The DBN in this paper is used to process these big data to obtain the pre-classification results, and the initial clustering center can be selected for better clustering accuracy and lower transportation cost. Then how to plan the processing center in a proper way and using the least time to achieve the best results need deep analysis and research [33,34]. (1) In order to digest the entire supplier’s material, how to set up the sub transportation network by dividing the region into several parts? (2) In each region, where to locate the center to make the transportation cost to the lowest level? (3) To cut the costs, how to design the algorithm by high performance computing and advanced technology? Actually the solution of the planning problem needs massive data to calculate for generating a precise result, like supplier’s quantity and geography locations, frequency of the transportation to each supplier spot, average vehicle cost, driving route situation, etc. But for the moment, there is no platform or system to integrate all the related data for the problem planning. In fact, such a kind of system with high computational performance is necessary for government or company group to set up in order to plan the transportation network and the processing centers for future’s extension in one certain region systematically. Inspired by these problems and the previous research, a high performance computing system was built in this paper to be used in the dynamic transportation network to find the right location and the optimal transportation network solution for the whole cost lowest. The innovations of this paper are as follows: (1) This paper combines GIS information with Internet of multimedia to obtain the big raw data which K -means clustering algorithm needs and establish the dynamic traffic network to process real time data. (2) Compared with traditional K -means, an improved K -means taking transportation as weigh is proposed to divide the region into several parts and locate the center to get the lowest comprehensive cost. (3) A DBN model was built to obtain the large real-time data pre-classification results used as the input of K -means, to improve the accuracy of clustering and computational performance. The rest of this paper is organized as follows. Section 2presents the setup of the system. The overall clustering algorithm is described in Section 3, Section 4 presents the discussion conducted on the real applications and the performance analysis of the proposed model. Finally in Section 5, the paper is conducted by a discussion and an outlook on the future work. 2. System setup Dynamic transportation network for processing center planning system is a complicated integration system with the technology of intelligent transportation, database, network, and computing. Algorithmic model and planning computing works as the

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

J. Yang et al. / Future Generation Computer Systems (

)



3

Fig. 1. System setup.

core in the system, the real time GIS data and other online and offline figures supply as the input of the Algorithmic module and computing [35–37]. The final target of the system is to utilize all the existing real time data to plan how to configure the processing centers to achieve the following: (1) Generate a dynamic optimal amount of the processing centers; (2) Computing performance is better than the traditional model; (3) Processing Centers can cover all the supplier spots; (4) Clear dynamic transportation network with optimal transportation route to transfer the material between the supplier points and the corresponding processing center; (5) Real time optimal configuration of the transportation network and the processing centers including the centers’ operation cost and transportation cost as well. The system is shown as Fig. 1 with the critical components and the structure.

internet or managed by administrator to guarantee the planning result is based on the latest information. Data base is fundamental for the system, including the traffic situation, supplier capacity, location figures, transportation requests, vehicle specification, unit transportation cost, specified charge along special route, GIS information of the region which covers the suppliers, basic center’s operation cost and related parameters, etc. 2.3. GIS component GIS is a computer system with acquisition, storage, management, apricot polling, simulation and analysis of geographic information, which can process and analyze a large amount of geographic data general geographic information [40]. For the transportation network planning for processing center, GIS is the fundamental that can provide the traffic situation, the route network and figures to the algorithmic planning computing. All the data provided can be used for the computing the center’s locating, vehicle transportation routing and transportation cost etc.

2.1. Algorithmic model 3. Algorithmic model It mainly includes DBN classification algorithm, supplier spots clustering algorithm, center locating algorithm, vehicle routing algorithm and multi-factor operation cost algorithm. This algorithmic model is the base of the computing system. Due to different applications such as different amounts of suppliers, cost accounting approaches, weighing factor for each suppliers, different transportation calculating per hour or per kilometer in certain religion etc.; algorithmic model needs to be adjusted according to the update of the data before the system goes to the dynamic algorithmic planning computing [38,39]. 2.2. Data base It stores all the real time data that the algorithmic calculation needs including the GIS data as well. The data is always updated via

In a certain region, it is assumed there are n supplier spots which need to transport their material for processing. If only one processing center established in the region, plenty of vehicles need to be equipped for supplement. Vehicles cost a lot especially for the supplier spot which is far away from the processing center. If n processing centers distribute in this region, the transportation cost must be very low, but the total operation cost of the n centers is terribly high. How to configure the supply transportation network at a perfect balance is a problem to be specifically studied. The K means algorithm is highly dependent on the initial cluster centers, which often result in a rather unstable clustering result and a large increase in the number of iterations. Traditional K -means select cluster centers randomly and this has great contingency. Therefore, in the iterative updating process, sometimes the efficiency of the

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

4

J. Yang et al. / Future Generation Computer Systems (

)



on the hidden1 layer is passed to the nodes in hidden2 layer. This modeling and passing methods continues to the last layer, which is called output layer to produce initial classify centers. In order to simplify the training time complexity, the DBN model is based on restricted Boltzmann machine (RBM). It consists of a set of visual variables X = {xi } (i is the node{of the } visual layer) { and}two sets of hidden layer variables H1 = hj (1) and H2 = hk (2) (j is the node of the hidden layer 1 and p is the node of the hidden layer 2. After repeated tests, the node of the hidden layer we selected is n1 = n/5 and n2 = k ∗ 5). The framework of the DBN model is shown as Fig. 2. In RBM, the energy is expressed as: E (x, h) = bT x + c T h + hT WX

(1)

where W is the weight matrix, b and c are the offset vectors of input layer and hidden layer respectively. The distribution probability of the network node is determined by the distribution of the energy and the smaller energy assignment correspond to the larger probability of occurrence. Therefore, the distribution probability is: 1

p (v, h) =

z

e−Energy

e−Energy = ∑

(2)

e−Energy(xi ,h)

i

where z is the normalization constant. Because there is no connection between the same layer when the RBM is trained, so the conditional distribution can be calculated as:

( (

)

p hj = 1|x = f

) bi +



wij xi

(3)

i

Fig. 2. Network structure of DBN.

(

(1)

p hj method is high, and sometimes the result cannot be convergent. Moreover, K -means is based on the objective functions, in which the gradient method is generally used to find the optimal values. Because the search of gradient method is along the decreasing direction, the algorithm may fall into a local minimum due to improper initial values and sometimes the result cannot be convergent. Therefore, the initial cluster centers should be chosen cautiously. To make this problem easier and convergence faster, firstly, the system divides the region into several zones with small scales using DBN under certain constraint condition, and then calculates the optimal transportation network configuration cost within each separated zone, finally tries to get the final approximation [41]. Normally the supplier spots amount is big and the transportation between two points in real world is not just measured as the linear distance but a lot of road sections connection. In this research clustering algorithm [42–45] is used to cluster n data objects (dimensions) into K pieces of clustering (K < n), to make the similarity of all the objects in one clustering as high as possible, and make the similarity between different clustering as low as possible. Due to this kind of the clustering features, the utilization of the DBN classification and the clustering algorithm make the zones more compact and the result closer to the practice. 3.1. DBN algorithm The system can classify the data obtained at random, but the calculation is not accurate and its efficiency is low. DBN is able to accurately and efficiently classify the data according to the characteristics of them and thus the system can cluster the data based on the classification. In this paper, because of the large amount of data obtained, the DBN with two hidden layers was used. The steps of DBN classification approach in this paper could be summarized as follows: the first RBM receives the input from its nodes and model it on the hidden1 layer. Then the modeled input

) ( ) = 0|x = 1 − p h(j 1) = 1|x ⎛

(

p xi = 1|h

) (1)

(4)



= f ⎝cj +



wij h(j 1) ⎠

(5)

j

p xi = 0|h(1) = 1 − p xi = 0|h(1)

)

(

(

)

(6)



⎛ (

(2)

p hp = 1|h

(1)

)

= f ⎝dj +



(1) ⎠

wjp hj

(7)

j

p h(p2) = 0|h(1) = 1 − p h(p2) = 1|h(1)

(

(

)

(1)

p hj

= 1|h

(2)

)

(

)

( =f

(8)

) ep +



(2)

wjp hp

(9)

p

(

(1)

p hj

) ( ) = 0|h(2) = 1 − p h(j 1) = 1|h(2)

where f is the sigmoid function, σ (z ) =

(10) 1

(1+e−z )

. And the parame-

ters are updated according to Eqs. (21)–(25).

(⟨ ⟩ ⟨ ⟩ ) ∆wij = ε xi hj data − xi hj model

(11)

(⟨ ⟩ ⟨ ⟩ ) ∆wjp = ε hj hp data − hj hp model

(12)

(⟨ ⟩ ⟨ ⟩ ) ∆xi = ε xi 2 data − xi 2 model

(13)

∆h(j ) = ε 1

(⟨(

(1)

hj

)2 ⟩

− data

⟨(

(1)

hj

)2 ⟩

) (14) model

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

J. Yang et al. / Future Generation Computer Systems (

∆h(p2) = ε

(⟨(

h(p2)

)2 ⟩ data

⟨( )2 ⟩ − h(p2)

model

)

.

)



(15)

The DBN algorithm frame is as following: (1) Initialize of execution times (gn ), the weights [ the number ] array (W = wij , wjp ) of the DBN network, the number of the nodes in hidden layer1 (n1 ) and hidden layer2 (n2 ), and the number of the category which used to clustering (K ); (2) Input the real time data that the algorithmic calculation needs and normalize them to [0, 1]; (3) Train the DBN network model, the learning rate of the DBN model is set to 0.0005. The batch size is set to 5 due to the large number of training samples. The purpose of the training is to minimize the cost function when given the training sets (f1 , s1 ), (f2 , s2 ), . . . , (fN , sN ), where fi , i = 1, 2, . . . , N is the output and si , i = 1, 2, . . . , N is the desired output. Here the cost function is expressed as

φˆ = arg min

N ∑

(si − φ (fi ))2

(16)

i=1

where N is the number of the samples. Then the solution to minimize the cost function is follow as

φˆ =

N ∑

ωi K (f , fi ) + b

5

(17)

i=1

where wi is the weight matrix for the ith sample, b is the bias value, and K (•) is the reproducing kernel. Thus, the optimal wi and b of the objective function are obtained by the way of back propagation error, and finally the initial data pre classification are obtained by several epoches and the deadline epoch number is 100; (4) Output the clustering result and assign the category label to each cluster; (5) Use DBN model to run the test data set samples and define the category of the sample according to the clustering label of the output layer.

Fig. 3. Clustering distribution.

(2) Calculate the distance between each sample to the clustering center: d(Xi , Zk (I)), i = 1, 2, 3 . . . n; k = 1, 2, 3 . . . K ; If

  k ∑ d (Xi , Zk (I )) = √ Xi − Zk (I ),

(19)

i=1

then Xi = ωj .

(20)

(3) Calculate the new clustering center: Zk (I + 1) =

nk 1 ∑

nk

Xi (k), k = 1, 2, . . . , K ;

(21)

i=1

(4) Determine: . If Zk (I + 1) = Zk (I), the algorithm ends;

(22)

else Zk (I + 1) ̸ = Zk (I), then I = I + 1, return2. (k = 1, 2, . . . , K ).

(23)

3.2. Clustering algorithm In traditional methods, the K -means [46–48] firstly chooses K points as the initial clustering center, in the iterative updating process, sometimes the efficiency of the method is high, and sometimes the result cannot be convergent. After pre-process by DBN, which make this problem easier and convergence faster, we got more scientific initial K clustering centers. In this paper, we firstly classify all the IoT data to k classes and determine the center of each set of data, and then calculate the distance between the samples and the clustering centers, and finally group each sample to the clustering which is the closest to it. The typical clustering criterion function is defined as: Jc =

m ∑ ∑

d(Xi , Zk ).

3.3. Algorithm model and application The research is trying to divide the region into several parts based on the DBN model and the K -Means algorithm, and trying to make clustering based on GIS as the concept in Fig. 3. The target of the system is to utilize all the existing real time data to find the optimal dynamic transportation network configuration as well as the lowest whole cost. When calculating the transportation network including processing center’s location, the total transportation cost will be the evaluation index to determine the quantity of the processing centers and the independent locations in the K -means algorithm. The algorithm target index model is as below:

(18)

k=1 xi ∈ck

Zk is the clustering center of No · k, d(Xi , Zk ) is the Euclid space distance between the sample and the corresponding clustering center, and Jc is the sum total of distance from each sample to its corresponding clustering center. This criterion function is meant to find the divisions with k parts which make the variance function minimum and the clustering compact independent. The algorithm frame is as following: (1) Use DBN model to divide n samples into K classes, choose initial clustering center Zk (I), I = 1;

MinTC =

∑∑ (

λij δij Dij + Ci ), ∀j ∈ S .

(24)

Gi⊆S j∈Gi

See Table 1 for meaning of the parameters in Eq. (24). Among them, δij is obtained from the database which is calculated from this suppliers’ production plan and capacity. λij is extracted from the system database which is also linked with GIS data, which reflects the different average cost with different road charge or tolls when vehicle transports from the supplier point to the corresponding processing center by going through different section of road, highway or else.

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

6

J. Yang et al. / Future Generation Computer Systems (

)



Fig. 4. Example of the supplier spots and the processing center.

Table 1 Meaning of the parameters in Eq. (24). MOSµt

Index: µ

S Gi Dij

all the supplier points; independent zone marked with No. i; the distance between processing center i and supplier point j; transportation request frequency of supplier point j; transportation cost coefficient from independent zone i to supplier point j; the processing center i operation cost; the total cost of all the processing center entire cost including the transportation and the operation cost;

δij λij

Ci TC

During the K -means algorithm calculation, in each round to find the new centroid corresponding to each existing clustering centers, here the weigh factor is applied for the new centroid calculation: kn ∑

Xk =

xi •

i=k1 kn ∑ i=k1

Tk =

kn ∑

Ti .

xi

kn ∑

Ti Tk

, Yk =

yi •

i=k1 kn ∑

Ti Tk

,

(25)

yi

Fig. 5. Schematic flow chart of the calculation.

i=k1

(26)

k

Here, Ti is the transportation cost from each supplier spot to its corresponding clustering center, now it is used for weigh factor to help the algorithm converge faster but comply with the target index. In this research, the distance mentioned in the algorithm is not the linear distance between two points, but the distance based on the GIS. The real distance between 2 points in GIS is the shortest route after going through the entire possible roads. Finding the shortest route under the premise of considering the real time condition is the basic problem to solve the transportation cost minimum problem. The related parameter can be referred as Fig. 4. 3.4. Determination for transportation network of processing centers distribution configuration Before determining the final transportation network and processing centers distribution during the period of time, first the scope of this quantity of the processing centers is needed. After analyzing the clustering of these processing centers within the

scope, the corresponding transportation cost comes up, and the result can help to find the best balance between the operation cost and transportation cost to get the best solution, and the final solution is more convincible, reasonable, and closer to the practice due to the deep and detailed algorithm calculation. (1) Use DBN network model to classify all the real-time IoT data and choose k initial spots as the clustering centers; (2) Calculate the distance between each supplier spot and the k clustering centers based on the GIS system. (Xk , Yk ) is the clustering center, (Xi , Yi ) is the supplier spot. The coordinates (X , Y ) is expressed in latitude and longitude which will be stored in the GIS database; (3) For each supplier spot, choose the minimum distance and the corresponding clustering center, and group the supplier spot to this clustering; (4) During the K -means algorithm, calculate the transportation cost with the vehicle transportation coefficient and the supplier spots transportation request frequency. Get the k clustering which have the best solution to make the transportation cost lowest. If there is data update, the system return step (1) to rerun; (5) Repeat step (1) to generate different amounts of clustering within the scope of the clustering quantity and the corresponding

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

J. Yang et al. / Future Generation Computer Systems (

)



(a)

7

(b)

Fig. 6. Hotels distribution. (a) Samples locations. (b) Samples with the coordinates.

best transportation configuration. The cost with different amounts of clustering is CTk ; (6) Calculate the processing center under each configuration of the clustering: COk synthesize CTk and COk to get the final conclusion including the processing centers quantity, locations, configuration of the best solution of transportation to each supplier spots. The algorithm flow chart is shown as Fig. 5.

(1) The route calculation is based on the digital map of openstreetmap project and real time traffic situation from the Internet; (2) The geocoded data of 400 locations of hotel is selected from openstreetmap; (3) The frequency of each hotel supply request is acquired by the questionnaire from the internet. (4) Transportation calculation parameters, shown in Table 2.

4. Applications and case study

4.2. Clustering and the transportation network result

In this section, we present the results of a case study for the application that how to solve the transportation network planning problem regarding a big amount of supplier spots. This application here is implemented for the hotel service centers configuration in Tianjin City. The service center serves different hotels in the city mainly for hotel daily supplies and service. Vehicles will be sent out to exchange the supplies from service centers and hotels every day. With the help of the system and the algorithm, the best configuration of the centers can be proposed.

Clustering algorithm is implemented for 26 times, which gets the initial clustering centers by using DBN model, and independently divides the region by 5 ∼ 30 service centers as Fig. 7. According to different transportation network with the clustering result, the annual transportation cost is calculated as shown in Fig. 8.

4.1. Set up and module input data The baseline of the planning is within the Out-Ring of Tianjin city, the geographical distribution of the hotels in the region is shown as Fig. 6. Objective of the model is to find the solution of the optimal configuration of the service centers and transportation network. The optimal location is where the sum of the transportation cost from the service centers to the hotels CTk is minimal. In order to find the best amount of service centers with the minimal overall transportation and operation cost, the clustering has to be calculated. In this analysis the clustering is calculated for different scenarios from 5 to 30 service centers within the region. The clustering algorithm is based on the geographic distribution of hotels and the traffic situation, and the transportation distances are derived by the real geographic road routing calculation. The basic data includes the geocoded data of all the 618 hotel locations within the Out-Ring of Tianjin city. The following data is used to do the algorithm:

4.3. Operation cost model For the operation cost model, average annual operation cost is used in this case. (1) The cost for the store rent is based on the amount of the hotels this service center serves for; (2) Cost for employee are estimated with approximately RMB 500,000 per annual; (3) Cost for machine broken down into cost according to the amount of the hotels this service center serves for, and machines underlay a depreciation of 10 years; (4) Projected operation cost is classified to the service center’s size with the corresponding service volume. 4.4. Result of the transportation network and the operation model The optimal number of service centers derived from this model within the Out-Ring of Tianjin is 25. The circle in Fig. 9 indicates the minimal total costs. The result of the model shows the transportation optimum of the mathematical model only and does not take into consideration existing facilities (buildings), current locations of service centers, land use plans, land prices, etc.

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

8

J. Yang et al. / Future Generation Computer Systems (

)



Table 2 Detailed performance comparison. Definition

Item

Cost

Transportation cost (RMB)

Cost per hour(including personnel) Cost per kilometer

30 0.8

Time restrictions

Breaks and loading time per tour Maxim working time Average service time at hotels

30 min 9h 15 min

Driving speed

Within InRing Out of InRing Highway and interurban roads Freeway

25 km/h 60 km/h 80 km/h 110 km/h

(a)

(b)

(c)

(d)

Fig. 7. Clustering models result with 5, 10, 20, 25 service centers. (a) Network with 5 service centers. (b) Network with 10 service centers. (c) Network with 20 service centers. (d) Network with 25 service centers.

4.5. Discussion In this research, convergence speed is one evaluation index for the clustering algorithm which can use number of re-clustering calculation as the factor. Table 3 presented that the using the transportation as weigh to find the new centroid is very helpful for the algorithm convergence

which can save 5–8 times of re-clustering calculation. It is obvious that the K -means using transportation cost as target has the higher convergence speed compared with traditional K -means. At the same time, it converged faster after using DBN to pre-process the big data. Also it will not bring the deviation to the final solution because the transportation cost is the target index as well during the whole clustering calculation.

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

J. Yang et al. / Future Generation Computer Systems (

)



9

Table 3 Convergence comparison. Clustering number 5

15

20

25

Algorithm

Convergence speed (time)

Traditional KM KM with transportation as weigh for new Centroid KM using DBN to pre-process data with transportation as weigh for new Centroid

30 25 22

Traditional KM KM with transportation as weigh for new Centroid KM using DBN to pre-process data with transportation as weigh for new Centroid

38 33 30

Traditional KM KM with transportation as weigh for new Centroid KM using DBN to pre-process data with transportation as weigh for new Centroid

40 32 30

Traditional KM KM with transportation as weigh for new Centroid KM using DBN to pre-process data with transportation as weigh for new Centroid

28 23 20

Fig. 8. Annual transportation cost scenario. Fig. 9. Transportation network model cost.

During the clustering algorithm calculation, if we use different initial clustering centers, the final clustering may get the different result because the calculation value converges of the second optimal result earlier. Regarding to this, in this paper, we use enough iteration time to reduce the error between the real optimal results. Here we use the DBN model to divide these data into a number of categories, and then choose the initial clustering center and make the clustering algorithm iteration. In the case study, when we choose 15 and 20 service centers, we implement the clustering iteration by 100 times automated switching the initial clustering centers and get the best result from all the 100 final value. As Fig. 10 shows that when iteration calculated 52th times and 76th times, the optimal value comes up correspondingly. In this research, the transportation cost based on GIS is taken as the target index during the K -means calculation. Another new approach is as following: firstly calculating clustering with the traditional K -means by using the Euclid space distance as the target index, then the real transport cost is calculated corresponding to the clustering result. The optimal number of service centers derived from this model is 27 as Fig. 11. Comparing each cost with each amount of service center under different approaches, the value of each point in this research is lower than the one with traditional K -means. This indicates that directly taking the cost as the target index when doing the clustering algorithm is more reasonable and closer to practice in this application, and it reduces the 2rd additional cost calculation workload as well. 5. Conclusion In this paper, we analyzed the processing center’s key point and relationship of the dynamic transportation network assignment based on IoT big database. A platform is presented that it is necessary as the base to support the planning algorithm, collecting

(a)

(b)

Fig. 10. Iteration calculation. (a) Service number is 15. (b) Service number is 20.

data and updating data as well. The transportation cost is always the key index when making the clustering planning including the new centroid generating. In this research all the transportation planning and calculation is based on the real-time IoT data and GIS data to make the final solution close to the practice and meet the requirements of high performance computing. According to our

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

10

J. Yang et al. / Future Generation Computer Systems (

(a)

(b)

Fig. 11. Transportation network model cost. (a) Cost with new approach. (b) Cost compare between 2 approaches.

case study in the Out-Ring of Tianjin City for the hotels service center’s distribution problem, we follow the algorithm based on DBN network model and the K -means but with improved calculation which cares the transportation always. The result shows that the solution is optimal, closer to the practice and it has a fast convergence speed. Behind this study, it not only make a new direction for the city traffic planning, but can produce huge economic benefits, and thus to accelerate the construction of the smart city network based on the IoT data set. Actually, this study will show even clearer if we apply this to a big region like states, countries or continents because the influence of the longer transportation or more complicated will influence more on this kind of processing center’s operations. Another issue is that this research can be expanded to the other transportation business model like the traditional supply chain between factories, suppliers, and customers. The algorithm and the target index have to be updated accordingly and need more detailed analysis. Acknowledgments This work was supported in part by National Natural Science Foundation of China (No. 61471260), Natural Science Foundation of Tianjin (16JCYBJC16000), and Shandong Provincial Natural Science Foundation (ZR2017QF015). References [1] S.L. Toral, D. Gregor, M. Vargas, F. Barrero, F. Cortes, Distributed urban traffic applications based on CORBA event services, Int. J. Space-Based Situat. Comput. 1 (1) (2011) 86–97. [2] Fatos Xhafa, Victor Naranjo, Leonard Barolli, Makoto Takizawa, On streaming consistency of big data stream processing in heterogenous clutsers, in: International Conference on Network-Based Information Systems. 2015, pp. 476– 482. [3] Catalin Gosman, Tudor Cornea, Ciprian Dobre, Florin Pop, Aniello Castiglione, Controlling and filtering users data in intelligent transportation system, Future Gener. Comput. Syst. (2016). [4] Bryan Raney, Kai Nagel, Iterative route planning for large-scale modular transportation simulations, Future Gener. Comput. Syst. 20 (7) (2004) 1101–1118.

)



[5] Wei Ni, Weigang Wu, Keqin Li, A message efficient intersection control algorithm for intelligent transportation in smart cities, Future Gener. Comput. Syst. (2016). [6] Yi Hou, Praveen Edara, Carlos Sun, Traffic flow forecasting for urban work zones, IEEE Trans. Intell. Transp. Syst. 16 (4) (2015) 1761–1770. [7] Kien Doan, Satish V. Ukkusuri, Dynamic system optimal model for multi-OD traffic networks with an advanced spatial queuing model, Transp. Res. C 51 (2015) 41–65. [8] Ke Han, Hongcheng Liu, Vikash V. Gayah, Terry L. Friesz, Tao Yao, A robust optimization approach for dynamic traffic signal control with emission considerations, Transp. Res. C 70 (2015) 3–26. [9] Yun Fei Fang, Feng Chu, Saïd Mammar, Meng Chu Zhou, Optimal lane reservation in transportation network, IEEE Trans. Intell. Transp. Syst. 13 (2) (2012) 482–491. [10] Hamid Menouar, Ismail Guvenc, Kemal Akkaya, A. Selcuk Uluagac, Abdullah Kadri, Adem Tuncer, UAV-enabled intelligent transportation systems for the smart city: Applications and challenges, IEEE Commun. Mag. 55 (3) (2017) 22– 28. [11] Ming Zhu, Xiao Yang Liu, Feilong Tang, Meikang Qiu, Ruimin Shen, Wennie Shu, Min You Wu, Public vehicles for future urban transportation, IEEE Trans. Intell. Transp. Syst. PP (99) (2016) 1–10. [12] Sung Pil Hong, Yun Hong Min, Myoung Ju Park, Kyung Min Kim, Suk Mun Oh, Precise estimation of connections of metro passengers from smart card data, Transportation 43 (5) (2016) 749–769. ¯ Beatriz Sousa Santos, Using cluster[13] Sérgio Barreto, Carlos Ferreira, Jos Paixao, ing analysis in a capacitated location-routing problem, European J. Oper. Res. 179 (3) (2007) 968–977. [14] Lev Kazakovtsev, Alexander Antamoshkin, Genetic algorithm with fast greedy heuristic for clustering and location problems, Informatica (38) (2014) 229– 240. [15] Şakir Esnaf, Tarık Küçükdeniz, A fuzzy clustering-based hybrid method for a multi-facility location problem, J. Intell. Manuf. 20 (2) (2009) 259–265. [16] John Mittenthal, Capacitated hierarchical clustering heuristic for multi depot location-routing problems, Int. J. Logist. 16 (5) (2013) 433–444. [17] Thomas Liebig, Nico Piatkowski, Christian Bockermann, Katharina Morik, Dynamic route planning with real-time traffic predictions, Inf. Syst. (2016). [18] Jurij Kotikov, Pavel Kravchenko, Optimizing transport-logistic cluster freight flows of a port megacity on the basis of GIS, Appl. Mech. Mater. 725–726 (2) (2015) 403–415. [19] Deepak Paramashivan Kaundinya, P. Balachandra, N.H. Ravindranath, Veilumuthu Ashok, A GIS (geographical information system)-based spatial data mining approach for optimal location and capacity planning of distributed biomass power generation facilities: A case study of Tumkur district, India, Energy 52 (2) (2013) 77–88. [20] G. Fusco, C. Colombaroni, L. Comelli, N. Isaenko, Short-term traffic predictions on large urban traffic networks: Applications of network-based machine learning models and dynamic traffic assignment models, in: Models and Technologies for Intelligent Transportation Systems, 2015. [21] Jiachen Yang, Ru Xu, Jing Cui, Zhiyong Ding, Robust visual tracking using adaptive local appearance model for smart transportation, Multimedia Tools Appl. (2016) 1–14. [22] A. Vishwanath, H. Gan, S. Kalyanaraman, S. Winter, Personalized public transportation: A mobility model and its application to melbourne, IEEE Intell. Transp. Syst. Mag. 7 (4) (2015) 37–48. [23] Sanjay Melkote, Mark S. Daskin, An integrated model of facility location and transportation network design, Transp. Res. A 35 (6) (2001) 515–538. [24] Reza Zanjirani Farahani, Elnaz Miandoabchi, W.Y. Szeto, Hannaneh Rashidi, A review of urban transportation network design problems, European J. Oper. Res. 229 (2) (2013) 281–302. [25] Nour Eddin El Faouzi, Henry Leung, Ajeesh Kurian, Data fusion in intelligent transportation systems: Progress and challenges — a survey, Inf. Fusion 12 (1) (2011) 4–10. [26] Junping Zhang, Fei Yue Wang, Kunfeng Wang, Wei Hua Lin, Xin Xu, Cheng Chen, Data-driven intelligent transportation systems: A survey, IEEE Trans. Intell. Transp. Syst. 12 (4) (2011) 1624–1639. [27] A. Koulakezian, H. Abdelgawad, A. Tizghadam, B. Abdulhai, Robust network design for roadway networks: Unifying framework and application, IEEE Intell. Transp. Syst. Mag. 7 (2) (2015) 34–46. [28] Seong Woo Kim, Wei Liu, Cooperative autonomous driving: A mirror neuron inspired intention awareness and cooperative perception approach, IEEE Intell. Transp. Syst. Mag. 8 (3) (2016) 23–32. [29] J. Yang, Z. Ding, F. Guo, H. Wang, N. Hughes, A novel multivariate performance optimization method based on sparse coding and hyper-predictor learning, Neural Netw. 71 (C) (2015) 45–54. [30] Lu Lu, Yan Xu, Constantinos Antoniou, Moshe Ben-Akiva, An enhanced SPSA algorithm for the calibration of Dynamic Traffic Assignment models, Transp. Res. C 51 (2015) 149–166. [31] Michael W. Levin, Matt Pool, Travis Owens, Natalia Ruiz Juri, S. Travis Waller, Improving the convergence of simulation-based dynamic traffic assignment methodologies, Netw. Spat. Econ. 15 (3) (2015) 1–22. [32] Michael W. Levin, Stephen D. Boyles, Intersection auctions and reservationbased control in dynamic traffic assignment, Transp. Res. Rec. 2497 (2015) 35– 44.

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

J. Yang et al. / Future Generation Computer Systems ( [33] Kajal Lahiri, Vincent Wenxiong Yao, Economic indicators for the US transportation sector, Transp. Res. A 40 (10) (2006) 872–887. [34] Anna Nagurney, Jon Loo, June Dong, Ding Zhang, Supply chain networks and electronic commerce: A theoretical perspective, Netnomics 4 (2) (2002) 187– 220. [35] G.F. Bonham-Carter, Geographic information systems for geoscientists-model ing with GIS, Comput. Methods Geosci. 13 (1994). [36] Harvey J. Miller, Yi Hwa Wu, Gis software for measuring space-time accessibility in transportation planning and analysis, GeoInformatica 4 (2) (2000) 141–159. [37] Tao Qu, Steven T. Parker, Bin Ran, Large scale intelligent transportation system traffic detector data archiving, 2015. [38] L.I. Qiang, Carl E. Kurt, GIS-based itinerary planning system for multimodal and fixed-route transit network, in: Mid-Continent Transportation Symposium 2000, 2000. [39] Andreas Klose, Andreas Drexl, Facility location models for distribution system design, European J. Oper. Res. 162 (1) (2005) 4–29. [40] R. Sherlock, P. Mooney, A. Winstanley, J. Husdal, Shortest path computation: a comparative analysis, Proc. Gisruk (2002). [41] J. Current, H. Min, D. Schilling, Multiobjective analysis of facility location decisions, European J. Oper. Res. 49 (3) (1990) 295–307. [42] Jie Yu, Gang Len Chang, H.W. Ho, Yue Liu, Variation based online travel time prediction using clustered neural networks, in: International IEEE Conference on Intelligent Transportation Systems, 2008, pp. 85–90. [43] Zhi Hua Fan, Da Ke Huang, Juan Zi Li, Ke Hong Wang, Morphing cluster dynamics: time series clustering with multipartite graph, in: International Conference on Machine Learning and Cybernetics. 2004, vol.3, pp. 1749–1754. [44] Xiaozhe Wang, A. Wirth, Liang Wang, Structure-based statistical features and multivariate time series clustering. 2007, pp. 351–360. [45] B. Higgs, M. Abbas, A two-step segmentation algorithm for behavioral clustering of naturalistic driving styles, in: International IEEE Conference on Intelligent Transportation Systems. 2013, pp. 857–862. [46] M. Emre Celebi, Hassan A. Kingravi, Patricio A. Vela, A comparative study of efficient initialization methods for the k-means clustering algorithm, Expert Syst. Appl. 40 (1) (2013) 200–210. [47] Adam Coates, Andrew Y. Ng, LeArning Feature Representations with K -Means, Springer, Berlin, Heidelberg, 2012, pp. 561–580. [48] Abdolreza Hatamlou, Salwani Abdullah, Hossein Nezamabadi-Pour, A combined approach for clustering based on K -means and gravitational search algorithms, Swarm Evol. Comput. 6 (2012) 47–52.

Jiachen Yang received the M.S. and Ph.D. Degrees in communication and information engineering from Tianjin University, Tianjin, China, in 2005 and 2009, respectively. He is currently a professor at Tianjin University. He is also a visiting scholar with the Department of Computer Science, School of Science, Loughborough University, U.K. His research interests include stereo camera, stereo vision research, pattern recognition, stereo image displaying, and quality evaluation.

Yurong Han is a M.S. student with School of Electrical and Information Engineering, Tianjin University, Tianjin, China. Her research interests include deep learning, pattern recognition and stereo vision research.

)



11

Yafang Wang is a M.S. student with School of Electrical and Information Engineering, Tianjin University, Tianjin, China. Her research interests include Big data, intelligent transportation systems, machine learning and stereo vision research.

Bin Jiang received the B.S. and M.S. degree in communication and information engineering from Tianjin University, Tianjin, China, in 2013 and 2016. He is currently pursuing the Ph.D. degree at the School of Electronic Information Engineering, Tianjin University, Tianjin, China. His research interests include pattern recognition, stereo vision research, stereo image displaying and quality evaluation.

Zhihan Lv received his Ph.D. degree from the Ocean University of China in 2012. He worked in CNRS (France) as Research Engineer, Umea University (Sweden) as Postdoc Fellow, Fundacion FIVAN (Spain) as Experience Researcher. He was a Marie Curie Fellow in European Unions Seventh Framework Programme LANPERCEPT. He has also been an assistant professor in Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China since 2012. His research mainly focuses on Virtual Reality, Augmented Reality, Multimedia, Computer Vision, 3D Visualization, Graphics, Serious Game, Human Computer Interaction, Networks, Bigdata, Software Engineering.

Houbing Song received the Ph.D. degree in electrical engineering from the University of Virginia, Charlottesville, VA, in 2012. In 2012, he joined the Department of Electrical and Computer Engineering, West Virginia University, Montgomery, WV, where he is currently an Assistant Professor and the founding director of both the West Virginia Center of Excellence for Cyber-Physical Systems (WVCECPS) sponsored by the West Virginia Higher Education Policy Commission and the Security and Optimization for Networked Globe Laboratory (SONG Lab). His research interests lie in the areas of cyber-physical systems, internet of things, connected vehicles, wireless communications and networking, optical communications and networking, and digital image processing. Dr. Song’s research has been supported by the West Virginia Higher Education Policy Commission. Dr. Song is a senior member of IEEE and a member of ACM.

Please cite this article in press as: J. Yang, et al., Optimization of real-time traffic network assignment based on IoT data using DBN and clustering model in smart city, Future Generation Computer Systems (2017), https://doi.org/10.1016/j.future.2017.12.012.

Suggest Documents