Resource Symmetric Dispatch Model for Internet of Things on ... - MDPI

4 downloads 215290 Views 4MB Size Report
Apr 5, 2016 - Abstract: Business applications in advanced logistics service are highly concurrent. ... center commands are sent to application terminals by the integrated ..... Call. Task 0. Task 0-1. Task 0-2. Task 0-1-1. Task 0-1-2. Task 0-2-1.
SS symmetry Article

Resource Symmetric Dispatch Model for Internet of Things on Advanced Logistics Guofeng Qin *, Lisheng Wang and Qiyan Li Department of Computer Science and Technology, Tongji University, Shanghai 201804, China; [email protected] (L.W.); [email protected] (Q.L.) * Correspondence: [email protected]; Tel.: +86-21-6598-1423 Academic Editor: Yuhua Luo Received: 16 December 2015; Accepted: 31 March 2016; Published: 5 April 2016

Abstract: Business applications in advanced logistics service are highly concurrent. In this paper, we propose a resource symmetric dispatch model for the concurrent and cooperative tasks of the Internet of Things. In the model, the terminals receive and deliver commands, data, and information with mobile networks, wireless networks, and sensor networks. The data and information are classified and processed by the clustering servers in the cloud service platform. The cluster service, resource dispatch, and load balance are cooperative for management and monitoring of every application case during the logistics service lifecycle. In order to support the high performance of cloud service, resource symmetric dispatch algorithm among clustering servers and load balancing method among multi-cores in one server, including NIO (Non-blocking Input/Output) and RMI (Remote Method Invocation) are utilized to dispatch the cooperation of computation and service resources. Keywords: Internet of Things; network integration; symmetric resource dispatch; cooperative systems; logistics service lifecycle

1. Introduction The Internet of Things is a hot-point in theory and engineering field because it involves the complexity of big data, massive amounts of information, the concurrency of operation and application, and real-time services. The Internet of Things is an auto-cooperative system and platform with integration of Internet, Intranet, Wide area network (WAN), Local area network (LAN), 3&4G, and other sensor network such as Radio-frequency Identification networks (RFID), Bluetooth, controller area network bus (CANBUS), video collection networks and so on. For high performance and reliable response, a hybrid cluster-based (HC) wireless sensor network (WSN) architecture was proposed by Huang et al., [1], which could transmit emergency data packets in an efficient manner during an emergency case. An insect organization model was setup in sensor networks by Ma and Krings [2], which justified how and why insect sensory systems might promote computation and communication ability. A distributed detection method on a binary target was considered to send local decisions to a fusion center with wireless sensors by Kim et al., [3], which proposed two new concepts called detection outage and detection diversity in the long-term system performance. Luis and Sebastia, [4] considered that the strong components of the system could provide a cost-effective, real-time, and high resolution means of wildfire prevention in a sensor network. It is very important to study component structure for high performance of systems with low power. Saaty and Shih, [5] considered a network structure must satisfy two requirements. The first one is that the similar sensors should be identified and grouped together. The other is that the relationship among them should be kept accurately according to the flow of networks. A new algorithm in the IP address range on ABC types networks based on the QoS of the user was proposed in order to consider a better resource distribution for operators by Haydar et al. [6]. Alexandros et al., [7] verified the core Symmetry 2016, 8, 20; doi:10.3390/sym8040020

www.mdpi.com/journal/symmetry

Symmetry 2016, 8, 20

2 of 12

algorithm in8,the was Symmetry 2016, 20 IP address range on ABC types networks based on the QoS of the user2 of 12 proposed in order to consider a better resource distribution for operators by Haydar et al [6]. Alexandros et al., [7] verified the core diameter size should change and modify with the lightweight diameter size should and modify with the lightweightinformation and scalability of tasks from many and scalability of taskschange from many experiments. A cooperative integration platform was experiments. A cooperative information integration platform was studied by Qin and Li et [8–11], studied by Qin and Li et al., [8-11], which consists of the intelligent mobile terminals, softwareal., systems, which consists the intelligent terminals, systems, integrated 3&4G, (GPRS) including integrated 3&4G,ofincluding globalmobile positioning systemsoftware (GPS), general packet radio service or global positioning system (GPS), general packet radio service (GPRS) or code division multiple access code division multiple access (CDMA), Internet (Intranet), and remote video monitor and M-DMB (CDMA),digital Internet (Intranet), and remote videoMiguel monitoretand (mobileadigital media broadcast) (mobile media broadcast) networks. al., M-DMB [12] proposed collaborative decision networks. Miguel et al., [12] proposed a collaborative decision method to process tasks in lower power. method to process tasks in lower power. During advanced logistics service lifecycle, application tasks from cloud terminals are highly During advanced logistics service lifecycle, application tasks from cloud terminals are highly concurrent with data andand information. For real-time customer service,service, dynamicdynamic symmetric resource concurrent withbig big data information. For real-time customer symmetric dispatch and load balancing on multi-cores play a very important role for high performance cluster resource dispatch and load balancing on multi-cores play a very important role for high performance computing ability in low power [13]. cluster computing ability in low power [13]. According to to this this motivation, motivation, aa resource resource symmetric symmetric dispatch dispatch model model for for Internet Internet of of Things Things on on According advanced logistics service lifecycle is studied, in Section 2, a symmetric integration cooperative system advanced logistics service lifecycle is studied, in Section 2, a symmetric integration cooperative structure is proposed, in Section resource symmetric dispatching among clustering servers is studied, system structure is proposed, in3,Section 3, resource symmetric dispatching among clustering servers including load balancing among multi-cores, in Section 4, experiment and analysis of load balancing is studied, including load balancing among multi-cores, in Section 4, experiment and analysis of load algorithm are provided. an application using a prototype for the system cloud platform balancing algorithm are Finally, provided. Finally, an case application case usingsystem a prototype for the on advanced logistics servicelogistics will be reported as abe result. cloud platform on advanced service will reported as a result. 2. The TheLevel LevelSymmetric SymmetricStructure Structurefor forLogistics LogisticsBusiness Businesswith withInternet Internetof ofThings Things 2. According to to the the logistics logistics business business application application service service process, process, the the vehicles’ vehicles’ GPS GPS information, information, According driver request request messages, messages, sensor sensor information information (for (for example example RFID, RFID, temperature, temperature, pressure, pressure, stress, driver stress, angle), angle), and application tasks in all the terminals are sent to the cloud platform. On the other hand, control and application tasks in all the terminals are sent to the cloud platform. On the other the hand, the center commands are sent to application terminals by the integrated WAN on 3&4G and Internet control center commands are sent to application terminals by the integrated WAN on 3&4G and (Intranet). Different types oftypes information, including messages, data files and video can be Internet (Intranet). Different of information, including messages, data files and stream video stream sent and received freely and safely via internet. The details of logistics business structure can be can be sent and received freely and safely via internet. The details of logistics business structure can seen in in Figure 1. 1.The business layer, layer, be seen Figure Thelogistics logisticsbusiness businessstructure structureincludes includes Web Web service service & & Electronic Electronic business management layer, operation layer, and device execution layer. Communication of the different layers management layer, operation layer, and device execution layer. Communication of the different depends on the information and dataand bus data to receive and deliverand information and data from different layers depends on the information bus to receive deliver information andthe data from business modules in the cloud service platform. the different business modules in the cloud service platform.

Figure 1. Business structure on advanced logistics service. Figure 1. Business structure on advanced logistics service.

Symmetry 2016, 8, 20

3 of 12

In the business layer structure, the users can do anything with client personal computers or cloud terminals, including mobile phones and PAD. The users are vehicle drivers, operators, and 3managers Symmetry 2016, 8, 20 of 12 from groups, manufacturers, factories, branch companies, procurement buyers, suppliers, banks, steel factories, In management warehouse The users dispatch vehicles to transport the businessdepartment, layer structure, the userscenters can doetc. anything withcan client personal computers or cloud terminals, including mobile phones andgoods, PAD. The vehicle drivers, operators, and the the goods, monitor and manage status of the and users finishare the financial settlement among managers from groups, factories, branch companies, procurement buyers, suppliers, different organizers in the manufacturers, product life cycle at any-time and any-where by all available networks in banks, steel factories, management department, warehouse centers etc. The users can dispatch the cloud platform. vehicles to transport the goods, monitor and manage status of the goods, and finish the financial For any-time and any-where business service, a symmetric service platform consists of a large settlement among the different organizers in the product life cycle at any-time and any-where by all set of intelligent mobile cloud terminals with software systems, integrated 3&4G units, remote video available networks in the cloud platform. monitors,For and different of communication such as GPS, GPRS (CDMA), Internet any-time andkinds any-where business service,networks a symmetric service platform consists of a large (Intranet), RFID, CANBUS, wireless network, Bluetooth andintegrated so on. This symmetric method set of intelligent mobile cloud terminals with software systems, 3&4G units, remote video from business service data and information service has been such greatly promoting businessInternet to business, monitors, andtodifferent kinds of communication networks as GPS, GPRS (CDMA), (Intranet), RFID, CANBUS, wireless network, Bluetooth and so on. This symmetric method from business to development and speeding up circulation of commodities among customers. business service to data and information service has been greatly promoting business to business,

2.1. Symmetric for the of up Things business toStructure development andInternet speeding circulation of commodities among customers. In the cloud Structure platform, are management, operation, and device execution layers in the 2.1. Symmetric forthere the Internet of Things

application layer. There are the mobile cloud terminals, B/S (Browser/Server structure) clients, In the cloud platform, there are management, operation, and device execution layers in the C/S (Client/Server structure) clients in the application layer. The network layer comprises wireless application layer. There are the mobile cloud terminals, B/S (Browser/Server structure) clients, C/S communication network, Internet, Intranet, and wireless sensor network. The mobile cloud terminals (Client/Server structure) clients in the application layer. The network layer comprises wireless and executive devices are connected to the cloudand service platform withnetwork. 3G or 4GThe andmobile internet. The B/S communication network, Internet, Intranet, wireless sensor cloud clients are connected the cloud platform through Internet and Intranet, also the C/S clients terminals and executive devices are connected to the cloud service platform with 3G or 4G and are connected the platform with the The through logical layer includes the platform internet. Thecloud B/S clients are connected theIntranet. cloud platform Internet and Intranet, also the C/Sgroup tool-wares middle wares to deal withwith parallel application caseslayer among servers. There are software clients and are connected the cloud platform the Intranet. The logical includes the platform group and middle wares to dealgeographic with parallelinformation application system cases among There are toolstool-wares for communication, application, (GIS),servers. Web Server and the software tools for communication, application, geographic information system (GIS), Web Server and interface in the cloud platform. There are two important supports of reliability and real-time in the the interface in the cloud platform. two important supports of reliability and real-time servers, in platform, one is support tools, and theThere otherare is resource symmetric dispatch among clustering the platform, one is support tools, and the other is resource symmetric dispatch among clustering including load balancing among multi-cores in one server. The support tools are composed of XML servers, including load balancing among multi-cores in one server. The support tools are composed data protocol transfer ware, Java database connectivity (JDBC), NIO (Non-blocking Input/Output), of XML data protocol transfer ware, Java database connectivity (JDBC), NIO (Non-blocking RMI Input/Output), (Remote Method load Invocation), balancer, and sobalancer, on. The and datasolayer includes all the databases RMIInvocation), (Remote Method load on. The data layer includes which contain data, video, and audio. The details can be seen in Figure 2. all the databases which contain data, video, and audio. The details can be seen in Figure 2.

Figure 2. Layer structure of logistics service cloud platform. Figure 2. Layer structure of logistics service cloud platform.

Symmetry 2016, 8, 20

4 of 12

Symmetry 2016, 8, 20

4 of 12

Symmetry 2016, 8, 20

4 of 12

2.2. Hierarchical Business Task Dispatch in the Internet of Things

2.2. Hierarchical Business Task Dispatch in the Internet of Things There are many application tasks from smart mobile phones, C/S & B/S clients, and 2.2. Hierarchical Business Task Dispatch in the Internet of Things There are many application tasks from smart mobile phones, & B/S clients, and auto-hardware auto-hardware devices including mobile application (APP), C/S electronic business application, and devices including mobile application (APP), electronic business application, and enterprise resource enterpriseThere resource management Because collected from auto-hardware are many application application. tasks from smart mobile information phones, C/S &isB/S clients, and auto-hardware management application. Because information is collected from auto-hardware devices and devices and commands are delivered to execution devices,business procedure of data and are highly devices including mobile application (APP), electronic application, and information enterprise resource commands are delivered to execution devices, procedure of data and information are highly management application.capacity Becauseis information is collected from to auto-hardware devices and concurrent and throughput very big at any time. In order improve service performance concurrent and throughput capacity is very big at any time. In order to improve service performance commands are delivered execution devices, procedure of data information are highly of cloud platform just in time, to a hierarchical structure for business taskand dispatch is proposed, the first of cloud platform just in time, a hierarchical structure for business task dispatch is proposed, the first concurrent and throughput capacity is very big at any time. In order to improve service performance layer is used for assigning application task among servers in the same service cluster; the second layer layer is used for assigning application task structure among servers in thetask same serviceiscluster; the the second of cloud platform just in time, a hierarchical for business dispatch proposed, first is charged with dispatching computation resource for one task among multi-cores in a same server. layer is charged with dispatching computation resource for one task among multi-cores in a same layer is used for assigning application task among servers in the same service cluster; the second The details hierarchical structurestructure for application task dispatch can be seen in Figure 3. 3. server. of Thethe details of the hierarchical for application task dispatch can be seen in Figure layer is charged with dispatching computation resource for one task among multi-cores in a same server. The details of the hierarchical structure for application task dispatch can be seen in Figure 3. Multi-Cores Servers Multi-Cores Core 1

Servers

Core 21 Core Core 2 Core 3 Core 43 Core Core 4

Server 1

Mobile APP

Server 1

Mobile APP E-business application E-business application

Server 2 Server 2 Cluster

Resource management Cluster Resource Core 21 Core application management Cluster Core 2 application Core 3 Auto-hardware device Core 3 Auto-hardware Server n Core 4 device n structure for application task dispatch. Core 4Figure 3. Server Hierarchical Figure 3. Hierarchical structure for application task dispatch. Core 1

Cluster

Figure 3. Hierarchical structure for application task dispatch.

In order to improve parallel processing performance, the cloud service platform makes use of In to improve parallel processing performance, service platform makes useof theorder reactor design pattern and events are selected bythe thecloud selector at set intervals. When aof the In order to improve parallel processing performance, the cloud service platform makes use non-blocking method is dealing with Input/Output (I/O) operations, it might return without reactor events selected the selector set intervals. When a non-blocking thedesign reactorpattern design and pattern andare events are by selected by theatselector at set intervals. When a waiting for themethod I/O to finish in threads pool, the platform would achieve higher performance and method is dealing with Input/Output (I/O) operations, it might return without waiting the I/O to non-blocking is dealing with Input/Output (I/O) operations, it might returnfor without greater parallel processing capacity. The details of work mechanism of NIO can be seen in Figure 4. for the I/Othe to platform finish in threads the higher platform would achieve performance and finishwaiting in threads pool, would pool, achieve performance andhigher greater parallel processing A Server-Socket-Channel iscapacity. first registered to the selector as an acceptor or a request handler. Then greater Theof details work mechanism of NIO be seen in Figure 4. capacity. Theparallel detailsprocessing of work mechanism NIO of can be seen in Figure 4. Acan Server-Socket-Channel is the acceptor handles the client requests passed from the selector and spawns Socket-Channels to A Server-Socket-Channel is first registered to the selector as an acceptor or a request handler. Then first registered to the selector as an acceptor or a request handler. Then the acceptor handles the client receive or send data. The polls allpassed channels andthe dispatches clientspawns requests or I/O operations the acceptor handles the selector client requests from selector and Socket-Channels to requests passed from the selector and spawnsarchitecture, Socket-Channels to receive orused send data. The selector at set intervals. Of course, under multi-core thread pool is often to I/O contain many receive or send data. The selector polls all channels and dispatches client requests or operations polls threads. all channels and dispatches client requests or I/O operations at set intervals. Of course, under at set intervals. Of course, under multi-core architecture, thread pool is often used to contain many

multi-core architecture, thread pool is often used to contain many threads. threads. Selector Client

Selector dispatch

Client Client

dispatch

Client Client Client

read

decode

compute

encode

send

acceptor

read read

decode decode

compute compute

encode encode

send send

acceptor

read read

decode decode

compute compute

encode encode

send send

read

decode

compute

encode

send

Figure 4. Work mechanism of NIO in clustering servers.

Figure 4. Work mechanism of NIO in clustering servers.

Figure 4. Work mechanism of NIO in clustering servers.

Symmetry 2016, 8, 20

5 of 12

3. Resource Symmetric Dispatching among Clustering Servers in Cloud Platforms For high performance service just in time, there are clustering servers in cloud platform. Resource symmetric dispatch and load balancing by multiple thread service is the key technical solution in communication server cluster, Web-Service application server cluster and Database server cluster. In order to symmetrically dispatch resource and balance load, there are two layers, one is resource dispatch among clustering servers, and the other is load balancing for multi-cores of server, which can balance the tasks of the clustering servers and the computation resource among multi-cores of server. 3.1. Resource Dispatching Algorithm among Servers Hypothesis: N is the number of computer process units in the communication servers. E is the whole time cost without loader in one unit. L is the loader of only one unit. The work load of the others is 0. If the communication server can be paralleled, the time cost is E/N without loader in the system [21,22]. If there is a loader L in one unit, then its computing ability is 1/L, so if loader 1 is added to the unit, its computing ability will be 1/(L + 1), then the whole time cost C is as follows: C“

E 1 pN ´ 1q ` 1`L



Ep1 ` Lq NL ` N ´ L

(1)

The Equation (1) is a theoretical result in a theoretical condition. In fact, the whole time cost C1 can approximately be expressed in load balanced condition of cluster server as follows: C1 “ C ` δpNq ` Tcomm pNq

(2)

Tcomm pNq is the whole communication time, which can be determined by experience, δpNq is the estimated number of units. The details of resource symmetric dispatch algorithm can be expressed as follows: Algorithm: Input: Graph G(V,E) and its sub-graph Gcluster (Vclusterp, Eclusterp) where V are logic processors (LPs), the E’s are empty links, Vclusterp is the set of LPs assigned to all processors, s P Clusterp, and Eclusterp is the set of empty links belonging to each processor or q P Clusterp; Clusterp is the set of neighbor-processors to p, LSTv, Tminu, Tsminu, v, Tw, u such that u P Clusterp, v P V and w P E(V-Vclusterp). Output: time-of-next Tu, v and select an LP to process a new event Begin PQ is initial and empty/*PQ is the priority queue, a data structure */ For all (u,v) u P Vclusterp, v P (V-Vclusterp) do Tv = 0; For all v P Vclusterp do Temp = Minw P E(V-Vclusterp)(Tminv,Tw,u); Tv = Max(Temp,LSTv); Tu = Minv P PQ(Tv); /*LST is the stimulated time, T is the min timestamp of LP */ Insert (v,PQ); End for; While (Not finished) do If (Non-blocking(Prj))/*unblock any process in Prj*/ Select an LP to process a new event/*setup a logic processor to process a new event */ End if End while

Symmetry 2016, 8, 20

6 of 12

Because data and information of application tasks from mobile APPs, electronic business applications, resource applications, and auto-hardware devices to clustering servers are very highly concurrent, this algorithm is put forward to dispatch application and balance work load from different application terminals among servers in same cluster. This method can promote the ability and efficiency of cluster and make the servers efficient run with low power. 3.2. Load Balancing among Multi-Cores More and more servers have used the multi-core architecture. However, there exist some problems in utilizing the multi-core processors. Firstly, it is known to us that I/O operation is much slower than the CPU processing speed, thus the traditional blocking I/O keeps CPU waiting simultaneously instead of doing any practical jobs [14,15]. Secondly, tasks are distributed unevenly to threads which lead to workloads unbalancing among multi-cores [16]. In order to solve these problems, a multi-core load balancing model based on the Java NIO (Java New I/O) framework is proposed, which utilizes both the high-performance non-blocking I/O in Java NIO framework and parallel processing ability of multi-core processors. In JDK1.4 API, Java NIO was designed to provide access to the low-level operations of modern operating systems and the intent is to facilitate an implementation that can directly use the most efficient operations of the underlying platform [17,18], which provides buffer-based multi-channel non-blocking I/O methods in Java language. Work load is balanced among multi-cores in a user-level way rather than a kernel way, including the type of load balancing involved in Java Fork/Join framework which is the basis of the model proposed, namely, a multi-core load balancing model and an overall framework with NIO mechanize [19,20]. There exist two problems of multi-core load unbalancing because tasks distributed unevenly to multi-threads [21]. Firstly, from the version of JDK1.2, Java thread is implemented by native thread which means how operating system supports native thread decides the implementation of Java Virtual Machine (JVM) threads. Both the Windows version and the Linux version of Sun Java Development Kit (JDK) use the 1:1 mapping to implement Java thread which means a Java thread is mapped to a LWP (light-weight process) and could be regarded as a native thread. This problem is studied in Linux, of course, it can be extends to other operating systems. Secondly, after comparing different combination of CPU workload factors, Kunz pointed out that the number of tasks in running queue is the best factor for evaluating the CPU workload in Linux [22]. In Linux, a task is a thread. As it is mentioned, a Java thread is a native thread or a kernel thread, therefore, a 1-slice Java thread and a 10-slice Java thread are equivalent when scheduled to cores by the Linux kernel. Thus multi-core load unbalancing emerges. When developing an application under multi-core architecture, distributing tasks evenly to cores by using multi-threading techniques is an effective way to enhance system performance [23]. A task scheduling model is proposed to solve the problem, which is based on the Java Fork/Join framework in JDK1.7. Fork/Join framework is a classic way of dealing with parallel programming. Though it could not solve all these problems, it is able to utilize multi-cores and make them cooperatively finish heavy tasks within its applicable scope. In Fork/Join framework, if a task could be divided into some subtasks and the result could be obtained by combining these subtasks, then the task is fit to be solved with Fork/Join pattern as Figure 5. The task is dependent on the subtasks in a low level, so the task 0 cannot get the result until all the subtasks return. Other problems relating parallelism such as load balancing and synchronization could also be solved with the framework.

Symmetry 2016, 8, 20 Symmetry 2016, 8, 20 Symmetry 2016, 8, 20

7 of 12 7 of 12 7 of 12

Call Call Return Return

Task 0 Task 0

Task 0-2 Task 0-2

Task 0-1 Task 0-1 Task 0-1-1 Task 0-1-1

Task 0-1-2 Task 0-1-2

Task 0-2-3 Task 0-2-3

Task 0-2-2 Task 0-2-2

Task 0-2-1 Task 0-2-1

Figure 5. Fork/Join Fork/Join Framework on NIO. Figure Framework on on NIO. NIO. Figure 5. 5. Fork/Join Framework

3.3. Balanced Task Task Scheduling Algorithm Algorithm among Multi-Cores Multi-Cores 3.3. 3.3. Balanced Balanced Task Scheduling Scheduling Algorithm among among Multi-Cores Service requests from from clients are are likely to to handle both both heavy tasks tasks and lightweight lightweight tasks. If If one Service Service requests requests from clients clients are likely likely to handle handle both heavy heavy tasks and and lightweight tasks. tasks. If one one heavy task is distributed to one thread, it would cause load un-balanced. Analogously, too many heavy task is distributed to one thread, it would cause load un-balanced. Analogously, too many heavy task is distributed to one thread, it would cause load un-balanced. Analogously, too many lightweight tasksthat that eachone one occupies a thread would increase high system cost [24][25]. Thus lightweight a thread would increase highhigh system cost cost [24,25]. Thus tasks lightweight tasks tasks thateach each oneoccupies occupies a thread would increase system [24][25]. Thus tasks executed in threads should be controlled in a suitable size. A method of taskcontrolling size controlling is executed in threads should be controlled in a suitable size. A method of task is used tasks executed in threads should be controlled in a suitable size. A method ofsize task size controlling is used to balance load of multi-cores with parallelizing heavy tasks and serializing light-weight (LW) to balance load of multi-cores with parallelizing heavyheavy tasks and light-weight (LW) tasks used to balance load of multi-cores with parallelizing tasksserializing and serializing light-weight (LW) tasks as Figure 6. Because serialization of the lightweight tasks is relatively easy to attain, it just as Figure 6. Because serialization of the lightweight tasks is relatively easy to attain, it just waits for tasks as Figure 6. Because serialization of the lightweight tasks is relatively easy to attain, it just waits fortasks enough tasks and to arrive anda fetches a thread from the thread pool tothe execute the tasks, thus enough to arrive fetches thread from the thread pool to execute tasks, thus we only waits for enough tasks to arrive and fetches a thread from the thread pool to execute the tasks, thus we only discuss the heavy task parallelization. The task scheduling algorithm is based on Java discuss heavythe task parallelization. The task scheduling algorithm algorithm is based on Fork/Join we onlythe discuss heavy task parallelization. The task scheduling is Java based on Java Fork/Join framework and classic task scheduling algorithm. framework and classicand taskclassic scheduling algorithm.algorithm. Fork/Join framework task scheduling

Merged task Merged task

Heavy task Heavy task

task LW 44 task LW

Serilization Serilization

task LW 33 task LW

task LW 22 task LW

return return

task LW 11 task LW

subtask 33 subtask

subtask subtask2 2

subtask 11 subtask

Parallenliztion Parallenliztion

... ...

... ...

return return

Thread Pool Thread Pool Figure 6. Task parallelization and serialization. Figure 6. Task parallelization and serialization. Figure 6. Task parallelization and serialization.

The task scheduling algorithm in Figure 7 is described as follows: The task scheduling algorithm in Figure 7 is described as follows: Condition: (1) The server has n processors P1, P2, … Pn,as and each processor Pi has a running task The task scheduling algorithm Figure 7 is follows: Condition: (1) The server has nin processors P1described , P2, … Pn, and each processor Pi has a running task queue Q i which the subtasks run on; (2) There are also n worker threads W1, W2, … W and Wi Condition: server run has on; n processors processor hasW annrunning n , and each 1 , Palso 2 , . . .nPworker queue Qi which (1) theThe subtasks (2) There Pare threads W1, WP2,i … and Wi executes the tasks on Qi; subtasks (3) There are on; k tasks T1, T2, … also Tk whose priority ranges strictly from high toi task queue (2) There n worker threads W1 ,strictly W2 , . . .from Wn and i which executes theQtasks onthe Qi; (3) Thererun are k tasks T1, T2,are … Tk whose priority ranges highW to low. T j could be divided into m parallel subtasks t j,1 ,t j,2 … t j,m (m may differ for different tasks) which executes the tasks on Q ; (3) There are k tasks T , T , . . . T whose priority ranges strictly from high to i 1 2 k low. Tj could be divided into m parallel subtasks tj,1,tj,2 … tj,m (m may differ for different tasks) which inherit the priority from theirmparent task by fork operation. low. Tj could be divided parallel subtasks ,tj,2 . . . tj,m (m may differ for different tasks) which inherit the priority frominto their parent task by forktj,1operation. The steps of the algorithm are the following: inherit the priority from their parent task by fork operation. The steps of the algorithm are the following: I. For task Tj, thealgorithm parallel subtasks tj,1,tj,2 … tj,m are distributed evenly to all task queues, so each The steps are the following: I. For taskofTjthe , the parallel subtasks tj,1,tj,2 … tj,m are distributed evenly to all task queues, so each queue has m/n Tparallel tasks. Tasks with higher priorities are queued earlier than other tasks. Thus I. For tj,1 ,thigher evenly to all task queues, so Thus each j , the parallel j,2 . . . tpriorities j,m are distributed queue has task m/n parallel tasks. subtasks Tasks with are queued earlier than other tasks. the distribution of parallel tasks is obtained and is presented in Figure 8. queue has m/n parallel tasks. Tasks with higher areinqueued than other tasks. Thus the distribution of parallel tasks is obtained and ispriorities presented Figure earlier 8. II. If Qi is not empty, W i dequeues a task from it to execute; else go8.to Step III. the distribution of parallel tasks is obtained and is presented in Figure II. If Qi is not empty, Wi dequeues a task from it to execute; else go to Step III. III. W searches the system for the processor which has the longest task queue, locates the task II. Qii isearches is not empty, Wi dequeues task fromwhich it to execute; go totask Stepqueue, III. locates III.IfW the system for the aprocessor has the else longest the task that has the highest priority and migrates it to Qi. Go back to Step II. that has the highest priority and migrates it to Qi. Go back to Step II.

Symmetry 2016, 8, 20

8 of 12

III. 2016, W the system for the processor which has the longest task queue, locates the88task Symmetry 8, of i searches Symmetry 2016, 8, 20 20 of 12 12 that has the highest priority and migrates it to Qi . Go back to Step II. IV. the migration Step is i tries to lower-priority or for IV. Step IIIIII is failed, Wi W tries to migrate lower-priority taskstasks or search for other IV.IfIf Ifthe themigration migrationinin in Step III is failed, failed, W i tries to migrate migrate lower-priority tasks or search search for other processors to repeat the migration. If all these failures and Q i is still empty, W i blocks and waits processors to repeat the migration. If all these failures and Q is still empty, W blocks and waits for other processors to repeat the migration. If all these failures and i Qi is still empty, i Wi blocks and waits for to new tasktask to awake. for new new task to awake. awake.

Processor Processor 1 1 tt1,1 1,1

tt1,2 1,2

Processor Processor 2 2 tt1,m/n+1 1,m/n+1

... ...

tt1,m/n tt2,1 1,m/n 2,1

... ...

ttk,1 k,1

... ...

ttk,2 k,2

... ...

ttk,M/n+1 k,M/n+1

... ...

ttk,M/n k,M/n

... ...

... ... Processor Processor n n

... ...

... ...

Figure 7. Dispatch of subtasks on multi-cores. Figure Figure 7. 7. Dispatch Dispatch of of subtasks subtasks on on multi-cores. multi-cores. Request Request

User User

Result Result

Task Task Type Type Judgement Judgement

NIO NIO Server Server

Queuenn PriorityQueue MessagePriority Message

dispatching dispatching

Queue22 PriorityQueue MessagePriority Message

Worker Worker Thread Thread Pool Pool

Queue11 PriorityQueue MessagePriority Message

Result Result Collecting Collecting

…… ……

Load Load Balancing Balancing Module Module

Figure 8. task delivering procedure of servers. Figure 8. 8. Service Service task task delivering delivering procedure procedure of of servers. servers. Figure Service

On On one one hand, hand, because because of of the the FIFO FIFO task task queue queue structure, structure, later later high-priority high-priority tasks tasks are are queued queued On one hand, because of the FIFO task queue structure, later high-priority tasks are queued after after after earlier earlier low-priority low-priority tasks, tasks, which which assure assure the the chance chance of of execution execution of of low-priority low-priority tasks. tasks. On On the the earlier low-priority tasks, which assure the chance of execution of low-priority tasks. On the other other hand, task migration gets high-priority tasks executed as soon as possible. other hand, task migration gets high-priority tasks executed as soon as possible. hand, task migration gets high-priority tasks executed as soon as possible. 3.4. 3.4. Application Application Service Service Delivering Delivering Procedure Procedure of of Servers Servers 3.4. Application Service Delivering Procedure of Servers The The improved improved task task scheduling scheduling algorithm algorithm with with Java Java NIO NIO is is proposed proposed to to keep keep load load among among overall overall The improved task scheduling algorithm with Java NIO is proposed to keep load among overall clustering clustering servers servers balancing. balancing. The The procedure procedure is is presented presented in in Figure Figure 8. 8. clustering servers balancing. The procedure is presented in Figure 8. for system performance. Some The The service service task task assigning assigning module module plays plays aa very very important important role role for system performance. Some The service task assigning module plays a very important role for system performance. Some important important details details of of the the module module are are presented presented as as follows: follows: important details ofgenerally the module aretype presented as follows: a. a. Task Task type: type: generally task task type determines determines task task size size and and how how tasks tasks are are handled, handled, namely, namely, a. Taskor type: generally task type determines task size and how tasks are handled, namely, parallelized serialized. When receiving a task, the server judges its task type at parallelized or serialized. When receiving a task, the server judges its task type at first, first, and and decides decides parallelized or serialized. When receiving a task, the server judges its task type at first, and decides what what to to do do next. next. For For example, example, aa file file access access task task or or aa mathematical mathematical calculation calculation task task has has more more work work to to do than a submitted form task, so they are better to be parallelized. do than a submitted form task, so they are better to be parallelized. b. b. Message Message priority priority queue: queue: in in order order to to handle handle messages messages with with different different priorities, priorities, aa data data structure structure of of message message priority priority queue queue is is introduced introduced to to classify classify the the messages. messages. When When these these queues queues are are polled, polled, more messages in high-priority queues would be chosen and forked. more messages in high-priority queues would be chosen and forked.

Symmetry 2016, 8, 20

9 of 12

what to do next. For example, a file access task or a mathematical calculation task has more work to do than a submitted form task, so they are better to be parallelized. b. Message priority queue: in order to handle messages with different priorities, a data structure of message priority queue is introduced to classify the messages. When these queues are polled, more Symmetry 8, 20 9 of 12 messages in 2016, high-priority queues would be chosen and forked. c. I/O: Java NIO is used in both network I/O and native I/O to minimize the time of waiting for c. I/O: Java NIO is used in both network I/O and native I/O to minimize the time of waiting for processors in the system architecture. processors in the system architecture. d. CPU affinity: as a worker thread takes charge of a processor’s queue tasks, it is likely to be d. CPU affinity: as a worker thread takes charge of a processor’s queue tasks, it is likely to be bound to the However, doesnot notprovide provide CPU affinity, we JNI use JNI bound to processor. the processor. However,Java Java language language does CPU affinity, thusthus we use (Java(Java Native Interface) and set CPU affinity of worker thread by invocating a low-level C language Native Interface) and set CPU affinity of worker thread by invocating a low-level C language dynamic library. dynamic library. 4. Experiment and Analysis 4. Experiment and Analysis In order to verify system performance, an experiment is done testtask thescheduling task scheduling In order to verify system performance, an experiment is done to testtothe algorithm algorithm in load balancing and utilizing multi-core processors. It compares the results of task with in load balancing and utilizing multi-core processors. It compares the results of task execution execution with one core, two cores, and four cores. We adopt three types of classic tasks for test, they one core, two cores, and four cores. We adopt three types of classic tasks for test, they are the Fibonnaci are the Fibonnaci program with argument 40 (Fib), quickly sorting of 20,000 integers (Sort) and program with argument 40 (Fib), quickly sorting of 20,000 integers (Sort) and multiplication of 512-bit multiplication of 512-bit and 512-bit integer (Mul). All three types have the same priority. For each and 512-bit integer (Mul). All three types have the same priority. For each type of task, number of type of task, number of subtasks executed on cores and the total execution time was gathered. The subtasks on cores andand thethe total execution time was tasks are executed 102times tasksexecuted are executed 10 times results are averaged. Thegathered. results areThe presented in Tables 1 and and the andresults Figure are 9. averaged. The results are presented in Tables 1 and 2 and Figure 9.

ms

12000 10000 8000 Fib

6000

Sort 4000

Mul

2000 0

CPUs

1

2

4

Figure 9. Execution time of tasks.

Figure 9. Execution time of tasks.

Experiment environment: Intel Core i7 Q720 1.6GHz, Linux 2.6.32, JDK 1.7. We use VMware

Experiment environment: Core i7 Q720 1.6GHz, Linux 2.6.32, JDK 1.7. We use VMware Workstation 8.0 as the runningIntel environment to configure the number of cores. Workstation 8.0 as the running environment to configure the number of cores. Table 1. Number of subtasks on cores.

Test Program Fib Sort Mul

Test program 1 Core Fib Sort CPU0 Mul 35,421 66,670 87,381

Table 1. 2Number 1 Core Cores of subtasks on cores. 4 Cores CPU0 CPU0 CPU1 CPU0 CPU1 CPU2 CPU3 Cores 17,806 8868 48820 Cores 8742 35,421 217,635 8990

66,670 33,754 16,253 16,879 CPU0 33,020CPU1 CPU0 17,120 CPU1 87,381 43,912 43,469 21,769 21,773 21,964 17,635 17,806 8868 8990 33,020 33,754 16,253 17,120 43,9122. Execution 43,469time of21,769 Table tasks (unit:21,773 ms). Test program Fib Sort Mul

1 Core

2 Cores

4 Cores

6300 9788 11,302

5895 8772 10129

4496 6857 7994

16,445 CPU2 21,874 8820 16,879 21,964

CPU3 8742 16,445 21,874

Symmetry 2016, 8, 20

10 of 12

Table 2. Execution time of tasks (unit: ms).

Symmetry 2016, 8, 20

Test Program

1 Core

2 Cores

4 Cores

Fib Sort Mul

6300 9788 11,302

5895 8772 10129

4496 6857 7994

10 of 12

Table 1 shows that subtasks are evenly distributed and executed on cores. From Table 2 we can Table 1 shows that subtasks evenly distributed executed cores. From Table we can see that with the number of coresare increasing time cost ofand each type of on task decreases, but it 2does not see that with the number of cores increasing time cost of each type of task decreases, but task it does not have a strictly inverse relationship with number of cores. According to Figure 9, if the size is have strictly inverseofrelationship with number larger,a the utilization multi-core processors canof becores. higher.According to Figure 9, if the task size is larger, the utilization of multi-core processors can be higher. 5. An Application Case 5. An Application Case In the cloud platform on Internet of Things for advanced logistics service, many cameras are In the to cloud platform onvehicle Internetterminal of Things forCANBUS advancedorlogistics service, which many cameras are integrated a mobile cloud with USB interface, can provide integrated to a mobile cloud vehicle terminal with CANBUS or USB interface, which can provide service for about a million mobile cloud terminals and two thousand personal computer clients service fortoabout a million mobile cloud terminals two thousand personal computer clients according its designed parameter, currently there and are three hundred mobile cloud terminals and according to its designed parameter, currently there are three hundred mobile cloud terminals and forty PC clients in the cloud service platform. The video stream data with JPG image format is forty PC clients in the service platform. Thetransferred video stream withfile, JPGthe image format is collected collected and sent tocloud the platform, which are to data a video recognition software and to the the images. platform,The which are transferred a videoanalyzes file, the recognition with dealssent with security emulation to software status of thesoftware vehicle deals and alarm the images. The security emulation software analyzes status of the vehicle and alarm according to according to deviation index of lane line at any time. The day, week, and month reports of goods, deviation index of lane line at anycustomers, time. Thefinance day, week, warehouses, warehouses, vehicles, employees, and and so onmonth will bereports outputof ingoods, the platform. vehicles, employees, customers, finance and so on will be output in the platform. The details of the logistics business cloud service and the algorithm application can be seen as The10. details thelogistics logisticsbusiness businessoperation cloud service the algorithm application be seen Figure Thereofare and and management service functionscan in the left, as Figure 10. There are logistics business operation and management service functions in the geographic information system (GIS) map in the middle, and map layer management in the rightleft, in geographic system video (GIS) and mapalarm in thestatus middle, mapand layer management the right in Figure 10a. information There are real-time in and the left, vehicle running in status reports Figure 10a. There are real-time video and alarm in the left, and vehicle running status reports in in the right in Figure 10b. The management andstatus monitor service requests from operators and sensors the Figure management and monitor requests fromtask operators and sensors are are right takeninplace in10b. highThe concurrency, which must beservice dealt with by the scheduling algorithm taken place in high concurrency, which must be dealt with by the task scheduling algorithm among the among the clustering servers and the load balancing algorithm among multi-cores in the cloud clustering servers and the load balancing algorithm among multi-cores in the cloud service platform. service platform.

(a)

(b)

Figure & security securityfor forvehicle vehicleininalgorithm algorithmapplication. application. Figure10. 10.(a) (a)Logistics Logisticsbusiness businesscloud cloud service; service; (b) (b) Status Status &

6. Conclusions 6. Conclusions Many flexible flexible methods methods should should be be used used for for obtaining obtaining data data and and information information from from sensors sensors and and Many terminals, ininorder order to collect and deliver the structure and bio-structure with the NIO terminals, to collect and deliver the structure and bio-structure data with data the NIO mechanism. mechanism. There exist two layers of computing resource dispatch for high performance application service, one is resource symmetric dispatch among clustering servers for application service, and the other is load balancing method among multi-cores in one server, which are in charge of delivering business application to clustering servers equally and balancing task load to multi-core processors effectively. A load balancing algorithm for cloud service was utilized to balance the

Symmetry 2016, 8, 20

11 of 12

There exist two layers of computing resource dispatch for high performance application service, one is resource symmetric dispatch among clustering servers for application service, and the other is load balancing method among multi-cores in one server, which are in charge of delivering business application to clustering servers equally and balancing task load to multi-core processors effectively. A load balancing algorithm for cloud service was utilized to balance the computation resource among multi-cores. By experiment, the cloud model and the algorithm were tested and proved to be feasible and efficient. An advanced logistics cloud service platform application case verified high performance of symmetric resource dispatch. In the future, the necessary work will be to promote the system dependability, scalability, and other Quality of Service. Acknowledgments: The researcher is supported by the national 863 program in Ministry of Science and Technology of P.R.China. Project number: 2013AA040302. Author Contributions: The correspondence author: Guofeng Qin works in the Department of Computer Science and Technology as an associate professor, Tongji University, Shanghai, 201804, P.R.China, who designed the framework of cloud platform on advanced logistics and resource symmetric dispatch Algorithm. He received a BA from Hunan University in 1995, MA from the Huazhong University of Science and Technology in 2001, and Ph.D. from Tongji University in 2004. Lisheng Wang and Qiyan Li are professors in the Department of Computer Science and Technology, Tongji University, Shanghai, 201804, P.R.China, they provided the test tools and environment for resource symmetric dispatch Algorithm, and analyzed the test data on the different test programs. Conflicts of Interest: We declare no conflict of interest with other people or organizations.

References 1.

2. 3. 4.

5. 6.

7. 8.

9.

10.

11.

Huang, M.; Liu, H.; Hsieh, W. A Hybrid Protocol for Cluster-based Wireless Sensor Networks. In Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, Hsinchu, Taiwan, 4–6 August 2008; pp. 16–20. Ma, Z.; Krings, A.W. Insect sensory systems inspired computing and communications. Ad Hoc Netw. 2009, 7, 742–755. [CrossRef] Kim, H.-S.; Wang, J.; Cai, P.; Cui, S. Detection Outage and Detection Diversity in a Homogeneous Distributed Sensor Network. IEEE Trans. Signal Process. 2009, 57, 2875–2881. Vicente-Charlesworth, L.; Galmes, S. On the development of a sensor network-based system for wildfire prevention. In Proceedings of the 8th International Conference on Cooperative Design, Visualization, and Engineering, Hong Kong, China, 11–14 September 2011; pp. 53–60. Saaty, T.L.; Shih, H.-S. Structures in decision making: On the subjective geometry of hierarchies and networks. Eur. J. Oper. Res. 2009, 199, 867–872. [CrossRef] Haydar, J.; Ibrahim, A.; Pujolle, G. A New Access Selection Strategy in Heterogeneous Wireless Networks Based on Traffic Distribution. In Proceedings of the 1st IFIP Wireless Days Conference, Dubai, United Arab Emirates, 24–27 November 2008; pp. 295–299. Tsakountakis, A.; Kambourakis, G.; Gritzalis, S. A generic accounting scheme for next generation networks. Comput. Netw. 2009, 53, 2408–2426. [CrossRef] Qin, G.; Li, Q. An Information Integration Platform for Mobile Computing. In Proceedings of the Third International Conference on Cooperative Design, Visualization, and Engineering, Mallorca, Spain, 17–20 September 2006; pp. 123–131. Qin, G.; Li, Q. Strategies for resource sharing and remote utilization in communication servers. In Proceedings of the 4th International Conference on Cooperative Design, Visualization, and Engineering, Shanghai, China, 16–20 September 2007; pp. 331–339. Qin, G.; Li, Q. Dynamic Resource Dispatch Strategy for WebGIS Cluster Services. In Proceedings of the 4th International Conference on Cooperative Design, Visualization, and Engineering, Shanghai, China, 16–20 September 2007; pp. 349–352. Qin, G.; Wang, X.; Wang, L.; Li, L.; Li, Q. Remote Video Monitor of Vehicles in Cooperative Information Platform. In Proceedings of the 6th International Conference Cooperative Design, Visualization, and Engineering, Luxembourg, Luxembourg, 20–23 September 2009; pp. 208–215.

Symmetry 2016, 8, 20

12.

13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

12 of 12

Garcia, M.; Lloret, J.; Sendra, S.; Rodrigues, J.J.P.C. Taking cooperative decisions in group-based wireless sensor networks. In Proceedings of the 8th International Conference on Cooperative Design, Visualization, and Engineering, Hong Kong, China, 11–14 September 2011; pp. 61–65. Wang, Y.; Qin, G. A multi-core load balancing algorithm based on Java NIO. Telkomnika Indones. J. Electr. Eng. 2012, 10, 1490–1495. Wikipedia. Asynchronous I/O. Available online: http://en.wikipedia.org/wiki/Asynchronous_I/O (accessed on 28 March 2015). Li, S.; Liu, N.; Guo, J. Multi-threading workload balance under multi-core architecture. Comput. Appl. 2008, 28, 138–140. Wikipedia. New I/O. Available online: http://en.wikipedia.org/wiki/New_I/O (accessed on 10 September 2014). Zhou, Z. Understanding the JVM Advanced Features and Best Practices; China Machine Press: Beijing, China, 2011; pp. 336–337. Kunz, T. The influence of different workload description on a heuristic load balance scheme. IEEE Trans. Softw. Eng. 1991, 17, 725–730. [CrossRef] Lea, D. A Java Fork/Join Framework. In Proceedings of the ACM 2000 Conference on Java Grande, San Francisco, CA, USA, 3–5 June 2000; pp. 36–43. Steven, H.; Costin, I.; Filip, B. Load Balancing on Speed. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Bangalore, India, 9–14 January 2010; pp. 124–157. Vivek, K.; William, G. Load Balancing for Regular Meshes on SMPs with MPI. In Proceedings of the 17th European MPI Users Group Meeting, Stuttgart, Germany, 12–15 September 2010; pp. 229–238. Shameem, A.; Jason, R. Multi-core Programming: Increasing Performance through Software Multi-Threading; China Machine Press: Beijing, China, 2007; pp. 12–13. Gun, Z.; Dai, X. The Fork/Join Framework in JDK 7. Available online: http://www.ibm.com/developerworks/ cn/java/j-lo-forkjoin/ (accessed on 23 August 2007). Boukerche, A.; Tropper, C. Parallel Simulation on the Hypercube Multiprocessor. Distrib. Comput. 1995, 8, 181–190. [CrossRef] Fujimoto, R.M. Parallel Discrete Event Simulation. Commun. ACM 1989, 33, 30–53. [CrossRef] © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Suggest Documents