744
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 14, NO. 2, FEBRUARY 2018
Guest Editorial Special Section on Engineering Industrial Big Data Analytics Platforms for Internet of Things VER the last few years, a large number of Internet of Things (IoT) solutions have come to the IoT marketplace. Typically, each of these IoT solutions are designed to perform a single or minimal number of tasks (primary usage). For example, a smart sprinkler may only be activated if the soil moisture level goes below a certain level in the garden. Further, smart plugs allow users to control electronic appliances (including legacy appliances) remotely or create automated schedules. Undoubtedly, such automation not only brings convenience to their owners but also reduces resource wastage. However, these IoT solutions act as independent systems. The data collected by each of these solutions is used by them and stored in accesscontrolled silos. After primary usage, data are either thrown away or locked down in independent data silos. We believe a significant amount of knowledge and insights are hidden in these data silos that can be used to improve our lives; such data include our behaviors, habits, preferences, life patterns, and resource consumption. To discover such knowledge, we need to acquire and analyze this data together in a large scale. To discover useful information and deriving conclusions toward supporting efficient and effective decision making, industrial IoT platform needs to support variety of different data analytics processes such as inspecting, cleaning, transforming, and modeling data, especially in big data context. IoT middleware platforms have been developed in both academic and industrial settings in order to facilitate IoT data management tasks including data analytics. However, engineering these general-purpose industrial-grade big data analytics platforms need to address many challenges. European Commission predicts that the present “Internet of PCs” will move toward an “Internet of Things” in which 50 to 100 billion devices will be connected to the Internet by 2020. Data analytics requirements in IoT change greatly based on applications that a given platform has to support. Data types also change a lot from text based (e.g., Twitter analysis) to images to audio/video (e.g., elderly care, crime detection) to numerical values (e.g., smart energy). In summary, IoT platforms will be required to support different types of analytics such as batch processing, interactive, real-time, and predictive analytics. Some of the data can be processed as batches; however, some data analytics need to be carried out in real-time in order to derive real
O
Digital Object Identifier 10.1109/TII.2017.2788080
values due to the time sensitive nature (e.g., fraud detection, intelligent traffic management). Broadly, data analytics can be categorized into four main types: prescriptive, predictive, diagnostic, descriptive. The prescriptive analysis reveals what actions should be taken. This is the most valuable kind of analysis and usually results in rules and recommendations for next steps. Predictive analytics tries to determine the most likely scenario by predictive future forecast. Diagnostic analysis looks at past performance to determine what happened and why. The result of diagnostic analysis is often an analytic dashboard. Big data analytics helps organizations harness their data and use it to identify new opportunities. It allows business to introduce new products and services, reduce costs, and make faster, better decisions. However, preserving users’ privacy and protecting user data using strong security measures are important challenges that need to be addressed by IoT platforms. We have accepted six manuscripts out of 24 submissions for this special section (25% acceptance rate) after the strict peerreview processes. Each manuscript has been blindly reviewed by at least three external reviewers before the decisions were made. Industrial Internet of Things (IIoT) have different security requirements, and one of which is studied by the authors of “Certificateless searchable public key encryption scheme for industrial internet of things.” Specifically, they studied the problem of searchable encryption and explained the need for an efficient scheme for IIoT applications, before presenting their SCF-MCLPEKS (secure channel free certificateless searchable public key encryption with multiple keyword scheme). A security scheme needs to be proved mathematically or using some formal methods, and similarly the authors proved the security of SCF-MCLPEKS in a widely accepted security model. The authors also evaluated the performance of their proposed scheme. Authors of “Optimal decision making for big data processing at edge-cloud environment: An SDN perspective” addressed the problem of Big data processing at the edge cloud interface by designing a SDN-based solution. As there is a huge amount of data flow from different Internet enabled devices across the boundaries of edge-cloud so network may be overloaded with respect to an exponential increase in the service requests from the end users/devices. As response time is a vital parameter to measure the performance of any application in this environment so some of the applications may be executed at the edge-cloud interface by workload distribution, which is controlled and managed by
1551-3203 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 14, NO. 2, FEBRUARY 2018
an efficient SDN-based controller. The cost of data movement across multiple data centers is controlled by designing a game theoretic approach. Finally, the obtained results prove the effectiveness of the designed solution in the edge-cloud environment. In “Deep convolutional computation model for feature learning on big data in internet of things,” the convolutional neural network is extended from the vector space to the tensor space for big data feature learning. With the deployment of various sensors in the industrial, a lot of heterogeneous data collected pose a challenge for data analyses. To address this problem, a deep convolutional computation model is designed based on the tensor representation model. To make full use of the local features and topologies contained in the big data, a tensor convolution operation is defined to prevent overfitting and improve the training efficiency. Furthermore, a high-order backpropagation algorithm is proposed to train the parameters of the deep convolutional computational model in the high-order space. The experimental results validate the performance of the presented model and show that the presented model demonstrates superior learning performance compared with the existing methods. In “Bilateral LSTM: A two-dimensional long short-term memory model with multiply memory units for short term cycle time forecasting in re-entrant manufacturing systems,” the authors propose a long term short term (LSTM) network version in order to forecast short cycle time (CT) of wafer lots, which are used in electronics for the fabrication of integrated circuits. It is crucial to predict CTs of wafer lots in order to help the manufacturers maintain rationalized production and high reliability in delivery. CT prediction facilitates production control such as rebalancing work in progress, changing dispatching rules and priorities. However, there are two main challenges in forecasting CT of wafer lots, which are the constraints on machine dedication and mask setup. Regarding the machine dedication constraint, each layer of a wafer plot should be processed on the same machine so that the circuits that are printed on each layer can later be correctly connected. Therefore, CTs with different reentrances of the same wafer lot are correlated, which is defined by the authors as “layer correlation.” On the other hand, for the circuits on the same layers of different wafer lots of the same kind, the machine needs calibration, which is referred to as mask setup constraint. The calibration requires time affecting the CT of a reentrance of a wafer lot to the machine. The processing times of the two layers, which have the same time to re-enter the machine and which belong to two different wafer lots are correlated. This correlation is referred to as “wafer correlation” by the authors. In order to take into account both “layer correlation” and “wafer correlation” in the prediction of CTs, the authors propose a two-dimensional LSTM neural network architecture with a multiply constant error carousel, which the authors call “bilateral LSTM.” Results the authors obtained show that bilateral LSTM proposed by the authors is more accurate and stable compared to the existing methods. In “leveraging analysis of user behavior to identify malicious activities in large scale social networks,” the authors propose a user behavior analysis model in order to detect malicious behavior in online social networks. The authors assess the performance of their user behavior analysis model using
745
data that was collected from significant number of user profiles from Twitter and YouTube together with 13 million channel activities. The behavior analysis model has four layers, which are social sensing layer, data acquisition and preparation layer, data storage management layer, and analysis representation layer, respectively. Social sensing layer acts as an interface between the social interactions of end users and the data acquisition layer allowing the end user to enter requests to assess certain user profiles in order to detect malicious users. Having entered the request, the user can monitor the data collection process. Social service application programming interface (API) manager uses the APIs of Twitter and YouTube in order to formulate and execute the requests and finally pass the response to the data acquisition and preparation layer, which is responsible for data acquisition and cleaning. The information harvested from the social networks are transformed into a format and transferred to data storage management layer. In this way, it is ensured to update the data whenever needed. All parallel requests, responses and data manipulation processes are handled concurrently in this layer as well as ensuring the coordination of incoming and outgoing streams of data analysis representation layer extracts some features and also derives further features through offline calculations. The features can be categorized as member-based, content-based, system-based, and composite characteristics. The analysis of these features has the capability to provide answers to queries that are related to topic behavior, activities and interactions and finally network behavior. In order to detect topic behavior, the behavior analysis model proposed by the authors employs a text mining algorithm, which is an updated version of latent Drichlet allocation. Regarding activities and interactions, actions of users are explained by deducing from the mass of their activities collected over time. Regarding network behavior, data are collected and analyzed for two types on networks, namely social and interactive networks. As a classification model to detect whether a user is malicious or not, five different classifiers were assessed, which are random forests, J48 decision tree, classification by regression, support vector machine and an iterative regression model proposed by the authors. The performance results obtained by using significant amount of Twitter and YouTube data indicate that the classifier engine in the proposed user behavior analysis model achieves a good balance sensitivity and specificity. Internet of Vehicles (IoV) is an example of Internet of Things (IoT) that supports ubiquitous information exchange and content sharing among vehicles. One promising way of implementing IoV is device-to-device (D2D) communication, which allows data transmission between two mobile entities that are in the proximity of each other, without the data having to go through the base station (BS). D2D communication in IoV has the potential to improve spectral efficiency and reduce latency. However, successful implementation of such communication is challenging due to the limited frequency spectrum, constrained battery capacity and vehicular users’ diverse multimedia rich content preferences such as route planning, collision warning, online games, and traffic monitoring. In “Social big-data-based content dissemination in internet of vehicles,” the authors use term “device vehicle to vehicle (D2D-V2V) communication” to
746
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 14, NO. 2, FEBRUARY 2018
address D2D communication in IoV. They propose a solution for the joint problem of peer discovery, power control and channel selection in D2D-V2V communication. The proposed solution includes a physical layer model and using similarities among vehicular users’ multimedia content selection. The physical model consists of a channel model and connection probability estimation, while similarities among vehicular users’ multimedia content selection is estimated by processing big data obtained through two online social networks that are popular in China, namely Sina Weibo and Youku. C. PERERA, Guest Editor School of Computing Science Newcastle University Newcastle upon Tyne NE1 7RU, U.K.
[email protected] A. V. VASILAKOS, Guest Editor Department of Computer Science Electrical and Space Engineering Lule˚a University of Technology Lule˚a 971 87, Sweden
[email protected]
G. Calikli, Guest Editor Chalmers University of Gothenburg G¨oteborg 412 58, Sweden
[email protected] Q. Z. SHENG, Guest Editor Department of Computing Macquarie University Sydney, NSW 2109, Australia
[email protected] K.-C. LI, Guest Editor Department of Computer Science and Information Engineering (CSIE) Providence University Taichung City 43301, Taiwan
[email protected]
Charith Perera (M’15) received the B.Sc.(Hons.) from Staffordshire University, Stoke-on-Trent, U.K., in 2009, and the Ph.D. degree from Australian National University Canberra, ACT, Australia, in 2015, both in computer science and also the MBA degree from the University of Wales, Cardiff, U.K., in 2012. He is currently a Researcher with Newcastle University, Newcastle upon Tyne, U.K. Previously, he worked at Open University, U.K. Before that, he worked with CSIRO where he involved in EU FP7 funded called OpenIoT project. He has authored or coauthored a number of research papers in IoT domain, especially in IoT middleware and mobile data analytics area. His research interests include Internet of Things, sensing as a service, usable privacy, sensing infrastructures, and architectures. Dr. Perera is a member of ACM. For more information: www.charithperera.net.
Athanasios V. Vasilakos received his Ph.D. degree from the University of Patras, Greece, in 1998. He is a Professor with the Lulea University of Technology, Lulea, Sweden. Mr. Vasilakos served or is serving as an Editor for many technical journals, such as the IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT; the IEEE TRANSACTIONS ON CLOUD COMPUTING, the IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, the IEEE TRANSACTIONS ON CYBERNETICS; the IEEE TRANSACTIONS ON NANOBIOSCIENCE; the IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE; ACM Transactions on Autonomous and Adaptive Systems; the IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS. He is also the General Chair of the European Alliances for Innovation (www.eai.eu).
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 14, NO. 2, FEBRUARY 2018
747
Gul Calikli received the B.S. degree in mechanical engineering and M.Sc. and Ph.D. degrees in computer engineering from Bogazici University, Istanbul, Turkey, in 2000, 2004, and 2012, respectively. She is currently an Assistant Professor with Chalmers University of Technology, Gothenburg, Sweden. Previously, she was a Postdoctoral Research Fellow with The Open University, Milton Keynes, U.K., and a Postdoctoral Research Fellow with the Data Science Laboratory in Ryerson University, Toronto, ON, Canada. Her research interests include software engineering for privacyaware adaptive systems, empirical software engineering, human aspects in software engineering, mining software repositories, and applications of artificial intelligence on building recommendation systems for software engineering. Dr. Calikli is a member of the IEEE Computer Society and ACM.
Quan Z. Sheng (M’06) received the Ph.D. degree in computer science from the University of New South Wales, Sydney, NSW, Australia, in 2006. He is a Full Professor and the Head of Department of Computing, Macquarie University, Sydney, NSW, Australia. He is the author of more than 310 publications. His research interests include the Internet of Things, big data analytics, service-oriented architectures, distributed computing, Internet computing, and Web of Things. Dr. Sheng is the recipient of the ARC Future Fellowship in 2014, the Chris Wallace Award for Outstanding Research Contribution in 2012, and the Microsoft Research Fellowship in 2003. He is a member of the ACM.
Kuan-Ching Li (SM’07) received the Licenciatura in mathematics, and the M.S. and Ph.D. degrees in electrical engineering from the University of Sao Paulo, Sao Paulo, Brazil, in 1984, 1996, and 2001, respectively. He is currently a Professor with the Department of Computer Science and Information Engineering, Providence University (PU), Taichung, Taiwan. Prior to joining PU in 2003, he had been a Postdoctoral Scholar with the University of California, Irvine, Irvine, CA, USA. He has been actively involved in many major conferences and workshops in program/general/steering conference chairman positions and as a program committee member, and has organized numerous conferences related to high-performance computing and computational science and engineering. His research interest include GPU/manycore computing, Big Data, and Cloud. He has authored or coauthored more than 200 journals and conference papers, and author/Editor of more than 10 books published by CRC Press, Springer, McGraw-Hill, and IGI Global. Dr. Li was a recipient of awards and funding support from a number of agencies and industrial companies, as also received guest and distinguished chair professorships from universities in China and other countries. He is the Editor-in-Chief of the International Journal of Computational Science and Engineering, International Journal of Embedded Systems, and International Journal of High-Performance Computing and Networking (Inderscience). He is a member of AAAS and a Fellow of the IET.