Privacy-enhancing Federated Middleware for the Internet of Things

Privacy-enhancing Federated Middleware for the Internet of Things Paul Fremantle School of Computing University of Portsmouth Portsmouth, UK

[email protected]

ABSTRACT The Internet of Things (IoT) offers new opportunities, but alongside those come many challenges for security and privacy. Most IoT devices offer no choice to users of where data is published, which data is made available and what identities are used for both devices and users. The aim of this work is to explore new middleware models and techniques that can provide users with more choice as well as enhance privacy and security. This paper outlines a new model and a prototype of a middleware system that implements this model.

CCS Concepts •Security and privacy → Embedded systems security;

Keywords Internet of Things; Security and Privacy; Cloud; Sensors and Actuators

1.

INTRODUCTION AND RELATED WORK

IoT devices publish data and actuate based on commands, in both cases, talking to Internet-based Cloud Services. Typically these are tied to a specific Cloud Service Provider (CSP), often the manufacturer of the device. CSPs may not be trustworthy - they may misuse or overuse data, or they may be hacked. Some CSPs have closed down, causing the devices to no longer operate. A common attack on CSPs has been to steal credentials and publish users’ passwords. One of the core issues with hardware devices is that they are vulnerable to physical attacks, often stealing secrets or keys, and there is a challenge to provide devices with secure identities during manufacturing. The Web is moving to models such as OAuth and OpenID Connect where users can choose secure providers to provide identities, reducing the number of passwords they need to provide, but there are not similar moves to allow IoT devices to choose their identity providers or CSPs. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Middleware Doctoral Symposium ’16, December 12-16 2016, Trento, Italy c 2016 Copyright held by the owner/author(s). Publication rights licensed to ACM.

ISBN 978-1-4503-4665-8/16/12. . . $15.00 DOI: http://dx.doi.org/10.1145/3009925.3009929

In a structured literature review, we identified 152 papers, and from these 22 IoT middleware systems: this is published in [1]. Only 12 offered any detailed security model, and only three of those offered a model that supported federated identity, and none of these systems used federated identity at the device level. The most complete model is that of Webinos [6], which introduces the concept of a Personal Zone Hub (PZH), which is a middleware runtime which is dedicated to a specific user. However the instantiation of the PZH is not addressed, and therefore the system is only suitable for developers or other experts who can run their own PZH. Webinos did not look into the cost of running a PZH either. Other systems do not provide a full middleware for IoT, but provide specific identity and security. These systems extend the federated identity to the device level. IOTOAS [4] addresses the use of OAuth2 with the CoAP protocol. The mapping of the OAuth2 Token API to support IoT devices using the CoAP protocol is being formalised in [10]. In [12] there is a demonstration of the OAuth1 protocol with MQTT, favouring OAuth1 over OAuth2 for IoT devices. In [7] they propose a “security manager” that performs some of the functions of the Device IdP in our model. One interesting area is anonymous identities for IoT. In [13] a capability-based access system is described that allows anonymous identities to be used. [2] provides an Architecture Reference Model for an approach that supports anonymous identities. Neither of these systems separate the provision of anonymous identities from the data-sharing middleware. Other identified gaps in the current approaches are that: no system addresses how to securely configure identifiers into devices and for users to claim ownership of devices in an effective way. None of the systems examined allows users to independently choose identity and cloud providers to choose the most secure provider of each. We address these problems in our approach below.

2.

APPROACH AND RESULTS

The approach is to create an improved middleware model and to implement this model in a prototype, including implementing archetypal devices and cloud services. There have been multiple iterations of both the model and the prototype based on feedback from presenting the model at conferences and with peers. The model is being driven through aiming to meet the requirements of simple User Stories [5], a methodology used in Agile development. For example, one user story is that a User would like to choose the cloud provider where there data is sent by the device. Another

Figure 1: Proposed model

user story is that the User would like to use their existing Web identity (for example their Google, Facebook, Twitter or Github identity) to log in, but would like to protect this identity from the device and from those accessing device information. The User Stories themselves have been motivated strongly by two works in this area, Privacy By Design [3] and Engineering Privacy [14]. In order to satisfy these and other user stories that have been identified, we aim to decouple the various parts of the system: the identity provision of users; the identity provision of devices; the propagation of data and commands; and the evaluation of access policies. This federation approach gives users a much wider choice, allowing more secure systems to replace less secure ones. In addition it makes life harder for attackers, as the system is designed so that multiple systems must be compromised in order to infringe privacy.

2.1

Work to date

The first work undertaken was in providing Federated Identity to IoT devices, using the OAuth2 model [8]. This was extended in [9] to provide some simple experimental data as well as to use the Dynamic Client Registration (DCR) API to support each device having a unique OAuth2 client identifier, which removed a number of the security challenges and threats present in the earlier work. However, we did not address the manufacturing process to configure the devices with this identifier. These works only addressed identity and did not aim to provide a wider middleware solution, as opposed to the current model.

2.2

Model

The heart of the model is a Device Identity Provider (DIdP) which allows devices to be registered by the manufacturer and then claimed by a user. Once a user has claimed a device, the device communicates with an Intelligent Gateway (IG) that works in conjunction with a container cloud infrastructure to create Personal Cloud Middleware (PCM) that are used to share data between devices and cloud services only as explicitly consented to by users. The PCM is similar in concept to the Webinos PZH. The current model is shown in Figure 1. The model addresses the question of privacy by federating different required processes into different logical par-

ticipants. The User Identity Provider (UIdP) is an existing Internet service (such as Google, Facebook, Github, or OpenID Connect providers). The UIdP is only aware of which DIdP the user is using. A key part of secure IoT is the provision of a secure identity to Devices and the management of these identities. We propose a Device Identity Provider that issues secure identities to devices, and is used to manage policies on what data can be shared from a given device. The DIdP does not participate in any of the data sharing. Therefore hacking the DIdP can help identify the devices a user owns, but not tie them to the data or commands sent to/from the device. The DIpD issues a new random pseudonym for the user and this provides a level of anonymity for users. An important part of the model is a clear registration process for devices. This is in two parts. At manufacture time, the device is issued with a Secure Device ID and Secret, which are independent from any hardware identity such as the MAC address. In the second stage, the owner of the device is involved in a flow where the user consents to the device acting on their behalf. This process results in a token being issued to the device. The token itself contains no user identifiable aspect and as a result an attacker of the device cannot deduce the owner. The DIdP knows the a limited user profile as provided by the UIdP, but this is not stored. Instead we issue each user with a new secure random pseudonym. In the case that the DIdP is hacked and the data stolen it is possible that the attacker could use a dictionary attack to tie our pseudonym to the third-party identities at the UIdP. However, since the OAuthing system does not store passwords, device data or any user data, this would not in itself give the hackers any significant personal information. The DIdP is aware of which TPAs a user is subscribed to, but does not know which devices are interacting with which TPAs. Once the device is registered, users can authorize thirdparty applications (TPAs) to access the devices sensor data and actuators. The policy language is at present very basic and this is an area for improvement in the current model. The TPA redirects the user to the DIpD which in turn redirects the user to a UIdP. The device and the TPA are not aware of the user’s identity - they only see tokens and do not have the authorization to call the APIs to convert those tokens to pseudonyms. Therefore the TPA cannot deduce the user’s identity from the OAuthing system. Users may of course choose to share their identity with the TPA outside of this model. A key part of the model, but not yet implemented, is that the PCM can perform data summarisation and filtering. For example, a user may choose to share only the rolling 24-hour average of data instead of the raw data. This is especially important because this prohibits many fingerprinting attacks that rely on access to raw device data, for example [11]. Once the device and the TPA are both authorized, they connect to the Intelligent Gateway (IG) that uses the provided tokens to identify the pseudonym of the user. Using this pseudonym, the IG ensures that a unique middleware is running on behalf of the user (the PZH), and routes the traffic to the PZH. This ensures that data is only shared within a unique container, protecting against multi-tenant attacks. The device data is only visible to the device, the PZHs and authorized TPAs.

Figure 2: Information Visibility Matrix

Figure 4: The Cloud environment for the demonstration and tests

Figure 5: One second client IGNITE vs Mosquitto

Figure 3: Device publishing data to TPA

If a third-party successfully attacks the IG or PZH, then they have access to the data in-flight, but do not inherently know the owner of the device. The manufacturer of the device is not party to any of the data or identities after manufacturing. The manufacturer only knows the original device identity (e.g. MAC address) and the Secure Device ID that was issued at manufacturing time. If the manufacturer wishes to offer a service to the user, this can be run as a TPA and this ensures that the user must consent to any sharing of data with the manufacturer. In Figure 2 we identify each participant and show what access they have to credentials, identities and data. A summarised sequence diagram showing how data is transferred from a device to a TPA is shown in Figure 3.

2.3

Results

To validate the model, there are prototypes of all the components, including the OAuthing DIdP, the IGNITE IG, the PZH, the devices, the cloud services and the manufacturing servers. The system uses existing UIdPs. In addition, a test harness was created to test the system in the cloud. Figure 4 shows a diagram of the test environment that is used for both performance testing and demonstration. The experimental results include data on the device overhead of running this model, the cost of running PZHs per user, and the throughput and latency performance of the DIpD and IG. The impact on program memory on our sample device

(based on the ESP8266 platform1 was to use less than 10% of the available program memory to support secure communication with the DIdP and IG. The prototype IG was able to instantiate more than 400 PCM middleware servers, each running in a separate isolated container, on a cloud server costing US$20 per month, giving an annual cost per user of US$0.60. Figure 5 shows the latency from publishing data to receiving it compared to a standard MQTT server2 . Each concurrent client is publishing one message per second, and the additional latency of the system is around 1ms. The additional time to instantiate the PZH, which happens the first time any of a users’ devices connect, is less than 1.5 seconds. Once the PZH is running, the initiation of an MQTT connection happens in 36ms as compared to 25ms in a standard Mosquitto system.

3.

DISCUSSION AND CONCLUSIONS

The experimental results show that the system is workable and effective. The automation of device registration during manufacturing, together with a simple user registration flow demonstrates that the identity provision can be successfully federated and de-coupled from the manufacturer. Earlier work has already shown that federated identity tokens can be used with low-cost IoT devices. This work extends that to show that low-cost devices can also support TLS and user registration flows. A key concern around the PCM model is the cost. The prototype shows that the costs are reasonable, even with an unoptimized prototype system. The OAuthing model and prototype demonstrate that devices can be connected to TPAs without inherently leaking the user’s identity to either system. User’s may choose to 1 2

https://en.wikipedia.org/wiki/ESP8266 Mosquitto - https://mosquitto.org/

provide TPAs with their identity, but that becomes a positive consent of the user rather than the default. In this model, all sharing of data from the user’s devices is authorized by user consent. In addition users can bring preexisting identities to the system rather than being required to create new credentials, which reduces the chances of password theft and gives users a choice of identity provider. There are issues with this model, which reflect the ongoing process of research. In particular, in the current incarnation, the manufacturer could still choose to use a DIpD that is not trustworthy and users do not have the option to move away from this, which is an area that we are looking at improving. Compared to the related works, we have identified a number of contributions. This model provides a clear registration process where the user can take ownership of devices without inherently leaking the identity. None of the related works clearly separate the concerns of the user identity, device identity and data sharing to provide a model whereby multiple successful attacks are required to compromise privacy. The existing systems that use anonymous identities do not provide the separation between identity provision and data sharing, and are thus susceptible to attacks on the central system. The existing research into the PZH/PCM model did not look at how to instantiate and run these on behalf of the user and did not validate the cost or practicability of this approach.

3.1

Ongoing and further research

There remain a number of unexplored aspects of this model. Firstly, to extend the model and system to support multiple co-existing DIdPs, and to allow users to migrate from one DIpD to another, which is currently not implemented. We believe this will provide users with complete choice, and allow them to migrate away from services which do not provide the correct level of trust. Secondly, to provide more fine-grained access control policies for devices. We aim to improve the policy language to provide more control to users while retaining simplicity and ease-of-use. One concern with our model is that many device manufacturers’ business models are based around collecting user data and therefore this system may be unattractive precisely because it improves user privacy. However, there are many specific areas, such as medical and healthcare IoT, where enhanced privacy and consent are highly attractive. We also believe that as privacy breaches for IoT become more publicised the demand for this and similar approaches will grow. In particular, we argue that it is easier to start with a strong privacy model and lessen the controls than vice-versa. Further areas of research include exploring scenarios where devices provide services for more than one owner and supporting devices that communicate via gateways (e.g. Bluetooth devices talking to a phone or hub). One significant area of research is to allow further de-centralisation by utilizing distributed ledger technologies and blockchains with the DIdP to provide a completely independent model of identity to devices, a policy provisioning model, and/or as a place to securely record that users have given cloud services consent to use data.

Acknowledgments Thanks to Benjamin Aziz, Jacek Kopeck´ y, Frank Leymann and Philip Scott for excellent supervision and advice.

4.

REFERENCES

[1] B. Aziz, A. Arenas, and B. Crispo. Engineering secure Internet of Things systems. Institution of Engineering and Technology, 2016. [2] J. B. Bernabe, J. L. Hern´ andez, M. V. Moreno, and A. F. S. Gomez. Privacy-preserving security framework for a social-aware internet of things. In International Conference on Ubiquitous Computing and Ambient Intelligence, pages 408–415. Springer, 2014. [3] A. Cavoukian, S. Taylor, and M. E. Abrams. Privacy by design: essential for organizational accountability and strong business practices. Identity in the Information Society, 3(2):405–413, 2010. [4] S. Cirani, M. Picone, P. Gonizzi, L. Veltri, and G. Ferrari. IoT-OAS: An OAuth-based Authorization Service Architecture for Secure Services in IoT Scenarios. 2015. [5] M. Cohn. User stories applied: For agile software development. Addison-Wesley Professional, 2004. [6] H. Desruelle, J. Lyle, S. Isenberg, and F. Gielen. On the challenges of building a web-based ubiquitous application platform. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing, pages 733–736. ACM, 2012. [7] S. Emerson, Y.-K. Choi, D.-Y. Hwang, K.-S. Kim, and K.-H. Kim. An oauth based authentication mechanism for iot networks. In Information and Communication Technology Convergence (ICTC), 2015 International Conference on, pages 1072–1074. IEEE, 2015. [8] P. Fremantle, B. Aziz, J. Kopeck` y, and P. Scott. Federated identity and access management for the internet of things. In Secure Internet of Things (SIoT), 2014 International Workshop on, pages 10–17. IEEE, 2014. [9] P. Fremantle, J. Kopeck` y, and B. Aziz. Web api management meets the internet of things. In European Semantic Web Conference, pages 367–375. Springer, 2015. [10] IETF. Authentication and authorization for constrained environments (ace) - documents. https://datatracker.ietf.org/wg/ace/documents/. (Accessed on 30th August 2016). [11] T. Kohno, A. Broido, and K. C. Claffy. Remote physical device fingerprinting. IEEE Transactions on Dependable and Secure Computing, 2(2):93–108, 2005. [12] A. Niruntasukrat, C. Issariyapat, P. Pongpaibool, K. Meesublak, P. Aiumsupucgul, and A. Panya. Authorization mechanism for mqtt-based internet of things. In 2016 IEEE International Conference on Communications Workshops (ICC), pages 290–295. IEEE, 2016. [13] D. Rotondi, C. Seccia, and S. Piccione. Access control & iot: Capability based authorization access control system. In 1st IoT International Forum, Berlin, November, 2011. [14] S. Spiekermann and L. F. Cranor. Engineering privacy. IEEE Transactions on software engineering, 35(1):67–82, 2009.