Security Concepts for Flexible Wireless Automation in Real-Time Environments Albert Treytl1, Thilo Sauter1, Heiko Adamczyk2, Svilen Ivanov3, Henning Trsek4 1
Austrian Academy of Sciences, Institute for Integrated Sensor Systems Wiener Neustadt, Austria 2 Institut f. Automation und Kommunikation Magdeburg, Magdeburg, 39106, Germany 3 rt-solutions.de GmbH, Oberländer Ufer 190a, 50968 Cologne, Germany 4 inIT – Institut Industrial IT, OWL University of Applied Sciences, 32657 Lemgo, Germany [albert.treytl, thilo.sauter]@oeaw.ac.at,
[email protected],
[email protected],
[email protected] Abstract
Wireless technologies in industrial automation increase flexibility, but pose challenges to the design of security strategies. The typical requirements regarding integrity protection, authentication, and availability need to be transferred from the traditional wired network domain to wireless networks without impairing the benefits of wireless communication. This article presents results of the security concept for flexible wireless automation in real-time environments devised in the flexWARE project. It includes the results of the security requirement analysis, a description of the procedural design approach towards security, and describes innovative communication protection mechanisms for wireless domains including seamless handover and novel location-based security services.
1. Introduction Modern industrial automation systems require flexible, real-time communication networks. To increase flexibility, wireless technologies have become an area of interest in particular for applications involving mobile devices, transmission of data onto moving parts of machinery, or highly distributed devices in environments where cabling is difficult, expensive, or even impossible [1]. However, industrial wireless networks pose particular security challenges. The typical requirements regarding integrity, authentication, and availability need to be transferred from the traditional wired network domain to wireless networks without impacting the applications’ real-time requirements. In comparison to wired installations, which obviously benefit from physical access limitations, for wireless transmission it is much harder to limit the area of
“network” availability because attackers can intrude the system at arbitrary locations. This article focuses on the integration of security in a wireless automation system at the example of the architecture developed within the European project flex WARE 1. The scientific contributions of this paper are: • Procedural approach to security that allows handling the trade-off between security and realtime and, inclusion of new application requirements during deployment. • Extension of the 802.11 measures for security in real-time applications through location-based fast handover. • Innovative security measures for location-based access in wireless networks similar to physical security in wired networks.
2. The flexWARE system concept The primary aim of flexWARE is to develop a fieldlevel network architecture based on both wired and wireless technologies with tight integration in such a manner that the real-time properties are maintained. A particular goal is to enable mobility of wireless nodes under real-time communication constraints. In larger automation systems, the interface between wireless segments and the wired backbone infrastructure cannot be accomplished by a single access point (AP). Rather, the area has to be covered by an adequate number of APs that nonetheless belong to a uniform domain.
1
The work presented in this paper is founded by the European Commission in the 7th Framework Program under flex the ICT-224350 WARE – Flexible Wireless Automation in Real-Time Environments project.
Within this domain, communication should be possible without restrictions. This means that a mobile node must be able to roam between access points seamlessly, which is not guaranteed in standard wireless network technologies especially with respect to real-time constraints of industrial applications. The main innovation of the flexWARE approach to achieve seamless handover under real-time requirements is a localization middleware that determines the position and movement of mobile nodes and exploits this information for in-advance real-time bandwidth reservations before the actual handover occurs. Notable is the fact that nodes do not contain any dedicated hardware for localization in order to detect the position.
Figure 1. flexWARE system architecture and target of inspection covering the wireless area managed by a controller Apart from the (standard) mobile nodes, the WARE system consists of enhanced access points, including the localization infrastructure, that build the wireless infrastructure, and flexWARE controllers which are coordinating a group of access points to guarantee real-time behavior (see Figure 1). The controllers are further connected via a wired field-level backbone (typically a real-time Ethernet network) to cover all mobility domains of the whole plant. flex
3. The approach security
towards
flex
WARE
Security is an important part of the flexWARE concept that also facilitates the localization middleware. In order to plan security measures properly, security and influencing application requirements must be investigated. Additionally designed security might cause conflicts with real-time requirements, e.g., maximum delay requirements vs. longer execution times of strong security measures, resulting in a re-configuration or re-design. In order to handle these mutual interdependencies a structured process for security is required.
80% of security are organizational measures forming a framework that allows security technologies to work properly [10]. Within the project two methodologies haven been chosen to manage security: The VDI guideline 2182 was introduced to specify and develop a secure technical system. Furthermore, flex WARE considers the organization-wide security management process based on the ISO 27000 standard series. 3.1. VDI guideline 2182 – a generic procedure model The principle of the VDI guideline 2182 [14] is similar to a uniform, feasible procedure model for ensuring information (IT) security throughout the entire life cycle. The life cycle shall extend to the manufacturer, integrator, and operator (asset owner). flex WARE covers all aspects because it offers an infrastructure that particularly covers the manufacturer’s position. The process consists of 8 procedures (see Figure 2), each of which is characterized with initial information, action, and output. These procedures represent on the one hand a systematic approach and on the other hand they allow customizing security to the level of protection required by the application. The use of the model was the beginning of the systematic approach including all aspects ranging from physical to communication and operational security. The initial structure analysis specifies the IT security target and also shows the corresponding assets of the flex WARE system (Figure 1). The next steps are an analysis of threats and vulnerabilities and a risk assessment. These phases are especially important, since the system developer has to anticipate the needs of users (system integrators and operators) but also to design its products in such a way that changes of the (security) requirements at a later stage can be included in an efficient way. The VDI guideline 2182 makes an important contribution to this desired collaboration between manufacturers, integrators, and end users by setting return points for this analysis in the development process. Following this guideline the design of the flexWARE system eases its integration in the organizational security management process at the end users. 3.2. Support of an organizational security management process Organizational security management is defined in a set of international standards namely ISO 17799 and ISO 27000) which form the basis for a company wide Information Security Management System (ISMS). Although often of purely organizational nature some parts of ISMS are related to the technical implementation.
Cyclic
Trigger
start (Structure Analysis)
Determine Determine relevant relevant security security objectives objectives
Identify Identify assets assets
Perform Perform audit audit
Analyse Analyse threats threats
Documentation
Implement Implement and and use use overall overall solution solution
Analyse Analyse and and assess assess risks risks
Select Select overall overall solution solution
Identify Identify measures measures and and assess assess effectiveness effectiveness
Figure 2. Generic VDI 2182 procedure model Base of ISMS process is a Plan-Do-Check-Act (PDCA) cycle, of which the check phase is relevant for the flexWARE security. The flexWARE engineering appliance supports this phase by: 1. an inventory of assets identifying applications and traffic patterns 2. Capacity planning and admission control performing an engineering-time schedulability test which evaluates the risk level for the threat due to increased system utilization. Additionally, the flexWARE system supports this task by providing detailed information about its own performance with regards to availability, integrity, and service levels. The system constantly monitors and reports security incidents on the wireless medium correlated with location information, software malfunctions, run-time capacity, etc. Furthermore the security process described in section 3.1 supports the demand of ISMS re-evaluation of the imposed risks in case of re-configuration, e.g. due to trading performance for security and vice versa. This allows for a flexible risk management in the overlaying ISMS.
4. Security target requirement analysis
definition
and
According to the chosen model (VDI 2182) the main starting point is the structure analysis which must be carried out before the procedure model is applied. This analysis includes a specification (as detailed as possible) of the IT security target and the corresponding assets, on the one hand, and a specification of the environment of use, on the other hand. 4.1. Specification of IT-Security Target and identification of Assets Figure 1 depicts the overall flexWARE system architecture which contains, from the security viewpoint, the target of inspection (TOI) containing the flex WARE assets having each their own specific
security needs. The black rectangle in Figure 1 represents the boundaries of the TOI which is characterized by 2 wired interfaces. Additionally the wireless medium is like a interface from an untrusted area to the TOI. These interfaces are the connection to the outside world with a certain probability of threats and therefore only relevant for the consideration of possible IT security threats (chapter 4.3). Taking into account the given use case scenarios, for a threat analysis it is essential to define the assets and relevant security objectives: The flexWARE controller is the major asset in the flexWARE architecture since it is the central element executing the complete real-time and security planning and scheduling. Furthermore the flexWARE access points are also to be classified as primary assets since they are the critical elements in the infrastructure that execute the security measures and perform security monitoring. Finally, and taking into account all the different use case scenarios, the main objectives of the given assets are availability, integrity, authentication and occasionally confidentiality. 4.2. Specification of the environment The specification of the environment of use concentrates mostly on identifying influencing variables. These are characteristic values, relating to topography (building, environment) which have a direct or indirect effect on the target of inspection. This information is derived from the use case scenario e.g., the smart ware-house with automated storage and retrieval systems, shown in Figure 3. In particular the following assumptions are taken: - Indoor factory shop floor of approximately 100m length, 30m width and 15m height - Multiple entry points - Server rooms are additionally locked - Radio signal cannot be precisely limited to the shop floor and can also cover publicly accessible areas. 4.3. flexWARE specific requirements Specific requirements of FlexWARE have been setup to achieve highest application coverage: First level of input is the flexWARE end user group (EUG) comprising small, medium as well as big enterprises representing a very broad industry sector. At the second level this information is analyzed and amended by the security experts within the project. Finally flexWARE defined the following six use cases that allow the definition of specific requirements which are relevant for important application classes: • Automated guided vehicles or conveyers • Smart ware-house with automated storage and retrieval systems • Industrial lift
Storage magazines
Rail PLC
PLC
Storage magazines
Rail
• Hoists and cranes • Wireless terminals for robot cells • Maintenance in airports This paper focuses on the use case “Smart ware-house with automated storage and retrieval systems” (Figure 3).
infrastructure components (access points and controllers) and demanding no special modifications at mobile nodes. Hence the majority of components can be reused and security enhanced by replacing the limited number of infrastructure components only. It can be anticipated that flexWARE has to consider mainly the trade-off between appropriate IT security measures and the real-time requirements. In fact, technical IT security measures are usually resource and time consuming. The key requirements therefore are: R1. IT Security of the communication network: to warrant availability, integrity, authentification, and confidentiality R2. Matching of IT security and application requirements: definition of appropriate, adjustable IT security measures and classification to different application classes (different real-time needs) R3. Integration of devices with limited security: Resource limited devices and existing mobile nodes should be integrated into the system.
5. Built-in security measures of wireless technologies Figure 3. Use case: smart ware-house with automated storage and retrieval systems. In this use case the communication between the remote human machine interface (HMI) and the robot communication shall transfer safety-related data coming from the emergency stop buttons. Therefore wireless emergency push buttons could be either mobile or fixed anywhere in the warehouse representing the major application specific IT security asset of this example scenario. For safety communication profiles (IEC61784-3) the main IT security objectives derived are integrity, authorization, and also confidentiality. Confidentiality is required since some existing safety applications use plaintext passwords to ensure authorization. Usually, confidentiality shall be guaranteed for all other passwords, e.g., for configuration management, and also for sensitive configuration data (e.g. address information). Within the factory floor, the massive use of embedded systems sets additional requirements: First, it must be considered that embedded systems might not have the capabilities to execute all security measures due to their limited resources. A possible solution for this trade-off between appropriate IT security measures and the real-time requirements can be introducing different application classes. Second, the integration of existing devices must be considered. flexWARE addresses these issues by concentrating security in
The built-in security measures of WLAN addresses the flexWARE security requirements R1 and R2, yet there exist problems, because R2 addresses real-time communication, which should not suffer under the used security mechanisms. Finally, a requirement and major design goal of the flexWARE system is to rely as much as possible on existing standards to ensure interoperability with a wide range of products and openness to many applications (R3). In general, flexWARE wireless communication media are IEEE 802.15.1, IEEE 802.15.4, and IEEE 802.11, since these are among the most promising candidates to achieve a long-term wide usage in industrial environments. They are already specified by user organizations (e.g., International Profibus User Organization [12]) to be used as the wireless counterparts of real-time Ethernet standards. All technologies mentioned above offer built-in security services on the data link layer, such as data confidentiality, data integrity, authentication, and replay protection, although providing different levels of complexity and robustness. The main focus in flex WARE is put on 802.11 and its security services, as it is today’s most widely spread technology and the most suitable one for the identified use cases. Wired equivalent privacy (WEP) initially provided security services in 802.11 WLANs. However, many potential and critical flaws were rapidly discovered, followed by the complete break in 2001. Therefore, the 802.11i amendment was created which specifies a robust security network (RSN) for authentication and
encryption. A robust security network association (RSNA) consists of three steps: the mandatory open system authentication, the following association and the final extensible authentication protocol (EAP) authentication. Three entities are involved, the client which is called supplicant, the AP called authenticator, and the authentication, authorization, and accounting (AAA) server. The authenticator communicates with the AAA server by means of EAP authentication protocols, called methods, which support a variety of credentials, e.g., shared keys, passwords, certificates, etc. After a successful authentication the derived 512bit master session key (MSK) is sent to the AP and the first 256 bits are used as pair wise master key (PMK). In the third step the PMK is used to derive transient keys for the encryption during the four-way handshake and the secure data exchange can start. The duration of the whole authentication is in the range of approx. 400ms. This causes problems for many industrial applications with high real-time requirements and the need for a handover from one AP to another. The first authentication can still follow the full IEEE 802.11ispecified protocol. Whereas subsequent authentications within the same extended service set have to be shortened to provide a fast handover, i.e., the interruption of data exchange has to be as short as possible.
6. flexWARE security measures Current development of security measures focuses on two aspects: the communication security of WLAN links and a new approach towards location-based security services to address problems of real-time communication over wireless networks in automation scenarios and integration of physical access protection. 6.1. Protection of the WLAN links For mutual authentication and protection of the wireless links, 802.11i security measures have been chosen. Based on these services the access control is realized. 802.11i mechanisms fulfill the first identified requirement (R1). The real-time tradeoffs related to the second security requirement (R2) are discussed in detail in this subsection. Therefore, two major issues have to be taken into account for the properties to be offered by flexWARE: 1. Handover properties for real-time connections 2. Efficiency of cryptographic functions to reduce the additional line delay for real-time services The handover is a very critical aspect since the initial authentication of a node with the access point causes delays that violate the deadlines of many classes of real-time traffic. Typical connection setup times are in the range of 800 ms exceeding the limit of a couple of milliseconds demanded by the application. Use case scenarios particularly affected are the use cases as
introduced in section 4 with update times of approx. 50 ms or motion control and safety applications with even smaller cycle times of around 1 ms. Within flexWARE three approaches have been identified to solve this conflict: First, key caching stores the PMKs which are created during a full EAP authentication. This reduces the handoff latency since no EAP authentication is required during handover. Controlled by the flexWARE controller, APs possess the necessary master keys to skip the authentication process and directly start the 4way handshake to establish the encryption keys. The second solution is to use pre-authentication, completely removing the EAP authentication and verification phase from the critical real-time path, since the node does the authentication procedure in advance. Possible schemes are Frequent Handoff Region (FHR) [3], Proactive Neighbor Caching (PNC) [4], Selective Neighbor Caching (SNC), or AP initiated handover [5]. As a third approach the mechanisms recently specified in the amendment 802.11r [6] for a faster handover, while still providing 802.11i security, could be used. 802.11r defines a mobility domain which is a set of APs within the same basic service set (BSS). The result of the first authentication is a MSK which is used to establish the 802.11r key hierarchy (different PMKs). Then the PMKs are distributed to all APs belonging to the same mobility domain, leading to an elimination of the EAP authentication during the handoff. Furthermore, the 4-way handshake and any resource reservation is integrated into the standard 802.11 open systems authentication/(re)-association. Table 1. Typical handover durations Open Systems Authentication Standard ~ 15ms Key Caching ~ 15ms Pre-Auth. ~ 15ms 802.11r ~ 15ms
EAP Authentication ~ 300ms ----
4-Way Handshake ~ 25ms ~ 25ms ~ 25ms --
Total ~ 340ms ~ 40ms ~ 40ms ~ 15ms
From the flexWARE point of view the 802.11r mechanisms are the most promising, because they combine both previous approaches and include resource reservation as an additional option. Since the flex WARE controllers need to do a pre-planning to fulfill the real-time requirements in any case, this preplanning is also done for security as an integral part of the scheduling process. An overview of typical handover times is provided in Table 1 showing only the time needed on the infrastructure side for establishing a secure connection [11]. Regarding real-time traffic protection of the actual runtime data, special attention has to be paid to a suitable efficiency and determinism of the cryptographic functions. However, another aspect has to be considered in this context, which specifically
applies to handover scenarios. The handover time consists of four different phases altogether. The last three phases of open authentication, association and RSNA depend on the infrastructure, whereas the first phase (search phase) depends only on the client. In the search phase, the connection loss has to be detected by the client and a scanning for potential APs on all available channels is performed. As a result, this phase causes a relatively large amount of the overall handover duration: In order to decrease the handover latency one of the flexWARE features is used. The localization ability of mobile nodes is used to support the handover. In such a scenario the flexWARE controller is acting as the handover coordinator having knowledge about the position of each mobile node and has also information about APs’ coverage. Hence, the location information available in flexWARE is used to trigger the handover. Furthermore, the decision about the new AP is also taken based on the current position of the node, leading to an increased handover performance of the client. However, finding a reasonable tradeoff between cryptographic efficiency and the handover performance is still essential. In addition to this, security services should be transparent to the field devices. A key goal is to deny malicious nodes any access to the network infrastructure: The flexWARE controller takes notice of potential security threats and can take appropriate action against them, e.g., a part of the real-time control (quality of service (QoS) monitoring) performs resource monitoring (MAC filtering, timestamping of packets, packet retransmission and detection of deadline violations). 6.2. Location-based security services The defense in depth concept of flexWARE is not limited to electronic or IT security measures. Industrial automation facilities are usually located in restricted areas, such as shop floor buildings, warehouses or fenced areas. Location-based security services use this restrictions for access control and authentication based on the physical location of a node. Using a localization scheme as a basis for security offers the following advantages: In comparison to wired installations, which obviously benefit from physical access limitations, for wireless transmission it is much harder to limit the area of “network” availability. When using wireless communication, attackers can intrude the system at arbitrary points. Areas covered by wireless links must be assumed to be untrusted areas. Typically, usage of directional antennas, reduction of sending power or even disabling of the low data rate modes of WLAN are used to limit the range of the WLAN signal. The problem of these measures is their accuracy and dependence on environmental influences. Additionally, they also contradict the goal to have a high signal to
noise ratio (SNR) and, therefore, high data rate and availability into the last corner of a building or the shop floor. Even a shielding of rooms is suggested, which seems not feasible for industrial environments due to their size and cannot be installed for outdoor industrial facilities such as refineries or large-scale chemical plants. Beside precise limitation of network access another important advantage of location-based security is the capability of using wireless nodes with old, and therefore limited or outdated, weak security. Since industrial hardware has a very long lifetime, to achieve full return on investment, a security system must be able to handle and integrate such devices. A typical example could be today’s 802.11b connected barcode scanners offering only weak WEP security. Locationbased security can increase their security level and, hence, allow reliable use of these devices. flex WARE location-based security services are based on a TDoA system (Time Difference of Arrival) since it requires no modifications of the sender unlike alternative localization technology such as ultrasonic systems. Being able to use unmodified nodes is vital since security relies on infrastructure components under full control and no elements that are threatened to be possessed by an attacker. It also comprises vital economic advantages ranging from cost and market availability to interoperability issues. The TDoA localization method calculates the position of a node by measuring the differences between reception times2 of incoming WLAN packets at different receivers [9]. A TDoA value (time difference between a pair of receivers) defines a hyperbola which comprises all possible locations of the sending node for this TDoA value. Intersection of all hyperbolas from different pairs of receivers finally defines the position of the sender. Figure 4 shows a typical scenario: packets of legitimate node A arrive at time ta1 to ta4 at the various receivers. Based on these values a position inside the restricted area is calculated and therefore access to the network is granted. The position calculated for the illegitimate node RE (alias real evil) is outside and this node is denied access. Investigations within the project indicate a precision of 1 to 3 meters sufficient to guard industrial areas indoor and outdoor. The time of arrival of a signal can be measured at every access point (AP). Yet, for a location infrastructure as depicted in Figure 5 the system would run out of frequency ranges due to interference. Hence, flex WARE introduced the concept of Smart Timing Repeaters (STR) that only implement the functionality 2 For precise location calculation clocks at access points must be
synchronized in the nanosecond range requiring special hardware. This synchronisation is achieved using the IEEE 1588 protocol and the security extension specified in Annex K [7] amended by dedicated security modules introducing no jitter [8].
d2 d 4
Claim new of posit ion
3
t a1
STR1
t re
t re1
RE with directional antennas and delay lines
t a3
STR3
d1
Position = f (tarr STR1,tarr STR2, …)
d3
of a TDoA receiver to prevent interference within the cell and with neighboring cells: Within one cell there is only one AP and multiple STRs. Since STRs contain a lot of components also required by an AP it seems favorable that one of the STRs of a cell is at the same time also acting as AP. If all STRs are equipped with AP functionality, other STRs can take over in case of AP failure and hence increase reliability of the communication system. In this way a novel security service granting authorization and access control based on the location is created addressing requirements R2 and R3. Secure (3D) localization in this scheme is based on two requirements: 1. The same packet is detected by at least five STRs. 2. The area should be chosen in such a way that “flat” intersections of the hyperbola are avoided. Since the localization relies on the arrival time of a single packet, an adversary, able to manipulate the signal runtimes (and therefore arrival times), is able to claim a wrong position. Figure 4 shows the attack for planar situation with four STRs.
be increased by additional measures in order to make an attack more costly than the assets gained by it: According to the principle of Kerckhoff [13] the position of the STR should be assumed to be known by the attacker. Nevertheless in practical situations STR locations are hidden. Since these devices do not transmit data their position also cannot be measured. Without knowledge of the STRs the attack cannot be performed successfully. Also close vicinity and rather circular arranged STRs require very narrow focused directional antennas in order to precisely aim at one STR only. As a next level of security additional information from the signal can be retrieved. The simplest, however in terms of protection very unreliable property would be the signal strength of the incoming signal that can be used to verify the calculated distance. Advanced attackers might be able to forge this by proper amplifiers. In the same way, yet harder to circumvent would be a determination of an angle of arrival requiring proper (phased) antenna arrays at the STRs. Using proper location of STRs and taking the large dimensions of industrial areas into account, the attack efforts can be drastically increased. Finally, in our opinion, only theoretical, reflections of the signal, due to multipath effects, can be used to determine if the signal was coming from the claimed position. For industrial environments a practical use of this method seems to be rather unrealistic.
C2
tre
t a2 STR2
C3
t re4
2
C1
C5
ta4
Node A Controller Restricted area
STR4
C4
C6 C7
Figure 4. Location-based security services and planar attack scenario. The adversary node RE is equipped with very narrow focused directional antennas aiming at the smart timing repeaters; upfront to these antennas delay elements d1 to d4 are installed. Setting this delay elements correctly – in the given case 0