2013 First International Black Sea Conference on Communications and Networking (BlackSeaCom)
Interoperability of Secure VoIP Terminals M. Ümit Uyar The City College of the City University of New York New York, NY 10031, USA
Recep Talha Küyük, Hasan Basri Çelebi, øbrahim Hökelek, Özgür Ören, Ayhan Yeni, AydÕn SarÕbudak, Fatih Kara, Gökhan VÕcÕl TÜBøTAK, BøLGEM, Gebze, Kocaeli, Turkey
[email protected]
{recep.kuyuk,ibrahim.hokelek,ozgur.oren}@tubitak.gov.tr
For automating the interoperability tests, formal test models are needed for MøLSEC-4 terminals. These models must be detailed enough such that they include all the testable features needed for interoperability of MøLSEC-4 terminals. Then, minimum length and maximum coverage test sequences can be automatically derived from the test models using the test generation techniques from test theory [4]. These test sequences can be converted into executable scripts to run in the BøLGEM interoperability testbed shown in Figure 1. A special purpose software unit developed at BøLGEM, called Shadow Unit, is placed into each MøLSEC-4 terminal to enable automation for applying user inputs onto the terminal (e.g., pushing buttons, answering incoming calls, etc.). Shadow Unit also has the ability to capture the screens of MøLSEC-4 terminals and send them out to the coordinator to be compared with the reference expected screens for each state.
Abstract— An interoperability test automation system has been implemented at the Wireless Communication Technologies Research Laboratory (KøTAL) at TÜBøTAK BøLGEM for secure IP communication terminals. Designed and manufactured at BøLGEM, MøLSEC-4 terminals provide VoIP signalling using SIP and end-to-end secure communication using SCIP protocols. This test system, called Shadow Coordinator, aims to verify the interoperability of terminals implementing SCIP, SIP and other related protocols. A simplified EFSM model for the expected behavior of a terminal implementing SCIP is introduced. A more advanced version of this model is used to genearate minimumlength and maximum coverage test sequences. Keywords —Interoperability, test automation, VoIP, SCIP, SIP.
I.
INTRODUCTION
We present an interoperability testbed implemented at the Wireless Communication Technologies Research Laboratory (KøTAL) at TÜBøTAK BøLGEM for secure IP communication terminals. These terminals, called MøLSEC-4, are designed and manufactured at BøLGEM. MøLSEC-4 terminals use SIP (Session Initiation Protocol) [1] for VoIP communication signaling and provide secure communication over IP by implementing SCIP (Secure Communication Interoperability Protocol) [1] as a set of application layer protocols.
To improve the effectiveness of interoperability tests, MøLSEC-4 test models were kept large with hundreds of states and thousands of state transitions. The length of the tests for various features of MøLSEC-4 such as open and secure calls, hold and conference services is over 4,000 steps.
SCIP has been accepted as the mechanism to establish and maintain secure end-to-end session by the NATO member nations. For each secure session, SCIP signalling mechanism [3] negotiates an application type such as voice and data, a cryptographic algorithm type such as Suite A and Suite B, and traffic encryption keys. SCIP can operate over any network as long as the minimum requirement of a 2400 bps synchronous data channel between two terminals is provided. It is independent of the underlying network and can allow for higher data rate voice coding algorithms, or higher rate data transmissions if the network supports. Figure 1 depicts the TÜBøTAK BøLGEM interoperability testbed used for performing the interoperability tests among BøLGEM’s secure communication terminals such as secure VoIP terminals (MøLSEC-4), secure cell phones (MøLCEP-K2) and secure PSTN terminals (MøLSEC-1A and MøLSEC-2). The testbed also includes various commercial wired and wireless terminals, servers, and gateways for both open and secure communications. Two terminals implementing SCIP and using the same certificate to generate the cryptographic keys should interoperate with each other regardless of the underlying network used for each terminal and where they are manufactured. Therefore, interoperability among SCIP terminals implemented by different NATO nations is expected.
978-1-4799-0857-8/13/$31.00 ©2013 IEEE
Figure 1 TÜBøTAK interoperability test system
II.
SCIP TEST MODEL FOR MøLSEC-4
A. EFSM Models There are many IP related protocols running in MøLSEC-4 terminals (such as SIP, SCIP, RTP, TCP, UDP and others)
172
2013 First International Black Sea Conference on Communications and Networking (BlackSeaCom)
Null_Null A_callSec_B
B_callSec_A
RBSec_Null
RingSec_Null C_call_A
C_call_A
B_answer_A
A_answer_B
C_callSec_A
RBSec_Ring
RBSec_RingSec
A_answer_C
C_hangup
C_callSec_A
C_hangup B_answer_A
RingSec_RingSec
RingSec_Ring A_answer_B
A_answer_C
B_answer_A
A_answer_B C_hangup C_hangup
Null_ActSec
ActSec_Ring
C_push_Open
B_push_Open
A_push_Sec
ActSec_RingSec
ActSec_Null C_callSec_A
A_push_Open
A_push_Open
A_push_Sec C_push_Sec
A_push_Open B_push_Open
A_push_Sec B_push_Sec
B_push_Sec
B_push_Open
Act_Null
A_push_Open A_push_Sec C_push_Open C_push_Sec
B_hangup
C_hangup
Null_Act
RingSec_Act
A_push_Sec
A_push_Open
B_push_Sec
Act_Ring
A_answer_C A_answer_B
Act_RingSec
RingSec_ActSec
C_callSec_A
Figure 2 Simplified EFSM model for secure call signaling (many state transitions are not shown due to the lack of space)
where each state has two variables: F1 and F2 where F1 corresponds to the state of the call between A to B, and F2 between A and C. The possible values for F1 or F2 are seven states defined above and the additional states from the open calls which are not included here (e.g., hold and conference states). The initial state of this simple EFSM model will be Null_Null indicating that there are initially no calls at the terminal. If, for example, A calls B, EFSM model for terminal A will move to RB_Null state (i.e., the call from A to B is in ringback state, whereas there is no call between A and C). If B answers the incoming call from A, the EFSM moves to Act_Null state. Many different actions may take place at this point. For example, A or B can switch to a secure call (i.e., ActSec_Null state), or C may make a clear or secure call to A moving the EFSM to Act_Ring or Act_RingSec state, respectively. Parts of this simplified EFSM is shown in Figure 2 which includes only some of the states and state transitions due to limited space. There are several design rules defined for the implementtion of SCIP in MøLSEC-4 with respect to handling secure calls: (i) if there is an active secure call in a terminal, there cannot be any other active or held calls, (ii) an active secure call cannot be put on hold, nor can it participate in a conference call, (iii) when there is one secure call, all signaling with clear or secure call are allowed over the other call except for moving into Act or ActSec state, (iv) when in RBSec_Ring or RBSec_RingSec state, the EFSM moves to Null_Act or Null_ActSec state, respectively, if A chooses to answer the incoming call from C.
which make the interoperability testing even more challenging. Interoperability tests must include all actions initiated using the hard buttons available to the user and the actions initiated from touching screens using the graphical user interfaces (GUIs). Complexity of MøLSEC-4 terminals makes manual test generation and execution techniques infeasible and necessitates the empoyment of formal test generation techniques reported in the literature [4]. One of the most popular models to represent the behavior of an implementation under test (IUT) is extended finite state machines (EFSMs) [4]. EFSMs can be represented as directed graphs where the vertices and the edges of the directed graph correspond to the states and the state transitions of the EFSM model. The states, state transitions, inputs and outputs for the test model are selected for a given system such that the behavior of an IUT can be tested most effectively and efficiently. B. Simplified EFSM model for Secure Call Processing Between two users communicating with SCIP enabled terminals, a user can initiate a secure call at any time and can leave a secure call to a clear call at any time as defined by the SCIP specification [2][3]. Let us now consider a SCIP-enabled terminal such as MøLSEC-4 (let us refer to the user for this terminal as user A). In MøLSEC-4 there are two calls that can appear on the screen. Following the SIP tradition of naming the users, let us refer to these users as B and C. Let us now construct a simple EFSM model representing the secure call processing in MøLSEC-4. First, let us identify a few states for a call between A and B: Null (no call is present), RB (A calls B where A gets ringback as B’s terminal is ringing), Ring (B calls A where A is ringing as B gets ringback), Act (A and B are talking), RBSec (A makes a secure call to B), RingSec (B makes a secure call to A), and ActSec (A and B have a secure call). Now, without loss of generality, let us define an EFSM
C. Test Generation from EFSM Model An important requirement in designing test models is that the inputs applied by the tester must be controllable and the outputs (or the states) of the model must be observable by the tester. In black-box test models, an IUT is considered as a
173
2013 First International Black Sea Conference on Communications and Networking (BlackSeaCom)
black-box such that (i) internal implementation details (such as variables, internal interfaces, special timers, etc.) are not available to the tester, (ii) inputs can only be applied to an IUT by externally available means (e.g., buttons, digits, handsets, etc.), and (iii) only the external outputs or GUIs are observable by the tester (e.g., internal variables or timeouts cannot be observed, etc.). These requirements make the black-box approach a suitable method for interoperability testing, where many different equipment from different vendors are tested for their simultaneous operation without access to their internal implementation details.
(a) Hold_Ring_C2_OnHook state
(b) Hold_Active_C2_OnHook state
Figure 3 Example states defined for the test model of MøLSEC-4
Obtaining a tour of a directed graph model that is minimum at length while covering all of the modeled state transitions of the EFSM model is called the Rural Chinese Postman (RCP) problem [4]. Unique Input/Output (UIO) sequences method is an effective and efficient tool in verifying the current state of an IUT [5]. The combination of the RCP and UIO sequences have been successfully applied to generate conformance and interoperability tests for many complex industrial and military systems including VoIP terminals and switches, ISDN terminals and switches, MILSTD 188-220 communication devices, and PBXs [8].
D. Interoperability Test Model for SCIP Each state in SCIP test model has been represented by using four variables: F1F2F3F4 . The states of the Calls 1 and 2 are represented by F1 and F2, the selected call on the screen by F3, and the state of the handset by F4. In SIP specifications, the users are traditionally referred to as A, B and C. F1 variable corresponds to the state of the call between A and B whereas F2 is for the state of the calls between A and C (recall that a simplified version of the states corresponding to F1 and F2 are shown in Figure 2). In MøLSEC-4 screen, the user can select either Call 1 or 2 by touching the corresponding line on the screen. F3 shows which call is currently selected (i.e., on focus) for the terminal. Finally F4 indicates if the handset is off-hook or on-hook.
MøLSEC-4 users can initiate actions by pushing hard buttons (e.g., digits, mute button, etc.), picking up or hanging up the hand set, or touching soft buttons displayed on GUIs (e.g., menu, silent, hold, resume, conference and other soft keys appearing on the screen). Therefore, MøLSEC-4 test model must allow for a tester initiating actions and observing outputs and/or states of an IUT by using both the buttons and the touch screen inputs. Each correct (or expected) output (such as dial tone, ringing, or active voice connection) and/or a state (e.g., GUI screen for an incoming, held or a secure active call) will imply that the underlying protocols running in MøLSEC-4 such as SIP, SCIP, RTP and others are behaving as expected.
Figure 3(a) shows the MøLSEC-4 screen which corresponds to the state called Hold_Ring_C2_OnHook. In this state, the Call 1 between A and B is on hold, whereas C is calling A in Call 2. C2 indicates that the focus on the screen is on Call 2, and the handset is on-hook position (i.e., Speaker or Hands Free button is pushed). MøLSEC-4 terminal may reach to this state by taking many different paths. For example, Call 1 (which is on hold) may be initiated either by A or B. Once active, Call 1 may be put on hold by A before or after C starts calling A on Call 2. Similar possibilities are also valid for screen focus and handset position. For example, A may push Speaker (or Hands Free) button before or after C starts calling A at Call 2. When the terminal is in Hold_Ring_C2_OnHook state, A may push Reply button and answers the incoming call from C, and hence moves the IUT into Hold_Active_C2_OnHook state. The expected screen for Hold_Active_C2_OnHook is shown in Figure 3(b). This state transition may also take place in different ways. For example, A may push Reply button, or directly touches the screen where Call 2 is displayed.
In MøLSEC-4, a total of 35 user inputs are implemented for a user: 25 hard buttons (digits, back arrow, mute, speaker, secure call), one handset (on-hook or off-hook), and 9 touchable fields on the screen (except for the edit screens or menu items, address books, etc. where many fields can be edited by a user). The EFSM model for MøLSEC-4 interoperability tests has several hundreds of states and thousands of state transitions. The length of MøLSEC-4 interoperability tests obtained using RCP and UIO sequences for clear and secure calls, hold and conference services is over 4,000 steps. A step in the context is defined as an input (or a group of inputs) applied by the tester to an IUT and the observation of the correct state. For example, A_dial_B, A_press_secure_cal, and A_press_hold are among the steps defined in MøLSEC-4 test model. Instead of verifying the individual output messages generated by SIP, SCIP and RTP (which are checked during the unit and conformance testing stages of the development cycle before interoperability tests are run), at each step the screens of IUT are checked. For MøLSEC-4 interoperability test model, the states correspond to the distinct screens of the GUI observed in an IUT. To verify that the screen of an IUT GUI is what it should be, a special software is developed in TÜBøTAK which is capable of extracting the screen information from an IUT and comparing it with the expected screen.
All these different scenarios correspond to different state transitions in the EFSM model of MøLSEC-4. Scenarios manually prepared by testers cannot be effective in complex systems such as MøLSEC-4 since the size of the number of possible scenarios is prohibitively large. In our experience, the coverage of manually prepared test scenarios is limited to only a few percent of the test space. Examples given in SIP documents for a conference call has 6 steps whereas EFSM model based test sequences have approximately 800 steps. EFSM models are not only inevitable for test coverage, but they are also very convenient for automating the test sequence generation and test automation purposes.
174
2013 First International Black Sea Conference on Communications and Networking (BlackSeaCom)
III.
commands will then virtually push the buttons of the IUT and capture and send the screen shots as defined by the test sequence (Figure 5). Figure 6 shows a sample GUI for Shadow Coordinator captured during interoperability testing of MøLSEC-4.
INTEROPERABILITY TEST SYSTEM
An effective and efficient automated interoperability test system for MøLSEC-4 terminals must possess capabilities to (i) automatically apply user inputs to an IUT as dictated by a test plan, and (ii) automatically check the outputs generated by an IUT for their validity with respect to the expected outputs. These features will provide controllability and observability of test cases so that large volumes of test cases can be prepared and, at the same time, their execution can be automated. To satisfy these requirements, a special software, called Shadow Unit, which runs in MøLSEC-4 terminals has been implemented by the developers at TÜBøTAK (Figure 4). Shadow Unit, using a separate Ethernet access, allows creating the effect of the following inputs to be applied to an IUT without user intervention: (i) pushing any button in MøLSEC-4, (ii) touching any given area of the MøLSEC-4 screen, and (iii) off-hook and on-hook operations with the handset.
M7LSEC 4 FSM model (manual)
Reference Screens (manual)
Test Plan Generator
Screen Comparator
Test Report
M7LSEC 4 (Implementation Under Test)
As can be seen in Figure 6, Shadow Coordinator GUI displays each test step (obtained from the test sequence generated from the EFSM model using RCP and UIO methods), expected state of the IUT after each test step is executed, the captured screen shot from the IUT, and the result of the comparison of the captured screen shot with the reference (i.e., the expected) screen. If the captured and the expected screens match, a verdict of Pass is displayed on the Shadow Coordinator GUI. If, however, they do not match, the differences are displayed in red so that the tester can observe the fault. In this case a Fail verdict is assigned to the test step. Shadow Coordinator has the capability to automaticaly or manually (i.e., step by step) run all test steps, skip one or more steps, push any button, touch any part of screen depending on the circumstances of the IUT during testing. User defined timers indicate if there is no action from an IUT within a certain amount of time. In addition, Shadow Coordinator captures all the packets sent and received from the IUT for diagnostic purposes. To the best of our knowledge, there is no commercially available test system containing the capabilities of Shadow Master.
Shadow Unit
Black
Crypto Unit
Test Plan
Figure 5 MILSEC-4 interoperability testing architecture
Captured Screens
Red
Executable Shadow Command Generator
Shadow Coordinator Test Otomation System
Shadow Unit Commands: •Push Button, Touch Screen, On/Off-Hook •Capture screen
Terminal Inputs: •Push •Touch •On/Off-Hook •Voice •LCD
M7LSEC 4 Test Sequence
Executable Shadow Commands
Since the outputs generated by an IUT during interoperability tests are encyrpted, instead of checking the details of the output messages (which are already tested during unit and conformance testing stages), we decided to check the screens generated by an IUT. Almost all screens have unique features that indicate the state of the call, and hence, indirectly verifying the correctness of the signalling. Shadow unit has the ability to send a copy of the IUT screen when requested by an external tester. Note that placing Shadow Unit in MøLSEC-4 implementation does not conform to the principles of blackbox testing method. However, Shadow Unit will not remain in the final product. After many iterations of interoperability tests are conducted during the product development cycle, MøLSEC4 implementation will become mature; at that point Shadow Unit will be removed from the IUT, and the interoperability tests will be repeated (most likely manually since automation will no longer be possible without Shadow Unit). These tests are called operational interoperability tests, as discussed later.
Shadow Unit Commands
MinimumLength Test Sequence Generator
SIP SCIP RTP TCP
MILSEC 4 (Implementation Unser Test)
Figure 4 Shadow Unit in MøLSEC-4 terminal
To control the automation of interoperability tests via Shadow Unit (i.e., to push the buttons and to obtain screen shots from an IUT at desired moments), a test automation system, called Shadow Coordinator, has been implemented. Shadow Coordinator converts a given test sequence into executable commands and sends them to Shadow Unit. These
Figure 6 Sample GUI from Shadow Coordinator
175
2013 First International Black Sea Conference on Communications and Networking (BlackSeaCom)
IV.
terminals due to the lack of Shadow Unit in MøLSEC-4. Figure 7 shows an example operational interoperability testbed setup where the test steps are run manually for MøLSEC-4 terminals, MøLCEP-K2 cell phones and MøLSEC-1A terminals.
INTEROPERABILITY WITH DIFFERENT TERMINALS, SERVERS AND GATEWAYS
A. Interoperability of MøLSEC-4 The tests generated for MøLSEC-4 using the EFSM model of MøLSEC-4 and RCP and UIO methods can be used for different levels of interoperability. Terminals from various vendors equipped with clear and secure voice and data communications using SCIP can be tested for their interoperability with MøLSEC-4 terminals. In this case, since the vendor terminals will not have the capability of Shadow Unit, the interoperability tests will be only partially automated such that the actions of MøLSEC-4 will be controlled by Shadow Master, whereas the actions of the vendor terminals will be run manually as specified by the interoperability test sequence. For example, consider the secure cell phones called MøLCEP-K2 manufactured by TÜBøTAK for clear and secure voice and data communication over GSM networks. Suppose in this case, the tester assigns MøLCEP-K2 as caller C. Since MøLCEP-K2 does not possess Shadow Unit, the actions defined for caller in C in the test sequence will be executed manually by a tester while callers A and B are controlled and observed in an automated fashion by Shadow Coordinator.
Commercial SIP Servers (Avaya, Cisco, etc.) M7LSEC 4 Tester Actions
M7LSEC 1A M7LSEC 4 M7LCEP K2 Operational Interoperability Test Plans
SIP – SCIP - RTP
MILSEC 1A Tester Actions
SCIP
PSTN
M7LCEP K2 Tester Actions
SCIP
GSM
Figure 7 Operational interoperability test for various TÜBøTAK terminals
V.
CONCLUDING REMARKS
We presented a test automation system developed at TÜBøTAK for interoperability of VoIP terminals with clear and secure voice and data communications capabilities over wired and wireless networks. Using EFSM models to represent MøLSEC-4 behavior for both secure and clear calls, and with the help of RCP and UIO sequences techniques from test theory, a large number of interoperability tests were generated automatically. By inserting a special software entity, called Shadow Unit, into MøLSEC-4, the interoperability tests are executed in a fully automated fashion without any human intervention. After MøLSEC-4 becomes mature, Shadow Unit will be removed and a subset of tests, called operational interoperability tests, will be run manually to ensure that no errors are introduced by the removal operation. Next versions of the test automation system will target testing interoperability of various communcations systems developed at TÜBøTAK.
In another level of interoperabilty, different servers and gateways can be employed to connect MøLSEC-4 terminals. In this case, interoperability tests can be run fully automated if all terminals are selected as MøLSEC-4, or partially automated if some of the terminals are other than MøLSEC-4 and/or manufactured by different vendors. B. Operational Interoperability Tests for MøLSEC-4 As mentioned earlier, placement of Shadow Unit into MøLSEC-4 terminals for testing purposes does not follow the principles defined for black-box testing that are required for most interoperability tests. However, Shadow Unit was a necessity in order to automate the large number of tests defined for MøLSEC-4 interoperability. After the terminals are tested for many iterations and MøLSEC-4 implementation finally becomes mature, Shadow Unit and its access point have to be removed from the terminal. Since this change requires software and hardware changes (no matter how small), there is still a non-zero probability that errors may be injected into the terminal during the removal. Therefore, the terminal have to be tested again following the removal of Shadow Unit and its access point. Since there will be no automation possible for running the tests, the interoperability tests must be run manually. Evaluation of the test steps also will not be possible since there is no way to obtain the screen shots from the IUT without Shadow Unit. Typically, these tests are selected as a subset of the large set of interoperability test suite (e.g., 50% of the interoperability tests can be selected for the manual operation). This subset of tests, which are executed manually, are often called operational interoperability tests.
REFERENCES [1] J. Rosenberg et al., ”SIP: Session Initiation Protocol,” IETF RFC 3261, June 2002. [2] J.S. Collura, "Secure Communications Interoperability Protocols (SCIP)," IEEE Military Communications, pp. 19-1 – 19-10, 2006. [3] General Dynamics, "SCIP Signaling Plan (SCIP-210) Revision 3.2," Available Online: http://cryptome.org/2012/07/nsa-scip.pdf [4] S. S. Batth, M. U. Uyar, Y. Wang and M. A. Fecko, "Fault Masking by Multiple Timing Faults in Timed EFSM Models," Computer Networks, Vol. 5s3, Issue 5, pp. 596-612, Apr. 2009. [5] A.S. Kalaji, R.M. Hierons and S. Swift, “An integrated search-based approach for automatic testing from extended finite state machine models”, Information and Software Technology, 2011, 53(12). [6] H. Ural, Z. Xu, “An EFSM-Based Passive Fault Detection. Testing of Software and Communicating Systems,” 2007,4581:335–350. [7] M. U. Uyar, S. Batth, J. Allen, W. Chriss, D. Somers, " Testing Industrial VoIP Implementations", Proc. 4th IEEE Int'l. Conf. on Standardization, Innovation in Information Technology, Switzerland, Sept. 2005.
C. Operational Interoperability with Various Terminals At the last stage of the operational interoperability tests, MøLSEC-4 terminal will be interconnected with various secure terminals and SIP servers and gateways. Again, the user actions and the validation of screens have to be manual for all
176