Command Line or Pretty Lines? Comparing Textual and Visual ...

4 downloads 4148 Views 615KB Size Report
support engineers in ID. We conducted a ... Network security engineers (a.k.a. system or security ... applications provide some support in this task, successful.
Command Line or Pretty Lines? Comparing Textual and Visual Interfaces for Intrusion Detection Ramona Su Thompson1, Esa M. Rantanen1, William Yurcik3, Brian P. Bailey1,2 Human Factors Division1, Department of Computer Science2, NCSA3 University of Illinois Urbana, IL 61801 U.S.A. {ramonasu, rantanen, bpbailey}@uiuc.edu, [email protected] ABSTRACT

Intrusion detection (ID) is one of network security engineers’ most important tasks. Textual (command-line) and visual interfaces are two common modalities used to support engineers in ID. We conducted a controlled experiment comparing a representative textual and visual interface for ID to develop a deeper understanding about the relative strengths and weaknesses of each. We found that the textual interface allows users to better control the analysis of details of the data through the use of rich, powerful, and flexible commands while the visual interface allows better discovery of new attacks by offering an overview of the current state of the network. With this understanding, we recommend designing a hybrid interface that combines the strengths of textual and visual interfaces for the next generation of tools used for intrusion detection. Author Keywords

Intrusion detection, network security, user study, textual interfaces, visual interfaces ACM Classification Keywords

H.5.2 [Information interfaces and presentation]: User Interfaces -- evaluation/methodology and interaction styles. INTRODUCTION

Network security engineers (a.k.a. system or security administrators) provide many critical and other valued services for companies, research laboratories, and universities while protecting their internal computer networks from malicious attacks. Much of engineers’ time is spent on reactive services, such as incident handling and attending to alerts [18, 25].

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CHI 2007, April 28–May 3, 2007, San Jose, California, USA. Copyright 2007 ACM 978-1-59593-593-9/07/0004...$5.00.

At the core of the reactive services is the task of incident analysis or intrusion detection (ID). ID can be divided into the following four subtasks: (1) pre-process information, (2) monitor the network for attacks, (3) analyze potential attacks, and (4) respond to them [9, 10, 23]. While automated intrusion detection systems (IDSs) and firewall applications provide some support in this task, successful ID still relies heavily on the expertise and knowledge of human engineers because they are able to adapt to changes in their environment and quickly integrate pertinent information about the network from a variety of sources [9]. Even for expert engineers, however, ID remains a very difficult and challenging task. Ethnographic studies of network security engineers have shown that they frequently experience information and cognitive overload while using multiple information resources [10-12, 16, 19, 23, 25], especially in the persistent subtasks of monitoring and analysis. The overload is further intensified for those individuals who serve multiple roles in an organization (i.e. system administrators) or who are less experienced. Currently, there are two types of interfaces being used, textual and visual. The textual interface, which typically provides a command line, is used to manipulate textual data (e.g. network logs, system logs, etc.). This textual data is often the main resource for network security engineers because it provides detailed information about the evolving use of the network [10, 16, 23]. The commands used in the interface allow engineers to quickly customize the data into a form where they can analyze the information (e.g., by quickly filtering out insignificant data using the grep command). These commands are rich, expressive, flexible, and powerful and are often chained into longer, complex sequences. Yet, the sheer amount of textual data to be examined often results in overload and makes the already complex task of ID more difficult and cognitively intensive. Research in data visualization, or visual interfaces, is emerging in the network security community. Researchers have sought to leverage the benefits of visualization to reduce the time and workload associated with the ID task [3, 5, 7, 9, 10, 12, 19, 21, 24], but few solutions have been successfully adopted. This may be due to the existing interfaces: they tend to require detailed interaction with the

interface, provide only simple filtering and manipulation of the data, and are highly specialized and hence limited in the scope of information they provide. There has been little empirical understanding about each of these interfaces. In this work, we empirically compare a textual (command-line) interface with a representative visual interface for the ID task, focusing on understanding the tradeoffs between the two and understanding how to design more effective interfaces. In our study, users were asked to investigate a series of IDS alerts by identifying different types of security attacks on a simulation of a realistic network. This task was based upon the monitoring and analysis phase of ID, the most time-consuming and cognitively challenging subtask in ID [9, 10, 23]. We collected and analyzed performance, observational, and subjective data from users. Results from the study offer an important step toward understanding the tradeoffs between textual and visual interfaces for ID as well as how these two modalities could be effectively combined in future tools. RELATED WORK

In this section, we provide an overview of the intrusion detection (ID) task of network security engineers as well as provide an overview of visualization tools for network security. Intrusion Detection

Intrusion detection poses high cognitive demands on network security engineers [16]. Given the complexity of the task and the wide range of resources available to them, there are few detailed guidelines of how the task should be performed. By a general consensus in the literature, the task can be divided into three main phases: (1) pre-processing, (2) monitoring and analysis, and (3) response [10, 23]. The pre-processing phase requires the engineer to configure a system to match new network data with the known patterns of malicious behavior and then generate alerts to notify the user [2]. Usually, this system is the IDS. This initial alert is critical as it provides a starting point for investigations; thus, engineers would like to minimize the number of false alarms without missing any true alerts [23]. The monitoring and analysis phase requires engineers to identify a potential “true” alert. Specifically, engineers sift through the alerts and make initial judgments on the alerts “trueness” by using their expertise coupled with the resources about the network. Once a “true” alert is identified, engineers prioritize its importance and determine if any further investigation is needed [9, 10, 23]. When a real attack has been identified, the task moves into the response phase. The type of response depends on the attack, but it may be as simple as electronically cleaning a computer using anti-virus software or as severe as shutting down the entire network. From the ethnographic studies and cognitive task analysis reported in the literature [9, 10, 23], the monitoring and

analysis phase is shown to be the most cognitively intensive and taxing task. This task requires significant allocation of attentional resources to monitor the thousands of incoming alerts (of which up to 99% may be false alarms) and requires retrieval of information from network resources and expert knowledge about the network to identify alerts indicating actual attacks [9-12, 15, 16, 23]. The design of effective tools can help reduce the workload that engineers face during this phase. Our research seeks to understand the strengths and weaknesses of existing tools of different modalities, textual and visual, to help inform the design of more effective tools for network security engineers. Visualization Tools in Network Security

Information visualization research has sought to help users efficiently discover and analyze information through visual exploration. This can be especially useful for large datasets in which the textual information is overwhelming and finding information is difficult. A variety of design paradigms have been used to address this problem in these visual interfaces: overview first with details on-demand [4, 14, 17, 22]; focus plus context [6]; and overview, zoom, and filter [13]. Recent research has sought to leverage the benefits of information visualization by applying these paradigms in the creation of a variety of visual interfaces for network security—VisflowConnect [24], NVisionIP [20], IDS Rainstorm [3], TNV [11], PortVis [21]. Each of these interfaces was designed with the hope of reducing the engineer’s workload in the task of ID. While these tools have been designed using a user-centered approach, very few have been empirically evaluated in the task of ID. In our research, we compare one of these visual interfaces with the prevailing textual interface in order to uncover the strengths and weaknesses of each in the task of ID. USER STUDY

The underlying assumption in development of visualization tools appears to be that a textual interface is less effective than a visual interface, as can be seen with the large amount of visual interfaces created for network security engineers by the research community. The purpose of this study was to understand how the modality of the interface affects the monitoring and analysis phase of ID. Specifically, the study was designed to answer at least two central questions: 1.

How does the modality of the interface affect users’ performance, efficiency, information processing, and integration of information for the task of ID?

2.

What are the relative strengths and weaknesses of each modality of interface for the task of ID?

Interfaces

In a controlled experiment, we compared a representative interface from each modality: a command-line interface as the textual interface and VisflowConnect [24] as the visual interface.

(a)

(b)

(c)

(d) Figure 1. (a) Sample of Netflow data which includes the following information about each network traffic flow (from left to right): start time, end time, source IP address with the port, destination IP address with the port, number of source packets, number of destination packets, source byte amount, destination byte amount, and status of flow. (b) Screenshot of the Main View of VisflowConnect, the visual interface that visualizes the data seen in (a). The axes, from left to right, are the external domain sender, internal host, external domain receiver. Lines between axes indicate network traffic flowing between the external domains and internal hosts. In this screenshot, an external domain is selected which can be seen with the pink (darkened) highlighted triangle on the left. (c) A detailed view of the external domain selected in (b), where the axes now represent the external hosts. The selected external host, the triangle highlighted in pink, is portscanning the internal network. (d) Screenshot of the shell window, the textual interface. A user typed the above command to filter the raw Netflow data from (a). The command is used to determine the unique IPs on the internal network that are being connected to by the external IP address in question. The attack being investigated is the same portscan, with different IP addresses, as seen in (b) and (c).

Textual Interface

The textual interface was composed of two parts: textual data and commands on the command-line. The textual data was the Netflow data, which contains detail of the network traffic flow data; which will be described in the next section. Text-based commands used on the command-line are used to manipulate the data. Filtering is frequently used in manipulating data, for example by using the grep and AWK commands. grep allows a user to search through files for specific strings in the text; while AWK is a programming language used for complex text-processing tasks. Engineers also used custom scripts to look at the network data for anomalous behaviors [10]. A screenshot of the textual data and the command-line can be seen in Figures 1a and 1d.

In the study, users were allowed to use standard text-based commands found in the Unix shell to perform the task. Paper manuals of some of the more commonly used textbased commands were provided to the user for reference, though a background questionnaire revealed that all of the users were familiar with the use of these commands. Visual Interface

We selected VisflowConnect [24] (Figure 1b-c), as the visual interface because (1) it used the same data type, Netflows, which allows for comparison between the interfaces; (2) it is a simple interface with a core set of functionality which reduces the learning curve for users; (3) it was designed from a user-centered point of view; and (4) it follows the information visualization design paradigm of overview first with details on-demand [4, 14, 17, 22].

The interface provides functionalities to visualize the data which include highlighting flows for a specific domain or IP address and ports entered by the user, viewing host statistics of selected domains and IP addresses which include the number of incoming and outgoing bytes, viewing traffic that occurs only between internal hosts, and viewing traffic between hosts for a highlighted external domain. Users were provided a manual of VisflowConnect and its features as a reference during the task. Netflows

We used Netflows data as the main resource. NetFlows are network-based logs that are collected from a network’s router. They provide records of the flows that occur on the network. A Netflow record contains a wide variety of useful information about the traffic in a given flow (i.e. timestamps, bytes transferred, etc.) [1]. NetFlows were used because they provide operators with rich details about the traffic flows while being relatively concise as a common data type for both the textual and visual interfaces. While Netflows are commonly used to investigate potential security threats on the network [23], the data primarily provides connectivity information. Security analysis is thus behavior-based and must generally be verified with corroborating evidence. The Netflow data used in the experiment (Figure 1a) was anonymized from a real network. There were three sets of data, one for each attack, for a given interface. Each set of data was then altered by changing the IP addresses and shifting the timestamps of the flow to create another three sets of data for the other interface. This randomized the data while allowing us to compare results between interfaces.

single external host scanning a subnet of the internal network on port 42. • No attack. This scenario served as a baseline trial in the study and represents a false alarm alert. Experimental Design

A repeated measures within-subjects design was used with Interface (Textual, Visual) and Attack Type (Portscan, Peer-to-Peer, No Attack) as factors. The conditions were ordered using a Latin-Square to minimize learning effects. Users

Twelve undergraduate and graduate students from various engineering departments at our institution, all with a basic understanding of computer networks and networking concepts, participated in the study. The students ranged in their knowledge of network security, from no knowledge to numerous years of experience as an amateur in network security (e.g., maintaining security of their home network). In addition to the students, we had two network security engineers, each with over three years of experience, participate in the study. Together, these 14 users are representative of current and future network security engineers. The ages ranged from 18 to over 40. Task Description

We structured our task around the monitoring and analysis phase found in Goodall et. al [9, 10] and Thompson et. al [23] because of its commonality and difficulty [23]. Specifically, the task was broken into the following subtasks, as indicated by Thompson et. al [23]: 1.

Receiving an alert. At the beginning of the task, the users were given an alert that signaled some anomalous behavior on the network of the form: “There are over 100 connections to IP address X.”, where X may have been an internal IP address or external IP address. An alert is not synonymous with an attack; it indicates a pattern of behavior on the network that may be associated with an attack.

2.

Determining the cause of the alert. Users needed to investigate the alert by looking at the Netflow data in a given interface, and specifically, gather information about what was occurring on the internal network for the IP address listed in the alert.

Attack Type

We selected three types of realistic attacks for the study based upon a list of attacks that had recently been identified by the network security engineers at our institution. These attacks were sanitized for the use of our study. The attacks were chosen because they are common and often trigger an alert to be generated by an IDS. The specific attacks were: • Peer-to-peer. A peer-to-peer computer network is a network in which computers can connect directly with each other without the assistance of a third party network or server. Use of peer-to-peer applications is usually a violation of network usage policy, and thus considered an attack. The pattern of this attack in the Netflow appeared as numerous computers, both on the internal and external network, connecting to an internal computer with IP address on a high end port, e.g., 47579. • Portscan. A portscan involves a computer scanning for multiple listening ports on a single target host or one port on an entire network’s subnet. This is often considered dangerous because it indicates a remote computer is conducting reconnaissance on the internal network. The pattern of the portscan used in the study appeared as a

For the textual interface, users executed the text-based commands on the command line to examine and filter the Netflow data presented in log file form (Figure 1a). The complexity of the types of commands used can be seen in Figure 1d. For the visual interface, users used VisflowConnect to examine and filter the Netflow data (Figure 1b-c). Users were allowed to use any of the features in the interface during their investigation. 3.

Deciding if further investigation is necessary by assessing whether an attack had occurred. When the users had gathered enough information about the alert

from the Netflow data, they needed to assess whether there was an attack occurring on the network. The amount of investigation and data gathering needed varied for each individual due to variance in network and network security knowledge and experience.

• Time on task (TOT). The time on task was measured from the time when the task was started to the time in which the user made a final identification of the type of attack. Measurements were made from the analysis of the timestamps in the screen interaction videos.

If users suspected an attack, they were required to identify the type of attack associated with the alert. In ID, engineers verify their hypothesis by using other resources, but our users were not asked to do this as they were limited to one resource.

• Total number of commands used. In the visual interface, the total number of commands included any interaction with the interface’s controls (buttons, menu items, etc.). In the textual interface, the commands included any command typed by the user. This measure was used as a means of measuring efficiency of the interface.

To aid their assessment and to give them some minimal expertise in the area of network security, users were given a list of potential security attacks on the network. On the reference sheet, twelve attacks were described along with a corresponding description of the attack profile. Hardware and Software

Users performed the task on a laptop running Windows XP and equipped with Secure Shell Client which was used to connect to a Unix machine for the textual interface. VisflowConnect was installed on the machine and was used as the visual interface. We used Camtasia to record the user’s screen interaction and voice for comments. Procedures

Upon arriving at the lab, we went through an informed consent process with the user. The user then filled out a demographic questionnaire and was given an explanation and demonstration of the two interfaces and was allowed time to practice with each of them. The reference manual for VisflowConnect and common commands for the textual interface was then provided. The user was informed that s/he was not limited to the commands in the manual, and could use any of the standard Unix commands. In addition, we described the format of the Netflow data for the textual interface (Figure 1a), provided the users with a reference sheet of the column descriptions, and provided the list of potential security events that could occur on the network. The user was then given an alert and asked to investigate and identify the attack, if it existed, as accurately and quickly as possible. The user was allowed to consult any of the reference materials as well as ask the experimenter for clarification on the use of the interfaces. The task ended when the user was confident of the diagnosis of the security event or confident there was no security event. After the task was completed, the user was asked to complete a short questionnaire. The user completed six tasks, one for each interface and attack combination. The user was not aware that the simulated attacks were repeated. When the user had completed all the tasks, s/he was asked to fill out a postevaluation questionnaire. Measurements

In this study, the dependent measures were:

• Accuracy. This was measured by the correctness of the identification; (1) if there was an actual attack, and (2) the identification of the specific type of attack. It is important for engineers to be accurate in their assessment so that true attacks can be identified and addressed. • Confidence. Users rated their confidence in their identification of an attack on a Likert scale (1=Low, 7 = High). The less confident an engineer is about their hypothesized assessment, the more time is required on the task, which can decrease productivity. • Discovery of other problems. This was measured by the number of other investigations, not associated with the alert that users made during the primary task. This shows an interface may help engineers identify other attacks that the IDS does not. • Rating of Interfaces. Users rated their preference of each interface on a Likert scale (1=Low Rating, 7=High Rating). In addition, we collected qualitative data by reviewing the users’ screen interaction videos and follow-up comments. RESULTS

A general linear model analysis (GLMA) was used on the dependent measures, with the threshold of significance set to 0.05. All post-hoc analysis tests were done using the Bonferroni adjustment unless indicated otherwise. Time on Task

As shown in Figure 2a, users were able to identify an attack more quickly with the textual interface (μ=540.2, s.e.=75.9) than with the visual interface (μ=863.8, s.e.=90.9) (F(1,13)=10.55, p