Exploring Covert Channels - Semantic Scholar

5 downloads 5719 Views 245KB Size Report
host is not necessarily aware that it is being used in such a way. According to [5] .... channels based on FTP exploit the use of this idle prevention scheme. Covert ...
Exploring Covert Channels Jason Jaskolka and Ridha Khedri Department of Computing and Software Faculty of Engineering McMaster University {jaskolj, khedri}@mcmaster.ca

Abstract—Covert channels pose a threat to system security for many reasons. One of the most significant security concerns surrounding the use of covert channels in computer and information systems involves confidentiality and the ability to leak confidential information from a high level security user to a low level one covertly. There are many differing views surrounding the ideas of covert channels and steganography with debates igniting over the existence of a relationship between the two concepts. This debate can be resolved with a model to provide a perception of covert channel communication to yield a better understanding of covert channels. In this paper, we propose a model to perceive covert channel communication. We use the proposed model to explore the relationship between covert channels, steganography and watermarking. The intent is to provide a better understanding of covert channel communication in an attempt to develop investigative support for confidentiality.

I. I NTRODUCTION With the ever-growing popularity and sophistication of computer systems, computer and information security is becoming more important than ever. Computers are being used in virtually every workplace in some form or another. Hence, due to the widespread use of computers and the variety of application domains, security concerns have varying implications and priority, from one domain to another. This paper focusses particularly on concerns regarding the leak of confidential information via covert channel communication from users/agents/processes having high security levels to others with lower security levels. The concept of covert channel communication goes back to the work of Lampson [1]. It is commonly referred to a covert channel as any communication channel that can be exploited to transfer information in a manner that violates the system’s security policy [2]. This means that channels that may be hidden from the view of system components, such as system monitors and other agents, are allowed to exist in a computer system provided they do not violate the security policy employed by the system. Covert channels pose a threat to system security for a number of reasons. The first reason is, of course, a confidentiality concern, as covert channels can be used to pass confidential information secretly. This is a particular concern to large organizations that wish to maintain confidentiality regarding company secrets as covert channel communication allows for This research is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), Grant Number RGPIN227806-09.

this secret information to flow into, or out of the organization. This is the case with the more recent idea of cloud computing. As organizations are beginning store huge amounts of data in the “cloud”, they must ensure that the cloud is secure. From a confidentiality perspective, organizations must use prevention and detection mechanisms to protect their data and secrets from any sort of attack or data leakage. Moreover, covert communication channels pose an economical concern as covert channels provide a means of transmitting information using an existing system without paying for the service provided. This is often the case when a system is infected by a Trojan Horse. For these reasons, among others, covert channel analysis has become part of the evaluation criteria for the classification of secure systems by the United States Department of Defense (DoD) and the National Computer Security Center (NCSC) as outlined in [2] and [3]. According to the literature, [4]–[11], there has been debate whether a relationship between steganography and covert channel communication exists. In [5], Bidou and Raynal suggest that there is a distinction between steganography and covert channel communication. It is argued that, in the case of steganography, the communication channel is known, so it is concluded that there is no relationship with covert channel communication. Steganography merely hides information into some form of cover such images, audio, video, etc. However, in the case of covert channel communication, not only is the information hidden in some form of cover, but the communication channel itself is also hidden in some way. In the literature, the counterargument is also present, arguing that steganography is simply a special case of covert channel communication. In [8], Patel et al. claim that steganography is an example of covert channel communication. Also, in [11], Zander et al. discuss the similarities of covert channel communication and steganography. Both concepts require some form of cover to hide information and to carry the information to its destination. The debate as to whether there is a relationship between steganography and covert channel communication is ongoing, however, we intend to show that a relationship between the two concepts does in fact exist. The debate that exists regarding the relationship between covert channel communication and steganography is an indication that there is no concrete understanding of covert channel communication in general. Despite that there are mathematical models for specific covert channels such as the Z-channel [12], as far as we know, there is no global model

that gives a general view of the morphology of covert channels. The literature shows that there are differing views on what constitutes a covert channel. Therefore, there is a need for a better understanding of covert channel communication and the implications of their existence as a security concern. The objective of this paper is to propose a graphical model that helps better understand covert channel communication. We aim to develop a model to perceive covert channel communication and to explore the relationship between covert channel communication, steganography and watermarking. We think that it would help to develop a new and better understanding of covert channels which is a prerequisite for developing effective and efficient mechanisms for detecting and preventing the use of covert channels to leak confidential information in computer systems. In Section II, we survey the literature, discuss ways to establish covert communication channels, and look at existing techniques for mitigating covert channel use. In Section III, we construct a model to explore how convoluted covert channels can be and to illustrate examples of covert channels. In Section IV, we present a clear articulation of our model and discuss its impact and application in the development of more effective mechanisms for covert channel detection and prevention. Finally, we conclude and highlight current and future research. II. S URVEY

OF THE

L ITERATURE

In this section, we survey the literature to discuss the large body of existing work in the area of covert channels. We look at how covert channels can be established through common protocols. We also discuss the existing techniques for mitigating the use of covert channels. A. Establishing Covert Channels In this section, we aim to provide some insight as to how covert channels can be created and used to smuggle confidential information into, or out of an organization. We conjecture that understanding how covert channels are established and used to secretly communicate information is a significant step in developing a model to better understand covert channel communication as a whole. In this section, we give a brief nonexhaustive summary of representative techniques to establish covert channels over several common protocols. This survey will help us in introducing our model in Section III. We will use the covert channelling techniques presented in this section to support the premise of the proposed model. Covert channeling techniques which employ network protocols as covert message carriers are very popular and are among the easiest to use. The Internet Protocol (IP) is the most commonly used protocol in the network layer today. It is used to move all traffic across the Internet. IP is widely distributed across the globe, providing both public and private networking connectivity over packet-switching networks [13]. IP is an interesting carrier for covert channels as data can be hidden in the header of an IP

datagram. For example, in [14], it is shown that since the value for the IP Identification field should be chosen at random, it is possible to choose a non-random value for the IP Identification field without interrupting the IP mechanism. By using a nonrandom value for the IP Identification field, 16 bits of covert data can be sent to any other networked system. The Transmission Control Protocol (TCP) is one of the core protocols of the Internet Protocol Suite. TCP is a transportlayer protocol that allows for reliable data transmission. TCP is a suitable carrier for covert channels since data can be hidden in the TCP segment header. One way to exploit the TCP protocol in order to establish a covert channel is to use the SYN/ACK bounce technique. Bouncing techniques take advantage of third party equipment that relays information between the source and destination. Furthermore, the bouncing host is not necessarily aware that it is being used in such a way. According to [5], the principle of bouncing can be applied to synchronization (SYN) segments and acknowledgement (ACK) segments under the TCP mechanism. A TCP connection is established by a three way handshake as described in [15] where: 1) The client transmits a segment with the SYN bit but not the ACK bit set. The segment contains the clients initial sequence number x. 2) The server acknowledges the SYN segment by transmitting a segment with both the SYN and ACK bits set. The segment contains the server’s initial sequence number y. The segment’s acknowledgment number is x + 1. 3) The client acknowledges the SYN/ACK segment by transmitting a segment with the ACK bit but not the SYN bit set. The segment’s acknowledgment value is y + 1. According to [5], because of this standard behaviour, it is possible to transmit information by using any server and spoofing the source IP address so that it points to the intended receiver on the other end of the covert channel. In [14], Smeets and Koot also describe how data can be hidden in the header of a TCP segment. The Internet Control Message Protocol (ICMP) is a network-layer protocol that is used for generating error, test, and informational messages related to IP-based communication on the Internet. The availability of ICMP is critical for the diagnosis of network problems and for regular IP-based networking. The ICMP Error Bounce is another bouncing technique for establishing a covert channel. For ICMP destination/port unreachable, ICMP time exceeded and ICMP redirect error message types, it is specified in [15], that the data field of the packet must contain the IP header as well as the first 48 bytes of data from the packet that generated the error. According to [5], if we have two agents, A and B wishing to establish a covert channel by bouncing off of an agent, C, the channel can be established by the following protocol: 1) A sends a packet that will generate an error. The source address of this packet is spoofed to be that of B and the covert data that is to be transmitted can be either in the

IP header or in the first 48 bytes of the IP data. 2) C identifies the error and sends the appropriate ICMP message to the source address which is now the address of B. 3) B receives the packet and gets the data. Again data can be hidden in the message headers for ICMP echo request/reply messages. The field of interest is the Optional Data field which is listed as optional because it is a variable length field that contains data to be returned to the sender. Since this field is variable in length, [16] suggests that it can be used to conceal information other than the data that the ICMP would normally return to the sender. The Hyper-Text Transport Protocol (HTTP) is an application-layer protocol used to transfer information across the Internet. HTTP offers itself to be a formidable choice for a covert message carrier in that, for many organizations HTTP is the only protocol that is allowed from the internal network to the Internet since it is required for web-browsing [17]. HTTP request messages may contain multiple headers and it is shown in [14], that along with the common header line in a HTTP request, covert information may be sent via HTTP request messages by including arbitrary headers. Similar to HTTP request messages, HTTP response messages may contain multiple headers. As such, in the same way that HTTP request headers can be used to transmit covert information, HTTP response headers can be used. In [14], Smeets and Koot, state that combination of a HTTP request message covert channel with a HTTP response message covert channel allows for the creation of synchronous communication among the covert channel. The Domain Name System (DNS) is a transport-layer protocol that is used for storing and querying information of domain names in distributed databases. Rather than using 32-bit integer IP addresses to identify machines, DNS allows for the identification of machines by pronounceable, easily remembered names. According to [14], the fields of a DNS message header can be used to transmit covert information if an agent can gain control over a rogue DNS server. It is important to note that if an agent spoofs any of the Number of Questions, Number of Answers, Number of Authority, or Number of Additional fields, then the corresponding sections must match the given number so as not to raise any suspicion of tampering. The File Transfer Protocol (FTP) is a commonly used network protocol that allows for the exchange and manipulation of files over a TCP/IP based network. The client software for FTP integrates a function guaranteeing that at least one command will be sent in a fixed period of time. This function is referred to as the idle prevention scheme [16]. Covert channels based on FTP exploit the use of this idle prevention scheme. Covert channel can be established by using a special sequence of the NOOP, ABOR, and ALLO commands to encode covert information. For example, in [16], Zou et al. describe a covert channel that utilizes the NOOP, ABOR, and ALLO command sequence to perform a hierarchical encoding. We can divide N bits of covert information into two parts:

the high order M bits and the low order N − M bits. The number of high order bits are encoded with the number of ALLO commands sent in the command sequence and the low order bits are encoded with the number of NOOP commands sent in the command sequence. The ABOR commands in the command sequence mark the beginning of every N -bits of information. According to [16], on average 16 commands are required to transmit 8 bits of covert information. Although covert channelling techniques exploiting the use of network protocols is common, it is also possible to use other forms of cover such as document formats and host platform data structures used in cloud computing. It is also possible to use the arrival time of packages to encode information. The Portable Document Format (PDF) is very popular nowadays. In [18], Lee and Tsai describe a technique for using PDF as a carrier for covert information by viewing a message as a string of bits or characters encoded with a special ASCII code by binary coding embedded at the between-word locations in the text of a PDF file. The study shows that the ASCII code A0, which is used as a non-breaking space, when embedded in a string of characters, becomes invisible in the windows of several popular PDF readers including Adobe Reader. It is suggested that there exists two ASCII codes which appear to be exactly the same white space, those codes being A0 and 20. Therefore, A0 and 20 can be used interchangeably as a between-word space to encode a message bit b according to the following encoding technique: • if b = 1, then replace 20 between two words by A0 • if b = 0, then make no change This is just one of likely many ways to employ PDF as a carrier for covert information. Covert channels using PDF are often viewed as steganographic in nature since data is embedded into a form of known cover, i.e., a document which can be viewed as a two-dimensional array of characters. Covert channels can also be established using host platform data structures. For example, Xen is one of the most popular virtualization systems in the shared hosting world. In [19], Sala¨un describes how the shared pseudo-physical memory table (containing the addresses of other guests) can be used to put data in place of an address. In order to achieve this, it is assumed that there is an initial knowledge from each guest to know each other and that there is a possibility to create a “chat room” between accomplice guests. In [20], Cabuk et al. discuss a type of timing channel which they call a network covert channel. A network covert channel operates by altering the timing of otherwise legitimate network traffic so that the arrival times of packets encode confidential information. There are two possible operations for this type of covert channel. The first operation of the network covert channel is the encoding of information using a constant time interval, t. If a packet arrives in the time interval, t, then we have a 1, otherwise, if no packets arrive in t, then we have a 0. The second operation of the network covert channel is the encoding of information using a varying time intervals, such that each interval encodes some information, i.e., (t1 , 0), (t2 , 1), etc.

In [12], Moskowitz et al. discuss the capacity of the timed Z-channel. They propose a mathematical model for only this type of channel in order to establish its capacity (bits per time unit). In understanding how covert channels, in general, can be established, we gain a better understanding of how we can model covert channel communication and develop investigative support for confidentiality. B. Mitigating Covert Channels The techniques used to detect covert channels can give us an idea about inherent features of covert channels in general. These features should be captured in a useful model for covert channel communication. When it comes to eliminating the use of covert channels in computer systems, a variety of approaches have been proposed. Some approaches look at detecting the use of covert channels and some approaches look at preventing the use of covert channels, while there are very few approaches which aim to recover from the effects of covert channel use. Each approach has its own strengths and weaknesses and some are more applicable in real world scenarios than others. In this section, we discuss and assess the strengths and weaknesses of existing approaches to covert channel elimination, and provide insight as to why there is a need for a better understanding of covert channel communication. In [21], Nagatou and Watanabe present a technique for detecting the use of covert channels at run time. Security policies are enforced through flow control and access control mechanisms. The flow control mechanism compares the result of each system call into a system resource and the result of an emulator. If the results are different then it is considered that a covert channel occurred in the system and the monitor terminates the process that invoked the infracting system call. This technique is only able to enforce non-interference and non-inference policies. In [22], Kemmerer describes a technique for detecting the use of covert channels in computer systems based on shared resources called the Shared Resource Matrix (SRM). The motivation for the SRM technique lies within the knowledge that the use of covert channels requires the collusion between an agent with the authorization to signal or leak information to an unauthorized agent and that the authorization is granted on system objects which may include file locks, device busy flags, the passing of time, etc. A matrix is constructed where the attributes of all shared resources are indicated in the row headings and the operation primitives, (i.e., Write File, Read File, Lock File, etc.), are indicated in the column headings. After all of the row and column headings are determined, one must determine for each attribute (each row) whether the primitive indicated by the column heading modifies and/or references that attribute. This is done by carefully reviewing the description for each of the primitives, whether it is stated in natural language, formal specification, or implementation code. The generated matrix is then used to determine whether any covert channels exist.

Kemmerer provides the following minimum criteria which must be satisfied in order to have a covert channel: 1) The sending and receiving agents must have access to the same attribute of a shared resource. 2) There must be some means by which the sending agent can force the shared attribute to change. 3) There must be some means by which the receiving agent can detect the attribute change. 4) There must be some mechanism for initiating the communication between the sending and receiving processes and for sequencing the events correctly. If each of these criteria are satisfied, then a covert channel exists. The advantages of the SRM technique include the ability to quickly discard attributes that do not meet the preliminary criteria of being modified or referenced by an agent and the ability to provide a graphical design for developers in all stages of software design. However, the SRM technique is quite tedious and a little bit ad hoc in that the analyst must decipher scenarios in which the criteria might be satisfied. Another technique for detecting covert channels in computer systems is Covert Flow Trees (CFTs). Presented by Kemmerer and Porras in [23] and [24], CFTs attempt to identify operation sequences that support either the direct or indirect ability of an agent to detect when an attribute has been modified. This means that CFTs aid in recognizing when system attributes have been changed in some way by a sequence of operations. Covert Flow Trees can automatically be constructed by providing the algorithm described in [23]. Once the CFT is constructed, the tree can be traversed to develop all possible operation sequences of the system. These operation sequences can then be analyzed by developing hypothetical agents and system states that could use the operation sequences for covert communication. The analyst may assume that the sender and receiver share some mechanism whereby they can synchronize communication. CFTs are able to generate a comprehensive list of scenarios that could potentially support covert communication. The downfall of CFTs lies in the size of the CFTs that are generated and the scalability of the approach. H´elou¨et et al. in [25] and [7], propose a method for detecting potential covert channels using scenarios. The use of scenarios has several advantages. Scenarios are often the first information one can obtain about a system’s behaviour since they are used to describe system requirements. Several recommendations, [2], [3], ask to document the use of covert channels with such models. The idea is that from a scenario description of a system, a covert channel is modelled as a game where a pair of corrupted users, sender and receiver, (S, R), try to send information while the rest of the protocol is attempting to prevent the information from being communicated. This scenario based approach only reveals “potential covert channels”, the existence of which needs to be tested on a real implementation of the protocol. According to [26], a system is separable (i.e., multilevel secure) if and only if it is behaviourally equivalent to a collection of single level systems that do not interact. In [26], Browne presents an approach called Mode Security. The idea

is to organize the state transitions of a multilevel state machine into distinct sets called modes. The aim is to create a separable system. In essence, each machine mode is considered totally secure when considered in isolation of all other modes. This means that covert channels can only occur when the machine makes a transition from one mode to another. Therefore, by reducing the number of mode transitions in the system, one can reduce the number of potential covert channels in the system. Similarly, in [27], Jacob proposes a technique for detecting covert channels where the idea is to begin by making a list of all channels in a system. From this list, a new system is produced by “cutting” known channels from the system. This new system is checked for separability. If the new system is separable, then there are no covert channels, otherwise, at least one covert channel exists. The downfall of Jacob’s technique is that it does not detect covert channels completely dependent on known channels. The major concern with approaches for mitigating covert channels based on separability is that these approaches are not universally applicable to all systems. In [28], Andrews and Reitman provide an axiomatic definition for information flow in sequential programs, with particular emphasis on proof rules for programs containing assignment, alternation, iteration, composition, and procedure calls. The definition provided closely resembles Hoare’s deductive system for functional correctness found in [29]. The axiomatic approach of Andrews and Reitman analyzes program looking for information flows which violate the security policy of the system. A similar approach was taken by Sabri et al. in [30], where an amended version of Hoare logic was used to verify the satisfiability of security policies in communication protocols. Since the confinement notion introduced by Lampson in [1], more and more approaches to detect illegal information flows have been proposed. A short while after Lampson, in [31], Goguen and Meseguer defined the existence of covert channels through non-interference properties. Numerous approaches to non-interference have been proposed. For example, in [32], Volpano and Smith describe non-interference through typing where a system contains interference if it cannot be correctly typed and in [33], Lowe describes non-interference using process algebra. The notion of non-interference is questioned in [34] since the transfer of a single bit of information causes a non-interference violation. According to [7], it is often the case that non-interference approaches attempt to classify data and processes of a system according to two security levels: high and low. However, it may not always be the case that there are only two security levels which leads to a fundamental restriction of the use of non-interference properties to define the existence of covert channels in a system. A wide variety of prevention schemes for the use of covert channels in computer systems have been proposed. One such approach is through bandwidth or capacity analysis. In [33], [12], and [35], mechanisms for computing the bandwidth of covert channels in computer systems are presented where the idea is that if the bandwidth of a covert channel can be reduced to a reasonably small rate, then the channel

is rendered unusable as a means of effectively transferring information. The guidelines outlined in [2] and [3], state that covert channels with bandwidths of less than one bit per second are usually considered acceptable; while a bandwidth of more than 100 bits per second is considered unacceptable. One such method, developed by the United States Naval Research Laboratory, is called the Pump. Described in [36] and [37], the Pump lets information pass from a low level system to one at a higher level. The motivation comes from the idea that acknowledgements are required for reliable communication. If a higher level system passed acknowledgements directly to a lower level system, then the higher level system could pass high information by altering acknowledgement delays. In order to minimize such a covert channel, the Pump decouples the acknowledgement stream by inserting random delays. With consideration on overall performance of the system in mind, the Pump uses statistical averages to compute the delay time which it inserts into the communication stream. It is admitted in [37] that this method cannot handle a large state space which proves to be its major flaw. A number of additional prevention schemes take probabilistic approaches to covert channel mitigation. For example, in [6], Grusho, et al. assume that for a secure transmission, covert channels will exploit a manipulation of the probability distribution parameters of the sent message sequence. Although many techniques already exist which aid in the fight again covert channels, there seems to be no single technique which can handle any type of covert channel in any type of system. We aim to construct a model that can encompass covert channel communication in general. III. C ONSTRUCTION

OF THE

M ODEL

We begin with the observation that using any of the covert channelling techniques outlined in Section II, the information or markers1 are hidden in a data structure of some kind. At the receiver end, the pieces of information transmitted through the covert channel are put together to build confidential information that has been leaked. In timing channels, the markers sent are used to determine the time between relevant packages. Then, these time intervals are used to build the confidential information. For instance, the techniques which employ the use of the DNS protocol use the DNS message header format in order to hide information and the techniques which use the FTP protocol utilize a data structure constructed from the usages of FTP commands to hide information. Suppose that a user wishes to use a covert channel in order to leak confidential information to an accomplice. The user and the accomplice agree to use one of the techniques given in Section II to establish the covert communication channel. For example, they agree on a timing channel using the IP protocol where relevant datagrams are marked by embedding information into the IP Identification field. The relevant datagrams are used to encode confidential information based on constant timing 1 A marker is not a confidential information, but it helps get the confidential information.

intervals. If a marked datagram arrives in the interval, a 1 is encoded, otherwise a 0 is encoded. Using this example as a guide, we can begin to construct our model for covert channel communication. The user and the accomplice have chosen to exploit the IP protocol and the IP Identification field to mark relevant datagrams. Now, the IP header format can be viewed as a twodimensional data structure such as a table or matrix as shown in Figure 1. The user is interested in using the IP Identification field to embed the confidential information, so in terms of the covert channel, we are actually only concerned with a single field of the data structure, which is represented in Figure 1 as the grey area.

The user and the accomplice are now sending IP datagrams over a period of time. However, an important part of the model is still missing. How is the information getting from the user to the accomplice? We must add the dimension of the communication medium (i.e., channel) to the model. The communication channel is the means by which the information can flow from the user to the accomplice. Now, in order to view the communication channel as the fourth dimension of the model, we can view the stream of data from Figure 2, as being encapsulated inside a communication channel as shown in Figure 3. In the case of our example, the communication channel for the user and the accomplice would be viewed as the IP mechanism whereby the IP protocol specifies how the IP datagrams are transmitted from the source IP address (the user) to the destination IP address (the accomplice).

Channel Data Structure

Data Structure

m Ti

e

Fig. 1. A view of the data structure representing the data that is being transmitted in a covert channel.

The user must embed the marking information into a series of IP datagrams. In order to obtain the confidential information on the side of the accomplice, another dimension must be incorporated into the model; that dimension is time. Since the agreed upon covert channel scheme involves constant timing intervals, the arrival time of each marked packet is very important. Time encompasses all of the transmitted data structures. We must add the dimension of time in order to model the fact that the information that is sent in the communication is sent as a stream of discrete data packets over time. With the addition of the dimension of time, our model now expands to a three-dimensional construction as shown in Figure 2. We can view what is shown in Figure 2 as a stream of data.

Data Structure

Fig. 2. time).

m Ti

e

A view of a stream of data (a series of data structures sent over

Fig. 3. A view of a stream of data being transmitted in a communication channel.

Our model can now be used to represent a single covert channel. However, we wish for our model to be more versatile. We want to have the ability to represent multiple covert channels and to highlight how convoluted covert channels can be in computer and information systems. Suppose that the user and the accomplice are trying to make their communication as covert as possible. Instead of sending the confidential information using only the technique of the IP Identification field, suppose that the user and the accomplice agree to establish multiple covert channels using multiple covert channelling techniques. Assume that the user sends marked packets using a combination of the covert channelling techniques from Section II. We can see this as being an extension of the channel dimension of the model. In a system containing covert channels, it is very possible that multiple channels may be involved, each transmitting its own stream of data which may be entirely different from the data that is being transmitted on any other channel. According to a pre-established scheme, the arrangement of all the data sent on all the channels gives the confidential information sent on the channels. This idea is illustrated in Figure 4. At this point, we have a complete perception of covert channel communication. Our model (Figure 4) represents multiple communication channels, each transmitting a stream of data which is in turn is a series of data structures sent over a period of time. This yields a four-dimensional model for

Channel

Channel n+1

Data Structure

m Ti

e

Channel n

Data Structure

m Ti

e

Channel n-1

Data Structure

Fig. 4.

m Ti

e

A model for covert channel communication.

covert channel communication. We will see in the next section how our model can be applicable to the covert channelling techniques presented in Section II, as well as to steganography and watermarking. IV. A RTICULATION

OF THE

M ODEL

The model constructed in Section III was built using a specific example. We will see in this section however, that our model is applicable for generalized covert channel communication cases as well. In the example given in Section III, the user and the accomplice were exploiting the IP protocol and the IP Identification field of the IP datagram header. It can be seen that our model applies in the cases when another protocol and/or header field is used as well. The header format that is used can simply be abstracted as a data structure of some dimension and the field which is used can be represented as an element of that data structure. It is not required that the dimension of the data structure be restricted to two dimensions. We can have data structures of n dimensions and the model would still have the ability to accurately represent a stream of data being transmitted over the communication channels. Usually through

marshalling data before sending it on a channel and its unmarshalling at the channel destination, we are concretely sending a stream of bytes. However, abstractly and without taking the marshalling into account, we can see that we are sending complex data structures of dimension n. The dimensions of time and channel remain applicable regardless of the technique employed by the user and the accomplices. The model for covert channel communication requires the dimensions of time and channel. This means for any data structure of n dimensions, we will have an n + 2-dimensional model. This means that the more complex the data that is being transmitted in the communication, the more complex the model becomes. This is a testament to the overall complexity of the issue of covert channels in computer systems and the difficulty in finding suitable detection and prevention mechanisms. It may be the case that our model can be simplified when a single communication channel is used or when only a single data structure is transmitted. Having the ability to account for the complexity of covert channels allows for a model which can lead to a better understanding of covert channel communication and its implications as a threat to computer and information security. In this context, complexity is a measure of the dimensionality of the model. Based on the number of dimensions represented by the model, we can attain an understanding of how each varying aspect of covert channel communication can simplify or further complicate the covert communication mechanism. For example, if a covert communication utilizes only one communication channel to convey its message, then we can view the communication as being less complex than a covert communication utilizing many communication channels. This is due to the simplification of the dimension representing the communication medium and the difference between Figure 3 and Figure 4. The proposed model allows for the perception of covert channel communication. A clear perception of the problem often leads to a better understanding of the problem. A better understanding of covert channel communication enables us to use our model to represent a variety of covert channel usages in computer and information systems. For example, the proposed model is able to represent simple cases of covert channel use, such as the use of a single communication channel, as well as more complex cases of covert channel use, such as one where confidential information may be leaked in parts over many communication channels. It captures the ability to combine two covert channels, say Channel n and Channel n + 1. The proposed model is also able to account for the perception of timing channels. In Section II-A, we saw a type of covert timing channels called network covert channels. With the proposed model, we are able to represent each of the encoding schemes for network covert channels as shown in Figure 5. Since our model contains a dimension of time, we can view the timing of each packet arrival in the series as being in an interval denoted by t1 , . . . , tn . Now for the first operation scheme for a network covert channel, we simply assume

Channel

tn

} ...

}

t1

Data Structure

Fig. 5.

m Ti

e

A model for timing channels.

t1 = t2 = · · · = tn . Any data structure that is transmitted in the interval can be represented as a 1 according to the given scheme. For the second operation scheme, we have the intervals and we can map any data structure sent in the time intervals according to the given scheme. We can also extend the idea of timing channels using our model. Consider a scheme where two agents use marked packets to indicate the packets in which the timing information is important. Using our model, we can view the information that is embedded in the data structure as one level of encoding and based on the contents of the data structure and we can view the timing information as a second level of encoding. Our model is able to capture this idea, since we are able to perceive the data structures containing some kind of information and the data structures being transmitted at some point in time. Therefore, our model is able to represent both levels of encoding. We can also come up with a more complex timing channel such as one where packets are being transmitted on different communication channels. The packets may be sent at different times on each channel and the scheme may be such that when the packet arrival times on each communication channel coincide (± δ time), the information embedded in the packets is relevant to the scheme for building up information. This is illustrated in Figure 6.

sured in terms of its capacity. According to [38], capacity is the maximum rate that a communication channel can reliably transmit information. Capacity is often measured in terms of bits per unit of time. Although our model does not offer empirical calculations of capacity, it does offer intuitive information regarding channel capacity. Since our model contains information regarding the data structures, being transmitted, we are able to obtain an intuition of the number of bits that can be transmitted per data structure and since we have information regarding the arrival time of each transmitted data structure, we can view the number of bits per unit time, which is the channel capacity. The proposed model is able to account for many different types of covert channels and covert channel concepts. This is a testament to its versatility and expressiveness for graphically representing covert channel communication. Furthermore, we can use the proposed model to generate a perception of steganography. In steganography, information is embedded into some form of cover, i.e., images, audio, video, etc. It is clear that we can view the form of cover as a data structure of some dimension. For example, an image containing hidden information is simply a two-dimensional data structure, more specifically a pixel matrix. The image needs to be transmitted to its destination in some way, meaning that it would need to be sent at some point in time or at a series of points in time and through one or more communication channels. In its simplest form, the model for steganography as a form of covert channel communication would appear to be something like that given in Figure 7, where we have a streaming image containing hidden information being sent over some type of steganographic communication channel.

Steganographic Channel

Channel #2 Image

m Ti

e

Channel #1 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Arrival Time

Fig. 6. A covert channel scheme where the coincidence arrival time of packets indicates embedded relevant information.

The proposed model is able to perceive the covert channel utilizing the scheme outlined in Figure 6. Again, since our model is able to account for the information that is embedded into each transmitted data structure, the arrival time of each transmitted data structure, and the use of multiple communication channels, we are able to perceive the operation of this type of covert channel. Often times, the damage of covert timing channels is mea-

Fig. 7. cation.

A model for steganography as a form of covert channel communi-

If we take an acrostic, which is a widely used method of linguistic steganography, we can see it as a data structure that is a list of strings. For example, in the following acrostic, we hide the word SKY. So nice and blue, Keeps the stars from falling bellow, You should seek its guidance on where to go. The acrostic is a list of verses that each starts with a letter

from the hidden message SKY. One can think of putting the letters of the hidden word in the second position or any other agreed upon position. Our data is then of two dimensions. One can think that the message can be hidden in several acrostics written at different times, which adds a time dimension. Furthermore, several poets can collude together to send a message, which adds the channel dimension to the problem. Many current steganographic systems put the elements of the hidden message at random positions (already agreed upon by the sender and the receiver). One can think about doing the same on the time dimension and on the channel dimension as well, which further obscures the message’s trail. As presented in the introduction, some argue that steganography is different from covert channels because covert channel communication uses unknown channels. According to the proposed model, a message can be sent on several channels. In this case, it is hard for an information leakage monitor to be aware of all the channels used. Therefore, we can have a covert channel while the monitor is not aware of all the channels used. Hence, requiring the awareness of an information leakage monitor of the channel(s) used to distinguish between steganography and covert channels is untenable. These ideas can be transposed to watermarking. Despite the idea that the object of watermarking is not the secrecy of the hidden message but the potential of its removal by a pirate, one can see the usefulness of having this perspective of multidimensional information hiding in increasing the robustness of the watermarking by making the watermark less detectable. For instance, in a video, one can distribute the watermark message on several frames according to a function on the time. The ability to represent steganography as a form of covert channel communication yields an inherent relationship between the two concepts. Since we are able to model both steganography and covert channel communication using the same model, we can argue that steganography is a special case of covert channel communication. The versatility and flexibility of the proposed model allows it to be a powerful representation of covert channel communication in general, which enables building better detection and prevention mechanism for covert channels in computer systems. V. C ONCLUSION

AND

F UTURE W ORK

In this paper, we present a novel, comprehensive model for covert channel communication. Our main contribution involves the construction of such a model in order to aid in the perception of covert channels in an attempt to provide a better understanding of covert channel communication and the security concerns surrounding them. The proposed model is about the morphology of covert channels. It helps to think about the structure of covert channels, which helps designers of covert channels to construct covert channels which are challenging to detect or covert channels with high capacities (for instance, when you use several limited capacity channels to build a global one). As with all models, it considers some aspects of the modelled reality which are either negligible or whose

effect the model is not designed to directly study. For example, despite the fact that the proposed model gives the designer ideas about how to combine, for instance, several channels for transmitting the information to potentially increase the capacity of sending information, it does not directly help in establishing the upper-bound of the capacity of a channel. For that, you would need a model such as the one proposed in [12] to establish the capacity of a timed Z-channel. We have also presented an argument that steganography can be viewed simply as a specific case of covert channel communication. We described how the proposed model can be used to view steganography, yielding an inherent relationship between covert channels and steganography since both problems can be represented by the same model. The ability to model a variety of covert channels scenarios gives a simple overall view of covert channel communication, which is not offered by many existing models of specific covert channels. The proposed model is able to graphically represent the morphology of covert channels allowing for the quick understanding and perception of the covert communication mechanisms being used. With this understanding of covert channel communication, we can develop new, more effective and efficient mechanisms for detecting and preventing the use of covert channels to leak confidential information. Indeed, as discussed in the previous section, we extend the idea of timing channels to propose a new scheme for sending information. In our recent work [39], we used our model for covert channel communication to develop investigative support for confidentiality. This involves looking at covert channel analysis in a digital forensics context [40], [41]. We provide investigative tests using a mathematical framework based on relational algebra in order to detect the leak of confidential information through covert channelling techniques. In the future, we intend to formalize the model presented in this paper to transition from a graphical representation to a parametrized mathematical one. With a mathematical model, we expect to be able to develop metrics to support the assertions made in regards to the complexity and the capacity of covert channel communication. It is obvious that each of the agents involved in a covert channel has a knowledge about the structure of the environment in which the confidential information is leaked. In our earlier work [42], [43], we already proposed mathematical models for capturing agent’s knowledge using a mathematical structure called Information Algebra. We intend to explore the usage of information algebras [44], [45] to propose a comprehensive mathematical model for covert channels. The work presented in this paper aims at a general understanding of the structure of covert channels that would help us in articulating the detailed mathematical model. ACKNOWLEDGEMENTS The authors are thankful for the anonymous reviewers. Their comments helped us considerably improve the quality of the paper.

R EFERENCES [1] B. Lampson, “A note on the confinement problem,” Communications of the ACM, vol. 16, no. 10, pp. 613 – 615, October 1973. [2] U. S. A. Department of Defense, Department of Defense Trusted Computer System Evaluation Criteria, ser. Defense Department Rainbow Series. Fort George G. Meade, Maryland: Department of Defense / National Computer Security Center, December 1985, no. DoD 5200.28STD. [3] U. S. A. National Computer Security Center, A Guide to Understanding Covert Channel Analysis of Trusted System, ser. NSA/NCSC Rainbow Series. Fort George G. Meade, Maryland: Department of Defense / National Computer Security Center, November 1993, no. NCSC-TG030. [4] H. Berghel, “Hiding data, forensics, and anti-forensics,” Communications of the ACM, vol. 50, no. 4, pp. 15–20, April 2007. [5] R. Bidou and F. Raynal, “Covert channels,” November 2005. [Online]. Available: http://www.iv2-technologies.com/ rbidou/CovertChannels.pdf [6] A. Grusho, A. Kniazev, and E. Timonina, “Detection of illegal information flow,” in Proceedings of the Third International Workshop on Mathematical Methods, Models, and Architectures for Computer Networked Security, MMM-ACNS 2005, ser. Lecture Notes in Computer Science, no. 3685, Berlin, Germany, 2005, pp. 235 – 244. [7] L. H´elou¨et, M. Zeitoun, and A. Degorre, “Scenarios and covert channels: Another game...” Electronic Notes in Theoretical Computer Science, vol. 119, pp. 93 – 116, 2005. [8] A. Patel, M. Shah, R. Chandramouli, and K. Subbalakshmi, “Covert channel forensics on the internet: Issues, approaches, and experiences,” International Journal of Network Security, vol. 5, no. 1, pp. 41 – 50, July 2007. [9] F. Petitcolas, R. Anderson, and M. Kuhn, “Information hiding - a survey,” Proceedings of the IEEE, vol. 87, no. 7, pp. 1062 – 1078, July 1999. [10] B. Sartin, “Anti-forensics - distorting the evidence,” Computer Fraud and Security, vol. 2006, no. 5, pp. 4 – 6, May 2006. [11] S. Zander, G. Armitage, and P. Branch, “Covert channels and countermeasures in computer network protocols,” IEEE Communications Magazine, vol. 45, no. 12, pp. 136 – 142, December 2007. [12] I. Moskowitz, S. Greenwald, and M. Kang, “An analysis of the timed z-channel,” IEEE Transactions on Information Theory, vol. 44, no. 7, pp. 3162 – 3168, November 1998. [13] Z. Trabelsi and I. Jawhar, “Covert file transfer protocol based on the ip record route option,” Journal of Information Assurance and Security, vol. 5, no. 1, pp. 64–73, 2010. [14] M. Smeets and M. Koot, “Research report: Covert channels,” Master’s thesis, University of Amsterdam, February 2006. [15] D. Comer, Internetworking with TCP/IP, 5th ed. Prentice Hall, 2005, vol. 1. [16] X. Zou, Q. Li, S. Sun, and X. Niu, “The research on information hiding based on command sequence of ftp protocol,” in Proceedings of 9th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, ser. Lecture Notes in Computer Science, vol. 3683. Springer Berlin / Heidelberg, 2005, pp. 1079–1085. [17] M. Van Horenbeeck, “Deception on the network: Thinking differently about covert channels,” in Proceedings of the 7th Australian Information Warfare and Security Conference. 174 - 184, December 2006. [18] I. Lee and W. Tsai, “A new approach to covert communication via pdf files,” Signal Processing, vol. 90, no. 2, pp. 557 – 565, 2010. [19] M. Sala¨un, “Practical overview of a xen covert channel,” Journal in Computer Virology, pp. 1–12, 2009. [20] S. Cabuk, C. Brodley, and C. Shields, “Ip covert channel detection,” ACM Transactions on Information and Systems Security, vol. 12, no. 4, April 2009. [21] N. Nagatou and T. Watanabe, “Run-time detection of covert channels,” in Proceedings of the First International Conference on Availability, Reliability and Security, ARES 2006, Vienna, Austria, 2006, pp. 577 – 584. [22] R. Kemmerer, “Shared resource matrix methodology: An approach to identifying storage and timing channels,” ACM Transactions on Computer Systems, vol. 1, no. 3, pp. 256 – 77, August 1983. [23] R. Kemmerer and P. Porras, “Covert flow trees: A visual approach to analyzing covert storage channels,” IEEE Transactions on Software Engineering, vol. 17, no. 11, pp. 1166 – 1185, November 1991.

[24] P. Porras and R. Kemmerer, “Covert flow trees: A technique for identifying and analyzing covert storage channels,” in Proceedings of the 1991 IEEE Computer Society Symposium on Research in Security and Privacy, Los Alamitos, CA, USA, 1991, pp. 36 – 51. [25] L. H´elou¨et, M. Zeitoun, and C. Jard, “Covert channels detection in protocols using scenarios,” in Proceedings of Security Protocols Verification, SPV’03, 2003, pp. 21 – 25. [26] R. Browne, “Mode security: An infrastructure for covert channel suppression,” in Proceedings of the 1994 IEEE Computer Society Symposium on Research in Security and Privacy, Los Almitos, CA, USA, 1994, pp. 39 – 55. [27] J. Jacob, “Separability and the detection of hidden channels,” Information Processing Letters, vol. 34, no. 1, pp. 27 – 29, February 1990. [28] G. Andrews and R. Reitman, “An axiomatic approach to information flow in programs,” ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 2, no. 1, pp. 56 – 76, January 1980. [29] C. Hoare, “An axiomatic basis for computer programming,” Communications of the ACM, vol. 12, no. 10, pp. 576 – 580, October 1969. [30] K. Sabri, R. Khedri, and J. Jaskolka, “Verification of information flow in agent-based systems,” in Proceedings of the 4th International MCETECH Conference on e-Technologies, ser. Lecture Notes in Business Information Processing, G. Babin, P. Kropf, and M. Weiss, Eds., vol. 26. Springer Berlin / Heidelberg, May 2009, pp. 252 – 266. [31] J. Goguen and J. Meseguer, “Security policies and security models,” in Proceedings of the 1982 Symposium on Security and Privacy, New York, NY, USA, 1982, pp. 11 – 20. [32] D. Volpano and G. Smith, “Eliminating covert flows with minimum typings,” in Proceedings of the 10th Computer Security Foundations Workshop, Los Alamitos, CA, USA, 1997, pp. 156 – 168. [33] G. Lowe, “Quantifying information flow,” in Proceedings of the 15th IEEE Computer Security Foundations Workshop, CSFW-15, Los Alamitos, CA, USA, 2002, pp. 18 – 31. [34] P. Ryan, J. McLean, J. Millen, and V. Gligor, “Non-interference: Who needs it?” in Proceedings of the 14th IEEE workshop on Computer Security Foundation. Washington, DC, USA: IEEE Computer Society, 2001, pp. 237 –238. [35] S. Shieh and A. Chen, “Estimating and measuring covert channel bandwidth in multilevel secure operating systems,” Journal of Information Science and Engineering, vol. 15, no. 1, pp. 91 – 106, 1999. [36] M. Kang and I. Moskowitz, “A pump for rapid, reliable, secure communication,” in Proceedings of the 1st ACM Conference on Computer and Communications Security, Fairfax, VA, USA, 1993, pp. 119 – 129. [37] R. Lanotte, A. Maggiolo-Schettini, S. Tini, A. Troina, and E. Tronci, “Automatic covert channel analysis of a multilevel secure component,” in Proceedings of the 6th International Conference, ICICS 2004, ser. Lecture Notes in Computer Science, no. 3269, Berlin, Germany, 2004, pp. 249 – 261. [38] S. Gianvecchio, H. Wang, D. Wijesekera, and S. Jajodia, “Modelbased covert timing channels: Automated modeling and evasion,” in Proceedings of the 11th International Symposium of Recent Advances in Intrusion Detection (RAID 2008), Boston, MA, September 2008, pp. 211 – 230. [39] J. Jaskolka, “Modeling, analysis, and detection of information leakage via protocol-based covert channels,” Master’s thesis, McMaster University, 2010. [40] R. Leigland and A. W. Krings, “A formalization of digital forensics,” International Journal of Digital Evidence, vol. 3, no. 2, Fall 2004. [41] C. Wonnemann, R. Accorsi, and G. Muller, “On information flow forensics in business application scenarios,” in Proceedings of the 33rd Annual IEEE International Computer Software and Applications Conference (COMPSAC 2009). Piscataway, NJ, USA: IEEE Computer Society, July 2009, pp. 324 – 328. [42] K. E. Sabri, R. Khedri, and J. Jaskolka, “Specification of agent explicit knowledge in cryptographic protocols,” in Proceedings of the World Academy of Science, Engineering and Technology (WASET), vol. 34, October 2008, pp. 447 – 454. [43] ——, Advanced Technologies. IN-TECH, October 2009, ch. 13: Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems, pp. 224–250. [44] J. Kohlas and R. St¨ark, “Information algebras and consequence operators,” Logica Universalis, vol. 1, no. 1, pp. 139 – 165, January 2007. [45] K. Sabri, “Algebraic framework for the verification of confidentiality properties,” Ph.D. dissertation, McMaster University, Hamilton, Onatrio, Canada, March 2010.