protect with a tunnel any client-server application that works on TCP. ... assumptions, a dedicated attacker could, by observing an encrypted SSH session ...
A Preliminary Look at the Privacy of SSH Tunnels Maurizio Dusi, Francesco Gringoli, Luca Salgarelli DEA, Universit`a degli Studi di Brescia, via Branze, 38, 25123 Brescia, Italy E-mail: @ing.unibs.it Abstract— Secure Shell (SSH) tunnels are commonly used to provide two types of privacy protection to clear-text application protocols. First and foremost, they aim at protecting the privacy of the data being exchanged between two peers, such as passwords, details of monetary transactions and so on. Second, they are supposed to protect the privacy of the behavior of end-users, by preventing an unauthorized observer from detecting which application protocol is being transported by an SSH tunnel. In this paper we introduce a GMM-based (Gaussian Mixture Model) technique that, under a set of reasonable assumptions, can be used to identify which application is being tunneled inside an SSH session by simply observing the stream of encrypted packets. This technique can therefore break the presumption of privacy in its second incarnation as described above. Although still preliminary, experimental results show that the technique can be quite effective, and that the standard bodies might need to take this approach under consideration when designing new obfuscation techniques for SSH.
I. I NTRODUCTION The use of encryption to protect the privacy of traffic on the Internet is quite common. One of the techniques that is ordinarily used to secure traffic flows is known as cryptographicallyprotected tunneling. In this case, an IP-level or TCP-level stream of packets is used to tunnel application-layer segments coming from various software programs, while a cryptographic algorithm protects their privacy. One of the most common of such mechanisms is implemented by the Secure Shell (SSH) [1] protocol, which can be setup to cryptographically protect with a tunnel any client-server application that works on TCP. Cryptographically protecting data streams with SSH tunnels has two objectives. On one hand, it aims at preserving the privacy of the users’ data, such as passwords and details of monetary transactions, that would otherwise be exposed by a plain-text protocol such as POP3, SMTP, HTTP, etc. On the other hand, there usually is a secondary, but equally important objective: the protection of the privacy of the users’ behavior. In other words, tunneling applications on SSH should also render the behavior of the user private to any snoopy network manager on the path of the SSH tunnel, in terms of the protocols that the user is tunneling, and the sites that they are visiting. The reason for this second type of protection is that by observing the protocols that somebody is using, an attacker could gain access to sensitive information that should remain private, and could help the bad guys answer some critical questions on the user themselves, such as: “is This work was supported in part by a grant from the Italian Ministry for University and Research (MIUR), under the PRIN project RECIPE.
Linda using e-mail at this time?”. Or, “is Paul browsing the web, rather than debugging that remote router like I told him to do?”... In this paper we show that, albeit with some restricting assumptions, a dedicated attacker could, by observing an encrypted SSH session tunneling another application protocol, gain enough information so as to detect with good accuracy which protocol is being tunneled. The technique we use is based on a mechanism commonly used in pattern recognition, i.e., Gaussian Mixture Model (GMM), here applied at simple features of the encrypted traffic flow, such as the length of each packet and their direction. We show that such technique can work under a few reasonable assumptions, such as the fact that the SSH tunnel carries one application-layer flow at a time, and there are no further obfuscation techniques in place besides the standard ones adopted by SSH. Although we are far from generalizing this technique so that it can be applied to SSH connections that multiplex several applications in a single tunnel, the results should be enough to warrant new developments to the SSH specifications for the inclusion of stronger obfuscation techniques in the base standards. The rest of the paper is organized as follows. In Section II we report on related works. In Section III we introduce the technique we used to model SSH tunnels. Section IV describes our GMM-based approach to the classification of SSH-tunneled traffic flows. Section V shows the experimental results we achieved by applying the technique to recorded traffic traces. In Section VI we discuss about the assumptions behind the approach. In particular, we show how the SSH signaling can be exploited to detect the boundaries of an encrypted flow in the case the tunnel is carrying only one application at a time. Finally, Section VII concludes the paper. II. R ELATED WORK The attempt to apply statistical (behavioral) traffic classification techniques to the characterization of encrypted flows is relatively recent. Levine et al. [2] proposed a statistical technique that can partially violate the privacy of encrypted HTTP streams. Given a web-site, the authors collect the lengths of the packets composing the web request: the mean value of each packet length traces the size profile of the site. In the same way, they derive the time profile from the interarrival times of the same packets. They compare the profiles of several web-sites to a generic web-request trace: by means of a cross-correlation measure they infer the similarity of such a trace to each given profile and show how it is possible
Accepted for publication on the proceedings of the 17th IEEE International Conference on Computer Communications and Networks (ICCCN'08), U.S. Virgin Islands, 3-7 Aug. 2008.
to assess the destination address of the corresponding flow. Liberatore et al. in [3] follow a similar approach, evaluating the technique with a larger data set of web-profiles. Another work that deals with the privacy of encrypted data streams is the one by Wright et al. [4], where it is shown that encrypted IPSec tunnels which carry only a single application protocol leak enough information about the flows in the tunnel to allow to precisely assess their number. The main implication of these works is that encryption is not enough to protect privacy when statistical mechanisms are used to analyze traffic flows. Bernaille et al. [5] show that the information about packet size and packet direction can be exploited for classifying (i.e., assigning each flow to a specific application class) encrypted traffic. Given an encrypted flow, they extract the data field, and remove the number of bytes that are due to cryptography. The authors choose to assign a constant value to the number of bytes to remove, depending on the chosen encryption algorithm. Once the packet size is purged of the bytes due to encryption, the proposed technique assigns the flow to one of several clusters, achieving a good degree of accuracy on several application protocols. Wright et al. in [6] describe a technique for the classification of encrypted traffic based on a Hidden Markov Model (HMM). The authors evaluate the technique on clear-text traffic that was previously encrypted in an “artificial way”: in fact, the size of each packet is rounded up to the next multiple of the block-cipher size. The classification results achieved on several application protocols are promising. In our previous work [7] we described a statistical classification technique able to detect when a SSH session is used to tunnel other protocols rather than used for remote administration or remote copy. Here we continue the work in two directions. First, we show that a statistical technique based on a Gaussian Mixture Model (GMM) can infer the kind of the application protocol being encrypted over an SSH channel. Second, we introduce a very preliminary technique to estimate the flow boundaries within an SSH tunneling session. III. M ODELING SSH TUNNELS In this paper we show how a Gaussian Mixture Model can be applied to the problem of the characterization of traffic encrypted by an SSH tunnel. In this preliminary work we apply the mechanism to clear-text traffic that was recorded on a real network, after artificially encrypting it using the SSH model described in this Section. In practice, we follow an approach that has been validated by the scientific community in these cases. For example, Wright et al. adopt a similar technique for their experimental evaluation in [6]. The reason for using clear-text traffic encrypted using a model of the SSH tunnel channel, as opposed to recording traffic from actual SSH tunnels is simple, and it has been described previously by many others that have faced similar issues. The collection of traffic traces encrypted over SSH sessions for the purpose of testing the validity of a traffic classification technique is quite a hard task. It involves the capture of the
traffic before and over the encrypted channel, and the ability to correlate the two information. Furthermore, observing the SSH session only, it is not possible – without knowing the keys – to separate the flows of different application protocols being encrypted. Therefore, we start by describing the SSH model that we used to replicate its behavior, “artificially” generating SSH tunnels starting from clear-text traffic traces. A. An introduction to SSH tunnels The SSH protocol [1], [8], [9] allows any client and server programs built to the protocol’s specifications to communicate securely over an insecure network. Furthermore, it allows the tunneling (port forwarding) of any TCP connection on top of SSH, so as to cryptographically protect any application that uses clear-text protocols. The type of cryptographic protection is specified by a set of options that are negotiated by the peers at the beginning of the connection, after host and user authentication have been completed. Such options include the way to handle public keys, the type and mode of symmetric encryption, and which message authentication algorithm should be used. Although SSH is usually run on top of TCP, it has been designed to be used on top of any reliable transport layer: for this reason the protocol defines an “SSH packet” which can span over several layer-4 segments. It is left to the transport layer to deliver the reconstructed packet to the SSH application at the receiver side. The format of an SSH packet is shown below: field length
pkt-length 4B
padding-length 1B
payload + padding variable
MAC 16B
Each time some data from the application that insists on top of the SSH channel is available, a new packet is created and the sender SSH process fills up the payload field of the structure. Before encryption takes place, this field is padded with a mandatory random sequence of at least four bytes so that the cumulative length of the first three fields is a multiple of the block cipher. Once the packet is delivered to the receiver side, the peer verifies its integrity using the information that the sender encoded in the MAC field: although the length of this field is decided during the connection setup and depends on the negotiated hash algorithm, we assume, without loss of generality, a length of 16 bytes. The receiver SSH process decrypts the data and uses the values of the first two fields to determine the boundary between the actual payload and the appended padding. The data carried inside the payload field of each SSH packet is used by the SSH peers to provide upper applications with secure transport channels multiplexed into a single encrypted tunnel. To this end, nine bytes at the beginning of the field are reserved for signaling and are used to decode the data being carried. The SSH packet header is given by the first two fields plus these nine bytes. When an upper application requests a new channel, its SSH peer sends an SSH MSG CHANNEL OPEN message to
the other side specifying the channel type, such as interactive login session, forwarded TCP/IP connection, etc. Once the peer acknowledges the request, a new channel is allocated: in the following we will focus on channels dedicated to the forwarding of TCP/IP connections (tunneling). These kind of channels are allocated each time a connection comes to a port for which remote forwarding has been requested: in this case a channel is opened to forward the port to the other side that will then connect to the actual destination on behalf of the original client socket. Such encrypted SSH channels are completely transparent to the upper application. Assuming that i) the SSH protocol exposes some information about when a forwarded connection starts and ii) that connection was the only channel inside the tunnel, than the objective of this paper is to demonstrate that analyzing the encrypted packets can lead an attacker to discover what kind of application is being carried by the SSH tunnel. Let us assume for the moment that both assumptions hold: we will discuss about them in depth in Section VI. B. A model for SSH channels We now introduce a model for an SSH channel that describes how clear-text TCP packets that need to be carried by an SSH tunnel are transformed according to the SSH specifications. In the analysis that follows, we ignore packets that do not carry TCP payload, since they do not introduce any additional information useful for classification purposes. In fact, empty TCP packets carry only signaling at the transport layer, and they are not forwarded over the SSH channel. For the same reason we also ignore empty SSH tunneling packets. We define a clear-text half-flow as the uni-directional ordered sequence of non-empty packets carrying the same pair of source and destination tuples of IP addresses and TCP port numbers: each TCP session is always composed of a pair of half-flows, the one made of packets carrying the TCP segments generated by the client, and the other carrying the server data. Suppose that with no tunneling in place we can capture the packets exchanged by a pair of peers during a TCP session on a generic router on the path: we would see the two associated half-flows multiplexed in a bi-directional clear-text flow that we represent with the following feature vector xC : xC = {C1 , C2 , C3 , . . . , CN } , where N is the number of packets composing the flow, and each Ci is the size of the i-th clear-text packet. Imagine now that SSH tunneling is in place and the tunneled flow crosses the same router as above, we can expect that: • as each half-flow enters the SSH tunnel, its segments are converted into SSH packets and sent over the tunnel; • the tunneled packets have different lengths than the corresponding clear-text segments; • the pair of encrypted half-flows is multiplexed into a new bi-directional encrypted flow that we represent with the following feature vector: x = {E1 , E2 , E3 , . . . , EN } .
Since we ignore signaling packets from the analysis, we can expect that the way packets are intermixed inside xC depend on the application protocol that runs on top of the tunnel: consecutive segments generated by a peer should be thought as a reaction to the preceding block of segments sent by the other peer. If this holds, the mixing order inside x should be the same as in xC . We have now to determine how the length of the packets changes going from xC to x. To this end we indicate with B the block-cipher size, with M the MAC size and with H the size of the SSH header. Given a clear-text packet C, the size of the encrypted packet E is computed as follows: E=
!
C +H B
"
· B + M + B · 1I ,
(1)
where !·" is the ceil operator and 1I is the characteristic function: # 1 if ∃k ∈ {0, 1} s.t. (C + kH) ≡B 0, 1I = 0 otherwise.
We built our model following the architecture of OpenSSH [10], which implements the SSH standards. As common practice, adding a variable amount of random padding may help thwart traffic analysis. OpenSSH pads one more B if the quantity (C +kH) is a multiple of the block cipher size, and the characteristic function considers this situation. The model does not take into account the packet fragmentation. Due to encryption, it could happen that the resulting size of the encrypted packet is more than the Maximum Transfer Unit (MTU) of the channel. For instance, on 802.3 segments the MTU, including TCP/IP header, is 1500 bytes. In this preliminary analysis, we choose to clamp to 1460 the maximum size of the encrypted payload, so that: E = min(E, M T U ).
(2)
This means that this version of the model can only take into account cases where the end-user applications send and receive relatively small packets. In other words, we will be able to apply it only to clear-text traces that do not present packets large enough so that the resulting encrypted packet is larger than 1500 bytes. Although the development of a more accurate SSH channel model is left as future work, we will discuss about this and other simplifications we adopted for this work in Section VI. Finally, to take into account the information about the direction of packets, each packet-size is added to a constant K = 1000 and weighed with the sign function: # +E + K if pkt sent by client, E= (3) −E − K if pkt sent by server.
The constant K takes the role to separate counterpropagating packets and we take advantage of this separation in building the Gaussian Mixture Model. The resulting SSH model, outlined in Figure 1, can be applied to clear-text traffic traces to mimic, with a perfect
clear-text pkt C
encrypted pkt E SSH encryption process
1: 2: 3: 4:
while(C): compute E as in Equation 1; process E according to Equations 2-3; return E; Fig. 1.
Model of the SSH channel.
approximation under the assumptions we discussed, what we would obtain by capturing the corresponding packets in the middle of the SSH tunnel. In the next session we will show how to setup a trained classification approach that can exploit these data to detect the application class that is running above an encrypted SSH tunnel. IV. A GMM APPROACH TO THE CLASSIFICATION OF SSH TRAFFIC
The Gaussian Mixture is a parametric model generally used to estimate a multivariate probability density function. Its distribution is of the form: f (x|Φ) =
L $ i=1
ai N (x|θ i ),
where L is the number mixture components, ai ≥ 0 are %of L L the mixing proportions ( i=1 ai = 1) and Φ = {ai , θ i }i=1 is the set of all distribution parameters. Since N is referring to the Gaussian distribution, θ i is represented by a mean vector and a symmetric, semi-positive definite covariance matrix. Given a set of training observations TS =(x1 ,. . . ,xn ), the distribution parameters Φ are computed using the Expectation Maximization (EM) algorithm [11], which is an iterative method that estimates the parameters of a parametric distribution by maximizing the likelihood function: l(Φ, TS ) =
n $ L &
j=1 i=1
ai N (xj |θ i ).
The procedure requires that L is determined before running the EM routine. We will describe in the next section the criteria we have adopted to tune the overall parameters of the model. A. Training the model We follow a supervised approach to gather the Gaussian Mixture Model of each of the application protocols we are interested in detecting. We select a set composed of encrypted flows that we are sure belong to the same application protocol (1) (2) and we split it into two training sets TS and TS . Suppose the number of Gaussian components L is fixed. We run the EM algorithm on the first set: the algorithm returns the distribution parameters Φ that maximizes the likelihood (1) function on TS . After that, we use the returned Φ to compute (2) the value of the likelihood function on TS .
We repeat the procedure making the parameter the ' L' vary in (( (2) . range [1 → 40], thus ending with forty pairs Φ, l Φ, TS Finally, the model – i.e, the pairs – that best fits the behavior of the application protocol is selected by running a minimum description length routine: intuitively, the routine looks for the best trade-off between the value of the likelihood function and the number of parameters of the model. From here on we indicate with ωi the model describing the i-th application protocol and with Ω the set of considered models. The value of mixture components L has a key role in the training process: if L is too low the trained model cannot describe the training set with an high accuracy level, while if L is too large there are over-fitting effects, i.e., the model achieved is bound too tightly to the training examples. The effects of this problem can be reduced if the minimum description length routine works on the likelihood function evaluated over a set other than the one used to gather the (2) model and, in our case, TS plays this role. In our experiments, we look for Gaussian components that span over a four-dimensional space. This implies that only encrypted flows that are at least four packets long can be detected by this preliminary version of our technique. B. Classification algorithm Given an unknown flow x, we compute for each class the conditional probability p(x|ωi ) that x belongs to the model ωi and we take the maximum so that: ωt = arg max{p(x|ωi )}, ωi
(4)
where we refer to ωt as the candidate class. In this work we train each class with the same number of samples, i.e., we do not make any assumptions on the a-priori probability of occurrence of each application in the training. A threshold-based rejection schema states whether or not the flow has been generated by the candidate application protocol: # if p(x|ωt ) > T, ωt x∈ unknown otherwise. The threshold value lies in the range [0, 1]. It is computed during the training phase by evaluating the conditional prob(2) ability p(x|ωi ) for any x ∈ TS , for any ωi ∈ Ω and by maximizing the function: max {T P − F P } , T
where T P and F P refer to true-positive (TP) and falsepositive (FP) ratio, respectively. In our context, TP are the flows correctly assigned to the class that generated them. FP are the flows incorrectly assigned to another class. V. E XPERIMENTAL RESULTS We tested the validity of the classification technique with traffic traces collected at the edge gateway of our faculty’s campus network. This networking infrastructure comprises several 1000Base-TX segments routed through a Linux-based dual-processor box and includes about a thousand workstations
with different operating systems. The network is connected to the Internet through a single 100Mb/s link, over which we captured all the traces. During a period of three weeks, a total of 50GB of cleartext traffic was collected by running Tcpdump [12] for fifteen minutes regularly every hour. We applied pattern-matching mechanisms to assess the actual application that has generated each TCP flow, in some cases with the addition of manual inspection. Because of this, we consider both the training and evaluation sets derived from these captures relatively reliable with respect to the pre-classification information, i.e., with respect to knowing, independently from our classifier, which application generated each flow (“ground truth”).
2500 1500
POP3 HTTP FTP
750 0 −750 −1500 −2500 2500 1500
2500
750 0 −750 −1500 −2500
−1500
−750
0
750
1500
−2500
Fig. 2. Scatter-plot of the packet-size feature of the HTTP, POP3 and FTP protocols after being encrypted by an SSH tunnel. The first 3 packets of each session are shown. A block cipher with an 8-bytes block size is used during encryption.
A. Datasets used in the analysis The selected TCP sessions are encrypted following exactly the procedure outlined in Section III and separated in two sets. We use these sets to train and to evaluate our classification technique, respectively. The training and the evaluation sets are collected in two different, and consecutive, time frames during the course of several weeks. The training set is composed of one thousand flows for each of the six protocols we consider: HTTP, SMTP, POP3, FTP, BitTorrent (BT), MSN. We consider the same number of flows for each of the trained classes since we do not make any assumption on the a-priori probability of each class. Protocol HTTP SMTP POP3 FTP BT MSN OTHER
sessions 5000 19500 20000 14500 7400 1000 7900
TABLE I P ROTOCOLS AND NUMBER OF FLOWS COMPOSING THE EVALUATION SET.
The evaluation set instead consists of the six protocols mentioned above and another set of protocols, named OTHER. This last set is used to verify the classifier’s ability to recognize protocols different than those used during the training phase. The OTHER set includes sessions generated by several application protocols, such as IMAP, SMB, eDonkey, NNTP and Gnutella, on which we applied patter-matching mechanisms to assess their ground truth. The number of flows composing the evaluation set is reported in Table I. B. The impact of the features on the model We first start by studying if the packet size continues to represent a discriminating feature of each application after its packet flow has been encrypted by an SSH tunnel. Since the SSH channel performs a quantization on the packet size, it might reduce the differences among the application protocols. Figure 2 shows a scatter-plot of the packet-size of three different protocols, HTTP, POP3 and FTP, after they have
been encrypted according to the SSH model introduced in Section III. As the plot suggests, the protocols lay in different regions: this fact supports the idea that the information related to the packet size is still very useful to characterize a session tunneled over an SSH channel. C. Numerical results We then follow the mechanism described in Section IV-A to gather the Gaussian Mixture Model for the application protocols composing the training set. The Gaussian distributions are gathered by looking at the first four packets of each session. The size of the block cipher is negotiated during the setup of the SSH channel. To evaluate its impact on the classification results, we repeated the tests by emulating two scenarios, one where the block cipher uses an 8-byte block size, and another one where the block size is set to 16 bytes. In both cases, we gather the Gaussian mixture model and perform the classification, after the threshold value has been computed as described in Section IV-B. Protocols HTTP SMTP POP3 FTP BT MSN OTHER
HTTP 90.28 – – – 0.12 – 16.57
SMTP – 99.83 2.47 0.19 – – 0.20
POP3 – 0.02 97.45 59.28 – – 0.03
FTP – 0.01 0.07 39.83 – – –
BT – – – – 96.33 0.39 0.16
MSN – – – – 0.24 97.19 0.90
Unknown 9.72 0.14 0.01 0.70 3.31 2.42 82.14
TABLE II C LASSIFICATION RESULTS , BLOCK - CIPHER SIZE = 8 BYTES . T RUE P OSITIVE RATES ARE IN BOLD .
Tables II and III report the classification results when applying the GMM-model to the traffic traces encrypted following the model of the SSH channel described in Section III with block sizes of 8 and 16 bytes, respectively. Independently of the block cipher size, in almost all cases the classifier assigns more than 90% of the encrypted sessions to the correct class, with a peak near to 100% in the SMTP case. Even the detection of the OTHER traffic, which the
Protocols HTTP SMTP POP3 FTP BT MSN OTHER
HTTP 91.38 – – – 0.18 – 13.20
SMTP – 99.65 1.45 0.40 – – 0.19
POP3 – 0.32 98.16 58.11 – – 0.13
FTP – 0.01 0.37 40.84 – – –
BT – – – – 95.44 0.77 0.38
MSN – – – – 2.76 96.13 0.24
Unknown 8.62 0.03 0.02 0.65 1.62 3.10 85.86
TABLE III C LASSIFICATION RESULTS , BLOCK - CIPHER SIZE = 16 BYTES . T RUE P OSITIVE RATES ARE IN BOLD .
classifier has not received training for, is over the 82% mark in the worst case. The only disappointing result is related to the classification of FTP (command): only around 40% of this traffic is detected correctly, while around 60% of FTP flows are incorrectly assigned to the POP3 class. This is actually an expected result: taking a look at Figure 2 reveals that there is a very close behavior of the features of these two protocols, and evidently our preliminary technique is not sophisticated enough to discriminate among the two with enough precision. With regards to the traffic of the OTHER class, the classifier assigns it incorrectly to the HTTP class 16.57% of the times, in the worst case. This fact can be partially explained by observing that the packets composing HTTP sessions have a size close to the MTU of the channel (see Figure 2). Analogously, the OTHER set is composed of protocols, such as eDonkey, that exhibit the same behavior. Finally, the results show that the classifier’s precision is hardly affected by the block cipher size, which indicates that it is fairly robust to the quantization process that SSH tunnels implement on the traffic. VI. D ISCUSSING THE ASSUMPTIONS : LIMITATIONS AND COUNTERMEASURES
A. Detecting the boundaries of consecutive flows The analysis we have reported relies on the assumption that the attacker can isolate a TCP session tunneled over SSH without observing the payload of the packets flowing into the channel. In the presence of clear-text traffic, we can perform this operation on each packet by looking at the information of the network and transport levels: the tuple of IP addresses and TCP port numbers identify the session the packet belongs to, while the TCP flags indicate the state of the connection. On the contrary, such information is encrypted when the packets traverse the SSH tunnel. Therefore, the question is about how a tunneled session can be isolated within the SSH session. Assuming that only one single application is tunneled in each SSH channel at any given time (more on this in the following), the way SSH signaling works can help the attacker in detecting when each consecutive tunneled flow starts and ends within a single, long-lived tunnel. In order to forward a new TCP session over SSH, the tunnel end-points must allocate an encrypted channel over an existing SSH connection. All the packets belonging to the tunneled
session carry the same channel identifier in their SSH header. The tunnel end-points exploit such information to demultiplex the packets belonging to different sessions and forward them to the actual destination. When an upper application requests a new channel, the pair of messages SSH MSG CHANNEL OPEN and SSH MSG CHANNEL OPEN CONFIRMATION are exchanged by the tunnel end-points. Note that the request to open a channel is always sent by the tunnel entry-point, i.e., the end-point which the actual client connects to. Instead, the closure of a channel can be initiated by both the tunnel end-points and involves the exchange of three signaling messages. The first endpoint notifies the other side with a SSH MSG CHANNEL EOF, and waits for the SSH MSG CHANNEL EOF CLOSE response. The channel is definitely closed only when the first end-point replies with the message SSH MSG CHANNEL CLOSE. Obviously, all the signaling messages are encrypted over the SSH session and they are not directly observable without the decryption key. However, we can take advantage of some characteristics that these messages exhibit. First, they have always a fixed size and a predictable direction. For instance, when the channel is opened, the first message is sent by the entry-point. Second, if a channel is running at a given time, we observe that during the channel opening or closure no other messages of this type are exchanged. Based on these considerations, we designed a simple algorithm to estimate the boundaries of an encrypted session. We define the sequence of (size,direction) of the opening procedure as the opening pattern. Similarly, the sequence of (size,direction) of the closure procedure represents the closure pattern. In the last case, we have to deal with two patterns, since both the end-points can close the channel. The algorithm then looks for the opening pattern and closure patterns and returns the boundary of the encrypted sessions. We verified the validity of this technique by applying it to the detection of flow-boundaries of two hundred POP3 sessions actually tunneled consecutively over one SSH channel. In all cases, the technique succeeded, showing that even this problem might be overcome by a dedicated attacker under some reasonable hypothesis. B. Multiplexing several application flows on the same SSH tunnel From what we have shown so far, it should be clear that one immediate countermeasure to prevent an attacker from breaking the privacy of an SSH tunnel is to simultaneously multiplex several flows over an SSH channel. This would invalidate our technique, because it would then be impossible, at least as far as the state of the art goes, to detect which encrypted packet belongs to which flow. However, we argue that a lot of users today employ SSH tunnels to protect a single traffic stream at a time, such as a POP3 or a peer-to-peer session that would otherwise be blocked by the enterprise firewall. In this case, our classification technique together with the boundary-detection mechanism briefly described in Section VI-A would be enough for
an attacker to detect which protocol is being tunneled in the majority of the times. C. Compression Another countermeasure that would probably invalidate our classification technique in its current incarnation is compressing the data before encryption. The SSH standards describe this as an optional feature, and indeed many OpenSSH implementations by default disable compression. We leave the analysis of how our technique responds to compression on the SSH channel as a future work. D. Completeness As explained in Section IV-A, our technique needs at least four packets to train the GMM and to classify an unknown encrypted flow. Clearly this represents a problem with respect to the applicability of the mechanism to very short-lived flows. However, with regards to the protocols we considered and to the environment in which sessions have been captured, we have observed that this is generally a minor issue: the majority of the protocols produce sessions longer than four packets. The only relevant exception is given by HTTP, for which about the 45% of sessions in our environment ends in only two non-empty packets. This is the case when the webserver accomplishes the GET request of the client in only one packet of data, and might be of marginal interest for security purposes. E. Maximum packet size, multi-segment SSH packets and padding Since the technique works exploiting a couple of simple features of the encrypted stream, i.e., packet size and direction, it is very sensible to the manipulation of these values. In fact, the classifier’s precision could be lower if the majority of the traffic produced large (near or above the MTU) packets, or if SSH were to create one single SSH packets combining two or more application segments. Furthermore, smart padding techniques could clearly further reduce its precision. Although the issues related to MTUs and multi-segment SSH packets are real, we suspect they affect only the minority of traffic. As for intentionally adopting padding techniques, we believe that this would be the only sensible, long-term countermeasure to the types of attacks described in this paper. We plan to investigate in depth these aspects in a future work. VII. C ONCLUSIONS In this paper we have applied a statistical technique to the problem of detecting the application protocol being carried by an encrypted SSH tunnel. The technique is based on two simple properties of the packet, such as its size and direction, that remain observable after encryption. The technique is based on a Gaussian Mixture Model, with the addition of a thresholdbased classification algorithm that assigns a given encrypted session to one of the trained protocols or to the “unknown” class.
Following a well known and scientifically accepted technique (see for example [6]), the experimental evaluation of the mechanism has been carried out by first recording clear-text sessions on a real network, and then emulating their tunneling on SSH channels. To take into account that the packet size is altered due to encryption, we built a model of the SSH channel that returns the size of the packets as if they had been captured in the middle of the SSH channel. The resulting sessions have then been used to assess the precision of the classifier. The experiments show that our technique can successfully detect the application protocol behind a given encrypted SSH tunnel in the vast majority of cases, with few notable exceptions (FTP). An attacker could use this technique to invade the privacy of users, assuming they tunnel one single protocol at a time over an SSH session. Our work in this area is continuing in several directions. Besides considering all the issues raised in Section VI, we are planning experiments to validate the technique on actual SSH tunnels, as opposed to the emulated one. Furthermore, we plan to investigate the applicability of this technique to other types of encrypted tunnels, such as the ones based on IPSec and on Transport Layer Security. Finally, and most important of all, we are also studying what are the most effective countermeasures that should be applied to SSH in order to prevent these types of attacks to the privacy of the users. R EFERENCES [1] T. Ylonen and C. Lonvick, “The Secure Shell (SSH) Protocol Architecture,” RFC 4251, IETF, Jan. 2006. [2] G. Bissias, M. Liberatore, D. Jensen, and B. N. Levine, “Privacy Vulnerabilities in Encrypted HTTP Streams,” in Proc. Privacy Enhancing Technologies Workshop (PET 2005), (Dubrovnik, Croatia), May 2005. [3] M. Liberatore and B. N. Levine, “Inferring the source of encrypted http connections,” in CCS ’06: Proceedings of the 13th ACM conference on Computer and Communications Security, (Alexandria, Virginia, USA), pp. 255–263, 2006. [4] C. V. Wright, F. Monrose, and G. M. Masson, “On Inferring Application Protocol Behaviors in Encrypted Network Traffic,” Journal of Machine Learning Research, vol. 7, pp. 2745–2769, Dec. 2006. [5] L. Bernaille and R. Teixeira, “Early Recognition of Encrypted Applications,” in Proceedings of the 8th Passive and Active Measurement Conference (PAM 2007), (Louvain–la–neuve, Belgium), Apr. 2007. [6] C. Wright, F. Monrose, and G. M. Masson, “HMM profiles for network traffic classification,” in Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security, (Washington DC, USA), October 2004. [7] M. Dusi, M. Crotti, F. Gringoli, and L. Salgarelli, “Detection of Encrypted Tunnels across Network Boundaries,” in Proceedings of the 43rd IEEE International Conference on Communications (ICC 2008), (Beijing, China), May 2008. [8] T. Ylonen and C. Lonvick, “The Secure Shell (SSH) Transport Layer Protocol,” RFC 4253, IETF, Jan. 2006. [9] T. Ylonen and C. Lonvick, “The Secure Shell (SSH) Connection Protocol,” RFC 4254, IETF, Jan. 2006. [10] “OpenSSH.” http://www.openssh.org. [11] A. Dempster, N. Laird, D. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society, vol. 39, no. 1, pp. 1–38, 1977. [12] “Tcpdump/Libpcap.” http://www.tcpdump.org.