YAN LAYOUT
9/6/06
2:55 PM
Page 6
Information Leak Vulnerabilities in SIP Implementations Hong Yan and Hui Zhang, Carnegie Mellon University Kunwadee Sripanidkulchai, NECTEC, Thailand Zon-Yin Shae and Debanjan Saha, IBM T. J. Watson Research Abstract The use of VoIP as a cheaper communications alternative is growing at an astronomical rate. However, potential abuse of the technology may hinder its deployment. One key security concern is the exploitation of implementation vulnerabilities in the form of unauthorized access, worms, viruses, and denial of service attacks, particularly when combined with explicit targeting of implementations that are known to be vulnerable. One way to protect from exploitations of implementationspecific vulnerabilities is “security-by-obscurity” where a SIP device does not reveal its specific software version. For the same reason, the SIP standard does not encourage announcing the software version in SIP messages. In this article we show that even when SIP messages do not explicitly contain software version information, there is sufficient information leak to determine it. To demonstrate this, we introduce techniques to fingerprint SIP devices and develop a fingerprinting tool called SIPProbe that collects fingerprints and identifies SIP implementations. This type of information leak presents a new security concern as it can be used by malicious users as a building block to scan SIP devices and launch attacks.
V
oice over IP (VoIP) technology is gaining popularity as an alternative to traditional telephony in homes and enterprises. Deployment in homes is fueled by lower-cost services provided by home broadband Internet service providers, VoIP providers [1], and peer-topeer networks [2], collectively boasting up to 17.4 million users [3]. Similarly, enterprises are converting their internal private branch exchange (PBX) systems and contact centers to VoIP. The key motivation for this transition is not just the reduced cost, but the ease of integration of voice services with other network-based applications, particularly in the enterprise environment. The key functionality required to support VoIP is signaling to establish and terminate calls, media transport for transmission of audio and video signals, and user location registration. In this article we focus on VoIP networks that use the Session Initiation Protocol (SIP) [4], an Internet Engineering Task Force (IETF) standard for signaling. Like HTTP, SIP supports open competition and interoperability between vendors. Users can potentially leverage SIPcompliant components, services, or applications from multiple providers. Furthermore, the simplicity of SIP and its ease of implementation has so far resulted in a multitude of SIP-enabled hardware and software components. For example, there are at least 70 distinct SIP softphones [5] and 60 SIP servers [6], mostly with different SIP implementations. As the network grows in scale and heterogeneity, potential abuse of VoIP technology may hinder its deployment. One key security concern is the exploitation of vulnerabilities in VoIP devices in the form of unauthorized access, worms, virus, and denial of service attacks, particularly when com-
6
0890-8044/06/$20.00 © 2006 IEEE
bined with explicit targeting of certain implementations that are known to be vulnerable. Such vulnerabilities may have disastrous consequences. For example, a vulnerability in Pingtel phones with software version 1.2.5–1.2.7.4 opens holes for remote administrative access and makes it possible to compromise an entire telephony infrastructure [7]. In addition, security testing of SIP implementations indicates that a large number of vendors are vulnerable to attacks that can lead to software crashes, hangs, and restarts [8, 9]. Although developers are making huge strides in terms of security and robustness to protect VoIP devices from malicious users, one way to evade attacks that exploit implementation-specific vulnerabilities is to keep the information about the implementation details and software versions secret. The SIP standard [4] defines a User-Agent field that can be used to announce the software version of a SIP device. The standard, however, warns that revealing a device’s software version might make it more vulnerable to attacks. Thus, the standard requires SIP implementers to make the User-Agent field a configurable option so that users can turn it off and choose not to expose the software version for privacy and security reasons. In this article we demonstrate that even when the UserAgent field is not available, there is sufficient information leak caused by variations in implementation to determine the software running on a SIP device. We introduce a probing technique, which we call SIP fingerprinting, that probes SIP devices with crafted SIP messages and analyzes the responses to infer which SIP stack is being used. To test the applicability of the technique, we build a SIP fingerprinting tool called SIPProbe and fingerprint 20 distinct SIP devices. We find that with one probe, we can uniquely identify all 20 devices. We
IEEE Network • September/October 2006
YAN LAYOUT
9/6/06
2:55 PM
Page 7
PSTN switch
raise the issue of information leaks as a security threat that can be used as a building block to scan SIP devices and launch attacks. This article is organized as follows. We first briefly introduce the SIP protocol, then discuss our SIP-based fingerprinting techniques. We present SIPProbe and the experimental results of applying the fingerprinting tool to 20 SIP devices. We then summarize our findings.
User C (PSTN phone)
Relay Server X (SIP proxy)
Ack B PSTN
Relay Relay Answer A
IP Answer B Server Z (SIP voice gateway) Relay Relay
Background Our SIP fingerprinting techniques are based on analysis of the protocol stack running on the SIP device. In this section we provide a short overview of SIP as a foundation for understanding SIP fingerprinting.
Call B
User A (SIP hardphone)
Server Y (SIP proxy)
Relay Call C User B (SIP softphone) Ack C
■ Figure 1. An example SIP infrastructure for voice services.
SIP-Based System Architecture SIP [4] is a signaling protocol used to register and locate users, and establish, maintain, and terminate sessions. While there are many competing signaling standards such as H.323 [10] and MGCP [11], SIP is gaining popularity because of its simplicity and lower overall cost. Furthermore, SIP has the potential to go beyond voice services to enable easy integration of network-based applications such as instant messaging, video, games, Web, calendars, applets, and directory services. Compared to the traditional telephony network, SIP is a fundamentally more complex environment. SIP is supported by a large number of vendors, including key ones such as Cisco, Avaya, and Microsoft. Various SIP hardware and software components from a variety of vendors may be interconnected in many different ways, depending on the services being offered. Because SIP is an application layer protocol, introducing and integrating new applications into the system is a simple and common process.
The network may always be evolving with the flux of such new services. Figure 1 illustrates a simplified SIP infrastructure that provides basic voice services. End devices or user agents, such as hardware-based phones (hardphones) and software-based phones (softphones), are used to initiate and listen for incoming calls. End devices register their location with a registration server, often collocated with their SIP proxy (in this example, user B is registered with SIP proxy server X). SIP proxy servers participate in call signaling by relaying the signaling messages and resolving SIP addresses into IP addresses. Once a call is established, its media session may be transported directly between the end devices if both are IP-based (between user A’s hardphone and user B’s softphone), or through a voice gateway (server Z) that translates the media between the IP network and the public switched telephone network (PSTN) between user B’s softphone and user C’s PSTN phone.
SIP Message Flow SIP end-device user X sip:
[email protected] 128.2.181.221
SIP proxy sip:proxy.b.com
INVITE sip:
[email protected] SIP/2.0 From: sip:
[email protected] To:sip:
[email protected]
SIP end-device user Y sip:
[email protected] 155.98.39.84
INVITE sip:
[email protected] SIP/2.0 From: sip:
[email protected] To:sip:
[email protected]
SIP/2.0 200 OK From:
[email protected] To: sip:
[email protected] ACK sip:
[email protected] SIP/2.0 From: sip:
[email protected] To: sip:
[email protected]
SIP/2.0 200 OK From: sip:
[email protected] To: sip:
[email protected]
ACK sip:
[email protected] SIP/2.0 From: sip:
[email protected] To:sip:
[email protected]
media BYE sip:
[email protected] SIP 2/0 From: sip:
[email protected] To: sip:
[email protected]
SIP/2.0 200 OK From:
[email protected] To: sip:
[email protected]
BYE sip:
[email protected] SIP/2.0 From: sip:
[email protected] To:sip:
[email protected] SIP/2.0 200 OK From: sip:
[email protected] To: sip:
[email protected]
■ Figure 2. An example of SIP call establishment and teardown.
IEEE Network • September/October 2006
The SIP protocol is the language SIP devices, including end devices and servers, use to exchange requests and responses. There are several types of requests that provide different functionalities. The simplified workflow depicted in Fig. 2 illustrates how a SIP user uses a SIP INVITE request to establish a call. To uniquely identify SIP users, an addressing scheme based on SIP uniform resource identifiers (URIs) is used. A typical URI has the format sip:user@host. For clarity, only important parts of the exchanged SIP messages are shown. When user X initiates a call to Y, it sends an INVITE request to Y’s proxy server. The request is relayed to the address where Y is located. Y decides to accept the call, and thus responds to the INVITE request with “200 OK.” which is then acknowledged by X. At this point the three-way handshake is complete, and a conversation dialog is established. X and Y exchange media, (e.g., voice) until X sends out a BYE request and terminates the session.
SIP Message Formats As illustrated in the above example, there are two classes of SIP messages: requests and responses. All the SIP requests (REGISTER, INVITE, OPTIONS, BYE, ACK, etc.) share
7
YAN LAYOUT
9/6/06
2:55 PM
Page 8
the same format: a request line followed by a list of headers and an optional message body. For example, consider the INVITE message sent by user X in Fig. 2. The request line, INVITE sip:
[email protected] SIP/2.0, carries information about the request type, request receiver, and SIP version. The various headers (From: sip:
[email protected] etc.) provide additional information regarding the request, such as the sender of the request. A request might also include a message body to convey more user information, but the structure or content of the message body is not defined by the protocol. The format of a SIP response is similar to a SIP request, except that the start line is a status line which contains a status code indicating the outcome of the request. Consider the corresponding response from the SIP proxy to user X in Fig. 2. As implied by “OK,” the response with status code “200” means that the corresponding request has been understood and successfully performed.
SIP Fingerprinting In this section we present a technique called SIP fingerprinting, which exploits implementation differences in the SIP protocol stack to identify SIP devices. The intuition behind SIP fingerprinting is to create SIP requests that elicit different responses from different devices. To fingerprint SIP devices, we manipulate the SIP request headers in RFC-compliant and non-RFC-compliant ways, and observe how the various implementations differ. By crafting SIP requests as probes and simply extracting the status codes returned from the devices, we obtain a unique response sequence that we call a fingerprint from each type of SIP device. We note that most of the information leak is from responses to non-compliant messages for which there is no specified behavior in the RFC. Thus, the differences in responses reflect differences in implementation. Our fingerprints are based on the OPTIONS request, though our techniques do not exclude the use of other types of SIP messages such as INVITE, CANCEL, and so on. To extract the fingerprint, we look at two sources of information: the returned status code and the values returned in certain response header fields. In general, the status codes we have observed fall under the following categories: “200 OK,” “3xx Redirection,” “4xx Request Failure,” “5xx Server Failure,” and no response. We implement the probes in a tool called SIPProbe, which we use to collect fingerprints of 20 different SIP devices. We extract status code fingerprints by probing SIP devices with standards-compliant OPTIONS messages and non-compliant messages, as described next. To extract value-based fingerprints, we look at the content returned in the Allowed Methods header in the responses we describe later. We also discuss other sources of information leak that can be used for fingerprinting. Note that this is not a complete list as our focus in this article is to introduce the information leak vulnerability. Other manipulations to extract additional information to be used as fingerprints may be possible.
Responses to Standards-Compliant OPTIONS A SIP OPTIONS request is used to query for the capabilities of a SIP device. When a SIP device receives an OPTIONS request, it responds with the complete set of supported SIP message types. According to the SIP standard [4], a valid SIP OPTIONS request starts with a request line that contains the keyword “OPTIONS,” a URI that identifies the recipient of the request, and SIP-Version that indicates the SIP version in use (“SIP/2.0” to be standards-compliant). In addition to the
8
mandatory request line, a SIP OPTIONS request must also contain the following six header fields: Via, CSeq, Call-ID, To, From, and Max-Forwards. The following is an example of a standards-compliant OPTIONS request: OPTIONS sip:128.2.181.221 SIP/2.0 Via: SIP/2.0/UDP 155.98.39.84;branch=z9hG4bKhjhs8as877 CSeq: 1 OPTIONS Call-ID: a84b4c76e66710 To: From: Anonymous ;tag=1928301774 Max-Forwards: 70
The Via header field indicates the transport used for the message and provides the location where the OPTIONS response is to be sent. In the above example, UDP is used as the transport. The OPTIONS request is to be sent to 128.2.181.221, and the response is expected to be sent back to 155.98.39.84. The Via header is also required to carry a branch parameter that identifies the transaction created by the request. The CSeq header, which consists of a sequence number and keyword OPTIONS, serves as a way to identify and order transactions. The RFC requires that the sequence number value be expressible as a 32-bit unsigned integer. The Call-ID is a unique identifier that is selected when a new request is generated and remains the same in the requestresponse exchange dialog. The To header field specifies the recipient of the OPTIONS request, or the address of the request target (128.2.181.221 in the example). The From header field indicates the sender of the OPTIONS request. The sender can choose to hide the identity, as shown in the example, by using the display name “Anonymous” along with a syntactically correct but otherwise meaningless URI. A tag parameter should be appended to the end of the From header field such that when combined with the Call-ID can uniquely identify a conversation. The Max-Forwards header field contains an integer that limits the number of hops a request can transit on the way to its destination. The integer, set to 70 in our example, is decremented by one at each SIP hop. The response to an OPTIONS message from the eyeBeam [12] softphone is shown in the following example: SIP/2.0 200 OK Via: SIP/2.0/UDP 155.98.39.84;branch=z9hG4bKhjhs8as877 Contact: To: ;tag=7c3d124c From: Anonymous ;tag=1928301774 Call-ID: a84b4c76e66710 CSeq: 1 OPTIONS Accept: application/sdp Accept-Language: en Allow: INVITE,ACK,CANCEL,OPTIONS,BYE,REFER,NOTIFY,MESSAGE,SUBSCRIBE,INFO
The status code returned from the eyeBeam softphone for a compliant probe is “200 OK.” Our fingerprints are obtained from collecting such status codes from responses to the various probes we send. Each device has a distinct fingerprint.
Responses to Non-Compliant OPTIONS Next, we describe different manipulations of the OPTIONS request in non-compliant ways by changing individual header fields. Note that this is not meant to be a comprehensive list.
IEEE Network • September/October 2006
YAN LAYOUT
9/6/06
2:55 PM
Page 9
Invalid Version — The next fingerprint is extracted from the status code to an OPTIONS message that has an invalid SIP version. The RFC requires all SIP messages to carry the version information “SIP/2.0.” We replace the RFCrequired version number with a nonexistent one (e.g., “99.9”) in the following example:
SIP fingerprintee
SIP fingerprinter
SIP fingerprint database
SIP request 1 (RFC-compliant) SIP/2.0 200 OK SIP request 2 (invalid version) (No response) SIP request 3 (invalid via address)
OPTIONS sip:128.2.181.221 SIP/99.9 Via: SIP/99.9/UDP155.98.39.84;branch= z9hGbKhjhs8as877…
While the RFC does not define how the SIP stack should react to invalid versions, the most appropriate status code is “505 Version Not Supported.” However, we find that among the 20 types of SIP devices we probed, two of them respond with “505 Version Not Supported,” while the others either process the request as if it were valid, do not respond, or respond with some different status code. We list all those responses later.
SIP/2.0 400 bad request SIP request 4 (incorrect content-length) SIP/2.0 400 PDU is incomplete SIP request 5 (malformed CSeq) (No response)
Match fingerprint: 1:SIP/2.0 200 OK SIP request 6 (missing call-ID) 2: 3:SIP/2.0 400 bad request SIP/2.0 400 bad or missing call-ID 4:SIP/2.0 400PDU is incomplete 5: SIP request 7 (incompatible transport protocol) 6:SIP/2.0 400 bad or missing call-ID 7:SIP/2.0 400 via transport inconsistent with actual SIP/2.0 400 via transport inconsistent with transport actual transport Fingerprint match found: Cisco SIP proxy
Invalid Via Address — The Via header field contains the address where the OPTIONS response should be sent. Instead of using the correct IP ■ Figure 3. A SIPProbe workflow example. address, we fill the Via header field with the text “localhost.” This tells the probed SIP device to send the response back to itself rather than the person issuing the probe. Upon receiving such requests, The OPTIONS response message contains information again, different SIP implementations react differently. regarding the methods that are supported by the component. Upon receiving a compliant OPTIONS request, a user agent is expected to respond with a list of supported SIP methods in Incorrect Content-Length — Content-Length is an optional headthe Allow field. An example of a SIP response with an Allow er field that indicates the size of the message body in decimal header can be found earlier. The values in the Allow list pronumber of octets. When no body is present in a message, the vide useful information for inference of the SIP stack because Content-Length field value must be set to zero. In our “Incordifferent user agents support different requests. Furthermore, rect Content-Length” probe, we add a nonzero Contentthe ordering of the supported requests is also different. HowLength value to an OPTIONS request that does not have a ever, some user agents may choose to not provide any inforbody. mation about the requests it supports for security purposes. For example, the OPTIONS response from a softphone called Malformed CSeq — A well formed CSeq header field in an sipXphone [[13] does not have an Allow field. Nevertheless, OPTIONS request contains both a single decimal sequence number and the request method keyword “OPTIONS.” Howits response is valid; it does not mean that sipXphone does ever, we remove the “OPTIONS” keyword and only include not support any method. the sequence number in our probe.
Other Information Leaks
Missing Call-ID — Call-ID is a mandatory field in an RFCcompliant SIP request. Although the RFC requires that user agents respond to requests without the Call-ID field with “400 Bad Request,” different SIP implementations behave differently. Incompatible Transport Protocol — An RFC-compliant Via header field must include the transport protocol by which the request is sent. The transport protocols supported by SIP are UDP, TCP, TLS, and SCTP. Our “Incompatible Transport Protocol” probe manipulates the Via field to introduce a mismatch between the claimed transport protocol and the one actually used. For example, we use UDP to send an OPTIONS request but claim to have used a TCP transport.
Allowed Methods In addition to looking at status codes in the responses to active probing, we also extract values from the response headers as part of the SIP fingerprint to improve its identification capabilities.
IEEE Network • September/October 2006
Another source of information is the formatting of the SIP messages. For example, we have observed that different user agents order the fields in the INVITE and OPTIONS messages differently [14]. Furthermore, text in messages also leak information. For example, the text included with the same status codes may be different: for responses to invalid SIP versions, the MCI SIP proxy reports “400 Syntax Error In Start Line,” whereas the Cisco voice gateway reports “400 Bad Request — Malformed/Missing VIA OR CSEQ.” We can also combine other types of fingerprinting with SIP fingerprints. For example, operating system (OS) fingerprints may be effective at identifying special hardware such as voice gateways or hardphones that often run on embedded OSs. We have incorporated OS fingerprinting into SIPProbe, using existing tools [15].
SIP Fingerprints In this section we describe our SIPProbe tool that implements fingerprinting, and report on the fingerprints obtained by probing 20 different SIP devices.
9
YAN LAYOUT
9/6/06
2:55 PM
Page 10
SIP fingerprint
Component
RFCcompliant
Invalid version
Incorrect Via address
Incorrect ContentLength
Malformed CSeq
Missing Call-ID
Incorrect transport protocol
SIP servers [16] Cisco Voice Gateway (CiscoSIPGateway/IOS-12.x)
200
400
200
NR
NR
400
400
Cisco Voice Gateway
400
400
400
NR
NR
400
400
Cisco SIP proxy
NR
NR
400
400
NR
400
400
SIP Express Router Proxy (iptel.org 0.0.0udpfifo i386/linux)
404
NR
404
404
NR
404
NR
Microappliances SIP Proxy (zdots.com MA1000-2.1)
403
400
403
403
NR
NR
400
3Com SIP Proxy (siphappens.com)
405
405
NR
405
NR
NR
NR
MCI SIP Proxy (sipaccount.mci.com)
302
400
NR
302
NR
NR
400
Hardphones Cisco Phone (cisco.com)
200
NR
200
400
NR
400
NR
Pingtel Phone (pingtel.com)
200
505
200
200
NR
NR
NR
Softphones WinSip (touchstone-inc.com)
200
NR
200
200
NR
481
481
SJPhone (sjlabs.com)
405
NR
405
NR
NR
NR
NR
KPhone (wirlab.net)
200
200
NR
200
200
200
NR
LinPhone (linphone.org)
200
200
200
200
NR
NR
NR
Express Talk (nch.com.au)
200
200
200
200
NR
200
200
sipXphone (sipfoundry.org)
200
505
200
200
NR
NR
200
Adore Softphone (adoresoftphone.com)
200
481
NR
400
NR
400
NR
Yate (yate.null.ro)
501
NR
NR
501
NR
501
501
eyebeam (counterpath.com)
200
200
200
200
405
NR
200
Phoner (phoner.de)
200
200
200
200
NR
NR
200
Sipps (nero.com)
200
200
200
200
400
NR
NR
■ Table 1. Fingerprints of various SIP implementations. The three-digit number (e.g., 200) is the response code to a SIP query, and NR denotes no response.
SIP Probe Tool Leveraging the SIP fingerprinting technique, we develop a tool called SIPProbe to identify remote user agents. SIPProbe reads the formats of the manipulated SIP requests from a template file, probes the given remote SIP user agent with the request messages, and then parses the responses and matches them against a SIP fingerprint database (see details next) to identify the SIP component. Besides being used as an active
10
fingerprinter, it can also be configured to passively sniff SIP messages from remote user agents and send back probes to fingerprint them. Figure 3 shows a SIPProbe workflow example. In this example, SIPProbe sends seven SIP OPTIONS requests as described earlier to the fingerprintee SIP device. The fingerprintee returns “200 OK” only to the RFC-compliant OPTIONS, and returns “400” with different error messages such as “Bad Request” and “PDU is incomplete” to OPTIONS requests with invalid via address, incorrect content-
IEEE Network • September/October 2006
YAN LAYOUT
9/6/06
2:55 PM
Page 11
Invalid_SIP_Version
NR
SIP/2.0_200
Incorrect_Via_Address
Malformed_CSeq
SIP/2.0_200 SIP/2.0_200
NR
NR SIP/2.0_400 SIP/2.0_405
NR
Missing_call-ID LinPhone(1)
SIP/2.0_200
NR
SIP/2.0_505
SIP/2.0_405
MA-1000-2.1(1) Incorrect_Via_Address Incorrect_transport 3Com_SIP_Proxy(1)AdoreSoftphone(1)
SIP/2.0_405 SIP/2.0_400 SIP/2.0_404
kphone(1) Incorrect_transport Sipps(1) eyeBeam(1) Incorrect_transport Yate(1) Cisco_SIP_Proxy(1) sjphone(1)
SIP/2.0_200
SIP/2.0_481
SIP/2.0_400 SIP/99.9_400
SIP/2.0_200 SIP/2.0_302 SIP/2.0_200
NR
SER(1) MCI_SIP_Proxy(1)Cisco_Voice_Gateway(1) sipXphone(1)Pingtel_Phone(1)
SIP/2.0_481
Cisco_phone(1) winsip(1)
NR
express_talk(1) Phoner(1)
■ Figure 4. SIP fingerprint decision tree for 20 different implementations. length, missing Call-ID, and incorrect transport protocol. Also, it does not respond to OPTIONS requests with invalid versions or malformed CSeqs. After collecting the probing results (fingerprints), SIPProbe queries SIP fingerprint database with the fingerprints of the SIP device. The fingerprint database finds a match and reports the fingerprintee to be a Cisco SIP Proxy. Further optimizations to make fingerprinting more efficient may also be useful. For example, we can combine multiple probes into a single probe, rather than sending the individual probes listed above. We plan to add this as part of future work.
Fingerprint Database We use SIPProbe to collect fingerprints of 20 different SIP components, including voice gateways, SIP proxies, softphones, and hardphones. The SIP infrastructure servers we probe are chosen based on access and availability: we have access to an operational Cisco-based deployment with voice gateways and proxies. The remaining SIP proxies are public SIP servers that are available for testing purposes [16]. Our reasoning for probing the 13 phones is as follows. We use Cisco and Pingtel hardphones for fingerprinting as they are mature products that are widely deployed in enterprises. The 11 softphones, all of which support SIP, are selected from a VoIP softphone list published on the VoIP informational Wiki [5]. These are popular softphones that run on a variety of OS platforms and have diverse features, cost, and vendors. Among the 11 softphones, four (Adore Softphone, Express Talk, SIPp, and WinSip) only run on Windows, two (KPhone and LinPhone) are for Linux, three (Phoner, sipXphone, and Yate) run on both Windows and Linux, and the other two (eyeBeam and SJPhone) have Windows, Mac OS X, and Pocket PC versions. The features supported by those softphones also vary across a large range. For example, Express Talk has the ability to put calls on hold and do call transfer, Adore SoftPhone and eyeBeam can be used as video phones, eyeBeam and KPhone support Instant Messaging, and WinSip is more of a bulk call generator and testing tool. Some of the phones such as LinPhone, KPhone, and sipXphone are Open Source, some such as Phoner are Freeware, and others are commercial products. While most of the softphone vendors are U.S.-based, we also pick softphones based outside the United States. For example, we use Adore Softphone, Yate, and Phoner based in India, Romania, and Germany, respectively. The fingerprints are listed in Table 1, with each row corresponding to a component. The columns denote the seven manipulations described previously and the observed status codes to each manipulation. For example, the Cisco voice gateway listed in the first row returns a “200 OK” response to RFC-compliant and incorrect Via address OPTIONS requests, but returns a “400” to invalid version, missing Call-ID, and incorrect transport protocol requests. In addition, it does not
IEEE Network • September/October 2006
respond to OPTIONS messages with incorrect content length or malformed CSeq. Each SIP component has its own distinct fingerprint. Even similar components from the same vendor, such as the two different versions of Cisco voice gateways, have significantly different fingerprints. This indicates that they are not implemented using the same SIP stack. Furthermore, none of the other Cisco components, such as the Cisco SIP proxy or hardphone, have the exact same fingerprint. To better understand which probes are effective at identifying distinct components, we run the ID3 decision tree algorithm [17] on the fingerprint results. Probes and probe responses are treated as features and attributes of the corresponding SIP component. Given a set of attributes (fingerprints), the decision tree predicts the identity of the user agent and minimizes the number of attributes that need to be used for the classification. Figure 4 depicts the resulting decision tree. For example, to identify the Pingtel phone, we can look at the responses to invalid SIP version (505) and incorrect transport protocol (no response). We find that only five features are required to uniquely identify the 20 SIP components: invalid SIP version, incorrect Via address, incorrect transport protocol, malformed CSeq, and missing Call-ID. The remaining features (RFC-compliant and incorrect Content-Length) do not improve the decision. The most distinguishing feature is the response to invalid SIP version probes, which alone can identify three different components: Microappliances SIP proxy, 3Com SIP proxy, and Adore softphone. Two softphones, Phoner and express talk, are the most difficult to identify, requiring at least four probes: invalid version, malformed CSeq, incorrect transport, and missing Call-ID. Next, we present the fingerprinting results for the Allow field. Table 2 lists the Allow fields of various user agents. Of the 20 fingerprinted devices, 13 responded with the Allow field. For example, the Cisco voice gateway listed in the first row supports 13 different methods, whereas the Cisco phone supports only eight. Also, note that the ordering of the supported methods is also different. The first Cisco voice gateway lists INVITE as the first method, but the Cisco phone lists OPTIONS as the first method. For those user agents that populate their Allow fields, the methods together with their ordering are incorporated as part of their SIP fingerprints. Note that by combining the status codes from the probes with the Allow field results, we can efficiently distinguish between the 20 SIP component using only one probe — a compliant OPTIONS request. We note that with the Allow field, as SIP devices become more mature, they will likely support more and potentially all methods. Therefore, all devices will return similar information, and the amount of information leak from this field will decrease over time. However, the ordering of methods may still leak information.
11
YAN LAYOUT
9/6/06
2:55 PM
Page 12
Summary In this article we look at security concerns related to explicit targeting of SIP implementations that are known to be vulnerable to holes leading to unauthorized access, worms, virus, and denial of service. In particular, we ask whether or not it is feasible to identify SIP devices without using the User-Agent field, which may not be available for security and privacy reasons. We find that there are information leak vulnerabilities in Component
SIP implementations that can be used to identify SIP stacks. We introduce SIP fingerprinting techniques to craft SIP messages to exploit variations in implementation and elicit different responses from each device. We record those responses as the devices’ fingerprint. In particular, we use compliant and non-compliant SIP messages as probes. To demonstrate the viability of SIP fingerprinting, we probed 20 different devices including infrastructure components such as hardware-based voice gateways and SIP proxies, and a multitude of hard-
SIP allowed field in OPTIONS response SIP servers
Cisco Voice Gateway (Cisco-SIPGateway/IOS-12.x)
INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, COMET, REFER, SUBSCRIBE, NOTIFY, INFO, UPDATE, REGISTER
Cisco Voice Gateway
N/A
Cisco SIP proxy
N/A
SIP Express Router Proxy (iptel.org 0.0.0udpfifo i386/linux)
N/A
Microappliances SIP Proxy (zdots.com MA-1000-2.1)
N/A
3Com SIP Proxy (siphappens.com)
N/A
MCI SIP Proxy (sipaccount.mci.com)
N/A Hardphones
Cisco Phone (cisco.com)
OPTIONS, INVITE, BYE, CANCEL, REGISTER, ACK, NOTIFY, REFER
Pingtel Phone (pingtel.com)
INVITE, ACK, CANCEL, BYE, REFER, OPTIONS, NOTIFY, REGISTER, SUBSCRIBE Softphones
WinSip (touchstone-inc.com)
INVITE, ACK, BYE, CANCEL, OPTIONS, MESSAGE, INF)
SJPhone (sjlabs.com)
INVITE, ACK, CANCEL, BYE, REFER, NOTIFY
KPhone (wirlab.net)
INVITE, OPTIONS, ACK, BYE, MSG, CANCEL, MESSAGE, SUBSCRIBE, NOTIFY, INFO, REFER
LinPhone (linphone.org)
INVITE, ACK, OPTIONS, CANCEL, BYE, SUBSCRIBE, NOTIFY, MESSAGE, INFO
Express Talk (nch.com.au)
INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY
sipXphone (sipfoundry.org)
N/A
Adore Softphone (adoresoftphone.com)
INVITE, BYE, OPTIONS, MESSAGE, ACK, CANCEL, NOTIFY, SUBSCRIBE, INFO, REFER
Yate (yate.null.ro)
ACK, INVITE, BYE, CANCEL
eyebeam (counterpath.com)
INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY, MESSAGE, SUBSCRIBE, INFO
Phoner (phoner.de)
INVITE, ACK, CANCEL, BYE, NOTIFY
Sipps (nero.com)
INVITE, ACK, CANCEL, BYE, REFER, OPTIONS, NOTIFY, INFO
■ Table 2. “Allow” fields of various SIP implementations. N/A indicates that the “Allow” field is not included in the response.
12
IEEE Network • September/October 2006
YAN LAYOUT
9/6/06
2:55 PM
Page 13
phones and softphones using our SIPProbe tool. Each device has a unique fingerprint and can be uniquely identified using a few probes. Furthermore, we discuss other potential sources of information leaks that can also be used for fingerprinting. For example, we analyze headers in the response for additional information and combine SIP fingerprints with other forms of fingerprints such as operating system fingerprinting. We raise the issue that information leak vulnerabilities are a potential security concern. SIP implementers may need to consider how to contain such leaks. For example, it may be possible to obscure the software versions from fingerprinting by not responding to non-compliant messages or responding using better matching response codes. Randomizing the order of the fields and attributes in the fields may also help hide information. In this article we present the information leaks obtained through SIP fingerprinting as a security vulnerability. However, such information may have useful applications that can help enhance the security and quality of communications rather than hurt. A useful application of SIP fingerprinting is classification of incoming SIP messages to provide different levels of security or quality of service. There are many interesting applications that can be built using such classification. For example, messages from voice gateways that are considered to have high cost may be given higher priority than messages from softphones. In addition, messages and content may be customized for specific user agents: one may send a highquality color image to a hardphone with a color display and a scaled down grayscale image to a hardphone with a low-resolution display. Lastly, messages that are suspected to be malicious may be processed slower or required to undergo more extensive security tests such as authentication, reverse Turing test, or more detailed fingerprinting. Fingerprinting also has uses in system management and diagnosis. Debugging tools like SIP traceroute may be augmented with more detailed information about the SIP stack along the call path to help isolate problematic components or verify that the call traverses the right set of components. If a particular user agent is causing problems, we can use fingerprints to identify its version, filter its traffic, and alert other users of its presence. Similarly, fingerprints can also identify and alert users of software versions that are not interoperable. Lastly, fingerprinting may be used as a measurement tool to survey user agents that are deployed in the network.
References [1] Vonage, http://www.vonage.com [2] Skype, http://www.skype.com [3] N. Willing, “VOIP Subscriber Numbers Soar,” http://www.lightreading.com, July 2005. [4] J. Rosenberg et al., “SIP: Session Initiation Protocol,” IETF RFC 3261, June 2002. [5] VOIP Phones, http://www.voip-info.org/wiki/view/VOIP+Phones [6] VOIP PBX and Servers, http://www.voip-info.org/wiki-VOIP+PBX+and+Servers
IEEE Network • September/October 2006
[7] @stake Inc, “Multiple Vulnerabilities with Pingtel xpressa SIP Phones,” http://www.atstake.com, July 2002, Security Advisory. [8] C.Wieser, M. Laakso, and H. Schulzrinne, “Security Testing of SIP Implementations,” Tech. rep., Dept. of Comp. Sci., Columbia Univ., 2003. [9] CERT/CC, CERT Advisory CA-2003-06, “Multiple Vulnerabilities in Implementations of the Session Initiation Protocol (SIP),” http://www.cert.org/advisories/CA-2003-06.html, 2003. [10] IEC H.323, Web proforum tutorial. [11] M. Arango et al., “Media Gateway Control Protocol (MGCP) Version1.0,” RFC 2705, Oct. 1999. [12] CounterPath, http://www.counterpath.com/ [13] SIPfoundry, http://www.sipfoundry.org/ [14] H. Yan et al., “Incorporating Active Fingerprinting into SPIT Prevention Systems,” 3rd Wksp. Securing Voice over IP, June 2006. [15] Fyodor, “Nmap Version Scanning,” http://www.insecure.org/nmap/versionscan.html, Sept. 2003. [16] H. Schulzrinne, “Listing of Public SIP Servers,” http://www.cs.columbia.edu/ sip/servers.html [17] T. Mitchell, Machine Learning, McGraw-Hill and MIT Press, 1997.
Biographies HONG YAN (
[email protected]) is a Ph.D. student in the Computer Science Department at Carnegie Mellon University. His research interests include VoIP security, Internet spam detection, and network control systems. He received his B.S. in computer science from Tsinghua University in 2000 and enjoyed working for a network management startup from 2001 to 2003. KUNWADEE SRIPANIDKULCHAI is a researcher at the National Electronics and Computer Technology Center (NECTEC), Thailand. Her research interests are in the areas of peer-to-peer systems, overlay networks, SIP/VoIP, and distributed systems management. She was a research staff member at the IBM T. J. Watson Research Center in 2004–2005. She received her B.S. from Cornell University in 1997, and her M.S. and Ph.D. in electrical and computer engineering from Carnegie Mellon University in 1999 and 2005, respectively. HUI ZHANG (
[email protected]) is a professor in the Computer Science Department at the School of Computer Science at Carnegie Mellon University. He won the National Science Foundation CAREER Award in 1996 and the Alfred Sloan Fellowship in 2000. He received his Ph.D. degree from the Computer Science Department of the University of California at Berkeley. During 2000–2003, he was the chief technical officer of Turin Networks. He was elected a fellow of ACM in 2005. ZON-YIN SHAE (
[email protected]) is with the IBM T. J. Watson Research Center, Yorktown Heights, New York. He works in the areas of multimedia networking, SIP/VoIP converged networks, multimedia traffic, and data analysis. He has held numerous patents and was a active member of H323 and MPEG international standard group. He received his B.A. and M.S. in electronic engineering from National Chiao-Tung University, Taiwan, and his Ph.D. in electrical engineering from the University of Pennsylvania at Philadelphia. DEBANJAN SAHA (
[email protected]) is currently at IBM T. J. Watson Research Center where he is a senior manager responsible for research in the area of service delivery infrastructure. At IBM and in Bell Labs he has played a leadership role in a number of projects focusing on design and development of network software and services. He also spent a number of years in Tellium, an optical networking startup, a company that he helped go public in 2001. He is one of the first developers of MPLS and a principal author of the GMPLS standards in the IETF. He has authored numerous papers on different topics in networking and is a co-recipient of IEEE Communications Society’s William R. Bennett award in 2003 and Fred W. Ellerscik award in 2004. He is also a co-author of a book, Optical Network Control: Architecture, Protocols, and Standards. He received a B.Tech. degree in computer science and engineering from the Indian Institute of Technology in 1990, and M.S. and Ph.D. degrees in computer science from the University of Maryland at College Park in 1992 and 1995, respectively.
13