DNS Authentication as a Service: Preventing Amplification Attacks ∗
†
Amir Herzberg
Department of Computer Science Bar-Ilan University Ramat-Gan, Israel
Fachbereich Informatik Technische Universität Darmstadt Darmstadt, Germany
[email protected]
[email protected]
We present the first defence against DNS-amplification DoS attacks, which is compatible with the common DNS servers configurations and with the (important standard) DNSSEC. We show that the proposed DNS-authentication system is efficient, and effectively prevents DNS-based amplification DoS attacks abusing DNS name servers. We present a gametheoretic model and analysis, predicting a wide-spread adoption of our design, sufficient to reduce the threat of DNS amplification DoS attacks. To further reduce costs and provide additional defences for DNS servers, we show how to deploy our design as a cloud based service.
Keywords Denial of service attacks, DNS Amplification, DNS Reflection, Source Authentication, DNS Authentication
INTRODUCTION
Amplification Denial of Service (DoS) attacks pose a major threat to the stability of the Internet, to organisations and to critical infrastructures. In the last decade the Internet has experienced a wave of Denial of Service (DoS) attacks threatening its welfare and stability. For example, consider the recent DoS attacks against Spamhaus (see [12]); Spamhaus contracted the Cloudflare services to resist the attack, to which the attackers responded with even more traffic, clogging the Internet Exchange Points (IXPs) serving Cloudflare (and many other customers). The most prevalent type of such attacks is reflection via different application protocols, particularly those that provide amplification. Domain Name System (DNS) has a long ∗Part of this research was conducted while the first author was visiting Technische Universt¨ at Darmstadt and CISPA, Saarland University. sites.google.com/site/amirherzberg/ †sites.google.com/site/hayashulman/
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ACSAC ’14 Dec. 8-12, 2014, New Orleans, LA, USA Copyright 2014 ACM 978-1-4503-3005-3/14/12 ...$15.00. http://dx.doi.org/10.1145/2664243.2664281.
history of abuse in amplification Denial of Service (DoS) attacks, to flood the victim networks and services, and it is one of the most popular amplifiers. A relatively small DNS request, e.g., up to 100 bytes, results in a much larger DNS response. We performed a study of typical DNS response sizes of Top Level Domains (TLDs) and top-million Alexa domains, www.alexa.com, signed and non-signed, in Figure 1; as can be seen, DNS responses signed with DNSSEC result in much larger packets, than the traditional (unsigned) responses, and even the ‘non-existing domain (NXD)’ responses that are signed, often exceed the maximal transmission unit (MTU). In the course of a reflection attack, the attacker sends multiple DNS requests (possibly from spoofed source IP addresses) to DNS servers, either via recursive resolvers or directly to the name servers, which incites much larger DNS answers in response. As a result, the victim is flooded with multiple (large) responses. These attacks have a detrimental impact on the bandwidth and performance of the DNS name servers and recursive resolvers, that are abused as a vector in amplification DoS attack, due to the excess processing of the numerous (malicious) requests and generation (or forwarding) of the significantly larger responses. 100 Legend DNSKEY (TLDs) NXD signed (TLDs) NXD (TLDs) ANY (TLDs) ANY signed (TLDs) ANY signed (Alexa)
80
Domains (%)
ABSTRACT
1.
Haya Shulman
60
40
20
0 0
1000
2000 3000 Response Size (bytes)
4000
5000
Figure 1: Length of responses for signed and nonsigned Alexa and TLDs, for ANY, DNSKEY and A resource records; A records were sent for random subdomains of tested domains, and resulted in NXD responses. In this work we focus on the defence mechanisms that the DNS servers1 can deploy to foil attempts of the attackers to abuse them as a vector in amplification DoS attacks. Such 1
In this work, we use the ‘DNS server’ to refer both to the
a defence would not only improve performance of the DNS servers and reduce the latency for their clients, but would also (indirectly) provide benefit to multiple Internet networks and services, by making it more difficult for attackers to find suitable amplifiers to flood the victims. Rate limiting is a common defence used to protect the DNS servers from being abused in amplification DoS attacks, by limiting the number of requests that a name server or an open recursive resolver (such as Google Public DNS) would respond to; the additional requests, that exceed the imposed threshold, are discarded. However, rate limiting is not an effective defence and attackers typically circumvent it by distributing the attack among multiple DNS servers, sending a lower volume of requests to each, thus still achieving the required amplification against a victim network. The attackers also often distribute the attack between multiple compromised hosts that they control, and typically employ IP addresses’ spoofing, to evade detection as well as to disguise the source of the attack making it difficult to block it. Indeed, amplification DoS attacks are frequently launched by exploiting name servers and (typically open) resolvers.
Related Work The challenge-response schemes, proposed in prior art, [4, 16], are designed to enable the name servers to filter the requests sent from spoofed IP addresses. Identification of spoofed requests typically relies upon a standard DNS functionality of the resolvers, and requires support for such filtering on the name server side. Upon a DNS request, the challenge-response authentication schemes on the name servers, [4, 16], send referral responses and encode a cookie via a CNAME or NS type DNS records. The cookie is the source IP address that appeared in the request. The resolver is then expected to echo the challenge in the subsequent DNS request (asking for the corresponding CNAME or NS records). The protection mechanism verifies that the source IP address in the second request is the same as the one appearing in the first request, and if so, sends another referral response, redirecting the resolver to the real name server (which the resolver eventually asks for the records that it originally needed to obtain). However, none of the proposed schemes is deployed in practice and currently the name servers are patched to support rate limiting, i.e., reducing the number of requests which are answered, and ignoring sources which send too many requests. The attackers easily overcome that defence, by distributing the requests among multiple hosts, such that each host sends only a few requests. We show that the challenge-response authentication via a random cookie does not constitute an effective defence against reflection amplification DoS attacks. In addition, these defences may often even ‘disrupt’ the DNS functionality for legitimate clients. We next discuss the problems and limitations in the defences proposed in prior art, and argue that these pose a significant obstacle towards deployment thereof and make them ineffective for protection against attacks. In Section 2 we report on our measurements’ results which show that the defences proposed in prior art are not suitable for common DNS servers configurations. name servers and to recursive resolvers, since both are frequently abused as a vector in amplification DoS attacks.
Signed DNS Zones The challenge-response mechanisms for source authentication, [4, 16], are incompatible with DNSSEC signed zones: when receiving a DNS request, the defence mechanism sends a freshly generated cookie (containing the source IP address of the requester) in the response. This requires signing the DNS record on-the-fly. However, most singed zones support offline signing, where the secret signing keys are not kept on the name server, hence it will not be able to produce a valid signature. As a result, DNSSEC-enabled resolvers would not be able to validate the DNS responses from signed zones. Unfortunately, the signed zones are typically those that are exploited for DoS attacks since they generate a very large amplification factor (due to cryptographic keys and signatures). The signed zones were already abused in the largest DNS amplification DoS attacks, although, most zones do not support DNSSEC, and less than 1% of top-million Alexa are signed, [14]. In addition, DNSSEC still faces deployment challenges, and is suffering from impeded adoption, [8], this should change in the future, when the obstacles are resolved.
Identifying Standard Resolvers Typical networks support a number of common variations for configuring recursive DNS resolvers, such that a number of different IP addresses participate in a resolution chain of a single DNS request. Since the challenge-response mechanisms perform authentication of the source IP address that sent the request, and implicitly assume that the chain of the requests will be served by the same resolver throughout the resolution chain – the authentication of benign resolvers would fail. In Section 2, we provide evidence, based on our measurements study, showing that an increasing number of requests, in a single resolution chain, are served by multiple resolvers.
Limited to a Single Name Server The mechanisms are limited to protection of a single name server (since they generate a dedicated zone for each name server). In addition, the defences are not suitable for protection of recursive resolvers (which are also often exploited for amplification DoS attacks).
Ineffective Against Advanced Attackers The mechanisms provide protection only against (naive) attackers, whereas a sophisticated attacker can circumvent the defences. For instance, attackers that can do source IP spoofing, would request the cookie once, and then distribute it among all the attacking hosts, which would send requests from a spoofed source IP address, that matches the one in the cookie.
Contributions We propose a DNS authentication system for detecting and preventing amplification DoS attacks. The system effectively and efficiently detects and thwart amplification DoS attacks, which exploit DNS resolvers or name servers, whether the attacker employs IP spoofing or not. In contrast to other proposals, it is compatible with DNSSEC and with clustersof-resolvers; it detects attacks which exploit IP spoofing, as well as attacks which employ ‘zombie’ hosts that send traffic from real IP addresses.
The idea behind our design is to construct a graph of legitimate resolvers. The connected components in that graph would correspond to legitimate resolvers, while isolated nodes, to (potentially) attacking hosts. Our design also takes into account hosts that may become compromised over time, or new resolvers that may be installed – we accomplish this by using a challenge-response authentication based on the destination IP address, to which the cookie is echoed. In particular, our defence system is allocated an address block, and each request would need to be echoed to the correct IP within that block. We believe that this service could be provided by cloud operators and propose a concept of DNS Authentication As A Service, and a design for deployment of our system on cloud platforms. Adoption is a significant challenge to any new technology, and especially for a system whose main benefit is to the entire Internet (and not merely to group of specific early adopters). We present a game-theoretic model and analysis of the clogging defences and attacks, which predicts rapid adoption and significant reduction in DNS amplification attack traffic. Our model and analysis may be of value in other scenarios and for other systems.
2.
2.1.3
Common Configurations of Name Servers
Recent works, [14, 15], studied different aspects of the DNS name servers infrastructure and found new common configuration of recursive authoritative name server (RANS), whereby a recursive resolver is registered as an authoritative name server (in the parent domain and in the zone file of the child domain). That (server-side) recursive resolver receives DNS requests from the client-side resolvers, to the target domain and then forwards the requests to the name server hosting the zone file. The resolver then forwards the requests to the name server hosting the zone file for the target domain. Upon responses from the name server, the serverside resolver caches them and subsequently returns them to the requesting client-side resolver. The setting is illustrated in Figure 2. In addition, [15], found evidence that a large
ILLUSION OF CHALLENGE-RESPONSE AUTHENTICATION
During the recent decade the DNS infrastructure has evolved into complex platforms with multiple resolving hosts. For instance, resolvers that serve multiple clients are organised in a resolvers farm configuration. This enables the operators to balance the load of DNS requests among multiple physical machines, which often use a shared cache, e.g., similar to Google Public DNS. We present a number of common configurations and then show why source authentication fails in those settings.
2.1
Complex DNS Platforms
In this section we review common configurations of the resolvers and name servers.
2.1.1
Resolver-behind-Upstream
Multiple resolvers use upstream resolvers for their DNS resolution services, e.g., see [9, 7]. This scenario typically occurs when, e.g., Internet Service Providers (ISPs) apply load-balancing to DNS requests among multiple upstream resolvers. Using an upstream forwarder is also considered to provide additional benefits, such as security and availability to resolvers, and is thus an increasingly growing configuration.
2.1.2
Resolver-behind-NAT
Resolvers can sometimes be located behind a many-tomany Network Address Translation (NAT) devices. In this case, different DNS requests may be sent by the NAT device with different source IP addresses, from the pool of addresses available to the NAT device. Recent studies, [3, 7, 1], observed that a large number of recursive DNS resolvers is located behind NAT devices, and often multiple resolvers reside behind the same NAT device. Furthermore, [11], found that 90% out of 20,000 DSL lines (from a major European ISP) were connected to the Internet via the NAT devices.
Figure 2: DNS resolution process for a domain supporting a RANS configuration.
fraction of the resolvers in a RANS configuration support open recursion (hence called ORANSes), i.e., respond to any query from any client. Furthermore, [15], found that 42% of the ORANSes use a chain of recursive resolvers, consisting of at least two intermediate resolvers.
2.2
Failure of Source Authentication
The challenge-response mechanisms, that we reviewed in Section 1, perform authentication of the source IP address that sent the request, and implicitly assume that a requests’ chain, initiated by some resolver, will be served by the same resolver throughout. However, this assumption is incorrect, and prior research, [13, 15], shows that an increasing number of requests in a single resolution chain are served by multiple resolvers. Therefore, mechanisms relying on source authentication are not only ineffective, but would also disrupt the DNS availability for benign clients.
2.2.1
Protecting Name Servers vs Protecting Zones
The proposed source authentication defences, [4, 16], apply their processing to the traffic destined to a name server. The processing increases the latency for DNS requests, whether they are legitimate, and sent from real source IP addresses, or malicious and sent from spoofed source IP address part of a reflection DoS attacks on victim networks and services. The latency of all proposed defences in [4, 16] requires at least an additional round trip time (RTT), and typically two RTTs; a typical RTT is between 50 and 150 ms, but can be also much larger.
As we showed above it is common for multiple zones to be hosted on a single name server. However, not all zones may be equally abused in reflection attacks, and thus not all may be willing to sacrifice latency for the offered protection against reflection attacks. For instance, typically zones signed with DNSSEC have larger responses (see Figure 1) and are thus more frequently abused in reflection DoS attacks. Such zones would benefit from the source authentication offered by the proposed mechanisms. However, other zones, e.g., serving plain (unsigned and thus much shorter) DNS records, would not be willing to bear the extra latency. Notice that another relevant issue is that the defences per name server may not be suitable for the business model of many zones. A name server deploying such a defence may want to increase the charge of the zones that it hosts. But, unsigned zones may not be willing to share the costs since they may not be a prey of abuse for reflection attacks. We believe that this motivates inspection of defences on a per zone rather than per name server, basis.
2.2.2
Circumventing Source Authentication
Our findings show that 8% of the top 25K Alexa domains and 3.9% TLDs, out of those that we measured, use ORANSes (illustrated in Figure 2). In this case, the requests are sent to an open resolver (whose IP address is registered as the name server for the tagret domain), which is configured to forward the requests to the name server hosting the zone file. As a result, when the source authentication mechanism is hosted on the name server network, it can be trivially circumvented: the request is sent to the open resolver, which then forwards it to the name server from its real IP address - thus source authentication would succeed. The problem is further exacerbated when the open recursive authoritative name server is based on a complex infrastructure, whereby a set of resolvers are responsible for receiving requests, and another set is responsible for querying the name servers, such as the infrastructure supported by Google DNS.
3.
DNS AUTHENTICATION
In this section we present a design of our DNS authentication system. We first introduce the Request-Authentication (Section 3.1), which filters requests sent from spoofed IP addresses, and detects amplification DoS attacks. We then design the Resolver-Authentication (Section 3.2) that identifies standard compliant resolvers and maintains a list of potentially compromised hosts. The Resolver-Authentication mechanism identifies resolving clusters Standard-conforming (‘good’) resolving clusters will be connected components in a graph, such that there will be edges between any two nodes in a component. In contrast, requests sent from spoofed IP addresses, will be isolated nodes in a graph, without any incoming edge; and requests from compromised (malicious) hosts, will only have incoming edges from itself or from other compromised hosts.
3.1
Request Authentication
The Request-Authentication uses a challenge-response based authentication, Figure 3, which standard resolvers can conclude correctly. In contrast, the attackers cannot successfully conclude authentication protocol since they either lack a complete (standard) DNS software and thus cannot respond to the challenge correctly or send requests from spoofed
IP addresses (of the victims) and thus cannot obtain the challenge returned by the challenger in step 1. The RequestAuthentication mechanism, Figure 3, consists of the following modules: a challenger, a verifier and a controller. Challenger. Upon receipt a DNS request, the challenger returns a (referral type) DNS response, along with a challenge, which is an IP address, randomly selected from the address block allocated to the verifier; step 2, Figure 3. The response from the challenger causes a standard-conforming DNS resolver to issue a subsequent DNS request (step 3 in Figure 3) to the name server that resides on the challenge IP address. This response causes standard compliant resolvers to issue a subsequent request (step 3 in Figure 3), to the name server indicated by the IP address in the additional section of a DNS response. Verifier. The verifier is allocated a range of IP addresses a.b.c.d/x (in CIDR notation [RFC4632]). We propose a technique allowing to artificially increase the range of IP addresses that can be allocated to the verifier, in Section 3.3.1. Upon a DNS request (step 4, Figure 3), the verifier confirms that the challenger received a matching request earlier, and that the current request arrives on the correct (challenge) IP address, selected by the challenger. The verifier also checks that the request originates from a standard resolver or a resolving cluster and was not sent from a spoofed source IP address or from a compromised host. Controller. The challenger and the verifier communicate to the controller. Upon a request, the challenger and verifier send the request to the controller, which validates it and stores it in a database, ad checks for attacks. The challenger, in response to valid requests, receives from the controller a challenge, and verifier receives an ACK/NAK (indicating whether to forward the request to the protected DNS server, or if to discard it. When the controller receives a request from the challenger it adds an entry, corresponding to a DNS request, to table T, which maps between the query, in a DNS request, the source IP address from which the request was sent, and the challenge, i.e., the IP address to which the subsequent request should be issued, and the time at which the query arrived. When the controller receives a request from the verifier, it first confirms that there is a corresponding entry in table T, and that the request arrived on the correct challenge IP address (that was provided by the challenger in step 2), and finally that the request is originated by a standard resolver and not sent from a spoofed IP address (see details in Section 3.2) and if so, forwards the request to the DNS server. When the DNS server returns a response, the verifier forwards it to the requester (after changing the source IP address to its own and destination to the address of the requester). The controller uses a timer based mechanism which evicts old entries from table T, τ seconds after arrival; τ is the maximal delay between step 2 and 4 (Figure 3). The controller also triggers alerts on queries for which attacks are detected; details in Section 3.4. Queries with alerts can be, e.g., served over TCP.
3.2
Resolver Authentication
Resolving platforms, whereby more than a single IP address is involved in resolution of a query, are becoming dominant in current Internet. In these cases, the request sent in
Challenger Challenger 5.6.7.8 5.6.7.8
Resolver Resolver 8.8.8.8 8.8.8.8
1
DB
Verifier Verifier 1.2.3.0/24 1.2.3.0/24
Request RR?www.foo.com Referral foo.com NS challenge-domain challenge-domain A 1.2.3.x
3
DNS DNSAuthentication Authentication Controller Controller
● ●
2
●
Generate Challenge dst_ip R[0,255] T[q,dst_ip]=src_ip If challenge-exhaust : alert & TCP
●
●
Request to 1.2.3.x RR?www.foo.com
NS.foo.com NS.foo.com 7.7.7.7 7.7.7.7
Verify Challenge If ((T[q,dstip]) && from resolver) relay to NS.foo.com If challenge-guess : alert & TCP
4
Response foo.com IN RR
Request RR?www.foo.com
Response foo.com IN RR
Figure 3: DNS authentication mechanism, consisting of challenger, verifier and the controller. step 3 (Figure 3), will originate from a source IP address different than the one used in request in step 1. We develop a methodology that allows us to discover IP addresses that belong to the same cluster of DNS resolvers. Furthemore, our methodology allows to identify potentially compromised IP addresses. Our methodology proceeds in two phases: first it constructs a graph of resolvers then parses the graph (periodically) to identify the IP addresses that exhibit a behaviour of standard-compliant resolvers, and classifies which of those IP addresses are really resolvers, and which of them are potentially compromised hosts that emulate the behaviour of standard resolvers. The first phase is continually running and is applied on DNS requests arriving at the challenger and the verifier (in steps 1 and 2). We apply our own challenge-response mechanism (Figure 3) to construct resolvers graph. Other options are also possible for construction of the clusters. For instance, one could use TCP for classifying new resolvers, and IP addresses that successfully conclude establishment of a TCP connection would be classified as legitimate. Unfortunately, many resolvers and clients lack support for TCP and further DNS resolution over TCP incurs significantly more failures than resolution over UDP, [6, 10]; therefore, to avoid these interoperability problems we use our own mechanism (which is run over UDP). Another option would be to assume a ‘quiet’ initialisation phase, without attacks, for construction of the resolvers’ graph.
nodes with only self edges, and standard-conforming resolving clusters will correspond to connected components in a graph. The distribution of DNS requests, among resolvers belonging to the same cluster, is uniform in the long run. Thus resolvers belonging to the same cluster will all participate at some point of time in sending requests in step 1 and in sending requests in step 3 and therefore there will be edges between any two nodes in a component; we elaborate more on this in Section 3.2.2. In contrast, requests sent from spoofed IP addresses, will correspond to isolated nodes in a graph, or nodes without any incoming edges from legitimate resolvers, and will have only outgoing edges or incoming edges from compromised hosts. For instance, assume a resolver platform is composed of two IP addresses 1.2.3.4 and 7.7.7.7; see Figure 4. First request, sent in step 1 from 1.2.3.4 for query x, is responded with a referral. Subsequent request is sent to the challenge IP that appeared in the referral response. The authentication is successful since a request from 1.2.3.4 was correctly followed by a request from 7.7.7.7, and both nodes are permanantly added to graph along with an edge from 1.2.3.4 to 7.7.7.7. If at a later time, a second request is sent from 7.7.7.7 to 1.2.3.4, and the protocol is concluded correctly, an edge is added from 7.7.7.7 to 1.2.3.4. This results in a connected component consisting of two nodes, with edges from each one to another. time
Step 1
Step 3
Constructing Resolvers Graph
We construct a graph G = (V, E) of resolvers, as follows: when a request arrives a from source IP ip1 , in step 1, we add a node to the graph V = V ∪ ip1 . When a request in step 3 arrives from source IP ip3 , and a corresponding (matching) query was received in step 1, we add a node to the graph V = V ∪ ip3 , and we add a directed edge E = E ∪ {ip1 , ip3 } to the graph G; if ip1 = ip3 , then we add a self edge, i.e., an edge from a node to itself. The nodes and edges are added permanently only if both steps 1 and 3 were completed correctly, namely, sent a query in step 1, and followed with a correct challenge in step 3. Otherwise, a node is not added to the graph. We make the following observation: standard-conforming individual resolvers (with a single IP) will be mapped to
1.2.3.4 Add Add node node 1.2.3.4 1.2.3.4 temporarily temporarily
1.2.3.4
request for x from 7.7.7.7 to correct challenge IP
1.2.3.4
request for y from 7.7.7.7 Request22with with Request queryfor forrecord recordyy query
3.2.1
Request11with with Request queryfor forrecord recordxx query
request for x from 1.2.3.4
7.7.7.7
Node Node 7.7.7.7 7.7.7.7 already already exists exists
7.7.7.7
Add Add 1.2.3.4 1.2.3.4 and and 7.7.7.7 7.7.7.7 and and add add 1.2.3.4 1.2.3.4 → → 7.7.7.7 7.7.7.7
request for y from 1.2.3.4 to correct challenge IP
1.2.3.4
7.7.7.7
Add Add 7.7.7.7 7.7.7.7 → → 1.2.3.4 1.2.3.4
Figure 4: Construction of a cluster consisting of a connected component with two nodes, 1.2.3.4 and 7.7.7.7.
Once a graph is constructed, a number of known algorithms, e.g., Dijkstra, can be applied to identify connected components; the vertices of a connected component are then added to form a cluster. The output is a list of clusters, {Ci }∞ i=1 , such that each cluster Ci is a group of DNS resolvers, i.e., a set of IP addresses of resolvers.
3.2.2
Uniform Order of IP Addresses in Requests
The scenarios we discussed in Section 1 show that there are two central settings, whereby resolvers may use different IPs in consecutive requests. (1) Resolvers apply server selection algorithms when they need to decide which upstream DNS server to forward the request to, e.g., in case of forwarders. (2) Resolvers can be assigned IP addresses, e.g., if they are behind a NAT device. We show that, in the long run2 , resolvers’ settings exhibit uniform order and distribution of IP addresses of the resolvers in DNS requests.
Resolvers Behind NAT Devices. Resolvers behind a many to many NAT device will be assigned different available IP addresses each time they send a DNS request, and thus an observation period of a sufficiently large time interval will contain requests in steps 1 and 3 from different IP addresses. Due to load balancing algorithms, the same logic applies to resolvers which use multiple network interface cards, or to resolvers pools.
The firewall of the network, that deploys the protection mechanism, should be configured to allow only the verifier to communicate to the DNS server. The communication between the firewall and the challenger and verifier should be over a secure channel protocol.
3.3.1
Allocating IP addresses
The verifier should be allocated a set of IP addresses, e.g., by configuring multiple network cards or via virtual IP mapping (allocating multiple IP addresses to the same network card). However, IP addresses are a scarce resource and allocating multiple addresses to the verifier may not be possible for many networks, where spare addresses is a deficiency. We show that allocating new addresses is not essential and existing addresses can be reused for this purpose. The design is simple: all the IP addresses, allocated to the hosts, can be reused by the verifier, only for packets destined to port 53. Specifically, the firewall of the network should be configured to with a suitable rule, to forward all packets, destined to port 53 and with destination IP address within the address block of the network, to the verifier. We sampled the IP address blocks allocated to networks among the top million Alexa sites, www.alexa.com, and found that the median network block size is 211 ; see Figure 5. 1 0.9 0.8
DNS resolvers apply different logic when selecting the DNS server to which they send the request. Some resolvers select a DNS server at random, e.g., WindowsDNS, others assign higher preference to least latent servers, e.g., Bind, (using the Round Trip Time (RTT) as a metric to compare the latency), but the goal of all server selection mechanisms is to distribute the load among the name servers. Prior art studied server selection behaviour with respect to unpredictability as a measure against cache poisoning attacks, e.g., [5, 17], and showed that even DNS resolvers that apply RTT-based (non-random) server selection, still periodically, e.g., every other minute, probe all the servers, i.e., including highly latent or non-responsive. Server selection logic is applied when sending requests to name servers directly, or when forwarding requests via the upstream resolvers.
0.7
3.3
Configuration and Deployment
The challenger can be configured to protect a DNS server in two ways: (1) updating the delegation NS records of the DNS server in the parent zone to point at the IP address of the challenger3 or (2) configuring the firewall (behind which the protected DNS server resides) to forward all UDP packets, destined to port 53, to the challenger’s IP address. The later option is easier to configure but, depending on the location of the challenger, may add slightly more latecy to DNS responses. For security, the verifier should communicate to the protected DNS server over a secure channel protocol, e.g., IPsec [RFC6071] or SSL/TLS. 2
The time period depends on the number of DNS resolvers (or alternately IP addresses assigned to the resolvers) and on the rate of requests from the resolvers to our mechanism). 3 Notice that this only applies to name servers and if the protected DNS server is a recursive resolver a firewall configuration should be supported.
Networks (%)
DNS Server Selection.
0.6 0.5 0.4 0.3 0.2 0.1 0
20
25 210 215 220 Network Block Size (#IP Addresses) [log-scale]
Figure 5:
3.3.2
225
IP address’ range sizes.
DNS Authentication as a Cloud Service (DAaaS)
Both the challenger and the verifier can be placed on cloud’s network4 . In our evaluations we set up the verifier hosts, the challenger and the controller on virtual machines on Amazon EC2. The database can either be placed on EC2, or, if it grows significantly, it can be placed on Amazon Simple Storage Service (S3). Amazon hosts are not expensive and in our evaluations we used a single EC2 account; a single EC2 account provides up to 25 IP addresses. We used micro virtual instances, that are offered free of charge. The advantage of deploying our design in the cloud is also that the cloud periodically changes the IP addresses, allocated to the virtual hosts, unless elastic IP address service is used (in which case a fixed and constant IP address is assigned to the client’s host). This provides an artificial inflation of the address block allocated to the verifier hosts. Another source of randomisation is that the IP addresses on cloud are not allocated from a consecutive addresses’ block, which 4 The challenger and the verifier can be configured to reside on the same network as the protected name server.
1
Step
d fe ss oo re Sp Add P I
IP Re Ad al dr es s
Challenge Challenge intercept intercept
IP Spo Ad ofe dr d es s
s
Not Notattack attack
s al re Re dd A IP
Challenge Challenge exhaust/guess exhaust/guess
3 IP Spo Ad ofe dr d es s
3
Step
s al es Re ddr A IP
makes it more difficult for the attacker to anticipate the address range among which addresses are assigned to the verifier hosts; notice that in our design the IP addresses do not need to be from a consecutive ‘addresses block’. We evaluated the overhead added by our mechanism on the resolution of DNS requests, the results are plotted in Figure 6. The evaluation consists of measurements of latencies for resolution of records in TLDs, without our mechanism, with our implementation on Amazon EC2 (US East, Northern Virginia), and with our implementation configured on our university network.
Compromised Compromisedhost host (not (notresolver) resolver)
1
CDF (%)
0.8
Legend no DNS auth cloud-based DNS auth uni network DNS auth
0.6
0.4
0.2
0 64
128 256 Latencies Variability (ms) [log-scale]
Figure 6: Latency of DNS resolution for records in TLDs. No DNS authentication, DNS authentication in cloud (Amazon EC2) and DNS authentication on university network.
3.4
Security Analysis and Evaluation
The analysis of our DNS authentication mechanism follows the steps illustrated in Figure 7. When sending the first request, in step 1 (Figure 3), the attacker can use either a spoofed source IP address or a real one. Typically to avoid exposure and losing control of compromised hosts the attacker would prefer not to use an IP address of a host under its control. Indeed many of the reflection attacks are orchestrated from spoofed source IP addresses. When using a spoofed source IP address in step 1 (Figure 7), the attacker will not be able to obtain the challenge. As a result, it will have to guess the correct destination IP address in step 3 via challenge exhaust/guess attack. In order to obtain the challenge the attacker may resort to more sophisiticated strategies, such as sending packets from real (not spoofed) IP address, or launching a challenge-intercept attack.
3.4.1
Figure 7: Attacks’ flows.
Challenge-Exhaust/Guess Attacks
In a challenge-exhaust attack the attacker triggers multiple DNS requests (step (1) Figure 3), all containing the same query, (from spoofed source address(es)) within a valid time interval τ . This results in an allocation of the corresponding number of challenge IP addresses. As a result, the attacker can trivially hit the correct challenge destination IP address (in step (3)) by sending a DNS request with the correct query, to any destination IP address, from a spoofed source IP address of the victim, thus foiling the defence mechanism. In a challenge-guess attack the attacker may try to guess the challenge, and following a request (in step (1)) from a spoofed IP address of a victim, the attacker would send requests to random (multiple) destination addresses (in step
(3)) hoping to guess the challenge. The concept of this attack is similar to challenge-exhaust attack except in this case, the attacker attempts to hit multiple destination IP addresses in the IP address block. To counter these attacks our design selects the challenge IP address out of a set of IP addresses. If the addresses range is sufficiently large, the attacker should not be able to circumvent the DNS authentication mechanism and cause the DNS server send a large response to the victim. Let N be the address block size, then the probability that the attacker hits the correct destination IP address in step 3 (Figure 3) is N1 . To increase its success the attacker can try to occupy multiple addresses, following challenge-exhaust attack, in step 1. Let n be a number of concurrent DNS requests from the same IP address for the same query (in step 1) during interval τ seconds. Then, the success probability for n guessing the correct IP address in step 3 is N . If n = N the attacker’s success probability of causing amplification is 1. Thus we also impose a limit ω on the number of requests for the same query and from the same IP (sending from different IPs decreases the success) that the attacker can send (trying to reduce the challenge range). Assuming a typical network size of N = 211 (Figure 5), the threshold ω can be sufficiently large while still allowing detection of attacks. We next calculate a threshold on the number of requests for the same resource record from the same IP address (or a set thereof) in a time interval. The query rate to the protected DNS server depends on the Time to Live (TTL) ttl in the requested resource record. Any subsequent requests, in time interval [0, ttl] are served from the cache. The threshold in this case is 1/ttl, namely, a request per time to live. However, during our evaluations we identified legitimate situations in which a query will be requested by the same IP address with frequency higher than TTL. This phenomenon is frequent when a number of resolvers are located behind a NAT device or behind a non-caching forwarder. In this case, the same IP, of the NAT or the forwarder, may forward the request multiple times within the TTL period. We assume that the requests rate for a DNS record q follow a Poisson distribution and arrive at a rate λ requests per second. Let ttl be a Time To Live (TTL) of record q. The average interval between two DNS requests from a specific IP address a.b.c.d to a specific resource record q is λ1 . Thus requests for a query q will be sent every interval λ · ttl + 1 1 at rate (λ·ttl+1) . If n resolvers are located behind a single
IP address, via NAT or forwarder, then the requests rate is n . (λ·ttl+1) Another parameter impacting the query rate is popularity of domains, [2], namely, the popularity of a domain is proportional to a request rate for resource records within that domain.
3.4.2
Challenge-Intercept/Compromised Host
We use our graph based approach, introduced in Section 3, for clustering and identification of standard conforming recursive DNS resolvers. This allows us to detect compromised hosts that launched the amplification attack, whether the address used to send the (subsequent) request in step 3, is spoofed or real. If the attacker sends a request with a spoofed source IP address of the victim in step 1 (Figure 3), then it will not be able to correctly echo the challenge in step 3; since the response is sent to IP address of the victim which the attacker does not control, it will not receive the response. Thus, neither the IP of the victim nor of the attacker are added to the graph. The attacker can lanuch a challenge-intercept attack. To obtain the challenge, in step 1 (Figure 3), the attacker uses an IP address which it controls. Then, the subsequent request, sent in step 3, can be issued from a spoofed source IP address of a victim but to a correct challenge IP (which was obtained in earlier step). For instance, attacker sends a first request, in step 1, from its own IP address, and subsequent request, in step 3, from IP address of the victim. In this case, two nodes, corresponding to the IP addresses of attacker and victim are added to the graph, Figure 8, with an edge from attacker’s IP to victim. However, the attacker should not be able to add an opposite edge, from an IP address of the victim to an IP that it controls. This is due to the fact that it would need to cause a query from victim, to obtain the challenge value, and then to send a request from attacker’s IP to the verifier in step 3 with the correct challenge.
attacker
victim
attacker
victim
Figure 8: Disconnected components generated by malicious hosts.
DNS servers as amplifiers, and refer to the amplifiers (reflectors) as ‘servers’; we denote the set of servers by S. The clogging game is a simple, two-move, complete-information sequential game. In the first more, each of the servers s ∈ S decides whether to adopt an anti-amplification defence or not; we denote the decision by α(s) ∈ {0, 1}. We focus on a simple binary decision, to adopt (α(s) = 1) or not, and assume that adoption completely prevents amplification for spoofed requests - as, indeed, is the situation with the request-authentication anti-amplification mechanism. In the second move, the attacker determines whether to attack at all, and, if she attacks, what is the optimal rate of attack traffic r∗ (s) to send to each server s ∈ S. Note that while the first step is specific to amplification attacks, and even contains few aspects specific to DNS reflection/amplification attacks, the second step is general and relevant to any clogging attack. We first analyse, in subsection 4.1, the second step of the clogging game; then we analyse the first move (of the reflectors/servers), in subsection 4.2. While the model is a simplification of reality, we hope that its analysis still provides useful insights for clogging (bandwidth DoS) attacks, and specifically for the cost/benefit trade-offs for adoption of the proposed DNS request authentication mechanisms, and the implications on the expected adoption of the mechanism and on its impact on DoS-victims and attackers. The most interesting aspect is the ability of such mechanisms, to reduce the amount of DNS-amplified DoS traffic in the Internet. This is not trivial, since the defence has to be deployed by DNS servers, which are often not the victims of the clogging attack. However, even if the attack is not directed at the DNS servers, they may still have a motivation to defend against it, since the defence may reduce the amount of undesirable traffic they handle (and hence reduce expenses and waste). Indeed, our analysis shows that this is the case, and that the defence will be adopted by all DNS servers with ‘large’ amplification factors. Intuitively, this holds since (1) the attacker prefers name servers with maximal amplification factor, and (2) servers which amplify high amounts of traffic, will adopt the defence to reduce costs. Hence, all attack traffic will be limited to one single (amplifying) DNS server. Furthermore, we show that typically it becomes more efficient to adopt the defence, when the attack traffic exceeds twice the benign traffic of the DNS server. Such low attack rate, implies that the defence essentially eliminates the threat of DNS amplification DDoS attacks.
4.1 To detect this attack and prevent it, we use the clustering mechanism (in Section 3.2) which enables us to identify whether the request belongs to a legitimate resolver or if exploited by an attacker.
4.
THE CLOGGING GAME
In this section, we present and analyse the clogging game, a simplified game-theoretic model of clogging5 , considering the attacker and each of the servers (reflectors) as rational entities with perfect information. We focus on the use of 5
Clogging attacks are bandwidth DoS attacks, based on flooding and/or the use of amplifying-reflectors, in particular, DNS servers.
Analysis of Clogging-Attacker Decisions
Attackers usually control a large set of corrupted machines (‘bots’), which they control, and use to send different spoofed DNS requests to different name servers. We assume that the attacker has some ‘attack cost’ CA (r) associated with sending requests at total rate r (of spoofed DNS requests per second (rps)). Our assumption is that the cost is only a function of the total traffic rate sent by the attacker to all servers and victims. The sending cost may reflect actual costs paid by an attacker to the owner of the botnet (when the attacker leases the bots), or may reflect the loss of potential alternate profit from the bots (when the attacker ‘owns’ the bots directly). We adopt the typical economic assumption of increasing costs of resources, i.e., assume that CA is positive, monotonic-increasing and
0 00 convex, i.e., CA (r), CA (r), CA (r) > 0 for every r. The attacker has income from generating attack-traffic to victim hosts and networks. The attacker income from the attack-traffic may be due to direct interests of the attacker, e.g., to disable competitors or to harm an ‘opponent’ entity; or, the attacker may be paid for the DoS attack by some ‘attack-customer’. For simplicity, we assume that the victims are not in the set S (i.e., not servers), and that the attacker is only sending attack traffic (requests) to the servers (and not directly to victims). Let ~r : S → 0 but IA holds IA (a) > 0, IA The attacker goal is to select the optimal attack-rate allocation r∗ to maximize the attacker utility. The utility is the difference between attacker’s income and attacker’s P costs, i.e., UA [~r, α] = IA (a[~r, α]) − CA (r), where r ≡ ~r(s). s∈S P Namely, r∗ ← arg max~r IA (a[~r, α]) − CA ~ r (s) . s∈S Since µ(s) > 1 for every s ∈ S, it follows that the optimal allocation sends the maximal amount of traffic to the non-adopting servers with the highest amplification factors. Assume that all amplification factors differ, i.e., (∀s, s0 ∈ S|s 6= s0 )(µ(s) 6= µ(s0 )). We obtain:
Lemma 4.1. Assume exists s∗ ∈ S s.t. α(s∗ ) = 0 and rA (s∗ ) > r∗ (s∗ ) > 0. Then for every s 6= s∗ s.t. α(s) = 0, (1) if µ(s) < µ(s∗ ) then r∗ (s) = 0 and (2) if µ(s) > µ(s∗ ) then r∗ (s) = rA (s).
4.2
Analysis of Servers’ Decisions
We next analyse the adoption of anti-amplification defences, specifically, DNS request-authentication, by the servers.
If all servers adopt request-authentication, the attack traffic and the utility to the attacker will be significantly reduced - with even more significant reduction in the damages to victims and to other network users. It remains to analyse the expected adoption; for that, we must consider also the incentives of the owners of the servers. In addition to the adversarial amplification factor µ(s), each server s ∈ S also has a benign amplification factors η(s), namely the average ratio between the size of responses and the size of corresponding benign requests. Note that the benign amplification factor η(s) refers to legitimate (nonspoofed) requests, and is independent of the adoption of defences. Clearly, for every server s ∈ S which does not adopt anti-spoofing defence, i.e., α(s) = 0, holds: µ(s) ≥ η(s) (typically, µ(s) >> η(s)). Server s sends responses to the benign requests at the fixed benign rate rB (s) · η(s) ≤ B(s). Hence, the attacker can use this server to amplify only up to total bandwidth B(s), i.e., the maximal effective adversarial-input rate to s is: r∗ (s) ≤ rA (s) ≡
B(s) − rB (s) · η(s) µ(s)
Let C[r∗ , α](s) denote the total cost for server s, if each server s0 makes adoption decision α(s0 ), and the attacker send to s0 at rate r∗ (s0 ). We do not model any income for name servers, i.e., a rational name server s would seek to minimize C[r∗ , α](s), selecting α(s) accordingly. We assume that the costs of every server s depend only on its own decision α(s) and on the amount of traffic it receives and sends, using a common linear function: C(s) = Fα(s) + γ · t(s), where γ is a constant (price per traffic unit), Fα represents fixed equipment and operational costs, and t(s) is the total traffic into and out of s. We assume that F1 > F0 , i.e., the fixes costs are higher when request-authentication is deployed (reflecting the fixed costs of deploying the defence). The traffic t(s) depends on server selection α and adversarial rates r∗ ). Recall that the attacker will not send spoofed traffic to servers which adopted the defence, i.e. if α(s) = 1 then r∗ (s) = 0. Hence: α(s) = 1 ⇒ (3 + η(s)) · rB (s) t(s) = α(s) = 0 ⇒ (1 + µ(s)) · r∗ (s) + (1 + η(s)) · rB (s) (1) Notice that the adoption of the defence causes slight increase in overhead of handling benign requests (3 + η instead of 1 + η), due to the challenge. Server s would rationally adopt the defence, i.e., select α(s) = 1, if this leads to lower cost C(s). By substituting the values for t(s), and simple algebraic manipulations, this holds when r∗ (s), the rate of requests-traffic sent by the attacker to s, exceeds the profitability threshold rate rα (s) defined as: rα (s) ≡
γ · 2rB (s) + F1 − F0 γ · (1 + µ(s))
(2)
Eq. 2 does not yet allow us to determine which serves will adopt the defence, since the servers make the first move in the game, i.e., every server s selects α(s) before the attacker selects r(s); and the adversary’s choice depends on the adoption-decisions (α values) by the servers. Notice also that in this model, all servers decide simultaneously, i.e., server s cannot observe which other servers has adopted the defence.
Namely, each server s has to decide if to deploy the defence or not, based on its prediction of the value of r(s) and of the (relevant) decisions of other servers, based on the assumption that all servers, as well as the attacker, are rational (and have perfect information, e.g., aware of the µ values for all servers). From Lemma 4.1, the optimal amount of traffic that server ∗ s receives, rα (s), depends only on the adoption decisions (α values) of itself and servers s0 with higher adversarial amplification factors µ(s0 ) > µ(s). Since we assume perfect information, this allows each server s to compute the decisions of all servers s0 with higher amplification factor, and thereby to know if it should deploy (α(s) = 1) or not. Consider the server s with maximal amplification factor µ(s), that decides not to deploy, i.e., α(s) = 0 and (∀s0 6= s∗ )(α(s0 ) = 1) ∨ (µ(s0 ) < µ(s∗ ). Since s does not deploy, ∗ surely rα (s) < rα (s). Assume that s has lower threshold rate than its feasible attack-request rate (otherwise, s will never amplify, and should not be in S); namely, rα (s) ≤ rA (s). Combining this ∗ ∗ inequality with rα (s) < rα (s), we have that rα (s) < rA (s), i.e., attacker is not fully utilizing the available attack bandwidth of s. Since s is the non-deploying server with largest adversary amplification factor µ, is follows from Lemma 4.1 that it is the only server to receive any attack traffic; i.e., for every s0 6= s holds r∗ (s0 ) = 0, and r∗ (s) < rα (s) ≤ rA (s). This implies a very limited amount of clogging, bounded by Eq. (2). Indeed, if we ignore the fixed ‘setup costs’ F1 − F0 , 2rB (s) , namely, the we get that the maximal attack rate is µ(s)+1 amount of clogging traffic is about only 2rB (s) - twice the benign request rate of the server.
5.
CONCLUSION
We design an anti-reflection mechanism which nullifies the amplification factor of DNS responses, when abused for DoS attacks. As a result, the attacker is required to expend an amount of resources that is equivalent to the impact on the victim network. Hence, our mechanism makes it not profitable for the attackers to abuse DNS servers, protected with our mechanis, in their reflection attacks. This provides an added value not only for the victims but also for the DNS servers, by saving their bandwith and resources for serving legitimate clients. As a side effect, our mechanism also produces a list of ‘good’ DNS resolvers and resolvers’ platforms as well as potentially compromised (suspect) hosts. We deployed and tested our mechanism in the cloud, and we believe that DNS authentication as a service would benefit a wide range of Internet clients and services. We also developed a game theoretic model showing the profitability of our mechanism. The model is of independent interest and can be used to analyse defences against different types of DoS attacks.
Acknowledgements This research was supported by grant 1354/11 from the Israeli Science Foundation (ISF), by the Ministry of Science and Technology, Israel and by the German Federal Ministry of Education and Research (BMBF) within EC SPRIDE, and by the Hessian LOEWE excellence initiative within CASED.
6.
REFERENCES
[1] H. A. Alzoubi, M. Rabinovich, and O. Spatscheck. The anatomy of ldns clusters: findings and implications for web content delivery. In Proceedings of the 22nd international conference on World Wide Web, pages 83–94. International World Wide Web Conferences Steering Committee, 2013. [2] R. P. Doyle, J. S. Chase, S. Gadde, and A. M. Vahdat. The trickle-down effect: Web caching and server request distribution. Computer Communications, 25(4):345–356, 2002. [3] O. Gudmundsson and S. D. Crocker. Observing DNSSEC Validation in the Wild. In SATIN, March 2011. [4] F. Guo, J. Chen, and T. cker Chiueh. Spoof Detection for Preventing DoS Attacks against DNS Servers. In ICDCS, pages 37–37. IEEE Computer Society, 2006. [5] A. Herzberg and H. Shulman. Security of Patched DNS. In S. Foresti, M. Yung, and F. Martinelli, editors, ESORICS, volume 7459 of LNCS, pages 271–288. Springer, 2012. [6] A. Herzberg and H. Shulman. DNSSEC: Security and Availability Challenges. In Communications and Network Security (CNS), 2013 IEEE Conference on, pages 365–366. IEEE, 2013. [7] A. Herzberg and H. Shulman. Vulnerable Delegation of DNS Resolution. In European Symposium on Research in Computer Security, Lecture Notes in Computer Science. Springer, 2013. [8] A. Herzberg and H. Shulman. Retrofitting security into network protocols: The case of dnssec. Internet Computing, IEEE, 18(1):66–71, 2014. [9] C. Kreibich, N. Weaver, B. Nechaev, and V. Paxson. Netalyzr: illuminating the edge network. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement, pages 246–259. ACM, 2010. [10] W. Lian, E. Rescorla, H. Shacham, and S. Savage. Measuring the Practical Impact of DNSSEC Deployment. In Proceedings of USENIX Security, 2013. [11] G. Maier, F. Schneider, and A. Feldmann. NAT Usage in Residential Broadband Networks. In Passive and Active Measurement, pages 32–41. Springer, 2011. [12] M. Prince. The DDoS That Almost Broke the Internet. CloudFlare Blog, April 2013. [13] K. Schomp, T. Callahan, M. Rabinovich, and M. Allman. On measuring the client-side dns infrastructure. In Proceedings of the 2013 conference on Internet measurement conference, pages 77–90. ACM, 2013. [14] H. Shulman. Is DNSSEC Ready for Prime Time: The Challenge of Legacy Infrastructure. Technical Report, September 2014. [15] H. Shulman. The (in)Security of Outsourced DNS. Technical Report, August 2014. [16] R. Tzakikario, D. Touitou, G. Pazi, et al. Dns anti-spoofing using udp, Nov. 2009. US Patent 7,620,733. [17] Y. Yu, D. Wessels, M. Larson, and L. Zhang. Authority Server Selection of DNS Caching Resolvers. ACM SIGCOMM Computer Communication Reviews, April 2012.