Chapter 4. Managed DNS Services

A managed DNS solution might represent the best near-term strategy for protecting a business from the challenges that face today’s networks at the edge. In this chapter, we dive deeper into what benefits it can provide.

At a high level, we can define a managed DNS service as a service sourced through a specialized DNS service provider that enables users not only to manage DNS traffic, but also to access advanced features, including active failover, load balancing, dynamic IP addresses, and geographically targeted DNS.

Each managed DNS service provider brings its own value proposition to users looking for such services. We’ll explore some of the typical provider services later in this chapter. But first, it’s worth discussing where you can use traditional DNS services as a foundation for understanding what a managed DNS provider can offer.

Benefits of DNS

Some of the many uses to which organizations apply their DNS service include the following:

Performance: You can use geolocation load balancing for performance optimization; routing the request to the server closest either to the user edge or to the endpoint.
Cloud migration: Ratio load balancing allows for a gradual transition to the cloud; you can migrate some traffic to new cloud-hosted resources environments to test and validate access and then slowly move more traffic when ready.
Availability: Active failover allows you to establish a second endpoint or multiple alternate endpoints to which the first can fail over to ensure availability and health of the connection path.
Containers: Containers can be published to multiple clouds, but are the clouds themselves load balanced? DNS traffic steering enables users to keep containers highly available, load balanced, and performant.

In the sections that follow, we look a little more closely at three areas: performance, availability, and security. These benefits apply whether the DNS is delivered from a managed DNS service provider or bundled with services from a CDN, a local ISP, a web application firewall (WAF) provider, or even your own data center staff.

Performance

Internet traffic is at an all-time high and shows no signs of slowing down. Correspondingly, network infrastructure in most companies is struggling to keep up. In any number of scenarios today, servers (whether physical or virtual) can become overloaded. Any time a server is at or near capacity, it can have a direct negative effect that can ripple throughout the network. By taking advantage of the routing capabilities of DNS, traffic and requests can be routed to alternative systems not experiencing as much load. To truly make this effective though, there must be a predetermined plan that is put into action quickly (preferably automated).

Another, less obvious, benefit is being able to direct traffic to test systems for performance testing. Using DNS infrastructure, the developer can run test environments in real time. To move traffic from test to production, developers can change time-to-live (TTL) settings, redirecting traffic to the chosen location.

On a related point, using decentralized DNS for nameservers to resolve your queries inherently reduces latency and maintains a smooth user experience. It can also eliminate the need to troubleshoot unidentified performance issues with your telco provider.

Availability

Outages can happen at any time. If your business has multiple datacenters or uses a service with multiple datacenters, DNS can be the traffic diverter from the outage area to another location that keeps your customers in business. Where practical, you can also do this on a small scale with other failover/clustering technologies.

You can also utilize DNS to route traffic away from your legacy datacenter during maintenance times. Having the ability to take control at the DNS layer and reroute traffic in a quick, transparent way provides a key advantage in continued availability to your customers.

Security

Increased levels of security threats are occurring at the DNS level every day. These arise from DDoS attacks, malicious bots, malware, and other application vulnerabilities that propagate via the network back roads. Access via DNS must be guarded, but it also becomes the first place the threats can begin with mitigation. Often, DNS amplification or reflection attacks are prime suspects during a DDoS event.

DNS Routing

DNS has a significant role to play in providing edge resilience, protection, and stability. It has close proximity to the edge and we can also use it to direct traffic where we need it to go—transparently to the customer. Think about how traditional failover occurs at the network or server layer. Deploying an intelligent DNS network that can steer traffic at the apex of a domain would be faster versus routing to an end node and then making a decision. This is important when reducing latency and quickly issuing a new session should a failure occur in order to serve your clients.

Note that none of these scenarios we have discussed required a managed DNS service—just DNS. It is possible with enough staff, planning, and priority responses to utilize DNS without utilizing other additional tooling as a shield. Beyond that, it is also possible to automate some aspects of a DNS strategy in-house, assuming resources are available.

DNS Anycast Networks

Anycast is a one-to-many network routing scheme in which a destination address has multiple routing paths to a variety of endpoints (at least two). Anycast DNS routing allows traffic to be distributed to multiple datacenters, providing global active-active load balancing.

A DNS anycast network offers several benefits. With a DNS anycast network, you can route requests to the closest PoP for the best response; take advantage of the one-to-many relationship between IP addresses and their associated nameservers; distribute traffic from a single IP address to different nameservers based on the origin of the request; and add multiple telco providers to these nameservers, adding another level of redundancy at these PoPs. Why does all this matter?

By routing requests to the closest nameserver, the resolution time is greatly reduced, and users experience improved overall performance. This effect is magnified for websites that include multiple DNS lookups for additional files and assets that need to be loaded before a page completes. Web apps must resolve various components for the user edge to become successful, and this is where DNS can potentially make or break the online experience. Some organizations believe their cloud provider or telco speeds and feeds drive performance—and this is true—but DNS and CDN networks also contribute.

An intelligent edge positions your organization to deliver continued optimal responses to:

Planned interruptions resulting from routine maintenance or a switch to new cloud services
Unplanned outages due to inclement weather, power failures, or faulty fiberoptic lines
Redundancy based on having multiple anycast PoPs or multiple transit providers per PoP

When to Consider a Managed DNS

What, then, are the benefits that you can get from a managed DNS service? In simplest terms, you can think of managed DNS as magnifying the capabilities of DNS—putting armor around the edge of your application through automation, scalability, self-service, monitoring, and key services tuned for your business.

To understand more, let’s take a closer look at the kinds of services that are typically offered.

Intelligent Monitoring

At its most basic level, intelligence at the network edge is the simple act of monitoring the edge to determine whether your resources are available. If there is a problem, decisions and next steps can happen first at the DNS layer, before affecting datacenter and hybrid-cloud environments. For example, if a primary fiber cut occurs, intelligent DNS allows you to see that the environment is not available and make routing decisions at the service edge, before the endpoint is affected.

This action from the edge can save milliseconds in failover response time. That time might seem small, but it does make a difference. According to the Aberdeen Group, a 1-second delay in page load time equals 11% fewer page views and a 16% decrease in customer satisfaction.¹

Active Failover

Active failover is a DNS service that moves traffic to a healthy endpoint host in the event of degraded service. During such impacts, active failover enables your website or web-based application to remain reachable. When the system detects an outage, traffic is automatically rerouted to an alternate, predefined endpoint—or even to multiple endpoints in succession. This ensures that your traffic finds a route to a healthy location as quickly as possible.

Active failover is configured to check on service endpoint health by running HTTP, HTTPS, Ping, SMTP, and TCP protocols to verify that the site is still responding. When the primary service fails to respond, traffic is redirected to an alternate endpoint. Active failover considers both the endpoint’s ability to serve the user and the condition of the path used to reach that endpoint.

Traffic Steering

Traffic steering makes intelligent, policy-based decisions on where to send user traffic via DNS. Factors that determine routing include proximity of user to content, node availability, and overall route performance. Using intelligent responses, the traffic steering can be adjusted to take a different route. This operates at the level of the root of the user’s domain.

A basic example of intelligent traffic steering is to “round-robin” traffic across multiple cloud or datacenter locations for load balancing. A more sophisticated example might involve taking into account the geographic location of the user to decide which servers to which to route requests. For example, a query from London could be routed to a European-based point of presence (PoP), whereas a request from San Francisco can be routed to a western US-based location.

You can also factor additional information gathered from monitoring, such as availability and load balancing, into traffic steering decisions. These capabilities can also help monitor your digital edge and DNS environment to detect and mitigate threats and anomalies—including routing. You can use traffic steering to shift traffic away from threats before they have a negative impact on your infrastructure, especially if you have an outage.

Using DNS for intelligent traffic steering is also good news from a marketing standpoint. Although some executives might not intuit the details of DNS at the edge, the benefit of being able to target specific content to specific audiences without significant additional investments is clear. For example, routing users to a specific geographic location allows you to serve different content. This capability is particularly beneficial to retailers.

Federated Load Balancing

The term “federated” is used to describe different kinds of implementations working together to provide a full solution. In this case, it refers to two different kinds of load balancers working together to provide a comprehensive load-balancing solution for your business. It is a tiered approach that brings together multiple disparate components into a single, unified solution that focuses on steering traffic based on balancing control and asset awareness.

As we noted in Chapter 1, there are two different kinds of edges we can talk about with our business networks: the user edge, where users first come into contact with our network; and the site edge, where the network first comes into contact with our infrastructure that holds the sought-after content or service.

Each of these edges has access to unique information. The load balancers running there have the flexibility to steer traffic along different dimensions. This simple description sums up the differences well:

The user edge is powered by DNS and steers user traffic to destination endpoints based on how the request is resolved.
The site edge or local load balancers are responsible for directing traffic to the most available compute or storage resource to service that request. The local rules take into account resource availability, load, session maintenance, and security.

The DNS load balancers at the user edge are commonly referred to as global load balancers (in contrast to the local load balancers at the site edge). These two types of load balancers complement each other well. Here are some examples:

Global load balancers have the big picture view of the available paths, hence the term global.
Global load balancers ensure that users are routed to the best, or available, endpoints.
Global load balancers can provide weighted, round-robin, and geography-based routing.
Local load balancers have knowledge about the available site resources.
Local load balancers ensure that the site is operating efficiently and is able to serve up resources.

These two types of load balancers also work together in the federated model to serve end users. For example, a problem occurs at a defined site where a load balancer is not available or able to steer traffic. If other sites are configured, a DNS-based global load balancer can automatically redirect traffic to an alternate “healthy” site.

“Secondary” or Multiprovider DNS Implementations

Service providers with a global DNS infrastructure can enable you to add a secondary global DNS service. A secondary DNS service can help provide resiliency at the DNS layer for use cases such as when your primary DNS service faces an outage or suffers a malicious attack. In such cases, the redundant service remains fully operational for your users. However, the window is not infinite as the secondary DNS servers receive updates from the primary. Latency might increase because application requests must travel a bit further.

The functionality of a secondary DNS is often misunderstood. A common misconception is that a secondary DNS architecture is for backup only, meaning that it sits idle and begins working only when the primary architecture fails. But do not think of a secondary DNS as a traditional server pool configuration or a virtual router redundancy protocol (VRRP) design. A secondary DNS can actually sit in delegation for an organization, delivering requests on queries if it happens to be faster than the primary server. It can be a workhorse in the event of a disruption to the primary nameserver but can also potentially resolve queries in real time.

As a secondary solution, the primary DNS server holds the “master copy” of the data for a zone, and secondary servers have copies of this data that they synchronize with the primary server through zone transfers. These zone transfers happen at intervals or when prompted by the primary nameserver. When implementing any sort of secondary DNS, be sure it can receive DNS Notifys utilizing, for example, a successful incremental or asynchronous full zone transfer (IXFR/AXFR), to ensure proper zone updates. You can also configure secondary DNS to complement an existing in-house approach. One method called hidden master uses your existing DNS behind the firewall for management and configuration and then uses a cloud-based DNS for resolving queries.

Note that having a secondary DNS does not necessarily mean it sits idle. In some cases, your secondary DNS providers might be able to provide faster local responses than your primary DNS environment. Proper configuration is key to ensuring the best results.

Outage Mitigation

At the provider level, managed DNS has an important role to play in today’s dispersed, multicloud environments. In much the same way that a traffic-steering service routes users to alternate regions for the best experience, intelligent DNS can route connections to an alternative cloud site during an unplanned outage, for continued service minus the session impacted.

You can use this same functionality to take control at the DNS layer for planned outages. For example, if you are aware that your cloud provider has an upcoming maintenance window and you want to steer traffic completely away from that node during the outage window, this mitigation strategy can accomplish that.

An additional use of this kind of functionality can benefit development and deployment processes. When applications are being moved to the cloud, intelligent DNS provides developers with more flexibility at the network edge to control production traffic. By being able to tune the traffic targeted for a new release of an application, developers and IT staff can gather useful information about how it performs and any potential issues under load.

Conclusion

A managed DNS can provide significant value across multiple dimensions of vulnerability. Adopting this can be an effective way to implement comprehensive edge protection for existing networks. This service has multiple tools available in its toolbox that you can use to fix or prevent many of the common threats your network might encounter as well as build a stronger, more efficient end-to-end path for your customers.

But it must also have the right tools for your needs. Even the most sophisticated and powerful screwdriver is largely ineffective if you need to nail two boards together. Ensuring that you are making the right choice here requires awareness and forethought. To assist with that, we next look at some suggested business criteria for evaluating managed DNS offerings.

¹ AberdeenGroup, “The Performance of Web Applications,” p. 4, November 2008, reprinted 2015.

Get Edge Resiliency now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Edge Resiliency by Gary Sloper, Mark Wilkins