Chapter 1. Introduction to EVPN
A wet California winter and spring had started to make way to sunny summer skies when I was invited to meet with a large financial company. The organization wanted me to critique its data center network design. Its use case revolved around a Layer 3 (L3) network. Clos-based topology was the basic network architecture it had chosen. Everything was done as nicely as I could suggest. No longer did I have to explain why the company had to move away from bridging as the centerpiece of its data center or why Clos networks were a better fit. One more conversion accomplished. I moved on.
As the summer turned to fall, the company approached me again to discuss a new constraint it had to deal with. The enterprise was going to deploy a new storage cluster solution in the network. This solution expected a Layer 2 (L2) connectivity to work. Needless to say, the L2 connectivity had to be across multiple racks. “Dinesh, how do I fit a solution that expects L2 connectivity in a network that has L3 as its foundation?” engineers at the company asked.
Increasingly that fall, I heard the same refrain over and over again. “How do I deploy an application that requires L2 in an L3 network?”
Another group of companies I spoke to were building new data centers and wanted to embrace the new world of white boxes and Clos networks. They had newer applications either like Hadoop or that relied on constucts like containers, so the new world was a great fit. Yet another group of companies wanted to upgrade from the buggy, difficult-to-maintain, and less reliable L2 heavy networks with the modern, resilient, robust world of Clos topologies. But they all had to sooner or later deal with their legacy applications. Some decided to build a different, smaller, sunset network for these applications. Others wanted to figure out how to make the new network support these older applications. “After all, haven’t you been saying that Clos networks are a Lego building block that can support myriad use cases?” they asked.
Some of these newer applications continue to rely on L2 multicast and broadcast for cluster membership discovery and heartbeat. The other common reliance on bridging comes from the assumption that the IP address of an endpoint stays the same, even when the endpoint is destroyed and re-created elsewhere. There are solutions to pass around /32 routes using either routing from the host or ideas such as redistribute Address Resolution Protocol (ARP). Nevertheless, support concerns and age-old habits limited virtual machine or container mobility to L2. And, of course, the older applications built for the old world could not be rewritten or decommissioned.
In the simplest of terms, Ethernet VPN (EVPN) is a technology that connects L2 network segments separated by an L3 network. EVPN accomplishes this by building the L2 network as a virtual Layer 2 network overlay over the Layer 3 network. It uses Border Gateway Protocol (BGP) as its control protocol.
EVPN is a mature technology that has been available in Multiprotocol Label Switching (MPLS) networks for some time. A draft standard that adopted this to Virtual Extensible LAN (VXLAN) has been available and relatively stable with multiple vendor implementations. There has been a lot of additional work in progress at the IETF (Internet Engineering Task Force), the standards body that governs IP-based technologies. In short, EVPN has slowly been gathering force as the alternative to controller-based VXLAN solutions. And by the summer of 2017, its moment in the data center had come.
Companies adopted VXLAN and the world of network virtualization but wanted native VXLAN routing (or RIOT, as it is often called, for Routing In and Out of Tunnels). Network operators had tried to love the one they were with and failed. Merchant switching silicon with RIOT support started to arrive in volumes to support real deployments. The missing piece was a technology that enabled this new functionality without the use of controllers. EVPN was that missing piece.
What had happened to the promised world of Software-Defined Networking (SDN), where endpoints would set up and control their own membership lists and the network had a single job as the great connector? For one reason or the other, some technical and some not, that play had failed to be the blockbuster it had been promised to be.
So, why should you pick up this book? If you perform a web search for EVPN, I venture that what you’ll uniformly find is something that is very complex to understand. Owing to EVPN’s origins in the service provider (SP) world, the standards document is peppered with terminology that does not make sense in the data center world. Furthermore, the explanations of even the most basic concepts are spread across several documents, leaving the task of piecing it all together to you.
My aim is to explain EVPN in the simplest terms possible—to make the technology accessible so that network operators and architects can understand its use for the cases cited at the beginning of this book. And hopefully, the book does more than that, explaining the concepts and practicalities in a way that helps you to use it in other, novel cases. This is a book that explores the why, not just the how. I remain vendor agnostic in all this to the extent possible.
And I expect you, my reader, to be a network architect or network operator. I assume that you are somewhat familiar with the basics of BGP and Clos networks. If not, I recommend, if a little abashedly, the prequel to this book, BGP in the Data Center (O’Reilly, 2017), for more detailed explanations of these concepts.
The story begins with a study of the two basic building blocks of EVPN: network virtualization, and the adaptation of BGP to the needs of network virtualization. We then explore how bridging and routing work in an EVPN world. After that, we turn to the configuration and management of EVPN networks. We conclude with some thoughts on considerations for deploying EVPN in larger deployments. I do not discuss L3 multicast and the data center interconnect use cases in this book. They’re evolving quickly, from both a standards and a deployment standpoint. Although some early implementations are available, I prefer to see a little more experience before talking about them in more than generic terms.
The hounds of complexity are forever at the gates. EVPN is a complex piece of technology, but one that you can tame, if you refrain from chasing after every single knob and optimization drafted and designed and sold. If you choose perfection as the destination, you savor it but for a moment, as the ever-changing world barges in. If you choose perfection as a journey, you can savor it much longer. One of the key ingredients of success is the KISS principle—Keep it Simple, Stupid—that has made networks, especially in the data center, interesting, scalable, and reliable. Keep your intent simple, and you don’t need to pay others to decipher your intent, often to their benefit, not yours.
If there is one takeaway and one alone, it is that EVPN in the data center can be a far simpler and, dare I say, more attractive beast than its SP cousin.
And, oh yes, the large financial company that I referred to earlier has deployed EVPN.
Software Used in This Book
I have used the open source routing suite FRR as the basis of configuration and examples, largely because it is open source and shows how simple EVPN configuration can be. There is a companion GitHub site to this book that allows you to use Vagrant to build out and play with the topology and configuration described in Chapter 6.
Get EVPN in the Data Center now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.