O'Reilly logo

Architecting Modern Data Platforms by Lars George, Paul Wilkinson, Ian Buss, Jan Kunigk

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 14. Basics of Virtualization for Hadoop

In this chapter, we assess virtualization technologies on a basic level. Although virtualized IT infrastructure scales well when stacking individual small to medium-sized applications, scaling virtual compute clusters and distributed systems requires special attention.

We begin with compute virtualization, which means running virtual machines (VMs) in a hypervisor, such as KVM or VMware. This is the most basic and well-defined building block in virtualized infrastructure. (In addition to virtualization on hypervisors, containerization is an emerging and relevant technology for enterprises; we cover it in Chapter 15.)

Even more important to our discussion of Hadoop in the cloud is the subject of storage virtualization, which we look at next. This means abstracting storage devices into containers that are centrally hosted in remote storage arrays based on storage area network (SAN) or object storage technology.

The third layer of virtualization to consider is network virtualization, also referred to as Software Defined Networks (SDN). As we will see, your choice of virtualization mechanisms will drive the lifecycle model of your clusters in the cloud.

We will cover all of these subjects in this chapter.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required