Hadoop and Spark Performance for the Enterprise

Ensuring Quality of Service in Multi-Tenant Environments

Hadoop and Spark Performance for the Enterprise

Get the Free Ebook

Virtually every enterprise depends on big data analysis, but distributed computing environments such as Hadoop and Spark are complicated, to say the least. Multiple users, business units, and workload types often compete for valuable computing resources. Monitoring tools are not well equipped to handle this level of complexity, and typically provide only very high-level and historical information. The lack of fine-grained visibility for making real-time adjustments to running workloads means that high-priority jobs can easily be pushed aside by lower-priority jobs.

It’s time to bring Quality of Service (QoS) to distributed processing in multi-tenant Hadoop environments. This O’Reilly report explains how QoS allows operators to assign priorities to jobs, ensuring that higher-priority tasks get the resources needed to meet critical deadlines. Author Andy Oram examines the critical role of performance in the evolution of operating systems, data warehouses, and distributed processing. He also discusses Quasar (part of Mesos) and Pepperdata, two tools that can help improve performance in distributed computing environments.

You’ll discover how tools that help ensure QoS can help distributed environments evolve to accommodate:

  • Multiple users contending for resources, such as those on operating systems
  • Jobs that grow or shrink in hardware usage, so they don’t strain at resource limits or let resources go to waste
  • Jobs of different priorities, including soft real-time requirements that allow them to override lower-priority or adhoc jobs
  • Performance guarantees, similar to service-level agreements (SLAs)

Fill out the form below

All fields are required.

We protect your privacy.
Andy Oram

Andy Oram

Andy Oram is an editor at O'Reilly Media, a highly respected book publisher and technology information provider. An employee of the company since 1992, Andy currently specializes in open source, software engineering, and health IT, but his editorial output has ranged from a legal guide covering intellectual property to a graphic novel about teenage hackers. His work for O'Reilly includes the influential 2001 title Peer-to-Peer, the 2005 ground-breaking book Running Linux, and the 2007 best-seller Beautiful Code. Andy also writes often for O'Reilly's Radar site (http://radar.oreilly.com/) and other publications on policy issues related to the Internet and on trends affecting technical innovation and its effects on society. Print publications where his work has appeared include The Economist, Communications of the ACM, Copyright World, the Journal of Information Technology & Politics, Vanguardia Dossier, and Internet Law and Business. His web site is www.praxagora.com/andyo.