O'Reilly logo
live online training icon Live Online training

More Effective DevOps Testing

Real World - Not Just the Idealized Approach

Topic: System Administration
Cliff Berg

Companies like Google and Amazon pioneered DevOps testing methods, using practices such as on-demand test environment provisioning, massive scale on-demand automated end-to-end testing, automated dependency management, a self-service test tool development group, and autonomous teams. But are all these methods right for you? Not everyone can afford to do things at the scale of Google and Amazon. What mix of practices are right for your organization?

In this course students learn how to define an approach for digital product testing in a holistic manner, so those products can be delivered with a rapid cadence and be fully tested with all risks managed through tests. This goes beyond the standard “Agile testing” and “pipeline” that are normally described in Agile and DevOps literature, and reflects what has worked in real world situations for complex digital platform products requiring tens or hundreds of teams. Topics covered include Behavior-Driven Development (BDD) in the context of integration testing, test isolation, managing behavioral coverage, failure mode testing versus “chaos engineering”, concurrency testing, product level Definition of Done, shift-left integration testing, when to mock and when not to, managing dependencies, merging after a pull request instead of before, the unique testing needs of highly distributed applications, and defining a holistic product test strategy.

Scope: Testing is a broad topic. The focus of this course is primarily on the testing of services and end-to-end integration: tooling that is specifically for testing user interfaces is not covered due to time, although all of the concepts and methods apply. Also, the focus is on business applications - not embedded or kernel testing. We also do not specifically cover testing of serverless code although the practices covered are highly applicable.

What you'll learn-and how you can apply it

  • How to approach product testing in a holistic manner.
  • How to define a test strategy, that manages all risk through automated tests.
  • Why and how testing highly distributed multiple product and multiple component (e.g., microservice based and event oriented) systems is different from testing monolithic systems.
  • How to manage test coverage for tests other than unit tests.
  • How to approach testing for a complex digital platform that consists of many products, each consisting of many independently deployable components.
  • How to use Behavior-Driven Development (BDD) methods for the above purposes.
  • How to ensure test isolation.
  • How to manage and test the dependencies between the components of a product, and merge changes spanning multiple repos.
  • How to decide whether to use feature toggles or feature branch builds.
  • How to design tests for resiliency - a critical precursor for “chaos” methods.
  • How to “shift left” your integration tests.
  • How do write a product “Definition of Done” (DOD) that utilizes the above practices.

This training course is for you because...

  • Programmers.
  • Test programmers.
  • Solution architects and digital platform architects.
  • Tech leads.
  • Test leads.
  • Technical product leads.
  • DevOps engineers.
  • Agile coaches who have technical expertise.


  • Some programming experience.
  • Some familiarity with modern digital platforms.
  • Some experience either on an Agile programming team or supporting or interacting with an Agile programming team.

Course Set-up

  • Cliff has detailed instructions for configuring your computer as a developer machine and installing VirtualBox here: https://agilegriffin.com/oreilly This does not need to be done prior to training, as training will consist of demos; but if you wish to replicate what is done during training, then these configurations will be very useful. You do not lose any functionality or compromise your system’s security by doing this - these changes merely make visible features that are present but hidden.

Recommended Follow-up:

About your instructor

  • Cliff Berg started out in IT as an electrical engineer and then went on to write advanced “synthesis” compilers. He has been a pioneer with Web technology and Agile. After co-founding a successful IT services startup in 1995 which grew to 200 people and which was an early adopter (in 2000) of eXtreme Programming, and authoring one of the first books about Enterprise Java for Sun Microsystems Press, Cliff went on to consult to organizations that were trying to use Agile methods, helping them to pinpoint the source of problems they were having with product reliability and security. Cliff’s focus then shifted to DevOps, and he has supported more than ten Agile or DevOps transformations, mostly at large organizations. In 2005 he authored the book High-Assurance Design, which explained how Agile teams could build secure and reliable systems, and more recently launched the website Transition2Agile.com, which brings together Agilists and organization change experts. Today Cliff’s focus is primarily on helping large organizations to adopt DevOps methods.


The timeframes are only estimates and may vary according to how the class is progressing

  • Overview (15 min)
  • How things have changed since eXtreme Programming (XP) - why today’s highly distributed systems need a different set of technical practices. (10 min)
  • What tends to go wrong with highly distributed applications - i.e., those that use microservices and events (10 min)
  • Defining a test strategy (15 min)
  • Behavior-driven development (BDD) as integration tests (10 min)

Break (10 min)

  • Gherkin basics (30 min)
  • BDD workflow (15 min)
  • Test isolation (10 min)
  • Test data management and Definition of Ready (DOR) (10 min)

Break (10 min)

  • To mock, or not to mock (5 min)
  • Shift-left testing (10 min)
  • Managing dependencies, the Pull Request, and Definition of Done (DOD) (20 min)
  • Managing behavioral test coverage (5 min)
  • Failure mode tests and chaos engineering (10 min)
  • Concurrency testing (10 min)
  • API testing (15 min)
  • Wrap up (5 min)