In this episode of the O’Reilly Podcast, I talk with Tammy Butow, a site reliability engineer at Gremlin, and Annie Lau, a software engineering manager at Trulia, about creating a culture of learning, how experimentation is important to business, and their careers in tech.

Why chaos engineering is important: Butow explains, "If you don't do chaos engineering, you're just going with your gut, really, and just waiting for the bad things to happen. The idea with chaos engineering is that you're proactively injecting failure when everyone else is away—but all of your experts are there, available to ask questions, to answer questions, to work with each other, instead of it happening at 2:00 a.m. or 3:00 a.m. in the morning.”

On the joys of software engineering, Lau says: “I fell in love with software engineering [because] you can actually see your finished product. You do something, and then you see the result. It's just an amazing feeling to see your work in action.”

The importance of learning and sharing knowledge: Lau says, “Technology is always changing. There's always new things on the horizon to help solve complex problems that we are facing today and will face in the future. So, it's very important that as engineers we keep on learning and keep on sharpening our skill sets.”

How learning something new can come in many different forms: Butow reflects, “Often I love to do things like lunch and learns. Share what you've learned across different projects that you've worked on. Get together as a team, a greater team, your whole engineering team. I think that sort of stuff is really important. You need to work with others. Otherwise you're never going to be able to really understand what the deep problems are that you need to fix.”

Trulia’s bug bounty program: Lau explains, “So with the bug bounty, by working with the hackers, we're really able to have that rich data that we can show the business people, and even the engineers who are not as familiar with security, how a vulnerability affects our users and systems. And by incentivizing our hackers to continue to hack us, we'll be able to keep up with the trend of typical vulnerabilities.”

Why incident management programs are essential: Butow explains, “Because the biggest thing that's so important is that everybody in your entire company needs to be able to say, ‘Hey, something's not working as expected.’ And the scenario that I love to use is, imagine if you're in sales, and you're about to go into a demo. You've been working for months. You've been preparing for the demo. You're super excited. You walk in, and everything's not working. Then what do you do?”

