Skip to Content
How to Make Things Faster
book

How to Make Things Faster

by Cary Millsap
June 2023
Intermediate to advanced
356 pages
5h 32m
English
O'Reilly Media, Inc.
Content preview from How to Make Things Faster

Chapter 69. Skew

Here’s a profile for a program that took almost 22 hours to run:

  Duration
Subroutine seconds % Count Mean
single-block read 60,499 76.8% 10,013,394 0.006 042
other 18,290 23.2% 919,906 0.019 882
Total 78,789 100.0% 10,933,300 0.007 206

From this, you should be able to make some simple predictions:

1. How much time would you save if you could reduce the average “single-block read” call duration from 0.006 seconds to 0.004 seconds?
You would save 10,013,394 calls × (0.006 − 0.004) sec/call ≈ 20,000 seconds.
2. How much time would you save if you could eliminate all 10,013,394 “single-block read” calls?
You would save 60,499 seconds. Almost 17 hours.
3. How much time would you save if you could eliminate half of the “single-block read” calls?
You would save half of 60,499 seconds, right? Actually…probably not. It might work out that way, but it’s unlikely.

The snag is a data property called skew. Skew is nonuniformity in your data. A tidy little profile like the one I’ve shown here hides something important in that mean column: those ten million calls didn’t all have the same duration. Some took a lot less than 0.006 seconds, and some took a lot more. How much time you save when you eliminate half your calls depends on which calls you eliminate.

Here is a profile showing “single-block read” calls grouped by duration. It’s a histogram with 11 buckets:

  Range (seconds) Duration
  {min ≤ duration < max} Seconds % Calls
1. 0.000 000 0.000 001 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Improve Your Problem-Solving Skills

Improve Your Problem-Solving Skills

Charles Humble

Publisher Resources

ISBN: 9781098147051Errata Page