Let’s take everything you learned in the previous chapter on measurements and apply that knowledge here to write a benchmark function. To reiterate, here’s what such a function should do:
Run the code multiple times to gather measurements. It’s best if we can do 30 runs or more.
Skip the results of the first run to reduce the warm-up effects and let caching do its job.
Force GC before each run.
Fork the process before measurement to make sure all runs are isolated and don’t interfere with each other.
Store all measurements somewhere (in the file, on S3, etc.) to be processed later.
Calculate and report average performance and its standard deviation.
This list makes for a pretty detailed spec, so let’s go ahead and write ...