Knock, knock.
Branch prediction.
Who’s there?
— A classic programming joke
One of the most common bottlenecks in many benchmarks is CPU. Proper design and analysis of CPU-bound benchmarks require knowledge of different runtime and hardware “features” that can affect performance. Each .NET runtime has a lot of different optimizations that can improve (or spoil) performance of your code. Each CPU microarchitecture has a lot of low-level mechanisms that also affect measurements. If you are not aware of these optimizations and mechanisms, it’s hard to design some benchmarks ...