Appendix 6
Experimental Efficiency of Software Data Preloading and Prefetching for Embedded VLIW
For our experimentation, we used a cycle-accurate simulator provided by STmicroelectronics. The astiss simulator offers the possibility of considering a nonblocking cache. We fixed the number of misinformation status hold registers (MSHR) (the pending loads queue) at 8. We made the choice of eight MSHR, because during experimentation, we observed that the instruction-level parallelism (ILP) and register pressure reach a limit when MSHR is set to eight; a larger MSHR does not yield more performance. We use a simulator for our experiments for many reasons:
Get Advanced Backend Optimization now with O’Reilly online learning.
O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.