Because of the way it affects the 970’s issue queue design, the group formation scheme has some interesting performance implications. Specifically, proper code scheduling is important in ways that it wouldn’t normally be for the other processors discussed here.
Figure 10-6. The 970's vector issue queues
Instead of trying to explain this point, I’ll illustrate it with
an example. Let’s look at an instruction with few group restrictions—the
floating-point add. The 970’s group formation rules dictate that the
fadd can go into any of the four dispatch slots, and which slot it goes into in ...