The G4e’s very straightforward floating-point implementation has a single FPU that takes a minimum of five cycles to finish executing the fastest floating-point instructions. (Some instructions take many more cycles.) The FPU is served by 48 microarchitectural floating-point registers (32 registers for the PPC ISA and 16 additional rename registers). Finally, single- and double-precision floating-point operations take the same amount of time.
The 970’s floating-point implementation is almost exactly like the
G4e’s, except there’s twice as much hardware. The 970 has two identical
FPUs, each of which can execute the fastest floating-point instructions
fadd) in six cycles. As with the G4e, single- and double-precision ...