G.1 TMS320C64X PROCESSOR
Another member of the C6000 family of processors is the C64x, which can operate at a much higher clock rate. The C6416 DSK operates at 1 GHz for a 1.00-ns instruction cycle time. Features of the C6416 architecture include: four 16 × 16-bit multipliers (each .M unit can perform two multiplies per cycle), sixty-four 32-bit general-purpose registers, more than 1 MB of internal memory consisting of 1 MB of L2 RAM/cache, and 16 kB of each L1P program cache and L1D data cache [1–7].
The C64x is based on the architecture VELOCITI.2, which is an extension of VELOCITI . The extra registers allow for packed data types to support four 8-bit or two 16-bit operations associated with one 32-bit register, increasing parallelism . For example, the instruction MPYU4 performs four 8-bit multiplications within a single instruction cycle time. Several special-purpose instructions have also been added to handle many operations encountered in wireless and digital imaging applications, where 8-bit data processing is common. In addition, the .M unit (for multiply operations) can also handle shift and rotate operations. Similarly, the .D unit (for data manipulation) can also handle logical operations. The C64x is a fixed-point processor. Existing instructions are available to more units. Double-word load (LDDW) and store (STDW) instructions can access 64 bits of data, with up to a two double-word load or store instructions per cycle (read or write 128 bits ...