Chapter 2. System structure and design 41
Data transfer between the CEC memory and attached I/O devices or CPCs is done through
the Memory Bus Adapter. The physical path includes the Channel card (except for STI
connected CPCs), the Self-Timed Interconnect bus, and possibly a STI extender card, the
Storage Control, and the Storage Data chips.
More detailed information about I/O connectivity and channel types can be found in
Chapter 2.2.12, “I/O subsystem” on page 71.
Dual External Time Reference
The optional ETR connections, although not part of the book design, are found adjacent to
the books on the opposite side of the CEC board. The z990 servers implement an Enhanced
ETR Attachment Facility (EEAF) designed to provide a dual External Time Reference (ETR)
attachment facility. Two ETR cards are automatically shipped when Coupling Links are
ordered and provide a dual path interface to the IBM Sysplex Timers, which are used for
timing synchronization between systems in a Sysplex environment. This allows continued
operation even if a single ETR card fails. This redundant design also allows concurrent
maintenance.
2.2.3 Processor Unit design
Each PU is optimized to meet the demands of new e-business workloads, without
compromising the performance characteristics of traditional workloads. The PUs in the z990
have a superscalar design.
Superscalar processor
A scalar processor is a processor that is based on a single issue architecture, which means
that only a single instruction is executed at a time. A superscalar processor allows concurrent
execution of instructions by adding additional resources onto the microprocessor to achieve
more parallelism by creating multiple pipelines, each working on their own set of instructions.
A superscalar processor is based on a multi-issue architecture. In such a processor, where
multiple instructions can be executed at each cycle, a higher level of complexity is reached
because an operation in one pipeline may depend on data in another pipeline. A superscalar
design therefore demands careful consideration of which instruction sequences can
successfully operate in a multi-pipeline environment.
As an example, consider the following: If the branch prediction logic of the microprocessor
makes the wrong prediction, it might be necessary to remove all instructions in the parallel
pipelines also (refer to “Processor Branch History Table (BHT)” on page 44 for more details).
There are challenges in creating an efficient superscalar processor. The superscalar design
of the z990 PU has made big strides in avoiding address generation interlock situations.
Instructions requiring information from memory locations may suffer multi cycle delays to get
the memory content. The superscalar design of the z990 PU tries to overcome these delays
by continuing to execute (single cycle) instructions that do not cause delays. The technique
used is called “out-of-order operand fetching”. This means that some instructions in the
instruction stream are already underway, while earlier instructions in the instruction stream
that cause delays due to storage references take longer. Eventually, the delayed instructions
catch up with the already fetched instructions and all are executed in the designated order.
The z990 PU gets much of its superscalar performance benefits from avoiding address
generation interlocks.
It is not only the processor that contributes to the capability of the successful execution of
instructions in parallel. Given a superscalar design, compilers and interpreters must create
code that benefit optimally from the particular superscalar processor implementation. Work is
Get IBM eServer zSeries 990 Technical Guide now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.