A fully pipelined single-precision floating-point unit in the synergistic processor element of a CELL processor

HJ Oh, SM Mueller, C Jacobi, KD Tran… - IEEE Journal of Solid …, 2006 - ieeexplore.ieee.org
HJ Oh, SM Mueller, C Jacobi, KD Tran, SR Cottier, BW Michael, H Nishikawa, Y Totsuka…
IEEE Journal of Solid-State Circuits, 2006ieeexplore.ieee.org
The floating-point unit (FPU) in the synergistic processor element (SPE) of a CELL processor
is a fully pipelined 4-way single-instruction multiple-data (SIMD) unit designed to accelerate
media and data streaming with 128-bit operands. It supports 32-bit single-precision floating-
point and 16-bit integer operands with two different latencies, six-cycle and seven-cycle, with
11 FO4 delay per stage. The FPU optimizes the performance of critical single-precision
multiply-add operations. Since exact rounding, exceptions, and de-norm number handling …
The floating-point unit (FPU) in the synergistic processor element (SPE) of a CELL processor is a fully pipelined 4-way single-instruction multiple-data (SIMD) unit designed to accelerate media and data streaming with 128-bit operands. It supports 32-bit single-precision floating-point and 16-bit integer operands with two different latencies, six-cycle and seven-cycle, with 11 FO4 delay per stage. The FPU optimizes the performance of critical single-precision multiply-add operations. Since exact rounding, exceptions, and de-norm number handling are not important to multimedia applications, IEEE correctness on the single-precision floating-point numbers is sacrificed for performance and simple design. It employs fine-grained clock gating for power saving. The design has 768K transistors in 1.3 mm/sup 2/, fabricated SOI in 90-nm technology. Correct operations have been observed up to 5.6 GHz with 1.4 V and 56/spl deg/C, delivering 44.8 GFlops. Architecture, logic, circuits, and integration are codesigned to meet the performance, power, and area goals.
ieeexplore.ieee.org
この検索けんさくさい上位じょうい結果けっか表示ひょうじしています。 検索けんさく結果けっかをすべて