Microarchitecture and implementation of the synergistic processor in 65-nm and 90-nm SOI
B Flachs, S Asano, SH Dhong… - IBM Journal of …, 2007 - ieeexplore.ieee.org
B Flachs, S Asano, SH Dhong, HP Hofstee, G Gervais, R Kim, T Le, P Liu, J Leenstra…
IBM Journal of Research and Development, 2007•ieeexplore.ieee.orgThis paper describes the architecture and implementation of the original gaming-oriented
synergistic processor element (SPE) in both 90-nm and 65-nm silicon-on-insulator (SOI)
technology and introduces a new SPE implementation targeted for the high-performance
computing community. The Cell Broadband Engine™ processor contains eight SPEs. The
dual-issue, four-way single-instruction multiple-data processor is designed to achieve high
performance per area and power and is optimized to process streaming data, simulate …
synergistic processor element (SPE) in both 90-nm and 65-nm silicon-on-insulator (SOI)
technology and introduces a new SPE implementation targeted for the high-performance
computing community. The Cell Broadband Engine™ processor contains eight SPEs. The
dual-issue, four-way single-instruction multiple-data processor is designed to achieve high
performance per area and power and is optimized to process streaming data, simulate …
This paper describes the architecture and implementation of the original gaming-oriented synergistic processor element (SPE) in both 90-nm and 65-nm silicon-on-insulator (SOI) technology and introduces a new SPE implementation targeted for the high-performance computing community. The Cell Broadband Engine™ processor contains eight SPEs. The dual-issue, four-way single-instruction multiple-data processor is designed to achieve high performance per area and power and is optimized to process streaming data, simulate physical phenomena, and render objects digitally. Most aspects of data movement and instruction flow are controlled by software to improve the performance of the memory system and the core performance density. The SPE was designed as an 11-FO4 (fan-out-of-4-inverter-delay) processor using 20.9 million transistors within 14.8 mm 2 using the IBM 90-nm SOI low-k process. CMOS (complementary metal-oxide semiconductor) static gates implement the majority of the logic. Dynamic circuits are used in critical areas and occupy 19% of the non-static random access memory (SRAM) area. Instruction set architecture, microarchitecture, and physical implementation are tightly coupled to achieve a compact and power-efficient design. Correct operation has been observed at up to 5.6 GHz and 7.3 GHz, respectively, in 90-nm and 65-nm SOI technology.
ieeexplore.ieee.org