
Copyright © National Academy of Sciences. All rights reserved.
The Future of Computing Performance: Game Over or Next Level?
92 THE FUTURE OF COMPUTING PERFORMANCE
The key observation motivating a CMP design is that to increase
performance when the overall design is power-limited, each instruction
needs to be executed with less energy. The power consumed is the energy
per instruction times the performance (instructions per second). Examina-
tion of Intel microprocessor-design data from the i486 to the Pentium 4,
for example, showed that power dissipation scales as performance raised
to the 1.73 power after technology improvements are factored out. If the
energy per instruction were constant, the relationship should be linear.
Thus, the Intel Pentium 4 is about 6 times faster than the i486 in the same
Germany, June 19-23, 2004, pp. 2-13; Jung Ho Ahn, William J. Dally, Brucek Khailany, Ujval
J. Kapasi, and Abhishek Das, 2004, Evaluating the imagine stream architecture, Proceedings
of the 31st Annual International Symposium on Computer Architecture, Munich, Germany,
June 19-23, 2004, pp. 14-25; Brucek Khailany, Ted Williams, Jim Lin, Eileen Peters Long,
Mark Rygh, Deforest W. Tovey, and William Dally, 2008, A programmable 512 GOPS stream
processor for signal, image, and video processing, IEEE Journal of Solid-State Circuits
43(1): 202-213; Christoforos Kozyrakis and David Patterson, 2002, Vector vs superscalar and
VLIW architectures for embedded multimedia benchmarks, Proceedings of the 35th Annual
ACM/IEEE International Symposium on Microarchitecture, Istanbul, Turkey, November
18-22, 2002, pp. 283-293; Luiz André Barroso, Kourosh Gharachorloo, Robert McNamara,
Andreas Nowatzyk, Shaz Qadeer, Barton Sano, Scott Smith, Robert Stets, and Ben Verghese,
2000, Piranha: A scalable architecture based on single-chip multiprocessing, Proceedings
of the 27th Annual International Symposium on Computer Architecture, Vancouver, Brit-
ish Columbia, Canada, June 10-14, 2000, pp. 282-293; Poonacha Kongetira, Kathirgamar
Aingaran, and Kunle Olukotun, 2005, “Niagara: A 32-way multithreaded SPARC processor,
IEEE Micro 25(2): 21-29; Dac C. Pham, Shigehiro Asano, Mark D. Bolliger, Michael N. Day,
H. Peter Hofstee, Charles Johns, James A. Kahle, Atsushi Kameyama, John Keaty, Yoshio
Masubuchi, Mack W. Riley, David Shippy, Daniel Stasiak, Masakazu Suzuoki, Michael F.
Wang, James Warnock, Stephen Weitzel, Dieter F. Wendel, Takeshi Yamazaki, and Kazuaki
Yazawa, 2005, The design and implementation of a first-generation CELL processor, IEEE
International Solid-State Circuits Conference Digest of Technical Papers, San Francisco, Cal.,
February 10, 2005, pp. 184-185; R. Kalla, B. Sinharoy, and J.M. Tendler, 2004, IBM POWER5
chip: A dual-core multithreaded processor, IEEE Micro Magazine 24(2): 40-47; Toshinari
Takayanagi, Jinuk Luke Shin, Bruce Petrick, Jeffrey Su, and Ana Sonia Leon, 2004, A dual-
core 64b UltraSPARC microprocessor for dense server applications, IEEE International Solid-
State Circuits Conference Digest of Technical Papers, San Francisco, Cal., February 15-19,
2004, pp. 58-59; Nabeel Sakran, Marcelo Uffe, Moty Mehelel, Jack Dowweck, Ernest Knoll,
and Avi Kovacks, 2007, The implementation of the 65nm dual-core 64b Merom processor,
IEEE International Solid-State Circuits Conference Digest of Technical Papers, San Francisco,
Cal., February 11-15, 2007, pp. 106-107; Marc Tremblay and Shailender Chaudhry, 2008, A
third-generation 65nm 16-core 32-thread plus 32-count-thread CMT SPARC processor, IEEE
International Solid-State Circuits Conference Digest of Technical Papers, San Francisco,
Cal., February 3-7, 2008, p. 82-83; Larry Seiler, Doug Carmean, Eric Sprangle, Tom Forsyth,
Michael Abrash, Pradeep Dubey, Stephen Junkins, Adam Lake, Jeremy Sugerman, Robert
Cavin, Roger Espasa, Ed Grochowski, Toni Juan, and Pat Hanrahan, 2008, “Larrabee: A
many-core x86 architecture for visual computing, ACM Transactions on Graphics 27(3):
1-15; Doug Carmean, 2008, Larrabee: A many-core x86 architecture for visual computing,
Hot Chips 20: A Symposium on High Performance Chips, Stanford, Cal., August 24-26, 2008.