Download IBM Presentations: Blue Pearl DeLuxe template

IBM Research Division The 50B Transistor Challenge Mikko Lipasti Department of Electrical and Computer Engineering University of Wisconsin - Madison IBM T.J. Watson Research Center July 22 and 23, 2008 July 22, 2008 © 2007 IBM Corporation IBM Research Division 50B Transistors on a Chip?  History – 1997 IEEE Computer Special Issue, 1B T/chip by 2007 • • • • • 3 papers advocate single fast core – CMU, Michigan, Wisconsin IRAM – Berkeley RAW – MIT SMT – Washington Multicore – Stanford  11 years later, 50x more transistors – We still need faster cores : computation • Fundamentally constrained by power – Will get more than one core : communication • Need efficient interconnects and coherent caches – Will get lots of on-chip memory • Need to think about new algorithms and new approaches to use it 2 July 22, 2008 IBM Research Division (1) What Will We Do With 50B Transistors?  50B transistors/chip dramatically alters data centers  E.g. Nokia moving aggressively into services – Google, Yahoo, MSN each provision ~1M servers – Now provision for 10x installed base (phone vs. PC) • Witness recent problems with Iphone/MobileMe  Impossible to anticipate applications – Youtube/Facebook/Flickr/Twitter – Unstructured real world data – Organize, search, extract semantic knowledge, mashups, …  Existing and future server apps all benefit 3 July 22, 2008 IBM Research Division (2) How Will We Design Chips with 50B Transistors  Three things that processors need to be good at: – Computation – Communication – Storage/Memory  Focus on cost and nature of computation  Focus on cost of communication  Shift emphasis to memory 4 July 22, 2008 IBM Research Division Cost of Computation  Less than 10% of energy spent on useful work – EPI overhead has gotten out of hand – Need to rethink operand delivery [ICCD’07], queues [ISPLED’07], caches, register files, control, …  Exploit program attributes – Solve hard problems via elimination • Macro-ops : no single-cycle operations [MICRO’03, HPCA’06] – Do the hard parts with narrow values [JILP’07]  Eliminate redundancy, excessive pipelines – Clever clock gating [ISLPED’06, ICCD’07] – Remove renaming, register file, clocked scheduler, pipelines [submitted]  Goal: reduce EPI by 10x at fixed process technology and MIPS 5 July 22, 2008 IBM Research Division Cost of Communication  Reduce coherence overhead and speculation – Region coherence [ISCA’05, ASPLOS’06, HPCA’08]  Exploit locality of communication patterns – Switched circuits [CALetters’07, NOCS’08] – On-chip multicasting [ISCA’08] – Multicast coherence [submitted]  New technologies – Nanophotonic rings [HP Labs collaboration] – Massive bandwidth, speed-of-light latency – Lots of interesting problems to solve 6 July 22, 2008 IBM Research Division Emphasis on Memory  In future processes, memory will be easier than logic – Reliability, variability: well-known solutions (ECC, sparing) – Interesting new technologies (PCRAM, etc.) – Not caches -- diminishing returns  Return to more regular, “memory-like” devices and logic? – Gate array, LUT, PLA  Majority of 50B T must not be switching – Remembering is cheaper than computing • Revisit value locality/reuse/memoization? – New search algorithms: • TCAM accelerator [ICCD’08] : Logic in memory—but not IRAM! 7 July 22, 2008 IBM Research Division Unstructured Real-World Data  Internet is exploding with data – Text – Semantic knowledge – Photo, video, audio  It is all in digital form but all we can do is view and copy it  Algorithms for analysis range from poor to nonexistent – Machine learning?  Why not learn from nature? 8 July 22, 2008 IBM Research Division Brains  Human brain  Von Neumann machine – Face recognition: <500ms – Neurons are slow: • – Critical path is a handful of “gates” Fundamentally different computational model  Made of shoddy, unreliable parts “…neurons are noisy, unreliable devices, … the nervous system averages over many cells to compensate for these shoddy components.” -Christof Koch  We can build it. We have the technology. Dec. 3, 2007 MICRO’-40 Panel: Computing Beyond Von Neumann 9 IBM Research Division Brains (2)  Human neocortex: – ~20B neurons, ~200T synapses – Structurally homogenous – Hypothesis: runs common algorithm  Apply architecture 101? – Abstraction layers – Hierarchy and replication – Simulation/analysis/synthesis –Let’s Build Brains! – Massively parallel fault-tolerant hardware  Best news: no need for parallel programming – Train vs. program Dec. 3, 2007 MICRO’-40 Panel: Computing Beyond Von Neumann 10 IBM Research Division Summary  Computation : – Reduce cost (EPI) by 10x – New algorithms  Communication – Streamline coherence protocols, interconnects – Exploit new technologies  Storage/Memory – Reliability/variability – Logic in memory/new algorithms  Brain computing for unstructured real-world data 11 July 22, 2008 IBM Research Division Questions? http://www.ece.wisc.edu/~pharm

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download IBM Presentations: Blue Pearl DeLuxe template