Download Chapter 18 Multicore Computers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 18
Multicore Computers
Vy Luong
Multicore Computers
 Chip multiprocessor:
combines two or more
processors (cores) on a
die.
 Each core consists of:
registers, ALU, pipeline
hardware, control unit
and L1 cache.
Hardware Performance Issues
 Goal: increase instruction-level parallelism.
 Superscalar
Replicate execution resources enabling parallel execution
of instructions in parallel pipelines.
 Simultaneous multithreading (SMT)
Duplicate register banks so that multiple threads can share
the use of pipeline resources.
 Problem: Managing multiple threads and power
consumption.
Why Multicore?
 Control power density by
using more of the chip
area for cache memory
(instead of logic
transistors).
 Near linear performance
improvement.
Applications That Benefit From
Multicore Systems






Servers
Multithreaded native applications
Multiprocess applications
Java applications
Multiinstance applications
Valve Game Software
Valve
 Reprogrammed Source engine software to use
multithreading to exploit the power of multicore
processor chips from Intel and AMD.
 Twice the performance with coarse threading.
 Hybrid threading approach (combine coarse with finegrained threading).
 Scene-rendering lists for multiple scenes in parallel
(and other graphic-related simulation).
Multicore Organization
 Variables in a multicore
organization:
 Number of core
processors on the chip
 Number of levels of
cache memory
 Amount of cache
memory that is shared
Superscalar or SMT?
 Intel Core Duo: individual cores are superscalar.
 Intel Core i7: Implement SMT cores.
 Advantages: scales up the number of hardware threads
that the system supports.
 Multicore system with four cores (and SMT) that
supports four simultaneous threads in each core, on the
application level, appears the same as 16 cores.
 SMT appears to be more attractive than superscalar.
Intel Core Duo
 Introduced in 2006.
 Two x86 superscalar
processors.
 Separate thermal control
units.
 Advanced Programming
Interrupt Controller
Intel Core i7
 Introduced in November
2008.
 Four x86 SMT processors
 DDR3
 QuickPath interconnect
ARM11 MPCore
 Can be configured with
up to four processors.
 DIC
 Timer
 Watchdog
Interrupt Handling
 support between 0 and 255 hardware interrupt inputs.
 maintains a list of interrupts, showing their priority and
status.
 DIC satisfies two requirements:
 Routing an interrupt request to a single CPU or CPUs, as required.
 Provide interprocessor communication so a thread on one CPU
can cause activity by a thread on another CPU.
 Interrupts:
 Inactive - processed by that CPU but pending or active in some
CPUs to which it is targeted.
 Pending – asserted but processing has not started.
 Active – started but processing is not completed.
Cache Coherency
 Snoop unit control (SCU): resolve bottlenecks related to access
to shared data.
 The SCU introduces three types of optimization:
 direct data intervention
 enables copying clean data from one CPU L1 data cache to another
CPU L1 data cache without accessing external memory.
 duplicated tag RAMs
 duplicated versions of L1 tag RAMs used by the SCU to check for data
availability before sending coherency commands to the relevant
CPUs.
 migratory lines
 enables moving dirty data from one CPU to another without
writing to L2 and reading the data back in from external memory.