Download lesson21

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Prelude to Multiprocessing
Detecting cpu and system-board
capabilities with CPUID and the
MP Configuration Table
CPUID
• Recent Intel processors provide a ‘cpuid’
instruction (opcode 0x0F, 0xA2) to assist
software in detecting a CPU’s capabilities
• If it’s implemented, this instruction can be
executed in any of the processor modes,
and at any of its four privilege levels
• But this ‘cpuid’ instruction might not be
implemented (e.g., 8086, 80286, 80386)
Intel x86 EFLAGS register
31
21
0
I
D
0
0
0
0
0
0
0
0
0
16
V
I
P
V
I
F
A
C
V
M
R
F
15
0
0
N
T
IOPL
O
F
D
F
I
F
T
F
S
F
Z
F
0
A
F
0
P
F
1
C
F
Software can ‘toggle’ the ID-bit (bit #21) in the 32-bit EFLAGS register
if the processor is capable of executing the ‘cpuid’ instruction
But what if there’s no EFLAGS?
• The early Intel processors (8086, 80286)
did not implement any 32-bit registers
• The FLAGS register was only 16-bits wide
• So there was no ID-bit that software could
try to ‘toggle’ (to see if ‘cpuid’ existed)
• How can software be sure that the 32-bit
EFLAGS register exists within the CPU?
Detecting 32-bit processors
• There’s a subtle difference in the way the
logical shift/rotate instructions work when
register CL contains the ‘shift-factor’
• On the 32-bit processors (e.g., 80386+)
the value in CL is truncated to 5-bits, but
not so on the 16-bit CPUs (8086, 80286)
• Software can exploit this distinction, in
order to tell if EFLAGS is implemented
Detecting EFLAGS
# Here’s a test for the presence of EFLAGS
mov $-1, %ax
# a nonzero value
mov $32, %cl
# shift-factor of 32
shl
%cl, %ax
# do logical shift
or
%ax, %ax # test result in AX
jnz
is32bit
# EFLAGS present
jmp is16bit
# EFLAGS absent
Testing for ID-bit ‘toggle’
# Here’s a test for the presence of the CPUID instruction
pushfl
# copy EFLAGS contents
pop
%eax
# to accumulator register
mov
%eax, %edx
# save a duplicate image
btc
$21, %eax
# toggle the ID-bit (bit 21)
push %eax
# copy revised contents
popfl
# back into EFLAGS
pushfl
# copy EFLAGS contents
pop
%eax
# back into accumulator
xor
%edx, %eax
# do XOR with prior value
bt
$21, %eax
# did ID-bit get toggled?
jc
y_cpuid
# yes, can execute ‘cpuid’
jmp
n_cpuid
# else ‘cpuid’ unimplemented
How does CPUID work?
• Step 1: load value 0 into register EAX
• Step 2: execute ‘cpuid’ instruction
• Step 3: Verify ‘GenuineIntel’ characterstring in registers (EBX,EDX,ECX)
• Step 4: Find maximum CPUID input-value
in the EAX register
Version and Features
• load 1 into EAX and execute CPUID
• Processor model and stepping information
is returned in register EAX
27
20 19
16
Extended Extended
Family ID Model ID
13 12 11
Type
8 7
Family
ID
4 3
Model
0
Stepping
ID
Some Feature Flags in EDX
28
H
T
T
13
9
3
2
1
0
P
G
E
A
P
I
C
P
S
E
D
E
V
M
E
F
P
U
HTT = HyperThreading Technology (1 = yes, 0 = no)
PGE = Page Global Entries (1=yes, 0=no)
APIC = Advanced Programmable Interrupt Controller on-chip (1 = yes,0 = no)
PSE = Page-Size Extensions (1 = yes, 0 = no)
DE = Debugging Extensions (1=yes, 0=no)
VME = Virtual-8086 Mode Enhancements (1 = yes, 0 = no)
FPU = Floating-Point Unit on-chil (1=yes, 0=no)
Some Feature Flags in ECX
5
V
M
X
VMX = Virtual Machine Extensions (1 = yes, 0 = no)
Multiprocessor Specification
• It’s an industry standard, allowing OS software
to use multiple processors in a uniform way
• OS software searches in three regions of the
physical address-space below 1-megabyte for a
“paragraph-aligned” data-structure of length 16bytes called the MP Floating Pointer Structure:
– Search in lowest KB of Extended Bios Data Area
– Search in topmost KB of conventional 640K RAM
– Search in the 128KB ROM-BIOS (0xE0000-0xFFFFF)
MP Floating Pointer Structure
• This structure may contain an ID-number
for one a small number of standard SMP
system architectures, or may contain the
memory address for a more extensive MP
Configuration Table having entries that
specify a “customized” system architecture
• The machines in our classroom employ
the latter of these two options
An example record
• The MP Configuration Table will contain
a record for each logical processor
reserved (=0)
reserved (=0)
Feature Flags
CPU signature (stepping, model, family)
CPU Flags
BP (bit 1), EN (bit 0)
Local-APIC
version
Local-APIC
ID
Entry Type
0
BP = Bootstrap Processor (1=yes, 0=no), EN = Enabled (1=yes, 0=no)
Our ‘mpinfo.cpp’ utility
• We created a Linux utility that will display
the system-information contained in the
MP Configuration Table (in hex format)
• You can refer to the ‘MP Specification 1.4’
document (online) to interpret this display
• This utility needs a device-driver ‘dram.c’
to be pre-installed (in order that it be able
to directly access the system’s memory)
A processor’s Local-APIC
• The purpose of each processor’s APIC is to
allow the CPUs in a multiprocessor system to
send messages to one another and to manage
the delivery of the interrupt-requests from the
various peripheral devices to one (or more) of
the CPUs in a dynamically programmable way
• Each processor’s Local-APIC has a variety of
registers, all ‘memory mapped’ to paragraphaligned addresses within the 4KB page at
physical-address 0xFEE00000
Local-APIC’s register-space
APIC
0xFEE00000
4GB physical
address-space
RAM
0x00000000
Analogies with the PIC
• Among the registers in a Local-APIC are
these (which had analogues in the older
8259 PIC’s design:
– IRR: Interrupt Request Register (256-bits)
– ISR: In-Service Register (256-bits)
– TMR: Trigger-Mode Register (256-bits)
• For each of these, its 256-bits are divided
among eight 32-bit register addresses
New way to do ‘EOI’
• Instead of using a special End-Of-Interrupt
command-byte, the Local-APIC contains a
dedicated ‘write-only’ register (named the
EOI Register) which an Interrupt Handler
writes to when it is ready to signal an EOI
# issuing EOI to the Local-APIC
mov
$0xFEE00000, %ebx
movl
$0, %fs:0xB0(%ebx)
# address of the cpu’s Local-APIC
# write any value into EOI register
# Here we assume segment-register FS holds the selector for a segment-descriptor
# for a ‘writable’ 4GB-size expand-up data-segment whose base-address equals 0
Each CPU has its own timer!
• Four of the Local-APIC registers are used
to implement a programmable timer
• It can privately deliver a periodic interrupt
(or one-shot interrupt) just to its own CPU
– 0xFEE00320: Timer Vector register
– 0xFEE00380: Initial Count register
– 0xFEE00390: Current Count register
– 0xFEE003E0: Divider Configuration register
Timer’s Local Vector Table
0xFEE00320
17 16
M
O
D
E
MODE:
0=one-shot
1=periodic
M
A
S
K
MASK:
0=unmasked
1=masked
12
B
U
S
Y
7
0
Interrupt
ID-number
BUSY:
0=not busy
1=busy
Timer’s ‘Divide-Configuration’
0xFEE003E0
3 2 1 0
reserved (=0)
0
Divider-Value field (bits 3, 1, and 0)
000 = divide by 2
001 = divide by 4
010 = divide by 8
011 = divide by 16
100 = divide by 32
101 = divide by 64
110 = divide by 128
111 = divide by 1
Initial and Current Counts
0xFEE00380
Initial Count Register (read/write)
0xFEE00390
Current Count Register (read-only)
When the timer is programmed for ‘periodic’ mode, the Current Count is
automatically reloaded from the Initial Count register, then counts down
with each CPU bus-cycle, generating an interrupt when it reaches zero
Using the timer’s interrupts
• Setup your desired Initial Count value
• Select your desired Divide Configuration
• Setup the APIC-timer’s LVT register with
your desired interrupt-ID number and
counting mode (‘periodic’ or ‘one-shot’),
and clear the LVT register’s ‘Mask’ bit to
initiate the automatic countdown operation
In-class exercise #1
• Run the ‘cpuid.cpp’ Linux application (on
our course website) to see if the CPUs in
our classroom implement HyperThreading
(i.e., multiple logical processors in a cpu)
• Then run the ‘mpinfo.cpp’ application, to
see if the MP Base Configuration Table
has entries for more than one processor
• If both results hold true, then we can write
our own multiprocessing software in H235!
In-class exercise #2
• Run the ‘apictick.s’ demo (on our CS 630
website) to observe the APIC’s ‘periodic’
interrupt-handler drawing ‘T’s onscreen
• It executes for ten-milliseconds (the 8254
is used here to create that timed delay)
• Try reprogramming the APIC’s Divider
Configuration register, to cut the interrupt
frequency in half (or perhaps to double it)