Download CS 110 – Lecture 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CS 300 – Lecture 21
Intro to Computer Architecture
/ Assembly Language
Virtual Memory
Next Homework
Sorry – it's not ready yet. It will be in the
wiki Friday. The first part will be due the
Thursday after break. The second will be
due a week later. There will be one more
homework after that.
Test Recap
Binary Numbers! Aaaaaargh!
Binary numbers WILL BE BACK on the final!
I pity the fool
that can't think
in binary!
Binary
Convert the following decimal numbers to 8 bit signed
binary numbers.
12
-8
128
63
-128
Add the following 8 bit signed binary numbers; indicate any
overflows (but still give a result)
11001111
10000000
01110111
+01000001
+10000000
+01110000
Signed:
11001100 / 8
11110000 * 16
01010101 / 4
IEEE Float:
0 01111110 10100000000000000000000
Convert the number -8.5 to IEEE floating point (32 bits)
Bit Fiddling
Write a MIPS code sequence which takes
an IEEE float in $a0 and places the
exponent only, converted to an integer
between -128 and 127, in register $v0.
More MIPS
*x = *(x+1)+2
Short Answer
If p points to a 64 bit floating point numbers, to increment p
you add _______
(T / F) If you divide by 2 using a right shift, the result is
rounded up if the number is odd.
To divide a signed integer by 2 using a right shift, you shift
in _______ bits.
When you use a lw instruction, the memory address
referenced must end in _________
(T / F) A “lw” instruction may access any word in the MIPS
memory.
(T / F) A “j” instruction can jump to any instruction in the
MIPS memory
Short Answer
If p points to an 8-bit character, a ______
instruction fetches the character from memory.
(T / F) A function is free to change the value of $t1
without saving it on the stack
(T / F) In C, the expression a[i] is the same as
*(a+i)
Fast arithmetic on large integers is important today
since computers commonly run
___________________________ software.
(T / F) Writing large assembly language programs
is likely to cause brain damage.
Bugz
f: add $a0, $a1, $a0
lw $a0, 4($a0)
jal put_str
jr $ra
g: lw $t0, 0($s0)
add $s1, $s1, $t0
addi $s0, $s0, 1
addi $s2, $s2, -1
bne $zero, $s2, g
Bugz
h: addi $sp, $sp, -4
sw $ra, 0($s0) # Oops - $sp
jal f1
addi $a0, $v1, 1
jal f2
la $ra, 0($sp) # Oops - lw
jr $ra
Mipsorama
int f(char **a, int *b, int c) {
int sum = 0;
int i;
while (c != 0) {
i = *b;
while (i != 0) {
put_str(*a);
sum++;
i--};
b++; a++; c--;
return(sum);
Back to Caches …
Things to know:
* A cache is smaller but faster than the
system being cached.
* Shape of the cache determines whether
addresses conflict - direct mapped,
associative, set (partial) associative
* Replacement policies (LRU)
* Multi-level cache systems
Dual Caches
One possible cache design is to separate
instruction caching from data caching.
There are major differences in the access patterns
for instructions & data (I & D)
* No writes to instructions (simplifies cache design)
* Instructions are more sequential – pre-loading is
a big issue. A less associative design is possible
* A data cache has to worry about regular access
patterns (much array code)
Cache Coherence
This is a problem when more than one party is
using a cache.
If two processors use a common memory, their onchip caches can lose coherence.
How to deal with this?
* Write-through (cache is never out of synch with
memory) instead of write-back (avoid writing dirty
cache words until replacement).
* Invalidation signals: When processor A writes into
memory, it must invalidate the corresponding word
in processor B's cache (or update it)
Current Cache Design Stuff
* Segregated (I/D) L1 cache – small and
very fast
* On-chip large L2 cache (in the mB range)
* Off-chip L3 cache in high end systems
* Set associative designs predominate – 8
way is common.
Overview of Pentium Caching
* Pentium I: 8KB each L1 I and D cache
* Pentium Pro: 256KB L2 cache added
* Pentium II: 16KB L1 I/D cache, 512KB L2
cache
* Pentium IV: up to 1MB L2 cache – cache
access time to L1 is just 2 clocks but cache
is smaller. L2 cache runs about 10 clocks.
* Pentium D: up to 4MB L2 cache
The Three C's
There are three reasons that cache misses
occur:
* Compulsory: data was not used previously
so can't be in the cache
* Capacity: the word could have been in the
cache but it was too full
* Conflict: the cache is big enough but the
shape of the cache precludes keeping the
data available
Cache and the Programmer
Most code doesn't care about the cache…
Some algorithms are "cache friendly" (quicksort)
Numeric code is a serious problem! Array access
patterns can lead to very poor cache behavior.
Explicit prefetching of data can achieve significant
speedup
Compilers for RISC are explicit cache managers