Download Class 2 Von Neumann Harvard and MIPS models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Computers’ basic organization
Goal - Teach the following topics:
 Von Neumann model, Harvard model. Differences.
 MIPS programming model
o Registers
o Memory
o ALU
o Control Unit
o I/O
Learning outcome - by the end of this class the students should know the main differences of the computer
models and the main parts of the computer.
Textbook:P&H, Ch.2.1-2.3, Appendix A .
1. Von Neumann model, Harvard model.
In below very simple C console program we have all components to explain computer’s general
organization.
The program performs arithmetic operation with 2 bytes from the 1024 byte array and writes back
the result of operation to the array.
The operation code, the numbers of array elements of operands and result are read from the
console.
Let’s compare this program components and behavior with the simple computer components and
behavior.
We have an array of bytes – this is similar to computer random access memory (RAM).
 It holds bytes’ values.
 It could be addressed. In our case for 1024 byte memory the addresses are from 0 to
1023.
 We can write into the array and read from it like with the memory.
Variables – are similar to registers:
 They hold temporary values of operands after reading of the memory and hold the result
before writing to memory.
 Operations are done with the variables (like with the registers).
 Input output results are kept in variables.
Switch block decodes the operation like Control Unit and launches appropriate operation.
The operation statements themselves are similar to Arithmetic Logic Unit. It performs
operations on the operands (registers) and writes the result back to variable (register).
Printf(), Scanf() functions simulate the Input / Output blocks. The input output is done through
the registers (variables).
#include <stdio.h>
unsigned char array[1024];
unsigned char a,b,c;
int i1,i2,j;
char operation;
// temporary variables
// array indexes for operands and result
// which operation to do
int main()
{
printf( "Enter the operation sign +, -, /, * \n" );
scanf( "%c", &operation );
printf( "Enter the first operand's index in array \n" );
scanf( "%d", &i1 );
printf( "Enter the second operand's index in array \n" );
1
scanf( "%d", &i2 );
printf( "Enter the result's index in array \n" );
scanf( "%d", &j );
a = array[i1];
b = array[i2];
switch (operation)
{
case '+': c = a+b;
break;
case '-': c = a-b;
break;
case '/': c = a/b;
break;
case '*': c = a*b;
break;
}
array[j] = c;
return 0;
}
In the first electronic computers the programming was done by countless multiposition switches
and cables. Arithmetic operations were done in serial decimal system using 10 vacuum tubes for
representing each decimal digit.
In John von Neumann’s computer model the program could be represented in digital form in the
computer’s memory, along with the data. The clumsy serial decimal arithmetic is replaced by
parallel binary arithmetic.
Von Neumann’s basic design, now known as a von Neumann machine, was used in the first
stored program computer and still the basis for nearly all digital computers, even now.
The original von Neumann machine has five basic parts: the memory, the arithmetic-logic unit
(ALU), the program control unit, and the input and output equipment.
Von Neumann’s architecture specificity


The program is represented in digital form in the computer’s memory, along with
the data.
Slow serial decimal arithmetic is replaced by parallel binary arithmetic in ALU.
Memory
Program Code
Data
Input
Registers
Clock
Control
Unit
(Accumulator)
Arithmetic
Logic Unit
Output
Central Processing Unit
Picture 1. The original von Neumann machine.
2
Memory keeps program code and data. The other units periodically refer to memory to fetch the
data or program or write there some information.
Control Unit (CU): The control unit fetches instructions from memory and decodes them to
produce signals which control the other parts of the computer. This may cause it to transfer data
between memory and ALU or to activate peripherals to perform input or output.
Arithmetic and Logic Unit (ALU): The ALU performs operations such as addition, subtraction
and multiplication of integers and bit-wise AND, OR, NOT and other Boolean operations.
There are Registers (some of them called Accumulator) which are connected with the Memory
and Input/Output. The data exchange between ALU, Memory and Input/Output is done through
the Registers (Accumulator).
Input and Output devices are used when there is need to exchange information with the outside
world.
Now the Control Unit and ALU are included in single device called Central Processing Unit
(CPU).
The meaning of the term Von Neumann model has evolved to mean a stored-program
computer in which an instruction fetch and a data operation cannot occur at the same time
because they share a common bus. This is referred to as the Von Neumann bottleneck and
often limits the performance of the system.
The Harvard architecture is a computer architecture with physically separate storage and signal
pathways for instructions and data. The term originated from the Harvard Mark I relay-based
computer, which stored instructions on punched tape (24 bits wide) and data in electromechanical counters. These early machines had data storage entirely contained within the central
processing unit, and provided no access to the instruction storage as data. Programs needed to
be loaded by an operator; the processor could not boot itself.
Today, most processors implement such separate signal pathways for performance reasons but
actually implement a modified Harvard architecture, so they can support tasks such as loading a
program from disk storage as data and then executing it.
3
Contrast with von Neumann architectures
Under pure von Neumann architecture the CPU can be either reading an instruction or
reading/writing data from/to the memory. Both cannot occur at the same time since the
instructions and data use the same bus system. In a computer using the Harvard architecture, the
CPU can both read an instruction and perform a data memory access at the same time, even
without a cache. A Harvard architecture computer can thus be faster for a given circuit complexity
because instruction fetches and data access do not contend for a single memory pathway.
Also, a Harvard architecture machine has distinct code and data address spaces: instruction
address zero is not the same as data address zero. Instruction address zero might identify a
twenty-four bit value, while data address zero might indicate an eight bit byte that isn't part of that
twenty-four bit value.
Contrast with modified Harvard architecture
A modified Harvard architecture machine is very much like a Harvard architecture machine, but it
relaxes the strict separation between instruction and data while still letting the CPU concurrently
access two (or more) memory buses. The most common modification includes separate
instruction and data caches backed by a common address space. While the CPU executes from
cache, it acts as a pure Harvard machine. When accessing backing memory, it acts like a von
Neumann machine (where code can be moved around like data, which is a powerful technique).
This modification is widespread in modern processors such as the ARM architecture and x86
processors. It is sometimes loosely called a Harvard architecture, overlooking the fact that it is
actually "modified".
2. MIPS Programming Model.
A programming model is an abstract view of a processor that is appropriate for programming but
omits details that are not needed for that task. It is the view of the machine a programmer uses
when programming.
Memory
232 bytes
Program Code
Data
Input
Control Unit
Arithmetic
Logic Unit
General Purpose 32
bit Registers
$0, $1, … , $31
Output
Central Processing Unit
Picture 2. MIPS Programming model based on Von Neumann architecture with one CPU.
4
Based on this model we can suppose the existence of several groups of MIPS instructions:
• Inter Register transfer instructions.
• Memory - register transfer instructions.
• Input / output instructions.
• Arithmetic or Logical instructions with registers.
In this model from the textbook we can see several additional groups of MIPS instructions:
• Integer Multiplication Division instructions.
• Floating point arithmetic instructions.
• System Control instructions
MIPS contains:
•
•
•
General Purpose
Registers
Integer arithmetic processor
Floating point arithmetic
processor
Whole system’s Control
Processor
Floating Point
Registers
Picture 3. MIPS Central Processing Unit and coprocessors.
Let’s describe different parts of programming model separately.
5
2.1. Registers
A register is a part of the processor that can hold a bit pattern. Registers are the memory
which have the fastest access for CPU instructions.
On the 32 bit architecture MIPS, a register holds 32 bits. There are many registers in the
processor, but only some of them are visible in assembly language. The others are used by the
processor in carrying out its operations.
The registers that are visible in assembly language are called general purpose registers and
floating point registers. There are 32 general purpose registers. Each general purpose register
holds a 32 bit pattern. In assembly language, these registers are numbered $0, $1, $2, ... , $31.
There are 16 floating point registers. These are discussed in a later chapter.
Assembly Operands are registers
• A register is a part of the processor that can hold a bit pattern
• operations can only be performed on these registers.
• Assembly uses registers instead of local variables
• Local variables in C and Java can be unlimited and can have different length
• Registers in Assembly cannot. The Hardware should be simple. It’s impossible to
create unlimited amount of unknown types of bit containers in the hardware.
Benefit: Since registers are directly in hardware, they are very fast
• Registers are the memory which have the fastest access for CPU instructions.
Drawback: Since registers are in hardware, there are a predetermined number of them
• Solution: MIPS code must be very carefully put together to efficiently use registers
•
A group of registers that are visible in
MIPS assembly language are called
general purpose registers.
•
There are 32 general purpose
registers.
• Why 32 ? More registers less operations with memory.
•
Each general purpose register holds
a 32 bit pattern.
• Groups of 32 bits called a
word in MIPS
6
• In C (and most High Level Languages) variables declared first and given a type
Example:
int fahr, celsius;
char a, b, c, d, e;
• Each variable can ONLY represent a value of the type it was declared as (cannot mix and
match int and char variables).
• In Assembly Language, the registers have no type; operation or a program determines
how register contents are treated
Register Use Conventions
General purpose registers are those that assembly language programs work with (other than
floating point registers).
The general purpose registers are numbered $0 through $31. However, by convention (and
sometimes by hardware) different registers are used for different purposes.
One of the general purpose registers, Register $0, is hard-wired to always contain the value
0x00000000 (all zero bits).
In addition to a number $0 — $31, registers have a mnemonic name (a name that reminds you of
its use). For example register $0 has the mnemonic name zero.
The table shows the 32 registers and their conventional use.
Registers $0 and $31 are the only two that behave differently from the others. Register $0 is
permanently wired to contain zero bits. Register $31 is automatically used by some subroutine
linkage instructions to hold the return address.
Below 2 groups of registers have the following meaning:
$16 - $23

$s0 - $s7
Saved Registers – more close to the meaning of the local variables of C. If you write something in
these registers then you suppose that they are not changed after you call couple of other
functions. While you are in the current function these variables contain your function values.
$8 - $15

$t0 - $t7
Temporary Registers – more close to the meaning of the global variables of C. Everybody has
right to use these registers. If you write something in these registers then you suppose that they
could be damaged after you call couple of other functions. You can keep there only temporary
values for a short period between other function calls.
The usage of register names instead of numbers make your code more readable.
7
Register
Number
Mnemonic
Name
$0
zero
Permanently 0
$1
$at
Assembler Temporary (reserved)
$2, $3
$v0, $v1
Value returned by a subroutine
$4-$7
$a0-$a3
Arguments to a subroutine
$8-$15
$t0-$t7
Temporary
(not preserved across a function call)
$16-$23
$s0-$s7
Saved registers
(preserved across a function call)
$24, $25
$t8, $t9
Temporary
$26, $27
$k0, $k1
Kernel (reserved for OS)
$28
$gp
Global Pointer
$29
$sp
Stack Pointer
$30
$fp
Frame Pointer
$31
$ra
Return Address
(Automatically used in some instructions)
Conventional Use
8
2.2. Memory
Memory is an array of Cells. Cells keep information. That information could be stored in Cells
from Register or retrieved from Cells to the Register. Each Cell has a unique address.
In the following example the memory contains 4 Cells. They are addressed by 2 bit address
which is enough to select all 4 Cells. If the memory is larger, then the address bits have to be
more to be able to address more Cells.
2 bit
Address Register
XX
00-0
01-1
10-2
11-3
All possible
addresses
22=4
Memory
Data A (in Cell 0)
Data D (in Cell 1)
Data K (in Cell 2)
Data A (in Cell 3)
Load



Memory is an array of Cells.
Memory keeps information in the cells.
Each Cell has a unique address and could
be selected by its Address.
Information could be loaded from Memory
Cell to Register or stored from Register to
Memory Cell. The source of information in
this case is not changed. Only the
destination is overwritten.

Store
Cell contents
Data Register
In below example we have 8 Cell memory. To address 8 Cells we need 8 addresses (0,1, … ,7)
which could be represented by 3 bits. Memory Cells could contain any amount of bits. Usually
they contain a byte (8 bits). Load and Store operations work with all 8 bits.
3bit
000
001
010
011
100
101
110
111
Memory 8 bit
0001 1000
0001 1001
0001 0000
0110 1000
0110 1000
0010 0010
0001 1100
0001 0000



Address length is 3 bits
Memory Cell length is 8 bits (1 byte)
We can load or store 8 bits at once.
3 address bits address 23 = 8 Memory Cells.
0001 1000
Data Register
9
In below case the Memory Cell contains 16 bits. Load and Store operations work with all 16 bits.
3bit
Memory 16 bit
000
001
010
011
100
101
110
111
0001 1000 0001 1000
0001 0000 0110 1000
0110 1000 0010 0010
0001 1100 0001 0000
0110 1000 0001 1100
0010 0010 0010 0010
0001 0000 0001 1000
0110 1000 0010 0010



Address length is 3 bits
Memory Cell length is 16 bits (2 bytes)
We can load or store only 16 bits at once.
3 address bits address 23 = 8 Memory Cells.
0001 1000 0001 1000
Data Register
Memory Model
Modern computer systems nearly always use cache memory and virtual memory. But our
abstract view of memory does not include them. The purpose of virtual memory is to make it
appear as if a program has the full address space available. So our programming model has the
full address space. The purpose of cache is to transparently speed up memory access. So our
programming model does not include cache.
Memory in the programming model is as follows:
DATA:
MIPS memory is an array of 232 bytes. Each byte has a 32-bit address. Each byte can
hold an 8-bit pattern, one of the 256 possible 8-bit patterns. The addresses of MIPS main
memory range from 0x00000000 to 0xFFFFFFFF.
The lower half (most of it anyway) is for user programs. User memory is further divided
(by software convention) into text, data, and stack segments.
User programs and data are restricted to the last 231 bytes. The last half of the address
space is used for specialized purposes.
OPERATIONS:
The processor chip contains registers, which are electronic components that can store bit
patterns. The processor interacts with memory by moving bit patterns between memory and its
registers.
 Load: a bit pattern starting at a designated address in memory is copied into a register
inside the processor.
 Store: a bit pattern is copied from a processor register to memory at a designated
address.
10
Bit patterns are copied between the memory and the processor in groups of one, two, four, or
eight contiguous bytes. When several bytes of memory are used in an operation, only the
address of the first byte of the group is specified.
Memory Layout
Load and store operations copy the bit pattern from the source into the destination. The source
(register or memory) does not change. Of course, the pattern at the destination is replaced by the
pattern at the source.
Memory is built to store bit patterns. Both instructions and data are bit patterns, and either of
these can be stored anywhere in memory (at least, so far as the hardware is concerned.)
However, it is convenient for programmers and systems software to organize memory so that
instructions and data are separated. Below is the way MIPS operating systems often lay out
memory.
Although the address space is 32 bits, the addresses from 0x80000000 to 0xFFFFFFFF are not
available to user programs. They are used for the operating system and for ROM. When a MIPS
chip is used in an embedded controller the control program probably exists in ROM in this upper
half of the address space.
ROM, Device buffers, … not
available for user programs
232 bytes
Memory Size
Dynamic Data
Dynamic Data
Static Data
User program
machine code
OS Kernel
Address 32
bits
The parts of address space accessible to a user program are divided as follows:
11
Text Segment: This holds the machine language of the user program (called
the text).
Data Segment: This holds the data that the program operates on. Part of the
data is static. This is data that is allocated by the assembler and whose size
does not change as a program executes. Values in it do change; "static"
means the size in bytes does not change during execution. On top of the static
data is the dynamic data. This is data that is allocated and deallocated as the
program executes. In "C" dynamic allocation and deallocation is done with
malloc() and free().
Stack Segment: At the top of user address space is the stack. With high level
languages, local variables and parameters are pushed and popped on the
stack as procedures are activated and deactivated.
Often data consist of several contiguous bytes. Each computer manufacturer has its own idea of
what to call groupings larger than a byte. The following is used for MIPS chips.
 byte — eight bits.
 word — four bytes, 32 bits.
 double word — eight bytes, 64 bits.
A block of contiguous memory is referred to by the address of its first byte (ie. the byte with the
lowest address.) Most MIPS instructions involve a fixed number of bytes.
Often you need a number of bits other than one of the standard amounts. Use the next largest
standard amount, and remember to be careful. Attempting to use the very minimum number of
bits is more complicated than it is worth and is a rich source of errors in assembly language
programming.
Operations
• Load: a bit pattern starting at a
designated address in memory is
copied into a register inside the
processor.
• Store: a bit pattern is copied from a
processor register to memory at a
designated address.
32
• MIPS memory is an array of 2
bytes.
• Each byte has a 32-bit address.
• Each byte can hold an 8-bit
pattern, one of the 256 possible
8-bit patterns.
• The addresses of MIPS main
memory range from 0x00000000
to 0xFFFFFFFF.
12
2.3 Registers and the ALU
The arithmetic/logic unit (ALU) of a processor performs integer arithmetic and logical
operations. For example, one of its operations is to add two 32-bit integers. An integer used as
input to an operation is called an operand. One operand for the ALU is always contained in a
register. The other operand may be in a register or may be part of the machine instruction itself.
The result of the operation is put into a general purpose register.
Machine instructions that use the ALU specify four things:
1.
2.
3.
4.
The operation to perform.
The first operand (often in a register).
The second operand (often in a register).
The register that receives the result.
The picture shows a 32-bit addition operation. The operands are register $8 and register $9. The
result is put in register $10. Here is how that instruction is written as assembly language:
add $10,$8,$9
add $9,$8,$9
With the help of C language we can understand better the meaning of above instructions:
C=A+B; $10=$8+$9
B=A+B;
$9=$8+$9
13
The next instruction add $9, $8, $9 does the same thing but puts the result in place of one of the
operands – register $9. As the operand already has participated in the operation then its value
could be overwritten at the end of the operation. This operation uses less amount of registers
than the previous one.
2.4. Control Unit
Computer only understands 1s and 0s. For computer add $10,$8,$9 doesn’t mean anything. So
the assembly human readable instructions should be represented for the computer in binary form.
Here is the assembly language for the add instruction:
add
$10,$8,$9
Here is the machine code it translates into:
0x01095020
Here is that as a bit pattern (the same machine code):
0000 0001 0000 1001 0101 0000 0010 0000
The program’s binary code is a sequence of machine code instructions.
Sequential Execution
MIPS machine instructions are all 32 bits wide (4 bytes). Normally, instructions are executed
one after another in the same sequence that they have been placed in memory, starting with the
lowest address and going up.
Here, for example, is a program which sets two operands then adds them. The three machine
instructions have been placed at locations 0x00400000, 0x00400004, and 0x00400008, and are
executed in that order.
Program Counter
0x0040 0000
+4
0x0040 0004
+4
0x0040 0008
Computer somehow should remember the instructions address it wants to read (fetch) from the
memory. There is a special register in Control Unit for that purpose. It’s called Program Counter
and keeps the next instruction’s address to be fetched from the memory. After fetching of each
instruction the computer increments this register’s content to point to the next instruction.
After reading the instruction from the memory computer should run the instruction. So there
should be some place to put the instruction and some hardware to understand (decode) the
instruction to be able to run it properly. For these purposes are created the Instruction Register
for keeping the instructions and the Instruction Decoder logic to decode the instruction.
14
After the instruction is decoded then the appropriate signals are sent to the different blocks of
hardware to perform the necessary action.
Control Unit
Program Counter
Current
instruction
Memory
Next instruction’s
address
Program Code
0x01095020
Instruction Register
Data
Instruction Decoder
Control Unit
Arithmetic
Logic Unit
General Purpose
32 bit Registers
$0, $1, … , $31
Input
Output
Central Processing Unit
MIPS Model Machine Cycle
Computer works proceeding sequentially one instruction at a
time.
The computer endlessly cycles through three basic steps.
The machine cycle is illustrated at right. Each cycle executes one
machine instruction. Everything the processor does is done by a
sequence of machine operations. So, also, everything any
program does is ultimately done by machine operations.
Fetch the Instruction. The program counter (PC)
contains the address of the current machine instruction.
The instruction is fetched from memory.
Increment the PC. The address in the program counter
is incremented by four. This allows to begin the next
instruction preparation while the current instruction is
processed.
Execute the Instruction. The machine operation specified by the
instruction is performed. As MIPS has RISC behavior then the
cycle’s execution block should be very simple. So MIPS ALU
operations never directly access memory.
15
Von Neumann machine operation cycle
1. Fetch instruction - The control unit retrieves instruction from the memory.
2. Increment the address of instruction
3. Execute. As the Von Neumann computer has CISC behavior then the cycle’s execution
block is very complex.
 Fetch operands.
o The control unit retrieves operands from the memory.
 Instructions are executed and they perform necessary operations in ALU with
operands.
 The result is written again into the memory.
 Input and Output devices are used when there is need to input or output the
information.
Control Flow
A bit pattern that is fetched as an instruction is interpreted as an instruction. The bits
determine what is done in the next machine cycle. If the pattern makes no sense as an
instruction then normal control flow is interrupted. If the pattern can be interpreted as an
instruction then it will be executed, whatever it does.
The control point of an executing program is the address in memory of the instruction being
executed. When an instruction is being executed (in the third step of the machine cycle) the
program counter holds the address of the instruction after the control point.
Normally the control point moves sequentially through the machine instructions. On the MIPS this
means it normally moves through memory in steps of four bytes (32 bits) at a time.
The execution sequence can be changed with a branch or a jump machine instruction.
Usually "control point" is shortened to "control" and the phrase flow of control means how the
control point moves through memory.
If control flow leads to an address in memory, then the four bytes starting at that address are
fetched as a machine instruction. The processor has no other way to distinguish instructions from
data. Whatever bit pattern gets pulled in from memory as an instruction will be executed as an
instruction. It is common for the control point of a buggy program to enter a section of data. This
sometimes leads to mystifying results.
By software convention, data and instructions are placed in different sections of memory. (This
helps prevent mystifying results). But this is not a requirement of the architecture.
[0x0040000c]
[0x00400010]
...
[0x00400020]
Control Flow
jump 0x0040 0020
other instructions
...
ori $8, $0, 4
Program Counter
0x0040 0004
Control Point
0x0040 0000
16
Computer System Components. Input and Output.
The diagram is a general view of how desktop and workstation computers are organized.
Different systems have different details, but in general all computers consist of components
(processor, memory, controllers, video) connected together with a bus. Physically, a bus consists
of many parallel wires, usually printed (in copper) on the main circuit board of the computer. Data
signals, clock signals, and control signals are sent on the bus back and forth between
components. A particular type of bus follows a carefully written standard that describes the
signals that are carried on the wires and what the signals mean. The PCI standard (for example)
describes the PCI bus used on most current PCs.
The processor continuously executes the machine cycle, executing machine instructions one by
one. Most instructions are for an arithmetical, a logical, or a control operation. A machine
operation often involves access to main storage or involves an i/o controller. If so, the machine
operation puts data and control signals on the bus, and (may) wait for data and control signals to
return. Some machine operations take place entirely inside the processor (the bus is not
involved). These operations are very fast.
Input/output controllers receive input and output requests from the central processor, and then
send device-specific control signals to the device they control. They also manage the data flow to
and from the device. This frees the central processor from involvement with the details of
controlling each device. I/O controllers are needed only for those I/O devices that are part of the
system.
Often the I/O controllers are part of the electronics on the main circuit board (the mother board)
of the computer. Sometimes an uncommon device requires its own controller which must be
plugged into a connector (an expansion slot) on the mother board.
17