Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Computers’ basic organization Goal - Teach the following topics: Von Neumann model, Harvard model. Differences. MIPS programming model o Registers o Memory o ALU o Control Unit o I/O Learning outcome - by the end of this class the students should know the main differences of the computer models and the main parts of the computer. Textbook:P&H, Ch.2.1-2.3, Appendix A . 1. Von Neumann model, Harvard model. In below very simple C console program we have all components to explain computer’s general organization. The program performs arithmetic operation with 2 bytes from the 1024 byte array and writes back the result of operation to the array. The operation code, the numbers of array elements of operands and result are read from the console. Let’s compare this program components and behavior with the simple computer components and behavior. We have an array of bytes – this is similar to computer random access memory (RAM). It holds bytes’ values. It could be addressed. In our case for 1024 byte memory the addresses are from 0 to 1023. We can write into the array and read from it like with the memory. Variables – are similar to registers: They hold temporary values of operands after reading of the memory and hold the result before writing to memory. Operations are done with the variables (like with the registers). Input output results are kept in variables. Switch block decodes the operation like Control Unit and launches appropriate operation. The operation statements themselves are similar to Arithmetic Logic Unit. It performs operations on the operands (registers) and writes the result back to variable (register). Printf(), Scanf() functions simulate the Input / Output blocks. The input output is done through the registers (variables). #include <stdio.h> unsigned char array[1024]; unsigned char a,b,c; int i1,i2,j; char operation; // temporary variables // array indexes for operands and result // which operation to do int main() { printf( "Enter the operation sign +, -, /, * \n" ); scanf( "%c", &operation ); printf( "Enter the first operand's index in array \n" ); scanf( "%d", &i1 ); printf( "Enter the second operand's index in array \n" ); 1 scanf( "%d", &i2 ); printf( "Enter the result's index in array \n" ); scanf( "%d", &j ); a = array[i1]; b = array[i2]; switch (operation) { case '+': c = a+b; break; case '-': c = a-b; break; case '/': c = a/b; break; case '*': c = a*b; break; } array[j] = c; return 0; } In the first electronic computers the programming was done by countless multiposition switches and cables. Arithmetic operations were done in serial decimal system using 10 vacuum tubes for representing each decimal digit. In John von Neumann’s computer model the program could be represented in digital form in the computer’s memory, along with the data. The clumsy serial decimal arithmetic is replaced by parallel binary arithmetic. Von Neumann’s basic design, now known as a von Neumann machine, was used in the first stored program computer and still the basis for nearly all digital computers, even now. The original von Neumann machine has five basic parts: the memory, the arithmetic-logic unit (ALU), the program control unit, and the input and output equipment. Von Neumann’s architecture specificity The program is represented in digital form in the computer’s memory, along with the data. Slow serial decimal arithmetic is replaced by parallel binary arithmetic in ALU. Memory Program Code Data Input Registers Clock Control Unit (Accumulator) Arithmetic Logic Unit Output Central Processing Unit Picture 1. The original von Neumann machine. 2 Memory keeps program code and data. The other units periodically refer to memory to fetch the data or program or write there some information. Control Unit (CU): The control unit fetches instructions from memory and decodes them to produce signals which control the other parts of the computer. This may cause it to transfer data between memory and ALU or to activate peripherals to perform input or output. Arithmetic and Logic Unit (ALU): The ALU performs operations such as addition, subtraction and multiplication of integers and bit-wise AND, OR, NOT and other Boolean operations. There are Registers (some of them called Accumulator) which are connected with the Memory and Input/Output. The data exchange between ALU, Memory and Input/Output is done through the Registers (Accumulator). Input and Output devices are used when there is need to exchange information with the outside world. Now the Control Unit and ALU are included in single device called Central Processing Unit (CPU). The meaning of the term Von Neumann model has evolved to mean a stored-program computer in which an instruction fetch and a data operation cannot occur at the same time because they share a common bus. This is referred to as the Von Neumann bottleneck and often limits the performance of the system. The Harvard architecture is a computer architecture with physically separate storage and signal pathways for instructions and data. The term originated from the Harvard Mark I relay-based computer, which stored instructions on punched tape (24 bits wide) and data in electromechanical counters. These early machines had data storage entirely contained within the central processing unit, and provided no access to the instruction storage as data. Programs needed to be loaded by an operator; the processor could not boot itself. Today, most processors implement such separate signal pathways for performance reasons but actually implement a modified Harvard architecture, so they can support tasks such as loading a program from disk storage as data and then executing it. 3 Contrast with von Neumann architectures Under pure von Neumann architecture the CPU can be either reading an instruction or reading/writing data from/to the memory. Both cannot occur at the same time since the instructions and data use the same bus system. In a computer using the Harvard architecture, the CPU can both read an instruction and perform a data memory access at the same time, even without a cache. A Harvard architecture computer can thus be faster for a given circuit complexity because instruction fetches and data access do not contend for a single memory pathway. Also, a Harvard architecture machine has distinct code and data address spaces: instruction address zero is not the same as data address zero. Instruction address zero might identify a twenty-four bit value, while data address zero might indicate an eight bit byte that isn't part of that twenty-four bit value. Contrast with modified Harvard architecture A modified Harvard architecture machine is very much like a Harvard architecture machine, but it relaxes the strict separation between instruction and data while still letting the CPU concurrently access two (or more) memory buses. The most common modification includes separate instruction and data caches backed by a common address space. While the CPU executes from cache, it acts as a pure Harvard machine. When accessing backing memory, it acts like a von Neumann machine (where code can be moved around like data, which is a powerful technique). This modification is widespread in modern processors such as the ARM architecture and x86 processors. It is sometimes loosely called a Harvard architecture, overlooking the fact that it is actually "modified". 2. MIPS Programming Model. A programming model is an abstract view of a processor that is appropriate for programming but omits details that are not needed for that task. It is the view of the machine a programmer uses when programming. Memory 232 bytes Program Code Data Input Control Unit Arithmetic Logic Unit General Purpose 32 bit Registers $0, $1, … , $31 Output Central Processing Unit Picture 2. MIPS Programming model based on Von Neumann architecture with one CPU. 4 Based on this model we can suppose the existence of several groups of MIPS instructions: • Inter Register transfer instructions. • Memory - register transfer instructions. • Input / output instructions. • Arithmetic or Logical instructions with registers. In this model from the textbook we can see several additional groups of MIPS instructions: • Integer Multiplication Division instructions. • Floating point arithmetic instructions. • System Control instructions MIPS contains: • • • General Purpose Registers Integer arithmetic processor Floating point arithmetic processor Whole system’s Control Processor Floating Point Registers Picture 3. MIPS Central Processing Unit and coprocessors. Let’s describe different parts of programming model separately. 5 2.1. Registers A register is a part of the processor that can hold a bit pattern. Registers are the memory which have the fastest access for CPU instructions. On the 32 bit architecture MIPS, a register holds 32 bits. There are many registers in the processor, but only some of them are visible in assembly language. The others are used by the processor in carrying out its operations. The registers that are visible in assembly language are called general purpose registers and floating point registers. There are 32 general purpose registers. Each general purpose register holds a 32 bit pattern. In assembly language, these registers are numbered $0, $1, $2, ... , $31. There are 16 floating point registers. These are discussed in a later chapter. Assembly Operands are registers • A register is a part of the processor that can hold a bit pattern • operations can only be performed on these registers. • Assembly uses registers instead of local variables • Local variables in C and Java can be unlimited and can have different length • Registers in Assembly cannot. The Hardware should be simple. It’s impossible to create unlimited amount of unknown types of bit containers in the hardware. Benefit: Since registers are directly in hardware, they are very fast • Registers are the memory which have the fastest access for CPU instructions. Drawback: Since registers are in hardware, there are a predetermined number of them • Solution: MIPS code must be very carefully put together to efficiently use registers • A group of registers that are visible in MIPS assembly language are called general purpose registers. • There are 32 general purpose registers. • Why 32 ? More registers less operations with memory. • Each general purpose register holds a 32 bit pattern. • Groups of 32 bits called a word in MIPS 6 • In C (and most High Level Languages) variables declared first and given a type Example: int fahr, celsius; char a, b, c, d, e; • Each variable can ONLY represent a value of the type it was declared as (cannot mix and match int and char variables). • In Assembly Language, the registers have no type; operation or a program determines how register contents are treated Register Use Conventions General purpose registers are those that assembly language programs work with (other than floating point registers). The general purpose registers are numbered $0 through $31. However, by convention (and sometimes by hardware) different registers are used for different purposes. One of the general purpose registers, Register $0, is hard-wired to always contain the value 0x00000000 (all zero bits). In addition to a number $0 — $31, registers have a mnemonic name (a name that reminds you of its use). For example register $0 has the mnemonic name zero. The table shows the 32 registers and their conventional use. Registers $0 and $31 are the only two that behave differently from the others. Register $0 is permanently wired to contain zero bits. Register $31 is automatically used by some subroutine linkage instructions to hold the return address. Below 2 groups of registers have the following meaning: $16 - $23 $s0 - $s7 Saved Registers – more close to the meaning of the local variables of C. If you write something in these registers then you suppose that they are not changed after you call couple of other functions. While you are in the current function these variables contain your function values. $8 - $15 $t0 - $t7 Temporary Registers – more close to the meaning of the global variables of C. Everybody has right to use these registers. If you write something in these registers then you suppose that they could be damaged after you call couple of other functions. You can keep there only temporary values for a short period between other function calls. The usage of register names instead of numbers make your code more readable. 7 Register Number Mnemonic Name $0 zero Permanently 0 $1 $at Assembler Temporary (reserved) $2, $3 $v0, $v1 Value returned by a subroutine $4-$7 $a0-$a3 Arguments to a subroutine $8-$15 $t0-$t7 Temporary (not preserved across a function call) $16-$23 $s0-$s7 Saved registers (preserved across a function call) $24, $25 $t8, $t9 Temporary $26, $27 $k0, $k1 Kernel (reserved for OS) $28 $gp Global Pointer $29 $sp Stack Pointer $30 $fp Frame Pointer $31 $ra Return Address (Automatically used in some instructions) Conventional Use 8 2.2. Memory Memory is an array of Cells. Cells keep information. That information could be stored in Cells from Register or retrieved from Cells to the Register. Each Cell has a unique address. In the following example the memory contains 4 Cells. They are addressed by 2 bit address which is enough to select all 4 Cells. If the memory is larger, then the address bits have to be more to be able to address more Cells. 2 bit Address Register XX 00-0 01-1 10-2 11-3 All possible addresses 22=4 Memory Data A (in Cell 0) Data D (in Cell 1) Data K (in Cell 2) Data A (in Cell 3) Load Memory is an array of Cells. Memory keeps information in the cells. Each Cell has a unique address and could be selected by its Address. Information could be loaded from Memory Cell to Register or stored from Register to Memory Cell. The source of information in this case is not changed. Only the destination is overwritten. Store Cell contents Data Register In below example we have 8 Cell memory. To address 8 Cells we need 8 addresses (0,1, … ,7) which could be represented by 3 bits. Memory Cells could contain any amount of bits. Usually they contain a byte (8 bits). Load and Store operations work with all 8 bits. 3bit 000 001 010 011 100 101 110 111 Memory 8 bit 0001 1000 0001 1001 0001 0000 0110 1000 0110 1000 0010 0010 0001 1100 0001 0000 Address length is 3 bits Memory Cell length is 8 bits (1 byte) We can load or store 8 bits at once. 3 address bits address 23 = 8 Memory Cells. 0001 1000 Data Register 9 In below case the Memory Cell contains 16 bits. Load and Store operations work with all 16 bits. 3bit Memory 16 bit 000 001 010 011 100 101 110 111 0001 1000 0001 1000 0001 0000 0110 1000 0110 1000 0010 0010 0001 1100 0001 0000 0110 1000 0001 1100 0010 0010 0010 0010 0001 0000 0001 1000 0110 1000 0010 0010 Address length is 3 bits Memory Cell length is 16 bits (2 bytes) We can load or store only 16 bits at once. 3 address bits address 23 = 8 Memory Cells. 0001 1000 0001 1000 Data Register Memory Model Modern computer systems nearly always use cache memory and virtual memory. But our abstract view of memory does not include them. The purpose of virtual memory is to make it appear as if a program has the full address space available. So our programming model has the full address space. The purpose of cache is to transparently speed up memory access. So our programming model does not include cache. Memory in the programming model is as follows: DATA: MIPS memory is an array of 232 bytes. Each byte has a 32-bit address. Each byte can hold an 8-bit pattern, one of the 256 possible 8-bit patterns. The addresses of MIPS main memory range from 0x00000000 to 0xFFFFFFFF. The lower half (most of it anyway) is for user programs. User memory is further divided (by software convention) into text, data, and stack segments. User programs and data are restricted to the last 231 bytes. The last half of the address space is used for specialized purposes. OPERATIONS: The processor chip contains registers, which are electronic components that can store bit patterns. The processor interacts with memory by moving bit patterns between memory and its registers. Load: a bit pattern starting at a designated address in memory is copied into a register inside the processor. Store: a bit pattern is copied from a processor register to memory at a designated address. 10 Bit patterns are copied between the memory and the processor in groups of one, two, four, or eight contiguous bytes. When several bytes of memory are used in an operation, only the address of the first byte of the group is specified. Memory Layout Load and store operations copy the bit pattern from the source into the destination. The source (register or memory) does not change. Of course, the pattern at the destination is replaced by the pattern at the source. Memory is built to store bit patterns. Both instructions and data are bit patterns, and either of these can be stored anywhere in memory (at least, so far as the hardware is concerned.) However, it is convenient for programmers and systems software to organize memory so that instructions and data are separated. Below is the way MIPS operating systems often lay out memory. Although the address space is 32 bits, the addresses from 0x80000000 to 0xFFFFFFFF are not available to user programs. They are used for the operating system and for ROM. When a MIPS chip is used in an embedded controller the control program probably exists in ROM in this upper half of the address space. ROM, Device buffers, … not available for user programs 232 bytes Memory Size Dynamic Data Dynamic Data Static Data User program machine code OS Kernel Address 32 bits The parts of address space accessible to a user program are divided as follows: 11 Text Segment: This holds the machine language of the user program (called the text). Data Segment: This holds the data that the program operates on. Part of the data is static. This is data that is allocated by the assembler and whose size does not change as a program executes. Values in it do change; "static" means the size in bytes does not change during execution. On top of the static data is the dynamic data. This is data that is allocated and deallocated as the program executes. In "C" dynamic allocation and deallocation is done with malloc() and free(). Stack Segment: At the top of user address space is the stack. With high level languages, local variables and parameters are pushed and popped on the stack as procedures are activated and deactivated. Often data consist of several contiguous bytes. Each computer manufacturer has its own idea of what to call groupings larger than a byte. The following is used for MIPS chips. byte — eight bits. word — four bytes, 32 bits. double word — eight bytes, 64 bits. A block of contiguous memory is referred to by the address of its first byte (ie. the byte with the lowest address.) Most MIPS instructions involve a fixed number of bytes. Often you need a number of bits other than one of the standard amounts. Use the next largest standard amount, and remember to be careful. Attempting to use the very minimum number of bits is more complicated than it is worth and is a rich source of errors in assembly language programming. Operations • Load: a bit pattern starting at a designated address in memory is copied into a register inside the processor. • Store: a bit pattern is copied from a processor register to memory at a designated address. 32 • MIPS memory is an array of 2 bytes. • Each byte has a 32-bit address. • Each byte can hold an 8-bit pattern, one of the 256 possible 8-bit patterns. • The addresses of MIPS main memory range from 0x00000000 to 0xFFFFFFFF. 12 2.3 Registers and the ALU The arithmetic/logic unit (ALU) of a processor performs integer arithmetic and logical operations. For example, one of its operations is to add two 32-bit integers. An integer used as input to an operation is called an operand. One operand for the ALU is always contained in a register. The other operand may be in a register or may be part of the machine instruction itself. The result of the operation is put into a general purpose register. Machine instructions that use the ALU specify four things: 1. 2. 3. 4. The operation to perform. The first operand (often in a register). The second operand (often in a register). The register that receives the result. The picture shows a 32-bit addition operation. The operands are register $8 and register $9. The result is put in register $10. Here is how that instruction is written as assembly language: add $10,$8,$9 add $9,$8,$9 With the help of C language we can understand better the meaning of above instructions: C=A+B; $10=$8+$9 B=A+B; $9=$8+$9 13 The next instruction add $9, $8, $9 does the same thing but puts the result in place of one of the operands – register $9. As the operand already has participated in the operation then its value could be overwritten at the end of the operation. This operation uses less amount of registers than the previous one. 2.4. Control Unit Computer only understands 1s and 0s. For computer add $10,$8,$9 doesn’t mean anything. So the assembly human readable instructions should be represented for the computer in binary form. Here is the assembly language for the add instruction: add $10,$8,$9 Here is the machine code it translates into: 0x01095020 Here is that as a bit pattern (the same machine code): 0000 0001 0000 1001 0101 0000 0010 0000 The program’s binary code is a sequence of machine code instructions. Sequential Execution MIPS machine instructions are all 32 bits wide (4 bytes). Normally, instructions are executed one after another in the same sequence that they have been placed in memory, starting with the lowest address and going up. Here, for example, is a program which sets two operands then adds them. The three machine instructions have been placed at locations 0x00400000, 0x00400004, and 0x00400008, and are executed in that order. Program Counter 0x0040 0000 +4 0x0040 0004 +4 0x0040 0008 Computer somehow should remember the instructions address it wants to read (fetch) from the memory. There is a special register in Control Unit for that purpose. It’s called Program Counter and keeps the next instruction’s address to be fetched from the memory. After fetching of each instruction the computer increments this register’s content to point to the next instruction. After reading the instruction from the memory computer should run the instruction. So there should be some place to put the instruction and some hardware to understand (decode) the instruction to be able to run it properly. For these purposes are created the Instruction Register for keeping the instructions and the Instruction Decoder logic to decode the instruction. 14 After the instruction is decoded then the appropriate signals are sent to the different blocks of hardware to perform the necessary action. Control Unit Program Counter Current instruction Memory Next instruction’s address Program Code 0x01095020 Instruction Register Data Instruction Decoder Control Unit Arithmetic Logic Unit General Purpose 32 bit Registers $0, $1, … , $31 Input Output Central Processing Unit MIPS Model Machine Cycle Computer works proceeding sequentially one instruction at a time. The computer endlessly cycles through three basic steps. The machine cycle is illustrated at right. Each cycle executes one machine instruction. Everything the processor does is done by a sequence of machine operations. So, also, everything any program does is ultimately done by machine operations. Fetch the Instruction. The program counter (PC) contains the address of the current machine instruction. The instruction is fetched from memory. Increment the PC. The address in the program counter is incremented by four. This allows to begin the next instruction preparation while the current instruction is processed. Execute the Instruction. The machine operation specified by the instruction is performed. As MIPS has RISC behavior then the cycle’s execution block should be very simple. So MIPS ALU operations never directly access memory. 15 Von Neumann machine operation cycle 1. Fetch instruction - The control unit retrieves instruction from the memory. 2. Increment the address of instruction 3. Execute. As the Von Neumann computer has CISC behavior then the cycle’s execution block is very complex. Fetch operands. o The control unit retrieves operands from the memory. Instructions are executed and they perform necessary operations in ALU with operands. The result is written again into the memory. Input and Output devices are used when there is need to input or output the information. Control Flow A bit pattern that is fetched as an instruction is interpreted as an instruction. The bits determine what is done in the next machine cycle. If the pattern makes no sense as an instruction then normal control flow is interrupted. If the pattern can be interpreted as an instruction then it will be executed, whatever it does. The control point of an executing program is the address in memory of the instruction being executed. When an instruction is being executed (in the third step of the machine cycle) the program counter holds the address of the instruction after the control point. Normally the control point moves sequentially through the machine instructions. On the MIPS this means it normally moves through memory in steps of four bytes (32 bits) at a time. The execution sequence can be changed with a branch or a jump machine instruction. Usually "control point" is shortened to "control" and the phrase flow of control means how the control point moves through memory. If control flow leads to an address in memory, then the four bytes starting at that address are fetched as a machine instruction. The processor has no other way to distinguish instructions from data. Whatever bit pattern gets pulled in from memory as an instruction will be executed as an instruction. It is common for the control point of a buggy program to enter a section of data. This sometimes leads to mystifying results. By software convention, data and instructions are placed in different sections of memory. (This helps prevent mystifying results). But this is not a requirement of the architecture. [0x0040000c] [0x00400010] ... [0x00400020] Control Flow jump 0x0040 0020 other instructions ... ori $8, $0, 4 Program Counter 0x0040 0004 Control Point 0x0040 0000 16 Computer System Components. Input and Output. The diagram is a general view of how desktop and workstation computers are organized. Different systems have different details, but in general all computers consist of components (processor, memory, controllers, video) connected together with a bus. Physically, a bus consists of many parallel wires, usually printed (in copper) on the main circuit board of the computer. Data signals, clock signals, and control signals are sent on the bus back and forth between components. A particular type of bus follows a carefully written standard that describes the signals that are carried on the wires and what the signals mean. The PCI standard (for example) describes the PCI bus used on most current PCs. The processor continuously executes the machine cycle, executing machine instructions one by one. Most instructions are for an arithmetical, a logical, or a control operation. A machine operation often involves access to main storage or involves an i/o controller. If so, the machine operation puts data and control signals on the bus, and (may) wait for data and control signals to return. Some machine operations take place entirely inside the processor (the bus is not involved). These operations are very fast. Input/output controllers receive input and output requests from the central processor, and then send device-specific control signals to the device they control. They also manage the data flow to and from the device. This frees the central processor from involvement with the details of controlling each device. I/O controllers are needed only for those I/O devices that are part of the system. Often the I/O controllers are part of the electronics on the main circuit board (the mother board) of the computer. Sometimes an uncommon device requires its own controller which must be plugged into a connector (an expansion slot) on the mother board. 17