Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PDP 11 Instructions and addressing modes • Details of PDP 11 instruction set and addressing modes are NOT examinable. – Details here so that you can understand examples and complete the exercises and assignment PDP 11 • For the exam Instructions and addressing modes Programming the PDP 11 On interrupts Transition to OS – The principles of “addressing modes” matter • E.g. need for ‘register’, ‘indirect register’, ‘indexed’, ‘relative’, ‘absolute’, … addressing modes – and how these relate to elements in high level languages like C – The way subroutine calls use stack, the way I/O works … these things also matter. 1 2 Couple of general themes commonly found in CPU architectures Data processing • Keep instructions and addresses short – Trying to minimise number of bytes transferred over bus – to speed overall processing – So very common for computer architects to design instructions sets and addressing modes that use short form addresses – e.g. byte offsets from current address rather than full address • Position independent code – If all addresses expressed as offsets, then code can be shifted around in memory! – This potentially simplifies the task of an operating system that must load programs into main memory. • Common for CPU architecture (certainly back in 1970s) to require that data elements be copied into registers before they could be processed in circuits like “add”, … – So code typically of form • • • • Load data into reg-0 Load other data into reg-1 Some instructions combining data Store result now held in reg-0 back into memory • PDP‐11 was unusual in that could take data from memory, route into processing circuit, and return result to main memory – Registers still heavily used but not quite as much of a bottleneck as on computers where all data had to be “loaded” before processing and “stored” after processing 3 4 5 6 Instruction formats • Several different instruction formats – Principle formats 1. 2. 3. 4. – Single operand Double operand Register source or destination Branch The PDP‐11 had a few instructions that had unique individual formats (e.g. the “sob” – ‘subtract and branch if not zero’ instruction used for loop control in some of the examples already illustrated) • Most of data manipulation instructions come in two versions – full word data (16‐bit), and one‐byte data “Instruction format” – the way that the bits in an instruction word are used to represent the op‐code and information identifying operands. nabg 1 “src and dst”; offset Address modes ‐ preview • Src = source and dst = destination • 6 bits get allocated for “src” or for “dst” – Where data are fetched from and written to • Defined using the different addressing modes, – E.g. » “register address mode + register” • Src – the data are already in a register • Dst – the result of operation will go into the register » “indirect register address mode + register” • Src – the register holds the address of the data that are to be used (i.e. it’s a pointer) • Dst – the register holds the address where the result of operation is to be stored (i.e. it’s a pointer) • All PDP‐11’s addressing modes involve reference to a register • Offset (used in branch instructions that form basis of conditional tests (if … then … else …) and loops) – 8 bits used to identify relative location of instruction that will branch to – 3 bits are “mode” – 3 bits are register (r0, r1, …, r6 (sp), r7 (pc)) • Modes? – 0 – register mode, (the data are in the register) – 1 – indirect register mode, (the register holds the address of the data (src) or address for result (dst) i.e. it’s like a C/C++ pointer) – 2 – auto‐increment –… 7 8 Single operand Double operand – 2 address 0 => word, 1 => byte variant of instruction Word & byte variants of instruction Clear word Clear byte DD – 2 octal digits, 6 bits, specifying address mode and register (the “destination”) 9 10 Instructions and status bits – used by branch instructions Double operand – register and address • All PDP‐11 data manipulation instructions update those “status” bits; following are just examples of changes possible (not a comprehensive list) • Supplementary set of instructions – you only got these if you paid extra for the EIS (Extended Instruction Set); always involve data in a register (depending on instruction this might be “src” or “dst”) so don’t need to specify mode; so only 9‐bits used for defining operands (3 bits for a register, 6 bits for the address mode and register defining other operand) 1. 2. 3. 4. nabg V – set if “add” caused overflow otherwise cleared N – set if result of operation was –ve (i.e. leftmost bit was 1 – either in 16‐bit word if a word operation, or in 8‐bit byte if a byte operation) C – set if there was a “carry” – as in ror (rotate right) instruction illustrated in bit‐counting example) Z – set if result was 0 • These status bits can then be tested (individually or in combinations) in immediately following branch instructions Product of two 16‐bit numbers may be a 32‐bit number – so instructions like MUL and DIV actually designed for 32‐bit data held in a pair of registers – R and R+1 Subroutine call instruction – jsr – also uses this register & address format SS – 2 octal digits, 6 bits, specifying address mode and register (src) DD – 2 octal digits, 6 bits, specifying address mode and register (dst) 11 12 2 Branch instructions Branch instructions ‐ offset • Some branch instructions test individual status bits, others test combinations of bits • Combination tests designed to be useful • Branch instructions have 8‐bits for an “offset” – 8 bits – interpreted as signed value ‐128 to +127 – This defines the number of words back or forward through the code – When working with 16‐bit signed numbers • All instructions must at word boundaries, so no point defining a byte offset (usual case) – When working with 16‐bit unsigned numbers (rarer, usually occur when simulating multi‐precision arithmetic – such as forming 32 bit sum of many 16‐bit entries in a data array) C (later C++) incorporated “unsigned int” etc – because the computer used had this feature! 13 Branch instructions – Computed relative to value in program counter after it has fetched the branch instruction and been incremented to hold address of following instruction Sounds a bit complex, but it’s sorted out by the assembler program that encodes your instructions. Only problem that would occur would be trying to branch too far – more than ~128 words (would need a jmp – jump instruction). Loops and if … then … code shouldn’t take up large numbers of instructions! 14 Branch instructions ‐ example 15 16 Branch instructions – for arithmetic on signed and unsigned “Jump and Subroutine” • Subroutine call – JSR – call; RTS – return from subroutine; • Also MARK – Explained later, it’s a very specialized instruction used to tidy up stack when returning from a subroutine • JMP – Like the unconditional branch instruction BR, but takes complete addressing data to define destination rather than offset (so not limited to branch range of ‐128 to +127 words) • SOB – That loop count instruction previously illustrated • Bit like a branch, but it can only branch back (to lower address) and only has 6‐bits to define how far – (It’s in with the JMP/JSR group as it just didn’t fit anywhere else!) 17 nabg 18 3 “Jump and Subroutine” Condition code instructions • There is a small group of instructions that can be used to explicitly set or clear individual bits in the status register. “call dest” and “return” – just assembler defined equivalents for jsr r7,dest and rts r7. Use of r7 (pc) was most common way of employing subroutine call and return instructions. 19 20 Other specialized instructions • There are a few more specialized instructions – Interrupt handling (RTI) • For interrupt based I/O – TRAP, EMT, IOT • Building blocks for an operating system – MFPI/MTPI • Obscure memory mapping system that allowed PDP‐11 (normally limited to 56Kbyte of memory) to utilize larger memories – HALT, WAIT, RESET 21 22 Too complex, too confusing? • Think that PDP‐11 is complex? Try a modern architecture! Back to PDP‐11 Learn about the modern Intel architecture on your own (if you really want to) http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64‐ia‐32‐architectures‐software‐developer‐vol‐2a‐manual.pdf The Intel manual just listing the instructions is almost 1000 pages long. nabg 23 24 4 Addressing modes – 2 Double operand addressing Addressing modes – 1 Single operand addressing 25 Direct addressing ‐ 1 26 Indexed The memory word following that containing the instruction contains the index value 27 Assembly code for array indexing “min and max” 28 Indirect addressing ‐ 1 References to base address of data array 29 nabg 30 5 Indirect addressing Double indirection • Mode 1, register deferred, is simple and common! • Double indirection is not that common a programming construct. – The register is a pointer, it holds the address of the data element. • Modes 3, 5, and 7 are less common, they basically involve – Won’t see modes 3, 5, and 7 in many examples • (Mode 3 using r7 is a special case that does turn up occasionally – see details in a couple of slides) – Pointers to pointers! – Double indirection! – What? • Register doesn’t contain the data – it contains an address (i.e. it’s a pointer) • The referenced address doesn’t contain the data either! It again contains an address. It’s the memory word at that 2nd address that actually contains the data. 31 32 Use of pc as register in dd or ss : 1 Use of pc as register in dd or ss : 2 • What would mode‐2, auto‐increment, do if pc (r7) was used as the register – E.g. mov (r7)+, r0 – Fetch, decode, execute interpretation: 1. 2. 3. 4. 5. – – It basically means grab some immediately following data and use it • • • • Fetch the instruction (say for example it was at location 1020) Increment (by 2) the pc following the fetch (so it now holds 1022) Interpret the addressing mode for source 1. 2. 3. • mode‐2, auto‐increment with r7 It’s auto increment, so r7 holds the address of the data Data to be fetched from location 1022 R7 to be incremented by 2 to 1024 Interpret the addressing mode for dest – it’s just register r0 Move the data fetched from 1022 from temporary register into r0 A very useful mode It gets its own special treatment in assembler “Immediate Addressing” Written as – mov #val,dest Ready to execute next instruction in location 1024 • Assembler must be able to work out value for val – so constant, octal value, address 33 34 Use of pc as register in dd or ss : 3 Use of pc as register in dd or ss : 4 • What would mode‐3, auto‐increment deferred, do if pc (r7) was used as the register • mode‐3, auto‐increment deferred with r7 – It basically means grab some data that was referenced by a pointer in the location following the instruction and use it – E.g. mov @(r7)+, r0 – Fetch, decode, execute interpretation: 1. 2. 3. Fetch the instruction (say for example it was at location 1020) Increment (by 2) the pc following the fetch (so it now holds 1022) Interpret the addressing mode for source 1. 2. 3. 4. 4. 5. – It’s auto increment deferred, so r7 holds the address of memory word that holds a data pointer Pointer value to be fetched from location 1022 R7 to be incremented by 2 to 1024 That pointer value fetched from location 1022 is the address of the real data – so get the real data Interpret the addressing mode for dest – it’s just register r0 Move the data fetched from the address that was in specified in 1022 from temporary register into r0 An occasionally useful mode It gets its own special treatment in assembler “Absolute Addressing” Written as – mov @#val,dest Ready to execute next instruction in location 1024 35 nabg • • • • 36 6 Use of pc as register in dd or ss : 5 Use of pc as register in dd or ss : 6 • What would mode‐6, indexed, do if pc (r7) was used as the register – E.g. mov Val(r7), r0 – Fetch, decode, execute interpretation: 1. 2. 3. 4. 5. – – It basically means grab some data whose address is specified relative to the program counter Fetch the instruction (say for example it was at location 1020) Increment (by 2) the pc following the fetch (so it now holds 1022) Interpret the addressing mode for source 1. 2. 3. 4. 5. • mode‐6, auto‐increment with r7 • • • • It’s indexed Data to be fetched from location 1022 R7 to be incremented by 2 to 1024 Value of r7 and data fetched are added to get address of real data Fetch real data Interpret the addressing mode for dest – it’s just register r0 Move the data fetched from the computed address r7+“index” from temporary register into r0 An extremely useful mode It gets its own special treatment in assembler “Relative Addressing” Written as – mov val,dest Ready to execute next instruction in location 1024 • Assembler must be able to work out value for val – so constant, octal value, address 37 38 Relative addressing Deferred relative • Mode 7 with pc • Relative addressing makes possible “Position Independent Code” – Rather than absolute addresses, have these relative addresses – If operating system wants to move a segment with code and data, it can. – It basically means grab some data whose address is specified relative to the program counter, use those data as address of real data • • • • An occasionally useful mode It gets its own special treatment in assembler “Deferred Relative Addressing” Written as – mov @val,dest • 39 Assembler must be able to work out value for val – so constant, octal value, address I think they should have chosen more distinct assembly language forms – I find having @# and @ to be a bit confusing! 40 Notes on the assembler : 1 • An assembler program sorts out addresses and works out the bit patterns that will represent the instructions. • The input file with an assembly language program will contain the source code and “assembler directives” • There were several assemblers for the PDP‐11 series – More PDP‐11 Program Examples – Used different directives – Some had extra features (such as “macros”) Illustrating addressing modes • Assembler for the simulator is relatively simple – – – – No macros Very small set of assembly directives ‐ .blkw, .word, .string, .origin, .end Only octal numbers No expressions • 41 nabg Most assemblers allowed expressions – e.g. “loop‐2”, “.+4” where constants required in ss or dd part of address 42 7 What is a “macro”? Macros • A macro (short for "macroinstruction", from a Greek word for 'long') in computer science is a rule or pattern that specifies how a certain input sequence (often a sequence of characters) should be mapped to a replacement input sequence (also often a sequence of characters) according to a defined procedure. • The mapping process that instantiates (transforms) a macro use into a specific sequence is known as macro expansion. • You may meet #define macros sometime in your C/C++ studies, e.g. #define min(X, Y) ((X) < (Y) ? (X) : (Y)) • Save data in registers • In assembly language, macros commonly used to define sequences of instructions that are often repeated, e.g. Many assemblers support predefined macros, and others support programmer‐ defined macros involving sequences of text lines in which variables and constants are embedded. This sequence of text lines may include opcodes or directives. Once a macro has been defined its name may be used in place of a mnemonic. When the assembler processes such a statement, it replaces the statement with the text lines associated with that macro, then processes them as if they existed in the source code file. 43 Notes on the assembler : 2 .MACRO SAVE mov r0,‐(sp) mov r1,‐(sp) mov r2,‐(sp) mov r3,‐(sp) mov r4,‐(sp) mov r5,‐(sp) .ENDM Restore previous data .MACRO RESTORE mov (sp)+,r5 mov (sp)+,r4 mov (sp)+,r3 mov (sp)+,r2 mov (sp)+,r1 mov (sp)+,r0 .ENDM Programmer could then just code the line “SAVE” or “RESTORE” rather than list all the instructions needed every time the data in the registers needed to be saved 44 Notes on the assembler : 3 • Directives: • Numbers: – .origin – All numbers are interpreted as octal values • Sets absolute start address for next segment of code or data • (and they don’t start with leading 0 as is required in C/C++ programs) – E.g. .origin 1000 • Names – .end – Labels and constants have names that start with a letter and contain letters, digits, and underscores – No length limit • Marks end of program and identifies start address – E.g. .end start – .string • ASCII text – will be terminated by at least 1‐null byte • Nowadays, plenty of data space • Real assemblers would have limited names to 6 or 8 characters – E.g. .string “Hello World” – .word • Comments • Comma separated list of word values – Start with ; character – E.g. .word 105, 1744, 177776 – .blkw • Reserves a block of memory of specified number of words – E.g. .blkw 5 45 That string copying program Generated code – as shown previously Demonstrated in previous lecture segment 47 nabg 46 ; Program to copy and determine length of string .origin 1000 001000 012701 start: mov #msga, r1 001002 001024 001004 012702 mov #msgb, r2 001006 001076 001010 005000 clr r0 001012 112122 l1: movb (r1)+, (r2)+ 001014 001402 beq done 001016 005200 inc r0 001020 000774 br l1 001022 000000 done: halt msga: .string "A string whose length is to be determined" 001024 020101 001026 072163 … 001074 000144 msgb: .string "Different string that should get overwritten" 001076 064504 001100 063146 … This is not written in “position independent” style, 001150 067145 001152 000000 there are absolute addresses in the code .end start Address Content (both shown as octal numbers – encoding bit patterns) 48 8 Instructions and addressing modes in copy string example : 1 001000 Instructions and addressing modes in copy string example : 2 001012 .origin 1000 012701 start: mov #msga, r1 112122 l1: movb (r1)+, (r2)+ • Move byte instruction – 4 bits ‐ 11 • Move word instruction – 4 bits ‐ 01 • Source: immediate addressing mode – mode 2 register 7 ‐ 6 bits for ss – 27 • Destination – register mode – mode 0, register 1 – 6 bits for dd – 01 • Code word 012701 • Next word has address of msga • Source: autoincrement addressing mode – mode 2 register 1 ‐ 6 bits for ss – 21 • Destination – autoincrement addressing mode – mode 2 register 2– 6 bits for dd – 22 • Code word 112122 49 Instructions and addressing modes in copy string example : 3 001014 001402 Instructions and addressing modes in copy string example : 4 beq done 001020 000774 br l1 • Branch instruction – 8 bits ‐ 014 • Branch instruction – 8 bits ‐ 004 • Offset – 8 bits for a word count • Offset – 8 bits for a word count – Here assembler has worked out it’s a branch forward for two words (4 bytes) • Instruction at 1014 was fetched • PC (r7) was incremented to 1016 • On execution add 4 to 1016 getting 1022 (yes – that is right, this is octal) • 1022 is address for “done” 51 Instructions and addressing modes in Min and max (pointers) example : 1 .origin 1000 maxval=77777 minval=100000 len=20 start:mov #minval,max mov #maxval,min Initialize mov #len,r2 mov #data,r0 loop: cmp @r0,max blt notlarge mov @r0,max notlarge: cmp @r0,min bgt notsmall Loop mov @r0,min notsmall: add #2,r0 sob r2,loop halt max: .word 0 min: .word 0 data: .word 167776, 317, 4051, 67676, 174210, 74, 7776, 7, 147333, 31410, 172315, 5612, 31013, 23712, 555, 177204 Data .end start 53 nabg 50 – Here assembler has worked out it’s a branch back for four words • Instruction at 1020 was fetched • PC (r7) was incremented to 1022 • On execution add ‐10 to 1022 getting 1012 • 1012 is address for “done” Offset value is 374 as an 8 bit signed value 11111100 which is ‐4 52 Instructions and addressing modes in Min and max (pointers) example : 2 start:mov #minval,max • Again a mov (word) instruction – 4bits – 01 • Source: – Immediate addressing • Mode 2, register 7 – 27 – And a memory word containing the value of minval • Destination – Relative addressing • Where is “max” relative to current point in code? • It’s 52 bytes further on. – So mode 6, register 7 (relative addressing) and a memory word containing the relative address 52 54 9 Instructions and addressing modes in Min and max (pointers) example : 3 loop: cmp @r0,max Instructions and addressing modes in Min and max (array indexing) example : 1 … … loop: cmp data(r0),max blt notlarge mov data(r0),max notlarge: cmp data(r0),min bgt notsmall mov data(r0),min notsmall: add #2,r0 sob r2,loop • Instruction ‐ cmp – 02 • Source @r0 – Register deferred – register contains address of operand, i.e. it’s a pointer – Mode 1, register 01 – So 10 • Destination max – Relative – Mode 6, register 7 and will need a memory word for the relative address of max – 030 bytes on from this point – So 67 and 30 • So – 021067 – 000030 55 56 Instructions and addressing modes in Min and max (array indexing) example : 2 loop: cmp data(r0),max • Instruction ‐ cmp – 02 • Source data(r0) – Indexed – address of operand obtained by adding register value to value in following word A new example • Here register is index, following word contains base address of array data – Mode 6, register 01 – So 60 and a word with base address 1072 • Destination max Input – those wait loops again – Relative – Mode 6, register 7 and will need a memory word for the relative address of max – 036 bytes on from this point – So 67 and 36 • So – 026067 – 001072 – 000030 57 58 59 60 Input from keyboard • Keyboard and teletype output printer are independent devices – Program has to echo input – sending it to teletype device • Program – Loop • Get character • Put character – Until character == ‘\n’ nabg 10 Get character / put character • “Enable” keyboard • Loop: – Test ‘done’ flag – Until flag set Keyboard • Send character • Loop Mapped to address 177560 – Test ‘done’ flag – Until flag set • Input character Both get character and put character should also test ‘error’ status bit in device controller – I’m simply assuming that my devices will always operate without errors! Mapped to address 177562 61 Teleprinter 62 Unix time share system Mapped to address 177564 • Keyboard/teletype as just shown would be the system administrator’s console • There would be half a dozen other ASR33 teletype devices each with their own hardwired addresses for the users of the Unix time share system Mapped to address 177566 63 Input echo program ; input demo ; read line from keyboard ; count lower case vowels (not checking upper case) tks=177560 ; control register for keyboard tkb=177562 ; data register for keyboard tps=177564 ; control register for console output tpb=177566 ; data register for console output .origin 1000 ; this version just reads characters until newline start:call getchar call putch cmp r0,#15 bne start halt ; getchar ; wait for flag to set ; read the character ; (going to assume no errors) getchar:inc @#tks ; enable getloop: bit #200,@#tks ; wait for done flag beq getloop movb @#tkb,r0 return ; putchar ‐ need to echo the character putch:mov r0,@#tpb wtc: tstb @#tps bpl wtc return .end start Input echo program ‐ mainline .origin 1000 ; this version just reads characters until newline start:call getchar call putch cmp r0,#15 bne start halt Constants (addresses of devices) get defined Mainline – calls to getchar and putchar in a loop that terminates when newline (015) character is entered Get character subroutine Put character subroutine 65 nabg 64 66 11 Call = jsr r7 • Typical subroutine call 1. 2. 3. 4. • Instruction is fetched and pc updated to point to next location. The instruction is decoded, and the dd destination part is interpreted. Typically the addressing mode is using relative addressing and the location contains the relative address of the subroutine i.e. the first instruction of the subroutine. The address data are fetched, and the pc updated to point to the next instruction. This return address is pushed onto the stack. The pc is changed to hold the address of first instruction in subroutine. The next instruction fetch will get the first instruction in the subroutine. 67 68 Input echo program ‐ getchar ; getchar ; wait for flag to set ; read the character ; (going to assume no errors) getchar:inc @#tks ; enable getloop: bit #200,@#tks ; wait for done flag beq getloop movb @#tkb,r0 return Return = rts r7 • Return from subroutine – simple typical case Checking whether this bit is set; it will be when the device has the correct bit pattern in the buffer for the key just pressed. 1. sp should point to word on stack holding return address 2. Pop this address off the stack into the pc • Next instruction executed would be that following the subroutine call instruction (and any address data associated with the subroutine call instruction) 69 70 Addressing mode getchar:inc @#tks ; enable • Instruction – inc (increment) – 0052 • Destination @#tks – Absolute addressing mode • Mode 3, register 7 • A word with an absolute address to follow – tks (tks=177560) • So – 005237 – 177560 71 nabg 72 12 Wait loop I/O Have to click in textarea for input • It’s simple to code, and pretty much the same for all devices; e.g. for input: Start device Loop: test device’s “done” flag if “done” not set then goto Loop deal with data Movie Illustration of wait loops in the movie is not that clear because the movie frame rate doesn’t really capture the large number of loop iterations. If you try the code yourself – type slowly! 73 Block transfer devices 74 Block transfer devices • A simple disk/DECTape device that store blocks of data (e.g. 128 words) would have the following registers: • The block transfer devices used “direct memory access” ‐ 1 – A control register – 2 PC IR Flags PC CPU ALU • bits set in this determined whether it was reading or writing Block number IR Flags Byte counter Destination address Flags Registers – A memory address register • This would be loaded with the location in memory where input data are to be placed or where data are to be taken from and written to storage – Once it had started a transfer, the device controller would automatically increment this register Flags CPU executing other instructions Flags CPU PC IR Block number Flags Byte counter Destination address Flags Byte counter Destination address Flags Registers DISK Disk cache BUS CPU to disk: copy into memory starting at address ******; 5 6 PC IR Flags PC CPU ALU • If the block size was not hard wired, this register would be loaded with the block size before the transfer was started • Otherwise, it would be automatically set when the transfer was started, and then counted down Block number CPU ALU DISK Disk cache disk to CPU: got it; – Things get a bit more complex with larger disks that have multiple surfaces, sectors, tracks etc Disk moving heads (seeking 4 PC IR ALU BUS – A word count DISK Disk cache BUS 3 Registers • Which block of disk or DECTape is to be used Byte counter Destination address Registers CPU to disk: load block number XXX and start seek; – A block number Block number CPU ALU DISK Disk cache BUS Block number IR Flags Byte counter Destination address Flags Registers Disk cache DISK Block number CPU Byte counter Destination address ALU Flags Registers BUS Disk cache DISK BUS Data transferred "directly" into memory Block of memory to be filled disk to CPU: transfer complete; Block of memory now filled 75 76 Wait loop coding for block devices • Code along the following lines: • Input and output operations are ; load disk control with command – read/write etc mov #cmdword, @#diskcontrol ; load address register with memory location mov #membuf, @#diskmemoryaddress ; if necessary, load word count mov #wordcount, @#diskcount ; set disk block address – which incidentally start the transfer mov #blkno, @#diskaddress ; now wait for “done” bit to be set in the control ; (it will be set once all the words copied to memory) diskwait:bit #somepat,@#diskcontrol ; disk gets chance to do dma transfers between these two instructions beq diskwait ; Have data - continue – Fairly easy – A little repetitious – Time wasting • Going around those wait loops thousands of times 77 nabg I/O 78 13 Sorting an array • You must have done something like this in C/C++ Array of short integers initialized with octal values – some represent negative numbers. Just like C (C++) Sorting an array A version of “insertion sort” (and output of array contents before and after sort) sorting (insertion sort) .origin 1000 ; r0 = i in C++ code ; r1 = 2*r0 is index as byte offset when accessing data start: mov #1,r0 loop: mov r0,r1 clc rol r1 ; valtoinsert = data[i] mov data(r1),valtoinsert mov r0,holepos ; r1 again used as index, now for holepos ; while(holepos>0 && valtoinsert<data[holepos‐1]) ; will use r1 as byte offset for [holepos‐1] ; and r2 as byte offset for [holepos] while: tst holepos beq endwhile mov holepos,r1 mov holepos,r2 dec r1 ; change index to byte offset clc rol r1 rol r2 cmp valtoinsert,data(r1) bge endwhile ; data[holepos] = data[holepos‐1] mov data(r1),data(r2) dec holepos br while endwhile:mov holepos,r1 clc rol r1 mov valtoinsert,data(r1) inc r0 cmp r0,num blt loop halt 80 81 82 Here adopting convention that code and data should go in separate memory areas Section for code .origin 2000 valtoinsert: .word 0 holepos: .word 0 num: .word 17 data: .word 142714, 66007, 41577, 42070, 132466, 27022, 71231, 154575, 53260, 150016, 61772, 145673, 143333, 25373, 122141 .end start nabg 79 Section for data 83 start: mov #1,r0 loop: mov r0,r1 clc rol r1 ; valtoinsert = data[i] mov data(r1),valtoinsert mov r0,holepos ; r1 again used as index, now for holepos ; while(holepos>0 && valtoinsert<data[holepos‐1]) ; will use r1 as byte offset for [holepos‐1] ; and r2 as byte offset for [holepos] while: … … … endwhile:mov holepos,r1 clc rol r1 mov valtoinsert,data(r1) inc r0 cmp r0,num Code for the i<num part of the C/C++ for loop blt loop halt 84 14 Assembler and C code • R0 used for variable i in C code – Index into a (zero based) array • But memory access is byte addressing, so need to convert index into byte offset every time mov r0,r1 clc rol r1 • Get value, rotate left (x2) – I clear the carry bit just in case it’s been set by some instruction for otherwise the carry bit will become the low order bit of offset making it an odd byte address (but really I’m being over cautious – nothing in this instruction sequence should set carry) • Now can use r1 in indexed mode addressing while: tst holepos holepos>0 beq endwhile mov holepos,r1 Need byte offsets for holepos (r2) mov holepos,r2 And holepos‐1 (r1) dec r1 ; change index to byte offset clc rol r1 rol r2 cmp valtoinsert,data(r1) valtoinsert < data[holepos‐1] bge endwhile ; data[holepos] = data[holepos‐1] mov data(r1),data(r2) dec holepos br while 85 86 Recursion (and division) Output of a value in decimal format Movie – sorts the data and displays dump after execution 87 88 Remember this? Itoa – integer to ascii • Not quite the same coding – but similar recursive style – Limited to signed 16‐bit numbers • Numbers converted to 32 bits temporarily before using DIV instruction • Approximate C code for this version Time to see a recursive subroutine using the stack. void itoa1(int num) { int quot = num / 10; int rem = num % 10; if(quot>0) itoa1(quot); char ch = rem + ‘0’; putchar(ch); } void itoa(int num) { if(num<0) { putchar(‘-’); num = -num; } itoa1(num); } 89 nabg 90 15 91 ; integer to ascii ; tps=177564 ; control register for console output tpb=177566 ; data register for console output val=1234 .origin 1000 start:mov #val,‐(sp) call itoa halt ; Argument is on stack itoa:tst 2(sp) bpl positive ; number was negative ; put ‐ sign and negate the number neg 2(sp) movb #55,r0 call putch positive:mov 2(sp),r0 call itoa1 return ; itoa1 ‐ recursive convert to decimal ; argument is in r0 on entry ; need local variable (on stack) itoa1:mov r0,r1 clr r0 div #12,r0 ; quotient in r0, remainder in r1 ; if quotient 0 ‐ finished recursion tst r0 beq done ; need recursive call ; need to save remainder on stack mov r1,‐(sp) call itoa1 ; retrieve stacked r1 mov (sp)+,r1 ; convert value to character done:mov #60,r0 add r1,r0 call putch return putch:mov r0,@#tpb wtc: tstb @#tps bpl wtc return .end start Mainline Auxiliary function for integer to ascii deals with negative numbers (print – sign and make value positive!) then calls recursive function that does main part of work The recursive part of function Putchar 92 Mainline .origin 1000 start:mov #val,‐(sp) call itoa halt itoa Push argument onto stack Call the subroutine itoa:tst 2(sp) bpl positive ; number was negative ; put ‐ sign and negate the number neg 2(sp) movb #55,r0 call putch positive:mov 2(sp),r0 call itoa1 return Call = jsr r7 Subroutine call using r7 (pc) as link register (the normal subroutine call for PDP 11) push contents of r7 onto stack Tests the argument on the stack stack pointer will be pointing to location on stack with return address, the argument (“num” in the C code) is one word (2‐bytes) higher (Just using indexing mode on the sp register) If argument was negative, need to output – sign (055 octal) Return = rts r7 r7 will hold address of instruction following the subroutine call instruction To slightly simplify the coding, the recursive function been written to take its argument in register r0 change r7 to address of subroutine next instruction executed will be the first instruction of subroutine 93 Multiplication and division 94 itoa1 subroutine • If you paid extra for the EIS option (upsize me?), you got MUL and DIV instructions – 32 bit numbers • On entry r0 holds 16‐bit value • Convert to 32‐bit format in registers r0 & r1 – R0 will be all zeroes, r1 will hold value originally in r0 • Can then use DIV instruction on r0 – Operand will be #12 (12 octal is ten decimal) 95 nabg 96 16 itoa1 itoa1:mov r0,r1 clr r0 div #12,r0 ; quotient in r0, remainder in r1 ; if quotient 0 ‐ finished recursion tst r0 beq done ; need recursive call ; need to save remainder on stack mov r1,‐(sp) call itoa1 ; retrieve stacked r1 mov (sp)+,r1 ; convert value to character done:mov #60,r0 add r1,r0 call putch return Local variable on stack • The value for “rem” in any one instance of itoa1 must be saved while recursive calls made. • Only possible place to save value is on the stack. • So in this simple routine, push the value that is to be saved onto stack before the recursive call, and pop it off the stack after the recursive call returns. Recursive call 97 Push and pop 98 Itoa movie 1 • Run till first entry to itoa1 – then view stack ; need to save remainder on stack 001062 010146 mov r1,‐(sp) 001064 004767 call itoa1 001066 177756 ; retrieve stacked r1 001070 012601 mov (sp)+,r1 mov r1,‐(sp) Instruction: mov = 01; Source: register mode (0), register r1, = 01, Destination: auto‐decrement (4), register r6 = 46 mov (sp)+,r1 Instruction: mov = 01; Source: autoincrement (2), register r6, = 026, Destination: register mode (0), register r1 = 01 99 On stack : 1 100 Itoa movie : 2 • Continue to first recursive call of itoa1 • 001234 ‐‐‐ the value to be printed • 001010 ‐‐‐ return address in main • 001044 ‐‐‐ return address in itoa 1st call to itoa1 Call itoa1: Instruction “call” = jsr (004) & register 7 destination itoa1 – relative addressing so mode 6, register 7, and word for relative address 004767 000002 ; works out as 001046 – the start of itoa1 nabg 101 102 17 On stack : 2 • • • • • Itoa movie : 3 • Continue until get leading digit of ascii string passed to putchar routine 001234 ‐‐‐ the value to be printed 001010 ‐‐‐ return address in main 001044 ‐‐‐ return address in itoa 000010 ‐‐‐ bottom digit (as rem in recursive function) 001070 ‐‐‐ return address in itoa1 103 Stack and registers 104 Movie Finish … Stack: 001234 original argument pushed on stack 001010 return address in main 001044 return address in itoa 000010 bottom digit (effectively local variable in itoa1) 001070 return address in itoa1 000006 next digit (local variable in recursive version of itoa1) 001070 return address in itoa1 001104 return address in itoa1 (call to putch) Registers The right answer! 12348 = 668 First digit as character ‘6’ 105 106 107 108 Structs and subroutines More realistic subroutine example with local variables on stack nabg 18 A C (C++) struct An array of those structs To occupy 16 bytes (to match the assembly language version) A name – at most 7 characters and a null byte (code doesn’t check restrictions) Short integers for birth data (and a two byte “fill” to make up to 16 bytes) 109 A simple program using the array of structs 110 A function that processes the data Initialize working variables with values in data[0] 111 A successful execution of the program 112 … and in assembler • What’s interesting about this example? – Working out address of array element in array of structs – Accessing field of struct – Accessing arguments on the stack – Using the stack for local variables • Day, month, year and oldest 113 nabg 114 19 Data .origin 2000 ; put data in a separate segment data: .string "tom " .word 13, 4, 3676 .word 0 .string "dick " .word 34, 3, 3673 .word 0 .string "harry " .word 23, 7, 3660 .word 0 .string "sue " .word 22, 13, 3671 .word 0 Data .string assembler directive arranges characters in successive words; if an odd number of characters in quoted string the assembler processing the .string directive adds one null byte; if an even number of characters, the assembler adds a zero word. The names have been entered with just the right number of spaces so that each along with any null fill bytes occupies exactly 8 bytes 115 start:mov #data,‐(sp) … halt ; puts ; Print chars until get null byte puts:mov r0,r1 … done: return ; print character putch:mov r0,@#tpb … ; oldest ; oldest:sub #10,sp … return ; getstruct ; getstruct: clc … return 116 Code Mainline Mainline Puts – a put string function Putch – a put char function Oldest – the subroutine Getstruct – helper function used when indexing into data array ; Array of struct ; each struc needs fourteen bytes ; 8 for name ‐ 2 each for day, month, year ; more convenient to make 16 (simplifies turning ; array index into byte offset) strucsz=20 tps=177564 ; control register for console output tpb=177566 ; data register for console output .origin 1000 start:mov #data,‐(sp) mov #4,‐(sp) call oldest ; and again need to clear stack ‐ those two arguments can go add #4,(sp) call puts halt Push arguments onto stack, call function, result in register r0 passed to output string function 117 118 puts ; ; puts ; On entry, r0 holds a byte address ; Print chars until get null byte puts:mov r0,r1 putl:clr r0 movb (r1)+,r0 beq done call putch br putl done: return putch ; print character putch:mov r0,@#tpb wtc: tstb @#tps bpl wtc return Nothing new here, just consume successive bytes printing them one by one until nul byte at end of string 119 nabg Standard wait‐loop printing of character 120 20 getstruct oldest ; oldest ; examine those structs ; Going to use some local variables on stack! ; oldest ‐ index into array 6(sp) ; day for oldest 4(sp) ; month for oldest 2(sp) ; year for oldest (sp) ; that is 4 two byte variables ; first must reserve that space by adjusting stack pointer oldest:sub #10,sp clr 6(sp) ; ndx=0 clr r0 mov 14(sp),r1 call getstruct mov 10(r0),r1 ; getday mov r1,4(sp) mov 12(r0),r1 ; getmonth mov r1,2(sp) mov 14(r0),r1 ; getyear mov r1,(sp) ; now loop through other records to find any that are older ; use r2 for loop counter clr ,r2 oldieloop:inc r2 ; num is 12(sp) cmp r2,(12)sp beq doneloop mov r2,r0 mov 14(sp),r1 call getstruct ; r0 holds base address of next record mov 14(r0),r1 ; r1 holds year for other record cmp (sp),r1 ; compare with year for oldest so far blt oldieloop ; current has older year bgt change ; same year, compare month mov 12(r0),r1 cmp 2(sp),r1 blt oldieloop bgt change ; same year and month, compare day of month mov 10(r0),r1 cmp 4(sp),r1 blt oldieloop ; change ‐ this record represents older person ; replace info in 'local' variables on stack ; change:mov r2,6(sp) mov 14(r0),r1 mov r1,(sp) mov 12(r0),r1 mov r1,2(sp) mov 10(r0),r1 mov r1,4(sp) br oldieloop ; ; done ‐ retrieve index of oldest person doneloop:mov 6(sp),r0 mov 14(sp),r1 call getstruct ; r0 holds struct address, 0 offset for string ; so it is the return value ; have to clean up stack ‐ those local variables add #10,sp return ; getstruct ; on entry r0 is index 'ndx' into array ; r1 holds base address of array data ; on exit r0 is address of data[ndx] getstruct: clc Works out address of array element data[r0] rol r0 rol r0 rol r0 rol r0 add r1,r0 return r0 has index, must be converted into a byte offset; Each struct is 16 bytes; so multiply r0 by 16 (decimal) Each rol is x2; so 4 rol instructions Add on base address of array Easy! A bit more complex … 121 122 Oldest ‐ 1 Oldest ‐ 2 ; oldest ; examine those structs ; Going to use some local variables on stack! ; oldest ‐ index into array 6(sp) ; day for oldest 4(sp) ; month for oldest 2(sp) ; year for oldest (sp) ; that is 4 two byte variables ; first must reserve that space by adjusting stack pointer oldest:sub #10,sp clr 6(sp) Stack: clr r0 address of array data value for num return address back in mainline two bytes for ‘oldest’ two bytes for ‘day’ two bytes for ‘month’ sp two bytes for ‘year’ 123 … clr r0 mov 14(sp),r1 call getstruct mov 10(r0),r1 ; getday mov r1,4(sp) mov 12(r0),r1 ; getmonth mov r1,2(sp) mov 14(r0),r1 ; getyear mov r1,(sp) to Base address for struct 148 offset from base address for struct Indexed address mode Argument “data” is now 14 above current stack pointer m Encoded string \0 13 4 3676 0 integer values 124 Oldest ‐ 3 ; now loop through other records to find any that are older ; use r2 for loop counter clr ,r2 • Using indexed address mode slightly differently than in previous example with array access oldieloop:inc r2 – In array access • Register holds index (converted into byte offset) • Word following instruction provided base address of array – Here using access to a struct • Register holds base address of struct • Word following instruction provides offset – Access into “local stack frame” is similar to struct access so 2(sp), 4(sp) etc 125 nabg Initialize working variables with values in data[0] ; num is 12(sp) cmp r2,12(sp) beq doneloop mov r2,r0 mov 14(sp),r1 Code to get address of data[i] into r0 call getstruct ; r0 holds base address of next record mov 14(r0),r1 ; r1 holds year for other record cmp (sp),r1 ; compare with year for oldest so far blt oldieloop ; current has older year bgt change ; same year, compare month mov 12(r0),r1 cmp 2(sp),r1 blt oldieloop bgt change ; same year and month, compare day of month mov 10(r0),r1 cmp 4(sp),r1 blt oldieloop 126 21 Oldest ‐ 4 Oldest ‐ 5 ; ; done ‐ retrieve index of oldest person doneloop:mov 6(sp),r0 mov 14(sp),r1 call getstruct ; r0 holds struct address, 0 offset for string ; so it is the return value ; have to clean up stack ‐ those local variables add #10,sp return ; change ‐ this record represents older person ; replace info in 'local' variables on stack ; change:mov r2,6(sp) mov 14(r0),r1 mov r1,(sp) mov 12(r0),r1 mov r1,2(sp) mov 10(r0),r1 mov r1,4(sp) br oldieloop ; 127 See it run 128 Messy isn’t it • Conceptually it’s not much harder than the C program • But all those fiddly details – Got to keep track of relative location of variables on stack • Argument ‘data’ – 14(sp) • Argument ‘num’ – 12(sp) • Local year (sp) – Got to remember to claim the local space ‐ oldest:sub #10,sp – And give it back when function finishes ‐ add #10,sp – And if you push arguments onto stack before calling a function, must remember to correct sp on return Same answer as C code, must be right 129 130 Typically storing temporary values on stack … and it only get’s worse! • The examples haven’t involved any calculations that had complex sub‐expressions to evaluate • Where calculations get more complicated than those shown in simple examples so far, intermediate results have to be pushed on stack • Complex sub‐expressions? – You should be familiar with these from mathematical (real arithmetic) examples – So imagine a subroutine root(a,b,c) • Remember this (Unit‐2 maths wasn’t that long ago!) • On first entry c = 2(sp), b=4(sp) and a=6(sp) • Then you work out b*b and need to put (two word) result on stack • The roots of equation are given by • If you need to work out something like that you’ll have to evaluate b2 and save result somewhere (as need to re‐use registers), then evaluate 4ac, then … – mov r1,‐(sp) ; low order bits in product – mov r0,‐(sp) ; high order bits in product • Now you want 4*a*c – But the stack pointer has changed! Now a =12(sp) 131 nabg 132 22 A more sophisticated approach … Stack frame • Use another register to hold • Trying to handle subroutine arguments and local variables with just a stack pointer is – error prone – Place in stack where stuff for current subroutine is stored • Access arguments and local variables by indexing from this “stack frame pointer” – You get those relative offsets wrong too often – Arguments will be at higher locations – Local variables will be at lower locations • Compilers are better at keeping track of things than fallible human programmers who lose count • But it still results in code that is very difficult to work through if debugging – as the changes of addresses make interpretation of code harder • Will still need to claim space for locals by adjusting the stack pointer when enter the function; and will need to reset the stack pointer just before exit; ‐ but now it doesn’t matter if stack pointer changes to allow temporary values to be saved during calculations. • More sophisticated approach – The idea of a “stack frame” 133 134 Extra instructions? PDP‐11 : stack frame • May (these days usually do) have some extra instructions that help programmer to keep track of stack frames and tidy them up on exit from a function (e.g. don’t have to remember to remove those arguments pushed into stack) • Register r5 is “the stack frame pointer” Frame: Location of “old r5” in calling frame Argument n – Not in original 11‐20; added later for something like 11‐40. – Frame‐2 Argument 1 • PDP‐11 has “Mark” instruction Unix and C compiler implemented before Mark instruction available, so don’t take advantage of this feature r5 Mark n Stack Frame‐1 Frame‐3 r5 Return address Local variable 1 r6 Local variable 2 135 136 A “mark” instruction is placed in the stack; e.g. if have 2 arguments will have “mark 2” 006402 placed in stack. During subroutine exit, this instruction will be fetched from the stack and executed. 137 nabg 138 23 139 Just show me … • Example – Same as last one – finding which struct represents the oldest person – This time, use frame pointer and “mark” • Not a major issue here as there weren’t deeply nested function calls leading to multiple frames on the stack • It’s just a simple illustration of using r5 and a mark instruction 141 New main mark2=6402 ; mark instruction with 2 args .origin 1000 start:clr r5 ; just to have an 'old r5' mov r5,‐(sp) ; save old r5 ; now save arguments mov #data,‐(sp) mov #4,‐(sp) ; now save the magic mark instruction for 2 args mov #mark2,‐(sp) mov sp,r5 ; setting frame pointer call oldest call puts halt ; Array of struct ; each struc needs fourteen bytes ; 8 for name ‐ 2 each for day, month, year ; more convenient to make 16 (simplifies turning ; array index into byte offset) strucsz=20 tps=177564 ; control register for console output tpb=177566 ; data register for console output mark2=6402 ; mark instruction with 2 args .origin 1000 start:clr r5 ; just to have an 'old r5' mov r5,‐(sp) ; save old r5 ; now save arguments mov #data,‐(sp) mov #4,‐(sp) ; now save the magic mark instruction for 2 args mov #mark2,‐(sp) mov sp,r5 ; setting frame pointer call oldest call puts halt ; ; puts ; On entry, r0 holds a byte address ; Print chars until get null byte puts:mov r0,r1 putl:clr r0 movb (r1)+,r0 beq done call putch br putl done: return ; print character putch:mov r0,@#tpb wtc: tstb @#tps bpl wtc return ; oldest ; examine those structs ; Going to use some local variables on stack! ; Find their positions, and positions of arguments, relative to the stack frame pointer (r5) ; ; first must reserve that space by adjusting stack pointer ; ; Now stack should be ; old r5 (i.e. 0) ; first argument ‐ address of array ; second argument ‐ number of elements ; the magic mark instruction <‐‐‐‐‐ r5 pointing to this word ; the return address ; (space for local variables) index of oldest ; day ; month ; year ; so address of array (data) is 4(r5), address of num is 2(r5) ; address of index is ‐4(r5); address of day is ‐6(r5), address of month ‐10(r5) ; and address of year is ‐12(r5) oldest:sub #10,sp clr ‐4(r5) ; ndx=0 clr r0 mov 4(r5),r1 ; address of data in r1 call getstruct ; returns with r0 holding address of data[0] mov 10(r0),r1 ; getday mov r1,‐6(r5) mov 12(r0),r1 ; getmonth mov r1,‐10(r5) mov 14(r0),r1 ; getyear mov r1,‐12(r5) ; now loop through other records to find any that are older ; use r2 for loop counter clr ,r2 oldieloop:inc r2 ; num is 2(r5) cmp r2,2(r5) beq doneloop mov r2,r0 mov 4(r5),r1 call getstruct ; r0 holds base address of next record mov 14(r0),r1 ; r1 holds year for other record cmp ‐12(r5),r1 ; compare with year for oldest so far blt oldieloop ; current has older year bgt change ; same year, compare month mov 12(r0),r1 cmp ‐10(r5),r1 blt oldieloop bgt change ; same year and month, compare day of month mov 10(r0),r1 cmp ‐6(r5),r1 blt oldieloop ; change ‐ this record represents older person ; replace info in 'local' variables on stack ; change:mov r2,‐4(r5) mov 14(r0),r1 mov r1,‐12(r5) mov 12(r0),r1 mov r1,‐10(r5) mov 10(r0),r1 mov r1,‐6(r5) br oldieloop ; ; done ‐ retrieve index of oldest person doneloop:mov ‐4(r5),r0 mov 4(r5),r1 call getstruct ; r0 holds struct address, 0 offset for string ; so it is the return value ; still have to clean up stack ‐ those local variables add #10,sp rts r5 ; return using frame pointer ; getstruct ; on entry r0 is index 'ndx' into array ; r1 is base address of data ; on exit r0 is address of data[ndx] getstruct: clc rol r0 rol r0 rol r0 rol r0 add r1,r0 return .origin 2000 ; put data in a separate segment data: .string "tom " .word 13, 4, 3676 .word 0 .string "dick " .word 34, 3, 3673 .word 0 .string "harry " .word 23, 7, 3660 .word 0 .string "sue " .word 22, 13, 3671 .word 0 .end start Much of it is the same code as already shown. Data no different. getstruct, puts, putch subroutines no different Changes to mainline – set up the call to oldest using new conventions involving stack frame; Changes to oldest to access arguments and locals variables by reference to stack frame pointer (r5). 142 Revised oldest ‐ 1 Tidying up has been done already 143 nabg 140 ; oldest ; examine those structs ; Going to use some local variables on stack! ; Find their positions, and positions of arguments, relative to the stack frame pointer (r5) ; ; first must reserve that space by adjusting stack pointer ; ; Now stack should be ; old r5 (i.e. 0) ; first argument ‐ address of array ; second argument ‐ number of elements ; the magic mark instruction <‐‐‐‐‐ r5 pointing to this word ; the return address ; (space for local variables) index of oldest ; day ; month ; year ; so address of array (data) is 4(r5), address of num is 2(r5) ; address of index is ‐4(r5); address of day is ‐6(r5), address of month ‐10(r5) 144 ; and address of year is ‐12(r5) 24 Revised oldest ‐ 2 Revised oldest ‐ 3 oldest:sub #10,sp clr ‐4(r5) ; ndx=0 clr r0 mov 4(r5),r1 ; address of data in r1 call getstruct ; returns with r0 holding address of data[0] mov 10(r0),r1 ; getday Initialize working variables mov r1,‐6(r5) with values in data[0] mov 12(r0),r1 ; getmonth mov r1,‐10(r5) mov 14(r0),r1 ; getyear mov r1,‐12(r5) 145 ; now loop through other records to find any that are older ; use r2 for loop counter clr ,r2 oldieloop:inc r2 ; num is 2(r5) cmp r2,2(r5) beq doneloop mov r2,r0 mov 4(r5),r1 call getstruct ; r0 holds base address of next record mov 14(r0),r1 ; r1 holds year for other record cmp ‐12(r5),r1 ; compare with year for oldest so far blt oldieloop ; current has older year bgt change ; same year, compare month mov 12(r0),r1 cmp ‐10(r5),r1 blt oldieloop bgt change ; same year and month, compare day of month mov 10(r0),r1 cmp ‐6(r5),r1 blt oldieloop Revised oldest ‐ 4 Move right along – nothing to see here 146 Revised oldest ‐ 5 ; ; done ‐ retrieve index of oldest person doneloop:mov ‐4(r5),r0 mov 4(r5),r1 call getstruct ; r0 holds struct address, 0 offset for string ; so it is the return value ; still have to clean up stack ‐ those local variables add #10,sp rts r5 ; return using frame pointer ; change ‐ this record represents older person ; replace info in 'local' variables on stack ; change:mov r2,‐4(r5) mov 14(r0),r1 mov r1,‐12(r5) mov 12(r0),r1 mov r1,‐10(r5) mov 10(r0),r1 mov r1,‐6(r5) br oldieloop New style return using r5 (to trigger mark instruction etc) 147 148 Execution – well just view dump to see stack when in “oldest” Stack (return address from completed call to getstruct) 0756/0760/0762 – local variables day, month, year 0764 pc Index (0) 0766 Return address 149 nabg 0770 Magical mark instruction “old r5” 0772/0774 0776 Arguments – address of data array 02000 and number of elements 4 150 25 Not conceptually harder • Examples should have convinced you that coding in assembly language isn’t really harder than programming in a high level language like C/C++ Nothing to it! – Same concepts • Expressions – ok spelt out as a sequence of operations • Assignment – just a mov of data into some variable • Loops – things like sob and constructs using conditional branch instructions • Conditionals – just conditional branch instructions • Function calls – jsr • Local variables – tweak the sp to allow space • … It’s easy to code in assembly language isn’t it! 151 But not something you’d really want to do much • It’s just all those extra details involved in specifying the exact processing steps • All those extra opportunities for errors • You would write about the same number of debugged lines of code per day in C/C++ or in assembler – But you need 5 to 10 times as many lines of assembler as you need lines of C/C++ to do the same computations – Programming in assembler will take you 5‐10 times as long! For the first couple of days, you’d be stumbling around specifying addressing modes – but you’d soon get used to them. 152 Not for data processing • No one has the time to program data processing applications in assembler any more. • The only appropriate use is for I/O control – Fortunately, this has mostly been done for us by the people who wrote the operating system kernel, the device drivers, and the libraries that make I/O available in high level languages. • But you do need to know a little more about I/O and “supervisor calls” (or “traps”) 153 154 Compute while I/O takes place • Why did interrupt driven I/O first get added? • So as to allow costly CPU to get on with calculations while slow peripheral devices transferred data. Interrupt driven I/O • So an example A simple example – Send a message to teleprinter (using interrupts) while “computing” – • in this case going round a small loop incrementing a counter, and checking a variable that will get set when complete message has been sent. 155 nabg 156 26 On interrupt Vectored interrupts 1. Current instruction completes 2. Hardware checks for interrupt comparing priority of interrupt request with priority set in status register – if request has higher priority then interrupt will occur. 3. Program counter and status word are both pushed onto stack. 4. Interrupting device supplies an address in low memory 1. 2. The memory location at this address should contain the address of the interrupt handling routine – load into pc The next memory location should contain a new value for the status register – setting the new priority – load that into status word. • Each device – keyboard, teleprinter, clock, disk – has a hard wired interrupt address • Code in application must initialize such addresses with locations of handler routines and must also set values in status words. 5. Next instruction executed will be first instruction of interrupt handler function. 157 158 Pseudo‐code Call asynchronousWriteline(msg) While flag not set increment counter halt -----------------------------asychronousWriteline store address of message string send first character return -------------------------------Charactersentinterrupt test if at end of message (nul byte as next byte) if not at end – send next character and return from interrupt if at end – set flag and return from interrupt 159 ; interrupts 1 ; send a string while doing some computation .origin 1000 ; This bit is the nascent operating system ttyaddr=64 ; interrupt entry point for tty ‐ set to address of handler routine ttysw=66 ; holds status word to load when start handling tty input tpsw=200 ; value to put in status word tps=177564 ; control register for console output tpb=177566 ; data register for console output osstart:mov #iput,@#ttyaddr mov #tpsw,@#ttysw ; now enable interrupts from tty ; need to set bit‐6 in its control register mov #300,@#tps call application halt ; interrupt handling routine ; on entry pc and status will have been saved on stack ; iput:tstb @bufptr beq msgdone ; There is another character to go movb @bufptr,@#tpb inc bufptr rti msgdone:inc @flagptr rti ; print function ; save in local variable bufptr ; send the first character print: mov 4(sp),bufptr mov 2(sp),flagptr movb @bufptr,@#tpb inc bufptr return bufptr: .word 0 flagptr:.word 0 .origin 1400 ; application ‐ arrange for message to be printed, then compute for a while ; application: mov #data,‐(sp) mov #doneflag,‐(sp) call print ; remove arguments inc sp inc sp clr count loop: inc count tst doneflag beq loop mov count,r0 halt .origin 2000 doneflag:.word 0 data: .string "It works ‐ and about time too!" count: .word 0 .end osstart nabg Video recording of interrupts program – you can see it spending most of its time in the loop around 01424‐01434 updating the counter; the time spent in the interrupt handler (around 01030‐01050) is so brief it doesn’t show as more than a flicker 160 Start up code • Code that initialises the “interrupts vector” – Only one device (the teleprinter) so only one entry has to be filled in • Memory word 64 is to hold address of routine that handles a teleprinter interrupt (this interrupt basically says “I’ve finished printing the last character that you gave me”) • Memory word 66 is to hold the status word to be loaded before handling the interrupt – Simply sets the appropriate priority (4) 161 162 27 Start up code Interrupt handler ; interrupts 1 ; send a string while doing some computation .origin 1000 ; This bit is the nascent operating system ttyaddr=64 ; interrupt entry point for tty ‐ set to address of handler routine ttysw=66 ; holds status word to load when start handling tty input tpsw=200 ; value to put in status word tps=177564 ; control register for console output tpb=177566 ; data register for console output osstart:mov #iput,@#ttyaddr Copies handler address into interrupt vector location mov #tpsw,@#ttysw ; now enable interrupts from tty and set ‘done’ (or ‘ready’) ; need to set bit‐6 & 7 in its control register mov #300,@#tps call application halt • What should we do when get an interrupt from teleprinter saying it has printed last character? – If there is another character, send it otherwise set a flag to indicate message sent. 163 164 Interrupt handler ; interrupt handling routine ; on entry pc and status will have been saved on stack ; iput:tstb @bufptr beq msgdone ; There is another character to go movb @bufptr,@#tpb inc bufptr Example of address mode 7, rti deferred relative; msgdone:inc @flagptr flagptr is a variable holding address of rti a “flag” (i.e. it’s a pointer) we are referencing flagptr using relative addressing 165 OS’s “print function” 166 OS’s “print function” • My “Operating System” offers a print function – Set bufptr to point to start of message – Set flagptr to point to a “flag” variable that is to be set when message all printed – Sending first character • Just move byte from memory to teleprinter’s buffer – that starts the print mechanism ; print function ; save in local variable bufptr ; send the first character print: mov 4(sp),bufptr mov 2(sp),flagptr Copies first byte of message into teleprinter data buffer, movb @bufptr,@#tpb so starting the print mechanism inc bufptr return bufptr: .word 0 flagptr:.word 0 – I’ve been lazy! I haven’t set this up properly using a stackframe etc, I’m just pushing the arguments onto stack and fetching from there 167 nabg 168 28 Application code Application code • Send the message – Push message address, and address of a flag variable onto stack – Call the print function – Clear arguments from stack – Compute loop • Increment count • Test flag – (Really counting how long it takes to print the message – as measured in cycles of that compute loop) 169 Start an asynchronous operation – check for completion .origin 1400 ; application ‐ arrange for message to be printed, then compute for a while ; application: mov #data,‐(sp) mov #doneflag,‐(sp) call print ; remove arguments inc sp inc sp clr count loop: inc count “Compute loop” tst doneflag beq loop mov count,r0 halt .origin 2000 doneflag:.word 0 data: .string "It works ‐ and about time too!" count: .word 0 170 It works – and about time too! • Apocryphal • Example code relied on flag getting set when asynchronous operation completes (last character of message sent) • Code tests this flag at regular intervals – In ~1963, Cambridge University’s “Computer Laboratory” (UK) took delivery of a prototype Ferranti Atlas 2 computer that they named Titan – It came without any software – In those days (well it was almost 60 years ago) academic “computer scientists” could write code! • You will meet similar constructs in Java or C# when dealing with things like SOAP requests (a form of network communication covered in some 300‐level subjects) • David Barron, David Hartley, Roger Needham and Barry Landy (all UK pioneer computer scientists) settled down and started to write an operating system (in assembly language) and a compiler for a high level language • Took them longer than they expected • They got rather exasperated • Other academics (scientists and engineers) who had hoped to use the new computer started grumbling • Finally – It works – and about time too! – A little program running on the new Titan OS started printing that line repeatedly on the line printer 171 Handling interrupts ‐ reprise The interrupt vector • Hardware automatically saves the “status word” – with all those flags like “carry”, “overflow”, “negative”, … • Each entry in interrupt vector consists of – Last instruction executed may have set these – will want the same values there when resume execution • Hardware automatically saves program counter – so have address of next instruction that would have been executed if there hadn’t been an interrupt • Interrupt handling code will need to save registers – Not needed in 1st example (where just operated on memory locations and device buffers) – Usually, interrupt handling needs to do some data manipulation in registers – and must of course preserve any existing data in those registers • Interrupt handling code would restore any saved register values • rti instruction resets program counter and status word 173 nabg 172 – Address of operating system code that handles that kind of interrupt – A value for the status word – primarily setting the priority • The vector might be loaded by the boot program transferring a preset block of binary data from a disk boot file, or, as in the example, code to initialize the vector would be the start up code in the OS • On first start up, or after a reset operation, no device can interrupt. Devices must be explicitly enabled – they have “enabled” bits in the control registers 174 29 The interrupt vector The interrupt vector • Entries in the vector are hardwired • There are some entries relating to the CPU itself • There are entries for I/O devices, e.g. – 04 & 06 – entry if an error occurs while executing an instruction – 010 & 012 – illegal instruction (you’ve done something like jump into your data and loaded the instruction register with a bit pattern that doesn’t represent a known instruction) – 014 & 016 BPT – 020 & 022 IOT – 024 & 026 Power fail – you were supposed to save the CPU registers before the CPU stopped working – 030 & 032 EMT – 034 & 036 TRAP – 060 & 062 – keyboard – 064 & 066 – teleprinter – 070 & 072 – high‐speed paper tape reader – 0100 & 0102 – line clock – 0210 &0212 – RC11 disk – 0214 & 0216 – DECTape controller 175 176 Maybe you would like something a little more sophisticated • You wouldn’t really like coding a system that required you to check that the previous output action was completed before starting a new output. • You prefer something like … Interrupts and buffering cout << “Greetings earthlings” << endl; … cout << “Take me to your leader” << endl; A more sophisticated example where the output handling functions dealt with all the issues of sending data – You simply call the function, passing your data; function will “immediately” return and data output will take place asynchronously 177 Use a circular buffer • This next example illustrates a simple “buffering” technique 1. You invoke a “writeline” function 2. System copies your string into a buffer and arranges for interrupt driven I/O to take characters from buffer and send them to teleprinter 3. If there are more characters in your string than there are spaces in the buffer, the system will wait until it can insert the extra characters. Example has two main purposes – firstly to show a more sophisticated interrupt driven program, something a little closer to how an OS like Linux handles terminal output (Unix/Linux don’t typically use circular buffers, instead using more elaborate “clist” structures – linked 179 lists of small text buffers); secondly, the example also aims to show some of perils of interrupt handling! nabg 178 Circular buffer 1 • Illustration – buffer holding 8 characters • Two index variables, a put index for putting characters into buffer; a get index for taking characters out of the buffer. 180 30 Circular buffer 2 Circular buffer 3 • The “H” finishes printing, interrupt occurs • Interrupt handler finds the “i” and sends it • The teleprinter is idle, the buffer is empty • You ask to print the greeting “Hi mom” • Get index was updated, system state shows “i” to have gone – System • Teleprinter idle – send the “H” • Now put all remaining characters that fit into buffer i i M o m • The “i” finishes printing, interrupt occurs • Interrupt handler finds the space and sends it M o m – Get index updated, system state shows space to have gone • Everything fitted, function returns – Put index will identify where next character to go i M o m 181 182 Circular buffer 4 Circular buffer 5 • Your code invokes the writeline function to add a newline character i M o m • Your code invokes the writeline function to add a message “It works – and about time too” \n • Writeline function fills up the buffer as far as it is able • There was space in buffer so writeline function adds the newline and returns • Interrupt occurs showing “M” sent, w o M o m \n I t • But it cannot put any more characters into buffer – so it waits (doesn’t return to caller) – “o” picked up and dispatched to printer, – Index updated, new state i o m \n 183 184 Circular buffer 5 Process continues • Interrupt – the “o” from “Mom” has been printed w o r m I \n t w o r k I \n t • Interrupt handler gets the “m” and sends it, updates index w o o m \n I t – Writeline resumes, finds it can fit in one more character – w o r m \n I r k \n I t w o r k s I t w o r k s I t w o r k s t w o r k s t w o r k s a w o r k s a n w o r k s a t but message still not complete so it waits (doesn’t return to caller) 185 nabg w o 186 31 Process continues 2 n w o n d n o r k s a n w o r k s a n d d r k s a n n d a k s a n d a b s a o r k s Eventually • Eventually, the interrupt handler will remove a character making space for writeline to add the final “.” from the message “It works – and about time too.” • Writeline will return to main program • Interrupt handling will slowly empty the buffer while main program gets on with its work a r k s a d r k s a n d a k s a n d a b s a 187 How to know whether to wait? How to wait? 188 ‘wait’ instruction • That 2‐instruction loop is ok but it means that CPU is using bus while fetching instructions • Now in a real system, with disks etc that work with direct memory access, bus cycles are valuable • CPU not doing anything useful – just causing congestion on bus • So some computers have a “wait” instruction • You have to wait if the next location into which you want to put a character is the same as the next location from which interrupt handler should be fetching a character – (The put index has caught up with the get index as they work cyclically around through the buffer) • You could wait using a loop like loop: cmp putndx,getndx beq loop – CPU just rests until an interrupt occurs, interrupt causes the wait instruction to finish and then starts normal interrupt handling 189 Movie Program prints a series of strings, doing a little computational work in between each call to the writeline function. (Computation is just a small loop incrementing a counter.) You will see it spending most of the time in the compute loops – with wild forays off into the interrupt handling code. You will also see it seem to hang – at the ‘wait’ instruction in writeline – whenever more characters must be added to a buffer that is full. 191 nabg 190 ; interrupts = circular buffer ; send strings while doing some computation .origin 1000 ; This bit is the nascent operating system ttyaddr=64 ; interrupt entry point for tty ‐ set to address of handler routine ttysw=66 ; holds status word to load when start handling tty input tpsw=200 ; value to put in status word tps=177564 ; control register for console output tpb=177566 ; data register for console output osstart:mov #iput,@#ttyaddr mov #tpsw,@#ttysw ; now enable interrupts from tty ; need to set bit‐6 & 7 in its control register mov #300,@#tps call application halt ; ; last character has been sent ; increment get index modulo length of buffer .origin 1200 iput:inc getndx bic #177760,getndx ; if getndx has caught up with putndx, then there are no more ; characters to send cmp getndx,putndx beq idone mov r0,‐(sp) mov getndx,r0 movb buff(r0),@#tpb mov (sp)+,r0 rti idone:clr active rti Setup and call application Overview! Expanded views and commentary later! Handle interrupt from teleprinter “OS” ‘kernel’ code – set up the I/O handling, interrupt handler 192 32 .origin 1400 ; circular buffer data structure ; 10 (i.e. eight) words 16‐characters buff: .blkw 10 ; two index values put and get ‐ appropriately initialized putndx:.word 0 getndx:.word 17 ; active flag active:.word 0 .origin 1600 ; ; writeline ‐ copy characters into buffer and send them off ; of course, if buffer cannot hold all characters will have to ; wait ; this version of writeline takes address of string as ; argument in r0 ; uses register r1 (doesn't save and restore! writeline:tstb (r0) beq endmessage ; if printer isn't active ‐ send this byte tst active bne usebuf ; simply send the byte ‐ and set getndx one less than putndx modulo buffer length ; and mark printer active movb (r0),@#tpb inc r0 mov putndx,getndx dec getndx bic #177760,getndx inc active br writeline ; usebuf ‐ put character into buffer if we can usebuf:cmp putndx,getndx bne canput wait br usebuf canput: mov putndx,r1 movb (r0),buff(r1) inc r0 inc putndx bic #177760,putndx br writeline ; all characters of message sent to buffer, can return endmessage: return ; Data structure for circular buffer (.origin directives being used to separate out each block of code into a separate area – simplifies examination of “dumps” of memory. Really would have a number of areas ‐ “OS” code, “OS” data, application code, and application data.) 193 “OS” data structures .origin 2000 application:mov #msg1,r0 call writeline mov #newline,r0 call writeline ; compute a bit busy1:inc r2 cmp r2,#377 blt busy1 ; another message mov #msg2,r0 call writeline mov #newline,r0 call writeline ; compute a bit more clr r2 busy2:inc r2 cmp r2,#377 blt busy2 ; another message mov #msg3,r0 call writeline mov #newline,r0 call writeline clr r2 busy3:inc r2 cmp r2,#377 blt busy3 mov #msg4,r0 call writeline clr r2 busy4: inc r2 cmp r2,#177 blt busy4 return “OS” – standard system libraries 194 Set up code Application code • Usual things (in a real OS there are similar operations in the code that gets loaded from the “boot” device) Application data .origin 2200 msg1: .string "Hi mom" ; acts as a newline string newline: .word 15 ; msg2: .string "It works ‐ and about time too" msg3: .string "Hand, hand, fingers, thumb, one thumb, one thumb, drumming on a drum, rings on fingers, rings on thumb" msg4: .string "dum, dity, dum, ditty, dum, dum, dum" .end osstart Application code and data 195 – Set up the interrupt vector (just the teleprinter handler) – Enable interrupts from teleprinter and set it as “ready” for first character – Call application .origin 1000 ttyaddr=64 ; interrupt entry point for tty ‐ set to address of handler routine ttysw=66 ; holds status word to load when start handling tty input tpsw=200 ; value to put in status word tps=177564 ; control register for console output tpb=177566 ; data register for console output osstart:mov #iput,@#ttyaddr mov #tpsw,@#ttysw ; now enable interrupts from tty ; need to set bit‐6 & 7 in its control register mov #300,@#tps call application halt Circular buffer structure 196 Interrupt handler ; last character has been sent ; increment get index modulo length of buffer .origin 1200 iput:inc getndx bic #177760,getndx ; if getndx has caught up with putndx, then there are no more ; characters to send cmp getndx,putndx beq idone mov r0,‐(sp) Send next character – need to mov getndx,r0 use a register while working out movb buff(r0),@#tpb address of next character so save mov (sp)+,r0 and restore r0 rti • Circular buffer – An array, buff, that can hold sixteen characters – Two index values – An active flag for teleprinter .origin 1400 ; circular buffer data structure ; 10 (i.e. eight) words 16‐characters buff: .blkw 10 ; two index values put and get ‐ appropriately initialized putndx:.word 0 getndx:.word 17 ; active flag active:.word 0 idone:clr active rti 197 nabg Writeline function that uses circular buffer 198 33 The application The message data • The .string directive in the assembler doesn’t support escape character sequence like \n so have a hacked way of entering a “newline” message! • Interleave “computation” and output of messages .origin 2000 application:mov #msg1,r0 call writeline mov #newline,r0 call writeline ; compute a bit busy1:inc r2 cmp r2,#377 blt busy1 ; another message mov #msg2,r0 call writeline mov #newline,r0 call writeline ; compute a bit more ; compute a bit more clr r2 busy2:inc r2 cmp r2,#377 blt busy2 ; another message mov #msg3,r0 call writeline mov #newline,r0 call writeline clr r2 busy3:inc r2 cmp r2,#377 blt busy3 mov #msg4,r0 call writeline clr r2 busy4: inc r2 cmp r2,#177 blt busy4 return Compute loops .origin 2200 msg1: .string "Hi mom" ; acts as a newline string newline: .word 15 ; msg2: .string "It works ‐ and about time too" msg3: .string "Hand, hand, fingers, thumb, one thumb, one thumb, drumming on a drum, rings on fingers, rings on thumb" msg4: .string "dum, dity, dum, ditty, dum, dum, dum" .end osstart 199 The system’s library function 200 Working modulo length of buffer • The writeline function is the most complex part of this example • Operation: • Buffer here has eight words – 16 characters • Indexes for get and put will act as byte count relative to start of buffer – running 00 to 017 • Increment counter – then want to ‘mask’ with 017 (‘and’ operation) – Called with r0 containing address of message in user area – all message characters are to be added to circular buffer before return – Will need to use r1 at some point so save and restore – Basically a loop – If count reaches 020, the “and” operation will change it back to 0 • Until no more characters • PDP‐11 doesn’t have an “AND” instruction! – Test for room in circular buffer – if no room wait a bit then repeat the test – If there is room, add another character and update indexes – But it’s got a “bit‐clear” instruction – so clear all the top bits! 0177760 – Special case – • If teleprinter is not active, send a character directly and make sure buffer pointers set up correctly – This will happen on very first character, and subsequently if teleprinter catches up and empties the buffer completely before next writeline request 201 .origin 1600 ; ; writeline ‐ copy characters into buffer and send them off ; of course, if buffer cannot hold all characters will have to ; wait ; this version of writeline takes address of string as ; argument in r0 ; uses register r1 (doesn't save and restore! writeline:tstb (r0) beq endmessage … … … ; all characters of message sent to buffer, can return endmessage: return ; Just a loop until detect a nul byte marking the end of the string – when find that, can return from this writeline subroutine 203 nabg 202 writeline:tstb (r0) beq endmessage ; if printer isn't active ‐ send this byte tst active bne usebuf ; simply send the byte ‐ and set getndx one less than putndx modulo buffer length ; and mark printer active movb (r0),@#tpb Code that is supposed to be inc r0 executed if teleprinter is idle – first mov putndx,getndx character of message isn’t buffered, dec getndx instead it is sent directly to teleprinter bic #177760,getndx inc active br writeline ; usebuf ‐ put character into buffer if we can … … Go and get next character in message … ; all characters of message sent to buffer, can return endmessage: return 204 34 Waiting for space writeline:tstb (r0) beq endmessage ; if printer isn't active ‐ send this byte tst active bne usebuf ; simply send the byte ‐ and set getndx one less than putndx modulo buffer length ; and mark printer active movb (r0),@#tpb inc r0 mov putndx,getndx dec getndx bic #177760,getndx inc active br writeline ; usebuf ‐ put character into buffer if we can usebuf:cmp putndx,getndx Wait for space in the buffer bne canput wait br usebuf canput: mov putndx,r1 movb (r0),buff(r1) Add character to buffer inc r0 inc putndx bic #177760,putndx br writeline Go and get next character in message ; all characters of message sent to buffer, can return • If the ‘put’ and ‘get’ indexes are equal, then buffer is full – have to wait, and check again later • Otherwise, there should be some space to put the character 205 endmessage: return 206 It works! Maybe not Movie • It’s buggy! • Just suppose we are checking whether the printer is active – ; if printer isn't active ‐ send this byte tst active – Variable active was at that moment 1 – the printer was busy; so the status bits will be set to show “not equal zero”; and the character will be put in the buffer. • There is a teleprinter interrupt! – – – – See it run again It works! 207 It doesn’t work 100% reliably iput:inc getndx bic #177760,getndx cmp getndx,putndx beq idone mov r0,‐(sp) mov getndx,r0 movb buff(r0),@#tpb mov (sp)+,r0 rti idone:clr active rti 208 Characters appear out of order • The two characters involved in this scenario would be printed out of order. • Resume the interrupted code – Oh yes we just tested the printer status and saw that it is busy so we have to put this character in buffer. • IT NEVER HAPPENED – I TELL YOU MY FUNCTION WORKS! – Back around loop in writeline function • It will only happen if the interrupt occurs at just the right (or is it wrong) moment! • Get next character from user area – it’s not nul so needs to be send • Oooh look – the printer is idle – we can send this character immediately – Chances of this happening are very small, so of course it didn’t happen in the demonstration recording • And anyway, you’d have been so pleased to see it working you wouldn’t really check the output that carefully and would probably never notice two letters in the wrong order. 209 nabg It’s sent a character. It was the last character. It marks the printer as inactive. Return from interrupt. 210 35 Race condition & critical region A “race” condition • Such problems are often referred to as “race” conditions – It’s a “race” between the code in writeline filling the buffer, and the code in the interrupt handler – One will be first to get through the “critical region” in the code where important data structures are updated – the correct functioning of the program depends which one completes first • A race condition is the behaviour of an electronic or software system where the output is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when events do not happen in the order the programmer intended. The term originates with the idea of two signals racing each other to influence the output first. • In concurrent programming, a critical section is a piece of code that accesses a shared resource (data structure or device) that must not be concurrently accessed by more than one thread of execution. A critical section will usually terminate in fixed time, and a thread, task, or process will have to wait for a fixed time to enter it. Some synchronization mechanism is required at the entry and exit of the critical section to ensure exclusive use, for example a semaphore. 211 212 Critical regions More on critical regions in 200‐level • You will explore issues like critical regions of code in some of your 200‐level subjects (CSCI204, CSCI212, and possibly CSCI213) • There you will have multi‐threaded programs where different threads may need to update a shared data structure – The “threads” library that you use will provide various forms of “lock” to prevent multiple threads entering the critical region • Here, with the interrupt program, changing the priority of the code can “lock out” the interrupts • The use of interrupts results in these problems where OS data structures may need to be changed both in interrupt handling code and standard “OS” code. • “Critical regions” for different data structures pervade OS code and code for real‐time systems • If you are aware of the problem, you can write code that carefully avoids potential bugs. 213 Protecting critical regions in the code • In this example, would want to make sure that didn’t get an interrupt while checking for and deciding how to deal with an idle printer. • Code can change the priority in the status word – Can temporarily raise the priority to 7 (disabling all interrupts) or 5 (disabling any interrupt from the teleprinter which is hardwired to level 4) – Simply move the new priority value into the status register before entering the “critical region” and setting in back to default priority 0 when leave the critical region bis #240,@#177776 ; set priority 5 … bic #240,@#177776; set priority back to zero Are there other critical regions? • What about the code working out the values of get and put indexes? – Don’t think that an interrupt in this bit (in writeline, marking printer as active) would result in errors – but are you sure mov putndx,getndx dec getndx bic #177760,getndx inc active • (That code should only get executed if printer wasn’t active – in which case shouldn’t be getting any interrupts from it; so there shouldn’t be a problem. But are you sure? Could there be a sequence where some interrupt came from the printer after it was marked as inactive?) Machine registers including status word had bus addresses; status word’s address was 177776 215 nabg 214 216 36 Real time coding Real time computing • Code for handling interrupts is perilous • These days, the only people who write interrupt handlers are – • Starts with machines like Laboratory Instrument Computer (LINC) ~1961 • In 1960s, get “laboratory/industrial” mini‐computers from manufacturers like DEC – Persons developing the kernel of operating systems like Linux – PDP‐5, PDP‐8, PDP‐11 • Multi‐cpu systems make it even more challenging to get everything right in situations where vital OS data structures can be updated by different processors • Start getting micro‐computer based systems with processors from Intel ~1972 – Operating systems sometimes simply tolerate rarely occurring bugs – like if one process is trying to create a file in some directory while another process is deleting a parent directory, the file system can get corrupted. – People working on “embedded computer systems” for ‘real time’ applications • Computer program interacts with real world through a variety of sensors and controls 217 218 Embedded computer systems Modern embedded systems • Back in the 1970s and 1980s, mini‐computers like the PDP‐11 were commonly used to control real‐time industrial equipment • Nowadays, real‐time systems are built using single chip microprocessors with similar I/O devices, – and an even wider range of applications – Extra I/O devices • • • • • Analog to digital interface – Gets a voltage (from temperature sensor, neutron flux sensor, pressure gauge, …) and converts it into a number • Digital to analog interface – Some of the “smart” phones have things like pressure sensors and gyroscopes – analog devices that provide input via specialized analog to digital converters – Given a number, uses it to set voltage on some analog control device • Switch sensors and relays • Controls for those toy helicopters • Controls for drone spy planes (and Amazon package delivery drones one day?) • Etc – Test whether things are switched on or off – Switch devices on and off • Ran all kinds of equipment – including nuclear reactors, scientific instruments, rolling mills in steel works, … Oooh – the neutron flux and temperature are both going up, best switch on the device that inserts the control rods to calm things down. Controls in cars Controls for Martian rovers Game consoles Phones that double as ‘home entertainment centres’ • You can learn how to write safe interrupt driven code for such systems in specialised subjects on “Embedded Computers” – For no rational reason (simply Faculty politics), such subjects are taught by the Electrical Engineering department 220 219 Therac 25 • One of the favourite cautionary tales of the software engineers. Feel like trying to write real‐time software without specialist training? – The Therac 25 was an instrument used in hospitals for electron beam therapy (mainly things like skin cancers) and megavolt x‐ray therapy (focussing xrays on some internal tumour) • Low intensity electron beam aimed directly at skin • High intensity electron beam aimed at metal target that would emit the xrays • Had to be reconfigured to switch treatment mode A cautionary tale – All done semi‐automatically by an interrupt driven program on a PDP‐ 11 221 nabg 222 37 Race condition on Therac • Operator entered configuration controls at keyboard – various control character sequences – Enter sequence at just the right (wrong?) time – Interrupt handling overlaps with code that is changing a control setting Would be Linux contributors and others … • (No careful coding of critical regions with interrupts disabled!) – Invalid data entered in control variable Few of you will write embedded control systems; but all of you need to know a little more about operating systems; and some of you might one day contribute to the development of future versions of the Linux kernel • Metal target not moved into correct position • Patient hit by full intensity electron beam – About 3 patients got fried enough to die 223 A little more on the OS aspects Primitive (Single User) Operating Systems • Need to look at things like – OS code • Kernel code – including all the interrupt handling • System services – primitive operations like handling I/O with buffering • Library support – wrapper functions around system services – “trap” instructions (“supervisor calls”) – “User‐mode”/ “supervisor‐mode” distinction 225 Typical primitive Single User OS ‐ possibly also ‘system stack’ Wrapper functions – (PDP‐11 slightly different as lowest memory locations always reserved for interrupt vector & system stack; kernel code starts above system stack) Console (keyboard & teleprinter) Get‐line, write‐line, Lineprinter – printline, page throw, … Disk – seek block, read block, write block Tape – rewind, skip to block, read block, write, OS on main machines of late 1950s, and again on earliest 8‐bit microcomputers in the 1970s! 226 • Non‐interrupt version of OS using subroutine calls – Provides readline, writeline, itoa, atoi • Application getline, writeline, readblock, writeblock Standardized start address for application code (value for .origindirective) • • • • Another example in PDP‐11 assembler System functions System data structures – Useful functions? • itoa (integer to ascii), atoi (ascii to integer) • Mul, div (integer multiply and divide if didn’t have hardware implementation) • Fload, Fadd, Fmul, Fdiv, Fsub, Fstore – subroutines simulating floating point (real number arithmetic) – Single process – multi‐process (multi‐tasking OS) Kernel code • As noted in the overview, operating systems started with a bunch of useful functions that (hopefully) would always be present in memory and which could be used by the (single) application program currently being run. – Also maybe • Hardware support and features Low‐level I/O operations 224 Calls to system provided subroutines – Your favourite from ~week 2 of CSCI114 Application code Application data Simple boot loader nabg 227 228 38 Assembler version – the “OS” parts • Message printing function – similar to those illustrated previously • A readline function – reads into buffer until get newline; again similar to earlier example • An “ascii to integer” function – Takes successive digit characters from a buffer converting to numeric value; using MUL • A “integer to ascii” function – This one is non‐recursive, fills an in memory string starting with low order (right‐most) digit • A recursive “integer to ascii” function (as previously illustrated) makes a nice simple illustration of recursion; but very few numeric output library functions are actually implemented that way! Example here is more typical. Application parts .origin 2000 application:mov #msg1,r1 call writeline mov #newline,r1 call writeline mov #msg2,r1 call writeline mov #newline,r1 call writeline mov #msg3,r1 call writeline mov #inbuf,r1 call readline mov #inbuf,r1 call atoi mov r0, num1 mov #msg4,r1 call writeline mov #inbuf,r1 call readline mov #inbuf,r1 call atoi mov r0,num2 add num1,r0 mov r0,sum mov #msg5,r1 call writeline mov sum,r0 mov #numbuf,r1 call itoa mov #numbuf,r1 call writeline mov #newline,r1 call writeline halt ; .origin 2400 ; data msg1: .string "Hello world" msg2: .string "I am a computer, I do arithmetic" newline: .word 15 msg3: .string "Enter num1 : " msg4: .string "Enter num2 : " msg5: .string "num1 + num2 = " inbuf: .blkw 20 num1: .blkw 1 num2: .blkw 1 sum: .blkw 1 numbuf: .blkw 5 .end application 229 230 OK ‐ enlarged .origin 2000 application:mov #msg1,r1 call writeline Hello world mov #newline,r1 call writeline mov #msg2,r1 I am a computer … call writeline mov #newline,r1 call writeline mov #msg3,r1 Enter num1 : call writeline mov #inbuf,r1 readline call readline mov #inbuf,r1 Convert to integer call atoi mov r0, num1 mov #msg4,r1 call writeline mov #inbuf,r1 call readline mov #inbuf,r1 call atoi mov r0,num2 add num1,r0 mov r0,sum mov #msg5,r1 call writeline OK ‐ enlarged Enter num2 : readline Convert to integer sum = num1 + num2 num1 + num2 = mov #msg5,r1 num1 + num2 = call writeline Convert sum to mov sum,r0 mov #numbuf,r1 string in ‘numbuf’ call itoa mov #numbuf,r1 Print numeric string call writeline mov #newline,r1 call writeline halt ; .origin 2400 ; data msg1: .string "Hello world" msg2: .string "I am a computer, I do arithmetic" newline: .word 15 msg3: .string "Enter num1 : " msg4: .string "Enter num2 : " msg5: .string "num1 + num2 = " inbuf: .blkw 20 num1: .blkw 1 num2: .blkw 1 sum: .blkw 1 numbuf: .blkw 5 .end application 231 ; demo of simplified non‐interrupt OS ; It just provides Readline, Writeline, itoa, atoi ; data for the teleprinter tps=177564 ; control register for console output tpb=177566 ; data register for console output ; ; data for the keyboard tks=177560 ; control register for keyboard tkb=177562 ; data register for keyboard ; .origin 1000 ; low level non‐interrupt I/O ; getchar ; wait for flag to set ; read the character getchar:inc @#tks ; enable getloop: bit #200,@#tks ; wait for done flag beq getloop movb @#tkb,r0 return ; putchar ‐ need to echo the character putchar:mov r0,@#tpb wtc: tstb @#tps bpl wtc return nabg 232 OS code OS code Low level kernel code ; readline ‐ called with r1 pointing to a buffer in user area where line to be stored ; uses r0 ; read characters (echoing them to teleprinter) and store until newline dealt with ; (and add a nul byte for safety before returning) readline: call getchar call putchar movb r0,(r1)+ cmpb r0,#15 bne readline clr r0 movb r0,(r1)+ return Wait loop I/O handlers for printing one character on teleprinter and reading one character from keyboard – essentially same as illustrated earlier. 233 System library functions – wrappers for low‐level I/O (think <stdio.h> ) 234 39 OS code ; atoi called with r1 pointing to buffer with characters ; processes all digits converting to integer ; will use r3 while doing multiplications ; assumes only short integers so takes only low order part of product ; writeline called with r1 pointing to user buffer with message ; uses r0 ; sends characters until get a nul writeline:movb (r1)+,r0 bne more return more:call putchar br writeline ; System library functions – wrappers for low‐level I/O 235 OS code atoi: clr r3 atoil:cmpb (r1),#60 In essence blt atoiend (‘0’<=ch) && (ch<=‘9’) cmpb (r1),#71 bgt atoiend ; character is a decimal digit movb (r1)+,r0 Val = (int) ch – (int) ‘0’ sub #60,r0 ; r0 now holds numeric value 0‐9 for next decimal digit mul #12,r3 Num = Num*10 + val add r0,r3 br atoil Stops processing at first non‐digit character in buffer ; return result in r0 atoiend: mov r3, r0 System library functions – return utilities (think <unistd.h>, 236 <stdlib.h>) ; itoa ‐ this is a non‐recursive version ; on entry r0 is number, r1 is address of a buffer that must be 5 words long ‐ last word 0 ; will loop filling in characters starting at low order decimal digit ; (+ve numbers only) ; use r2,r3 OS code itoa:mov #10,r2 ; start by filling buffer with spaces itoafill:movb #40,(r1)+ sob r2,itoafill ; now generate digits tst r0 bgt nonzero ; simply put 0 movb #60,‐(r1) return Fills 8‐bytes with ‘ ’ (space) characters Fills 8‐bytes with ‘ ’ (space) characters System library functions – utilities (think <unistd.h>, <stdlib.h>) 237 238 itoa:mov #10,r2 ; start by filling buffer with spaces itoafill:movb #40,(r1)+ sob r2,itoafill ; now generate digits tst r0 bgt nonzero ; simply put 0 movb #60,‐(r1) return OS code Iterative integer to ascii Extra word – nul terminates string Turns value into 32‐bits in r2,r3 (high order bits in r2 all zeroes) – nonzero: clr r2 mov r0,r3 needed for division itoadiv: tst r3 beq itoadone ; do a division Iterative loop through digits div #12,r2 ; remainder in r3 is next value of next digit to go in buffer add #60,r3 movb r3,‐(r1) ; quotient in r2 mov r2,r3 clr r2 br itoadiv ; itoadone ‐ itoadone: return 1. Fill buffer with spaces 2. Consider number with decimal value 987 R1 will be pointing here at end of fill loop 1. 2. 3. First division operation leaves quotient 98, remainder 7 Convert value 7 into character ‘7’ 7 Place ‘7’ in buffer 4. Second division operation leaves quotient 9, remainder 8; place the 8 5. Finally, place 9 R1 will be pointing here 8 7 3. Print number with 5 leading spaces 9 8 7 239 nabg 240 40 Working with a primitive OS • All the examples shown have included both “OS code” and “Application code” • Of course, the “OS code” wouldn’t be included in each application Simple OS – Hopefully, the OS code is always there in memory. • Application code would be arranged to run in higher memory locations than the OS code (via an .origin directive) • Application code would have subroutine calls to the functions provided by the primitive OS – Of course, the addresses of these functions would somehow have to be provided to the assembler! • More on such issues later 241 Simple disk based OSs 242 Simple disk based OSs • Simple operating systems would have used a disk – or, for those that were poorer, DECTapes • Resident OS code would have additional functions (along with console handling functions similar to those shown and utilities like the example • Very simple file organization on disk atoi, and itoa functions) – Boot block(s) (at fixed block number on disk/tape) • Has bit patterns representing the memory resident portion of OS with the interrupt vector and low‐level device handlers etc • Gets loaded into memory directly into appropriate locations (using wait loop I/O code in a simple boot‐loader program toggled into high memory) – Directory block(s) • Directory (in memory, and as binary file on disk) would consist of an array of file structs (name, size, first block), and a bit‐map identifying the free/in‐use status of all blocks on disk – Files for editor, assembler, and any other supplied programs – User’s files and free blocks – Read/write block • Arguments: unit, disk block number, memory address – Lookup file • Argument – unit, filename • Fills in a struct with first block number and number of blocks in file – Create file • Arguments – name, unit, number of blocks required – In this kind of simple OS, you would have to specify file size in advance at creation time • Returns either first block number or ‐1 if there wasn’t space on disk for file of requested size (directory structure on disk would have been updated) • (Disk files would have used sequences of consecutive blocks) 243 244 Working with primitive OS Exit to shell • Along with the OS you got a simple “shell” program • The primitive OS (the parts “always” in memory) would have a couple of other functions – Just another application starting at .origin 2000 • It’s code would be loaded from a fixed block on disk (or DECTape if you were too poor to afford a disk) • It would have a few simple commands – Load and run program • Invoked from shell with disk address and details of file of assembled code ready for execution at address 2000+ – PIP – peripheral interchange program » Let you copy text/binary files on paper tape to/from files on disk/DECTape – Editor » Simple editor lets you edit text files read from/written back to disk/DECTape – Assembler » Convert assembly text to executable form – Program loader – … – Loads file blocks directly into memory – Jumps to start of code at 2000 – Exit to shell • Reloads the interactive shell program from its fixed location on disk into memory at 2000 (overwriting last program) • Jumps to start of code at 2000 • (A bit like cmd.exe on Windows – you may have used this) 245 nabg 246 41 Similar to cmd.exe etc Operation • Use of such simple disk‐based single user operating systems is actually quite similar to using cmd.exe or making naïve use of Unix/Linux – Terminal interaction similar to following (command names more likely to be cryptic 2‐character things! Cf, Ed, As, Rn)… • • • • Createfile A:myprog 2 Edit A:myprog Assemble A:myprog B:myexe Run B:myexe • Diagrams purport to show what would be in memory at various steps • Start with OS loaded into memory by boot loader and shell loaded into user area Boot loader “OS” Stack 400‐777 A Files written to disk or DECTape B 1. Createfile command in shell Createfile – A:myprog Editor program runs Data area used to hold text of program 1. User modifies text being edited 2. User saves file to tape On exit from editor – OS loads shell back from disk Editor Assembler program runs 1. Reads program text from file on tape unit A, constructing symbol table in memory OS 4. Shell command Run Data 2. 3. Re‐read program text, now generating executable image of program that gets written block by block to file on unit B On exit from assembler – OS loads shell back from disk 250 B • By end of 2nd year, you ought to be sufficiently skilled at programming to be able to write a simple non‐interrupt OS and interactive shell for the simulated PDP‐11 B:myexe myexe Assembler Left as an exercise … Invoke operations in OS to read generated executable (as binary image) from file on tape B; user program overwrites shell; OS Assembler Data area used to hold generated symbol table A Operation ‐ 3 2. 3. 2. Data 249 A 1. Invoke operations in OS to read assembler program (as binary image) from fixed location on disk; assembler program overwrites shell; OS 2. OS A:myprog B:myexe 1. x Invoke operations in OS to read editor program (as binary image) from fixed location on disk; editor program overwrites shell; 3. 3. Shell command Assemble A 1. Loads from specific disk blocks where have binary image of OS and shell (using simple wait loop I/O) 248 Operation ‐ 2 A:myprog 2 Shell Memory mapped device registers Simple shell 2000‐… disk Code in shell invokes tape handling functions and file manipulation functions of OS to create file on tape 2. Shell command Edit Limited OS 1000‐1777 247 Operation ‐ 2 OS “shell” Interrupt/trap vector 0‐377 B User program runs On exit from user program – OS loads shell back from disk – Actually, there a couple of bits that might be too hard; • you would need to either use Schmidt’s code that simulates an RK11 disk, or add code to simulate a less sophisticated disk unit like the RC11 disk or a DECTape unit – Follow his example of the RK11 disk – basically, he uses an int[ ] to represent all the words of the disk, and has code that copies subsections of this array to represent read/write transfers of different disk blocks • You would need some extra mechanism for loading binary images of your OS, shell, and utility applications onto this disk unit OS Shell 251 nabg 252 42 But subroutine calls to the OS? • Very early on in the development of operating systems (late 1950s) designers of several different computer systems encountered similar problems with applications using ordinary subroutine calls to invoke systems functions. Calling the OS – One minor problem – the subroutine addresses • You needed a table giving the start addresses of the different OS routines – essentially a set of constant declarations that would have to be included in the assembly language source of each application Maybe something ‘more sophisticated’ than a simple subroutine call – Often updated as the OS gets modified – A more significant problem – the desire to apply more controls over applications so that it wouldn’t be so easy for a wayward application to destroy the OS or mismanage peripheral devices 253 254 Trap, Emt, Svc Software interrupt • Computer architects independently developed similar solutions • With a vectored interrupt system like that on the PDP‐ 11, a “trap” instruction would – New instruction(s) • Variously named “trap”, “emt” (emulate), or “svc” (supervisor call) – Some hardware assists • Basically, they all involved a kind of “software interrupt” – Save status word on stack – Save program counter on stack (program counter would hold address following the trap instruction) – Load program counter with address taken from word in interrupt vector • Address is code in the OS – Load status word with a specific status word value (setting priority and possible other status information) – Not an external interrupt from an I/O device triggering switch into OS code – Switch into OS code made at request of program • Next instruction would be taken from OS code – Stack holds data needed when want to return to calling program 255 Trap and Emt! 256 Trap PDP‐11 was odd – provided two instructions for essentially the same job; trap and emt; rational was that ‘emt’ was going to be used with code from DEC (like their OS) while ‘trap’ was for any extensions that programmers chose to implement locally. Unix used trap. Bottom 8 bits of instruction used to encode request e.g. Trap 0 – exit; trap 1 – read; trap 2 – write; trap 3 ‐ …. 257 nabg 258 43 Trap handling code First ‘trap’ style OS • Redo last example using “trap” rather than subroutine calls • The code that handles a trap (or ‘emt’) request is essentially a “dispatcher” – OS services of interest to application – Retrieve actual trap instruction • • • • • You can find it’s address from the return address on the stack – Mask out the bottom bits with the request number – Use this as an index into a jump table that contains addresses of the functions that provide the different OS services. Readline, Writeline, Atoi Itoa – Low level non‐interrupt getchar and putchar hidden from application • This is in the style of the earliest trap based operating systems which offered an eclectic mix of functions (but no direct access to low level I/O routines) – E.g. the Titan OS had “supervisor calls” that include functions that (using floating point) computed sine(x), cosine(x), exp(x) as well as readline, writeline etc 259 Arguments for traps Traps & arguments • Code illustrated here shows a different approach to passing arguments – This alternative approach is commonly (but not universally) used with trap/emt/svc instructions ¶ – Address of buffer with text for output • Atoi – trap 3 – 104403 – Address of location that is to store integer result – Address of buffer containing some digit characters ‘trap n’ Address of arg1 Address of arg2 Next instruction • Itoa – trap 4 – 104404 – Address of location that holds integer value that is to be converted to string – Address of buffer that is to hold the string ¶Linux has its own rules for trap instructions – all data, including the integer that determineswhich OS request is being made, must be passed in machine registers. 261 nabg 262 I’m sure you trust me, but here is a movie demonstrating it running Much the same application .origin 2400 newline: .word 15 msg3: .string "Enter num1 : " msg4: .string "Enter num2 : " msg5: .string "num1 + num2 = " inbuf: .blkw 20 num1: .blkw 1 num2: .blkw 1 sum: .blkw 1 numbuf: .blkw 5 – No arguments • Readline – trap 1 – 104401 – Address of buffer where input line to be stored – So, application code will typically look like writeline msg5 itoa sum numbuf writeline numbuf writeline newline exit • Exit – trap 0 ‐ 104400 • Writeline – trap 2 – 104402 • The locations following the trap instruction will contain addresses of arguments – .origin 2000 application:writeline msg3 readline inbuf atoi num1 inbuf writeline msg4 readline inbuf atoi num2 inbuf mov num2,r0 add num1,r0 mov r0,sum 260 Traps – calls to OS Arguments for trap requests Ordinary code 263 264 44 Trap based, non‐interrupt OS : 1 Trap based, non‐interrupt OS : 2 ; demo of simplified non‐interrupt OS ; system calls ; exit = os executes halt instruction! ; read = os will read a line from keyboard ; returning when a newline character read ; write = os will write a line to teletype ; returning when line all written (nul character at end) ; atoi, and itoa ‐ integer<‐>string conversions ; ; define the operating system calls as trap instructions exit=104400 readline=104401 writeline=104402 atoi=104403 itoa=104404 ; ; data for the trap instruction trapaddr=34 ; "interrupt entry point" ‐ start address of request handler trapsw=36 ; location for status word opsyssw=40 ; value of status word ‐ priority 1 … ; ; data for the trap instruction trapaddr=34 ; "interrupt entry point" ‐ start address of request handler trapsw=36 ; location for status word opsyssw=40 ; value of status word ‐ priority 1 ; ; data for the teleprinter tps=177564 ; control register for console output tpb=177566 ; data register for console output ; ; data for the keyboard tks=177560 ; control register for keyboard tkb=177562 ; data register for keyboard 265 266 Trap based, non‐interrupt OS : 3 .origin 1000 osstart: mov #os,@#trapaddr mov #opsyssw, @#trapsw jmp application ; low level non‐interrupt I/O ; getchar ; wait for flag to set ; read the character getchar:inc @#tks ; enable getloop: bit #200,@#tks ; wait for done flag beq getloop movb @#tkb,r0 return ; putchar ‐ need to echo the character putchar:mov r0,@#tpb wtc: tstb @#tps bpl wtc return Trap based, non‐interrupt OS : 4 ; my micro operating system ; I will be using r0 and r1 (and maybe other registers) so save these os:mov r0,‐(sp) mov r1,‐(sp) ; find out which request ‐ pick up return address as saved in stack mov 4(sp),r1 ; program counter has been incremented ‐ take off 2 dec r1 dec r1 ; r1 should hold the address of the trap instruction mov (r1),r0 ; r0 now holds the actual trap instruction that was executed ; bottom 8 bits contain request id ‐ (though typically far fewer ; than 255 calls defined) … Initialize trap vector, then start the application Simple non‐interrupt wait‐loop style handling of keyboard and printer; same as in many earlier examples 267 268 Trap based, non‐interrupt OS : 5 Trap based, non‐interrupt OS : 6 ; handle return from os call ; when reach here r0 should hold number of arguments used ; by last os call; need to adjust return address that is on stack osreturn:clc rol r0 add r0,4(sp) ; and put back registers mov (sp)+,r1 mov (sp)+,r0 … ; r0 now holds the actual trap instruction that was executed ; bottom 8 bits contain request id ‐ (though typically far fewer ; than 255 calls defined) ; clear the top byte bic #177400,r0 ; convert index to byte offset rol r0 jmp @osfns(r0) rti ; ; function table for my os osfns: exitos read write getint putint Rare example of one of those double indirections ! Here using indexed deferred osfns – base address of a little array of code pointers r0 – the index 269 nabg Return from a system trap works exactly like a return from interrupt – a trap is simply a “software interrupt” 270 45 Services of my OS : Exit Services of my OS : Read Exit system call • Exit Readline system call • Read – Well, as explained before, a real disk OS would implement “exit from application” as loading the command shell back into memory and restarting that. • But I haven’t got around to writing a command shell – so I just halt. ; exit from os exitos: nop nop halt br exitos ; no escape ; exit from os ; takes one argument - address of buffer where input to be stored ; (argument will be in location immediately after trap instruction) ; uses r0 and r1 ; read characters (echoing them to teleprinter) and store until newline dealt with ; (and add a nul byte for safety before returning) ; on entry r1 still holding address with trap instruction read: inc r1 inc r1 ; r1 now holds address that stores address of buffer r1 ; make it store the address of buffer mov (r1),r1 readloop: call getchar call putchar movb r0,(r1)+ cmpb r0,#15 bne readloop clr r0 movb r0,(r1)+ ; now return from this request - set r0 to 1 as this request had 1 argument inc r0 br osreturn 271 Services of my OS : Write 272 Services of my OS : getint Writeline system call write:inc r1 inc r1 mov (r1),r1 writeloop: movb (r1)+,r0 bne more mov #1,r0 br osreturn more:call putchar br writeloop Loop taking characters from output buffer and sending them to teletype – in C roughly while(*r1) putchar(*r1++); 273 Services of my OS : putint ; putint ‐ this is a non‐recursive version ; use r2,r3 and same local variable valptr putint:mov r2,‐(sp) mov r3,‐(sp) inc r1 inc r1 mov (r1),valptr inc r1 inc r1 mov (r1),r1 mov @valptr,r0 mov #10,r2 ; start by filling buffer with spaces putintfill:movb #40,(r1)+ sob r2,putintfill ; now generate digits itoa system call ; now generate digits tst r0 bgt nonzero ; simply put 0 movb #60,‐(r1) br putintdone nonzero: clr r2 mov r0,r3 putintdiv: tst r3 beq putintdone ; do a division div #12,r2 ; remainder in r3 is next value of next digit to go in buffer add #60,r3 movb r3,‐(r1) ; quotient in r2 mov r2,r3 clr r2 br putintdiv ; putintdone ‐ ; replace r3 and r2 putintdone: mov (sp)+,r3 mov (sp)+,r2 ; 2 args mov #2,r0 br osreturn Essentially same as previously illustrated – only real difference is picking up arguments.275 nabg atoi system call ; getint ; processes all digits converting ; to integer ; will use r3 while doing multiplications ; (so save and restore) ; assumes only short integers so takes ; only low order part of product ; has a local variable (valptr) getint: mov r3,‐(sp) inc r1 inc r1 mov (r1),valptr inc r1 inc r1 mov (r1),r1 clr r3 Loop consuming digits – to getintend when non‐digit found getintl:cmpb (r1),#60 blt getintend cmpb (r1),#71 bgt getintend ; character is a decimal digit movb (r1)+,r0 sub #60,r0 ; r0 now holds numeric value ; 0‐9 for next decimal digit mul #12,r3 add r0,r3 br getintl Setting valptr to hold address for result and r1 to start of digit string Returning result ; result in r3; put it ; where it should go getintend: mov r3,@valptr ; replace r3 with saved value mov (sp)+,r3 ; note use of 2 args mov #2,r0 br osreturn 274 valptr: .word 0 No device interrupts? • Example just shown used wait‐loops for handling keyboard and teleprinter • How about interrupts? • Actually, many of the simpler single user disk operating systems for micros (and DEC’s OS‐8 operating system for the simpler PDP‐8 computer) didn’t bother with interrupts – using simple wait loop I/O for all devices. • Some disk operating systems – including DEC’s systems for the PDP‐ 11 like RSTS‐11 – did utilize interrupt driven device I/O 276 46 Traps and hardware interrupts No instant echo • OK – another version • Keystrokes not echoed – Using interrupt driven I/O for keyboard and teletype • Readline and Writeline system calls will still only return to application when operations complete – – Complete line in input buffer – Output string all printed – As illustrated previously, characters input at the keyboard must be echoed to the teleprinter for them to appear. – It’s not done in this version I felt that it would over‐complicate the code (or maybe I just felt lazy) • I’d have to configure the teleprinter interrupt handler to work differently when echoing keystrokes and when printing user messages • Writeline is the simple version – Or, maybe I could hack it by coding the readline function so that it turned off interrupts from the teleprinter and simply copied the input characters directly to the teleprinter buffer – it would work, it would be ugly – print from users message, – no attempt to copy message into a system buffer as was illustrated in the circular buffer example • So readline requests should be followed by writeline requests – to get something to appear! 277 278 Interrupts & trap : 1 : setting up interrupt vector ‐ 1 Interrupts & trap : 1 : setting up interrupt vector ‐2 ; demo of simplified interrupt OS ; system calls ; exit = os executes halt instruction! ; read = os will read a line from keyboard ; returning when a newline character read ; write = os will write a line to teletype ; returning when line all written (nul character at end) ; atoi, and itoa ‐ integer<‐>string conversions ; define the operating system calls as trap instructions exit=104400 readline=104401 writeline=104402 atoi=104403 itoa=104404 ; data for the trap instruction trapaddr=34 ; "interrupt entry point" ‐ start address of request handler trapsw=36 ; location for status word opsyssw=40 ; value of status word ‐ priority 1 ; data for the teleprinter ttyaddr=64 ; interrupt entry point for tty ‐ set to address of handler routine ttysw=66 ; holds status word to load when start handling tty event tpsw=200 ; value to put in status word ‐ priority 4 tps=177564 ; control register for console output tpb=177566 ; data register for console output ; data for the keyboard kbdaddr=60 ; interrupt entry point for kbd ‐ set to address of handler routine kbdsw=62 ; holds status word to load when start handling kbd event kbsw=200 ; value to put in status word kbdc=177560 ; control register for console input kbdb=177562 ; data register for console input 279 280 Interrupts & trap : 1 : setting up interrupt vector ‐ 3 .origin 1000 osstart: mov #os,@#trapaddr mov #opsyssw, @#trapsw mov #tput,@#ttyaddr mov #tpsw,@#ttysw mov #kget,@#kbdaddr mov #kbsw,@#kbdsw ; need to enable interrupts from keyboard and teletype ; set 'enable' and 'done' in tty ; enable only in kbd mov #300,@#tps mov #100,@#kbdc ; hopefully all is ready ; start the application jmp application Interrupt handlers – keyboard and teleprinter ; handle keyboard interrupt kget:movb @#kbdb,ch movb ch, @ibufptr Store next input inc ibufptr in buffer cmpb #15,ch beq ilinedone rti Initialize interrupt and trap vector ; ilinedone ‐ add the nul byte, ; set flag saying input ready ilinedone: clrb @ibufptr Set done flag inc kbdoneflag when get rti Enable interrupts from keyboard and teleprinter; mark printer as “ready” ; os variables ibufptr:.blkw 1 kbdoneflag:.blkw 1 ch:.blkw 1 281 nabg newline ; handle teleprinter interrupt tput:tstb @obufptr beq msgdone ; There is another character to go movb @obufptr,@#tpb Send next inc obufptr character rti msgdone:inc printdoneflag rti Set done flag ; os variables message all obufptr: .blkw 1 sent printdoneflag: .blkw 1 282 47 OS code Readline and Writeline OS code – 1 Essentially unchanged from previous ; handle return from os call ; when reach here r0 should hold number of arguments used ; by last os call; need to adjust return address that is on stack ; my micro operating system ; using r0 and r1 so save these os:mov r0,‐(sp) mov r1,‐(sp) ; find out which request mov 4(sp),r1 dec r1 dec r1 ; r1 should hold the address of the trap instruction mov (r1),r0 ; r0 now holds the actual trap instruction that was executed ; bottom 8 bits contain request id bic #177400,r0 clc ; just in case its set! rol r0 jmp @osfns(r0) osreturn:clc rol r0 add r0,4(sp) Return from OS ; and put back registers mov (sp)+,r1 mov (sp)+,r0 rti ; function table for my os osfns: exitos read Table of “function write pointers” getint putint Dispatcher code – determine which os operation is being requested – jump to implementation 1. 2. Set up pointers (for buffers) and flags for their interrupt driven I/O Start the operation 1. 2. 3. Enter a loop – 1. 2. 4. Test termination flag If not set, “wait” Exit that loop when flag set 1. 2. • Writeline – send first character Readline – activate Writeline – set when last character has been sent Readline – set when newline character input So, essentially waiting in OS code for operations to complete exitos, getint, putint – 283 same as before 284 OS code ‐2 OS code ‐2 Readline – with interrupts Writeline – with interrupts ; on entry r1 still holding address with trap instruction read: inc r1 inc r1 ; r1 now holds address that stores address of buffer ; make it store the address of buffer mov (r1),ibufptr clr kbdoneflag inc @#kbdc ; now get wait in OS ‐ interrupt handled keyboard ; will eventually set the 'line done flag' kbwait:tst kbdoneflag bgt kblinedone wait write:inc r1 inc r1 mov (r1),obufptr clr printdoneflag ; send the first character movb @obufptr,@#tpb inc obufptr ; now wait in os until printdoneflag is set wrtwait:tst printdoneflag bgt olinedone ; nothing to do ‐ it's kind of wait loop wait br wrtwait olinedone:mov #1,r0 jmp osreturn Initializing for interrupt style input Waiting for the line all to be in the input buffer ; returns from wait state after interrupt handled ; go back and re‐check if line complete br kbwait Now can return from OS request back to application ; finally ‐ the line has been read; can return to user kblinedone:mov #1,r0 br osreturn Modified application – application:writeline msg3 readline inbuf writeline inbuf atoi num1 inbuf writeline msg4 readline inbuf writeline inbuf atoi num2 inbuf Enter num1 : (read & echo) Convert string to integer Enter num2 : (read & echo) Convert string to integer mov num2,r0 add num1,r0 mov r0,sum writeline msg5 itoa sum numbuf writeline numbuf writeline newline exit Initializing for interrupt style output Waiting for the contents of the output buffer all to have been sent Now can return from OS request back to application 285 286 It still works (need to echo those input lines) nabg • These functions will (perform calculation) num1 + num2 = Convert integer to string Output result Return to OS – and thence to shell 287 It’s no longer “busy waiting” for I/O – so it’s more obvious when the CPU in effect goes to sleep while slow I/O devices complete their work. 288 48 Utilising interrupts with OS? • Interrupt driven programs are essential for real time computing (embedded systems) – Must respond to external events within a very limited time Interrupts? • Bumper impact sensor fired – must inflate air‐bag • Steel plate in roller mill has reached sensor 4 – must reverse direction of rollers • … • But they don’t appear to be that beneficial to a single user OS – CPU still idle • With non‐interrupt, it’s “busy wait” checking on status of I/O device • With interrupts, it’s “sleep, wake up, check for completion, sleep again” – In a single user OS, there often isn’t anything useful for the CPU to do while slow I/O devices complete data transfers 289 (Interrupts are not essential for a single user OS – which is why some of the simplest systems didn’t use interrupt driven I/O) Multi‐tasking Something useful for the CPU to do? • It’s very difficult to arrange effective overlap between computation and I/O operations. • As mentioned in the overview lecture, early 1960s was time when multi‐tasking operating systems were conceived and first implemented 1. 2. • Think about all the programs that you have written in CSCI114/CSCI124 Several application programs in memory CPU can work doing computations for one program while I/O devices performing transfers previously requested by other programs • • Transfers use interrupts Completion states defined and represented by OS status variables – • Interrupt handlers set status variables – – Essentially 3. 1. Read the input data 2. Compute 3. Now do all the output – There is often no way to overlap CPU and I/O activity if working on a single task. 4. “line read”, “output buffer contents all written”, “disk/tape block transferred” When current program requests I/O, OS sets up transfers needed, and then will usually choose to get CPU to work for another program as current program probably will not be able to make effective use of CPU until its requested I/O transfer has been completed OS code periodically (clock interrupt) checks its status variables and may switch CPU to another program 291 Memory in a typical multitasking OS System stack Device handlers OS code for I/O management OS buffers for I/O OS data structures for I/O requests OS data structures for processes Each block represents thousands of consecutive memory bytes OS code for scheduling processes OS code for filesystem OS trap handling functions (files & directories) User application 1 Code User application 1 stack User application 1 Static data & heap User 1 – waiting for input User application 2 Code User app 2 stack User application 2 Static data & heap Block of User User application3 application 3 memory Code stack not in use User 2 – waiting for input User application 3 more stack User application 3 Static data & heap User application 4 Code User 3 – running on CPU User application 4 stack User 4 – ready to run User application 4 Static data & heap Block of memory not in use 292 Multi‐tasking OS (illustrating a few of the elements that will be present!) Interrupt & trap vector; OS trap dispatcher 290 Memory mapped device registers 293 • Written in assembly language in those early days • Wikipedia identifies these as some of the early systems – – 1960 KDF9; – 1961 CTSS – the operating system for MITs timeshare computer; MCP from Burroughs – 1962 GCOS from General Electric – 1963 – Titan – 1964 – Berkeley Time Sharing, OS‐360, – 1965 – “THE multiprogramming sytem” (Dijkstra); Multics – 1967 – TSS360 (time share on IBM 360s) – 1969 – TENEX – 1969 – start of work on Unix! 294 Boot loader nabg 49 Unix and Linux Unix and C • These will form the next main topic¶ • Unix – Bell Labs • Nov 1971 Unix 1st edition – Simple multi‐user OS, in assembly language, plus – 1969 Experiments on the PDP‐7 by Ken Thompson • He needed a computer offering a good interactive development environment, and found use of the time‐share Multics system too clumsy and far too costly – so built his own development environment on a little used machine that happened to be in the lab. – 1970 Bell Labs gets a PDP‐11, • Thompson continues development of system, which is now officially going to lead to a computerised system which AT&T (the telephone company that owned Bell Labs) would in future use for patent document processing and publication • Thompson works on “B” language – a derivative of BCPL that the Cambridge Titan group had created to implement software like editors • Dennis Ritchie joins Thompson – they start work on “C” ¶There will be a small diversion first – on C programming. There are differences between C and C++. You need to be familiar with C – most of software tools and operating systems are in C not C++. • shell that has commands like – mkdir, ls, ed, mv, ln, chmod, chown, rm, su, • FORTRAN compiler • nroff/troff – Document mark up system for phototypsetter » You learn about markup languages (HTML, XML, SGML etc) in other subjects; nroff/troff was a markup system that allowed authors to annotate documents with directives that determined details of how they would get printed on a phototypesetter • Feb 1973 3rd edition – C compiler – More of now the standard features of Unix shell (e.g. pipes) 296 295 Unix and C • Nov 1973 4th edition – Largely re‐written in C • Some low level code dealing with things like the interrupt vector and device registers remains in assembly language • All the parts dealing with filesystem and process management now in (a relatively) high level language • May 1975 6th edition Extra hardware to support trap/emt/svc feature – Multi‐user Unix development environment – much as it is today! User terminals System console 297 298 Extra hardware support A few extra aspects ‐ 1 • The programming examples illustrating trap simply showed it as a “software interrupt” – a different way of calling functions provided by an operating system. • A system will need to manage memory usage by different processes • But if you really want a multi‐tasking operating system, you require a little more • A system will need exclusive control over I/O – you cannot have application programs trying to start devices • The “little more” requires some additional hardware 299 nabg But I left something out … – You don’t want a processes to be able to overwrite the OS, or to read or write into memory areas that have been allocated to some other process – Actually, on architecture like PDP‐11 where device control registers map to memory addresses, this requirement is satisfied through controls on memory; – On architectures with specific I/O instructions, some mechanism must restrict these to OS code 300 50 A few extra aspects ‐ 2 PDP 11‐40 hardware support • A system will probably need additional status information in the status register • The PDP 11 (in its later models) did provide some hardware support for multi‐process systems¶ – This additional information being used to identify whether the current execution mode is ordinary user or is privileged operating system code • You’d like to be able to catch a pesky user program that executes “halt” – probably by having this result in a software “error” interrupt! • A system will probably need to separate different uses of the stack – Sharing a single stack for an application (its possibly recursive function call sequence) and the OS (interrupts) is problematic • The stack is too small! – Cannot allocate much local space in functions • It’s difficult to suspend one process and let another start if the all are to share one stack 301 PDP 11 A separate kernel stack pointer – Bits in the status word to distinguish OS mode and user mode execution – A separate system stack (for interrupts) and application stack – A rather crude memory management system that supported 18‐ bit addresses (~248Kbyte memory, and ~8kbyte of device registers); even later versions had 22‐bit addresses (4Mbyte) • More sophisticated approaches to the memory management requirements will be explored in your future studies on operating systems – Hardware segmentation, paging, combinations of segmentation and paging … virtual memory etc ¶Actually, the extra hardware support was not provided as standard on most models – you had to pay extra for a “Memory Management” component with these features. 302 Separate stack • System stack 0400‐0777 – That is 0400 bytes (two hundred and fifty six bytes for those who think decimal!) – Not much • Your function int myfun(int x, int y) { // local variables short temp[100]; … } just consumed almost the entire stack for its local variables! • Realistic applications will need to use a different area of memory for the application stack – one with more space 303 PDP 11 Information in status word PDP 11 Kernel and user modes 305 nabg 304 306 51 PDP 11 Memory management PDP 11 Memory management (~PDP 11‐40, but simplified) (~PDP 11‐40, but simplified) • Computer can have up to 128k words (256k bytes) of memory • OS allocates memory in blocks of 4k words (8kbytes) • A user mode program can have up to 8 blocks of memory – In user mode, all memory is just plain old memory • It’s only in supervisor mode that the low memory addresses are the interrupt vector and the highest memory addresses are the device registers. – Compilers etc just generate addresses starting at 0 for any program 307 • User mode – 16 bit “logical” address space • 64kbyte for your code, stack, static data, heap¶ • You sort it out (with help from your C or FORTRAN compiler, or use it yourself with the macro assembler MACRO‐11) – Logical address mapped into 64kbytes of the total 256kbytes • Obviously, never mapped onto lowest or highest addresses which continue to have their hard wired uses of interrupt vector and device registers • Unit of allocation – 4kword (8kbyte blocks) • Blocks representing consecutive logical addresses not necessarily consecutive in physical memory ¶You think that 64kbytes too small for a program? But that insertion sort program, similar to the programs you write for CSCI114, only required eighty‐four bytes in assembly language. 308 Memory management (simplified) A 16‐bit logical address 0 1 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 An 18‐bit physical address 0 0 0 1 1 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 1 1 1 1 1 0 0 0 1 Memory mapping registers 309 310 Early Unix • OS ~40kbytes – code and data • Time share system with ~128kword (256kbyte) memory would be able to support ~5 concurrent users – ed for editing (it’s still there on Linux) – C compiler – Shell with cat, ls, mkdir, rm, chmod, … • Just the same as working on the shared banshee or wumpus servers now. 311 nabg 312 52