Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Branch instructions (1) * Branch : * Branch with Link : 31 28 27 Cond 1 B{<cond>} label BL{<cond>} sub_routine_label 25 24 23 0 0 1 L Offset Link bit 0 = Branch 1 = Branch with link Condition field * The offset for branch instructions is calculated by the assembler: • By taking the difference between the branch instruction and the target address minus 8 (to allow for the pipeline). • This gives a 26 bit offset which is right shifted 2 bits (as the bottom two bits are always zero as instructions are word – aligned) and stored into the instruction encoding. • This gives a range of ± 32 Mbytes. The ARM Instructions - 1 Embedded Systems Lab./Honam University Branch instructions (2) * When executing the instruction, the processor: • shifts the offset left two bits, sign extends it to 32 bits, and adds it to PC. * Execution then continues from the new PC, once the pipeline has been refilled. * The "Branch with link" instruction implements a subroutine call by writing PC-4 into the LR of the current bank. • i.e. the address of the next instruction following the branch with link (allowing for the pipeline). * To return from subroutine, simply need to restore the PC from the LR: • MOV pc, lr • Again, pipeline has to refill before execution continues. * The "Branch" instruction does not affect LR. * Note: Architecture 4T offers a further ARM branch instruction, BX • See Thumb Instruction Set Module for details. The ARM Instructions - 2 Embedded Systems Lab./Honam University Data processing Instruction Format The ARM Instructions - 3 Embedded Systems Lab./Honam University Data processing Instructions * Largest family of ARM instructions, all sharing the same instruction format. * Contains: • Arithmetic operations • Comparisons (no results - just set condition codes) • Logical operations • Data movement between registers * Remember, this is a load / store architecture • These instruction only work on registers, NOT memory. * They each perform a specific operation on one or two operands. • First operand always a register - Rn • Second operand sent to the ALU via barrel shifter. * We will examine the barrel shifter shortly. The ARM Instructions - 4 Embedded Systems Lab./Honam University Arithmetic Operations * Operations are: • ADD operand1 + operand2 • ADC operand1 + operand2 + carry • SUB operand1 - operand2 • SBC operand1 - operand2 + carry -1 • RSB operand2 - operand1 • RSC operand2 - operand1 + carry - 1 * Syntax: • <Operation>{<cond>}{S} Rd, Rn, Operand2 * Examples • ADD r0, r1, r2 • SUBGT r3, r3, #1 • RSBLES r4, r5, #5 The ARM Instructions - 5 Embedded Systems Lab./Honam University Comparisons * The only effect of the comparisons is to • UPDATE THE CONDITION FLAGS. Thus no need to set S bit. * Operations are: • CMP operand1 - operand2, but result not written • CMN operand1 + operand2, but result not written • TST operand1 AND operand2, but result not written • TEQ operand1 EOR operand2, but result not written * Syntax: • <Operation>{<cond>} Rn, Operand2 * Examples: • CMP r0, r1 • TSTEQ r2, #5 The ARM Instructions - 6 Embedded Systems Lab./Honam University Logical Operations * Operations are: • AND operand1 AND operand2 • EOR operand1 EOR operand2 • ORR operand1 OR operand2 • BIC operand1 AND NOT operand2 [ie bit clear] * Syntax: • <Operation>{<cond>}{S} Rd, Rn, Operand2 * Examples: • AND r0, r1, r2 • BICEQ r2, r3, #7 • EORS r1,r3,r0 The ARM Instructions - 7 Embedded Systems Lab./Honam University Data Movement * Operations are: • MOV operand2 • MVN NOT operand2 Note that these make no use of operand1. * Syntax: • <Operation>{<cond>}{S} Rd, Operand2 * Examples: • MOV r0, r1 • MOVS r2, #10 • MVNEQ r1,#0 The ARM Instructions - 8 Embedded Systems Lab./Honam University Conditional Execution * Most instruction sets only allow branches to be executed conditionally. * However by reusing the condition evaluation hardware, ARM effectively increases number of instructions. • All instructions contain a condition field which determines whether the CPU will execute them. • Non-executed instructions soak up 1 cycle. – Still have to complete cycle so as to allow fetching and decoding of following instructions. * This removes the need for many branches, which stall the pipeline (3 cycles to refill). • Allows very dense in-line code, without branches. • The Time penalty of not executing several conditional instructions is frequently less than overhead of the branch or subroutine call that would otherwise be needed. The ARM Instructions - 9 Embedded Systems Lab./Honam University The Condition Field 31 28 24 20 16 12 8 4 0 Cond 1001 = LS - C clear or Z (set unsigned lower or same) 0000 = EQ - Z set (equal) 0001 = NE - Z clear (not equal) 1010 = GE - N set and V set, or N clear and V clear (>or =) 0010 = HS / CS - C set (unsigned higher or same) 1011 = LT - N set and V clear, or N clear and V set (>) 0011 = LO / CC - C clear (unsigned lower) 1100 = GT - Z clear, and either N set and V set, or N clear and V set (>) 0100 = MI -N set (negative) 1101 = LE - Z set, or N set and V clear,or N clear and V set (<, or =) 0101 = PL - N clear (positive or zero) 0110 = VS - V set (overflow) 1110 = AL - always 1111 = NV - reserved. 0111 = VC - V clear (no overflow) 1000 = HI - C set and Z clear (unsigned higher) The ARM Instructions - 10 Embedded Systems Lab./Honam University Using and updating the Condition Field * To execute an instruction conditionally, simply postfix it with the appropriate condition: • For example an add instruction takes the form: – ADD r0,r1,r2 ; r0 = r1 + r2 (ADDAL) • To execute this only if the zero flag is set: – ADDEQ r0,r1,r2 ; If zero flag set then… ; ... r0 = r1 + r2 * By default, data processing operations do not affect the condition flags (apart from the comparisons where this is the only effect). To cause the condition flags to be updated, the S bit of the instruction needs to be set by postfixing the instruction (and any condition code) with an “S”. • For example to add two numbers and set the condition flags: – ADDS r0,r1,r2 ; r0 = r1 + r2 ; ... and set flags The ARM Instructions - 11 Embedded Systems Lab./Honam University Conditional Execution cont. * Check the conditional field of CPSR and the conditional field of current instruction. • If the condition matches, current instruction is executed; otherwise, current instruction execution is aborted. The ARM Instructions - 12 Embedded Systems Lab./Honam University Conditional Execution cont. * Reducing the number of branches • MOVS r0, r1, LSR #1 ; C(flag) := r1[0] • MOVCC r0, #10 ; if C=0, then r0 := 10 • MOVCS r0, #11 ; if C=1, then r0 := 11 • MOVS • MOVNE The ARM Instructions - 13 r0, r4 r0, #1 ; if r4==0 then r0 := 0 ; else r0 := 1 Embedded Systems Lab./Honam University ARM instruction set * * * * * * ARM versions. ARM assembly language. ARM programming model. ARM memory organization. ARM data operations. ARM flow of control. The ARM Instructions - 14 Embedded Systems Lab./Honam University ARM versions * ARM architecture has been extended over several versions. * We will concentrate on ARM7. The ARM Instructions - 15 Embedded Systems Lab./Honam University ARM assembly language * Fairly standard assembly language: LDR r0,[r8] ; a comment label ADD r4,r0,r1 The ARM Instructions - 16 Embedded Systems Lab./Honam University ARM programming model r0 r1 r2 r3 r4 r5 r6 r7 The ARM Instructions - 17 r8 r9 r10 r11 r12 r13 r14 r15 (PC) 0 31 CPSR NZCV Embedded Systems Lab./Honam University Endianness * Relationship between bit and byte/word ordering defines endianness: bit 31 bit 0 byte 3 byte 2 byte 1 byte 0 little-endian The ARM Instructions - 18 bit 0 bit 31 byte 0 byte 1 byte 2 byte 3 big-endian Embedded Systems Lab./Honam University ARM data types * * * * Word is 32 bits long. Word can be divided into four 8-bit bytes. ARM addresses cam be 32 bits long. Address refers to byte. • Address 4 starts at byte 4. * Can be configured at power-up as either little- or bit-endian mode. The ARM Instructions - 19 Embedded Systems Lab./Honam University ARM status bits * Every arithmetic, logical, or shifting operation sets CPSR bits: • N (negative), Z (zero), C (carry), V (overflow). * Examples: • -1 + 1 = 0: NZCV = 0110. • 231-1+1 = -231: NZCV = 0101. The ARM Instructions - 20 Embedded Systems Lab./Honam University ARM data instructions * Basic format: ADD r0,r1,r2 • Computes r1+r2, stores in r0. * Immediate operand: ADD r0,r1,#2 • Computes r1+2, stores in r0. The ARM Instructions - 21 Embedded Systems Lab./Honam University ARM data instructions * * * * ADD, ADC : add (w. carry) SUB, SBC : subtract (w. carry) RSB, RSC : reverse subtract (w. carry) MUL, MLA : multiply (and accumulate) The ARM Instructions - 22 * * * * * * AND, ORR, EOR BIC : bit clear LSL, LSR : logical shift left/right ASL, ASR : arithmetic shift left/right ROR : rotate right RRX : rotate right extended with C Embedded Systems Lab./Honam University Data operation varieties * Logical shift: • fills with zeroes. * Arithmetic shift: • fills with ones. * RRX performs 33-bit rotate, including C bit from CPSR above sign bit. The ARM Instructions - 23 Embedded Systems Lab./Honam University ARM comparison instructions * * * * * CMP : compare CMN : negated compare TST : bit-wise test TEQ : bit-wise negated test These instructions set only the NZCV bits of CPSR. The ARM Instructions - 24 Embedded Systems Lab./Honam University ARM move instructions * MOV, MVN : move (negated) MOV r0, r1 ; sets r0 to r1 The ARM Instructions - 25 Embedded Systems Lab./Honam University ARM load/store instructions * LDR, LDRH, LDRB : load (half-word, byte) * STR, STRH, STRB : store (half-word, byte) * Addressing modes: • register indirect : LDR r0,[r1] • with second register : LDR r0,[r1,-r2] • with constant : LDR r0,[r1,#4] The ARM Instructions - 26 Embedded Systems Lab./Honam University ARM ADR pseudo-op * Cannot refer to an address directly in an instruction. * Generate value by performing arithmetic on PC. * ADR pseudo-op generates instruction required to calculate address: ADR r1,FOO The ARM Instructions - 27 Embedded Systems Lab./Honam University Example: C assignments * C: x = (a + b) - c; * Assembler: ADR LDR ADR LDR ADD ADR LDR r4,a r0,[r4] r4,b r1,[r4] r3,r0,r1 r4,c r2[r4] The ARM Instructions - 28 ; ; ; ; ; get address for a get value of a ; get address for b, reusing r4 get value of b compute a+b ; get address for c get value of c Embedded Systems Lab./Honam University C assignment, cont’d. SUB r3,r3,r2 ADR r4,x STR r3[r4] The ARM Instructions - 29 ; complete computation of x ; get address for x ; store value of x Embedded Systems Lab./Honam University Example: C assignment * C: y = a*(b+c); * Assembler: ADR LDR ADR LDR ADD ADR LDR r4,b ; get address for b r0,[r4] ; get value of b r4,c ; get address for c r1,[r4] ; get value of c r2,r0,r1 ; compute partial result r4,a ; get address for a r0,[r4] ; get value of a The ARM Instructions - 30 Embedded Systems Lab./Honam University C assignment, cont’d. MUL r2,r2,r0 ; compute final value for y ADR r4,y ; get address for y STR r2,[r4] ; store y The ARM Instructions - 31 Embedded Systems Lab./Honam University Example: C assignment * C: z = (a << 2) | (b & 15); * Assembler: ADR LDR MOV ADR LDR AND ORR r4,a ; get address for a r0,[r4] ; get value of a r0,r0,LSL 2 ; perform shift r4,b ; get address for b r1,[r4] ; get value of b r1,r1,#15 ; perform AND r1,r0,r1 ; perform OR The ARM Instructions - 32 Embedded Systems Lab./Honam University C assignment, cont’d. ADR r4,z ; get address for z STR r1,[r4] ; store value for z The ARM Instructions - 33 Embedded Systems Lab./Honam University Additional addressing modes * Base-plus-offset addressing: LDR r0,[r1,#16] • Loads from location r1+16 * Auto-indexing increments base register: LDR r0,[r1,#16]! * Post-indexing fetches, then does offset: LDR r0,[r1],#16 • Loads r0 from r1, then adds 16 to r1. The ARM Instructions - 34 Embedded Systems Lab./Honam University ARM flow of control * All operations can be performed conditionally, testing CPSR: • EQ, NE, CS, CC, MI, PL, VS, VC, HI, LS, GE, LT, GT, LE, AL, NV * Branch operation: B #100 • Can be performed conditionally. The ARM Instructions - 35 Embedded Systems Lab./Honam University Example: if statement * C: if (a > b) { x = 5; y = c + d; } else x = c - d; * Assembler: ; compute and test condition ADR r4,a ; get address for a LDR r0,[r4] ; get value of a ADR r4,b ; get address for b LDR r1,[r4] ; get value for b CMP r0,r1 ; compare a < b BGE fblock ; if a >= b, branch to false block The ARM Instructions - 36 Embedded Systems Lab./Honam University If statement, cont’d. ; true block MOV r0,#5 ; generate value for x ADR r4,x ; get address for x STR r0,[r4] ; store x ADR r4,c ; get address for c LDR r0,[r4] ; get value of c ADR r4,d ; get address for d LDR r1,[r4] ; get value of d ADD r0,r0,r1 ; compute y ADR r4,y ; get address for y STR r0,[r4] ; store y B after ; branch around false block The ARM Instructions - 37 Embedded Systems Lab./Honam University If statement, cont’d. ; false block fblock ADR r4,c ; get address for c LDR r0,[r4] ; get value of c ADR r4,d ; get address for d LDR r1,[r4] ; get value for d SUB r0,r0,r1 ; compute a-b ADR r4,x ; get address for x STR r0,[r4] ; store value of x after ... The ARM Instructions - 38 Embedded Systems Lab./Honam University Conditional Instruction Use ; Compute and test the condition ADR r4, a LDR r0, [r4] ADR r4, b LDR r1, [r4] CMP r0, r1 ; get address for a ; get value of a ; get address for b ; get value of b ; compare a < b ; Notice we don't need a branch here The ARM Instructions - 39 Embedded Systems Lab./Honam University Example: Conditional instruction implementation ; true block MOVLT r0,#5 ; generate value ADRLT r4,x ; get address for STRLT r0,[r4] ; store x ADRLT r4,c ; get address for LDRLT r0,[r4] ; get value of ADRLT r4,d ; get address for LDRLT r1,[r4] ; get value of ADDLT r0,r0,r1 ; compute y ADRLT r4,y ; get address for STRLT r0,[r4] ; store y The ARM Instructions - 40 for x x c c d d y Embedded Systems Lab./Honam University Conditional instruction implementation, cont’d. ; false block ADRGE r4,c ; get address for c LDRGE r0,[r4] ; get value of c ADRGE r4,d ; get address for d LDRGE r1,[r4] ; get value for d SUBGE r0,r0,r1 ; compute a-b ADRGE r4,x ; get address for x STRGE r0,[r4] ; store value of x The ARM Instructions - 41 Embedded Systems Lab./Honam University Example: switch statement * C: switch (test) { case 0: … break; case 1: … } * Assembler: ADR r2,test ; get address for test LDR r0,[r2] ; load value for test ADR r1,SwitchTable ; load address for switch table LDR r1,[r1,r0,LSL #2] ; index switch table SwitchTable DCD case0 DCD case1 ; DCD directive instructs the assembler ; to reserve a word of store and ; initialize to the right side value ... The ARM Instructions - 42 Embedded Systems Lab./Honam University Example: FIR filter * C: for (i=0, f=0; i<N; i++) f = f + c[i]*x[i]; * Assembler ; loop MOV MOV ADR LDR MOV initiation code r0,#0 ; use r0 for I r8,#0 ; use separate index for arrays r2,N ; get address for N r1,[r2] ; get value of N r2,#0 ; use r2 for f The ARM Instructions - 43 Embedded Systems Lab./Honam University FIR filter, cont’.d ADR r3,c ; load r3 with base of c ADR r5,x ; load r5 with base of x ; loop body loop LDR r4,[r3,r8] ; get c[i] LDR r6,[r5,r8] ; get x[i] MUL r4,r4,r6 ; compute c[i]*x[i] ADD r2,r2,r4 ; add into running sum ADD r8,r8,#4 ; add one word offset to array index ADD r0,r0,#1 ; add 1 to i CMP r0,r1 ; exit? BLT loop ; if i < N, continue The ARM Instructions - 44 Embedded Systems Lab./Honam University ARM subroutine linkage * Branch and link instruction: BL foo • Copies current PC to r14. * To return from subroutine: MOV r15,r14 Build a stack for nested calls and to pass parameters. The ARM Instructions - 45 Embedded Systems Lab./Honam University Nested subroutine calls * void f1 (int a) { f2(a) } * Nesting/recursion requires coding convention: f1 LDR r0,[r13] ; load arg into r0 from stack ; call f2() STR r13!,[r14] ; store f1’s return adrs STR r13!,[r0] ; store arg to f2 on stack BL f2 ; branch and link to f2 ; return from f1() SUB r13,#4 ; pop f2’s arg off stack LDR r13!,r15 ; restore register and return The ARM Instructions - 46 Embedded Systems Lab./Honam University Summary * Load/store architecture * Most instructions are RISCy, operate in single cycle. • Some multi-register operations take longer. * All instructions can be executed conditionally. The ARM Instructions - 47 Embedded Systems Lab./Honam University