Download The ARM Instruction Set - Embedded Systems Laboratory

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Branch instructions (1)
* Branch :
* Branch with Link :
31
28 27
Cond
1
B{<cond>} label
BL{<cond>} sub_routine_label
25 24 23
0
0
1 L
Offset
Link bit
0 = Branch
1 = Branch with link
Condition field
* The offset for branch instructions is calculated by the assembler:
• By taking the difference between the branch instruction and the
target address minus 8 (to allow for the pipeline).
• This gives a 26 bit offset which is right shifted 2 bits (as the
bottom two bits are always zero as instructions are word –
aligned) and stored into the instruction encoding.
• This gives a range of ± 32 Mbytes.
The ARM Instructions - 1
Embedded Systems Lab./Honam University
Branch instructions (2)
* When executing the instruction, the processor:
• shifts the offset left two bits, sign extends it to 32 bits, and adds it to PC.
* Execution then continues from the new PC, once the pipeline has been
refilled.
* The "Branch with link" instruction implements a subroutine call by writing
PC-4 into the LR of the current bank.
• i.e. the address of the next instruction following the branch with link (allowing
for the pipeline).
* To return from subroutine, simply need to restore the PC from the LR:
• MOV pc, lr
• Again, pipeline has to refill before execution continues.
* The "Branch" instruction does not affect LR.
* Note: Architecture 4T offers a further ARM branch instruction, BX
• See Thumb Instruction Set Module for details.
The ARM Instructions - 2
Embedded Systems Lab./Honam University
Data processing Instruction Format
The ARM Instructions - 3
Embedded Systems Lab./Honam University
Data processing Instructions
* Largest family of ARM instructions, all sharing the same instruction format.
* Contains:
• Arithmetic operations
• Comparisons (no results - just set condition codes)
• Logical operations
• Data movement between registers
* Remember, this is a load / store architecture
• These instruction only work on registers, NOT memory.
* They each perform a specific operation on one or two operands.
• First operand always a register - Rn
• Second operand sent to the ALU via barrel shifter.
* We will examine the barrel shifter shortly.
The ARM Instructions - 4
Embedded Systems Lab./Honam University
Arithmetic Operations
* Operations are:
• ADD
operand1 + operand2
• ADC
operand1 + operand2 + carry
• SUB
operand1 - operand2
• SBC
operand1 - operand2 + carry -1
• RSB
operand2 - operand1
• RSC
operand2 - operand1 + carry - 1
* Syntax:
• <Operation>{<cond>}{S} Rd, Rn, Operand2
* Examples
• ADD r0, r1, r2
• SUBGT r3, r3, #1
• RSBLES r4, r5, #5
The ARM Instructions - 5
Embedded Systems Lab./Honam University
Comparisons
* The only effect of the comparisons is to
• UPDATE THE CONDITION FLAGS. Thus no need to set S bit.
* Operations are:
• CMP
operand1 - operand2, but result not written
• CMN
operand1 + operand2, but result not written
• TST
operand1 AND operand2, but result not written
• TEQ
operand1 EOR operand2, but result not written
* Syntax:
• <Operation>{<cond>} Rn, Operand2
* Examples:
• CMP
r0, r1
• TSTEQ
r2, #5
The ARM Instructions - 6
Embedded Systems Lab./Honam University
Logical Operations
* Operations are:
• AND
operand1 AND operand2
• EOR
operand1 EOR operand2
• ORR
operand1 OR operand2
• BIC
operand1 AND NOT operand2 [ie bit clear]
* Syntax:
• <Operation>{<cond>}{S} Rd, Rn, Operand2
* Examples:
• AND
r0, r1, r2
• BICEQ
r2, r3, #7
• EORS
r1,r3,r0
The ARM Instructions - 7
Embedded Systems Lab./Honam University
Data Movement
* Operations are:
• MOV
operand2
• MVN
NOT operand2
Note that these make no use of operand1.
* Syntax:
• <Operation>{<cond>}{S} Rd, Operand2
* Examples:
• MOV
r0, r1
• MOVS
r2, #10
• MVNEQ r1,#0
The ARM Instructions - 8
Embedded Systems Lab./Honam University
Conditional Execution
* Most instruction sets only allow branches to be executed conditionally.
* However by reusing the condition evaluation hardware, ARM effectively
increases number of instructions.
• All instructions contain a condition field which determines whether the CPU
will execute them.
• Non-executed instructions soak up 1 cycle.
– Still have to complete cycle so as to allow fetching and decoding of
following instructions.
* This removes the need for many branches, which stall the pipeline (3 cycles to
refill).
• Allows very dense in-line code, without branches.
• The Time penalty of not executing several conditional instructions is
frequently less than overhead of the branch
or subroutine call that would otherwise be needed.
The ARM Instructions - 9
Embedded Systems Lab./Honam University
The Condition Field
31
28
24
20
16
12
8
4
0
Cond
1001 = LS - C clear or Z (set unsigned
lower or same)
0000 = EQ - Z set (equal)
0001 = NE - Z clear (not equal)
1010 = GE - N set and V set, or N clear and
V clear (>or =)
0010 = HS / CS - C set (unsigned
higher or same)
1011 = LT - N set and V clear, or N clear
and V set (>)
0011 = LO / CC - C clear (unsigned
lower)
1100 = GT - Z clear, and either N set and V
set, or N clear and V set (>)
0100 = MI -N set (negative)
1101 = LE - Z set, or N set and V clear,or N
clear and V set (<, or =)
0101 = PL - N clear (positive or
zero)
0110 = VS - V set (overflow)
1110 = AL - always
1111 = NV - reserved.
0111 = VC - V clear (no overflow)
1000 = HI - C set and Z clear
(unsigned higher)
The ARM Instructions - 10
Embedded Systems Lab./Honam University
Using and updating the Condition Field
* To execute an instruction conditionally, simply postfix it with the appropriate
condition:
• For example an add instruction takes the form:
– ADD r0,r1,r2
; r0 = r1 + r2 (ADDAL)
• To execute this only if the zero flag is set:
– ADDEQ r0,r1,r2
; If zero flag set then…
; ... r0 = r1 + r2
* By default, data processing operations do not affect the condition flags (apart
from the comparisons where this is the only effect). To cause the condition
flags to be updated, the S bit of the instruction needs to be set by postfixing
the instruction (and any condition code) with an “S”.
• For example to add two numbers and set the condition flags:
– ADDS r0,r1,r2
; r0 = r1 + r2
; ... and set flags
The ARM Instructions - 11
Embedded Systems Lab./Honam University
Conditional Execution
cont.
* Check the conditional field of CPSR and the conditional field of
current instruction.
• If the condition matches, current instruction is executed; otherwise,
current instruction execution is aborted.
The ARM Instructions - 12
Embedded Systems Lab./Honam University
Conditional Execution
cont.
* Reducing the number of branches
• MOVS
r0, r1, LSR #1
; C(flag) := r1[0]
• MOVCC r0, #10
; if C=0, then r0 := 10
• MOVCS r0, #11
; if C=1, then r0 := 11
• MOVS
• MOVNE
The ARM Instructions - 13
r0, r4
r0, #1
; if r4==0 then r0 := 0
; else r0 := 1
Embedded Systems Lab./Honam University
ARM instruction set
*
*
*
*
*
*
ARM versions.
ARM assembly language.
ARM programming model.
ARM memory organization.
ARM data operations.
ARM flow of control.
The ARM Instructions - 14
Embedded Systems Lab./Honam University
ARM versions
* ARM architecture has been extended over several versions.
* We will concentrate on ARM7.
The ARM Instructions - 15
Embedded Systems Lab./Honam University
ARM assembly language
* Fairly standard assembly language:
LDR r0,[r8] ; a comment
label ADD r4,r0,r1
The ARM Instructions - 16
Embedded Systems Lab./Honam University
ARM programming model
r0
r1
r2
r3
r4
r5
r6
r7
The ARM Instructions - 17
r8
r9
r10
r11
r12
r13
r14
r15 (PC)
0
31
CPSR
NZCV
Embedded Systems Lab./Honam University
Endianness
* Relationship between bit and byte/word ordering defines endianness:
bit 31
bit 0
byte 3 byte 2 byte 1 byte 0
little-endian
The ARM Instructions - 18
bit 0
bit 31
byte 0 byte 1 byte 2 byte 3
big-endian
Embedded Systems Lab./Honam University
ARM data types
*
*
*
*
Word is 32 bits long.
Word can be divided into four 8-bit bytes.
ARM addresses cam be 32 bits long.
Address refers to byte.
• Address 4 starts at byte 4.
* Can be configured at power-up as either little- or bit-endian mode.
The ARM Instructions - 19
Embedded Systems Lab./Honam University
ARM status bits
* Every arithmetic, logical, or shifting operation sets CPSR bits:
• N (negative), Z (zero), C (carry), V (overflow).
* Examples:
• -1 + 1 = 0: NZCV = 0110.
• 231-1+1 = -231: NZCV = 0101.
The ARM Instructions - 20
Embedded Systems Lab./Honam University
ARM data instructions
* Basic format:
ADD r0,r1,r2
• Computes r1+r2, stores in r0.
* Immediate operand:
ADD r0,r1,#2
• Computes r1+2, stores in r0.
The ARM Instructions - 21
Embedded Systems Lab./Honam University
ARM data instructions
*
*
*
*
ADD, ADC : add (w. carry)
SUB, SBC : subtract (w. carry)
RSB, RSC : reverse subtract (w. carry)
MUL, MLA : multiply (and
accumulate)
The ARM Instructions - 22
*
*
*
*
*
*
AND, ORR, EOR
BIC : bit clear
LSL, LSR : logical shift left/right
ASL, ASR : arithmetic shift left/right
ROR : rotate right
RRX : rotate right extended with C
Embedded Systems Lab./Honam University
Data operation varieties
* Logical shift:
• fills with zeroes.
* Arithmetic shift:
• fills with ones.
* RRX performs 33-bit rotate, including C bit from CPSR above sign bit.
The ARM Instructions - 23
Embedded Systems Lab./Honam University
ARM comparison instructions
*
*
*
*
*
CMP : compare
CMN : negated compare
TST : bit-wise test
TEQ : bit-wise negated test
These instructions set only the NZCV bits of CPSR.
The ARM Instructions - 24
Embedded Systems Lab./Honam University
ARM move instructions
* MOV, MVN : move (negated)
MOV r0, r1 ; sets r0 to r1
The ARM Instructions - 25
Embedded Systems Lab./Honam University
ARM load/store instructions
* LDR, LDRH, LDRB : load (half-word, byte)
* STR, STRH, STRB : store (half-word, byte)
* Addressing modes:
• register indirect : LDR r0,[r1]
• with second register : LDR r0,[r1,-r2]
• with constant : LDR r0,[r1,#4]
The ARM Instructions - 26
Embedded Systems Lab./Honam University
ARM ADR pseudo-op
* Cannot refer to an address directly in an instruction.
* Generate value by performing arithmetic on PC.
* ADR pseudo-op generates instruction required to calculate address:
ADR r1,FOO
The ARM Instructions - 27
Embedded Systems Lab./Honam University
Example: C assignments
* C:
x = (a + b) - c;
* Assembler:
ADR
LDR
ADR
LDR
ADD
ADR
LDR
r4,a
r0,[r4]
r4,b
r1,[r4]
r3,r0,r1
r4,c
r2[r4]
The ARM Instructions - 28
;
;
;
;
; get address for a
get value of a
; get address for b, reusing r4
get value of b
compute a+b
; get address for c
get value of c
Embedded Systems Lab./Honam University
C assignment, cont’d.
SUB r3,r3,r2
ADR r4,x
STR r3[r4]
The ARM Instructions - 29
; complete computation of x
; get address for x
; store value of x
Embedded Systems Lab./Honam University
Example: C assignment
* C:
y = a*(b+c);
* Assembler:
ADR
LDR
ADR
LDR
ADD
ADR
LDR
r4,b ; get address for b
r0,[r4] ; get value of b
r4,c ; get address for c
r1,[r4] ; get value of c
r2,r0,r1 ; compute partial result
r4,a ; get address for a
r0,[r4] ; get value of a
The ARM Instructions - 30
Embedded Systems Lab./Honam University
C assignment, cont’d.
MUL r2,r2,r0 ; compute final value for y
ADR r4,y ; get address for y
STR r2,[r4] ; store y
The ARM Instructions - 31
Embedded Systems Lab./Honam University
Example: C assignment
* C:
z = (a << 2) |
(b & 15);
* Assembler:
ADR
LDR
MOV
ADR
LDR
AND
ORR
r4,a ; get address for a
r0,[r4] ; get value of a
r0,r0,LSL 2 ; perform shift
r4,b ; get address for b
r1,[r4] ; get value of b
r1,r1,#15 ; perform AND
r1,r0,r1 ; perform OR
The ARM Instructions - 32
Embedded Systems Lab./Honam University
C assignment, cont’d.
ADR r4,z ; get address for z
STR r1,[r4] ; store value for z
The ARM Instructions - 33
Embedded Systems Lab./Honam University
Additional addressing modes
* Base-plus-offset addressing:
LDR r0,[r1,#16]
• Loads from location r1+16
* Auto-indexing increments base register:
LDR r0,[r1,#16]!
* Post-indexing fetches, then does offset:
LDR r0,[r1],#16
• Loads r0 from r1, then adds 16 to r1.
The ARM Instructions - 34
Embedded Systems Lab./Honam University
ARM flow of control
* All operations can be performed conditionally, testing CPSR:
• EQ, NE, CS, CC, MI, PL, VS, VC, HI, LS, GE, LT, GT, LE, AL, NV
* Branch operation:
B #100
• Can be performed conditionally.
The ARM Instructions - 35
Embedded Systems Lab./Honam University
Example: if statement
* C:
if (a > b) { x = 5; y = c + d; } else x = c - d;
* Assembler:
; compute and test condition
ADR r4,a ; get address for a
LDR r0,[r4] ; get value of a
ADR r4,b ; get address for b
LDR r1,[r4] ; get value for b
CMP r0,r1 ; compare a < b
BGE fblock ; if a >= b, branch to false block
The ARM Instructions - 36
Embedded Systems Lab./Honam University
If statement, cont’d.
; true block
MOV r0,#5 ; generate value for x
ADR r4,x ; get address for x
STR r0,[r4] ; store x
ADR r4,c ; get address for c
LDR r0,[r4] ; get value of c
ADR r4,d ; get address for d
LDR r1,[r4] ; get value of d
ADD r0,r0,r1 ; compute y
ADR r4,y ; get address for y
STR r0,[r4] ; store y
B after ; branch around false block
The ARM Instructions - 37
Embedded Systems Lab./Honam University
If statement, cont’d.
; false block
fblock ADR r4,c ; get address for c
LDR r0,[r4] ; get value of c
ADR r4,d ; get address for d
LDR r1,[r4] ; get value for d
SUB r0,r0,r1 ; compute a-b
ADR r4,x ; get address for x
STR r0,[r4] ; store value of x
after ...
The ARM Instructions - 38
Embedded Systems Lab./Honam University
Conditional Instruction Use
; Compute and test the condition
ADR r4, a
LDR r0, [r4]
ADR r4, b
LDR r1, [r4]
CMP r0, r1
; get address for a
; get value of a
; get address for b
; get value of b
; compare a < b
; Notice we don't need a branch here
The ARM Instructions - 39
Embedded Systems Lab./Honam University
Example: Conditional instruction
implementation
; true block
MOVLT r0,#5 ; generate value
ADRLT r4,x ; get address for
STRLT r0,[r4] ; store x
ADRLT r4,c ; get address for
LDRLT r0,[r4] ; get value of
ADRLT r4,d ; get address for
LDRLT r1,[r4] ; get value of
ADDLT r0,r0,r1 ; compute y
ADRLT r4,y ; get address for
STRLT r0,[r4] ; store y
The ARM Instructions - 40
for x
x
c
c
d
d
y
Embedded Systems Lab./Honam University
Conditional instruction implementation,
cont’d.
; false block
ADRGE r4,c ; get address for c
LDRGE r0,[r4] ; get value of c
ADRGE r4,d ; get address for d
LDRGE r1,[r4] ; get value for d
SUBGE r0,r0,r1 ; compute a-b
ADRGE r4,x ; get address for x
STRGE r0,[r4] ; store value of x
The ARM Instructions - 41
Embedded Systems Lab./Honam University
Example: switch statement
* C:
switch (test) { case 0: … break; case 1: … }
* Assembler:
ADR r2,test ; get address for test
LDR r0,[r2] ; load value for test
ADR r1,SwitchTable ; load address for switch table
LDR r1,[r1,r0,LSL #2] ; index switch table
SwitchTable DCD case0
DCD case1
; DCD directive instructs the assembler
; to reserve a word of store and
; initialize to the right side value
...
The ARM Instructions - 42
Embedded Systems Lab./Honam University
Example: FIR filter
* C:
for (i=0, f=0; i<N; i++)
f = f + c[i]*x[i];
* Assembler
; loop
MOV
MOV
ADR
LDR
MOV
initiation code
r0,#0 ; use r0 for I
r8,#0 ; use separate index for arrays
r2,N ; get address for N
r1,[r2] ; get value of N
r2,#0 ; use r2 for f
The ARM Instructions - 43
Embedded Systems Lab./Honam University
FIR filter, cont’.d
ADR r3,c ; load r3 with base of c
ADR r5,x ; load r5 with base of x
; loop body
loop LDR r4,[r3,r8] ; get c[i]
LDR r6,[r5,r8] ; get x[i]
MUL r4,r4,r6 ; compute c[i]*x[i]
ADD r2,r2,r4 ; add into running sum
ADD r8,r8,#4 ; add one word offset to array index
ADD r0,r0,#1 ; add 1 to i
CMP r0,r1 ; exit?
BLT loop ; if i < N, continue
The ARM Instructions - 44
Embedded Systems Lab./Honam University
ARM subroutine linkage
* Branch and link instruction:
BL foo
• Copies current PC to r14.
* To return from subroutine:
MOV r15,r14
Build a stack for nested calls
and to pass parameters.
The ARM Instructions - 45
Embedded Systems Lab./Honam University
Nested subroutine calls
* void f1 (int a) { f2(a) }
* Nesting/recursion requires coding convention:
f1
LDR r0,[r13] ; load arg into r0 from stack
; call f2()
STR r13!,[r14] ; store f1’s return adrs
STR r13!,[r0] ; store arg to f2 on stack
BL f2 ; branch and link to f2
; return from f1()
SUB r13,#4 ; pop f2’s arg off stack
LDR r13!,r15 ; restore register and return
The ARM Instructions - 46
Embedded Systems Lab./Honam University
Summary
* Load/store architecture
* Most instructions are RISCy, operate in single cycle.
• Some multi-register operations take longer.
* All instructions can be executed conditionally.
The ARM Instructions - 47
Embedded Systems Lab./Honam University