Download Lecture 4 - Number Representations, DSK Hardware, Assembly Programming James Barnes ()

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lecture 4 - Number Representations, DSK
Hardware, Assembly Programming
James Barnes ([email protected])
Spring 2014
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 1 / 30
References
The next two lectures will focus on data types, hardware and programming (C,
assembly). References:
●
●
●
●
TRETTER: Data types and number representations – Tretter Chapter 1, on
E-reserve at library
WIKI: Fixed Point Arithmetic - Wikepedia
http://en.wikipedia.org/wiki/Fixed-point_arithmetic
CHAISSING: Architecture and Instruction Set of the C6x Processor Chaissing 2005, Chapter 3
TI: TMS320C6000 CPU and Instruction Set Reference Guide
http://www.engr.colostate.edu/ECE423/docs/spru189f.pdf
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 2 / 30
❖ References
Data Types, Number
Representations
❖ C Data types vs
C671X Core Data Units
❖ Fixed Point Numbers
❖ Addition Overflow
❖ Representing Signed
Fixed-point Numbers
with Integers (Qf
Notation)
❖ Addition/Multiplication
and Normalizing.
❖ IEEE Floating Point
Numbers
❖ Floating Point Number
Types
Data Types, Number Representations
❖ Precision of Single
Precision Floating Point
Numbers
C6713 Hardware
C6713 Assembly
Language Programming
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 3 / 30
C Data types vs C671X Core Data Units
●
●
●
C specifies a number of data types based on arithmetic type (fixed pt vs float,
signed vs unsigned)
The DSP processor is register-oriented. The data types are based on how
much of a register is required to hold the data
In assembly programming, the only time the arithmetic type is recognized is
in arithmetic instructions
C type
Width
C671X Name
C671X Load
Instruction
C671X Arith
Instruction
char, signed char, uchar
short, signed short, ushort
int, signed int, uint
–
float
double
8
16
32
40
32
64
BYTE
HWORD
WORD
LONG
float
double
LDB1 , LDBU
LDH1 ,LDHU
LDW,LDWU
(LDB and LDW)
LDW
LDDW
ABS A0
ABS A0
ABS A0
ABS A1:A0
ABSSP A0
ABSDP A1:A0
1 sign-extended
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 4 / 30
Fixed Point Numbers
●
●
C considers all fixed point numbers to be integers (radix pt to right of LSB).
For integers, the radix point is just a convenience for humans.
C type unsigned int:
✦
valdec =
31
X
dn 2n , range [0, 232 − 1]
n=0
✦
●
Rules for unsigned integer different than for signed integers. DSP chip
has separate instructions for unsigned integer, for example ADDU
C type int:
✦
valdec = −d31 ∗ 231 +
30
X
dn 2n , range [−231 , 231 − 1]
n=0
✦
✦
d31 is the sign bit, 1 ⇒ negative
Arithmetic is 2’s complement
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 5 / 30
Addition Overflow
●
Overflow can occur; two criteria (tests)
1.
2.
●
Result has different sign than both operands (cannot get overflow when
operand signes different).
Carry-out of sign bit is different than carry-in of sign bit.
DSPs can clamp result to maximim or minimum (”saturate”), ex. SADD
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 6 / 30
Representing Signed Fixed-point Numbers
with Integers (Qf Notation)
●
●
●
●
Binary point of N bit number is f positions to the left of the LSB, where
f=[0,N-1] (binary point cannot be to the left of the sign bit)
User must manage movement of binary point - DSP has no knowledge of
binary point
Qf or Qm : f representation:
30
X
dn 2n )
valdec = 2−f · (−d31 231 +
n=0
●
DSP frequently uses Q15 with 16b words. In Q15, values range from
[−1, 1 − 2−15 ].
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 7 / 30
Addition/Multiplication and Normalizing.
●
Under addition, the binary point does not move.
✦
●
Addition overflow can occur as with integers.
Under multiplication, the binary point moves to the left. Ex: multiplying two 16
bit numbers results in a 32 bit number. For Q15, the binary point will be 30
positions to the left of the LSB (to the right of the sign bit) and there will be
two sign bits (sign extension).
✦
●
To reduce the answer to a Q15 result, the 32 bit number must be
right-shifted by 15 bits and the lower 16 bits used.
Example with Q0.2 (in class).
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 8 / 30
IEEE Floating Point Numbers
●
●
●
float: 32b single precision - 6-8 decimal places of precision
double: 64b double precision - 15-17 decimal places of precision
Organization of float as stored in memory.
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 9 / 30
Floating Point Number Types
●
5 float number categories
Type
Infinity
NAN
Normal
Denorm
Zero
●
e
255
255
1≤e<255
0
0
f
0
6 0
=
6= 0
6= 0
0
Value(dec)
(−1)s · ∞
undefined
s
(−1) · 2e−127 · (1 + f )
(−1)s · 2−126 · f
(−1)s · 0 (note: ±0)
NORM (”normal” or ”normalized”)
✦
has biased exponent 2e−127 ; the ”real” range of the exponent is [-126,127]
✦
✦
mantissa has implied 1, ”real” range is [1, 1 + (1 − 2−23 )]
Range [2−126 , (2 − 2−23 ) · 2127 ]
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 10 / 30
Precision of Single Precision Floating Point
Numbers
●
NORMs have the same relative precision over their entire range
✦
✦
e=1: 8 · 106 values in range [2−126 , 2−125 − 2−149 ], step size 2−149
e=2: 8 · 106 values in range [2−125 , 2−124 − 2−148 ], step size 2−148
and so forth. For a given mantissa value, as the step size increases, the number
value increases proportionately such that the relative precision remains constant
1
( 8·10
6 ).
●
DENORMs have reduced precision.
✦
The smallest DENORMs have 100% step size (2−149 to2 · 2−149 )
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 11 / 30
❖ References
Data Types, Number
Representations
C6713 Hardware
❖ C6713 High-Level
Block Diagram
❖ CPU Functional Units
❖ Instruction Fetch
❖ Register and Bus
Architecture
❖ General vs Special
Purpose Registers
❖ Special Purpose
Register Map
C6713 Hardware
❖ Functional Units
C6713 Assembly
Language Programming
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 12 / 30
C6713 High-Level Block Diagram
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 13 / 30
CPU Functional Units
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 14 / 30
Instruction Fetch
●
●
●
Instructions are fetched up to eight at a time in 256-bit wide “fetch packets”.
Each instruction occupies a 32-bit slot.
The human assembly writer, optimizing compiler, or optimizing assembler indicates
which instructions can be executed in parallel and the fetch packet is written to
instruction memory with that information.
Reference: TI
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 15 / 30
Register and Bus Architecture
●
●
16 32b registers in A and B register stack
Bus structure
✦
✦
✦
Each functional unit can write to any register in its stack
Each functional unit can take input operands from any register in its stack
There are cross-paths which allow a functional unit to take one operand
from the other stack
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 16 / 30
General vs Special Purpose Registers
Special-purpose registers vs general purpose: any register can be used for
computation but...
●
●
●
●
Registers A1, A2, B0, B1, and B2 are used for conditionals (example later)
Input operands to a function call are placed in A4,B4,A6,B6,...
Return value of function placed in A4
Function return address (PC+1 for function call) is placed in B3 WHEN
control passes to the function
The processor will automatically save and restore registers A0-A9 and B0-B9
during a ”context switch”, which happens during a function call.
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 17 / 30
Special Purpose Register Map
Notice the form of the asm function prototype.
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 18 / 30
Functional Units
Name
D
L
M
S
Function(s)
ALU, memory access
ALU
Multiply only
ALU, bit manipulation, branch instructions
Operation
Mnemonic
Fixed Pt Arith, Logic
Multiply
Load
Branch
DP Multiply
–
MPY
LDH, LDW
B
MPYDP
Colorado State University Dept of Electrical and Computer Engineering
Arithmetic Type
fixed pt only
fixed pt, float
fixed pt, float
fixed pt, float
Functional Unit
Latency (clock cycles)
1
1
1
1
4
Delay Slots
(clock cycles)
0
1
5
6
9
ECE423 – 19 / 30
❖ References
Data Types, Number
Representations
C6713 Hardware
C6713 Assembly
Language Programming
❖ Why/Why Not
Assembly?
❖ Format of Assembly
Instruction
❖ Register and
Load/Store Cross-Paths
❖ Note on Addressing
Memory
C6713 Assembly Language Programming
❖ Initializing Pointers
with MVKH, MVKL
❖ Calling Assembly
Functions (Passing
Arguments)
❖ Program Flow
(Conditionals and
Branches, Loops)
❖ Instructions that
Require Wait States
(NOP)
❖ Functional Unit
fixed-point Instructions
❖ Functional Unit
floating-point Instructions
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 20 / 30
Why/Why Not Assembly?
●
Plusses
✦
Speed
■
■
✦
✦
●
Smaller program size by better use of registers for scratch memory
Good way to learn the hardware
Minuses
✦
✦
●
Compiler is good at finding parallel operations, but humans can do
better
Re-ordering of instructions and re-use of register results can reduce
number of LOAD/STORE operations or make better use of NOP wait
cycles
Slower development
For a complex program, an optimizing compiler may do better than the
human
Linear assembly is a compromise between C and ”straight” assembly
✦
Assembler assigns registers, chooses functional units, finds instructions
that can be executed in parallel, puts in delays (NOPs). User chooses
instructions, defines variables and program flow.
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 21 / 30
Format of Assembly Instruction
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 22 / 30
Register and Load/Store Cross-Paths
Note:
●
●
The destination must be on the same side (A vs B) as the functional unit
For LDx instructions, ”the same side as” means the side the address pointer
register is on.
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 23 / 30
Note on Addressing Memory
●
●
●
●
Memory addresses are 32b wide
Memory is byte-addressable
There are two memory addressing modes: linear (used here) and circular
(discussed later).
The pointer post-increment/post-decrement operation does the right thing
depending on the type of the LD/ST instruction
✦
Ex
LDW
.D1
*A0++,A7
will increment A0 by 4
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 24 / 30
Initializing Pointers with MVKH, MVKL
How do you initialize a pointer to memory?
1.
2.
ZERO , but what if you want to access a non-zero location?
MVKH,MVKL
●
Must be applied in right order (see example)
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 25 / 30
Calling Assembly Functions (Passing
Arguments)
●
We saw that function arguments are passed in regs A4,B4,...
✦
●
long (40b integer) arguments and results occupy two adjacent
When calling ASM from C, the C compiler will put extra instructions in the
compiled program to save A0-A9 and restore them after the ASM function
call is completed.
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 26 / 30
Program Flow (Conditionals and Branches,
Loops)
●
●
ANY instruction can be made conditional. If the condition (conditional register
value) is false (all 0’s), the instruction is not executed
Making a loop requires
1.
2.
3.
A label
A conditional register
A branch statement
(In-class example)
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 27 / 30
Instructions that Require Wait States (NOP)
●
●
●
The following instructions that we will use require wait states: LDx, STx,
B, MPY
Register transfers, ADD,SUB are all 1-cycle
How to use delay slots: instead of NOP: do something useful that does not
depend on results of the instruction needing wait states.
(In-class example).
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 28 / 30
Functional Unit fixed-point Instructions
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 29 / 30
Functional Unit floating-point Instructions
Colorado State University Dept of Electrical and Computer Engineering
ECE423 – 30 / 30