Download Document

Integer Operations 1 Outline • Arithmetic Operations – – – – overflow Unsigned addition, multiplication Signed addition, negation, multiplication Using Shift to perform power-of-2 multiply/divide • Suggested reading – Chap 2.3 Negation：取反 2 Unsigned Addition u • • • v • • • u+v • • • UAddw(u , v) • • • Operands: w bits + True Sum: w+1 bits Discard Carry: w bits 3 Unsigned Addition • Standard Addition Function – Ignores carry output • Implements Modular Arithmetic – s = UAddw(u , v) = (u + v) mod 2w  u  v UAdd w (u,v)   w u  v  2  u  v  2w u  v  2w P67 (2.9) 4 Visualizing Unsigned Addition P68 Figure 2.16 Overflow • Wraps Around – If true sum ≥ 2w – At most once True Sum UAdd4(u , v) 16 14 2w+1 12 Overflow 10 8 14 6 2w 12 10 4 8 2 0 0 4 0 Modular Sum Module: 取模 v 6 u 2 4 2 6 8 10 12 0 14 5 Unsigned Addition Forms an Abelian Group P68 • Closed under addition – 0  UAddw(u , v)  2w –1 • Commutative （交换律） – UAddw(u , v) = UAddw(v , u) • Associative （结合律） – UAddw (t, UAddw (u,v)) = UAddw (UAddw (t, u ), v) 6 Unsigned Addition Forms an Abelian Group • 0 is additive identity – UAddw (u , 0) = u • Every element has additive inverse – Let UCompw (u ) = 2w – u P68 （2.10） – UAddw(u , UCompw (u )) = 0 7 Signed Addition • Functionality – True sum requires w+1 bits – Drop off MSB – Treat remaining bits as 2’s comp. integer u  v  2 w ,  Tadd (u, v )   u  v, u  v  2 w ,  P70 （2.12） PosOver：Positive Overflow NegOver：Negative Overflow TMaxw  u  v ( PosOver ) TMinw  u  v  TMaxw u  v  TMinw ( NegOver ) 8 Signed Addition PosOver TAdd(u , v) P70 Figure 2.17 True Sum 0 111…1 2w–1 PosOver TAdd Result >0 v <0 <0 NegOver u 0 100…0 2w –1 011…1 0 000…0 0 000…0 >0 1 100…0 –2w –1 1 000…0 –2w 100…0 NegOver 9 Visualizing 2’s Comp. Addition • Values – 4-bit two’s comp. – Range from -8 to +7 • Wraps Around – If sum  2w-1 • Becomes negative – If sum < –2w–1 • Becomes positive 10 Visualizing 2’s Comp. Addition P72 Figure 2.19 NegOver TAdd4(u , v) 8 6 4 2 0 6 -2 4 2 -4 0 -6 -2 -8 -4 -8 -6 -4 u v -6 -2 0 2 4 -8 6 PosOver 11 Detecting Tadd Overflow P71 • Task – Given s = TAddw(u , v) – Determine if s = Addw(u , v) • Claim – Overflow iff either: • • – u, v < 0, s  0 u, v  0, s < 0 2w–1 PosOver 2w –1 0 (NegOver) (PosOver) ovf = (u<0 == v<0) && (u<0 != s<0); NegOver 12 Mathematical Properties of TAdd • Two’s Complement Under TAdd Forms a Group – Closed, Commutative, Associative, 0 is additive identity – Every element has additive inverse • Let TCompw (u)  u  TMinw u  TMinw u  TMinw P73 （2.13） • TAddw(u , TCompw (u )) = 0 13 Mathematical Properties of TAdd • Isomorphic Algebra to UAdd – TAddw (u , v) = U2T (UAddw(T2U(u ), T2U(v))) • Since both have identical bit patterns – T2U(TAddw (u , v)) = UAddw(T2U(u ), T2U(v)) Isomorphic：同构 14 Negating with Complement & Increment P73 • In C – ~x + 1 == -x • Complement – Observation: ~x + x == 1111…111 == -1 x 1001110 1 + ~x 0110001 0 -1 1111111 1 ~x：Complement 15 Signed Addition • Increment – – – – ~x + 1 = ~x +[x + (-x)] +1 (~x + x) + -x + 1 == -1 + (-x + 1) == -x So, ~x + 1 == -x 16 Multiplication P75 • Computing Exact Product of w-bit numbers x, y – Either signed or unsigned • Ranges – Unsigned: 0 ≤ x * y ≤ (2w – 1) 2 = 22w – 2w+1 + 1 • Up to 2w bits – Two’s complement min: x *y ≥–2w–1*(2w–1–1) = –22w–2 + 2w–1 • Up to 2w–1 bits – Two’s complement max: x * y ≤ (–2w–1) 2 = 22w–2 • Up to 2w bits, but only for TMinw2 17 Multiplication • Maintaining Exact Results – Would need to keep expanding word size with each product computed – Done in software by “arbitrary precision” arithmetic packages 18 Power-of-2 Multiply with Shift Operands: w bits True Product: w+k bits Discard k bits: w bits * u · 2k u k • • • 2k 0 ••• 0 1 0 ••• 0 0 • • • UMultw(u , 2k) ••• 0 ••• 0 0 0 ••• 0 0 TMultw(u , 2k) 19 Power-of-2 Multiply with Shift • Operation – u << k gives u * 2k – Both signed and unsigned • Examples – u << 3 == u*8 – u << 5 - u << 3 == u * 24 – Most machines shift and add much faster than multiply • Compiler will generate this code automatically 20 Unsigned Power-of-2 Divide with Shift • Quotient of Unsigned by Power of 2 – u >> k gives  u / 2k  – Uses logical shift k u Operands: / 2k ••• ••• Binary Point 0 ••• 0 1 0 ••• 0 0 Division: u / 2k 0 ••• 0 0 ••• Quotient: u / 2k 0 ••• 0 0 ••• . ••• 21 2’s Comp Power-of-2 Divide with Shift P77 • Quotient of Signed by Power of 2 – u >> k gives  u / 2k  – Uses arithmetic shift – Rounds wrong direction when u < 0 k u Operands: Division: Result: / 2k u / 2k RoundDown(u / 2k) ••• ••• Binary Point 0 ••• 0 1 0 ••• 0 0 0 ••• ••• 0 ••• ••• . ••• 22 Correct Power-of-2 Divide • Quotient of Negative Number by Power of 2 – Want  u / 2k  (Round Toward 0) – Compute as  (u+2k-1)/ 2k  • In C: (u + (1<<k)-1) >> k • Biases divided toward 0 Quotient：商 23 Correct Power-of-2 Divide Case 1: No rounding k u Dividend: +2k +–1 1 / 2k  u / 2k  0 ••• 0 0 0 ••• 0 0 1 ••• 1 1 1 Divisor: ••• ••• 1 ••• 1 1 Binary Point 0 ••• 0 1 0 ••• 0 0 0 ••• 1 1 1 1 ••• . 1 ••• 1 1 Biasing has no effect 24 Correct Power-of-2 Divide Case 2: Rounding k u Dividend: +2k +–1 1 ••• ••• 0 ••• 0 0 1 ••• 1 1 1 ••• ••• Incremented by 1 Divisor: / 2k  u / 2k  Binary Point 0 ••• 0 1 0 ••• 0 0 0 ••• 1 1 1 1 ••• . ••• Incremented by 1 Biasing adds 1 to final result 25 Floating Point 26 Topics • • • • • • Fractional Binary Numbers IEEE 754 Standard Rounding Mode FP Operations Floating Point in C Suggested Reading: Chap 2.4 27 Encoding Rational Numbers • • • • P80 Form V = x  2 Very useful when V >> 0 or V <<1 An Approximation to real arithmetic From programmer’s perspective y – Uninteresting – Arcane and incomprehensive * Arcane：神秘的 * Incomprehensive: 不可理解的 28 Encoding Rational Numbers • Until 1980s – Many idiosyncratic formats, fast speed, easy implementation, less accuracy • IEEE 754 – Designed by W. Kahan for Intel processors – Based on a small and consistent set of principles, elegant, understandable, hard to make go fast Idiosyncratic: 特殊的 Elegant：雅致的 29 Fractional Binary Numbers 2m 2m–1 4 2 1 bm bm–1 • • • b2 b1 b0 . b–1 b–2 b–3 1/2 1/4 1/8 ••• b–n ••• ••• 2–n 30 Fractional Binary Numbers • Bits to right of “binary point” represent fractional powers of 2 m • Represents rational number:  bi 2i P81 （2.17） i n 31 Fractional Numbers to Binary Bits unsigned result_bits=0, current_bit=0x80000000 for (i=0;i<32;i++) { x *= 2 if ( x>= 1 ) { result_bits |= current_bit ; if ( x == 1) break ; x -= 1 ; } current_bit >> 1 ; } 32 Fraction Binary Number Examples Value 0.2 • Observations: Binary Fraction 0.00110011[0011] – The form 0.11111…11 represent numbers just below 1.0 which is noted as 1.0- – Binary Fractions can only exactly represent x/2k – Others have repeated bit patterns 33 IEEE Floating-Point Representation P83 • Numeric form – V=(-1)sM  2E • Sign bit s determines whether number is negative or positive • Significand M normally a fractional value in range [1.0,2.0). • Exponent E weights value by power of two 34 IEEE Floating-Point Representation • Encoding – s exp frac – s is sign bit – exp field encodes E – frac field encodes M • Sizes – Single precision (32 bits): 8 exp bits, 23 frac bits – Double precision (64 bits): 11 exp bits, 52 frac bits 35 Normalize Values P84 • Condition – exp  000…0 and exp  111…1 • Exponent coded as biased value – E = Exp – Bias • Exp : unsigned value denoted by exp • Bias : Bias value – Single precision: 127 (Exp: 1…254, E : -126…127) – Double precision: 1023 (Exp: 1…2046, E : -1022 …1023) – In general: Bias = 2m-1 - 1, where m is the number of exponent bits 36 Normalize Values • Significand coded with implied leading 1 – m = 1.xxx…x2 • xxx…x: bits of frac • Minimum when 000…0 (M = 1.0) • Maximum when 111…1 (M = 2.0 – ) • Get extra leading bit for “free” 37 Normalized Encoding Examples • Value: 12345 (Hex: 0x3039) • Binary bits: 11000000111001 • Fraction representation: 1.1000000111001*213 • M: 10000001110010000000000 • E: 10001100 (140) • Binary Encoding – 0100 0110 0100 0000 1110 0100 0000 0000 – 4640E400 38 Denormalized Values P84 • Condition – exp = 000…0 • Values – Exponent Value: E = 1 – Bias – Significant Value m = 0.xxx…x2 • xxx…x: bits of frac 39 Denormalized Values • Cases – exp = 000…0, frac = 000…0 • Represents value 0 • Note that have distinct values +0 and –0 – exp = 000…0, frac  000…0 • Numbers very close to 0.0 • Lose precision as get smaller • “Gradual underflow” 40 Special Values P85 • Condition – exp = 111…1 41 Special Values • exp = 111…1, frac = 000…0 – Represents value (infinity) – Operation that overflows – Both positive and negative – E.g., 1.0/0.0 = 1.0/0.0 = +, 1.0/0.0 =  42 Special Values • exp = 111…1, frac  000…0 – Not-a-Number (NaN) – Represents case when no numeric value can be determined – E.g., sqrt(–1),  43 Summary of Real Number Encodings Figure 2.22  NaN -Normalized +Denorm -Denorm 0 P85 +Normalized + NaN +0 44 8-bit Floating-Point Representations 7 s 6 3 exp 0 2 frac 45 8-bit Floating-Point Representations • • • • • • • • • • • • • • • • • Exp 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 exp 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 E -6 -6 -5 -4 -3 -2 -1 0 +1 +2 +3 +4 +5 +6 +7 n/a 2E 1/64 1/64 1/32 1/16 1/8 1/4 1/2 1 2 4 8 16 32 64 128 (denorms) (inf, NaN) 46 Dynamic Range (Denormalized numbers) Figure 2.23 • s exp • • • • • • 0 0 0 … 0 0 frac E Value 0000 000 0000 001 0000 010 -6 -6 -6 0 1/8*1/64 = 1/512 2/8*1/64 = 2/512 0000 110 0000 111 -6 -6 6/8*1/64 = 6/512 7/8*1/64 = 7/512 P86 47 Dynamic Range • s exp frac E Value • • • • • • • 000 -6 001 -6 8/8*1/64 = 8/512 9/8*1/64 = 9/512 0 0 … 0 0 0 0 0001 0001 0110 0110 0111 0111 110 111 000 001 -1 -1 0 0 14/8*1/2 = 14/16 15/8*1/2 = 15/16 8/8*1 = 1 9/8*1 = 9/8 48 Dynamic Range (Denormalized numbers) • s exp frac E • • • • • 010 0 0 0111 … 0 1110 0 1110 0 1111 110 7 111 7 000 n/a Value 10/8*1 = 10/8 14/8*128 = 224 15/8*128 = 240 inf 49 Distribution of Representable Values • 6-bit IEEE-like format – K = 3 exponent bits – n = 2 significand bits – Bias is 3 • Notice how the distribution gets denser toward zero. 50 Distribution of Representable Values -15 -10 -5 Denormalized -1 -0.5 Denormalized 0 Normalized 0 Normalized 5 10 15 Infinity 0.5 1 Infinity 51 Interesting Numbers P88 Figure 2.24 52 Special Properties of Encoding • FP Zero Same as Integer Zero – All bits = 0 • Can (Almost) Use Unsigned Integer Comparison – Must first compare sign bits – Must consider -0 = 0 – NaNs problematic • Will be greater than any other values – Otherwise OK • Denorm vs. normalized • Normalized vs. infinity 53 Round Mode P89 • Round down: – rounded result is close to but no greater than true result. • Round up: – rounded result is close to but no less than true result. 54 Round Mode P90 Figure 2.25 Mode 1.40 1.60 1.50 2.50 -1.50 Round-to-Even 1 2 2 2 -2 Round-toward-zero 1 1 1 2 -1 Round-down 1 1 1 2 -2 Round-up 2 2 2 3 -1 55 Round-to-Even • Default Rounding Mode – Hard to get any other kind without dropping into assembly – All others are statistically biased • Sum of set of positive numbers will consistently be over- or under- estimated 56 Round-to-Even P89 • Applying to Other Decimal Places – When exactly halfway between two possible values • Round so that least significant digit is even – E.g., round to nearest hundredth 1.2349999 1.23 (Less than half way) 1.2350001 1.24 (Greater than half way) 1.2350000 1.24 (Half way—round up) 1.2450000 1.24 (Half way—round down) 57 Rounding Binary Number P89 • “Even” when least significant bit is 0 • Half way when bits to right of rounding position = 100…2 Value Binary Rounded Action Round Decimal 2 3/32 10.00011 10.00 Down 2 2 3/16 10.0011 10.01 Up 2 1/4 2 7/8 10.111 11.00 Up 3 2 5/8 10.101 10.10 Down 2 1/2 58 Floating-Point Operations • Conceptual View – First compute exact result – Make it fit into desired precision • Possibly overflow if exponent too large • Possibly round to fit into frac 59 Mathematical Properties of FP Add • Compare to those of Abelian Group – Closed under addition? YES • But may generate infinity or NaN – Commutative? – Associative? YES NO • Overflow and inexactness of rounding – 0 is additive identity? – Every element has additive inverse YES ALMOST • Except for infinities & NaNs 60 Mathematical Properties of FP Add • Monotonicity – a ≥ b  a+c ≥ b+c? ALMOST • Except for infinities & NaNs 61 Algebraic Properties of FP Mult • Compare to Commutative Ring – Closed under multiplication? YES • But may generate infinity or NaN – Multiplication Commutative? – Multiplication is Associative? P92 YES NO • Possibility of overflow, inexactness of rounding – 1 is multiplicative identity? – Multiplication distributes over addition? YES NO • Possibility of overflow, inexactness of rounding 62 Algebraic Properties of FP Mult P90 • Monotonicity – a ≥ b & c ≥ 0  a *c ≥ b *c? ALMOST • Except for infinities & NaNs 63 FP Multiplication • Operands (–1)s1 M1 2E1 (–1)s2 M2 2E2 • Exact Result (–1)s M 2E – Sign s : – Significand M : – Exponent E : s1 ^ s2 M1 * M2 E1 + E2 64 FP Multiplication • Fixing – If M ≥ 2, shift M right, increment E – If E out of range, overflow – Round M to fit frac precision 65 FP Addition • Operands (–1)s1 M1 2E1 (–1)s2 M2 2E2 – Assume E1 > E2 • Exact Result (–1)s M 2E – Sign s, significand M: • Result of signed align & add – Exponent E : E1 66 FP Addition • Fixing – If M ≥ 2, shift M right, increment E – if M < 1, shift M left k positions, decrement E by k – Overflow if E out of range – Round M to fit frac precision 67 FP Addition E1–E2 (–1)s1 m1 (–1)s2 m2 + (–1)s m 68 Answers to Floating Point Puzzles • int x = …; • float f = …; • double d = …; • Assume neither d nor f is NAN or infinity 69 Floating Point in C • • • • • • • • • • x == (int)(float) x x == (int)(double) x f == (float)(double) f d == (float) d f == -(-f); 2/3 == 2/3.0 d < 0.0 ((d*2) < 0.0) d > f  -f < -d d *d >= 0.0 (d+f)-d == f No: 24 bit significand Yes: 53 bit significand Yes: increases precision No: loses precision Yes: Just change sign bit No: 2/3 == 0 Yes! No Yes! No: Not associative 70 Answers to Floating Point Puzzles • C Guarantees Two Levels – float – double single precision double precision 71 Answers to Floating Point Puzzles • Conversions – Casting between int, float, and double changes numeric values – Double or float to int • Truncates fractional part • Like rounding toward zero • Not defined when out of range – Generally saturates to TMin or TMax – int to double • Exact conversion, as long as int has ≤ 53 bit word size – int to float • Will round according to rounding mode 72

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document