Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Announcements • Assn 4 is posted. Note that due date is the 12th (Monday) at 7pm. (Last assignment!) • Final Exam on June 15, 9am. Room GOO247. • Assn 3 solution is posted. Spring 2006 CISC101 - Prof. McLeod 1 Last Time • Variable Scope and Lifetime • Sorting Spring 2006 CISC101 - Prof. McLeod 2 Today • Searching – Sequential search – Binary search • Source and effects of round-off error. • Start working on assignment? • Exam prep (mostly Thursday). Spring 2006 CISC101 - Prof. McLeod 3 Searching • Sequential search pseudocode: • Loop through array starting at the first element until the value of target matches one of the array elements. • If a match is not found, return –1. Spring 2006 CISC101 - Prof. McLeod 4 Sequential Search • As code: public int seqSearch(int[] A, int target) { for (int i = 0; i < A.length; i++) if (A[i] == target) return i; return –1; } Spring 2006 CISC101 - Prof. McLeod 5 Sequential Search, Cont. • Works on any kind of dataset. • Finds the first matching element – does not say anything about how many matches there are! • Best case – one comparison. • Worst case (target not found) – n comparisons (n is the same as A.length. • On average - n/2 comparisons. Spring 2006 CISC101 - Prof. McLeod 6 Searching in an Ordered Dataset • How do you find a name in a telephone book? Spring 2006 CISC101 - Prof. McLeod 7 Binary Search • Binary search pseudocode. Only works on ordered sets: a) Locate midpoint of array to search. b) Determine if target is in lower half or upper half of array. • • • Spring 2006 If in lower half, make this half the array to search. If in upper half, make this half the array to search. Loop back to step a), unless the size of the array to search is one, and this element does not match, in which case return –1. CISC101 - Prof. McLeod 8 Binary Search – Cont. public int binSearch (int[] A, int key) { int lo = 0; int hi = A.length - 1; int mid = (lo + hi) / 2; while (lo <= hi) { if (key < A[mid]) hi = mid else if (A[mid] < lo = mid + else return mid; mid = (lo + hi) / } return -1; } 2006 Spring 1; key) 1; 2; CISC101 - Prof. McLeod 9 Binary Search – Cont. • For the best case, the element matches right at the middle of the array, and the loop only executes once. • For the worst case, key will not be found and the maximum number of loops will occur. • Note that the loop will execute until there is only one element left that does not match. • Each time through the loop the number of elements left is halved, giving the progression below for the number of elements: Spring 2006 CISC101 - Prof. McLeod 10 Binary Search – Cont. • Number of elements to be searched progression: n n n n n, , 2 , 3 , ..., m 2 2 2 2 • The last comparison is for n/2m, when the number of elements is one (worst case). • So, n/2m = 1, or n = 2m. • Or, m = log2(n). • So, the algorithm loops log(n) times for the worst case. Spring 2006 CISC101 - Prof. McLeod 11 Binary Search - Cont. • Binary search with log(n) iterations for the worst case is much better than n iterations for sequential search! • Major reason to sort datasets! Spring 2006 CISC101 - Prof. McLeod 12 From Before: • A float occupies 4 bytes • A double occupies 8 bytes of memory • (A byte is 8 bits, where a bit is zero or one.) Spring 2006 CISC101 - Prof. McLeod 13 Storage of Real Numbers • So, a real number can only occupy a finite amount of storage in memory. • This effect is very important for two kinds of numbers: – Numbers like 0.1 that can be written exactly in base 10, but cannot be stored exactly in base 2. – Real numbers (like or e) that have an infinite number of digits in their “real” representation can only be stored in a finite number of digits in memory. • And, we will see that it has an effect on the accuracy of mathematical operations. Spring 2006 CISC101 - Prof. McLeod 14 Storage of “Real” or “Floating-Point” Numbers - Cont. • Consider 0.1: (0.1)10 = (0.0 0011 0011 0011 0011 0011…)2 • What happens to the part of a real number that cannot be stored? • It is lost - the number is truncated. • The “lost part” is called the “roundoff error”. Spring 2006 CISC101 - Prof. McLeod 15 Storage of “Real” or “Floating-Point” Numbers - Cont. • Back to the storage of 0.1: • Compute: 10000 0.1 i 1 • And, compare to 1000. float sum = 0; for (int i = 0; i < 10000; i++) sum += 0.1F; System.out.println(sum); Spring 2006 CISC101 - Prof. McLeod 16 Storage of “Real” or “Floating-Point” Numbers - Cont. • Prints a value of 999.9029 to the screen. • If sum is declared to be a double then the value: 1000.0000000001588 is printed to the screen. • So, the individual roundoff errors have piled up to contribute to a cumulative error in this calculation. • As expected, the roundoff error is smaller for a double than for a float. • On the following charts, the x-axis is the counter, i, from the code shown above. Spring 2006 CISC101 - Prof. McLeod 17 float Accumulation Progress of float Roundoff Error 4.00E-02 2.00E-02 0.00E+00 -2.00E-02 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 -4.00E-02 -6.00E-02 -8.00E-02 -1.00E-01 -1.20E-01 Spring 2006 CISC101 - Prof. McLeod 18 double Accumulation Progress of double Roundoff Error 1.80E-10 1.60E-10 1.40E-10 1.20E-10 1.00E-10 8.00E-11 6.00E-11 4.00E-11 2.00E-11 0.00E+00 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 -2.00E-11 Spring 2006 CISC101 - Prof. McLeod 19 Roundoff Error • This error is referred to in two different ways: • The absolute error: absolute error = |x - xapprox| • The relative error: relative error = (absolute error) |x| Spring 2006 CISC101 - Prof. McLeod 20 Roundoff Error - Cont. • So for the calculation of 1000 as shown above, the errors are: Type Absolute Relative float 0.0971 9.71E-5 double 1.588E-10 1.588E-13 • The relative error on the storage of 0.1 is the absolute error divided by 1000. Spring 2006 CISC101 - Prof. McLeod 21 The Effects of Roundoff Error • Roundoff error can have an effect on any arithmetic operation carried out involving real numbers. • For example, consider subtracting two numbers that are very close together: • Use the function f ( x ) 1 cos( x ) for example. As x approaches zero, cos(x) approaches 1. Spring 2006 CISC101 - Prof. McLeod 22 The Effects of Roundoff Error • Using double variables, and a value of x of 1.0E-12, f(x) evaluates to 0.0. • But, it can be shown that the function f(x) can also be represented by f’(x): 2 sin ( x ) f ' ( x) f ( x) 1 cos( x ) • For x = 1.0E-12, f’(x) evaluates to 5.0E-25. • The f’(x) function is less susceptible to roundoff error. Spring 2006 CISC101 - Prof. McLeod 23 The Effects of Roundoff Error - Cont. • Another example. Consider the smallest root of the polynomial: ax2+bx+c=0: b b 4ac x1 2a 2 • What happens when ac is small, compared to b? • It is known that for the two roots, x1 and x2: c x1 x2 a Spring 2006 CISC101 - Prof. McLeod 24 The Effects of Roundoff Error - Cont. • Which leads to an equation for the root which is not as susceptible to roundoff error: x1 2c b b 4ac 2 • This equation approaches –c/b instead of zero when ac << b2. Spring 2006 CISC101 - Prof. McLeod 25 The Effects of Roundoff Error - Cont. • The examples above show what can happen when two numbers that are very close are subtracted. • Remember that this effect is a direct result of these numbers being stored with finite accuracy in memory. Spring 2006 CISC101 - Prof. McLeod 26 The Effects of Roundoff Error - Cont. • A similar effect occurs when an attempt is made to add a comparatively small number to a large number: boolean aVal = ((1.0E10 + 1.0E-20)==1.0E10); System.out.println(aVal); • Prints out true to the screen • Since 1.0E-20 is just too small to affect any of the bit values used to store 1.0E10. The small number would have to be about 1.0E-5 or larger to affect the large number. • So, keep this behaviour in mind when designing expressions! Spring 2006 CISC101 - Prof. McLeod 27 The Effect on Summations • Taylor Series are used to approximate many functions. For example: ( 1)i 1 xi ln( 1 x) i i 1 • For ln(2): ( 1)i 1 1 1 1 ln( 2) 1 ... i 2 3 4 i 1 Spring 2006 CISC101 - Prof. McLeod 28 The Effect on Summations • Since we cannot loop to infinity, how many terms would be sufficient? • Since the sum is stored in a finite memory space, at some point the terms to be added will be much smaller than the sum itself. • If the sum is stored in a float, which has about 7 significant digits, a term of about 1x10-8 would not be significant. So, i would be about 108 - that’s a lot of iterations! Spring 2006 CISC101 - Prof. McLeod 29 The Effect on Summations - Cont. • On testing using a float, it took 33554433 iterations and 25540 msec to compute! (sum no longer changing, value = 0.6931375) • Math.log(2) = 0.6931471805599453 • So, roundoff error had a significant effect and the summation did not even provide the correct value. A float could only provide about 5 correct significant digits, tops. • For double, about 1015 iterations would be required! (I didn’t try this one…) • So, this series does not converge quickly! Spring 2006 CISC101 - Prof. McLeod 30 The Effect on Summations - Cont. • Here is another way to compute natural logs: 1 1 x 2 i 1 ln x 2 1 x i 0 2i 1 • Using x = 1/3 will provide ln(2). Spring 2006 CISC101 - Prof. McLeod 31 The Effect on Summations - Cont. • For float, this took 8 iterations and <1msec (value = 0.6931472). • Math.log(2) = 0.6931471805599453 • For double, it took 17 iterations, <1 msec to give the value = 0.6931471805599451 • Using the Windows calculator ln(2) = 0.693147180559945309417232121458177 (!!) • So, the use of the 17 iterations still introduced a slight roundoff error. • The second version is much faster!! Spring 2006 CISC101 - Prof. McLeod 32 Numeric Calculations • Error is introduced into a calculation through two sources (assuming the formulae are correct!): – The inherent error in the numbers used in the calculation. – Error resulting from roundoff error. • Often the inherent error dominates the roundoff error. • But, watch for conditions of slow convergence or ill-conditioned matrices, where roundoff error will accumulate and end up swamping out the inherent error. Spring 2006 CISC101 - Prof. McLeod 33 Numeric Calculations - Cont. • Once a number is calculated, it is very important to be able to estimate the error using both sources, if necessary. • The error must be known in order that the number produced by your program can be reported in a valid manner. • This is a non-trivial topic in numeric calculation that we will not discuss in this course. Spring 2006 CISC101 - Prof. McLeod 34