* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download IOI/ACM/Supercom 2004 Training
Survey
Document related concepts
Transcript
IOI 2005 Training Dr Kan Min-Yen Topics and outline Sorting Computer Arithmetic and Algebra Invariants and Number Theory Sorting Problems and Parameters Comparison-based sorting Non-comparison-based www.personaltouchmailing.com What sort of problems? Sorting is usually not the end goal, but a prerequisite (Efficient) searching! Uniqueness / Duplicates Prioritizing (c.f. priority queues) Median / Selection Frequency Counting Set operations Target Pair Two properties of sorting Stable Items with the same value are ordered in the same way after the sort as they were before Important for doing multiple stage sorts E.g., sorting by First name, Last name In-place: sorts the items without needing extra space Space proportional to the size of the input n Comparison-based sort Based on comparing two items Many variants, but not the only way to sort Discuss only the important ones for programming contests Selection, Insertion, Merge and Quick Comparison Sorts Comparison Sort Animation: http://math.hws.edu/TMCM/java/xSortLab/ Selection Algo: find min or max of unsorted portion Heapsort: is selection sort with heap data structure Remember: Minimize amount of swaps Insertion Algo: insert unsorted item in sorted array Remember: Minimize amount of data Comparison Sorts Merge Idea: divide and conquer, recursion Algo: merge two sorted arrays in linear time Remember: Don’t recurse to the base case if using this sort, not inplace Quick Idea: randomization, pivot Algo: divide problem to smaller and larger half based on a pivot Remember: Partition can be used to solve problems too! A problem with sorting as its core may not be best solved with generic tool. Think before using a panacea like Quicksort. Miscellaneous sort Most sorting relies on comparisons between two items. Proven to be Θ(n log n) Example: sort an array of distinct integers ranged 1-k We’ll go over Radix sort Counting sort What is radix? Radix is the same as base. In decimal system, radix = 10. For example, the following is a decimal number with 4 digits 0 3 2 8 1st Digit 2nd Digit 3rd Digit 4th Digit Radix Sort Suppose we are given n d-digit decimal integers A[0..n-1], radix sort tries to do the following: for j = d to 1 { By ignoring digit-1 up to digit-(j-1), form the sorted array of the numbers } Radix Sort (Example) Original 0123 2043 9738 1024 0008 2048 1773 1239 0 Group using 4th digit Sorted Array if we only consider digit-4 1 2 3 0123, 2043, 1773 Ungroup 4 1024 5 6 7 8 9738, 0008, 2048 9 1239 0123 2043 1773 1024 9738 0008 2048 1239 Radix Sort (Example) 0 0008 Original 0123 2043 1773 1024 9738 0008 2048 1239 Group using 3rd digit 1 2 0123, 1024 3 9738, 1239 4 2043, 2048 5 6 7 1773 8 9 Sorted Array if we only consider digits- 0008 3 and 4 0123 Ungroup 1024 9738 1239 2043 2048 1773 Sorted Array if we only consider digit-2, 3, and 4 0 0008, 1024, 2043, 2048 0008 1 0123 1024 2 1239 Radix Sort (Example) Original 0008 0123 1024 9738 1239 2043 2048 1773 Group using 2nd digit 3 4 5 6 7 9738, 1773 8 9 Ungroup 2043 2048 0123 1239 9738 1773 Radix Sort (Example) 1024 2043 2048 0123 1239 9738 1773 Done! 0008 0 0008, 0123 Original 0008 Group using 1st digit 1 1024, 1239, 1773 2 2043, 2048 3 4 5 6 7 8 9 9738 0123 Ungroup 1024 1239 1773 2043 2048 9738 Details on Radix Sort A stable sorting algorithm Done from least signficant to most signficant Can be used with a higher base for better efficiency 1. 2. 3. Decide whether it’s really worth it Works for integers, but not real, floating point 4. But see: http://codercorner.com/RadixSortRevisited.htm The combination of 1 and 2 can be used for combining sorts in general Counting Sort Works by counting the occurrences of each data value. Assumes that there are n data items in the range of 1..k The algorithm can then determine, for each input element, the amount of elements less than it. For example if there are 9 elements less than element x, then x belongs in the 10th data position. These notes are from Cardiff’s Christine Mumford: http://www.cs.cf.ac.uk/user/C.L.Mumford/ Counting Sort The first for loop initialises C[ ] to zero. The second for loop increments the values in C[], according to their frequencies in the data. The third for loop adds all previous values, making C[] contain a cumulative total. The fourth for loop writes out the sorted data into array B[]. countingsort(A[], B[], k) for i = 1 to k do C[i] = 0 for j = 1 to length(A) do C[A[j]] = C[A[j]] + 1 for 2 = 1 to k do C[i] = C[i] + C[i-1] for j = 1 to length(A) do B[C[A[j]]] = A[j] C[A[j]] = C[A[j]] - 1 Counting Sort Demo from Cardiff: http://www.cs.cf.ac.uk/user/C.L.Mumford/trist an/CountingSort.html Counting and radix sort What’s their complexity? Why do they work so fast?? No comparisons are made Both are stable sorts, but not in-place Radix sort: O(dn) = O(n), if d << n Counting sort: 2k + 2n = O(n), if k << n Can you fix them to be in-place? When to use? When you have lots of items in a fixed range Quiz and discussion There are no right answers… Which sort better used for a very large random array? For sorting a deck of cards? For sorting a list of first name, last names? For sorting single English words For sorting an almost sorted set? Computer Arithmetic and Algebra Big Numbers Arbitrary precision arithmetic Multiprecision arithmetic Computer Algebra Dealing with algebraic expressions a la Maple, Mathematica Arithmetic Want to do standard arithmetic operations on very large/small numbers Can’t fit representation of numbers in standard data types What to do? Sassy answer: use java and BigInt class Two representations How to represent: 10 00000 00000 02003 Linked list: 1e16 2e3 3e0 Array: <as above> Dense representation Good for numbers with different or arbitrary widths Where should the head of the linked list point to? Sparse representation Good for problems in the general case Don’t forget to store the sign bit somewhere Which base to choose: 10 or 32? How to represent arbitrary large real numbers? Standard operations Most large number operations rely on techniques that are the same as primary school techniques Adding Subtracting Multiplying Dividing Exponentiation / Logarithms Algorithms for big numbers Addition of bigints x and y Complicated part is to deal with the carry What about adding lots of bigints together? Solution: do all the adds first then worry about the carries Canonicalization and adding Canonicalizing 12E2 + 34E0 => 1E3 + 2E2 + 3E1 + 4E0 Strip coefficient of values larger than B and carry to next value Reorder sparse representation if necessary Adding Iteratively add all Ai Bi … Zi then canonicalize Hazard: Data type for each entry must be able to hold maximum expected data or overflow will occur Algorithms for big numbers Subtraction Like addition but requires borrowing (reverse carry) Place the higher absolute magnitude number at the top Applicable to addition of mixed sign numbers Comparison Start from higher order bits then work backwards Multiplying Given two big integers X and Y in canonical form: the big integer Z = X*Y can be obtained thanks to the formula: Notes: This is the obvious way we were taught in primary school, the complexity is Θ(N2) Canonicalize after each step to avoid overflow in coefficients To make this operation valid, B2 must fit in the coefficient Shift Multiplication If a number is encoded in base B Left shift multiplies by B Right shift multiples by B Yet another reason to use arrays vs. linked lists If your problem requires many multiplications and divisions by a fixed number b, consider using b as the base for the number Karatsuba Multiplication We can split two big numbers in half: X = X0 + B X1 and Y = Y0 + B Y1 Then the product XY is given by (X0 + BX1) (Y0 + BY1) This results in three terms: X0Y0 + B (X0Y1 + X1Y0) + B2(X1Y1) Look at the middle term. We can get the middle term almost for free by noting that: (X00 + +X X11)) (Y (Y00 + X00Y Y00 + X Y11 (X + YY11)) = X + XX11YY00 + XX11Y X00Y Y11 + = - Karatsuba Demonstration 12 * 34 Given: X1 = 1, X0 = 2, Y1 = 3, Y0 = 4 Calculate: X0Y0 = 8, X1Y1 = 3 (X0+X1)(Y0+Y1) = 3*7 = 21 Final middle term = 21 – 8 – 3 = 10 Solution: 8 + 10 * 101 + 3 * 102 = 408 Notes: 1.5) Recursive, complexity is about Θ(n There’s a better way: FFT based multiplication not taught here that is Θ (n log n) http://numbers.computation.free.fr/Constants/Algorithms/fft.html Division Algo: long division (Skiena and Revilla, pg 109): Iterate shifting the remainder to the left including the next digit Subtract off instances of the divisor Demo: http://www.mathsisfun.com/long_division2.html Exponentiation How do you calculate x256? x256 = ((((((((x2)2)2)2)2)2)2) What about x255? Store x, x2, x4, x8, x16, x32, x64, x128 on the way up. Complexity Θ(log n) Computer Algebra Applicable when you need an exact computation of something 1/7 does not exactly equal 0.142857142857 1/7 * 7 = 1 Or when you need consider polynomials with different variables Data structures How to represent x5 + 2x – 1? Dense (array): 15 04 03 02 21 -10 Sparse (linked list): [1,5][2,1][-1,0] What about arbitrary expressions? Solution: use an expression tree (e.g. a+b*c) Simplification: Introduction But CA systems have to deal with equivalency between different forms: So an idea is to push all equations to a canonical form. (x-2) (x+2) X2 – 4 Like Maple “simplify” Question: how do we implement simplify? Simplify - Transforming Negatives Why? Kill off subtraction, negation Addition is commutative To think about: How is this related to big integer computation? Simplify – Leveling Operators Combine like binary trees to n-ary trees Simplify – Rational Expressions Expressions with * and / will be rewritten so that there is a single / node at the top, with only * operators below Simplify – Collecting Like Terms Exact / Rational Arithmetic Need to store fractions. What data structure can be used here? Problems: Keep 4/7 as 4/7 but 4/8 as 1/2 How to do this? Simplification and factoring Addition and subtraction need computation of greatest common denominator Note division of a/b c/d is just a/b d/c Invariants Sometimes a problem is easier than it looks Look for another way to define the problem in terms of indirect, fixed quantities Invariants Chocolate bar problem You are given a chocolate bar, which consists of r rows and c columns of small chocolate squares. Your task is to break the bar into small squares. What is the smallest number of splits to completely break it? Variants of this puzzle? Breaking irregular shaped objects Assembling piles of objects Fly flying between two trains Adding all integers in a range (1 … 1000) Also look for simplifications or irrelevant information in a problem The Game of Squares and Circles Start with c circles and s squares. On each move a player selects two shapes. These two are replaced by one according to the following rule: Identical shapes are replaced with a square. Different shapes are replaced with a circle. At the end of the game, you win if the last shape is a circle. Otherwise, the computer wins. What’s the key? Parity of circles and squares is invariant After every move: if # circles was even, still will be even If # circles was odd, still will be odd Sam Loyd’s Fifteen Puzzle Given a configuration, decide whether it is solvable or not: 8 7 12 15 1 5 2 13 6 3 9 14 4 10 11 Possible? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Key: Look for an invariant over moves Solution to Fifteen Look for inversion invariance Inversion: when two tiles out of order in a row inverted not inverted 11 10 11 For a given puzzle configuration, let N denote the sum of: 10 the total number of inversions the row number of the empty square. After a legal move, an odd N remains odd whereas an even N remains even. Fifteen invariant a b c d Note: if you are asked for optimal path, then it’s a search problem Story time: Euclid of Alexandria (~325 BC – 265 BC) How many others have been on the bestseller list for over 1,000 years? Greek geometer Led school in Alexandria Probably wroteThe Elements, the definitive math text until the 19th century Developed geometry, number theory and others from a set of 5 postulates Knowledge survived Dark Ages in the Western world as it was translated into Arabic Euclidian Algorithm for GCD Rational calculation require the calculation of the greatest common divisor The Euclidean algorithm is a good way to do this. It’s a recursive procedure: gcd(N,M) = gcd(M, N mod M) Demo: http://www.math.umn.edu/~garrett/crypto/a01/ Euclid.html Sieve of Eratosthenes Initialization: Keep a boolean array [1...n] = PRIME. We will change entries to COMPOSITE as we go on. Algorithm: Set k=1. iterate towards √n Find the first PRIME in the list greater than k. (e.g., 2.) Call it m. Change the numbers 2m, 3m, 4m, ... to COMPOSITE. Set k=m and repeat. Sieve of Eratosthenes Demo from Univ. of Utah (Peter Alfeld): http://www.math.utah.edu/~alfeld/ Eratosthenes.html Notes: If at n in the sieve, we know that all numbers not crossed out under n2 are also prime. Review Time For self-review Other topics related that you should cover on your own Roman number conversion: these problems crop up every once in a while Playing cards and other common games Divisibility criteria and modular arithmetic For Fun: Monty Hall Suppose you're on a game show, and you're given the choice of three doors. Behind one door is a car, behind the others, goats. You pick a door, say number 1, and the host, who knows what's behind the doors, opens another door, say number 3, which has a goat. He says to you, "Do you want to pick door number 2?" Is it to your advantage to switch your choice of doors? Analysis of the Monty Hall Dilemma Door case A B C 1 bad bad good 2 bad good bad 3 good bad bad assume you choose door A: you have a 1/3 chance of a good prize. But (this is key) Monty knows what is behind each door, and shows a bad one. In cases 1 and 2, he eliminates doors B and C respectively (which happen to be the only remaining bad door) so a good door is left: SWITCH! Only in case 3 (you lucked out in your original 1 in 3 chances) does switching hurt you. References Books and websites used to compile this lecture References Skiena, S. and Revilla, M. Programming Challenges Gathen, J. and Gerhard, J. Modern Computer Algebra References Arbitrary-Precision Arithmetic http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK4/NODE144.HTM * World of Seven http://www.comp.nus.edu.sg/~stevenha/programming/programming.html Common Mistakes in Online and Real-time Programming Contests http://www.acm.org/crossroads/xrds7-5/contests.html Mathematical Constants and Computation http://numbers.computation.free.fr/Constants/constants.html Euclid’s Elements http://mathworld.wolfram.com/Elements.html Monty Hall Problem http://www.cut-the-knot.org/hall.shtml Cut the Knot http://www.cut-the-knot.org/