Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
EECS 560 Project 2 Due: November 6, 2003 Description: In this project you will compare several different data structures that can be used to implement priority queues. This will require that you generate random sets of data to compare the efficiencies of the various structures for the different operations. This will require randomly generating the data for the structure and then timing random series of operations, performing multiple tests on different data sets of the same size. The data structures to be tested are heaps, d-heaps, leftist heaps, weight-balanced leftist heaps and pairing heaps. The random generation process is described below. The most important thing in testing is to be sure that the set of data is identical for each type of data structure. Thus, it might be easiest to do the runs on each structure in a separate program. Part 1: In this part you are to look at d-heaps for various values of d and compare the efficiency of the insert and deletemin operations for the following d values: 2 (the regular binary heap), 3, 5, and 7. You will build the d-heaps for various values of n by randomly generating a total of n values between 1 and 4n. There will be duplicate values, but that’s O.K. Use the following values of n: 16,000, 32,000, 64,000, 128,000 and 256,000. You may build the d-heaps either by inserting the values one at a time or by using a modification of buildheap. Do not include the time required to build the initial structure in your timing tests. Part 2: In the second part of the project you will compare the binary heap, leftist heaps, weight-balanced leftist heaps and pairing heaps (described below). In order to make things easy, use the buildheap to get the original structure for each of the heap variations. Thereafter use the insert and deletemin operations defined for that structure. The original data is to be generated exactly as in part 1 again using n = 16,000, 32,000, 64,000, 128,000, and 256,000. A weight-balanced leftist heap is a leftist heap in which the weight of the node rather than its null path length is used to maintain the structure. The weight of a node is the number of nodes in the subtree rooted at that node. For every node in the weight-balanced leftist heap, the weight of the left child must be greater than or equal to the weight of the right child. The pairing heap is a heap-ordered M-ary tree for which all operations except deletion take constant worst-case time. To merge two pairing heaps, the heap with the larger root is made the first (leftmost) child of the heap with the smaller root. As with leftist heaps, an insertion is accomplished by doing a merge with a one element heap. For a deletemin operation, the root of the heap is removed leaving k trees to be merged back into a heap. This will require k – 1 merges. This should be done in two passes by first scanning left to right, merging the heaps in pairs i.e. merge the first and second trees, then the third and fourth, etc. Then, scan back from right to left each time merging the rightmost tree remaining from the first merge with the current merged result. (If the number of trees to be merged is odd, then the three rightmost nodes are merged into one tree at the end of the first pass.) The actual heap implementation is done with a leftchild/right sibling representation as illustrated below. Using a random number generator To ensure that the timing tests are “fair,” you must use the same seed for the random number generator for each structure. Note: you only seed the random number generator once, at the beginning of a run. If you reseed it during the timing tests you will effectively be testing the same data set each time. Structuring the timing tests After getting the values for the initial heap, here's the way to generate your test data for a random sequence of operations: 1. Generate a random integer between 2n and 5n. This will be the number of operations to perform. Let's call it M. 2. Perform these steps M times: a. Generate a random number x such that 0 x 1 b. If 0 x < 0.4, perform a deletemin operation. If 0.4 x 1, generate a random integer y such that 1 y n and insert y. Then, generate the data to initialize a new structure and repeat the process above. To get the average time, you must run a minimum of 10 timing tests for each structure for each value of n. To get better timing results, it’s a good idea to run even more tests for the smaller data sets. Do all tests on lists of the same size in the same run of a program. To test the individual operations in order to try to experimentally determine the complexity of the two operations you will first generate heaps of the indicated size. Instead of randomly generating operations, you will instead do the following: 1. Randomly generate n/2 integers between 2n and 5n and insert each into the tree as soon as it is generated. 2. Perform n/2 deletemin operations. Again, run a minimum of 10 tests for each structure for each value of n. Get separate timing results for each of the two operations in order to try to determine the complexity of each operation for each structure. Requirements: 1. To ensure that the timing tests are “fair,” use the same seed for the random number generator for each of the different types of heaps. To get accurate timing you will probably need to run more than ten tests on the short lists. Also record and report the average number of deletemin operations and insertions for each group of tests. For example, you run 20 tests on a heap initially containing 16,000 integers. Find the average M value for the 20 tests and the average number of deletemin operation and insert operations for each test. 2. You must clearly show that each of your operations is working correctly. (This should be done before you do the timing.) It will probably be easiest to do this in a separate program so that additional code can be used to print out information about the heaps, and/or to read from a data file. For example, after each deletemin and insertion, print out the heap. You can use relatively small data sets (< 50) to illustrate the correctness of your implementation. To turn in: 1. A listing of your well-documented code, and copies of all test files used to verify the correctness of your code, and the results of those tests. 2. A written report that includes: a. A complete description of how your testing was done. This should include the number of tests, when the timing was started and stopped, any problems you ran into, etc. b. A tabular summary of the data obtained for each set of tests. You do not need to provide the actual data, but may do so if you wish. c. An analysis of your data and conclusions you can draw from it. At a minimum this should include graphs to indicate the complexity of the deletemin and insert operations for each of the structures (plot size vs. time). A pairing heap (above) and the left-child right-sibling representation (below)