Download EECS 560

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Binary search tree wikipedia , lookup

Transcript
EECS 560
Project 2
Due: November 6, 2003
Description: In this project you will compare several different data structures that can be
used to implement priority queues. This will require that you generate random sets of data to
compare the efficiencies of the various structures for the different operations. This will require
randomly generating the data for the structure and then timing random series of operations,
performing multiple tests on different data sets of the same size. The data structures to be
tested are heaps, d-heaps, leftist heaps, weight-balanced leftist heaps and pairing heaps.
The random generation process is described below. The most important thing in testing is to
be sure that the set of data is identical for each type of data structure. Thus, it might be
easiest to do the runs on each structure in a separate program.
Part 1: In this part you are to look at d-heaps for various values of d and compare the
efficiency of the insert and deletemin operations for the following d values: 2 (the regular
binary heap), 3, 5, and 7. You will build the d-heaps for various values of n by randomly
generating a total of n values between 1 and 4n. There will be duplicate values, but that’s O.K.
Use the following values of n: 16,000, 32,000, 64,000, 128,000 and 256,000. You may build
the d-heaps either by inserting the values one at a time or by using a modification of buildheap.
Do not include the time required to build the initial structure in your timing tests.
Part 2: In the second part of the project you will compare the binary heap, leftist heaps,
weight-balanced leftist heaps and pairing heaps (described below). In order to make things
easy, use the buildheap to get the original structure for each of the heap variations. Thereafter
use the insert and deletemin operations defined for that structure. The original data is to be
generated exactly as in part 1 again using n = 16,000, 32,000, 64,000, 128,000, and 256,000.
A weight-balanced leftist heap is a leftist heap in which the weight of the node rather than its
null path length is used to maintain the structure. The weight of a node is the number of nodes
in the subtree rooted at that node. For every node in the weight-balanced leftist heap, the
weight of the left child must be greater than or equal to the weight of the right child.
The pairing heap is a heap-ordered M-ary tree for which all operations except deletion take
constant worst-case time. To merge two pairing heaps, the heap with the larger root is made
the first (leftmost) child of the heap with the smaller root. As with leftist heaps, an insertion is
accomplished by doing a merge with a one element heap. For a deletemin operation, the root
of the heap is removed leaving k trees to be merged back into a heap. This will require k – 1
merges. This should be done in two passes by first scanning left to right, merging the heaps in
pairs i.e. merge the first and second trees, then the third and fourth, etc. Then, scan back
from right to left each time merging the rightmost tree remaining from the first merge with the
current merged result. (If the number of trees to be merged is odd, then the three rightmost
nodes are merged into one tree at the end of the first pass.) The actual heap implementation
is done with a leftchild/right sibling representation as illustrated below.
Using a random number generator
To ensure that the timing tests are “fair,” you must use the same seed for the random number
generator for each structure. Note: you only seed the random number generator once, at the
beginning of a run. If you reseed it during the timing tests you will effectively be testing the
same data set each time.
Structuring the timing tests
After getting the values for the initial heap, here's the way to generate your test data for a
random sequence of operations:
1. Generate a random integer between 2n and 5n. This will be the number of operations to
perform. Let's call it M.
2. Perform these steps M times:
a. Generate a random number x such that 0  x  1
b. If 0  x < 0.4, perform a deletemin operation.
If 0.4  x  1, generate a random integer y such that 1  y  n and insert y.
Then, generate the data to initialize a new structure and repeat the process above. To get the
average time, you must run a minimum of 10 timing tests for each structure for each value of n.
To get better timing results, it’s a good idea to run even more tests for the smaller data sets.
Do all tests on lists of the same size in the same run of a program.
To test the individual operations in order to try to experimentally determine the complexity of
the two operations you will first generate heaps of the indicated size. Instead of randomly
generating operations, you will instead do the following:
1. Randomly generate n/2 integers between 2n and 5n and insert each into the tree as soon
as it is generated.
2. Perform n/2 deletemin operations.
Again, run a minimum of 10 tests for each structure for each value of n. Get separate timing
results for each of the two operations in order to try to determine the complexity of each
operation for each structure.
Requirements:
1. To ensure that the timing tests are “fair,” use the same seed for the random number
generator for each of the different types of heaps. To get accurate timing you will probably
need to run more than ten tests on the short lists. Also record and report the average
number of deletemin operations and insertions for each group of tests. For example, you
run 20 tests on a heap initially containing 16,000 integers. Find the average M value for the
20 tests and the average number of deletemin operation and insert operations for each
test.
2. You must clearly show that each of your operations is working correctly. (This should be
done before you do the timing.) It will probably be easiest to do this in a separate program
so that additional code can be used to print out information about the heaps, and/or to read
from a data file. For example, after each deletemin and insertion, print out the heap. You
can use relatively small data sets (< 50) to illustrate the correctness of your implementation.
To turn in:
1.
A listing of your well-documented code, and copies of all test files used to verify the
correctness of your code, and the results of those tests.
2. A written report that includes:
a. A complete description of how your testing was done. This should include the number
of tests, when the timing was started and stopped, any problems you ran into, etc.
b. A tabular summary of the data obtained for each set of tests. You do not need to
provide the actual data, but may do so if you wish.
c. An analysis of your data and conclusions you can draw from it. At a minimum this
should include graphs to indicate the complexity of the deletemin and insert operations
for each of the structures (plot size vs. time).
A pairing heap (above) and the left-child right-sibling representation (below)