Download Quick-Sort

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Quick-Sort
COMPSCI 355
Fall 2016
Quick-Sort
●
Choose an item from the list as the pivot.
●
Rearrange the list so that:
●
–
the value of every item preceding the pivot is
less than or equal to the pivot value.
–
the value of every item following the pivot is
greater than the pivot value.
Recursively sort the sublists on both sides
of the pivot.
Quick-Sort
E
B
F
G
H
A
C
Any item in the list can be chosen as the pivot.
A common convention is to take the item in the
first position.
I
D
J
Quick-Sort
E
B
F
smaller than
pivot value
G
H
A
C
I
D
J
greater than
pivot value
Quick-Sort
E
B
D
smaller than
pivot value
G
H
A
C
I
F
J
greater than
pivot value
Quick-Sort
E
B
D
smaller than
pivot value
G
H
A
C
I
F
greater than
pivot value
J
Quick-Sort
E
B
D
smaller than
pivot value
C
H
A
G
I
F
greater than
pivot value
J
Quick-Sort
E
B
D
C
smaller than
pivot value
H
A
G
I
greater than
pivot value
F
J
Quick-Sort
E
B
D
C
A
H
smaller than
pivot value
G
I
greater than
pivot value
Now what?
F
J
Quick-Sort
A
B
D
sort
recursively
C
E
H
G
I
sort
recursively
F
J
Partition Tree
H F B N A G J C K I D L E M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
H
K I J L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
C
E F G D
H
K I J L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
B
C
E F G D
H
K I J L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
E F G D
H
K I J L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
E
G F
K I J L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
G F
K I J L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
G F
F
G
K I J L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
G F
F
G
K I J L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
J I
G F
F
K I J L N M
G
K
L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
J I
I
G F
F
K I J L N M
G
K
J
L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
J I
I
G F
F
K I J L N M
G
K
J
L N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
J I
I
G F
F
K I J L N M
G
K
J
L N M
L
N M
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
J I
I
G F
F
K I J L N M
G
K
J
L N M
L
N M
M
N
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
J I
I
G F
F
K I J L N M
G
K
J
L N M
L
N M
M
N
Partition Tree
H F B N A G J C K I D L E M
C F B E A G D
B A
A
A
B
C
H
E F G D
D
D
E
J I
G F
F
K I J L N M
I
J
G
K
L N M
L
N M
M N
●
O(n) steps at each level of recursion.
●
How many levels?
A Bad Partition Tree
How could we avoid this
degenerate behavior?
A B C D E
A
B C D E
B
C D E
C
D E
D
E
E
A Bad Partition Tree
How could we avoid this
degenerate behavior?
A B C D E
A
1. Median-of-three partitioning
2. Randomized QuickSort
B C D E
B
C D E
C
D E
D
E
E
Probability Theory
●
A sample space is the set of all possible
outcomes of an experiment or random trial.
–
Rolling a die: {1, 2, 3, 4, 5, 6}
–
Flipping a coin: {H, T}
–
Flipping a coin twice: {HH, HT, TH, TT}
–
Flipping a coin until it comes up tails:
Probability Theory
●
A sample space is the set of all possible
outcomes of an experiment or random trial.
–
Rolling a die: {1, 2, 3, 4, 5, 6}
–
Flipping a coin: {H, T}
–
Flipping a coin twice: {HH, HT, TH, TT}
–
Flipping a coin until it comes up tails:
{T, HT, HHT, HHHT, HHHHT, ... }
Probability Theory
●
●
●
A subset of outcomes is called an event.
A probability function associates a real
number to every event in such a way that:
–
Pr(∅) = 0
–
Pr(S) = 1
–
0 ≤ Pr(E) ≤ 1 for any event E
–
If A ∩ B = ∅ then Pr( A ∪ B ) = Pr( A ) + Pr( B )
What is the probability function that
describes the experiment of rolling a die?
Probability Theory
●
●
A random variable is a function that maps
each possible outcome of an experiment
to a real number.
The expected value of a random variable
X is:
E(x) =
Σ
x∊ℝ
x ∙ Pr(X = x)
Expected Value
●
●
●
●
Suppose that 1000 raffle tickets are sold
for $1 each.
The grand prize is $300 and there are two
second-place prizes of $100.
We can think of the payoff of a ticket as a
random variable.
The expected value (payoff) is:
Expected Value
●
●
●
●
Suppose that 1000 raffle tickets are sold
for $1 each.
The grand prize is $300 and there are two
second-place prizes of $100.
We can think of the payoff of a ticket as a
random variable.
The expected value (payoff) is:
300∙(1/1000) + 100∙(2/1000) + 0∙(997/1000) = 500/1000
= $0.50
Expected Value
●
What is the expected value of a die roll?
1∙(1/6)
2∙(1/6)
3∙(1/6)
4∙(1/6)
5∙(1/6)
+ 6∙(1/6)
●
That's 21/6 = 3.5 (not surprising).
St. Petersburg Paradox
●
●
●
●
The pot starts at $1.
You repeatedly flip a fair coin and the pot
is doubled for every HEADS.
The game ends when TAILS first appears,
and you get to keep what's in the pot.
How much would you pay to play this
game?
Linearity of Expectation
●
If you roll two dice and add up the sum,
what is the expected value?
2∙(1/36)
3∙(2/36)
⋮
+ 12∙(1/36)
●
There is an easier way.
Linearity of Expectation
●
●
●
If X and Y are two random variables, then
E(X + Y) = E(X) + E(Y).
So if X and Y each represents the
outcome of rolling a die, then E(X+Y) = 6.
What is the expected number of coin flips
that you would need to get 100 HEADS?
Randomized Quick-Sort
●
●
●
Call a pivot good if it partitions the list
into two sublists of size at least n/4 each.
Probability of a pivot being good...?
If a node representing a sublist of k items
in the partition tree is associated with a
good pivot, then its children represent
sublists of size at most 3k/4.
Randomized Quick-Sort
●
●
●
A path from the root to a leaf has at most
log4/3(n) nodes with good pivots.
What is the expected length of that path?
The expected running time of randomized
QuickSort is...?
Randomized Quick-Sort
●
A deeper analysis reveals that the
expected number of comparisons
made by randomized Quick-Sort is:
~ 1.39n⋅log2(n)
Insertion Sort Decision Tree
a<b
true
false
a<c
b<c
true
false
a<c
a<b<c
true
a<c<b
true
false
b<c
b<a<c
false
c<a<b
true
b<c<a
false
c<b<a
Theoretical Lower Bounds
●
Use Stirling's approximation:
n! ≥ sqrt(2πn) · (n /e) n
●
Therefore, we have:
log(n!) = Ω(n log n)
Theoretical Lower Bounds
●
Alternatively:
log n! = log(1) + log(2) + ⋯ + log(n)
n
≥ ∫ log x dx
1
= n log n – n + 1
Theoretical Lower Bounds
●
Alternatively:
–
The product n! has n / 2 terms greater than n / 2.
–
So we have: n! > (n / 2) n / 2
Comparisons
●
●
Insertion-Sort
–
Excellent for short lists (say, ≤ 50 items)
–
Also good for partially sorted data.
Merge-Sort
–
Asymptotically optimal.
–
In-place implementations are complicated.
–
In practice, tends to be slower than Heap-Sort and
Quick-Sort.
–
Excellent when input cannot fit entirely into memory.
●
●
Long runs of data processed in blocks.
Minimizes disk block transfers.
Comparisons
●
●
Quick-Sort
–
Used in C library qsort function.
–
Tends to beat Heap-Sort in practice.
–
Quadratic-time worst-case behavior.
Heap-Sort
–
Best when an in-place algorithm is needed is
worst-case guarantees.
Related documents