Download Sorting Algorithms

Document related concepts

B-tree wikipedia , lookup

Quadtree wikipedia , lookup

Linked list wikipedia , lookup

Interval tree wikipedia , lookup

Array data structure wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
Data Structures and Algorithms
Rabie A. Ramadan
[email protected]
Part II
Data Structures and Algorithms
Algorithms and Programs



Algorithm:

A mechanical procedure written in such a way that
human beings can understand it (pseudo code)
Program:
• A mechanical procedure written in such a way that a
computer can execute it
Given an algorithm, we are led to ask:
• What is it supposed to do?
• Does it really do what it is supposed to do?
• How efficiently does it do it?
Computing the Running Time


•
•
Algorithms Goals:
Easy to understand, code, and debug.
Makes efficient use of the computer's resources, especially, one
that runs as fast as possible.
The running time depends on factors such as:
•
•
•
•
the input to the program
•
The size of the input and its nature (e.g. array size to be sorted)
the quality of code generated by the compiler used to create the
object program,
the nature and speed of the instructions on the machine used to
execute the program, and
the time complexity of the algorithm underlying the program
Computing the Running Time
If the number of instructions is n

•
T(n) is the running time
•
Worst case  All of the instructions are executed
•
Best case  the minimum number of instructions are executed
•
The Average is the best way to measure the running time Tavg(n)
Big-Oh and Big-Omega Notation

Big O
• We say that T(n) is O(f(n)) if there are constants
c and n0 such that T(n) <= cf(n) whenever n >=
n0.
•

Defines the upper bound of an algorithm/program
Omega 
• T(n) is  (W(g(n))), means that there exists a positive constant c
such that T(n) >= cg(n) infinitely often (for an infinite number of
values of n).
Example

Find the Complexity of the following equation:
Group Activity
Find the complexity
Procedure Calls

What is a procedure ?

Count all of the callings to this procedure
Example













function fact ( n: integer ): integer;
{
fact(n) computes n!
}
begin(1)
(1)
if n <= 1 then(2)
(2)
fact := 1
else(3)
(3)
fact := n * fact(n-1)
end; { fact }
Let T(n) be the running time for fact(n).
The running time for lines (1) and (2) is O(1),
for line (3) it is O(1) + T(n-1).
Basic Data Structures

Way of organizing information

Many different data structures

Different problems may require different data structures


Each data structure has unique properties that makes it well suited
to give a certain view of the data.
There are many different ways of creating the same data structure in
a computer.
Our Objectives

Show how data structures are represented in the
computer

Identify linear and nonlinear data structures

Manipulate data structures with basic operations

Compare different implementations of the same data
structure
Computer Memory

Every piece of data that is stored in a computer
is kept in a memory cell with a specific address.
• The computer can store many different types of
data in its memory.
Computer Memory (Cont.)

Storing the string 'apple' in the computer's memory, it
might look like this.

Storing a list of names might look like this
Computer Memory (Cont.)

Can we represent the following tree the same way?

It does not make sense. Right!!
Stack (Call Back)

last-in, first-out, or LIFO,

Insert  PUSH

Delete  POP
Queue (Call Back)

first-in, first-out, or FIFO,

Insert  ENQUEUE,

Delete  DEQUEUE

It has a head and tail
Queue (Call Back)
Priority queue
Abstract data type that supports the following
operations:

•
•
•
Add an element to the queue with an associated priority
Remove the element from the queue that has the highest priority, and
return it
O(1) to insert the element and O(n) to return an element
Linked Lists

objects are arranged in a linear order.

the order in a linked list is determined by a pointer in
each object

provide a simple, flexible representation for dynamic
sets
Linked List

Could be single or double linked list

Could be circular list

Sorted or unsorted
Liked list operations

Searching  O(n)

Inserting  O(1)

Deleting

O(1)

O(n) if sorted
Trees

•
Binary tree
A list with two pointers  right and left
Complete Binary Tree


If the height is h, the number of nodes is 2h+1-1 .
The missing nodes could be only at the bottom of the tree.
Rooted trees with unbounded branching
The number of children is k
The number of children are not known a head
The space problem appears when creating the unbounded tree




•
•
The Solution:
left-child[x] points to the leftmost child of node x, and
right-sibling[x] points to the sibling of x immediately to the right.
Heaps

A binary tree with the following properties:
• It is a complete binary tree; that is, each level of the tree is completely
•
filled, except possibly the bottom level. At this level, it is filled from left
to right.
It satisfies the heap-order property: The data item stored in each node is
greater than or equal to the data items stored in its children.
Heap Example
Heap
Not a Heap  Not complete
Not a Heap  complete but does not
satisfy the heap property
Heap Representation




•
•
•
•
As an Array
An array A that represents a heap is an array with two attributes
– length, the number of elements in the array
– heap-size, the number of heap elements stored in the array
• Viewed as a binary tree and as an array :
The root of the tree is
stored at A[0],
its left-child at A[1],
its right child at A[2] etc.
Heap Operations

The height of a node in a tree is the number of edges on the longest
simple downward path from the node to a leaf. (i.e. maximum depth
from that node

The height of an n-element heap based on a binary tree is
log (n)

The basic operations on heaps run in time at most proportional to the
height of the tree and thus take O(log( n)) time.
Maintaining the Heap Property

One of the more basic heap operations is converting a complete binary tree to a heap.

Such an operation is called “Heapify”.

Its inputs are an array A and an index i into the array.

When Heapify is called, it is assumed that the binary trees rooted at LeftChild(i) and
RightChild(i) are heaps, but that A[i] may be smaller than its children, thus violating
the 2nd heap property.

The function of Heapify is to let the value at A[i] “float down” in the heap so that the
subtree rooted at index i becomes a heap.

The action required from Heapify is as follows:
Sorting Algorithms
Bubble sort O(n2)
Insertion sortO(n2)
Selection sortO(n2)
Shell sortO(n2)
Heap sortO(n log n)
Merge sortO(n log n)
Quick sortO(n log n)
O(n2) Complexity
O(n log n)
Bubble Sort

It's also the slowest

Compares each item in the list with the item
next to it,

Swapping them if required.

The algorithm repeats this process until it
makes a pass all the way through the list
without swapping any items
Efficiency


Pros: Simplicity and ease of implementation.
Cons: Horribly inefficient.  O(n2)
Heap Sort

Building a heap out of the data set

Given an array A[1…, n]

Since the elements in the subarray
A[floor(n/2 +1) . . n] are all leaves,

Use 'Heapify' to sort the array
Heap Sort


Pros: In-place and non-recursive, making it a good choice
for extremely large data sets.  O(n log n)
Cons: Slower than the merge and quick sorts.
Insertion Sort



Requires two lists
In-place sort is used to save space
Based on the technique used by card players to arrange a hand of
cards
• Player keeps the cards that have been picked up so far in sorted
order
• When the player picks up a new card, he makes room for the new
card and then inserts it in its proper place
Analysis

Pros: Relatively simple and easy to
implement.  O(n2 )

Cons: Inefficient for large lists.
Merge Sort

Divide-And-Conquer Algorithm

The list to be sorted into two
equal halves

Places them in separate arrays.

Each array is recursively sorted,

Then merged back together to
form the final sorted list.
Example
Analysis

Pros: Marginally faster than the heap sort for larger sets.

O(n log n)

Cons: At least twice the memory requirements of the
other sorts; recursive.
Selection Sort

Selects the smallest unsorted item remaining in
the list.

Then swapping it with the item in the next
position to be filled.

Example:
Analysis



Pros: Simple and easy to implement.
O(n2).
Cons: Inefficient for large lists, so similar to the more
efficient insertion sort that the insertion sort should be
used in its place.
Shell Sort

Shell sort works by comparing elements that are
distant rather than adjacent elements in an array
or list where adjacent elements are compared

Shell sort makes multiple passes through a list
and sorts a number of equally sized sets using the
insertion sort.
Shellsort Examples

Sort: 18 32 12 5 38 33 16 2
8 Numbers to be sorted, Shell’s increment will be floor(n/2)
* floor(8/2)  floor(4) = 4
increment 4: 1
2
3
4
18 32 12 5 38 33 16
(visualize underlining)
2
Step 1) Only look at 18 and 38 and sort in order ;
18 and 38 stays at its current position because they are in order.
Step 2) Only look at 32 and 33 and sort in order ;
32 and 33 stays at its current position because they are in order.
Step 3) Only look at 12 and 16 and sort in order ;
12 and 16 stays at its current position because they are in order.
Step 4) Only look at 5 and 2 and sort in order ;
2 and 5 need to be switched to be in order.
Shellsort Examples (con’t)
Sort: 18 32 12 5 38 33 16 2
Resulting numbers after increment 4 pass:
18 32 12 2
38 33 16 5

* floor(4/2)  floor(2) = 2
increment 2: 1 2
18
32
12
2
38
33
16
5
Step 1) Look at 18, 12, 38, 16 and sort them in their appropriate location:
12
38
16
2
18
33
38
5
Step 2) Look at 32, 2, 33, 5 and sort them in their appropriate location:
12
2
16
5
18
32
38
33
Shellsort Examples (con’t)

Sort: 18 32 12 5 38 33 16 2
* floor(2/2)  floor(1) = 1
increment 1: 1
12
2
16
5
18
32
38
33
2
5
12
16
18
32
33
38
The last increment or phase of Shellsort is basically an Insertion
Sort algorithm.
Analysis



Pros: Efficient for medium-size lists.
O(n2)
Cons: Somewhat complex algorithm, not nearly as
efficient as the merge, heap, and quick sorts.
Quick Sort

If there are one or less elements in the array to be sorted, return
immediately.

Pick an element in the array to serve as a "pivot" point.

Split the array into two parts - one with elements larger than the pivot
and the other with elements smaller than the pivot.

Recursively repeat the algorithm for both halves of the original array.
Example
52
Analysis





Its complexity is affected by the pivot point
The worst-case efficiency of the quick sort, O(n2)
As long as the pivot point is chosen randomly, the algorithmic
complexity of O(n log n).
Pros: Extremely fast.
Cons: Very complex algorithm, massively recursive
Bucket Sort

It assumes that the input is generated by a random process that
distributes elements uniformly over the interval [0, 1).

divide the interval [0, 1) into n equal-sized subintervals, or
buckets, and then distribute the n input numbers into the
buckets.

simply sort the numbers in each bucket and then go through the
bucket in order, listing the elements in each.

O(n)
Example