Download Advanced Data Structure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Control table wikipedia , lookup

Linked list wikipedia , lookup

Bloom filter wikipedia , lookup

Lattice model (finance) wikipedia , lookup

Array data structure wikipedia , lookup

Hash table wikipedia , lookup

Rainbow table wikipedia , lookup

Quadtree wikipedia , lookup

Red–black tree wikipedia , lookup

B-tree wikipedia , lookup

Interval tree wikipedia , lookup

Binary tree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
Advanced Data Structure
Hackson Leung
2009-01-10
Outline
• Review
• General Array
• Linked List
• Sorted Array
• Main dish
• Binary Search Tree
• Heap (aka Priority Queue)
• Hash Table
• Dessert?
Review
Something you should have known
Review
• Data Structure – way to store data
• GOOD or BAD?
• No absolute answer
• Operations Supported
•
•
•
•
•
Find an element T inside
Find the minimum element inside
Insert a new element T in it
Remove the Tth element inside
Remove the minimum element inside
Review
• GOOD or BAD?
• No absolute answer
• WHY?
• Bringing small money is troublesome
• Most shops accept money
• Octopus is convenient in payment
• Not many shops accept it
• Money OR Octopus? Depends
• Which structure is the best?
• Also Depends!
Review
• GOOD or BAD?
• No absolute answer IF we consider many
• Definite IF only consider one operation
• For one operation, time complexity
• Still remember that?
• O(1), O(lgN), O(N), O(NlgN), O(N2)...
Review - Array
• If you don’t know that, you won’t be here
• Find an element T
• Search the whole array, O(N)
• Find the minimum
• Search the whole array, O(N)
• Insert an element T
• A[cnt] = T; cnt = cnt + 1; O(1)
• Remove the Tth element
• O(1). Why?
• Remove the minimum element
• O(N). Why?
Review – Linked List
• Find an element T
• Search the whole list, O(N)
• Find the minimum
• Search the whole list, O(N)
• Insert an element T
• O(1). Why?
• Remove the element with pointer T
• O(1), with better implementation.
• Remove the minimum element
• O(N). Why?
Review – Sorted Array
• Find an element T
• Binary search, O(lgN)
• Find the minimum
• Still need to search?!?! No way! O(1)
• Insert an element T
• Where to add? O(lgN)
• Maintain sorted, O(N) shift
• Remove the Tth element
• Delete + Shift, O(N)
• Remove the minimum element
• Delete + Shift, O(N) (Can it be O(1)???)
Review – Summery
Array
Linked List
Sorted Array
Find T
O(N)
O(N)
O(lgN)
Find Min
O(N)
O(N)
O(1)
Add T
O(1)
O(1)
O(N)
Remove Tth/pointer T
O(1)
O(1)
O(N)
Remove Minimum
O(N)
O(N)
O(N) or O(1)
• If we perform a few number of operations, fine
• If we perform a lot......
• NONE ARE EFFICIENT!
Binary Search Tree
I suppose you have learnt tree before
Binary Search Tree
• Make use of Binary Tree
• Binary Tree? At most 2 children!
• One more property
• Left Subtree < Node < Right Subtree
11
8
4
15
9
20
Binary Search Tree
• Insert 11, 8, 15, 9, 20, 4
11
8
4
15
9
20
Binary Search Tree
• Insert 11, 8, 15, 9, 20, 4
11
8
4
15
9
20
Binary Search Tree
• Find 9
11
8
4
15
9
20
Binary Search Tree
• Find 14
11
8
4
15
9
20
Binary Search Tree
• Remove
• Leaf Node  Very easy
• Single Child  Still easy
• Push that child upward
• 2 Children?
• Find Max(Min) at Left(Right) Subtree
• Replace to that node
• For that removal of node...
• Leaf?
• Single?
• 2 Children?
Binary Search Tree
• Remove
• Leaf Node  Very easy
• Single Child  Still easy
• Push that child upward
• 2 Children?
• Replace by Min(Max) at Left(Right)
Subtree
• Either leaf of single child case
• Declare that node to be deleted
instead of actually deleting (Lazy
Deletion)
Binary Search Tree
• Remove
• Declare that node to be deleted
instead of actually deleting (Lazy
Deletion)
• Delete 11
11
8
4
15
9
20
Binary Search Tree
• Remove
• Declare that node to be deleted
instead of actually deleting (Lazy
Deletion)
• Delete 11
8
4
15
9
20
Binary Search Tree
• Summary
• Insert is similar to Find
• Find Minimum: Left Most
• Remove Min: Find Min + Remove
• Complexities
•
•
•
•
•
Find: O(lgN)
Find Min: O(lgN)
Remove Min: O(lgN)
Insert: O(lgN)
Remove: O(lgN)
Binary Search Tree
• World of lgN!
•
•
•
•
It is supposed to be lgN only.
Upper bounds...?
Example...?
Reason...?
• Solution
• AVL Tree, Red Black Tree
• By using rotations
• Rarely used + difficult to implement
Heap
Priority is given to...
Heap (Priority Queue)
• Binary Tree (Usually complete)
• Queue only supports
• Enqueue (Insert)
• Dequeue (Remove Min)
• One property
• Node value is greater(smaller) than
all its descendants
• Max(Min) Heap
Heap (Priority Queue)
•
•
•
•
•
•
How to implement? Array!
Suppose 1-based array
For an node at Tth position
Parent = T/2
Left = 2T
Right = 2T+1
Heap (Priority Queue)
• Enqueue
• Just insert at the end
• Shift up until property is satisfied
Heap (Priority Queue)
• Dequeue
• Replace top element with the last
• Shift down until property is satisfied
Heap (Priority Queue)
• Build a heap
•
•
•
•
Each node’s height: lgN
For each node, do shift down
O(NlgN)???
For a binary tree with N nodes
• There are at most ceil(N/2h+1) nodes
with height h
• Complexity in terms of h
•
• Therefore it is O(N)
= O(2N)
Heap (Priority Queue)
• Summary
• Find is not supported
• O(N) search whole array
• Is remove supported?
• Any subtree is also a heap
• Remove Min on that subtree
• Complexity
•
•
•
•
•
Find: O(N)
Find Min: O(1)
Remove Min: O(lgN)
Insert: O(lgN)
Remove: O(lgN)
Hash Table
Memory is finite, but input range is NOT
Hash Table
• Simple Problem
• Mark Six result: 6 integers in [1..49]
• Check if each number matches my
bet
• Most efficient approach?
• Answer
• Use boolean array with 49 cells
• Check if that cell is set for each
number
• O(1)
Hash Table
• Simple Problem...?
• Mark Six result: 6 integers in [1..232]
• Check if each number matches my
bet
• Most efficient approach?
• Answer...?
• Use boolean array with 232 cells
• 4GB memory already
• Even O(1) checking... Too much
memory required
Hash Table
• Suppose we are not checking
numbers but strings instead...
• Suppose we are not checking
numbers but images instead...
• Solution
• Compress the range by Hash Function
• Convert the range into an integer
Hash Table
• Idea
• We need to store values
between 0 to 99, but we
have only 10 cells
• We can compress the
range [0, 99] to [0, 9] by
taking the modulo 10. It
is called Hash Value
• Insert, Find and Deleting are O(1)
Hash Table
• Problem
• What if inserting both 11 and 31?
• Collision problem
• Solutions
• Chaining (Open Hashing)
• Open Addressing (Close Hashing)
Hash Table - Chaining
• Each cell becomes a linked list
• On average, Insert/Find/Remove
takes O(1+α), where α is the Load
Factor = # stored / # cells
• For random Hash
Function, usually
average case
Hash Table – Open Addressing
• Find until a blank cell is reached
• Remove must use Lazy Deletion or
Find may fail in some cases
Hash Table – Open Addressing
• Previous method is called Linear
Probing
• Quadratic Probing is also popular
• Table size MUST be prime
• In OI, Linear Probing is enough
• Complexities
• Find: O(1 / 1-α)
• Insert: O(1 / 1-α)
• Remove: O(ln(1 / 1- α)/α + 1/α)
Hash Table – Summary
• Find Min and Remove Min is usually
not supported unless you search the
whole tree or array
• Complexities (Chaining)
• Find: O(1 + α)
• Insert: O(1 + α)
• Remove: O(1 + α)
• Complexities (Open Addressing)
• Find: O(1 / 1-α)
• Insert: O(1 / 1-α)
• Remove: O(ln(1 / 1- α)/α + 1/α)
• If α is small (<50%), almost O(1)
Summary
Compare what we have learnt
Summary
Binary Search
Tree
Heap (Priority
Queue)
Hash Table
(Open
Addressing)
Find
O(lgN) in average
O(N)
O(1 - 1/α)
Find Min
O(lgN) in average
O(1)
O(N)
Insert
O(lgN) in average
O(lgN)
O(1 – 1/α)
Remove
O(lgN) in average
O(lgN)
O(ln(1 / 1- α)/α +
1/α)
Remove Min
O(lgN) in average
O(lgN)
O(N+ln(1 / 1α)/α + 1/α)
Summary
• Choice of data structures are
depending on the problem nature
• What operations are intensive is
the critical factor of choosing them
Extras
Read only if you have digested main dishes
Extra Topics
• Double Hashing
• Use of two hash functions
• Binary Indexed Tree
• IOI 2001 Mobiles
• Segment Tree
• IOI 1998 Picture
• Quad Tree
• Rarely used
Tasks
Practise makes Perfect
Tasks
•
•
•
•
•
•
HKOJ 1020 – Left Join
HKOJ 1021 – Inner Join
HKOJ 1019 – Addition II
HKOJ 1090 – Dilligent
HKOJ 3061 – Tappy World
NOI 2004 Day 1 - Cashier
Q&A
Curiosity?