Survey

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pattern recognition wikipedia , lookup

Inverse problem wikipedia , lookup

Dijkstra's algorithm wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Corecursion wikipedia , lookup

Transcript
Different Tree Data Structures
for Different Problems
Data Structures
• Data Structures have to be adapted for each
problem:
– A specific subset of operations
• (Example: Search is needed, Insert is not)
– A specific set of requirements or usage patterns
• (Example: Search occurs frequently, Insert occurs rarely)
• Special Data Structures
• Hybrid data structures
• Augmented data structures
Problem 1
• Problem: Dictionary, where every key has the
same probability to be searched for and the
structure must be dynamic (inserts and deletes)
– Solution: we have seen that we can use Balanced
Binary Search Trees
Problem 2
• Problem: Dictionary, where every key has a
different, known probability to be searched for; the
structure is not dynamic.
– Example: A dictionary must contain words: bag, cat, dog,
school. It is known that school gets searched 40% of the
time, bag 30% of the time, dog 20% of the time and cat
10% of the time.
– The keys which are more frequent searched for must be
near the root. Balancing for reducing height does not
help here !
– Solution: build the Optimal Binary Search Tree
Optimal Binary Search Tree
0.1
0.4
cat
0.3
bag
school
0.2
0.3
dog
bag
0.2
0.4
dog
school
0.1
cat
• Balanced BST
0.1 * 0 + 0.3*1+ 0.2* 1 + 0.4*2 = 1.3
• Optimal BST
0.4 * 0 + 0.3*1+ 0.2* 2 + 0.1*3 = 1.0
Problem 3
• Problem: A dictionary structure, used to store a
dynamic set of words that may contain common
prefixes.
• Solution:
– We can exploit the common prefixes of the words, and
associate the words with paths in a tree instead of nodes
of a tree
– Solution: Trie trees (Prefix trees, Retrieval trees)
• Multipath trees
• If the alphabet has R symbols, a node may have R children
Trie Tree
B
E
S
E
Y
A
Word set:
BE, BY,
SEA, SEE, SEEN,
SELL, SHE
E
N
H
L
E
L
Problem 4
• Problem: we need a data structure having
features similar to the search trees
(search, insert, delete) but on the
secondary storage (disk).
– What is different with disks?
• Disks are slow, thus read/write happens in bulk
(several items at once, forming a page)
– Solution: Adapt the BST data structure to a
search tree that may contain in a node
several keys, up to fill up a page => B-Trees
B-Tree
8
4
5
6
11
13
17
15
22
25
Minimum degree of a B-tree: t>=2, such that every node, except the root,
must have n keys and n+1 children, where t<=n+1<=2t
2-3-4 trees are actually B-trees of min degree 2
The value of t must be in concordance with the size of the disk page
27
B-Trees – formal definition
•
•
A B-tree T is a rooted tree (whose root is T.root) having the properties:
Every node x has the following attributes:
– x.n, the number of keys currently stored in node x,
– the x.n keys themselves, x.key1, x.key2, …, x.keyx.n, stored in nondecreasing
order, so that x.key1 <= x.key2 <= … <= x.keyx.n
– x.leaf , a boolean value that is TRUE if x is a leaf and FALSE if x is an internal
node.
•
•
•
•
Each internal node x also contains x.n+1 pointers x.c1, x.c2, … x.cx.n+1 to its
children. Leaf nodes have no children, and so their ci attributes are
undefined.
The keys x.keyi separate the ranges of keys stored in each subtree: if ki is
any key stored in the subtree with root x.ci, then k1 <= x.key1 <= k2 <=
x.key2 <= … <= X.keyx.n <= kx.n+1
All leaves have the same depth, which is the tree’s height h.
Nodes have lower and upper bounds on the number of keys they can
contain. We express these bounds in terms of a fixed integer t >=2 called
the minimum degree of the B-tree:
– Every node other than the root must have at least t -1 keys. Every internal node
other than the root thus has at least t children. If the tree is nonempty, the root
must have at least one key.
– Every node may contain at most 2t -1 keys. Therefore, an internal node may
have at most 2t children.
Problem 5
• Problem: we need a data structure combining
the features of both Binary Search Trees and
Heaps, holding nodes that have both a search
key and a priority
– Binary Search Trees and Heaps both come in form of
Binary Trees
– Solution: Treap – a Binary Tree where the nodes
satisfy both the BST property and the Heap property !
• Why is this structure important ?
– Building a Treap by using random generated values as
priorities is a method for building a balanced binary tree
Treap