Download Ch19v2.0

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Control table wikipedia , lookup

Comparison of programming languages (associative array) wikipedia , lookup

Bloom filter wikipedia , lookup

Java ConcurrentMap wikipedia , lookup

Hash table wikipedia , lookup

Rainbow table wikipedia , lookup

Transcript
Introducing
Hashing
Chapter 19
Slides by Steve Armstrong
LeTourneau University
Longview, TX
2007, Prentice Hall
Chapter Contents
• What is Hashing?
• Hash Functions
 Computing Hash Codes
 Compressing a Hash Code into an Index for the Hash
Table
• Resolving Collisions
 Open Addressing with Linear Probing
 Open Addressing with Quadratic Probing
 Open Addressing with Double Hashing
 A Potential Problem with Open Addressing
 Separate Chaining
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
What is Hashing? 1
• A technique that determines an index or
location for storage of an item in a data
structure
• The hash function receives the search key
 Returns the index of an element in an array
called the hash table
 The index is known as the hash index
• A perfect hash function maps each search
key into a different integer suitable as an
index to the hash table
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
What is Hashing? 2
Fig. 19-1 A hash function indexes its hash table.
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
What is Hashing? 3
• Two steps of the hash function
 Convert the search key into an integer called
the hash code
 Compress the hash code into the range of
indices for the hash table
• Typical hash functions are not perfect
 They can allow more than one search key to
map into a single index
 This is known as a collision
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
What is Hashing?
Fig. 19-2 A collision caused by the hash function h
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Hash Functions 4
• General characteristics of a good
hash function
 Minimize collisions
 Distribute entries uniformly throughout
the hash table
 Be fast to compute
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Computing Hash Codes 5
• We will override the hashCode method of
Object
• Guidelines
 If a class overrides the method equals, it
should override hashCode
 If the method equals considers two objects
equal, hashCode must return the same value
for both objects
 Ctd …
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Computing Hash Codes
• Guidelines continued …
 If an object invokes hashCode more than
once during execution of program on the
same data, it must return the same hash code
 If an object's hash code during one execution
of a program can differ from its hash code
during another execution of the same
program
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Computing Hash Codes 7
• The hash code for a string, s
• Hash code for a primitive type
 Use the primitive typed key itself (unicode)
 Manipulate internal binary representations
 Use folding
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Compressing a Hash Code 9
• Must compress the hash code so it fits into
the index range
• Typical method for a hash code c is to
compute:
c %n
 n is a prime number (the size of the table)
 Index will then be between 0 and n – 1
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Resolving Collisions 11
• Options when hash functions returns
location already used in the table
 Use another location in the table
 Change the structure of the hash table so
that each array location can represent
multiple values
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Open Addressing with Linear Probing 12
• Open addressing scheme locates
alternate location
 New location must be open, available
• Linear probing
 If collision occurs at hashTable[k], look
successively at location k + 1, k + 2, …
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Open Addressing with Linear Probing 13
Fig. 19-3 The effect of
linear probing after
adding four entries
whose search keys
hash to the same index.
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Open Addressing with Linear Probing 14
Fig. 19-4 A revision
of the hash table
shown in 19-3 when
linear probing
resolves collisions;
each entry contains
a search key and its
associated value
for retrieving
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Removals 15
Fig. 19-5 A hash
table if remove used
null to remove
entries.
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Removals
Distinguishing among three kinds of locations
in the hash table
1. Occupied

The location references an entry in the dictionary
2. Empty

The location contains null and always did
3. Available

The location's entry was removed from the dictionary
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Open Addressing with Linear Probing 16
Fig. 19-6 A linear probe sequence (a) after adding an entry;
(b) after removing two entries;
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Open Addressing with Linear Probing 16
Fig. 19-6 A linear probe
sequence (c) after a
search; (d) during the
search while adding an
entry; (e) after an
addition to a formerly
occupied location.
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Searches that Dictionary Operations Require 16
• To retrieve an entry
 Search the probe sequence for the key
 Examine entries that are present, ignore
locations in available state
 Stop search when key is found or null
reached
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Searches that Dictionary Operations Require
• To remove an entry
 Search the probe sequence same as for
retrieval
 If key is found, mark location as available
• To add an entry
 Search probe sequence same as for retrieval
 Note first available slot
 Use available slot if the key is not found
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Open Addressing, Quadratic Probing 18
• Change the probe sequence
 Given search key k
 Probe to k + 12, k + 22, k + 32, … k + n2
• Reaches every location in the hash table
if table size is a prime number
• For avoiding primary clustering
 But can lead to secondary clustering
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Open Addressing, Quadratic Probing
Fig. 19-7 A probe sequence of length
five using quadratic probing.
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Open Addressing with Double Hashing 19
• Resolves collision by examining locations
 At original hash index
 Plus an increment determined by 2nd function
• Second hash function
 Different from first
 Depends on search key
 Returns nonzero value
• Reaches every location in hash table if table size
is prime
• Avoids both primary and secondary clustering
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Open Addressing with Double Hashing 20
Fig. 19-8 The first three locations in a probe sequence
generated by double hashing for the search key.
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Separate Chaining 22
• Alter the structure of the hash table
• Each location can represent multiple
values
 Each location called a bucket
• Bucket can be a(n)





List
Sorted list
Chain of linked nodes
Array
Vector
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Separate Chaining
Fig. 19-9 A hash table for use with separate chaining;
each bucket is a chain of linked nodes.
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Separate Chaining 23
Fig. 19-10 Where new entry is inserted into linked bucket
when integer search keys are (a) duplicate and unsorted;
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Separate Chaining
Fig. 19-10 Where new entry is inserted into linked bucket
when integer search keys are (b) distinct and unsorted;
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X
Separate Chaining
Fig. 19-10 Where new entry is inserted into linked bucket
when integer search keys are (c) distinct and sorted
Carrano, Data Structures and Abstractions with Java, Second Edition, (c) 2007 Pearson Education, Inc. All rights reserved. 0-13-237045-X