Download Comparison of Spatial Hashing Algorithms for Mobile Wireless Network Simulations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Hash table wikipedia , lookup

Rainbow table wikipedia , lookup

Transcript
Comparison of Spatial Hashing Algorithms for
Mobile Wireless Network Simulations
Carl Hein
Jon Russo
Lockheed Martin Advanced Technology Laboratories
3 Executive Campus
Cherry Hill, NJ 08002
856-792-9893, 856-792-9887
[email protected], [email protected]
Keywords:
Mobile Network Modeling, Spatial Hashing
ABSTRACT: Mobile wireless network simulations require efficient methods for determining which nodes are
reachable by other nodes. This operation must be performed repeatedly as wireless nodes move. Alternatively,
methods are needed to predict when a node will come within, or go beyond, the range of another node. Such
propagation calculations can dominate the simulation of large networks and ultimately limit a model's scalability.
Several spatial hashing algorithms have been proposed to improve scalability. The algorithms rapidly locate only
nodes within a given range from another node, without involving calculations on all the other nodes. A set of prior
and new algorithms are described and surveyed in this paper. A set of criteria is described for comparing the
methods based on the relative frequency of operations needed by mobile wireless network simulations. Four
algorithms are evaluated, and the results are presented as a function of network size. Their relative cost and
scalability are compared.
1. Introduction
Simulations of mobile networks must calculate the
locations and distances between moving objects, such
as vehicles, and their relativity to stationary objects.
The trajectories of moving objects may be specified by
lists of way points. Each way point specifies an
object’s position at a given time. The location of the
object can be interpolated for any time between way
points. Optionally, the way point data may contain
higher order components, such as velocity and
acceleration, to more accurately interpolate positions.
In general, the object management services required to
support mobile network simulations are:
A. Instantiating and removing objects
B. Periodically computing their positions (moving
objects)
C. Asynchronously interpolating exact positions
D. Finding near-by objects.
We will refer to the four operations (A, B, C, D)
throughout the remainder of this paper.
For moving objects, the calculations must be
performed repeatedly. Data structures and algorithms
are required that will scale efficiently to support these
services over arbitrary distances, time spans, velocities,
and number of objects. Table 1 shows the intended
ranges to be handled.
Table 1. Intended Range of Distances, Time Spans,
Velocities, and Population
Region
Sizes
Time Spans
Velocities
Populations
Urban to whole-hemisphere coverage = 1
km to 10,000 km. Typically 500 km on a
side
1 minute to several weeks; typically several
hours
0.0 to ~28,000 km/h (orbital velocity);
typically 50 km/h for ground vehicles and
900 km/h for aircraft
1 to 100,000 objects; typically 1,000
objects
We considered the baseline method to be a single,
linear list of all the objects. To locate all objects within
a given distance of every other object would require
O(N^2) operations with this method, where N is the
number of objects, since the whole list must be
traversed for each object. Because this operation must
be performed repeatedly in mobile network
simulations, it could become the dominant computation
and effectively limit the scalability of the model for
large N. More scalable methods are therefore sought.
Spatial hashing methods generate location-based hash
keys that can be used to directly access just the objects
located within given sub-regions. Potentially, access
time can therefore be relatively independent of the total
number of objects being simulated, which could enable
scaling simulations to much larger numbers of objects
(N).
To support the requirements above, several data
structure/algorithm combinations were considered.
Each differs in their scalability and efficiency to
perform the above operations, so their selection
depends on how often the above operations are
performed. Previous studies [1-11] have suggested
algorithms, but tended to consider one operation, such
as distance-based access, without weighting the relative
cost of the related maintenance operations based on the
expected frequency with which they must be
performed. However, in this study we assumed that for
the intended scenarios, objects are instantiated once
(A) and persist for long periods while being moved and
queried (B, C, D) many thousands or millions of times.
The relative frequencies of the balance of the
operations are uncertain, but we assume that updating
positions (B) occurs more frequently for most objects
(perhaps every 30-seconds for all objects) than
asynchronous queries (C) (perhaps only at specific
moments for a specific object). Likewise, finding near
objects (D) may occur somewhat less frequently and be
called by only a subset of objects as compared to
updating positions (B). So a notional expectation of
frequencies per object is:
A. Once
B. Millions of times
C. Thousands of times
D. Thousands of times
Another issue for (D) finding near objects is the span
of ranges. Some algorithms work well if the maximum
distance-horizon is known ahead of time and is
relatively constant. However, for radio applications, we
know that we need to perform nearness searches within
a given simulation over both short (1-2 km) and long
(hundreds of km) ranges, as well as in between,
especially when combinations of low-power UHF and
high-power HF radios are being simulated
simultaneously. Table 2 shows data structure/
algorithms that were investigated.
2. Experiment Design
To obtain quantifiable measurements as to how the
methods compare, the following test scenarios were
defined, and the methods were evaluated against them.
Table 2. The Methods, Advantages, and Concerns of Four Data Structure/Algorithms
Method
Un-ordered linear linked list.
Advantages
Simple, dynamic, low-cost A, B, C
2D Hash
Matrix
Space is divided into nXm
grid cells. All objects located
in a given grid cell are
attached to a linear list of
objects within that cell.
Finding near objects (D) cost is
basically independent of the number
of objects tracked, if cell size can be
set optimally for given scenario.
Space Hash
Tree
A balanced tree is maintained where each node lists the
range of nodes below it.
Search is rapid because the
tree typically remains only
log (N) deep. Decision on
which way to navigate at
each node is made by
comparing ranges. Technically not a pure hash method
as such, but a related accelerated logical data structure.
Like 2D Hash Matrix above,
but space is divided in one
direction only (east-west).
Dynamic. Finding near objects (D)
cost goes up only as log of the
number of objects tracked (i.e.,
almost constant, vanishing for large
N.) Very general purpose, should be
reasonably good at all levels,
densities, sizes, with no bad blowups and without a priori info.
Linear LinkedList of Objects
1D Hash Array
(proposed after
seeing 2D hash
matrix results)
Finding near objects (D) cost is
basically independent of the Number
of objects tracked. A mix of
advantages of pure Linear linked-list
and 2D hash, possibly with only
minimal down-sides of both.
Concerns
Finding near objects (D) cost grows
as square of number of objects
tracked.
Need to define region of opera-tion
ahead of time. Some cost for
maintaining structure with movement
(B). Finding near objects (D) cost
goes up either for large ranges or
small ranges if many nodes. Need to
tune cell size to case, and may not be
able cover ranges efficiently.
Some cost for maintaining structure
with movement (B).
Need to define longitude region of
operation a priori. Similar concerns as
2D hash above, but less.
To determine the effects of scalability for each
candidate approach, we ran each test with:
2 - objects
10 - objects
100 - objects
1,000 - objects
10,000 - objects
100,000 - objects
1,0000,000 - objects
The map region was set to 50,000 meters on a side
(North-South/East-West). The objects are initially
instantiated randomly within the 50 km X 50 km
region. Each movement step randomly selects onequarter (N/4) of the objects and randomly moves each
+/-50 meters North and East.
To yield run times of sufficient duration to measure
accurately, the tests are run iteratively, with a
commensurate number of iterations versus the number
of objects tracked to achieve a reasonable run time at
all scales. Where practical, the tests were run with:
Objects
2
10
100
1,000
10,000
100,000
1,000,000
Iterations (Iter)
20,000,000
2,000,000
200,000
20,000
2,000
200
20
Four tests were run for each candidate approach and
scale level:
• Instantiation (A)
• Movement ((N/4) * B)
• Finding Nearby Objects (D)
• All (One A, (N/4) B, and D)
The last case (All) is considered a realistic total
scenario with reasonable mixture of operations. The
prior tests help isolate the scalability of the individual
operations. Computing exact position (C) is not
investigated, since that would be the same for all
methods. It is just the interpolation between two way
points for the current exact time. All methods could
have a pointer a given object's way points or could
access the object through a name-lookup hash table to
access any object in a similar way. Therefore, the
primary concern of this investigation was to determine
a method that can find nearest objects (D) while
maintaining locations with movement (B).
Experiment conditions: All tests were compiled with
“gcc –O” and run on the same AMD PC under Redhat
Linux Fedora Core 4.
3. Initial Results and Observations
To normalize all results for comparison, the effective
uSeconds per object per trial is plotted (Figures 1, 2,
and 3). Initially, only the first three methods were
tested. The fourth method was proposed after analyzing
the initial results.
Key:
Blue = Instantiation (A)
Violet = Move Update (B)
Green = Find Nearest (D)
Red = All (A, B, D), weighted by relative
frequency of operation
Figure 1. Linear-List Benchmark (Baseline)
Figure 2. Matrix Hash
Figure 3. Tree-Hashing Algorithm
The initial results were surprising in some respects.
The linear-linked list was faster and scaled to large
cases better than expected. The matrix hash method
was originally set to 256 grids, which was expected to
be “coarse” for 50 km. However, its performance
faired poorly in sparse situations due to excessive
visiting of empty grid cells. Therefore, the 2D matrix
was collapsed into a 1D array, such that space is
hashed in the East-West direction only. All latitudes
are stored at a given longitude hash. This was called
1D “array-hash” and was expected to reduce the area
“squared” search to a “linear” search operation.
4. Secondary Findings
Again like the matrix hash, the results of the array hash
varied widely with density (Figure 4, 5, and 6).
Therefore, a much coarser grid was tried: 16 cells
instead of 256. Both the array and matrix methods were
tried with 16 cells. The array-hash algorithm then
began to approach the operation of simple linear-linked
list, and the performance improved dramatically but
still was not quite as good overall as the linear list.
5. Conclusion
In this paper, we studied the efficiency of six variant
spatial indexing schemes through their instantiation,
maintenance, and access processes.
Under the
expected usage conditions, none of the hash-based
methods tested here consistently exceeded the baseline
linear-linked list method in scalability by any
significant amount. Much of this result is due to the
relatively higher cost of maintaining the hashing
structures when large numbers of nodes are moving. If
the nodes were relatively stationary, the hash-based
Figure 4. Array Hash
Figure 5. Matrix Hash with 16 Grids (Coarse)
Figure 6. Array Hash with 16 Grids (Coarse)
methods would certainly offer superior access times, as
stationary, then few accesses would be required. The
unique aspects of mobile network simulations require
frequent position updates and accesses. Considering its
relative simplicity, the linear list method appears the
best choice of the methods tested so far for the
expected conditions of wireless network simulation.
Further work should be done in finding and testing
other hashing algorithms or designing new spatial
hashing algorithms that exhibit lower maintenance
costs, which might be more advantageous for mobile
network simulations.
6. References
[1]
[2]
[3]
[4]
Davis, W. A. and C. H. Hwang, “Organizing and
Indexing for Spatial Data,” 2nd International
Conference on Spatial Data Handling, Seattle,
July 1986.
Harle, Robert K., “Spatial Indexing for LocationAware Systems,” Harle, Robert K., The First
International Workshop on Mobile and
Ubiquitous Context Aware Systems and
Applications, Philadelphia, PA, August 6, 2007.
Erin Hastings, Jaruwan Mesit, and Ratan Guha,,
“A Scalable Technique for Large Scale, RealTime Range Monitoring of Heterogeneous
Clients,” 3rd International Conference on
Testbeds and Research Infrastructures for the
Development of Networks and Communities
Orlando, Florida, May 2007.
Mao Huaqing and Bian Fuling, “Design and
Implementation of QR+Tree Index Algorithms,”
International
Conference
on
Wireless
Communications, Networking and Mobile
Computing, Shanghai, China, Sept. 2007.
[5] Mathias Eitz and Gu Lixu, “Hierarchical Spatial
Hashing for Real-time Collision Detection,”
Proceedings of the IEEE International
Conference on Shape Modeling and Applications
2007, Lyon, France, May 2007.
[6] Hadjieleftheriou, M., E.G. Hoel, and V.J. Tsotras,
“SaIL: a Library for Efficient Application
Integration of Spatial Indices,” Proceedings of
16th International Conference on Scientific and
Statistical Database Management, Santorini
Island Greece, June 2004.
[7] Brain, M. and A. Tharp, 1990. Perfect hashing
using sparse matrix packing. Information
Systems, 15(3), 281-290.
[8] Asserson, U. and T. Moller, 2000. “Optimized
View Frustum Culling Algorithms for Bounding
Boxes,” Journal of Graphic Tools 2000.
[9] Gross M., B. Heidelberger, M. Muller, D.
Pomernats, and M. Teschner, 2003. “Optimized
Spatial Hashing for Collision Detection of
Deformable Models,” Vision, Modeling, and
Visualization 2003.
[10] Hastings, E. and R. Guha, 2005. “Real-Time
Range Monitoring Queries on Heterogeneous
Mobile Objects by Spatial Hashing.”
[11] Lo, M. and C. Ravishankar. 1996. “Spatial HashJoins”. ACM SIGMOD International Conference
on Management of Data 1996.