Download Nearest Neighbor Searching Under Uncertainty

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neuroinformatics wikipedia , lookup

Data analysis wikipedia , lookup

Inverse problem wikipedia , lookup

Probability box wikipedia , lookup

Corecursion wikipedia , lookup

Data assimilation wikipedia , lookup

Pattern recognition wikipedia , lookup

Theoretical computer science wikipedia , lookup

Transcript
Nearest Neighbor Searching
Under Uncertainty
Wuzhou Zhang
Supervised by Pankaj K. Agarwal
Department of Computer Science
Duke University
Nearest Neighbor Searching (NNS)
Applications
Pattern Recognition, Data
Compression
Statistical Classification, Clustering
Databases, Information Retrieval
Computer Vision, etc.
http://en.wikipedia.org/wiki/Nearest_neighbor_search
Nearest Neighbor Searching
Under Uncertainty
0.2
Discrete pdf
0.1
•
0.3
0.4
0.4
0.1
Continuous pdf
0.2
•
0.4
0.2
0.3
0.3
0.1
Nearest Neighbor In Expectation
_________
Bisector In Case Of Gaussian
For Gaussian distribution, bisector is a
line!
Hard to get explicit
formula!
Figure:
Squared Distance Function
bisector is simple and
beautiful!
In case of discrete pdf,
bisector is also a line!
In both cases, compute the Voronoi diagram, solve it
optimally!
However, not a metric !
Sampling Continuous Distributions
Sometimes working on continuous distributions is
hard….
Lower bounds on other metrics and distributions are also
possible….
Let’s focus on discrete pdf then….
Expected Nearest Neighbor
In L1 Metric (Manhattan metric)
Expected Nearest Neighbor
In L1 Metric ( cont. )
Source: Range Searching on Uncertain Data [P.K.Agarwal et al. 2009]
Geometric Reduction
Building Block: Half-Space
Intersection and Convex Hulls
Upper hulls correspond to lower envelopes, an example in 2D
Source: page 252 – 253, Computational Geometry: Algorithms and Applications, 3rd Edition[Mark de Berg et al. ]
Segment-tree Based Data Structures
for Expected-NN In L1 Metric
Segment-tree Based Data Structures
for Expected-NN In L1 Metric ( cont. )
Segment-tree Based Data Structures
for Expected-NN In L1 Metric ( cont. )
Size of data structure
Preprocessing time
Query time
Summary of the result
Approximate L2 Metric
It’s a metric when P is centrally
symmetric!
Approximate L2 Metric ( cont. )
More complex!
Future Work
• Approximate the expected NN in L2
metric
Work harder
in the near future!
• Study the complexity of expected Voronoi
diagram
• Study the probability case
Thanks!
Main References:
[1] Pankaj K. Agarwal, Siu-Wing Cheng, Yufei Tao, Ke Yi: Indexing uncertain data.
PODS 2009: 137-146
[2] Pankaj K. Agarwal, Lars Arge, Jeff Erickson: Indexing Moving Points. J. Comput.
Syst. Sci. 66(1): 207-243 (2003)