Download Efficient Computation of Range Aggregates against Uncertain

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computational electromagnetics wikipedia , lookup

Pattern recognition wikipedia , lookup

Natural computing wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Theoretical computer science wikipedia , lookup

Data assimilation wikipedia , lookup

Lateral computing wikipedia , lookup

Recommender system wikipedia , lookup

Transcript
Efficient Computation of Range
Aggregates against Uncertain
Location Based Queries
Abstract:
In many applications, including location based services, queries may not be precise.
In this paper, we study the problem of efficiently computing range aggregates in a
multidimensional space when the query location is uncertain. Specifically, for a
query point Q whose location is uncertain and a set S of points in a multidimensional space, we want to calculate the aggregate (e.g., count, average and
sum) over the subset S_ of S such that for each p ∈ S_, Q has at least probability
θ within the distance γ to p. We propose novel, efficient techniques to solve the
problem following the filtering-and-verification paradigm. In particular, two novel
filtering techniques are proposed to effectively and efficiently remove data points
from verification. Our comprehensive experiments based on both real and synthetic
data demonstrate the efficiency and scalability of our techniques..
Algorithm Used:
Filtering-and-Verification Algorithm:
This motivates us to follow the filtering-and-verification paradigm for the uncertain
aggregate query computation. Particularly, in the filtering phase, effective and
efficient filtering techniques will be applied to prune or validate the points. The
points that cannot be pruned or validated will be verified in the verification phase in
which their falling probabilities are calculated. Therefore, it is desirable to develop
effective and efficient filtering techniques to prune or validate points such that the
number of points being verified can be significantly reduced.
Input: RS : an aggregate R tree on data set S,
Q : uncertain query, F : Filter, γ : query distance,
θ : probabilistic threshold.
Output: |Qθ,γ(S)|
Description:
1: Queue := ∅ ; cn := 0; C := ∅ ;
2: Insert root of RS into Queue;
3: while Queue _= ∅ do
4: e ← dequeue from the Queue;
5: if e is validated by the filter F then
6: cn := cn + |e|;
7: else
8: if e is not pruned by the filter F then
9: if e is data entry then
10: C := C ∪ p where p is the data point e represented;
11: else
12: put all child entries of e into Queue;
13: end if
14: end if
15: end if
16: end while
17: for each point p ∈ C do
18: if Pfall(Q, p, γ) ≥ θ then
19: cn := cn + 1;
20: end if
21: end for
22: Return cn
System Architecture:
Existing System:
Query imprecision or uncertainty may be often caused by the nature of many
applications, including location based services. The existing techniques for
processing location based spatial queries regarding certain query points and data
points are not applicable or inefficient when uncertain queries are involved.
Disadvantages:
1. Extensive experiments are conducted to demonstrate the efficiency of our
techniques.
2. While we focus on the problem of range counting for uncertain location based
queries in the paper, our techniques can be immediately extended to other
range aggregates..
Proposed System:
We propose two novel filtering techniques, STF and APF, respectively. The STF
technique has a decent filtering power and only requires the storage of very limited
pre-computed information. APF provides the flexibility to significantly enhance the
filtering power by demanding more pre-computed information. While we focus on
the problem of range counting for uncertain location based queries in the paper, our
techniques can be immediately extended to other range aggregates.
Advantages:
1. We will focus on evaluating the effectiveness of the proposed algorithms in
capturing query relevance.
2. Relevance Measure
3. Online query grouping process
4. Similarity function
Module Description:
1. Maximal and Minimal Distance Filtering(MMD)
2. Statistical Filtering(STF)
3. Probabilistically Constrained Regions (PCR)
4. Anchor Point Filtering(APF)
Maximal and Minimal Distance Filtering (MMD):
Since we can immediately apply Algorithm once the minimal and maximal distances
between e and an anchor point are computed,
it is less efficient to decide the
accessing order after computing distances between all anchor points and e.
Therefore we employ a simple heuristic to order anchor points without distance
computations. We first consider the dimensional case, and the heuristic can be
easily extended to higher dimensions. The maximal/minimal distance filtering
technique is employed as a benchmark to evaluate the efficiency of other filtering
techniques.
Statistical Filtering (STF):
We propose a statistical filtering technique, namely STF. After introducing the
motivation of the technique, we present some important statistic information of the
uncertain query and then show how to derive the lower and upper bounds of the
falling probability of a point regarding an uncertain query Q, distance γ and
probabilistic threshold θ.
Probabilistically Constrained Regions (PCR):
Although the statistical filtering technique can significantly reduce the candidate
size in Algorithm, the filtering capacity is inherently limited because only a small
amount of statistics are employed. This motivates us to develop more sophisticated
filtering techniques to further improve the filtering capacity; that is, we aim to
improve the filtering capacity with more pre-computations (i.e., more information
kept for the filter). In this subsection, the PCR technique proposed in will be
modified for this purpose.
Anchor Point Filtering (APF):
The anchor (pivot) point technique is widely employed in various applications,
which aims to reduce the query computation cost based on some pre-computed
anchor (pivot) points. In this subsection, we investigate how to apply anchor point
technique to effectively and efficiently reduce the candidate set size.
System Configuration:H/W System Configuration:Processor
-
Pentium –III
Speed
-
1.1 Ghz
RAM
-
256 MB(min)
Hard Disk
-
20 GB
Floppy Drive
-
1.44 MB
Key Board
-
Standard Windows Keyboard
Mouse
-
Two or Three Button Mouse
Monitor
-
SVGA
S/W System Configuration:
Operating System

Application Server

Front End

:Windows95/98/2000/XP
:
Tomcat5.0/6.X
:
Scripts
HTML, Java, JSP,AJAX
:
JavaScript.

Server side Script
:
Java Server Pages.

Database Connectivity
:
Mysql.