Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Optimizing Query Processing In Sensor Networks Ross Rosemark Our Research • We argue that the sensor network can be viewed as a database where each node is a table. – Under this view.. we argue that a query plan can now dictate how to abstract data from the sensor network rather than a sensor node. Our Goal • Given a query Q2 – Define how to process the query • What metadata (if any) should be collected • What query plan should a node utilize to abstract data from it’s local sensors • What is the routing infrastructure of the query • Example – Given Q2 • Define cost of – Collecting metadata + Execution Cost + Routing cost • Define cost of – Not collecting metadata + execution cost + routing cost • Choose the lowest cost Idea • Evaluate multiple different infrastructures for a query • Choose the infrastructure that utilizes the least energy • The * operator means aggregation – Not database aggregation (i.e. Sum, Count) but rather aggregation that is discussed in networks Research Issue • We use metadata to evaluate different query plans – Metadata becomes an important research issue • Which nodes should send metadata to the AP • What metadata does the AP require • We do an on demand approach in terms of collecting metadata Metadata Collection • Algorithm to collect metadata – Only nodes participating in query send metadata • Algorithm – Access Point routes query to spatial area utilizing GPSR – Utilizing GPSR query is routed around spatial area – Each node on perimeter of spatial area floods msg inside spatial area – Each node in spatial area sends metadata to the AP utilizing GPSR Metadata For a given query Q1 • Initially the access point knows: – The number of nodes in the network (N) – The spatial area of the network (SA) – The query area (QA)…. (we only consider spatial queries) – A histogram that represents the selectivity of each attribute • – Bad representation Using this information • Query Plan 1 – – – • Query Plan 2 – – – Estimate query execution cost if metadata is not collected Estimate result collection cost if metadata is not collected If (Query Plan 1 > Query Plan 2) • – Estimate metadata collection cost Estimate query execution cost if metadata is collected Estimate result collection cost Choose Query Plan 1 Else • Choose Query Plan 2 Metadata Collection • When metadata is collected – nodes participating in a query send • the selectivity of each of the queries predicates • It’s longitude and latitude • Example – Query 1-> Select * From Sensors Where Light > 10 – Node participating in query send the selectivity of the predicate Light > 10 – Node participating in query send Longitude/Latitude (i.e. Longitude = 1002.3 Latitude = 2003.1 Metadata • If query 2 now comes in and covers the same/subset of the spatial area of query 1 then we evaluate the following: – Should we collect more metadata, or just optimize with our current metadata • Estimate the metadata collection cost • Estimate query execution cost • Estimate routing cost • This is a repeat of the initial problem – Our estimates are now better though Results • Estimated mathematically the energy associated with – Metadata collection – Query Execution • Ran simulations to get real values for these metrics • In simulations inserted 1 spatial query into the network • Ran this experiment varying – The query (spatial area) (predicates) – Topology (5 different topologies) – Metadata (5 different metadata distributions) Query Execution . Query Execution 2.5 2 1.5 Actual Estimate 1 0.5 0 Collected Metadata Did not Collect Metadata Metadata Collection Metadata Collection 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Actual Estimate Total 2.5 2 1.5 Actual Estimate 1 0.5 0 Metadata Collection No Metadata Collection Questions?