Download i +1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Distributed operating system wikipedia , lookup

IEEE 1355 wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Airborne Networking wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

CAN bus wikipedia , lookup

Kademlia wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Transcript
When processing is cheaper than transmitting
Daniel V Uhlig
Maryam Rahmaniheris
1

How to gather interesting data from
thousands of Motes?
• Tens to thousands of motes
• Unreliable individually

To collect and analyze data
• Long term low energy deployment
• Can using processing power at each Mote
 Analyze local before sharing data
2

Transmission of data is expensive compare to
CPU cycles
• 1Kb transmitted 100 meters = 3 million CPU
instructions
• AA power Mote can transmit 1 message per day for
about two months (assuming no other power
draws)
• Power density is growing very slowly compared to
computation power, storage, etc

Analyze and process locally, only transmitting
what is required
3

Minimize communications
◦ Minimize broadcast/receive time
◦ Minimize message size
◦ Move computations to individual nodes



Nodes pass data in multi-hop fashion
towards a root
Select connectivity so graph helps with
processing
Handle faulty nodes within network
4
6
7
5
6
10
5
5
10
10
5


Max is very simple
What about Count?
◦ Need to avoid double counting due to redundant
paths

What about spatial events?
◦ Need to evaluate readings across multiple sensors


Correlation between events
Failures of nodes can loose branches of the
tree
6
•
Connectivity Graph
– unstructured or how to structure
•
•
•
•
Diffusion of requests and how to combine
data
Maintenance messages vs Query messages
Reliability of results
Load balancing
– messages traffic
– storage
•
Storage costs at different nodes
7
S.Madden, M.Franklin, J.Hellerstein, and
W.Hong
Intel Research, 2002
8
•
•
•
Aggregates values in low power, distributed
network
Implemented on TinyOS Motes
SQL like language to search for values or sets
of values
– Simple declarative language
•
•
Energy savings
Tree based methodology
– Root node generates requests and dissipates down
the children
9
•
Three functions to aggregate results
– f (merge function)
• Each node runs f to combine values
• <z>=f (<x> , <y>)
• EX: <SUM, COUNT>=f (<SUM1+SUM2>, <COUNT1+COUNT2>)
– i (initialize function)
• Generates state record at lowest level of tree
• EX:<SUM, COUNT>
– e (evaluator function)
• Root uses e to generate the final result
• RESULT=e<z>,
• EX: SUM/COUNT
•
Functions must be preloaded on Motes or
distributed via software protocols
10
Count = 10
1
2
7
1
3
1
1
3
1
1
Max via tree
11
All searches have different properties that
affect aggregate performance
• Duplicate insensitive – unaffected by double
counting (Max, Min) vs (Count, Average)
– Restrict network properties
•
Exemplary – return one value (Max/Min)
– Sensitive to failure
•
Summary – computation over values
(Average)
– Less sensitive to failure
12
•
•
•
Distributive – Partial states are the same as
final state (Max)
Algebraic – Partial states are of fixed size but
differ from final state (Average - Sum, Count)
Holistic – Partial states contain all sub-records
(median)
– Unique – similar to Holistic, but partial records may
be smaller then holistic
•
Content Sensitive – Size of partial records
depend on content (Count Distinct)
13


Diffusion of requests and then collection of
information
Epochs subdivided
for each level to
complete task
◦ Saves energy
◦ Limits rate of data
flow
14

Snooping – Broadcast messages so others can
hear messages
◦ Rejoin tree if parents have failure
◦ Listen to other broadcasts and only broadcast if its
values are needed
 In case of MAX, do not broadcast if peer has
transmitted a higher value

Hypothesis testing – root guesses at value to
minimize traffic
15

Theoretic results
for
◦ 2500 Nodes


Savings depend on
function
Duplicate
Insensitive,
summary best
◦ Distributive helps

Holistic is the worse
16
•
•
•
•
•
•
16 Mote network
Count number of
motes in 4 sec epochs
No optimizations
Quality of count is due
to less radio
contention in TAG
Centralized used 4685
messages vs TAG’s
2330
50% reduction, but less
then theoretical results
– Different loss model,
node placement
17
•
Loss of nodes and subtrees
– Maintenance for structured connectivity
•
Single message per node per epoch
– Message size might increase at higher level nodes
– Root gets overload (Does it always matter?)
•
Epochs give a method for idling nodes
– Snooping not included, timing issues
18
S.Nath, P.Gibbons, S.Seshan, Z.Anderson
Microsoft Research, 2008
20

TAG

Synopsis Diffusion
◦ Not robust against node or link failure
◦ A single node failure leads to loss of the entire sub branch's data
◦ Exploiting the broadcast nature of wireless medium to enhance
reliability
◦ Separating routing from aggregation
◦ The final aggregated data at the sink is independent of the
underlying routing topology
◦ Synopsis diffusion can be used on top of any routing structure
◦ The order of evaluations and the number of times each data
included in the result is irrelevant
21
3 10
Count = 10
1
2
7
1
3
1
1
3
1
1
Not robust against node or link failure
22

Multi-path routing
Count =
◦ Benefits
 Robust
 Energy-efficient
20
23
15
2
7
3
◦ Challenges
 Duplicate sensitivity
 Order sensitivity
58 10
4
1
1
2
23

A novel aggregation framework
◦ ODI synopsis: small-sized digest of the partial results
 Bit-vectors
 Sample
 Histogram

Better aggregation topologies

Example aggregates

Performance evaluation
◦ Multi-path routing
◦ Implicit acknowledgment
◦ Adaptive rings
24
SG: Synopsis Generation
SF: Synopsis Fusion
SE: Synopsis Evaluation

The exact definition of these functions
depend on the particular aggregation
function:
◦ SG(.)
 Takes a sensor reading and generates a synopsis
◦ SF(.,.)
 Takes two synopsis and generates a new one
◦ SE(.)
 Translates a synopsis into the final answer
25

Distribution phase
◦ The aggregate query is flooded
◦ The aggregate topology is constructed

Aggregation phase
◦ Aggregated values are routed toward Sink
◦ SG() and SF() functions are used to create partial results
26

The sink is in R0

A node is in Ri if it’s i hops
away from sink

Nodes in Ri-1 should hear the
broadcast by nodes in Ri

Loose synchronization between
nodes in different rings

Each node transmits only once
R3
A
C
R2
R1
B
R0
◦ Energy cost same as tree
27
SG: Synopsis Generation
SF: Synopsis Fusion
SE: Synopsis Evaluation

Coin tossing experiment CT(x) used in Flajolet and
Martin’s Algorithm:
◦ For i=1,…,x-1: CT(x) = i with probability 2  i
◦ Simulates the behavior of the exponential hash function
◦ Synopsis: a bit vector of length k > log(n)
 n is an upper bound on the number of the sensor nodes in the
network
◦ SG(): a bit vector of length k with only the CT(k)th bit is set
◦ SF(): bit wise Boolean OR
◦ SE(): the index of lowest-order 0 in the bit vector= i->2i 1 / 0.77
Magic
Constant
28
SG: Synopsis Generation
SF: Synopsis Fusion
SE: Synopsis Evaluation

The number of live sensor nodes, N, is proportional to 2i 1
i N
Intuition: The probability of N nodes all failing to set the ith bit is (1  2 )
i
which is approximately 0.37 when N  2 and even smaller for larger N.
4
0 1 1 0 1 1 Count 1 bits
0 1
0 1 0 1
0 0
0 1 0 0 0 0
0 1
0 0 0 1
0 1
0 1
0 0 0 1 0
29
SG: Synopsis Generation
SF: Synopsis Fusion
SE: Synopsis Evaluation
s
s
SF
SF
SF
SF
SF
SF
SF
SF
SF
SG
SG
SG
SG
SG
SG
SG
SG
r1
r2
r3
r4
r5
r1
r2
Aggregation DAG
SG
SF
SF
SG
r5
r4
r3
Canonical left-deep tree
30
Theorem: Properties P1-P4 are necessary and sufficient
properties for ODI-Correctness
◦ P1: SG() preserves duplicates
 If two reading are considered duplicates then the same synopsis
is generated
◦ P2: SF() is commutative
 SF(s1, s2) = SF(s2, s1)
◦ P3: SF() is associative
 SF(s1, SF(s2, s3)) = SF(SF(s1, s2), s3)
◦ P4: SF() is same-synopsis idempotent
 SF(s, s) = s
31
SG: Synopsis Generation
SF: Synopsis Fusion
SE: Synopsis Evaluation

Uniform Sample of Readings
◦ Synopsis: A sample of size K of <value, random number,
sensor id> tuples
◦ SG(): Output the tuple <valu, ru, idu>
◦ SF(s,s’): outputs the K tuples in s∪s’ with the K largest ri
◦ SE(s): Output the set of values vali in s
◦ Useful holistic aggregation
32
SG: Synopsis Generation
SF: Synopsis Fusion
SE: Synopsis Evaluation

Frequent Items (items occurring at least T times)
◦ Synopsis: A set of <val, weight> pairs, the values are unique and
the weights are at least log(T)
◦ SG(): Compute CT(k) where k>log(n) and call this weight and if it’s
at least log(T) output <val, weight>
◦ SF(s,s’): For each distinct value discard all but the pair <value,
weight> with maximum weight. Output the remaining pairs.
◦ SE(s): Output <value, 2 weight > for each <val, weight> pair in s as a
frequent value and its approximate count
◦ Intuition: A value occurring at least T time is expected to have at
least one of its calls to CT() return at least log(T)
 p=1/T
33

Communication error
◦
◦
◦
◦
◦
◦

1-Percent contributing
h: height of DAG
k: the number of neighbors each nodes has
p: probability of loss
The overall communication error upper bound:1  (1 
If p=0.1, h=10 then the error is negligible with k=3
p k )h
Approximation error
◦ Introduced by SG(), SF(), and SE() functions
◦ Theorem 2: any approximation error guarantees provided for the
centralized data stream scenario immediately applies to a synopsis
diffusion algorithm , as long as the data stream synopsis is ODI-correct.
34

◦
◦
Implicit acknowledgement provided by ODI synopses


Retransmission
High energy cost and delay
Adapting the topology
When the number of times a node’s transmission is included in the parents
transmission is below a threshold
Assigning the node to a ring that can have a good number of parents
Assign a node in ring i with probability p to :








Ring i +1 If
ni > ni-1
ni+1 > ni -1 and ni+2 > ni
Ring i -1 If
ni-2 > ni-1
ni-1 < ni+1 and ni-2 > ni
35
Rings
Adaptive Rings
36




The algorithms are implemented in TAG simulator
600 sensors deployed randomly in a 20 ft * 20 ft grid
The query node is in the center
Loss probabilities are assigned based of the distance between nodes
37
RMS Error
% Value Included
38

Pros
◦
◦
◦
◦
◦

High reliability and robustness
More accurate answers
Implicit acknowledgment
Dynamic topology adaptation
Moderately affected by mobility
Cons
◦ Approximation error
◦ Low node density decreases the benefits
◦ The fusion functions should be defined for each
aggregation function
◦ Increased message size
39

Is there any benefit in coupling routing with aggregation?
◦ Choosing the paths and finding the optimal aggregation points
◦ Routing the sensed data along a longer path to maximize
aggregation
◦ Finding the optimal routing structure
 Considering energy cost of links
 NP-Complete
 Heuristics (Greedy Incremental)

Considering data correlation in the aggregation process
◦ Spatial
◦ Temporal
 Defining a threshold
 TiNA
40

Could energy saving gained by aggregation be
outweighed by the cost of it?
◦ Aggregation function cost
 Storage cost
 Computation cost (Number of CPU cycles)

No mobility
◦ Static aggregation tree

Structure-less or structured? That is the question…
◦ Continuous
◦ On-demand
41

Transmitting large amounts of data on the
internet is slow
◦ Better to process locally and transmit the
interesting parts only
42

How does query rate affect design decisions?

Load balancing between levels of the tree
◦ Overload root and main nodes

How will video capabilities of Imote affect
aggregation models?
43