Download amsel_poster_FINAL

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Load Balancing in File Systems
Nadine Amsel
Dr. Carlos Maltzahn
Storage Systems Research Center (SSRC) at UCSC
http://ssrc.cse.ucsc.edu
Introduction
Results
• A new breed of distributed, petabyte-scale file systems
uses many Object Storage Devices (OSDs)
Will more hardware prevent overload?
What is the length of each period of overload time?
• Search in such file systems requires OSDs to store
large indices and cope with ever-changing hot spots due
to a diverse query stream
• What is the extent of query hot spots? How long do
they persist?
Most overload periods last only a few minutes. The
distribution of period lengths follows a heavy-tailed
power law so the variance is infinite (there is no stable
average).
Methods
• Time-stamped queries by 500,000 AOL users over 3
months used to determine overload patterns
• Each term in a query maps to one OSD (i.e.
assuming a term-distributed index)
Overload occurs all the time. Just one overloaded OSD
can slow down the whole storage system.
• Two questions to answer:
1. How many OSDs are overloaded?
2. How long does an OSD stay overloaded?
The median overload length is ~4 minutes for 128 OSDs
and ~2 minutes for 1K OSDs. In 99% of all cases, the
overload period lasts no longer than an hour.
• OSD address determined by taking the hash of the
term and extracting the last n bits (where n is
determined by the number of OSDs)
• An OSD’s load is determined by the number of queries
it receives per minute
• Query traces analyzed using different numbers of
OSDs and overload thresholds:
• 128, 1K, and 64K OSDs
• 10, 30, and 50 queries/minute overload thresholds
128
1K
64K
Median
18
3
2
Mean
15
2
1
The query workload leads to overload even if distributed
over a large number of nodes. Increasing the number of
nodes is not a solution.
Conclusion
• Index query workloads cannot be effectively addressed
by increasing the number of OSDs.
• Load-balancing mechanism needs to adapt on a minuteby-minute basis and any mechanism that takes longer
than an hour to adapt will not be able to keep up with
99% of the workload changes.
This work was completed as part of UCSC's SURF-IT summer undergraduate research program, an NSD CISE REU Site. This material is based upon work supported by the National Science Foundation under Grant No. CCF-0552688.