Download Extremely Large Databases in Astronomy: LSST Extremely Large Databases Workshop SLAC

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Extremely Large Databases in
Astronomy: LSST
Extremely Large Databases Workshop
SLAC
October 25, 2007
Kian-Tat Lim
Stanford Linear Accelerator Center
Outline
Why?
What?
How?
What does this mean?
Extremely Large Databases Workshop
October 25, 2007 SLAC
2 / 30
Outline
Why?
What?
How?
What does this mean?
Extremely Large Databases Workshop
October 25, 2007 SLAC
3 / 30
LSST Overview
Proposed
telescope to be
built in Chile
Extremely Large Databases Workshop
October 25, 2007 SLAC
4 / 30
Telescope
3.2 gigapixel camera
8.4 meter diameter mirror
Extremely Large Databases Workshop
October 25, 2007 SLAC
5 / 30
Survey
Wide
Fast
Deep
Extremely Large Databases Workshop
October 25, 2007 SLAC
6 / 30
Dark Matter and Energy
Photo: J. A. Tyson, W. Colley,
E. L. Turner, and NASA
Extremely Large Databases Workshop
October 25, 2007 SLAC
7 / 30
Variable Objects
Extremely Large Databases Workshop
October 25, 2007 SLAC
8 / 30
Transient Objects
Extremely Large Databases Workshop
October 25, 2007 SLAC
9 / 30
Moving Objects
Photo: D. Roddy, Lunar and Planetary
Institute
Extremely Large Databases Workshop
October 25, 2007 SLAC
10 / 30
Outline
Why?
What?
How?
What does this mean?
Extremely Large Databases Workshop
October 25, 2007 SLAC
11 / 30
Data Challenge
Images
Database
Extremely Large Databases Workshop
October 25, 2007 SLAC
12 / 30
Data Locations
Mountain
Base Camp
Archive Center
Data Access Center
Extremely Large Databases Workshop
October 25, 2007 SLAC
13 / 30
Non-Image Data In Database
Moving
Objects
Catalog
Object Catalog
Source Catalog
Metadata
Provenance
Statistics
Summaries
Difference Image Source Catalog
Calibration
Engineering and Facility Database
2
Extremely Large Databases Workshop
October 25, 2007 SLAC
14 / 30
Sagans of Rows
49 billion objects
2.8 trillion sources
Extremely Large Databases Workshop
October 25, 2007 SLAC
15 / 30
Lots of Columns
269 columns/Object
56 columns/Source
Extremely Large Databases Workshop
October 25, 2007 SLAC
16 / 30
Denormalization
Load
Query
Extremely Large Databases Workshop
October 25, 2007 SLAC
17 / 30
Database Size
Grows to 14 PB with indices
Extremely Large Databases Workshop
October 25, 2007 SLAC
18 / 30
Managing The Database
Similar to commercial
analytical data
warehouses
Extremely Large Databases Workshop
October 25, 2007 SLAC
19 / 30
Comparisons
Google and few
others are here
LSST
We have 12 years to
reach today’s
commercial limits
Likely growth
faster than linear
Wal-mart
BaBar
SDSS (~0.02)
AT&T
*
All numbers based on publicly available data
Extremely Large Databases Workshop
October 25, 2007 SLAC
20 / 30
Outline
Why?
What?
How?
What does this mean?
Extremely Large Databases Workshop
October 25, 2007 SLAC
21 / 30
Reliability
Duplicate
Buffer
Replicate
Backup
Mirror
Extremely Large Databases Workshop
October 25, 2007 SLAC
22 / 30
Integrity
Never modify raw data
Provenance
Reprocessing
Extremely Large Databases Workshop
October 25, 2007 SLAC
23 / 30
Scalability
Horizontally scalable
Portable code
Extremely Large Databases Workshop
October 25, 2007 SLAC
24 / 30
Specialized file system
Database
servers
Map/Reduce + SQL
Extremely Large Databases Workshop
October 25, 2007 SLAC
25 / 30
Outline
Why?
What?
How?
What does this mean?
Extremely Large Databases Workshop
October 25, 2007 SLAC
26 / 30
What Can Vendors Do?
Mine!
Mining
Extremely Large Databases Workshop
October 25, 2007 SLAC
27 / 30
What Can Vendors Do?
Relax
Relax
Extremely Large Databases Workshop
October 25, 2007 SLAC
28 / 30
What Can Vendors Do?
Maintain
Maintenance
Extremely Large Databases Workshop
October 25, 2007 SLAC
29 / 30
Summary
Many issues common to
extremely large DBs
Extremely Large Databases Workshop
October 25, 2007 SLAC
30 / 30
Related documents