Download Zlotnicki

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
PODS  NODS  EOSDIS PO.DAAC
Lessons Learned
Victor Zlotnicki
Jet Propulsion Laboratory
California Institute of Technology
PODS  NODS  EOSDIS PO.DAAC
• 198x: NASA Pilot X Data
Systems
• 1985?: transferred to NASA
Disciplines
• 1994: EOSDIS ‘operational’
• 1997: TRMM & EOSDIS
• 2003-5: EOSDIS ‘evolution’
SERVICE MODEL, PODS, 1980s
Original emphasis: central
Pilot Ocean Data System
service, store, subset,
• Beautiful software:
browse, deliver browse or
modelled orbit, modelled
small subset
sensor viewing geometry,
(SEASAT data)
subset along track,
Directory to find data
displayed & delivered
(‘GOLD’ Catalog)
subsets, etc. VAX 780.
• Jim Brown, Carol Miller,
Chuck Klose …
Problems
• Cost of adding each new, derived data set
(solved with ‘levels of service’)
• Decreasing price of computer hdw at user’s end
(less necessary to subset centrally)
NASA HQ OCEANS 1984
PODS under NASA OCEANOGRAPHY
• Must provide computing to JPL
oceanographers
• JPL scientist must be Group
Supervisor of PODS mgr.
• Must be available to reprocess
TOPEX, other data (Project
would not)
• Managers 1984-2006:
J.C. Klose, D. Halpern, D. Collins,
P. Liggett
JPL OCEANS GROUP 1983
SERVICE MODEL, EOSDIS
Problems
Satellite Cmd & Ctrl
• Handling all these
Telemetry
conflicting requirements
Level 0 processing
with the same ‘NASA
project’ structure.
Level 1,2,3 processing
• Web use grew without
Delivery to Science
NASA help.
Users
• Delivery to gral public
• Also: GCMD
• TRMM (1997),
•
•
•
•
•
SOME LESSONS ¿LEARNED?
Centralized, large data Projects
•
•
•
•
Live longer than ‘Programs’ that group small
tasks
Are less cost-effective
Extra funds can be put to good, honest use
(data recovery, ‘Pathfinder’ Climate Time Series
generation.)
Find it hard to adapt to technological changes
C&C, downlink, level 0 processing must be
decoupled from the rest.
Derived products (eg Pathfinder Time Series,
reprocessed and streamlined GDRs) are a
great way to improve data quality, shrink
volume, make data available to many,.
It is still hard to find data today.
Need ‘google data’
Trends in Data storage
Memory:
•
•
•
•
4 GB mem stick: $120
1 GB ECC mem: $500
32 bit OS: 4 GB mem max (232)
64 bit OS: 1 TB mem in PCs
(264=16x109 GB mem theoret)
Optical disk
•
4.6 GB DVD: $0.50
Magnetic Disk:
Cheap Good/ Bettr
est
RAID RAID
$/TB $/TB $/TB
1992
1,000k 4,000k 8,000k
1996
92k 366k 732k
2000
8k
33k
67k
2004
0.8k
3k
6k
2006
0.2k
0.9k
1.8k
2010
0.02k 0.08k 0.17k
60% annual increase in density.
Source: Steve Gilheany,
http://www.berghell.com/whitepapers.htm
Source: http://www.pricegrabber.com
Trends in Network Transfer
• ~1980: dial upTelemail
(0.3, 1.2 kb/s) @ work.
• ~1990: 10 Mb/s @work,
1.2kb/s @home.
• 2006: JPL offices have
100 Mb/s std, 1 Gb/s if
insist, 10 Gb/s in 2007.
0.5+ Mb/s @home
• 2006: NREN 10 Gbps
between Ames & GSFC
• ABILENE: hi speed
optical intercontinental
US (10 Gb/s, -> 100 in
2006/2007). Separate
from Internet.
Research ctrs.
• NATIONAL LAMBDA
RAIL: 10 Gb/s optical
ETHERNET
Trends in Distributed Computing
• GRID computing:
split a huge calculation
among many disparate,
geographically separate,
administratively
separate computers.
• Goal: solving problems
too big for any single
supercomputer.
Computational Grids focus on
computationally-intensive operations.
Data Grids, ‘the controlled sharing and
management of large amounts of
distributed data’.
Equipment Grids, e.g. a telescope, where
the surrounding Grid is used to
control the equipment remotely and
to analyse the data collected.
•
Source: Wikipedia. 2006-10
Trends in Web Services
• Web Service: software system
designed to support
interoperable machine-tomachine interaction over a
network (Wikipedia, 2006)
• Web Services + Grid Computing:
a product is created ‘on the fly’
from data and algorithms
scattered ‘out there’. Example:
SciFlo (http://sciflo.jpl.nasa.gov)
• Danger: would you write a
scientific paper or base policy on
untraceable computations?
SUMMARY
• NASA, NOAA, USGS do have a responsibility to
manage sat data in order to maximize its use, maximize
the hdw investment.
• Huge centralized data projects have the advantage of
survivability, the disadvantage of inertia.
• Scientific ‘stewardship’, frequent reprocessing, ‘higher
level’ products are cost-effective ways to improve
quality, decrease volume.
• Failure to understand technological trends, and build a
true ‘open architecture’ system, may cause a data
system to be built to solve a problem that no longer
exists when the system is completed.