* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download New Software for Ensemble Creation in the - adass xiv
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Oracle Database wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Functional Database Model wikipedia , lookup
Ingres (database) wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Relational model wikipedia , lookup
Database model wikipedia , lookup
New Software for Ensemble Creation in the Spitzer-Space-Telescope Operations Database Russ Laher and John Rector 2004 ADASS XIV Conference October 24 - 27, 2004 Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Preface About one third of the 230 Spitzer data-processing pipelines require multiple input images (e.g., calibrations, image co-adds & mosaics) Motivation is data noise reduction and/or statistical characterization of the data Input images are grouped for particular pipeline processing into what we call “ensembles” in the operations database Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Outline Powerpoint Presentation • Introduction • • • • • Background Purpose of Talk Database storage of ensembles Ensemble-creation rules Ensemble-creation software Conclusions Future Work URL of long version of paper http://spider.ipac.caltech.edu/staff/laher/sirtf/NewEnsembleCreation.pdf Appendices • • • • A. On-line software tutorial B. Spitzer ensemble-creation rules C. S/W output, test mode D. S/W output, normal mode Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Background Spitzer rules for ensemble creation are well documented and under version control. Spitzer pipeline-operator Ron Beck created the first version of a script for executing the ensemble-creation rules • Rules are hard coded (and therefore hard to change) • Direct SQL is used for DB access (open/close DB connection for each access) New database-design improvements and software have been developed for increased speed and flexibility Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Purpose of this Talk To acquaint you with SSC methodologies for creating/storing ensembles, including • Database design • “Ensemble-creation” rules Debut our new ensemblecreation software • New database tables and schema changes • New database stored functions Identify general concepts used in creating/storing ensembles (for application to other astronomical missions) Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Hierarchy of Spitzer Observations Observing campaign 5-7 days in a campaign 30,000-100,000 observed images (DCEs) one instrument per campaign Spitzer instruments: IRAC, MIPS, IRS campaignId Request i 200-300 in a campaign reqKey Exposure j 10-100 in a request ExposureId In “cluster” mode, there may be multiple exposures per cluster of observations (clusterPosNum) At scheduling time, the “pipeline picker” assigns to each DCE a pipeline for initial processing (initPlScriptId) Instrument channel k 3 or 4 depending on the instrument chanNum or chnlNum DCE l “Data Collection Event” 1-10 DCEs in an exposure and channel Each DCE gives a FITS file of observed data (single image or stack of images, depending on the instrument and mode) dceId Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Miscellaneous Considerations Ensembles can be created in the database after the observations are scheduled (it is not necessary to have received the actual DCEs from the spacecraft) Wouldn’t it be nice to store with each ensemble in the database information about the “rule” applied in creating it? Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Database Storage of Ensembles ensembles ensId: serial plScriptId: smallint dceSetId: integer expectedInputs: smallint repDceId: integer version: smallint vbest: smallint ruleId: smallint 1 1+ dceSets 1 1+ dceId: integer dceSetId: integer ensembleSets inEnsId: integer outEnsId: integer There are three database tables for storing information about how (instances of) ensembles are defined (which DCEs are included and how they are to be processed) DCEs are grouped explicitly into DCE sets (via association of dceIds with an dceSetId) The type of pipeline ensemble processing to be done is stored with the ensemble (plScriptId is assocated with ensId) Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Database Storage of Ensembles (cont.) ensembles ensId: serial plScriptId: smallint dceSetId: integer expectedInputs: smallint repDceId: integer version: smallint vbest: smallint ruleId: smallint 1 1+ dceSets 1 1+ dceId: integer dceSetId: integer ensembleSets inEnsId: integer outEnsId: integer A DCE set is stored with one or more ensembles (dceSetId is associated with ensId) An ensemble is characterized in the database by dceSetId and plScriptId Two or more ensembles can be associated together for processing a set of ensembles by creating a new ensemble with NULL dceSetId and two or more associations in the ensembleSets database table Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 DB Storage of Ensemble Rules ensRules ensPlScripts 1 ruleId: smallint instrument: char(4) sql: lvarchar make: Boolean ensOfEns: Boolean minInputs: smallint comment: varchar(255) created: datetime createdBy: varchar(30) 1+ ruleId: smallint plScriptId: smallint There are two database tables for storing ensemble-creation rules The ensRules database table specifies how DCEs are to be grouped The ensPlScripts database table specifies how a set of DCEs is to be processed (by one or more different pipelines) Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Database Schema for Ensemble Creation ensRules ensPlScripts 1 ruleId: smallint instrument: char(4) sql: lvarchar make: Boolean ensOfEns: Boolean minInputs: smallint comment: varchar(255) created: datetime createdBy: varchar(30) 1 0+ 0+ ensTempList2 1 ensTempList groupId: serial ruleId: smallint initPlScriptId: smallint chanNum: smallint exposureNum: smallint fowlerNum: smallint waitPeriod: integer dceNum: smallint primaryField: smallint cycleNum: smallint aperture: smallint clusterPosNum: smallint frameNum: smallint arraycoord: smallint 1+ 1 ruleId: smallint dceId: integer dceSetId: integer expectedInputs: smallint initPlScriptId: smallint chanNum: smallint exposureNum: smallint fowlerNum: smallint waitPeriod: integer dceNum: smallint primaryField: smallint cycleNum: smallint aperture: smallint clusterPosNum: smallint frameNum: smallint arraycoord: smallint 0+ groupId: integer dceSetId: integer expectedInputs: smallint dceId: integer 1 1 ensTempList3 1 1 serialId: serial ruleId: smallint ensPlScriptId: smallint dceSetId: integer expectedInputs: smallint dceId: integer 1+ ensOfEnsTempList2 1 1 1 1 1 1+ ensTempListMore ruleId: smallint ensPlScriptId: smallint initPlScriptId: smallint chanNum: smallint exposureNum: smallint fowlerNum: smallint waitPeriod: integer dceNum: smallint primaryField: smallint cycleNum: smallint aperture: smallint clusterPosNum: smallint frameNum: smallint arraycoord: smallint ruleId: smallint plScriptId: smallint 1 0+ ensOfEnsTempList 1+ 1+ 1 ruleId: smallint ensId: integer ensPlScriptId: smallint expectedInputs: smallint dceId: integer initPlScriptId: smallint chanNum: smallint exposureNum: smallint fowlerNum: smallint waitPeriod: integer dceNum: smallint primaryField: smallint cycleNum: smallint aperture: smallint clusterPosNum: smallint frameNum: smallint arraycoord: smallint 1 1 ensTempList3More 1 ensOfEnsTempList3 ruleId: smallint inEnsId: integer outEnsId: integer serialId: integer ensId: integer 1 ensembles ensId: serial plScriptId: smallint dceSetId: integer expectedInputs: smallint repDceId: integer version: smallint vbest: smallint ruleId: smallint 1+ 1+ 1 1+ ensembleSets dceSets 1 1+ dceId: integer dceSetId: integer 1+ 1 inEnsId: integer outEnsId: integer 1+ 1+ Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Database Stored Functions for Ensemble Creation Database stored function Return value(s) getEnsRules() All records getEnsPlScripts() All records getReqMode(reqKey) Corresponding reqMode (decoded for instrument name) deleteAllEnsTempLists() None getEnsGroupsFrom EnsTempList(ruleId) All records for given ruleId getEnsSetsFromEnsOf EnsTempList3(ruleId) All records for given ruleId createEnsembles (ruleId, test) Basic info for all ensembles created or to be created for given ruleId createEnsembleSets (ruleId, test) Basic info for all ensembles and ensembleSets created or to be created for given ruleId Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Features of ensembleCreation.pl Much faster performance is expected because pre-compiled database stored functions are called Efficient architecture: only a single database connection is needed Software complexity is encapsulated in the database stored functions Database-table-driven specification of ensemble-creation rules makes it flexible On-line tutorial (lists options, switches, sample command lines) Useful, thoughtfully-organized diagnostic outputs Test mode to verify effect of ensemble-creation rule, without actually having to create ensembles in the database Post-mortem debugging capability via direct SQL querying of database temporary tables Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Flow Chart for createEnsembles.pl Open database (DB) connection Delete all records from temporary DB tables Read ensRules and ensPlScripts DB tables Optionally write ensemble-creation rules to output file Execute ensemble-creation-rule SQL statements to pre-load data into ensTempList# temporary DB tables Create ensembles and sets of DCEs in temporary DB tables and optionally in ensembles and dceSets DB tables Write summary to output file Execute ensemble-creation-rule SQL statements to pre-load data into ensOfEnsTempList# temporary DB tables Create ensembles and sets of ensembles in temporary DB tables and optionally in ensembles and ensembleSets DB tables Write summary to output file Via database stored function Via dbaccess system-call Close database connection File output Open/close DB connection Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Conclusions Increased speed in creating database records for ensembles is achieved by using database stored functions Flexibility in adding/changing ensemble-creation rules is achieved by storing the rules in the database Several “small improvements” were implemented, as well (e.g., storing the minimum number of DCEs with the ensemble-creation rule, storing the corresponding ruleId with each ensemble in the database) Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125 Future Upgrades Add new option to execute selected ensemble-creation rules • Specify comma-separated list of ruleIds • Application is augmenting existing set of ensembles Add new option to create ensembleSets from existing ensembles • Specify ruleId and ensPlScriptId • Application is linking together existing ensembles (e.g., process the data for all reqKeys in a given 12-hour PAO to flag pixels with latent images) Russ Laher ([email protected]) and John Rector ([email protected]) Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125