* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download PSPS-CDR-day2
Survey
Document related concepts
Transcript
Pan-STARRS PS1 Published Science Products Subsystem Critical Design Review November 5-6, 2007 Honolulu CDR – Day 2 slide 2 Topics: November 6 / Day 2 • • • • • • • • • • • Welcome back (Heasley) ODM Continued (JHU team) The PSPS Data Retrieval Layer (Referentia team) Lunch The Web Based Interface (Heasley) Pan-STARRS External Interfaces (Heasley) PSPS Test Plan (Heasley) PSPS Schedule (Heasley) System Level Risk Assessment (Heasley) Executive Session (Committee only) Recap Meeting of Review Panel with Subsystem and component leads • Adjourn slide 3 The Object Data Manger System Continued The Johns Hopkins Team slide 4 The PSPS Data Retrieval Layer The Referentia Team slide 5 Pan-STARRS PS1 Published Science Products Subsystem Critical Design Review Data Retrieval Layer (DRL) Referentia Systems, Inc Matt Shawver, [email protected], 808-423-1900x111 Kenn Yuen, [email protected] Chris Richmond, [email protected] Jay Knight, [email protected] Nov 5-6, 2007 slide 6 Outline slide 7 Software Architecture Requirements and Implementation Plans Key Design Modifications Test Plan Development Schedule DRL Development Status Demo High Level Software Architecture slide 8 DRL Software Architecture Web Based Interface Test Web Based Interface Generic Web Service Client Java Web Service Proxy Web Service Interface Result Set Persistence (Java Caching System) DM Adaptor Query Manager Adaptor PostgreSQL JDBC Adaptor MySQL JDBC Adaptor JHU Query Manager PostgreSQL Driver MySQL Driver Data Managers slide 9 Login / Session Manager Tomcat Security DRL Requirements Query Analysis Query Queuing and Execution Result Caching Result Retrieval Administrative • Performance Monitoring • User Administration • Logging Support multiple Data Managers • JHU Query Manager • MySQL • SQL Server • PostgreSQL slide 10 Req: Query Analysis From previous design requirements: Syntax validation Current DM Resource Load Query processing time estimate Schema information slide 11 Query Analysis Implementation For syntax validation, SQL Server PARSEONLY command For performance status, SQL Server sp_monitor procedure Highly database implementation dependent Exact prediction of query time is impossible. Instead show query execution plan. Schema information will be retrieved by querying database metadata views and functions For ODM, use Query Manager functionality when available slide 12 Req: Query Queuing and Execution From previous design requirements: Query any of the Data Managers Provide status and progress information Provide priority level with validation status results Set query priority based on validation priority level slide 13 Query Queuing and Execution Implementation Issue: database query execution plans are not always accurate. Alternative implementation: treat all queries the same at first. Short, medium and long queues each have their own connections allocated If a short or medium query takes longer than a certain amount of time, it will be moved to a longer queue Queue sizes, and expiration times will be user configurable If the long queue runs out of query slots, most recent query will be cancelled and restarted when a slot becomes available. For ODM, use Query Manager queuing functionality (user chooses which queue to use) slide 14 Req: Result Caching From previous design requirements: Query result sets are stored in the DRL Cache until they have been retrieved by a PDC. Purge retrieved result sets if space is needed slide 15 Result Caching Implementation Maintain results in result set cache as long as possible to allow repeated retrieval of results With a large enough cache (terabyte), results should be typically held for a week or more Link to past results via query history Performance of the result set cache is critical for PSPS responsiveness • Hybrid memory / disk cache – LRU for memory and disk • In-memory index for fast disk retrieval • Retrieval of partial results • Efficiently support writing and reading multiple concurrent result sets slide 16 Result Caching (continued) Java Caching System (JCS) Implementation • Web server caching system • Uses Java serialization • In memory storage, with swapping to indexed file on disk • Built-in capability for distributing cache across multiple machines (untested) • Modified JCS to support synchronous puts when memory is full (wait for space to be freed via disk write) • Store Result Set as a list of objects each made up of a block of rows • Support many result sets (each result set can use as little as one block of rows in memory) • Adding memory speeds cache slide 17 Req: Result Retrieval From previous design requirements: •The PS1 DRL shall return query results in response to a query request. slide 18 Result Set Retrieval Implementation Don’t slow down fast queries • Return results immediately if query is very fast Enable incremental results for queries with large data volumes Status updates with number of rows retrieved Execution status if supported by database Support streaming CSV file download • Stream file directly from cache rather than creating on disk slide 19 Req: Performance Monitoring From previous design requirements: Allow administrators to monitor the performance of the DRL functions • I/O statistics • CPU statistics • memory statistics • process statistics at configurable levels of detail slide 20 Performance Monitoring Implementation JMX over RMI to provide management interface to all JVM information collected at runtime • Does not provide CPU information Use cross platform third party library if more detailed information is required • YourKit (http://www.yourkit.com) is one good third party option • Tradeoff: non-JVM profiling libraries incur overhead Provide user configurable logs to a database to store historical information slide 21 Req: User Administration From previous design requirements: The PS1 DRL shall provide a computer security system to protect the DRL and PSPS DM Components from unauthorized access. The PS1 DRL shall provide and authenticate at least two levels of access. The PS1 DRL shall provide and authenticate privileged access for use of the private Administrative API. The PS1 DRL shall provide and authenticate standard access for use of the public API. slide 22 User Administration Implementation Initial plan: Tomcat Realm JDBC based security with an in-process database • Straightforward • Independent of other components • Allows administrator to create and modify user accounts through web service • Allows association of additional information with user account – Role – Query log – Available result sets – Running queries slide 23 Req: Logging From previous design requirements: Log major system events • Query events • Unsuccessful Authentication Attempts • Server restarts • Any errors slide 24 Logging Implementation Log results via JDBC to in-process database Move to external database if DRL is clustered in the future Logs linked to user accounts (stored in same database) slide 25 Key Modifications: DRL – DM ICD Changes For non-ODM Data Managers, DRL should utilize JDBC directly rather than RMI and MBeans for performance and flexibility reasons • JDBC optimized for transfer of result set data • JDBC already abstracts much of database implementation details • Eliminate the RMI step, increase performance, and reduce complexity for Database developers • Use database security for DM rather than custom J2EE security slide 26 Driver Specifics Performance • Result set batching Data Types Schema Information Retrieval Performance information retrieval slide 27 Key Modifications: Caching Changes Instead of purging results as soon as they are retrieved, associates results with query history and keep around as long as possible slide 28 Key Modifications: Session Management Connection and data persistence across web service calls Get UUID back on login to identify session • UUID generator security (Java randomUUID for cryptographically strong randomness) Web Services don’t usually save state • In this case, UUID tied to JHU Query Manager session slide 29 Test Plan Initial test plan draft developed Includes more test details than existing DRL test plan Key realization: • Need to define minimum requirements for integration of new Data Managers • Data Manager acceptance testing needed Will be updated as we continue to make design decisions and as software is implemented slide 30 Performance Testing Performance Critical Components • Result Set Persistence – Stream large result sets directly to/from disk • HTTP for Data Transfer – Zip Data Compression • JDBC Drivers – Optimize use of JDBC Driver • Server Threading – Test with many distributed clients downloading – Connection persistence across web service calls • Division of machines / processes slide 31 Software Delivery Implementation will be provided using a version of the SDSS database as an example backend. MySQL and PostgreSQL will also be supported. Example WBI will be provided with software Example Java and .NET client applications will also be provided Automated test suite will also be delivered slide 32 Schedule slide 33 Status Completed review of specification No DRL problems identified Technologies chosen for implementation • Tomcat • Axis2 Web Service • Java Caching System for Result Set caching • Microsoft JDBC SQL Server Driver Initial web service proof of concept developed Draft test plan document slide 34 Demo slide 35 The PSPS Web Based Interface & User Interface Clients Jim Heasley slide 36 The WBI The WBI provides an interface for a human user to request published science data from the data stores via the DRL. It is one example of a Published Data Client (PDC). Note that there can be more than one PDC providing the same functionality. The WBI provides both a Standard User API and an Administrative User API. The WBI is in fact a combination of the infrastructure needed to connect to the DRL and some number of clients that access the PSPS data stores via the DRL. Driving requirement – SECURITY • Preventing unauthorized access • Not about encrypting data for privacy! REALITY CHECK – it’s a web server with some clients attached! slide 37 WBI Components slide 38 WBI Components The WBI Software Design Description is in the SAIC generated document PSDC-630-WBI-SDD The WBI Components are • Administrative User View – The user interface for an authenticated WBI administrator • Administrative Web Service Driver – Programming interface that converts a method call to its mapped SOAP message and sends it to the DRL. – There is a 1-to-1 mapping of requests handled by the Administrative Service Driver to SOAP messages defined in the DRL WSDL – Documented in PSDC-630-DRL-PDC-Private-ICD.html • Request Controller – Provides stateful management of user requests that may persist longer than a user’s WBI session. slide 39 WBI Components The WBI Components (continued): • Standard User View – Provides the user interface to an authenticated nonadministrative WBI user. • Standard User Web Service Driver – Provides a programming interface that converts a method call to its mapped SOAP message and transmits it to the DRL. It also performs reverse function for responses/faults received from the DRL. There is a 1-to-1 mapping of request to the SOAP messages defined in the DRL WSDL. – Documented in PSDC-630-DLR-PDC-Public-ICD.html • WBI Account Manager – Responsible for authenticating users and granting access permissions to WBI functionality. Users will be identified by a user name and password which serves as the authentication credential. slide 40 WBI Components The WBI Components (continued): • WBI Log Configuration Manager – Permits an administrator to define logs, define the level & verbosity of event reporting, and identify events reported to administrators. • WBI Log Manger – Initializes logs on startup as defined in a configuration file. – Coordinates logs from multiple WBI components to ensure only level of logging specified is done slide 41 WBI Detailed Design Main challenge – negotiation of the Web Services Interface to the DRL. These web services make use of concepts outside the realm familiar to traditional scientific programers, e.g., • XML • SOAP • WSDL • X.509 Certificates • Digital signatures To simplify access for the WBI and other PDCs the Standard Web Service Driver has been encapsulated in an optional Java-based component named PDC-Core. Documentation of this reference implementation is provided in the SAIC generated document PSDC-630-WBI-SDD-Addendum-A_Detailed-Design.html slide 42 WBI User Interfaces The components described to this point are there to provide the low-level functionality necessary to ask for and return data from the DRL and the data managers which connected to it. The astronomers won’t interact with them directly. The USER INTERFACES are the web applications that use the web services provided by the WBI and DRL are the tools with which the astronomers will interact. As mentioned yesterday, we have followed the advice of the PDR committee and are providing access via “recycled” web applications (the SDSS Casjobs web interface, hereafter Query Manager = Qm), reused tools (from the MOPS), and a work-alike clone of another existing web app (IPAC’s Gator). slide 43 The SDSS Casjobs Web Interface slide 44 PS1 Casjobs Interface = Qm slide 45 A PS1 Menu Driven Web Application Following the PDR, I developed a prototype of a menu driven web application for accessing the tables in the PS1 data base, modeled on the IPAC’s Infrared Science Archive Gator interface. This application was developed using PHP, a serverside HTML embedded scripting language. There are PHP APIs available for most major databases. The user interface allows generating SQL commands from scratch in a roll-your-own window or automated SQL generation from check box selection of database attributes and user specified spatial constraints. The interface is configured using information stored in a MySQL database. This allows easy modification of schema, help files, etc. slide 46 slide 47 Menu Driven Web Interface Modules Menu Driven Queries Main Window schema sqlpage catalogs collections makeMenu glossary Roll your own SQL queries generateSQL submitSQL slide 48 Crocodile Demo This first demonstration shows Crocodile set up to use the 2MASS point source and extended source databases, the USNO UCAC astrometric catalog, the UCAC bright star supplemental catalog, the USNO-B catalog, and the Tycho 2 catalog. (Only 1% of the 2MASS PSC is implemented for the demo, along with the 2MASS XSC and the UCAC bright star supplement.) This second demonstration shows an implementation of the Crocodile user interface configured to use an early version of the Pan-STARRS schema. There’s no back end database attached to this demo. slide 50 MOPS Tools Within the MOPS subsystem the group (in particular Larry Denneau) has developed an extensive set of software tools for interacting with the MOPS DB. As the SSDM will be a copy (and hot spare) of the MOPS DB, these tools can be interfaced to the WBI to provide access to the SSDM for use by astronomers without impacting the MOPS functional DB. As Larry noted yesterday, the MOPS tools have been developed in PERL. The next 3 slides are screen shots of • A summary page of a single night’s coverage with links to additional information • A section of the tracklet and linking efficiency page for a lunation • An a (semimajor axis) vs. e (eccentricity) plot of orbits discovered by MOPS slide 51 MOPS Tools slide 52 MOPS Tools slide 53 MOPS Tools slide 54 WBI Hardware Plan The anticipated hardware configuration for the WBI component of the PSPS includes: • 2 servers to provide a full Windows common infrastructure • 2 web servers to host the WBI components and possibly copies of the DRL software • A network switch (over the entire PSPS) that provides load balancing for the web servers • Mass storage (~several TBytes) to store output from the various DMs. Software implementation will include • Common infrastructure for Windows configuration • Windows 2003 servers • IIS web server software • Windows automated patch installation software slide 55 WBI Risk Assessment Overall Risk Assessment for this subsystem is LOW! Because – The APIs are very well defined and use well known (in the computer community) software definitions. – We are recycling and reusing existing software and creating a clone of a successful scientific database interface – It’s only a damn web server system—it’s NOT rocket science. slide 56 PSPS External Interfaces Jim Heasley slide 57 PSPS External Interfaces The PS1 PSPS will (initially) receive data from the two data managers being developed under the PS1 construction program: the ODM & MOPS. The external interfaces defined for these two subsystems will provide the basic template for connecting future databases and data stores (e.g., image archives) to the PSPS. Actual data transfer will be done via the Data Store mechanism described yesterday morning. For each data store interfaced to a DM we define: • An Interface Requirements Specification (IRS) that tells what we must transfer. • An Interface Control Document (ICD) that tells how the transfer takes place and the details of what is actually transferred. slide 58 PSPS External Interfaces We have documented these interfaces as follows: • IPP-PSPS – PSDC 930-006: The Interface Requirements Specification – PSDC 940-006: The Interface Control Document • MOPS-PSPS – PSDC 930-007: The Interface Requirements Specification – PSDC 940-007: The Interface Control Document For brevity, in this section I will only discuss the interfacing between the PSPS and the IPP because • The MOPS-PSPS interfacing is much simpler in that we are moving a dump of the MySQL database held within the MOPS. • The IPP-PSPS interface is more complicated than that for MOP-PSPS slide 59 The PS1 IPP-PSPS IRS Defined IPP Data Products • Predefined Objects determined by the IPP during commissioning and testing. • Sky Cell Definitions • Camera Configuration Information • Detection Attributes common to P2, P4Σ, P4Δ, and image stack sources • Detection Attributes common to P2, P4Σ , P4Δ sources when PSF fitting is insufficient • Attributes for non-psf sources in image stacks for sources that are not well described by a PSF fit. • P2, P4Σ, P4Δ photometry transformation metadata • Frame Descriptor Metadata (per exposure) • P2 Image Metadata (per OTA per exposure) slide 60 The PS1 IPP-PSPS IRS Defined IPP Data Products (continued) • P4Σ, P4Δ, Image Stack Metadata • Definitions of alternative source fitting models used on the P2, P4Σ, P4Δ sources that don’t fit the PSF • List of Survey Programs • List of Photo-Z recipes Defined PSPS (derived) Data Products • Object Attributes/Statistics IPP Data Publication Process PSPS Communication Process slide 61 The PS1 IPP-PSPS ICD slide 62 The IPP-PSPS ICD describes the details of the publication process of data moving from IPP to PSPS. The IPP defines a 1 time transfer to catalog objects from IPP to PSPS to initialize the objects in the ODM. This is done to jump start the ODM and detection-to-object correlation process. The PSPS publication cycle consists of a job, which itself consists of a set of batches of FITS files and a batch manifest file The monthly publication process has 4 stages: • Verification that the FITS files match the batch manifest. If not send, Batch Manifest Error Notification. • Verification of the integrity of FITS files via checksums. If not, send Batch Corruption Error Notification. • Verification that the content of the FITS files match the manifest description. If not, send Batch Content Error Notification. • Detailed verification of every datum in each FITS file. If not, send a Batch Verification Error Notification. When IPP has shipped all the batches for a job it sends PSPS a batch manifest file. The PS1 IPP-PSPS ICD If any error notification is sent the appropriate correction process for IPP is defined by the ICD. The ICD also describes the situations that could give rise to a broken publication cycle and the actions to be taken to resynchronize the publication process should that occur. Note: times for publication cycle are based on an old (SAIC) conservation ingest rate. We now expect to do this much faster and may be able to publish on a faster time scale. Data transfer from IPP to PSPS is assumed to occur in a trusted environment. No authentication measures are specified. Data moves as a push from IPP to PSPS. The interface between the systems is supplemented by event notifications via email from PSPS to the IPP. slide 63 The PS1 IPP-PSPS ICD The following notifications from PSPS to IPP are defined: • • • • • • • • • • • slide 64 Batch Manifest Error Batch Corruption Error Batch Received Batch Refused Error Batch Content Error Batch Verification Error Job Manifest Error Job Verification Error Job Verification Successful Job Ingest Failure Job Publication (can’t accept data) (manifest lists batches not found) (data from different batches inconsistent) (data verified but PSPS can’t ingest it) The PS1 IPP-PSPS ICD Manifest files are simple XML files describing the appropriate structure (e.g., batch or job). The ICD describes the structure and format of the FITS files used to transmit the IPP data products. The ICD describes in detail the FITS “cards” used in the Header Data Units (HDU) for each component of the FITS files used to transfer the IPP data products. The data attributes themselves are included in the FITS files as binary FITS tables as defined in the appropriate HDU. slide 65 The PSPS Test Plan Jim Heasley slide 66 Test Plan Overview SAIC designed a PSPS Software Test Plan (STP) that is presented in PSDC-630-STP. While the document was prepared at a time when SAIC was the primary support for the PSPS development (and hence is worded to reflect their expectation to be doing the code development around an Oracle database system and the subsequent testing), the test plan itself is very generic and database independent. As with the previous discussion of the PSPS external interfaces, we will concentrate here on the testing of the ODM. The primary subsections of the STP cover • Test environment • Test identification • Test Schedule slide 67 PSPS Software Test Environment slide 68 Software Test Environment The software test environment is designed to support: • Design validation • Component testing • Integration testing • Deployment testing PSPS Test Software will include the following tools: • DM simulator • DRL simulator • IPP simulator • MOPS simulator • WBI simulator • WBI user simulator slide 69 Test Data from IPP The IPP will need to provide test data sets for all the data coming to the PSPS as defined in the ICD. In particular, for the ODM we will need samples of: • Alternate Source Model Fits • Camera configuration • Image stack (cumulative sky) data • P2 Frame data • P4Δ high significance data • P4Δ low significance data • Photometry transformation information • Predefined objects from IPP • Sky cell definitions • Survey program definitions slide 70 Test Data Sets The STP defines test data sets that are jobs and batches of the IfA provided data sets that are used to evaluate the ingest mode of the ODM. In these tests one or more of the sample data sets are corrupted to test whether the software can identify the data values that are inappropriate or out of range. slide 71 Test Identification Test Levels • Design validation – Tests to be peformed as part of the development process during the design an code/unit test phase. • Component testing – Tests performed to verify each component meets specifications • Integration testing – Tests conducted by JHU, Referentia, and IfA to confirm the PSPS components work with each other • Deployment testing – Tests to verify PSPS works with IPP and MOPS slide 72 Planned Tests Test cases describe prerequisite conditions, inputs, expected results, conditions for success, assumptions, constraints, and outputs to a test log when a test procedure is performed. One or more test cases are defined in the Software Test Description for every test (PSDC-630-STD). The test cases are organized by component/subsystem as follows: • WBI • DRL • DM • PSPS slide 73 Real World Testing Fact of life: the development timelines for the PSPS components differ, in particular, the more complex ODM will be finished later than the others. The good news: we already have a working MOPS DB and Perl based queries, the DRL is making excellent progress. So, I plan to have the first “full” integration testing of the PSPS done earlier than ODM’s completion by have an end-to-end testing using the WBI MOPS client, the DRL, and the SSDM (MOPS clone). JHU can provide a MyBestPS1 minisystem for testing purposes at Referentia, and network access to the ODM prototype for testing and integration purposes. slide 74 The PSPS Implementation Schedule Jim Heasley slide 75 Schedule The three components of the PSPS have rather different time scales for completion. The driving factor for the PSPS implementation schedule is the ODM work at JHU. You’ve already heard from that team that we anticipate a complete system design to be available and tested by September 2008. The DRL development work by Referentia is due to be completed by April 1, 2008. No detailed schedule is available on the WBI infrastructure software (as Mr. Holmberg hasn’t started work yet). As shown in the previous sections, we already have working prototypes of all three web access clients we intend to implement. slide 76 Schedule – Target Mile Stones slide 77 Jan. 2008 Jan. 2008 Apr. 2008 Aug. 2008 Sep. 2008 Oct. 2008 Nov. 2008 Dec. 2008 - Define hardware for Yr 1 configuration New software engineers start @ IfA, JHU Referentia finishes DRL programming JHU Finishes ODM design Hardware ordered for PSPS Install hardware at site System integration PSPS ORR Risk Evaluation & Mitigation Plan Jim Heasley slide 78 Risk Paraphrasing an American “philosopher”: There are the things you know you know, there are the things that you know you don’t know, And there are the things that you don’t know that you don’t know. D. Rumsfeld slide 79 Risk In this section we want to show that • We’ve maximized those things in the first category. • Figured out what we need to do about things in the second category • And hopefully minimized those in the third category! slide 80 Inter-Subsystem Risks Telescope, Camera, and Observatory and Telescope Instrumentation Software Subsystems • • • slide 81 No direct risk to PSPS schedule or cost Any delay on their side gives schedule relief to PSPS integration Overall risk to PSPS - LOW Inter-Subsystem Risks Moving Object Processing Software (MOPS) • • • • slide 82 The Solar System Data Manager (SSDM) in PSPS will import much simpler data products than those provided by the IPP. The SSDM is will be clone of the MOPS internal DM. MOPS decides what it exports. Maturity of MOPS-PSPS interface does not pose significant risk to PSPS development. Overall risks to PSPS - LOW Inter-Subsystem Risks Image Processing Pipeline (IPP) • The reality requirement: PSPS must ingest the data products from IPP at the nominal monthly processing rate… whatever it turns out to be • IPP is the stressing data source for the PS1 PSPS • IPP is ahead of PSPS in the design/review process – Some fine tuning of the data attributes and file formats may be needed to reach a final configuration • The good news is that IPP will be processing data for ~6 months before it is “ready” for delivery to the PSPS, so we will have ample opportunity to test both simulated and real data. Risk Level – LOW, provided we are able to iterate/refine issues with the IPP. slide 83 Intra-Subsystem Risks slide 84 WBI – LOW DRL – LOW SSDM – LOW ODM –LOW to MODERATE General Areas of Risk slide 85 Science Rules • Over the course of these presentations there have been some points that have been left unspecified on purpose, e.g. radius size for detection-object correlation, number of detections to create an object, etc. These will ultimately be determined by experimentation with real data from the IPP. • We believe this is a LOW risk issue provided IPP can provide to us the appropriate data before system deployment so that we can conduction the appropriate experiments and deduce the rules. Short time scale versions • This is the getting the same answer as you did from an old query in a database which is constantly being updated (and hopefully improved)! • As noted the previously we hope to deal with this by providing snap shots of the object table and by providing adequate information in the schema so that one could recover to the state the database was in when the query was made. • Issues generating risk are how often are the snap shots made, how long do they persist, how easy and/or fast does the process of rerunning an old query have to be? Risks here involve cost of storage for the snap shots and perhaps programming effort. • Risk is LOW unless “unreasonable” demands are placed on the ODM. General Areas of Risk Major data recalibrations and versions • At the end of the 1st year of operation the PS1 team intends to develop new astrometric and photometric calibrations to the data derived from the data in hand at that time. • Such new calibrations can potentially change every derived number in the database for both detections and objects. • Experience with SDSS indicates that astronomers will also want to keep the “DR1” of the database around as they will have written papers based upon that calibration. Full versioning of the old tables poses a COST RISK. • It may be feasible to solve this problem by providing correction tables for new calibrations and presenting the different calibrations to users as database views. While this should work in principle, it has not been tested in practice and hence represents a MODERATE risk. slide 86 General Areas of Risk Dealing with crawlers and value added data generators • The IPP and PSPS have now agreed that the value added products will be calculated by teams from the science consortium and not be performed within the database. • We have three potential schemes available to serve these products back to users – By providing tables in the ODM into which user provided data could be loaded if it is deemed to be especially desirable to serve it this way. These data would be loaded on a schedule that coincides with regular ODM updates. – As client web interfaces and DMs attached to the DRL. – PS1 Consortium clients could have their own MyDB space associated with the ODM (either within the project storage area or as add on storage paid for by the client) and their tables could be exported to other PSPS ODM users via the Qm group mechanism. • Overall risk to PSPS is LOW (after initial efforts to get the process going). slide 87 General Areas of Risk Geoplexing • We need to be able to distribute incremental updates of the ODM to other sites off Maui in order to – Maintain a physically distinct (and hence) safe backup of the ODM in case of catastrophic events at the primary ODM site. – Have a mechanism to provide data to PS1 Consortium partners for their ODM mirrors. • This process is ultimately limited by the bandwidth available between Maui and the alternate sites. • Risk level is LOW to MODERATE for providing regular incremental updates but HIGH if one needed to recreate the primary site from one of the remote ones. slide 88 General Areas of Risk Location, Location, Location! • The site for the PSPS has not been determined, with both the IfA buildings on Maui and Oahu being possible choices. • Strictly speaking this is an operational issue (with some issues that involve the PS1 Consortium) that doesn’t need to be addressed at CDR, but this uncertainty does have implications on the design! • Risk assessment– MODERATE to HIGH? slide 89 General Areas of Risk Staffing • Software development staffing – We have adequate personnel to deal with the WBI and DRL development. The ODM situation is tighter but will be helped with the addition of two FTE provided by the PS Project in January. – Risk – LOW to MODERATE • Operational Staffing – To date there has been no clear discussion of the requirements for an operational staff to install or administer the PSPS system. This was largely because until the completion of the CDR we do not know exactly what we’re planning for. – The JHU team advises that we need to have 2 people trained to perform the hardware installation and system admin functions. – Risk –MODERATE as it’s budget issue, but unclear to me whether it belongs to the Project or the Consortium (or both). slide 90 General Areas of Risk PSPS stability/trouble shooting/testing/upgrades • Good engineering practice dictates that one does not do testing for stability, trouble shoot, or upgrades on a production system. • We believe that to be able to address the problems that are almost certainly going to arise with a new system like the ODM we will require a testing system for this purpose. • This system was not anticipated in the original PSPS plan but should fit within the maximum possible budget available to the subsystem. Without this test system overall risk to PSPS would be HIGH. slide 91 General Areas of Risk PSPS stability/trouble shooting/testing/upgrades • Good engineering practice dictates that one does not do testing for stability, trouble shoot, or upgrades on a production system. • We believe that to be able to address the problems that are almost certainly going to arise with a new system like the ODM we will require a testing system for this purpose. • This system was not anticipated in the original PSPS plan but should fit within the maximum possible budget available to the subsystem. Without this test system overall risk to PSPS would be HIGH. slide 92 General Areas of Risk Budget • My concern here is not what it was at PDR where it was unclear what the project was willing to commit to the PSPS development, but rather a realistic worry that if some other critical subsystem suffers a catastrophic problem funds targeted to PSPS will be diverted for that purpose. At that point we will have to make some cutbacks in what we want to do with the PSPS system. • Risk level ??? slide 93 In Conclusion slide 94 Does Our Design Meet the Top Level Design Requirements? 3.3.01 The PSPS shall be able to ingest a total of 1.5x1011 P2 detections, 8.3x1010 cumulative sky detections, and 5.5 x109 celestial objects together with their linkages. • Yes. Prototyping efforts for the ODM show this target can be achieved. 3.3.02 The PSPS shall be able to ingest the observational metadata for up to a total of 1,050,000 observations. • Yes. Prototyping efforts for the ODM show this target can be achieved. 3.3.0.3 The PS1 PSPS shall be capable of archiving up to ~ 100 Terabytes of data (TBR). • Yes. Prototyping efforts for the ODM show this target can be achieved. 3.3.0.4 The PSPS shall archive the PS1 data products. • Yes. The design provides sufficient redundancy that the products can not only be input into the database but also reliably maintained in case of hardware failures. slide 95 Does Our Design Meet the Top Level Design Requirements? slide 96 3.3.0.5 The PSPS shall possess a computer security system to protect potentially vulnerable subsystems from malicious external actions. • Yes. Multiple layers of security are built in via the WBI, the DRL, and the Data Stores. 3.3.0.6 The PSPS shall provide end-users access to detections of objects in the Pan-STARRS databases. • Yes. Multiple WBI interfaces are available to provide interactive user access as well as allowing for machine based access by Published Data Clients. 3.3.0.7 The PSPS shall provide end-users access to the cumulative stationary sky images generated by the Pan-STARRS. • Yes. For PS1 the PSPS will serve as a front end for requests for images from the IPP. 3.3.0.8 The PSPS shall provide end-users with metadata required to interpret the observational legacy and processing history of the PanSTARRS data products. • Yes. The relevant metadata are transferred fro the IPP along with the data products and the schema of the ODM provides for their access. Does Our Design Meet the Top Level Design Requirements? slide 97 3.3.0.9 The PSPS shall provide end-users with Pan-STARRS detections of objects in the Solar System for which attributes can be assigned. • Yes. These are contained in the MOPS DB and the SSDM component of the PSPS is a “hot spare” of the former. 3.3.0.10 The PSPS shall provide end-users with derived Solar System objects deduced from Pan-STARRS attributed observations and observations from other sources. • Yes. These are contained in the MOPS DB and the SSDM component of the PSPS is a “hot spare” of the former. 3.3.0.11 The PSPS shall provide the capability for end-users to construct queries to search the Pan-STARRS data products over space and time to examine magnitudes, colors, and proper motions. • Yes. This will be possible through multiple mechanisms attached to the DRL through several web clients. 3.3.0.12 The PSPS shall provide a mass storage system with a reliability requirement of 99.9% (TBR). • Yes. The ODM design provides sufficient redundancy (based on experience with the SDSS database) to reach this level of reliability. Does Our Design Meet the Top Level Design Requirements? 3.3.0.13 The PSPS baseline configuration should accommodate future additions of databases (i.e., be expandable). • Yes. The basic design of the DRL provides for the possibility to add both new published data clients and data stores. Further, we have multiple approaches to incorporate “value added” data products within the PSPS design. slide 98 Does the PSPS Design Meet the CDR Criteria? Has this design we’ve presented matured to the CDR level? • We believe this is the case. • Indeed all other PS1 subsystems that passed CDR have experienced continuing development. That’s not an excuse, just a statement of fact. Is the design reliable and maintainable? • In our opinion, yes. Is the hardware within the scope of the project budget? • The current design uses generally available commodity hardware. We continue to seek leveraged hardware acquisition with interested vendors. slide 99 Does the PSPS Design Meet the CDR Criteria? Is the test plan well conceived? • Yes. Are the interface designs mature? • The interfaces to other PS1 subsystems are in fact stable and have changed little since PDR. That doesn’t mean that some “tweaking” won’t be needed. • The Data Store component is already used by other PS1 subsystems. Have the risks been identified and mitigation strategies spelled out? • Yes, we believe so. • Also note that of the risks we have identified none are fatal to the basic design or cause us to not meet any requirements. These risks are in many ways related to convenience to the PSPS users. slide 101 Finally Hopefully the very bright folks I’ve been working with at JHU and Referentia (and previously at SAIC) have found solutions for the hard problems and anticipated the ones which might become hard. Now, one of your tasks as the CDR panel is to help us by telling us what we don’t know that we don’t know. And so, for patiently listening to us for the past 2 days slide 102 Mahalo! slide 103