* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Title Page - Chep 2000 Home Page
Survey
Document related concepts
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Team Foundation Server wikipedia , lookup
Relational model wikipedia , lookup
Functional Database Model wikipedia , lookup
Concurrency control wikipedia , lookup
ContactPoint wikipedia , lookup
Database model wikipedia , lookup
Transcript
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics analysis components, adaptable to different experiments. A job configuration manager uses a scripting interface to provide web-based editing, submission and cataloguing of analysis jobs, both user-level and experiment-wide, centrally managed in a database. A client/server system distributed over compute nodes provides job submission and monitoring across facilities, which may span several sites. A file catalog records production relationship of data files generated by an experiment. NOVA provides database tools for geometry and parameter object storage. A NOVA web-based browser navigates a relational database storing hierarchically structured dataObjects. Clients may access database information from the code or through a CORBA-specified interface. NOVA components have been tested and deployed in the STAR and ATLAS environments. February 7, 2000 Outline • • • • • • February 7, 2000 Goals Requirements Architecture Components Details Summary CHEP in Padova Motivations • Unprecedented data volume and software complexity in new large High Energy and Nuclear Physics experiments at RHIC (BNL) and LHC (CERN) New approaches to analysis and data handling software Distributed computing environment (DCE) is vital and increasingly powerful Experience in developing DCE solutions for STAR Build on experience to develop DCE tools for use in similarly challenging environments February 7, 2000 CHEP in Padova Goals • Develop software tools for – coordination and control of widely distributed analysis development and physics analysis activity – distributed management and analysis of very large datasets – enhanced robustness, reusability and maintainability of analysis software • For application in many global computing environments (ATLAS, STAR, …) – generic tools not tied to specific implementation choices – select, templatable implementations provided such that NOVA components can be used in a baseline framework February 7, 2000 CHEP in Padova Requirements • Support wide area data intensive analysis • Define middleware services are required to permit analysis applications to effectively run over wide area networks • Provide a rich set of features that applications can select and use to obtain the level of service they need to operate • Define the features and the API's necessary to allow the application and middleware to communicate • Integrate the middleware API's with the applications February 7, 2000 CHEP in Padova Design Approach • Small, modular components; application-neutral interfaces – Can be used as a coherent framework or in isolation to extend existing analysis systems • Focused on support for C++ based analysis – Used for all RHIC, LHC, other large experiments • Emphasis on user participation in iterative development; real-world prototyping and testing (STAR, ATLAS) • Extensive use of existing tools and technologies – Must be readily available, true or de facto standards, well supported, widely used or showing good growth February 7, 2000 CHEP in Padova Component-based Architecture NOVA Architecture nanoDST GCA Query Visualisation Dynamically loaded apps Web browser Mobile Analysis Client Client Data Binder Module Regional Center Analysis Daemon Client Data Binder Module Remote Analysis Remote Clients Offline Control Framework Bug system HyperNews MySQL Client State DB Database Navigator Web Server Server Data Binder Module CVS Code Repository Monitoring Module MySQL Analysis Catalogue Analysis Server State Server Middleware Components Grand Challenge Architecture (GCA) NOVA component Parameters Repository Data Repository Data Management Catalog Interface MySQL Data Catalogue Third party tool customized for and integrated into NOVA Application specific; sample implementation provided Status: Implemented Prototyped Planned Existing third party tool employed by NOVA February 7, 2000 CHEP in Padova Tools and Technologies • Third party tools and technologies used in NOVA: – MySQL: relational database for catalogs, state information and simple objects: C-structs – Perl: Unix scripting and web development tool – Apache: customizable (Perl & PHP) web server for communication and monitoring – CORBA: low-volume interprocess data exchange – ROOT: visualization and analysis tools February 7, 2000 CHEP in Padova Components NOVA components fall into four domains – Regional Center • Central management and execution of analysis – Remote Client • Mobile Analysis – Middleware Components • Data exchange and navigation tools • Client/Server object request brokerage – Data Management • Data repository, catalogue, and interface • Data model for simple objects (C-structs) February 7, 2000 CHEP in Padova Dynamic Binding • Problem: – A user has a new idea that was not foreseen at the beginning. User modifies the structure of one object in his application. Application stores new objects in the database. – Remote applications unaware of a new functionality may request objects in old format. • Solution: – Application: provides metadata request (name, time, selectors...) and the application dataObject dictionary – Database server: provides dataObject and the dictionary – Object Request Broker module: converts dataObject according to the application dictionary February 7, 2000 CHEP in Padova Dynamic Object Broker Application DataObject Application Dictionary Object Request Broker Database DataObject Middleware Services Database Dictionary Remote Application Clients Parameters Repository Central Database Server February 7, 2000 CHEP in Padova Forward Compatibility • Benefits: – Separation of database and analysis applications – Robust interface (via built-in type checking) – Dictionary built from C-header files or IDL-files – Database access is independent of application code version: user can read new dataObjects with an old executable • Usage: – Parameters data management (versioned geometry and reconstruction constants support) February 7, 2000 CHEP in Padova Static Binding • Problem: – Remote application (web browser) navigates current database hierarchy. • Solution: – Object Request Broker at the Regional Center serves dynamic HTML dataObjects in format tailored according to application ID: Netscape or MS Internet Explorer February 7, 2000 CHEP in Padova Static Object Broker NOVA Browser Application DataObject Apache Web Server Application ID Database API Module Database DataObject Middleware Services Database API Call Parameters Repository Regional Center Database Server Remote Application Client February 7, 2000 CHEP in Padova Layered Interface February 7, 2000 CHEP in Padova Data Model Array of parameters Array of structures structure February 7, 2000 relation CHEP in Padova parameter Cataloguing Analysis Workflow Job configuration manager Job monitoring system fileCatalog February 7, 2000 CHEP in Padova Grand Challenge Interface GC System Query Estimator GCA Interface STAR Components gcaClient StIOMaker Query Monitor database Cache Manager FileCatalog fileCatalog Index Builder IndexFeeder tagDB February 7, 2000 CHEP in Padova Limiting Dependencies Experiment-specific & GCA-dependent • IndexFeeder server – IndexFeeder read the “tag database” so that GCA “index builder” can create index • FileCatalog server – FileCatalog queries the “file catalog” database of the experiment to translate fileID to HPSS & disk path • gcaClient interface – Experiment sends queries and get back filenames through the gcaClient library calls February 7, 2000 CHEP in Padova Summary What is NOVA? • Framework components for distributed computing What are NOVA components? • • • • • • Configuration manager for analysis jobs Distributed job submission and monitoring system Analysis workflow catalog Database for versioned dataObjects Brokered extraction of dataObjects Web-based database navigation tool February 7, 2000 CHEP in Padova