Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
JOINT INSTITUTE FOR NUCLEAR RESEARCH OFF-LINE DATA PROCESSING GRID-SYSTEM MODELLING FOR NICA Nechaevskiy A. 1 Dubna, 2012 AGENDA NICA off-line data processing parameters Tasks for simulation Simulation platform choice Model efficiency estimation First results Conclusion 2 DATA PROCESSING SCHEMA FOR NICA MPD NICA’s data flow parameters: • high speed of the events generation (to 6 KHZ), • in the central collision of Au-Au about 1000 particles are formed, • the size of the file with modelled information from detectors for 100000 events occupies about 5 TB. MPD parameters № Parameter 1 Speed of data collection from all detector’s components 2 Duration of the set of statistics period within a year 3 Frequency of the event emergence on an installation exit 4 Dead time after event emergence 5 Average of tracks in an event 6 Average of particles collisions 7 Average of bytes on each collision 8 Average time of event's reconstruction on the processor in capacity 1КSI2K Value 4.7 GB/s 120 days 6 KHz 1 cicle (50%) 500 20 45 2 s. 3 SOURCE DATA The specification of requirements to NICA experiment off-line data processing № Requirements Value 1 Quantity of events to processing in a year 1.87 е10 2 Total data volume to storage in a year 8,4 PB 3 Total Disk space in case storage is RAID6 (+25%) in a year 10 PB 4 Total CPUs in grid structure, minimum necessary for data recovery with the speed equal to a set of events, proceeding from 7000 thousand astronomical clock of work a year 1480 5 Numbers of grid sites 20 6 Minimum of Data transfer speed from JINR to Sites 2,5 Gb/s The expected number of data processed events is about 19 billions. If data transfer speed from sensors is 4.7 GB/s, the total amount of source data can be estimated as 30 PB annually, or 8.4 PB after processing. 4 GRID FOR EXPERIMENTS Hierarchical grid infrastructure with some computing centers Tier 0/1/2 already used in ALICE experiment and others. PANDA experiment wants to use it also. Questions For Simulation • Grid Infrastructure Architecture? • Number Resource centers? • Amount of the Resources? • Capacity of the network? • Resource distribution between users groups? • etc. Urgency Recommendation and specification for NICA grid infrastructure creation 5 SIMULATION TASKS Task 1. Task 2. 6 GRIDSIM SIMULATION PACKAGE •Allows to simulate various classes of heterogeneous resources, users, applications and brokers • There are no restrictions on jobs number which can be sent on a resource; • Capacity of a network between resources can be set; • System supports simulation of statistical and dynamic schedulers; • Statistics of all or the chosen operations can be registered • Implemented in Java • Configuration files are used to set simulation’s parameters • Source code is available • A lot of examples of the GridSim using • Multilevel architecture allows to add new components easily http://www.gridbus.org/gridsim/ GridSim Architecture 7 MODEL EFFICIENCY ESTIMATION Parameters of the model efficiency: a) Average network loading by days [%] b) Numbers of the running /waiting jobs c) Number of uses CPUs d) Total Data transfers in hours [GB] e) Total Storage uses [%] f) Cluster uses [%] j) Refused CPUs [%] 8 MODEL COMPONENTS 1. 2. 3. 4. User Interface (edit/add model) MySQL database to save simulation parameters Simulation System Results Visualization Tools 9 TEST SIMULATION Clusters: 1 Machine 2 CPUs Users: 1 Jobs: 10 10 EXAMPLE OF GRAPHIC REPRESENTATION OF THE SIMULATION RESULTS 1. Waiting and Running Jobs 2. Average Clusters Usage 11 DONE! The web interface of the model editing with one test scenario of the grid work is created key parameters of the model estimate are allocated; Results visualization tools are created; Simulation passed debugging and verification phase. 12 CONCLUSION The model will allow : to estimate some architectures (parameters) of the data processing system by changing entrance data only; library of scenarios (Data processing, architectures, other) will allow to compare various technical solutions and to choose optimum. Plans: ― the user interface development; ― debugging the model in client-server architecture ― development of a scenarios sets of grid systems work ― user’s editing and adding grid model parameters 13 Questions? 14