Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
I/O Streaming Evaluation of Batch Queries for Data-Intensive Computational Turbulence Kalin Kanov, Eric Perlman, Randal Burns, Yanif Ahmad, and Alexander Szalay Johns Hopkins University I/O Streaming For Batch Queries Based on partial sums Allows access to the underlying data in any order and in parts Data streamed from disk in a single pass Eliminates redundant I/O Over an order of magnitude improvement in performance over direct evaluation of queries Introduction Data-intensive computing breakthroughs have allowed for new interaction with scientific numerical simulations Formerly, analysis performed during the computation No data stored for subsequent examination Turbulence Database Cluster Stores entire space-time evolution of the simulation Two datasets totaling 70TB; part of the 1.1PB GrayWulf cluster Provides public access to world-class simulation Implements “immersive turbulence*” approach *E. Perlman, R. Burns,Y. Li, and C. Meneveau. Data exploration of turbulence simulations using a database cluster. In Supercomputing, 2007. Turbulence Database Cluster Motivation Without I/O streaming: Heavy DB usage slows down the service by a factor of 10 to 20 Query evaluation techniques adapted from simulation code do not access data coherently Substantial storage overhead (~42%) incurred to localize each computation Turbulence queries: 95% of queries perform Lagrange Polynomial interpolation Can be evaluated in parts Processing a Batch Query 0 10 11 14 15 8 9 12 13 2 3 6 7 0 1 4 5 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 Processing a Batch Query query 2 10 0 11 14 15 8 9 12 13 2 3 6 7 0 1 4 query 1 5 1 q1: 0 2 3 5 6 4 6 q2: 9 2 1 2 3 1 4 q3: 4 5 6 7 Multiple disk seeks query 3 4 1 1 1 Redundant I/O 7 8 8 9 9 1 2 1 0 1 1 1 2 1 3 1 4 1 5 Streaming Evaluation Method Linear data requirements of the computation allow for: Incremental evaluation Streaming over the data Concurrent evaluation of batch queries Processing a Batch Query query 2 10 0 11 14 15 8 9 12 13 2 3 6 7 0 1 4 query 1 5 1 2 I/O Streaming: 0 q1 3 Sequential I/O Single pass query 3 4 5 6 7 8 1 0 9 1 1 1 2 3 4 5 6 7 8 9 1 1 q1 q1 q1 q1 q3 q1 q3 q1 q1 q2 q3 q3 q2 1 2 1 2 q1 q2 1 3 1 4 q2 1 4 1 5 Lagrange Polynomial Interpolation N f (x', y') ly j1 p N j 2 N (y') lx n N i 2 (x') f (x i1 Lagrange coefficients n N i 2 ,y Data p N j 2 ) Processing a Batch Query Input queries pre-processed into a key-value dictionary Keys are z-index values of data atoms stored in DB Entries are lists of queries Temp table is created out of dictionary keys Execute a join between temp table and data table When data atom is read-in all queries that need data from it are processed and their partial sums updated Experimental Evaluation Random workloads: across the entire cube space a 1283 subset of the entire space Workload from the usage log of the Turbulence cluster Compare with direct methods of evaluation: Direct Sorting Join/Order By 3D Workload Used for generating global statistics 128 Workload Used for: Examining ROI Creating visualizations Experimental Evaluation Random workloads: across the entire cube space a 1283 subset of the entire space Workload from the usage log of the Turbulence cluster Compare with direct methods of evaluation: Direct Sorting Join/Order By Setup Experimental version of the MHD database ~300 timesteps of the velocity fields of the MHD simulation Two 2.33 GHz dual quad-core Windows 2003 servers with SQL Server 2008 and 8GB of memory Part of the 1.1PB GrayWulf cluster with aggregate low-level throughput of 70 GB/sec Data tables striped across 7 disks per node 3D Workload I/O Streaming Join/Order Sorting Bytoof executes a magnitude more sequential entire batch acces as a join Over anleads order improvement Each atom is read only once Effective cache usage 128 Workload Less I/O More data sharing I/O Streaming alleviates I/O bottleneck Computation emerges as the more costly operation 128 Workload Future Work Extend I/O streaming technique to other decomposable kernel computations: Differentiation Temporal interpolation Filtering Multi-job batch scheduling: Integrate into a batch scheduling framework such as JAWS* *X. Wang, E. Perlman, R. Burns, T. Malik, T. Budavari, C. Meneveau, and A. Szalay. Jaws: Job-aware workload scheduling for the exploration of turbulence simulations. In Supercomputing, 2010. Summary I/O Streaming method for data-intensive batch queries Single pass by means of partial-sums Effective exploitation of data sharing Improved cache locality Over an order of magnitude improvement in performance Questions Images courtesy of Kai Buerger ([email protected])