Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Designing a cluster for geophysical fluid dynamics applications Göran Broström Dep. of Oceanography, Earth Science Centre, Göteborg University. Our cluster (me and Johan Nilsson, Dep. of Meterology, Stockholm University) • Grant from the Knut & Alice Wallenberg foundation (1.4 MSEK) • • • • 48 cpu cluster Intel P4 2.26 Ghz 500 Mb 800Mhz Rdram SCI cards • Delivered by South Pole • Run by NSC (thanks Niclas & Peter) What we study Geophysical fluid dynamics • Oceanography • Meteorology • Climate dynamics Thin fluid layers Large aspect ratio Highly turbulent Gulf stream: Re~1012 Large variety of scales Parameterizations are important in geophysical fluid dynamics Timescales • Atmospheric low pressures: • Seasonal/annual cycles: • Ocean eddies: • • • • • • 10 days 0.1-1 years 0.1-1 year El Nino: 2-5 years. North Atlantic Oscillation: 5-50 years. Turnovertime of atmophere: 10 years. Anthropogenic forced climate change: 100 years. Turnover time of the ocean: 4.000 years. Glacial-interglacial timescales: 10.000-200.000 years. Some examples of atmospheric and oceanic low pressures. Timescales • Atmospheric low pressures: • Seasonal/annual cycles: • Ocean eddies: 10 days 0.1-1 years 0.1-1 year • El Nino: 2-5 years. • • • • • North Atlantic Oscillation: 5-50 years. Turnovertime of atmophere: 10 years. Anthropogenic forced climate change: 100 years. Turnover time of the ocean: 4.000 years. Glacial-interglacial timescales: 10.000-200.000 years. Normal state Initial ENSO state The ENSO state The ENSO state Timescales • • • • Atmospheric low pressures: Seasonal/annual cycles: Ocean eddies: El Nino: • North Atlantic Oscillation: years. • • • • 10 days 0.1-1 years 0.1-1 year 2-5 years. 5-50 Turnovertime of atmophere: 10 years. Anthropogenic forced climate change: 100 years. Turnover time of the ocean: 4.000 years. Glacial-interglacial timescales: 10.000-200.000 years. Positive NAO phase Negative NAO phase Positive NAO phase Negative NAO phase Timescales • • • • • • • Atmospheric low pressures: 10 days Seasonal/annual cycles: 0.1-1 years Ocean eddies: 0.1-1 year El Nino: 2-5 years. North Atlantic Oscillation: 5-50 years. Turnovertime of atmophere: 10 years. Anthropogenic forced climate change: 100 years. • Turnover time of the ocean: years. 4.000 • Glacial-interglacial timescales: 10.000-200.000 years. Temperature in the North Atlantic Timescales • • • • Atmospheric low pressures: Seasonal/annual cycles: Ocean eddies: El Nino: • North Atlantic Oscillation: 10 days 0.1-1 years 0.1-1 year 2-5 years. 5-50 years. • Turnovertime of atmophere: 10 years. • Anthropogenic forced climate change: 100 years. • Turnover time of the ocean: 4.000 years. • Glacial-interglacial timescales: 10.000-200.000 years. Ice coverage, sea level What model will we use? MIT General circulation model MIT General circulation model • • • • • • • • General fluid dynamics solver Atmospheric and ocean physics Sophisticated mixing schemes Biogeochemical modules Efficient solvers Sophisticated coordinate system Automatic adjoint schemes Data assimilation routines • Finite difference scheme • F77 code • Portable MIT General circulation model Spherical coordinates “Cubed sphere” MIT General circulation model • • • • • • • • General fluid dynamics solver Atmospheric and ocean physics Sophisticated mixing schemes Biogeochemical modules Efficient solvers Sophisticated coordinate system Automatic adjoint schemes Data assimilation routines • Finite difference scheme • F77 code • Portable MIT General circulation model MIT General circulation model MIT General circulation model MIT General circulation model MIT General circulation model MIT General circulation model MIT General circulation model MIT General circulation model Some computational aspects Some tests in INGVAR (32 AMD 900 Mhz cluster) Experiments with 60*60*20 grid points Experiments with 60*60*20 grid points Experiments with 60*60*20 grid points Experiments with 120*120*20 grid points MM5 Regional atmospheric model MM5 Regional atmospheric model MM5 Regional atmospheric model Choosing cpu’s, motherboard, memory, connections SG I: 32 0 AM 0, D 50. : X .. AM P2 D 80 In :XP 0+ te l:P 260 In te 4,2 0+ l:P .8 G 4 hz In 2 .2 te l In :P 6G te l: 4,1. Hz X In eo 7 G te l: n, 2 Hz Xe .8 C om on, Gh pa 2.2 z q G hz Al p HP ha Ita ... ni HP um zx 2 60 00 Run time Specfp (swim) 700 600 500 400 300 200 100 0 2. 26 G Hz Xe on 2. 2G Hz Xe on D ua l P4 2. 2 P4 1. 7 P4 G Hz G Hz ua l D AM D 20 00 + 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 AM D run time Run time on different nodes Choosing interconnection (requires a cluster to test) Based on earlier experience we use SCI from Dolphinics (SCALI) Our choice • • • • • Named Otto SCI cards P4 2.26 GHz (single cpus) 800 Mhz Rdram (500 Mb) Intel motherboards (the only available) • 48 nodes • NSC (nicely in the shadow of Monolith) Otto (P4 2.26 GHz) Scaling Otto (P4 2.26 GHz) Ingvar (AMD 900 MHz) Why do we get this kind of results? Time spent on different “subroutines” 60*60*20 120*120*20 Relative time Otto/Ingvar Some tests on other machines • • • • • INGVAR: 32 node, AMD 900 MHz, SCI Idefix: 16 node, Dual PIII 1000 MHz, SCI SGI 3800: 96 Proc. 500 MHz Otto: 48 node, P4 2.26 Mhz, SCI ? MIT, LCS: 32 node, P4 2.26 Mhz, MYRINET Comparing different system (120*120*20 gridpoints) Comparing different system (120*120*20 gridpoints) Comparing different system (60*60*20 gridpoints) SCI or Myrinet? 120*120*20 gridpoints SCI or Myrinet? 120*120*20 gridpoints (60*60*20 gripoints) (ooops, I used the ifc Compiler for these tests) SCI or Myrinet? 120*120*20 gridpoints (60*60*20 gripoints) (1066Mhz rdram?) (ooops, I used the ifc Compiler for these tests) SCI or Myrinet? (time spent in pressure calc.) 120*120*20 gridpoints (60*60*20 gripoints) (1066Mhz rdram?) (ooops, I used the ifc Compiler for these tests) Conclusions • Linux clusters are useful in computational geophysical fluid dynamics!! • SCI cards are necessary for parallel runs >10 nodes. • For efficient parallelization: >50*50*20 grid points per node! • Few users - great for development. • Memory limitations, for 48 proc. a’ 500 Mb, 1200*1200*30 grid points is maximum (eddy resolving North Atlantic, Baltic Sea). • For applications similar as ours, go for SCI cards + cpu with fast memory bus and fast memory!! Experiment with low resolution (eddies are parameterized) Experiment with low resolution (eddies are parameterized) Thanks for your attention