Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
FABRIC A pilot study of distributed correlation Huib Jan van Langevelde Ruud Oerlemans Sergei Pogrebenko and many other JIVErs… Aim of the project • Research the possibility of distributed correlation • Using the Grid for getting the CPU cycles • Can it be employed for the next generation VLBI correlation? • Exercise the advantages of software correlation • Using floating point accuracy and special filtering • Explore (push) the boundaries of the Grid paradigm • “Real time” applications, data transfer limitations • To lead to a modest size demo • With some possible real applications: • Monitoring EVN network performance • Continuous available eVLBI network with few telescopes •Monitoring transient sources •Astrometry, possibly of spectral line sources • Special correlator modes: spacecraft navigation, pulsar gating • Test bed for broadband eVLBI research huib 23/6/06 Something to try on the roadmap for the next generation correlator, even if you do not believe it is the solution… NGC Groningen 29 June 2006 2/12 SCARIe FABRIC •EC funded project EXPReS (03/2006) • To turn eVLBI into an operational system • Plus: Joint Research Activity: FABRIC • Future Arrays of Broadband Radio-telescopes on Internet Computing •One work-package on 4Gb/s data acquisition and transport (Jodrell Bank, Metsahovi, Onsala, Bonn, ASTRON) •One work-package on distributed correlation (JIVE, PNSC Poznan) •Dutch NWO funded project SCARIe (10/2006) • Software Correlator Architecture Research and Implementation for eVLBI • Use Dutch Grid with configurable high connectivity • Software correlation with data originating from JIVE •Complementary projects with matching funding • International and national expertise from other partners • Poznan Supercomputer centre • SARA and University of Amsterdam • Total of 9 man year at JIVE, plus some matching from staff huib 23/6/06 • plus similar amount at partners NGC Groningen 29 June 2006 3/12 Previous experience on Software correlation • Builds on previous experience at JIVE • regular and automated network performance tests • Using Japanese software correlator from NICT • Huygens extreme narrow band correlation • Home grown superFX with subHz resolution huib 23/6/06 NGC Groningen 29 June 2006 4/12 Basic idea •Use the Grid for correlation •CPU cycles on compute nodes •The Net could be crossbar switch? •Correlation will be asynchronous •Based on floating point arithmetic •Portable code, standard environment typical VLBI problems huib 23/6/06 description 1 Gb/s full array typical eVLBI continuum typical spectral line FABRIC demo future VLBI N N data-rate N telescopes subbands [Mb/s] spect/prod 16 16 1024 16 8 8 128 16 10 2 16 512 4 2 16 32 32 32 4096 256 Tflops 83.89 2.62 16.38 0.16 21474.84 Rough estimate based on XF correlation NGC Groningen 29 June 2006 5/12 Work packages • Grid resource allocation • Grid workflow management • Tool to allocate correlator resources and schedule correlation • Data flow from telescopes to appropriate correlator resources • Expertise from the Poznan group in Virtual Laboratories • Will this application fit on Grid? • As it is very data intensive • And time-critical if not real-time • Software correlation • correlator algorithm design • High precision correlation on standard computing • Scalable to cluster computers • Portable for grid computers and interfaced to standard middleware • Interactive visualization and output definition huib 23/6/06 • Collect & merge data in EVN archive • Standard format and proprietary rights NGC Groningen 29 June 2006 6/12 Workflow Management • Must interact with normal VLBI schedules • Divide data, route to compute nodes, setup correlation • Dynamic resource allocation, keep up with incoming data! Effort from Poznan, based on their Virtual Lab. huib 23/6/06 NGC Groningen 29 June 2006 7/12 Topology •Slice in time • Every node gets an interval • A “new correlator” for every time slice • Employ clusters computers at nodes • Minimizes total data transport • Bottleneck at compute node • Probably good connectivity at Grid nodes anyway • Scales perfectly • Easily estimated how many nodes are needed • Works with heterogeneous nodes • But leaves sorting to compute nodes • Memory access may limit effectiveness huib 23/6/06 NGC Groningen 29 June 2006 •Slice in baseline • Assign a (or a range of) products to a certain node • E.g. two data streams meet in some place • Transport Bottleneck at sources (telescopes) • Maybe curable with multicast transport mechanism which forks at network nodes • Some advantage when local nodes at telescopes • Does not scale very simply • Simple schemes for ½N2 nodes • Need to re-sort output • But reduces the compute problem • Using the network as the cross-bar switch 8/12 Broadband software correlation Station 1 Station 2 Station N EVN Mk4 equivalents Raw data BW=16 MHz, Mk4 format on Mk5 disk From Mk5 to linux disk Raw data 16 MHz, Mk4 format on linux disk DIM,TRM, CRM Channel extraction Extracted data SU Pre-calculated,Delay tables DCM,DMM, FR Delay corrections Delay corrected data Correlator Chip Correlation. SFXC huib 23/6/06 Data Product NGC Groningen 29 June 2006 9/12 Better SNR than Mk4 hardware huib 23/6/06 NGC Groningen 29 June 2006 10/12 Software correlation •Working on benchmarking • Single core processors so far • Different CPU’s available SFX correlator: measuring CPU on single core Auto and Cross correlations 4000 3500 • Already quite efficient 2500 CPU time (s) • More work on memory performance 3000 jop32 2000 pcint cedar 1500 1000 •Must deploy on cluster computers •And then on Grid 500 0 0 4 8 12 16 20 24 28 32 36 40 44 number of stations SFX correlator:CPU contributions 4000 •Organize the output to be used for astronomy 3500 3000 CPU time (s) 2500 cedar FFT only 2000 I/O only FFT Auto 1500 1000 huib 23/6/06 500 0 0 4 8 12 16 20 24 28 32 36 40 44 number of stations NGC Groningen 29 June 2006 11/12 Huygens, software correlation •Experience with software correlation from Huygens •Carrier signal from Titan lander •Recorded on Mk5 disk system • Saved Doppler data experiment •Requires extreme narrow band correlation •And solar system model •May reveal 3D trajectory at 1km accuracy huib 23/6/06 NGC Groningen 29 June 2006 13/12 Goal of the project • Develop: methods for high data rate e-VLBI using distributed correlation • High data rate eVLBI data acquisition and transport • Develop a scalable prototype for broadband data acquisition •Prototype acquisition system • Establish a transportation protocol for broadband e-VLBI •Build into prototype, establish interface normal system • Interface e-VLBI public networks with LOFAR and e-MERLIN dedicated networks •Correlate wide band Onsala data on eMERLIN •Demonstrate LOFAR connectivity • Distributed correlation • Setup data distribution over Grid •Workflow management tool • Develop a software correlator huib 23/6/06 •Run a modest distributed eVLBI experiment NGC Groningen 29 June 2006 14/12 2 major components Part 1: Scalable connectivity • 1.1. Data Acquisition • 1.1.1. Data acquisition architecture (MRO) • Scalable data acquisition system, off-the-shelf components new version of PC-EVN? • 1.1.2. Data acquisition prototype (MRO) • Prototype for 4Gb/s? • 1.1.3. Data acquisition control (MPI) • Control data acquisition, interface for protocol, distributed computing • 1.2. Broadband Datapath • 1.2.1. Broadband protocols (JBO) • IP protocols, lambda switching, multicasting • 1.2.2. Broadband data processor interface (JBO) • Data from public network to eMERLIN correlator huib 23/6/06 • 1.2.3. Integrate and test (OSO) • 10 Gb/s test environment for OSO-eMERLIN (and LOFAR?) • 1.2.4. Public to dedicated interface (ASTRON) NGC Groningen 29 June 2006 15/12 Components (part 2) Part 2: Distributed correlation • 2.1. Grid resource allocation • 2.1.1. Grid VLBI collaboration (PNSC) • Establish relevant tools for eVLBI • 2.1.2. Grid workflow management (PNSC) • Tool to allocate correlator resources and schedule correlation • 2.1.3. Grid routing (PNSC) • Data flow from telescopes to appropriate correlator resources • 2.2. Software correlation • 2.2.1. correlator algorithm design • High precision correlation on standard computing • 2.2.2. Correlator computational core • 2.2.3. Scaled up version for clusters • 2.2.4. Distributed version, middleware • Deploy on Grid computing • 2.2.5. Interactive visualization • 2.2.6. Output definition huib 23/6/06 • Output data from individual correlators • 2.2.7. Output merge • Collect data in EVN archive NGC Groningen 29 June 2006 16/12 On distributed computing • What are the Grid resources • Calibrate the require amount of computing • Dynamical allocation possible? • Interaction with observing schedule • Topology of network • Slice data in frequency, time or differently? • Interface for routing data • Multicast implementation on acquisition module huib 23/6/06 NGC Groningen 29 June 2006 17/12 Distributed correlation • Correlator model centrally generated? • Or calculate at every node • Plan for merging data back together • How to get uvw coordinates in data • Monitor progress centrally huib 23/6/06 NGC Groningen 29 June 2006 18/12 Current eVLBI practice observing schedule in VEX format BBC & samplers user correlator parameters field system controls antenna and acquisition correlator control including model calculation Mk4 formatter Mk5 recorder huib 23/6/06 NGC Groningen 29 June 2006 earth orientation parameters output data Mk4 data in Mk5prop form over TCPIP Mk5 playback 19/12 FABRIC components observing schedule in VEX format DBBC VSI field system controls antenna and acquisition PC-EVN #2 huib 23/6/06 NGC Groningen 29 June 2006 VSIe?? on?? GRID resources data user correlator parameters earth orientation parameters resource allocation and routing correlator control including model calculation FABRIC = The GRID output data 20/12