Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction of Farm&Grid I. PC Farm: clustering local resources The basic idea of Farm is to cluster economical commercial PCs together to achieve some powerful capabilities of Super-Computer. The three key components of Farm are Disk Server, Node and Network connection: 1) Disk is used for large and frequent data storage/transfer. A typical amount of space is about >1Terabyte, which can be composed by grouping multiple ~100GB cheap commercial disks into a “whole disk” with fast and reliable Net link. High quality motherboard is required for Disk survival in large/frequent data transformation. 2) Nodes are where users’ programs are run and data is processed. They are composed by high performance PCs that act as an integrated CPU and Memory. 3) GHz fast Network Links and Switches are needed to integrate Disk and Nodes up, and talk to outside world for data transformation. Following is the example of Manchester Dzero Farm composition: A brief, Farm is a kind of local computer network cluster, which can act as Super Computers with a large Disk server and CPU/Memory Nodes array integrated by fast Network Link. II. Grid: Today, we scientists are now facing a challenging problem: new generations of science, including particle physics experiments, astronomical satellites and telescopes, genome databases, digitization of paper archives etc, are expected to produce huge boosts in the amount of data to be stored and processed in the next few years by increasingly dispersed groups of scientists and engineers. In particle physics, LHC (Large Hadron Collider) is due to start operation in 2007-8 to probe fundamental questions such as the origin of mass in the Universe. The two general-purpose detectors at LHC, Atlas and CMS, contain over a hundred million individual electronic channels, each producing digital data at a rate of 40 million cycles a second. Even after event selection for interesting physics, the total amount of data produced is likely to be several petabytes (1PB is a million GB, equivalent to about 10 million CDs) per year. Such huge volume of data must be made available for analysis by hundreds of physicists all over the world looking for a handful of very rare events. The “Grid” is considered as the solution to these kinds of computational and data-intensive problem. The Grid takes its name from the Electricity Grid that provides a ubiquitous supply of electricity through a standard interface (plug and socket) throughout the country and with suitable conversion across the world. The complexity of power stations, sub-stations, power lines etc, is hidden from the end-user, who simply plugs in his appliance. In a computational Grid, the power stations are collections/clusters (“Farms”) of computers and data storage centers and the power lines are fiber optics of the network links. Special software, called “middleware”, provides the interfaces through which users can submit their own programs to these computers and access the data stored. The user doesn’t need to know or care where his program actually runs or where his data is actually located as long as he gets his results back as quickly and reliably as possible. Since the computing resources will have many different owners, economic models need to be established or credits will be exchanged within “Virtual Organizations”(VO) such as worldwide particle physics community, which is similar to real money is charged using electricity. Although the components of Grid, computers, disks, network etc have existed for many years, to seamlessly integrate thousands of them together into one distributed system that looks very muck like one enormous PC to users is a severe challenge. Standard protocols as a means of enabling interoperability and common infrastructure have been defined for Grid. A “Grid” is a system that: Coordinates resources that are not subject to centralized control. This is vital point of Grid, otherwise we are dealing with a local management system such as Farm. It would be impracticable that everyone who wants to join the project would have to put all his own investment into one place, e.g. CERN. Grid should be the integration of local Farms spreading all over the world with some large “Regional Centers” (as CERN for particle physics). Clustering at local level reflects the normal funding mechanism and divides the hardware into maintainable chunks as well. Such clustering ensures the resources remain under local control, and they can be switched in and out of Grid at will without breaking the Grid. Of course, agreements should be setup among disperse Farms to build up the VO. Using standard, open, general-purpose protocols and interfaces. A Grid is built from multi-purpose protocols and interfaces that address such fundamental issues as authentication, authorization, resource discovery, and resource access. It is important that these protocols and interfaces be standard and open. Otherwise, we are dealing with an application-specific system. In a simple word, Grid wouldn’t be and should not be some kind “Dzero” or “LHC” computer; although some constituent Farms in Grid may focus on these particle physics experiments, but the whole Grid should be much more generalpurpose and open to different users. Deliver nontrivial qualities of service. A Grid allows its constituent resources to be used in a coordinated fashion to deliver various qualities of service, relating for example to response time, throughput, availability, and security, and/or co-allocation of multiple resource types to meet complex user demands, so that the utility of the combined system is significantly greater than that of the sum of its parts. This is that user can not only have access to the data stored on Grid, but also hardware capabilities (CPU, memory etc) of its constituent Farm resources based on pre-setup agreement. A very interesting example is Web: it satisfies the first to criteria, i.e. its open, general-purpose protocols support access to distributed resources, but it fails to last one of delivering high qualities of service for you can only access/download data you want but can’t run them on remote machines. In a brief, the Grid is considered as the next and more important IT revolution after CERN developing World-Wide-Web. The standard protocols are clearly defined; management tools (e.g. “middleware”) have been developed and under test; the particle physics community VO hierarchical system is being formed and agreements have been setup among VO’s constituent Farms so that the “membership” can be identified in the same way Web identifies IP (Internet Protocol). It is becoming more and more clear that the Grid has the potential to bring fundamental changes to the way large-scale computing is handled both in academic and in industry. We particle physicists will build a Grid to enable us to analyze data from LHC and other collider experiments, and use it as a test bed for the new technology. III. DØ Application for Grid: DØ is expected to accumulate a data and Monte Carlo (MC) sample in excess of 10 PB over the duration of the current Tevatron running period. Due to the complex nature of proton-antiproton collisions, the simulation and reconstruction of the events is a complicated and resource intensive operation. The computing resources required to support this effort are far larger than can be supplied at a single central location (Fermilab). Thus, DØ Collaboration calls for an increase in the amount of production level computing resources carried out at off-site regional analysis centers (RAC Farm) including the reconstruction of data, MC generation, and running analysis software. This way the Collaboration would integrate remote computing resources into the DØ computation model. This realization led to the instigation of the SAM (Sequential data Access via Meta-data, Tevatron RunII database) project between DØ and Fermilab Computing Division to develop an effective functioning computation grid for the DØ experiment. It will be a good opportunity for USTC/China to learn and build a world first-class computer system, by joining DØ Collaboration, developing a local RAC Farm and deploying the Grid technology within China: To get the help of DØ/Manchester computer experts to build our local DØ RAC Farm To provide a computational Grid for the USTC/China DØ groups. To contribute to DØ’s overall computing needs and so receive a credit from the Grid Appendix: Farm: clustering cheap commercial computer resources with fast network link to achieve some power of Super-Computer. It’s actually a local high capability computational Network. Grid: disperse Farms over world constitute a Virtual Organization by committing some common agreements. To an “ignorant” individual user who just wants to run his code on some specified data, the Grid looks like a Hyper-Computer with unique hardware resource (CPU, Memory etc) and all data stored. The membership Farm has potential to access any resources (data/information + hardware) in the Grid. Farms are under local control and can switch in/out-of the Grid at will. SAM: Sequential data Access via Meta-data, is a DØ-FNAL Computer Division project to build the first fully functional HEP Data-Grid moving PByte data among worldwide distributed computing resources. According the first two-year Computing Model performance review, it has become apparent that the computing resources required to support the DØ physics program are larger than previously expected; thus, the off-site Grid solution for SAM is crucial to DØ Collaboration. Lots of effort has been contributed to the SAM Grid development, and decisive progress is expected in the near future.