* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download link request
Backpressure routing wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Computer network wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Cracking of wireless networks wikipedia , lookup
Distributed firewall wikipedia , lookup
Network tap wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
Computer cluster wikipedia , lookup
Airborne Networking wikipedia , lookup
Everything2 wikipedia , lookup
A Survey on Parallel Computing in Heterogeneous Grid Environments Takeshi Sekiya Chikayama-Taura Laboratory M1 Nov 24, 2006 Dynamic Change of CPU/Network Load Parallel Computing in Grid Environments • Increase opportunity in which we can use multi cluster environments – But, schemes for stand alone clusters cause problems in grid-like usage Heterogeneous hardware and Firewall/ software NAT • New mechanisms are needed – Handling heterogeneity – Firewall/NAT traversal – Adaptation to dynamic environment – Monitoring Maintenance Complex Configuration Difficult to Know What’s Happening Failure Heterogeneous Environments • Heterogeneous machines – Binaries are different – Complex configuration are required when hardware/software is different • Heterogeneous networks – Overheads of synchronization in parallel application with different latency/bandwidth – Firewalls/NATs Firewall/NAT • Firewalls/NATs hinder bi-directional connectivity • Bi-directional TCP/IP connectivity needs to be provided to support a wide spectrum of applications Firewall or NAT Solutions to the Internet Asymmetric-Connectivity Problem • MPI Environment on Grid with Virtual Machines [Tachibana et al. 2006] – Xen for VM and VPN for Virtual Network – Low cost VM migration • ViNe [Tsugawa et al. 2006] – A host named Virtual Router – Overlay network base • WOW [Ganguly et al. 2006] Outline • Introduction • WOW – IPOP: IP over P2P – Routing IP on the P2P Overlay – Connection Setup – Joining an Existing Network – NAT Traversal – Experiments • Summary Objective and Approach • The system architected to … – Adapt heterogeneous environments • Present to end-users a cluster-like environment – Scale to large number of nodes – Facilitate the addition of nodes through selforganization of virtual network • Less manual configuration • Approach with Virtualization – Virtual Machines • Homogeneous software – Self-organizing overlay network • All-to-all connectivity Virtual Machine • A homogeneous software environment • Offering opportunities for load balancing and fault tolerance • Users can use preconfigured systems – Linux distribution – Libraries and softwares Virtual Network Virtual Grid Cluster IPOP (IP over P2P) P2P Network Physical Infrastructure NAT firewall P2P overlay network IPOP [Ganguly et al. 2006] • Characteristics – A virtual IP address space – Self-organizing • Architecture – IP tunneling over P2P – A virtualized network interface (tap) captures virtual IP packets – Brunet P2P overlay network Capturing Virtual IP Packets • The tap appears as a network interface from applications • IPOP translates virtual IP addresses to Brunet P2P network addresses Ethernet Frame IP Packet application Tunneling application tap tap IPOP Brunet Message IP Packet IPOP Ethernet Frame IP Packet Brunet P2P • Ring-structured overlay • Organized connections – Near: with neighbors – Far: across the ring • 160 bit SHA-1 hash address • Greedy routing • Each node has constant number of connections – O(log2(n)) overlay hops n4 n3 n5 n2 n1 n6 Multi hop path from n1 to n7 n7 n8 n12 n9 n11 n10 Connection Setup Connection Protocol • Node A wishes to connect to node B CTM reply CTM request 1. A sends a CTM (Connect To Me) request to B over P2P network • The CTM request contains A A’s URI B 2. When B receives the CTM request, B sends a CTM reply to A • The CTM reply contains B’s URI URI (Uniform Resource Indicator) ex.) brunet.tcp:192.0.0.1:1024 Connection Setup Linking Protocol 3. B sends a link request connection message to A over the Direct A to B physical network 4. When A receives the link request, A simply link request A responds with a link reply message link reply 5. Finally, new connection is established between A and B B Linking Race Condition (1) • Race condition may occur because linking protocol is initiated by both peers link request link reply link request link reply Both attempts succeed Linking Race Condition (2) link request • Check no existing connection or Active linking on? connection attempt, when nodes link error receive link request • When nodes receive link error, they restart protocol with random back-off Random back-off link request link error link request link reply Joining an Existing Network Leaf Connection • A new node N creates a leaf connection to an initial node I by directly using linking protocol Correct position of new node • I acts as forwarding agent for N Initial node I Leaf connection New node N Joining an Existing Network Send CTM request • N sends a CTM request addressed to itself over P2P network – the CTM request contains N’s URI Left neighbor L • A CTM request is received by right and left neighbors, since Right neighbor R N is still not in the ring CTM request Initial node I New node N Joining an Existing Network Send CTM reply • L and R send CTM reply including their URI to I • I forwards CTM reply to Left neighbor L N CTM reply Right neighbor R Initial node I CTM reply New node N Joining an Existing Network Linking Protocol • Start linking protocol • L and R send link request message to N over the physical Left neighbor L network Link request Right neighbor R Initial node I Link request New node N Joining an Existing Network Complete Joining • N forms connections with neighbors and is in ring • Acquires “far” connections Left neighbor L New node N Right neighbor R Initial node I Adaptive Shortcut Creation • High latencies were observed in experiments due to multi-hop overlay routing • Shortcut creation – Count IPOP packets to other nodes – When number of packets within an interval exceeds threshold, initiate connection setup – Because overhead incurred during maintenance connections, drop connections no longer in use NAT IP: 192.168.0.2 IP: 133.11.238.100 Src: 192.168.0.2:5000 Dst: 157.82.13.244:80 Host a Src: 157.82.13.244:80 Dst: 192.168.0.2:5000 IP: 157.82.13.244 Src: 133.11.23.100:6000 Dst: 157.82.13.244:80 NAT Host b Src: 157.82.13.244:80 Dst: 133.11.23.100:6000 Private Network Global Network NAT Table 192.168.0.2:5000 ⇔ 133.11.23.100:6000 NAT Traversal UDP Hole Punching IP: A IP: N Src: A:a Dst: M:m Host A IP: M IP: B Src: N:n Dst: M:m NAT Src: M:m Dst: A:a NAT Table A:a ⇔ N:n NAT Src: M:m Dst: N:n Host B Src: B:b Dst: N:n NAT Table M:m ⇔ B:b Experimental Setup Hosts: 2.4GHz Xeon, Linux 2.4.20, VMware GSX Hosts: 2.0 GHz Xeon, Linux 2.4.20, VMware GSX Host: 1.3GHz P-III Linux 2.4.21 VMPlayer Host: 1.7GHz P4, Win XP SP2, VMPlayer 34 compute nodes, 118 P2P router nodes on PlanetLab Experiment 1 Joining and Shortcut Connections • Node A: IPOP node • Node B: new joining node – A and B are in different network domains with NAT – B sends ICMP packets to A at 1sec intervals • Within period 1 (about 3 seconds), B establish a route to other nodes • Within period 2 (about 28seconds), B establish a shortcut connections to A Experiment 2 PVM parallel application: FastDNAml (1) • Parallelization with PVM based master-workers model • FastDNAml has a high computation-tocommunication ratio • Dynamic task assignment tolerates performance heterogeneities among computing nodes Master Task Pool Workers Experiment 2 PVM parallel application: FastDNAml (2) Sequential Execution Parallel Execution Node #2 30 Nodes Shortcuts disabled Shortcuts enabled Execution time (sec) 22272 2033 1642 Parallel Speed up n/a 11.0 13.6 • The execution with shortcuts enabled is 24% faster than that with shortcuts disabled • The parallel speedup is 13.6x – 23x is reported in previous work in homogeneous cluster Summary • Introduced WOW – Scalable, fault-resilient and low management infrastructure • Future works – Research on middleware which is easy to use for heterogeneous adaptive Grid environment