* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Slide 1
Net neutrality law wikipedia , lookup
Asynchronous Transfer Mode wikipedia , lookup
TCP congestion control wikipedia , lookup
Remote Desktop Services wikipedia , lookup
Wireless security wikipedia , lookup
Distributed firewall wikipedia , lookup
Deep packet inspection wikipedia , lookup
Computer network wikipedia , lookup
Wake-on-LAN wikipedia , lookup
Network tap wikipedia , lookup
Airborne Networking wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Internet protocol suite wikipedia , lookup
Zero-configuration networking wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
IELM 231: IT for Logistics and Manufacturing Course Agenda Introduction IT applications design: Human-Computer Interface Fundamental IT tools: sorting, searching The Client-Server architecture, Interacting applications IT in logistics, Case study 1: web-search Search robots Data processing, Data storage/retrieval (DB, indexes) Data presentation: page ranking techniques IT in logistics, Case study 2: web-based auctions How auctions work Web issues: session tracking Web issues: secure communications Web issues: cash transactions The Internet and Web searching Part 1. The infrastructure of the web - The architecture of the Internet - The ISO/OSI 7-layer communication protocol - Client-server architecture for applications Part 2. How to use a web search engine [case study: Google] - Basic searching - Advanced searching (Booleans) and filtering (setting up local engines) - Searching data not accessible to search engines Part 3. How search engines work - Crawling the internet - Indexing of information, database design - Page ranking The Internet and Web searching Part 1. The infrastructure of the web - The architecture of the Internet - The ISO/OSI 7-layer communication protocol - Client-server architecture for applications Unit of digital communication: Packets All data (files, streamed data e.g. music, video) is transmitted using wires/optic-cables/wireless channels In most cases, there must be no error in the communication the data on receiving computer is exactly same as data on sending computer Why zero-error ? [hint: uploaded programs] Zero error some error detection and correction technology must be used When is some error acceptable? [hint: video-conf, streaming] Long message probability (error) is high Need to re-transmit message ? Unit of digital communication: Packets Long message p(error) is high Need to re-transmit [part] of message Solution: Break message into small “packets” send packets 1-by-1 To address From address Long message part 1 of 3 part 2 of 3 part 3 of 3 packets To address From address 1/3 To address From address 2/3 To address From address 3/3 data (part 1 of 3) data (part 2 of 3) data (part 3 of 3) EDC EDC EDC transmit Receiver Re-constructs Message from three parts received Typical packet size: 2048 - 4096 Bytes Question: Why do some web pages load in non-sequential fashion (some pictures load first, others later) Network terminology LAN: Local Area Network A network of communicating devices in a small area (e.g. a building, a factory, etc.) Common ways of physically connecting computers in a LAN: Cables (wires), Bluetooth, Wi-Fi… WAN: Wide Area Network Two or more LAN’s connected to each other, over a large area, e.g. international communication networks. Common ways of connecting between LAN’s in a WAN: Telephone networks, Long-distance cables, Satellites Network topologies Suppose N computers need to communicate with each other Pairwise connections: How many ? Problems ? Network topologies Network topology describes how different devices are (physically) connected to each other. 1 1 6 2 6 2 Central Hub 3 5 3 5 4 4 (a) Ring topology (b) Star topology 6 5 1 1 Terminator 3 2 •• • 5 Stub 3 2 Bus •• • 4 4 (c) Mesh topology Tap (d) Bus topology 6 What is the internet • millions of connected computing devices: hosts = end systems • running network applications • communication links – fiber, copper, radio, satellite – transmission rate = bandwidth • routers: forward packets router server workstation mobile local ISP regional ISP UST network What is the internet.. router workstation server • protocols control sending, receiving of msgs mobile – e.g., TCP, IP, HTTP, FTP, PPP local ISP • Internet: “network of networks” – public: Internet – private: Intranet regional ISP • Internet standards – RFC: Request for comments – IETF: Internet Engineering Task Force UST network What is a protocol protocols define format, order of msgs sent and received among network entities, and actions taken on msg transmission, receipt a human protocol a computer protocol Hi TCP connection req Hi TCP connection response What’s the time? Get http://www.awl.com/kurose-ross 2pm <file> time A closer look at network structure • network edge: applications and hosts • network core: – routers – network of networks • access networks, physical media: communication links The network edge End systems (hosts): – run application programs – e.g. Web, email – at “edge of network” Client/server model – client host requests, receives service from always-on server – e.g. Web browser/server; email client/server Network edge: connection-oriented service Goal: data transfer between end systems • handshaking: setup (prepare for) data transfer ahead of time – Hello, hello back human protocol – set up “state” in two communicating hosts e.g. TCP service [RFC 793] • reliable, in-order byte-stream data transfer – loss: acknowledgements and retransmissions • flow control: – sender won’t overwhelm receiver • congestion control: – senders “slow down sending rate” when network congested Network edge: connection-less service Goal: data transfer between end systems e.g. UDP - User Datagram Protocol: connectionless unreliable data transfer no flow control no congestion control App’s using TCP: • HTTP (Web), FTP (file transfer), Telnet (remote login), SMTP (email) App’s using UDP: • Streaming media, Teleconferencing, DNS, Internet telephony Network core Network core: a mesh of inter-connected routers Basic methods to transfer data through the net: Circuit switching Dedicated circuit per call, e.g. telephone net Packet-switching Data sent through net in discrete “chunks” Network core: circuit switching End-end resources reserved for “call” • link bandwidth, switch capacity • dedicated resources: no sharing • circuit-like (guaranteed) performance • call setup required Network core: packet switching Each end-end data stream divided into packets • user A, B packets share network resources • each packet uses full link bandwidth • resources used as needed Bandwidth division into “pieces” Dedicated allocation Resource reservation Resource allocation: • total resource demand can exceed amount available • congestion: packets queue, wait for link use • store and forward: packets move one hop at a time – Node receives complete packet before forwarding Packet switching: store and forward L R • • • • R R Packet Length: L bits Baud rate: R bps Time to push packet on link: L/R sec Entire packet must arrive at router before it can be transmitted on next link: store and forward • delay = 3L/R Example: • L = 7.5 Mbits • R = 1.5 Mbps • delay = 15 sec Access networks and physical media Q: How to connect end systems to edge router? • residential access nets • institutional access networks (school, company) • mobile access networks Residential access: point to point access • Phone modem – up to 56Kbps direct access to router (often less) – Can’t surf and phone at same time: can’t be “always on” • ADSL: asymmetric digital subscriber line [similar to NOW Broadband] – up to 1 Mbps upstream – up to 8 Mbps downstream Residential access: Cable modems cable headend cable distribution network (simplified) home Residential access: Cable modems.. Diagram: http://www.cabledatacomnews.com/cmic/diagram.html Company access: local area networks • company/univ local area network (LAN) connects end system to edge router • Ethernet: – shared or dedicated link connects end system and router – 10 Mbs, 100Mbps, Gigabit Ethernet Wireless access networks router base station Shared wireless access network connects end system to router – via base station aka “access point” Wireless LANs: – 802.11b (WiFi): 11 Mbps (good for networks) – bluetooth: 720Kbps (good for device-to-device) mobile hosts Home networks Typical home network components: • ADSL or cable modem • router/firewall/NAT • Ethernet • wireless access point to/from cable headend cable modem router/ firewall Ethernet wireless laptops wireless access point Internet structure: network of networks • a packet passes through many networks! local ISP Tier 3 ISP Tier-2 ISP local ISP local ISP local ISP Tier-2 ISP Tier 1 ISP Tier 1 ISP Tier-2 ISP local local ISP ISP Network Access Point Tier 1 ISP Tier-2 ISP local ISP Tier-2 ISP local ISP Protocol “Layers” Networks are complex! • many “pieces”: – hosts – routers – links of various media – applications – protocols – hardware, software Analogy: Organization of air travel ticket (purchase) ticket (complain) baggage (check) baggage (claim) gates (load) gates (unload) runway takeoff runway landing airplane routing (departure) airplane routing (arrival) airplane routing [intermediate air-traffic control points] Layering of airline functionality ticket (purchase) ticket (complain) ticket baggage (check) baggage (claim baggage gates (load) gates (unload) gate runway (takeoff) runway (land) takeoff/landing airplane routing airplane routing airplane routing departure airport airplane routing airplane routing intermediate air-traffic control centers Layers: each layer implements a service – via its own internal-layer actions – relying on services provided by layer below arrival airport Why layering? Dealing with complex systems: • explicit structure allows identification, relationship of complex system’s pieces – layered reference model for discussion • modularization eases maintenance, updating of system – change of implementation of layer’s service transparent to rest of system – e.g., change in gate procedure doesn’t affect rest of system • layering considered harmful? Internet protocol stack • application: supporting network applications – FTP, SMTP, HTTP • transport: host-host data transfer – TCP, UDP • network: routing of datagrams from source to destination – IP, routing protocols • link: data transfer between neighboring network elements – PPP, Ethernet • physical: bits “on the wire” application transport network link physical Encapsulation message segment Ht datagram Hn Ht frame Hl Hn Ht M M M M source application transport network link physical Hl Hn Ht M link physical Hl Hn Ht M switch destination M Ht M Hn Ht Hl Hn Ht M M application transport network link physical Hn Ht Hl Hn Ht M M network link physical Hn Ht Hl Hn Ht M M router The Network Layer: Internet Protocol • What’s inside a router • Internet Protocol and IP addresses • How packets are routed Internet Protocol (IP) The Internet Protocol (IP) is a network-layer (Layer 3) protocol that contains addressing information and some control information that enables packets to be routed. Network layer functions: Forwarding and Routing Forwarding: determines which link to take at a specific router; routing algorithm local forwarding table header value output link 0100 0101 0111 1001 Routing: plan of a series of forwarding data that can take the packet from source to destination 3 2 2 1 value in arriving packet’s header 0111 1 3 2 DATAGRAM IP datagram format IP protocol version number header length (bytes) “type” of data max number remaining hops (decremented at each router) upper layer protocol to deliver payload to how much overhead with TCP? • 20 bytes of TCP • 20 bytes of IP • = 40 bytes + app layer overhead 32 bits type of ver head. len service length fragment 16-bit identifier flgs offset upper time to Internet layer live checksum total datagram length (bytes) for fragmentation/ reassembly 32 bit source IP address 32 bit destination IP address Options (if any) data (variable length, typically a TCP or UDP segment) E.g. timestamp, record route taken, specify list of routers to visit. Datagram networks Packets forwarded using destination host address – packets between same source-dest pair may take different paths application transport network data link 1. Send data physical application transport 2. Receive data network data link physical Main router functions: • run routing algorithms/protocol (RIP, OSPF, BGP) • forwarding datagrams from incoming to outgoing link IP Addressing IP Address is a locator to allow one IP device to ‘find’ another IP device 223.1.1.1 IP address: 32-bit identifier for hostrouter interface (128bits in Vista, OS-X) 223.1.2.1 223.1.1.2 223.1.1.4 interface: connection between host/router and physical link – router’s: 2 or more interfaces – host 1 or more interfaces – each interface has an IP address 223.1.1.3 223.1.3.1 223.1.1.1 = 11011111 00000001 00000001 00000001 223 1 1 1 223.1.2.9 223.1.3.27 223.1.2.2 223.1.3.2 Access networks and physical media IP address: – subnet part (high order bits) – host part (low order bits) What’s a subnet ? – device interfaces with same subnet part of IP address – can physically reach each other without intervening router 223.1.1.1 223.1.2.1 223.1.1.2 223.1.1.4 223.1.1.3 223.1.2.9 223.1.3.27 223.1.2.2 LAN 223.1.3.1 223.1.3.2 network consisting of 3 subnets Subnets 223.1.1.0/24 To determine the subnets, detach each interface from its host or router, creating islands of isolated networks. 223.1.2.0/24 Each isolated network is called a subnet. 223.1.3.0/24 Subnet mask: /24 Subnets.. 223.1.1.2 How many ? 223.1.1.1 223.1.1.4 223.1.1.3 223.1.9.2 223.1.7.0 223.1.9.1 223.1.7.1 223.1.8.1 223.1.8.0 223.1.2.6 223.1.2.1 223.1.3.27 223.1.2.2 223.1.3.1 223.1.3.2 Subnets… Subnets allow us to create sub-collection of IP devices, e.g. a LAN IP address is made of two parts: - Network address - Host (i.e. device) address How many bits (and which ones) of the IP address are used for Network address, and for Host ? Depends on the LAN: e.g. only 4 devices we may only use 2 bits. Subnet mask specifies which bits are used for Host name: Full Network Address 192.168.5.10 11000000.10101000.00000101.00001010 Subnet Mask 255.255.255.0 11111111.11111111.11111111.00000000 Network Portion 192.168.5.0 11000000.10101000.00000101.00000000 Client Portion 0.0.0.10 00000000.00000000.00000000.00001010 IP Address: how to get one? How does host get IP address? 1. Hard-coded by system administrator in a file Control-panelNetworkConfigurationtcp/ipproperties or 2. DHCP: Dynamic Host Configuration Protocol: Dynamically get address from a server (plug-and-play) DNS: Domain Name System People: many identifiers: – HKID, name, passport # Internet hosts, routers: – IP address (32 bit) - used for addressing datagrams – “name”, e.g., www.yahoo.com - used by humans Q: map between IP addresses and name ? Domain Name System: • distributed database implemented in hierarchy of many name servers • application-layer protocol host, routers, name servers to communicate to resolve names (address/name translation) Suppose a client wants to connect to a host www.amazon.co.uk (1) the “network” must tell us the IP address of a host www.amazon.co.uk (2) the client sends a “connect” request to that IP address. DNS.. How to find IP address of a named host? A DB of {Name IP address} is stored on computers, Name Servers resource record name IP, … zone of authority, managed by Name Server High level Name server Lower level Name server, can allocate, store names to computers below it in hierarchy Transport services and Protocols • provide logical communication between app processes running on different hosts • Most common transport protocol: TCP application transport network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical The TCP provides a reliable, continuous stream of data - protocol for automatically requesting missing data - reordering IP packets that arrive out of order - converting IP datagrams to a streaming protocol - routing data within a computer to the correct application. application transport network data link physical The Application Layer • • • • Principles of network applications Web and HTTP FTP Electronic Mail – SMTP, POP3, IMAP • DNS • Socket programming with TCP • Building a Web server Creating a network application Write programs that – run on different end systems and communicate over a network – Example: Web server software communicates with browser software No software written for devices in network core – Network core devices do not function at application layer – This design allows for rapid app development application transport network data link physical application transport network data link physical application transport network data link physical Client-Server Architecture server: – always-on host – permanent IP address clients: – communicate with server – may be intermittently connected – may have dynamic IP addresses – do not communicate directly with each other Addressing processes For a process to receive messages, it must have an identifier (i.e. address) A host has a unique 32-bit IP address Does the IP address of the host on which the process runs suffice for identifying the process? Ans: No, many processes can be running on same host Problem: Most computers have only one “internet connection”, usually a serial port. How to manage multiple processes (e.g. mail, internet, ftp, telnet) sending/receiving packets of data through that line? Ans: The Operating system must somehow separate out one channel into multiple channels Sockets Sockets • process sends/receives messages to/from its socket • socket analogous to door – sending process shoves message out door – sending process relies on transport infrastructure on other side of door which brings message to socket at receiving process host or server host or server process controlled by app developer socket socket TCP with buffers, variables process Internet controlled by OS TCP with buffers, variables Sockets.. Addressing Identifier includes: IP address of host and port number of the process on host socket a host-local, application-created, OS-controlled interface (a “door”) into which application process can both send and receive messages to/from another application process Default port numbers for common apps: HTTP server: 80 FTP: 20, 21 SMTP: 25 Telnet: 23 Socket programming Goal: how to build client/server application that communicate using sockets Socket API - introduced in UNIX, 1981 - explicitly created, used, released by apps client/server paradigm - two types of transport service via socket API: unreliable datagram reliable, byte stream-oriented Socket-programming using TCP Socket: a door between application process and end-end-transport protocol (UDP or TCP) TCP service: reliable transfer of bytes from one process to another controlled by application developer controlled by operating system process process socket TCP with buffers, variables host or server internet socket TCP with buffers, variables host or server controlled by application developer controlled by operating system Socket programming with TCP Client must contact server • server process must first be running • server must have created socket (door) that ‘listens’ for client’s contact Client contacts server by: • creating client-local TCP socket • specifying IP address, port number of server process • When client creates socket: client TCP establishes connection to server TCP • When contacted by client, server TCP creates new socket for server process to communicate with client – allows server to talk with multiple clients – source port numbers used to distinguish clients application viewpoint TCP provides reliable, in-order transfer of bytes (“pipe”) between client and server Stream terminology On a host, data can come to a port at any time The receiving process only listens at the port intermittently (why?) What happens to data if this process is not yet listening? Streams • A stream is a sequence of characters that flow into or out of a process. • An input stream is attached to some input source for the process, eg, keyboard or socket. • An output stream is attached to an output source, eg, monitor or socket. Socket programming with TCP output stream inFromServer Client Process process input stream outToServer Example client-server app: 1) client reads line from standard input (inFromUser stream) , sends to server via socket (outToServer stream) 2) server reads line from socket 3) server converts line to uppercase, sends back to client 4) client reads, prints modified line from socket (inFromServer stream) monitor inFromUser keyboard input stream client TCP clientSocket socket to network TCP socket from network Client/server socket interaction: TCP Server (running on hostid) Client create socket, port=x, for incoming request: welcomeSocket = ServerSocket() TCP wait for incoming connection request connection connectionSocket = welcomeSocket.accept() read request from connectionSocket write reply to connectionSocket close connectionSocket setup create socket, connect to hostid, port=x clientSocket = Socket() send request using clientSocket read reply from clientSocket close clientSocket Note: For VB 6.0 on Windows, similar commands are in the Winsock control References and Further Reading Books: Jim Kurose, Keith Ross, Computer Networking: A Top Down Approach Featuring the Internet, 3rd ed., Addison-Wesley, July 2004. Web sources: 1. Domain Name Systems: Wikipedia DNS 2. Registering your own Domain Name: ICAAN, InterNIC, … Next: Search engines, Google case study