Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
INFO 320 Server Technology I Week 2 Server architectures INFO 320 week 2 1 www.ischool.drexel.edu OS and Server Architecture • Last week we outlined the basic functions of an operating system • Since an OS exists to serve as a connection between apps and the hardware, what kind of hardware is available and how it’s used are critical things to consider • …So what is a server architecture? INFO 320 week 2 2 www.ischool.drexel.edu Server Architecture • Key issues in server architecture include – What is the extent of centralization or distribution of functions? • There are many possible answers, not just A or B – The main functions are managing data, performing processing (e.g. running apps), and determining how to display the results to the user • i.e. Who does what where in your system? INFO 320 week 2 3 www.ischool.drexel.edu Server Architecture • Other issues to keep mind could include – Reliability – Availability – Security – Performance INFO 320 week 2 4 www.ischool.drexel.edu Server Architecture • So defining server architecture is a key step in the larger process of designing a network • Once the architecture is set, then can work on details such as – How many of each server are needed? – How big are they (CPUs, RAM, storage)? – What kind of links are needed among them? INFO 320 week 2 5 www.ischool.drexel.edu Centralization • Centralizing all or some aspects of a system can be good – Take advantage of economies of scale – Easier to staff support people – Easier to control procurement – Easier to enforce programming and data structure standards – Easier to manage security INFO 320 week 2 6 www.ischool.drexel.edu Centralization • We can centralize computers, as was done with mainframes • Centralization doesn’t necessarily apply to the entire system though • We could centralize processing – Data processing, payroll, apps unique to a given department (CAD) might be centralized • We could centralize data – Big database server(s) INFO 320 week 2 7 www.ischool.drexel.edu Distributed data processing • Distributed data processing (DDP) is a possible step away from centralization – Servers are distributed throughout the organization in order to meet operational, economic, and/or geographic needs – Could still have a larger central facility with satellite facilities, or all peer facilities INFO 320 week 2 8 www.ischool.drexel.edu Distributed data processing • DDP advantages include – Responsiveness to local needs – Higher availability, more redundancy to minimize impact of a single system failure – Resource sharing can still be done with expensive hardware – Incremental growth is easier • Avoids all or nothing upgrades – More user involvement, control, productivity INFO 320 week 2 9 www.ischool.drexel.edu Distributed data processing • DDP operating systems need – Good networking capability to exchange data – Ability to cluster machines for high availability and high performance – To manage processes across the distributed environment INFO 320 week 2 10 www.ischool.drexel.edu Distributed processing overview • We’ll look at critical technologies in distributed processing – Client/server computing – Distributed message passing – Remote procedure calls – Clusters INFO 320 week 2 11 www.ischool.drexel.edu Client/server computing • In a client/server environment, a client requests information from the servers – An API (Applications Programming Interface), drivers, or other forms of middleware allows communication between them • Clients present the information in a usercuddly GUI format INFO 320 week 2 12 www.ischool.drexel.edu Client/server computing • Servers exist to provide shared services to clients – What kind of servers could we see? • Also keep in mind the network connecting the clients and servers – Is it a LAN, WAN, the Internet, or ??? – We need to be aware of the amount of traffic we expect the network to bear INFO 320 week 2 13 www.ischool.drexel.edu Client/server characteristics • A client/server architecture differs from other distributed processing in many ways – Strong emphasis on user-friendly apps for the user on their system – Often centralize database, network management, and utility functions to control overhead and support costs – Open and modular systems are increasingly common – mix products from various vendors INFO 320 week 2 14 www.ischool.drexel.edu Client/server characteristics – Networking is critical, hence focus a lot of attention to network management and security issues • Client/server apps communicate directly, depending on the network protocols (TCP/IP) to make that possible – Even though the client and server often have different platforms and OS’s – Client/server apps look like Internet apps! INFO 320 week 2 15 www.ischool.drexel.edu Client/server characteristics Images from (Stallings, 2009) INFO 320 week 2 16 www.ischool.drexel.edu Client/server database • A common client/server app is to use a database server • The DBMS resides on the server, and is called by the application logic • Part of the app design challenge is to make sure the network isn’t overwhelmed by the data transfer expectations INFO 320 week 2 17 www.ischool.drexel.edu Client/server database INFO 320 week 2 18 www.ischool.drexel.edu Client/server database • The first example is good use of client/server, since – The server has the job of sorting through one million records, at which a desktop system might cringe – The network doesn’t have to support moving the entire database across itself INFO 320 week 2 19 www.ischool.drexel.edu Client/server classes • Four classes of client/server (C/S) apps – Host-based processing, much like a mainframe & dumb terminal, is not really C/S – Server-based processing, the most serverheavy class of C/S processing – Cooperative processing, processing is locally optimized on the client – Client-based processing, the most fair split of workload INFO 320 week 2 20 www.ischool.drexel.edu Client/server classes (b) Is a “thin” client app (c) and (d) are “fat” client apps INFO 320 week 2 21 www.ischool.drexel.edu Three-tier client/server architecture • In three-tier C/S, we now have a client, a middle tier server, and a backend server – The client is typically a thin client – The middle tier is often an application server • It acts as a server to the client, and as a client to the backend server – The backend server is often one or more database servers • The app server chooses which one is needed INFO 320 week 2 22 www.ischool.drexel.edu File consistency • Clients and servers often cache files which are frequently used • When a file or database record is being changed, the cache can be inconsistent with the correct version • Often address this by locking files or records, hence the level at which data is locked can be a key performance issue INFO 320 week 2 23 www.ischool.drexel.edu What is middleware? INFO 320 week 2 24 www.ischool.drexel.edu Middleware • Development of C/S apps has exceeded anyone’s ability to make standardized application support tools • APIs and other programming interfaces help address this, and are generically known as middleware – ‘Common definitions are that middleware is the "glue" between software components or between software and the network or it is the slash in Client/Server.’ From here INFO 320 week 2 25 www.ischool.drexel.edu Middleware INFO 320 week 2 26 www.ischool.drexel.edu Middleware • Middleware describes software that connects two or more software applications so they can exchange data • There are many types of middleware, hence the confusion – Message Oriented Middleware, Object Middleware, RPC Middleware, Database Middleware, Transaction Middleware, Portals INFO 320 week 2 27 www.ischool.drexel.edu Distributed message passing • Within one computer, processes can pass messages via semaphores • In distributed systems, processes are on different systems, so that isn’t possible – One issue is message reliability (did it get there?) – Can processing continue before getting a response? (if so, called nonblocking or asynchronous) INFO 320 week 2 28 www.ischool.drexel.edu Distributed message passing INFO 320 week 2 29 www.ischool.drexel.edu Distributed message passing INFO 320 week 2 30 www.ischool.drexel.edu Remote procedure calls • Remote procedure calls (RPCs) allow distributed systems to communicate as though they were on the same machine – A remote interface can have named operations with specific types • Allows clearly defined documentation and static error checking – Helps generate code automatically, and port code to different platforms and OS’s INFO 320 week 2 31 www.ischool.drexel.edu Remote procedure calls This expands on image (b) on slide 29. INFO 320 week 2 32 www.ischool.drexel.edu Remote procedure calls • Issues with using RPC include – Passing parameters by value or pointer – Representation of parameters (int, float, $, …) – Client/server binding • Nonpersistent (always make new connection) • Persistent (keep the same binding until it expires) – Asynchronous (let other processes continue) or synchronous (block everything until done) – Object-oriented RPC (see OLE or CORBA) INFO 320 week 2 33 www.ischool.drexel.edu SMP • In order to get lots of computational power, symmetric multiprocessing (SMP) was the first option – SMP has multiple processors – They share main memory (RAM) and I/O – They are connected by a bus – Are processors are the same type (hence the ‘symmetric part’) INFO 320 week 2 34 www.ischool.drexel.edu Clustering • As the need for more computational power grew, clustering was developed – What kind of problems need massive CPU power? • Clustering is a group of interconnected standalone computers working together as one – Each computer in a cluster is a node INFO 320 week 2 35 www.ischool.drexel.edu Clustering • Clustering has several benefits – Absolute scalability – can keep adding more systems to get as much power as you can afford – Incremental scalability – you can add a little more power as well, avoiding complex upgrade paths – High availability – lots of separate computers means if one fails it’s not a big deal INFO 320 week 2 36 www.ischool.drexel.edu Clustering – Superior price/performance since cheap computers can be clustered • Clusters can be classified based on whether they share hard disks (among other ways) – In the first approach, each standby server has separate disks, and they communicate via a high speed link – In the second approach, they share a RAID array INFO 320 week 2 37 www.ischool.drexel.edu Clustering INFO 320 week 2 38 www.ischool.drexel.edu Clustering • A better approach for cluster classification is by functionality – Passive Standby – Active secondary – Separate servers – Servers connected to disks – Servers share disks INFO 320 week 2 39 www.ischool.drexel.edu Clustering • Passive Standby – A second server takes over if the primary fails – Easy to implement – Wastes second server since it’s mostly unused – Doesn’t improve performance over a single server – Often not considered a true cluster INFO 320 week 2 40 www.ischool.drexel.edu Clustering • Active secondary – The second server is also used for processing tasks – Cheaper since second server is now used – Increased complexity INFO 320 week 2 41 www.ischool.drexel.edu Clustering • Separate servers (is (a) on slide 38) – Servers have separate disks – Data is copied from primary to second server – Gives high availability and high performance – High network and server overhead due to copying INFO 320 week 2 42 www.ischool.drexel.edu Clustering • Servers connected to disks – Also called the shared nothing approach – Servers are connected to a set of disks, but each server has its own disks in that set – Reduces need for copying among servers – Often needs mirroring or RAID in case of disk failure – Windows Cluster Server is an example INFO 320 week 2 43 www.ischool.drexel.edu Clustering • Servers share disks (is (b) on slide 38) – Multiple servers share a set of disks – Low network and server overhead – Reduced chance of disk failure – Requires lock manager software, plus mirroring and/or RAID INFO 320 week 2 44 www.ischool.drexel.edu Clustering and the OS • Clustering produces interesting OS problems – Failure management • Either a high availability approach or a fault tolerant approach can be used • The latter is better at handling partial transactions if a system fails • Failover is the function of handing off an app and its data when there’s a failure; the opposite is failback INFO 320 week 2 45 www.ischool.drexel.edu Clustering and the OS – Load balancing • How do you balance how much work each system is performing? • A load-balancing facility must handle this and schedule tasks accordingly – Parallelized computation • How is the application run on multiple systems? – Could have a parallelizing compiler – A parallelized application is written to run on a cluster – Parametric computing tools can be used for simulations that require a lot of similar runs with different conditions INFO 320 week 2 46 www.ischool.drexel.edu Clustering architecture • A cluster presents itself to the user as a single system, the single-system image – This is possible thanks to the clustering middleware – The middleware also may perform load balancing and respond to system failures INFO 320 week 2 47 www.ischool.drexel.edu Clustering architecture INFO 320 week 2 48 www.ischool.drexel.edu Clustering architecture • The single-system image ensures that – Single entry point • The user logs into the cluster, not a machine – Single file hierarchy • The user sees files in a single file structure – Single control point • There is a default node used to manage the cluster – Single virtual networking • Any node can access the rest of the cluster INFO 320 week 2 49 www.ischool.drexel.edu Clustering architecture – Single memory space • Distributed shared memory allows programs to share variables – Single job management system • A user can commit a job to run without specifying where it runs (which node) – Single user interface • The same GUI supports users regardless of where they log into the cluster INFO 320 week 2 50 www.ischool.drexel.edu Clustering architecture • To improve availability, the OS allows – Single I/O space • Any node can access any I/O peripheral or disk device no matter where it is – Single process space • A uniform process identification scheme is used – Checkpointing • Saves process state and data in case of failure – Process migration • Enables load balancing INFO 320 week 2 51 www.ischool.drexel.edu SMP versus clustering • SMP is more mature technology, is easier to manage and configure than a cluster – SMP takes less space and power • Clusters win when scalability, either absolute or incremental, is critical – Availability for clusters is also higher INFO 320 week 2 52 www.ischool.drexel.edu Clustering examples • Windows Cluster Server is a sharednothing approach • Sun Cluster is an object-oriented approach using CORBA – The object framework handles calls to other nodes – A virtual node (vnode) file system is used INFO 320 week 2 53 www.ischool.drexel.edu Beowulf • Beowulf (no, not Beowulf) is one of the oldest clustering approaches, started in 1994 using clustered PCs – Most Beowulf clusters use Linux systems, connected by Ethernet (LAN) or via TCP/IP • Each node runs an autonomous Linux kernel, yet participates in global namespaces INFO 320 week 2 54 www.ischool.drexel.edu Beowulf • Key pieces of Beowulf software are – BPROC, the distributed process space package, which allows a process to span multiple nodes and can allow a new process to be created on other nodes – Ethernet Channel Bonding, which joins multiple local networks into one high speed network and does load balancing INFO 320 week 2 55 www.ischool.drexel.edu Beowulf – Pvmsync is a programming environment which helps perform synchronization and shares data objects among processes – EnFuzion is a set of tools for parametric computing; creating a lot of jobs with different input parameters or initial conditions INFO 320 week 2 56 www.ischool.drexel.edu References • Operating Systems Internals and Design Principles, by William Stallings, 6th Ed, Pearson/Prentice Hall 2009. ISBN 0136006329 – His web site • What is Middleware? http://www.middleware.org/whatis.html INFO 320 week 2 57 www.ischool.drexel.edu