* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CAO - E
Survey
Document related concepts
Transcript
II BSC ECS COMPUTER ARCHITECTURE AND ORGANIZATION UNIT I MODERN COMPUTER ORGANIZATION Introduction – Layers in modern computer - Computer organization – Main Memory – CPU Operation – Computer types – System performance and measurement – High performance techniques – Booting sequence – Computer design process – Computer structure – Computer Function – Architecture and Organization – CISC Vs RISC UNIT II PROCESSOR DESIGN AND DATA PATH Introduction – Processor role – Processor design goals – Processor design process – Data path organization – Main memory interface – Local storage register file – Data path simple instructions UNIT III MEMORY DESIGN AND MANAGEMENT Introduction – Memory parameters – Classification of memory – Memory Technology – Main memory allocation – Static RAM IC – Dynamic RAM – ROM logic – Multiple memory decoding – Memory Hierarchy – Cache memory – Principle of cache – Virtual memory Concept – Advantage of Virtual memory UNIT IV COMPUTER PERIPHERALS Introduction – Keyboard – CRT display monitor – Printer – Magnetic storage devices – Floppy disk drive – Hard disk drive – Special types of disk drives – Mouse and Track ball – Modem – CD-ROM Drive – Scanner – Digital Camera – DVD. UNIT V ADVANCED SYSTEM ARCHITECTURE Introduction – High performance computer architecture – RISC systems – Superscalar architecture – VLIW architecture – EPIC architecture –Multiprocessor Systems TEXT BOOK 1. Govindarajalu.B “Computer Architecture and Organization Design Principles and Applications” Tata McGraw-Hill, 2006 B.Sc. Electronics & C. Sys. (Colleges-revised) 2010-11 Page 11 of 35 Annexure No. 30 B SCAA Dt. PART-A UNIT-1 MODERN COMMPUTER ORGANIZATION 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. Arithmetic and logical operations are done in CPU The abbreviation of RISC is reduced instruction set computing Registers are the temporary memory area while data manipulation. Computers uses binary based codes to give the information. The maximum number of clock cycles measured in MHz is speed . The memory inbuilt in the processor is cache memory . The abbreviation of PGA is pin grid array. ASCII is the standard code used for text character. The abbreviation of ASCII is American standard code for information interchange. The abbreviation for DRAM is Dynamic random access memory . A microprocessor is an IC that contains CPU on a single chip. The stepping process of computer is called Booting. AT & ATX stands for advanced technology & extended advanced technology. PART-A UNIT-2 PROCESSOR DESIGN AND DATAPATH 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. An IC is an electronic device that contains resitors, capacitors and transistors. Processors are divided as 3 stages CU, CPU, ALU. The system crystal determines the speed of the CPU. The three keys that measures the CPU performance are speed,addressbus,databus. Pentium MMX is used for multimedia world. Today’s standard p3 processor speeds up to 500 MHz. The Pentium processor uses multithreading and RISC technology. The p3 processor uses streamline coding and advanced cache technology. The PowerPC name stands for performance optimization with RISC . LIF & ZIF stands for low insertion force & zero insertion force. When handling CPU see electrostatic discharge and potential pin damage. EMI & RFI stands for electromagnetic interference & radio frequency interference. PART-A UNIT—3 MEMORY DESIGN AND MANAGEMENT 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. The types of memory are primary & secondary memories. The abbreviation of BIOS is basic input output systems. The other name for BIOS is firmware. The PCI stands for Peripheral component interconnect. POST refers to power on self test. ROM is an nonvolatile memory. DRAM uses microscopic transistors and microscopic capacitors for storing data bit. SIPP stands for single inline pinned package. SIMM stands for single inline memory module. SRAM uses flip flops for storing the data bits. The original onboard cache is known as internal cache. The UMA ranges from 640 KB to 1024 KB. The function of the shadow ram is to re-writes the contents of ROMBIOS. The MEM command gives the information of the amount and type of memory available. PART—A UNIT—4 COMPUTER PHERIPERALS 1. 2. 3. 4. 5. The mouse is a graphical user interface. The term MODEM stands for modulator and demodulator . The round shape device on that the letters are fixed is TRACE BALL. The types of keyboard are membrane, capacitive, mechanical keyboards. The device that is used to get the pictorial information into CPU is scanner. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. The capacity of the hard disk is given by CHS values. SCSI stands for small computer system interface. The HDD has the sector value of 512 bytes. The 3.5 inch floppy is the industrial standard. FDD parameters are stored in CMOS. The term LPT stands for line print terminal. The horizontal printing is known as LANDSCAPE. The resolution of the printer is measured in dot per inch (dpi). The CRT stands for cathode ray tube and it works as a output device. The scanner converts the photographic information into digital information. The input devices are mouse.keyboard, joystic, microphone, scanner, cd-rom drive. The output devices are printer, monitor, plotter, speaker. The input and output devices are HDD, FDD, MODEM,TAPE DRIVE. PART—A UNIT—5 ADVANCED SYSTEM ARCHITECTURE. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. The abbreviation for RISC is reduced instruction set computing. The system which does more than one process is called multiprocessor. The term CISC stands for complex instruction set computing. The super scalar technology uses two instruction pipelines such as U & V. The abbreviation of FRC is functional redundancy check. The abbreviation of DCL is data conversion logic. In 8088 processor RAM logic contains 4 banks of 9 chips. The lock signal is used to prevent the system bus from other bus. The term DVD refers to digital versatile disc. The hardware and software collectively called as firmware. PART-B UNIT 1 MODERN COMPUTER ORGANIZATION 1. Define computer. A computer is a machine that can be programmed to manipulate symbols. Its principal characteristics are: It responds to a specific set of instructions in a well-defined manner. It can execute a prerecorded list of instructions (a program). It can quickly store and retrieve large amounts of data. Therefore computers can perform complex and repetitive procedures quickly, precisely and reliably. Modern computers are electronic and digital. The actual machinery (wires, transistors, and circuits) is called hardware; the instructions and data are called software. All general-purpose computers require the following hardware components: Central processing unit (CPU): The heart of the computer, this is the component that actually executes instructions organized in programs ("software") which tell the computer what to do. Memory (fast, expensive, short-term memory): Enables a computer to store, at least temporarily, data, programs, and intermediate results. Mass storage device (slower, cheaper, long-term memory): Allows a computer to permanently retain large amounts of data and programs between jobs. Common mass storage devices include disk drives and tape drives. Input device: Usually a keyboard and mouse, the input device is the conduit through which data and instructions enter a computer. Output device: A display screen, printer, or other device that lets you see what the computer has accomplished. In addition to these components, many others make it possible for the basic components to work together efficiently. For example, every computer requires a bus that transmits data from one part of the computer to another. 2. Write about notebook computer. Notebook computer An extremely lightweight personal computer. Notebook computers typically weigh less than 6 pounds and are small enough to fit easily in a briefcase. Aside from size, the principal difference between a notebook computer and a personal computer is the display screen. Notebook computers use a variety of techniques, known as flat-panel technologies, to produce a lightweight and non-bulky display screen. The quality of notebook display screens varies considerably. In terms of computing power, modern notebook computers are nearly equivalent to personal computers. They have the same CPUs, memory capacity, and disk drives. However, all this power in a small package is expensive. Notebook computers cost about twice as much as equivalent regular-sized computers. Notebook computers come with battery packs that enable you to run them without plugging them in. However, the batteries need to be recharged every few hours. 3. Define about workstation computer. It is a type of computer used for engineering applications (CAD/CAM), desktop publishing, software development, and other types of applications that require a moderate amount of computing power and relatively high quality graphics capabilities. Workstations generally come with a large, highresolution graphics screen, at large amount of RAM, built-in network support, and a graphical user interface. Most workstations also have a mass storage device such as a disk drive, but a special type of workstation, called a diskless workstation, comes without a disk drive. The most common operating systems for workstations are UNIX and Windows NT. Like personal computers, most workstations are single-user computers. However, workstations are typically linked together to form a local-area network, although they can also be used as stand-alone systems. 4. Write short notes on desktop computer. N.B.: In networking, workstation refers to any computer connected to a local-area network. It could be a workstation or a personal computer Desktop model A computer designed to fit comfortably on top of a desk, typically with the monitor sitting on top of the computer. Desktop model computers are broad and low, whereas tower model computers are narrow and tall. Because of their shape, desktop model computers are generally limited to three internal mass storage devices. Desktop models designed to be very small are sometimes referred to as slim line models. 5. Write about palmtop computer. A small computer that literally fits in your palm. Compared to full-size computers, palmtops are severely limited, but they are practical for certain functions such as phone books and calendars. Palmtops that use a pen rather than a keyboard for input are often called hand-held computers or PDAs. Because of their small size, most palmtop computers do not include disk drives. However, many contain PCMCIA slots in which you can insert disk drives, modems, memory, and other devices. Palmtops are also called PDAs, hand-held computers and pocket computers. 6. What are the requirements of a computer? All general-purpose computers require the following hardware components: Central processing unit (CPU): The heart of the computer, this is the component that actually executes instructions organized in programs ("software") which tell the computer what to do. Memory (fast, expensive, short-term memory): Enables a computer to store, at least temporarily, data, programs, and intermediate results. Mass storage device (slower, cheaper, long-term memory): Allows a computer to permanently retain large amounts of data and programs between jobs. Common mass storage devices include disk drives and tape drives. Input device: Usually a keyboard and mouse, the input device is the conduit through which data and instructions enter a computer. Output device: A display screen, printer, or other device that lets you see what the computer has accomplished. PART-C UNIT-1 MODERN COMPUTER ORGANIZATION 1. Write about Supercomputer and Mainframe. Supercomputer is a broad term for one of the fastest computers currently available. Supercomputers are very expensive and are employed for specialized applications that require immense amounts of mathematical calculations (number crunching). For example, weather forecasting requires a supercomputer. Other uses of supercomputers scientific simulations, (animated) graphics, fluid dynamic calculations, nuclear energy research, electronic design, and analysis of geological data (e.g. in petrochemical prospecting). Perhaps the best known supercomputer manufacturer is Cray Research. Mainframe was a term originally referring to the cabinet containing the central processor unit or "main frame" of a room-filling Stone Age batch machine. After the emergence of smaller "minicomputer" designs in the early 1970s, the traditional big iron machines were described as "mainframe computers" and eventually just as mainframes. Nowadays a Mainframe is a very large and expensive computer capable of supporting hundreds, or even thousands, of users simultaneously. The chief difference between a supercomputer and a mainframe is that a supercomputer channels all its power into executing a few programs as fast as possible, whereas a mainframe uses its power to execute many programs concurrently. In some ways, mainframes are more powerful than supercomputers because they support more simultaneous programs. But supercomputers can execute a single program faster than a mainframe. The distinction between small mainframes and minicomputers is vague, depending really on how the manufacturer wants to market its machines. 2. Write about CPU in detail. Central processing unit Die of an Intel 80486DX2 microprocessor (actual size: 12×6.75 mm) in its packaging. A central processing unit (CPU) or processor is an electronic circuit that can execute computer programs. This topic has been in use in the computer industry at least since the early 1960s (Weik 1961). The form, design and implementation of CPUs have changed dramatically since the earliest examples, but their fundamental operation has remained much the same. Early CPUs were custom-designed as a part of a larger, sometimes one-of-a-kind, computer. However, this costly method of designing custom CPUs for a particular application has largely given way to the development of mass-produced processors that are made for one or many purposes. This standardization trend generally began in the era of discrete transistor mainframes and minicomputers and has rapidly accelerated with the popularization of the integrated circuit (IC). The IC has allowed increasingly complex CPUs to be designed and manufactured to tolerances on the order of nanometers. Both the miniaturization and standardization of CPUs have increased the presence of these digital devices in modern life far beyond the limited application of dedicated computing machines. Modern microprocessors appear in everything from automobiles to cell phones to children's toys. EDVAC, one of the first electronic stored program computers. Prior to the advent of machines that resemble today's CPUs, computers such as the ENIAC had to be physically rewired in order to perform different tasks. These machines are often referred to as "fixed-program computers," since they had to be physically reconfigured in order to run a different program. Since the term "CPU" is generally defined as a software (computer program) execution device, the earliest devices that could rightly be called CPUs came with the advent of the stored-program computer. The idea of a stored-program computer was already present in the design of J. Presper Eckert and John William Mauchly's ENIAC, but was initially omitted so the machine could be finished sooner. On June 30, 1945, before ENIAC was even completed, mathematician John von Neumann distributed the paper entitled "First Draft of a Report on the EDVAC." It outlined the design of a stored-program computer that would eventually be completed in August 1949 (von Neumann 1945). EDVAC was designed to perform a certain number of instructions (or operations) of various types. These instructions could be combined to create useful programs for the EDVAC to run. Significantly, the programs written for EDVAC were stored in high-speed computer memory rather than specified by the physical wiring of the computer. This overcame a severe limitation of ENIAC, which was the large amount of time and effort it took to reconfigure the computer to perform a new task. With von Neumann's design, the program, or software, that EDVAC ran could be changed simply by changing the contents of the computer's memory. [1] While von Neumann is most often credited with the design of the stored-program computer because of his design of EDVAC, others before him such as Konrad Zuse had suggested and implemented similar ideas. Additionally, the so-called Harvard architecture of the Harvard Mark I, which was completed before EDVAC, also utilized a stored-program design using punched paper tape rather than electronic memory. The key difference between the von Neumann and Harvard architectures is that the latter separates the storage and treatment of CPU instructions and data, while the former uses the same memory space for both. Most modern CPUs are primarily von Neumann in design, but elements of the Harvard architecture are commonly seen as well. Being digital devices, all CPUs deal with discrete states and therefore require some kind of switching elements to differentiate between and change these states. Prior to commercial acceptance of the transistor, electrical relays and vacuum tubes (thermionic valves) were commonly used as switching elements. Although these had distinct speed advantages over earlier, purely mechanical designs, they were unreliable for various reasons. For example, building direct current sequential logic circuits out of relays requires additional hardware to cope with the problem of contact bounce. While vacuum tubes do not suffer from contact bounce, they must heat up before becoming fully operational and eventually stop functioning altogether.[2] Usually, when a tube failed, the CPU would have to be diagnosed to locate the failing component so it could be replaced. Therefore, early electronic (vacuum tube based) computers were generally faster but less reliable than electromechanical (relay based) computers. 3. Write about clock rate of CPU. Clock rate Most CPUs, and indeed most sequential logic devices, are synchronous in nature.[8] That is, they are designed and operate on assumptions about a synchronization signal. This signal, known as a clock signal, usually takes the form of a periodic square wave. By calculating the maximum time that electrical signals can move in various branches of a CPU's many circuits, the designers can select an appropriate period for the clock signal. This period must be longer than the amount of time it takes for a signal to move, or propagate, in the worst-case scenario. In setting the clock period to a value well above the worst-case propagation delay, it is possible to design the entire CPU and the way it moves data around the "edges" of the rising and falling clock signal. This has the advantage of simplifying the CPU significantly, both from a design perspective and a component-count perspective. However, it also carries the disadvantage that the entire CPU must wait on its slowest elements, even though some portions of it are much faster. This limitation has largely been compensated for by various methods of increasing CPU parallelism. (see below) However, architectural improvements alone do not solve all of the drawbacks of globally synchronous CPUs. For example, a clock signal is subject to the delays of any other electrical signal. Higher clock rates in increasingly complex CPUs make it more difficult to keep the clock signal in phase (synchronized) throughout the entire unit. This has led many modern CPUs to require multiple identical clock signals to be provided in order to avoid delaying a single signal significantly enough to cause the CPU to malfunction. Another major issue as clock rates increase dramatically is the amount of heat that is dissipated by the CPU. The constantly changing clock causes many components to switch regardless of whether they are being used at that time. In general, a component that is switching uses more energy than an element in a static state. Therefore, as clock rate increases, so does heat dissipation, causing the CPU to require more effective cooling solutions. One method of dealing with the switching of unneeded components is called clock gating, which involves turning off the clock signal to unneeded components (effectively disabling them). However, this is often regarded as difficult to implement and therefore does not see common usage outside of very low-power designs.[9] Another method of addressing some of the problems with a global clock signal is the removal of the clock signal altogether. While removing the global clock signal makes the design process considerably more complex in many ways, asynchronous (or clockless) designs carry marked advantages in power consumption and heat dissipation in comparison with similar synchronous designs. While somewhat uncommon, entire asynchronous CPUs have been built without utilizing a global clock signal. Two notable examples of this are the ARM compliant AMULET and the MIPS R3000 compatible MiniMIPS. Rather than totally removing the clock signal, some CPU designs allow certain portions of the device to be asynchronous, such as using asynchronous ALUs in conjunction with superscalar pipelining to achieve some arithmetic performance gains. While it is not altogether clear whether totally asynchronous designs can perform at a comparable or better level than their synchronous counterparts, it is evident that they do at least excel in simpler math operations. This, combined with their excellent power consumption and heat dissipation properties, makes them very suitable for embedded computers (Garside et al. 1999). 4. Write about the design and implementation of CPU. Design and implementation Integer range The way a CPU represents numbers is a design choice that affects the most basic ways in which the device functions. Some early digital computers used an electrical model of the common decimal (base ten) numeral system to represent numbers internally. A few other computers have used more exotic numeral systems like ternary (base three). Nearly all modern CPUs represent numbers in binary form, with each digit being represented by some two-valued physical quantity such as a "high" or "low" voltage.[6] MOS 6502 microprocessor in a dual in-line package, an extremely popular 8-bit design. Related to number representation is the size and precision of numbers that a CPU can represent. In the case of a binary CPU, a bit refers to one significant place in the numbers a CPU deals with. The number of bits (or numeral places) a CPU uses to represent numbers is often called "word size", "bit width", "data path width", or "integer precision" when dealing with strictly integer numbers (as opposed to floating point). This number differs between architectures, and often within different parts of the very same CPU. For example, an 8-bit CPU deals with a range of numbers that can be represented by eight binary digits (each digit having two possible values), that is, 28 or 256 discrete numbers. In effect, integer size sets a hardware limit on the range of integers the software run by the CPU can utilize.[7] Integer range can also affect the number of locations in memory the CPU can address (locate). For example, if a binary CPU uses 32 bits to represent a memory address, and each memory address represents one octet (8 bits), the maximum quantity of memory that CPU can address is 232 octets, or 4 GiB. This is a very simple view of CPU address space, and many designs use more complex addressing methods like paging in order to locate more memory than their integer range would allow with a flat address space. Higher levels of integer range require more structures to deal with the additional digits, and therefore more complexity, size, power usage, and general expense. It is not at all uncommon, therefore, to see 4- or 8-bit microcontrollers used in modern applications, even though CPUs with much higher range (such as 16, 32, 64, even 128-bit) are available. The simpler microcontrollers are usually cheaper, use less power, and therefore dissipate less heat, all of which can be major design considerations for electronic devices. However, in higher-end applications, the benefits afforded by the extra range (most often the additional address space) are more significant and often affect design choices. To gain some of the advantages afforded by both lower and higher bit lengths, many CPUs are designed with different bit widths for different portions of the device. For example, the IBM System/370 used a CPU that was primarily 32 bit, but it used 128-bit precision inside its floating point units to facilitate greater accuracy and range in floating point numbers (Amdahl et al. 1964). Many later CPU designs use similar mixed bit width, especially when the processor is meant for general-purpose usage where a reasonable balance of integer and floating point capability is required. ================================ PART-B UNIT-2 PROCESSOR DESIGN AND DATA PATH 1. Write about processor design goals. The first CPUs were designed to do mathematical calculations faster and more reliably than human computers. Each successive generation of CPU might be designed to achieve some of these goals: higher performance levels of a single program or thread higher throughput levels of multiple programs/threads less power consumption for the same performance level lower cost for the same performance level greater connectivity to build larger, more parallel systems more specialization to aid in specific targeted markets Re-designing a CPU core to a smaller die-area helps achieve several of these goals. Shrinking everything (a "photomask shrink"), resulting in the same number of transistors on a smaller die, improves performance (smaller transistors switch faster), reduces power (smaller wires have less parasitic capacitance) and reduces cost (more CPUs fit on the same wafer of silicon). Releasing a CPU on the same size die, but with a smaller CPU core, keeps the cost about the same but allows higher levels of integration within one VLSI chip (additional cache, multiple CPUs, or other components), improving performance and reducing overall system cost. 2. Write about the basic architecture of a computer. Basic Architecture of a Modern Computer/Network: Abstraction Layers When the machine powers up, tells central processing unit (CPU) to check memory, etc. and where to go to find how to "boot up" 1 2Controls access to almost all reads (sensing the keyboard, disk drive, memory, or other inputs), writes (to memory, printer, screen, speakers) through the CPU, which actually processes the data stream. Also includes filing system, e.g. where you locate your documents (from papers to music and images), applications, and the like. bFor Windows (up to 2000) and MacOS (up to 9.2), a patched, cobbled-in way of using Internet-standard communications protocols, such as TCP/IP and Ethernet (and its descendents). For Unix and its variants (such as MacOS X and Linux) communications are now embedded in the OS. aThe very basic interface between hardware and software—where the computer "converses" with all peripheral devices, as well as hard drives, video/sound cards, etc. 3 [One hopes!] a set of knowable "sockets" into which data to and from applications can be fed, and through which a keystroke or other data input is handled by the OS and CPU. Can be open and publically known, but is often internal corporate, proprietary (and thus secret) information— a de facto set of "standards" 4The applications with which you're familiar, for example Netscape, Mulberry, Word, WinAmp, etc.-—indeed, the operating environment in which you probably spend most of your time 5The applications and systems that allow communication and integration among separate machines (caveat: Unix does this implicitly) for high-level, often Netbased data-handling. In theory, these are independent of the specific hard- and software of any PC—they are "cross-platform." 6Networked processing, with the ability to hand-off processing tasks to any CPU able to perform the tasks requested. Potentially a very rich level, where individual processors are able to negotiate with others, and perhaps develop their own practices of deference to each other. The Architecture of a Modern Personal Computer: Basic Hardware Configuration input-output bus input-output bus printer driver "read-only" devices "write-only" devices Video card/ "rasterizer" 3. Write about the high end processor economics. High-end processor economics Developing new, high-end CPUs is a very costly proposition. Both the logical complexity (needing very large logic design and logic verification teams and simulation farms with perhaps thousands of computers) and the high operating frequencies (needing large circuit design teams and access to the state-of-the-art fabrication process) account for the high cost of design for this type of chip. The design cost of a high-end CPU will be on the order of US $100 million. Since the design of such high-end chips nominally takes about five years to complete, to stay competitive a company has to fund at least two of these large design teams to release products at the rate of 2.5 years per product generation. As an example, the typical loaded cost for one computer engineer is often quoted to be $250,000 US dollars/year. This includes salary, benefits, CAD tools, computers, office space rent, etc. Assuming that 100 engineers are needed to design a CPU and the project takes 4 years. Total cost = $250,000/engineer-man_year X 100 engineers X 4 years = $100,000,000 US dollars. The above amount is just an example. The design teams for modern day general purpose CPUs have several hundred team members. Only the personal computer mass market (with production rates in the hundreds of millions, producing billions of dollars in revenue) can support such a large design and implementation teams.[citation needed] As of 2004, only four companies are actively designing and fabricating state of the art general purpose computing CPU chips: Intel, AMD, IBM and Fujitsu.[citation needed] Motorola has spun off its semiconductor division as Freescale as that division was dragging down profit margins for the rest of the company. Texas Instruments, TSMC and Toshiba are a few examples of a companies doing manufacturing for another company's CPU chip design 4. Write about general purpose computing. General purpose computing The vast majority of revenues generated from CPU sales is for general purpose computing. That is, desktop, laptop and server computers commonly used in businesses and homes. In this market, the Intel IA-32 architecture dominates, with its rivals PowerPC and SPARC maintaining much smaller customer bases. Yearly, hundreds of millions of IA-32 architecture CPUs are used by this market. Since these devices are used to run countless different types of programs, these CPU designs are not specifically targeted at one type of application or one function. The demands of being able to run a wide range of programs efficiently has made these CPU designs among the more advanced technically, along with some disadvantages of being relatively costly, and having high power consumption =============================================================== PART-C UNIT -2 PROCESSOR DESIGN AND DATA PATH 1. What do you mean by system performance analysis? Performance analysis Because there are too many programs to test a CPU's speed on all of them, benchmarks were developed. The most famous benchmarks are the SPECint and SPECfp benchmarks developed by Standard Performance Evaluation Corporation and the ConsumerMark benchmark developed by the Embedded Microprocessor Benchmark Consortium EEMBC. Some important measurements include: Instructions per second - Most consumers pick a computer architecture (normally Intel IA32 architecture) to be able to run a large base of pre-existing pre-compiled software. Being relatively uninformed on computer benchmarks, some of them pick a particular CPU based on operating frequency (see Megahertz Myth). FLOPS - The number of floating point operations per second is often important in selecting computers for scientific computations. Performance per watt - System designers building parallel computers, such as Google, pick CPUs based on their speed per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself. [1][2] Some system designers building parallel computers pick CPUs based on the speed per dollar. System designers building real-time computing systems want to guarantee worst-case response. That is easier to do when the CPU has low interrupt latency and when it has deterministic response. (DSP) Computer programmers who program directly in assembly language want a CPU to support a full featured instruction set. Low power - For systems with limited power sources (e.g. solar, batteries, human power). Small size or low weight - for portable embedded systems, systems for spacecraft. Environmental impact - Minimizing environmental impact of computers during manufacturing and recycling as well during use. Reducing waste, reducing hazardous materials 2. Write about CPU design. CPU design CPU design focuses on these areas: 1. 2. 3. 4. 5. 6. datapaths (such as ALUs and pipelines) control unit: logic which controls the datapaths Memory components such as register files, caches Clock circuitry such as clock drivers, PLLs, clock distribution networks Pad transceiver circuitry Logic gate cell library which is used to implement the logic CPUs designed for high-performance markets might require custom designs for each of these items to achieve frequency, power-dissipation, and chip-area goals. CPUs designed for lower performance markets might lessen the implementation burden by: Acquiring some of these items by purchasing them as intellectual property Use control logic implementation techniques (logic synthesis using CAD tools) to implement the other components - datapaths, register files, clocks Common logic styles used in CPU design include: Unstructured random logic Finite-state machines Microprogramming (common from 1965 to 1985, no longer common except for CISC CPUs) Programmable logic array (common in the 1980s, no longer common) Device types used to implement the logic include: Transistor-transistor logic Small Scale Integration jelly-bean logic chips - no longer used for CPUs Programmable Array Logic and Programmable logic devices - no longer used for CPUs Emitter-coupled logic (ECL) gate arrays - no longer common CMOS gate arrays - no longer used for CPUs CMOS ASICs - what's commonly used today, they're so common that the term ASIC is not used for CPUs Field-programmable gate arrays (FPGA) - common for soft microprocessors, and more or less required for reconfigurable computing A CPU design project generally has these major tasks: Programmer-visible instruction set architecture, which can be implemented by a variety of microarchitectures Architectural study and performance modeling in ANSI C/C++ or SystemC High-level synthesis (HLS) or RTL (eg. logic) implementation RTL Verification Circuit design of speed critical components (caches, registers, ALUs) Logic synthesis or logic-gate-level design Timing analysis to confirm that all logic and circuits will run at the specified operating frequency Physical design including floorplanning, place and route of logic gates Checking that RTL, gate-level, transistor-level and physical-level representations are equivalent Checks for signal integrity, chip manufacturability As with most complex electronic designs, the logic verification effort (proving that the design does not have bugs) now dominates the project schedule of a CPU. Key CPU architectural innovations include index register, cache, virtual memory, instruction pipelining, superscalar, CISC, RISC, virtual machine, emulators, microprogram, and stack. 3. Write the role of the processor in the CPU. The processor plays a significant role in the following important aspects of your computer system: Performance: The processor is probably the most important single determinant of system performance in the PC. While other components also play a key role in determining performance, the processor's capabilities dictate the maximum performance of a system. The other devices only allow the processor to reach its full potential. Software Support: Newer, faster processors enable the use of the latest software. In addition, new processors such as the Pentium with MMX Technology, enable the use of specialized software not usable on earlier machines. Reliability and Stability: The quality of the processor is one factor that determines how reliably your system will run. While most processors are very dependable, some are not. This also depends to some extent on the age of the processor and how much energy it consumes. Energy Consumption and Cooling: Originally processors consumed relatively little power compared to other system devices. Newer processors can consume a great deal of power. Power consumption has an impact on everything from cooling method selection to overall system reliability. Motherboard Support: The processor you decide to use in your system will be a major determining factor in what sort of chipset you must use, and hence what motherboard you buy. The motherboard in turn dictates many facets of your system's capabilities and performance. PART—B UNIT—3 MEMORY DESIGN AND MANAGEMENT 1. Write about the structure of the cache memory. Structure Cache row entries usually have the following structure: Data Valid Tag Index Displacement blocks bit The data blocks contain the actual data fetched from the main memory. The memory address is split (MSB to LSB) into the tag, the index and the displacement (offset), while the valid bit denotes that this particular entry has valid data. The index length is log2(cache_rows) bits and describes which row the data has been put in. The displacement length is log2(data_blocks) and specifies which block of the ones we have stored we need. The tag length is address − index − displacement and contains the most significant bits of the address, which are checked against the current row (the row has been retrieved by index) to see if it is the one we need or another, irrelevant memory location that happened to have the same index bits as the one we want 2. Write about the associativity of the cache memory. Associativity Which memory locations can be cached by which cache locations The replacement policy decides where in the cache a copy of a particular entry of main memory will go. If the replacement policy is free to choose any entry in the cache to hold the copy, the cache is called fully associative. At the other extreme, if each entry in main memory can go in just one place in the cache, the cache is direct mapped. Many caches implement a compromise in which each entry in main memory can go to any one of N places in the cache, and are described as N-way set associative. For example, the level-1 data cache in an AMD Athlon is 2way set associative, which means that any particular location in main memory can be cached in either of 2 locations in the level-1 data cache. Associativity is a trade-off. If there are ten places the replacement policy can put a new cache entry, then when the cache is checked for a hit, all ten places must be searched. Checking more places takes more power, chip area, and potentially time. On the other hand, caches with more associativity suffer fewer misses (see conflict misses, below), so that the CPU spends less time servicing those misses. The rule of thumb is that doubling the associativity, from direct mapped to 2-way, or from 2-way to 4-way, has about the same effect on hit rate as doubling the cache size. Associativity increases beyond 4-way have much less effect on the hit rate, and are generally done for other reasons (see virtual aliasing, below). In order of increasing (worse) hit times and decreasing (better) miss rates, direct mapped cache—the best (fastest) hit times, and so the best tradeoff for "large" caches 2-way set associative cache 2-way skewed associative cache -- "the best tradeoff for .... caches whose sizes are in the range 4K-8K bytes" -- André Seznec[3] 4-way set associative cache fully associative cache -- the best (lowest) miss rates, and so the best tradeoff when the miss penalty is very high 3. Write short notes on pseudo- associative cache. Pseudo-associative cache A true set-associative cache tests all the possible ways simultaneously, using something like a content addressable memory. A pseudo-associative cache tests each possible way one at a time. A hash-rehash cache is one kind of pseudo-associative cache. In the common case of finding a hit in the first way tested, a pseudo-associative cache is as fast as a direct-mapped cache. But it has a much lower conflict miss rate than a direct-mapped cache, closer to the miss rate of a fully associative cache. 4. Write short notes on cache. A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations. As long as most memory accesses are cached memory locations, the average latency of memory accesses will be closer to the cache latency than to the latency of main memory. When the processor needs to read from or write to a location in main memory, it first checks whether a copy of that data is in the cache. If so, the processor immediately reads from or writes to the cache, which is much faster than reading from or writing to main memory. The diagram on the right shows two memories. Each location in each memory has a datum (a cache line), which in different designs ranges in size from 8[1] to 512[2] bytes. The size of the cache line is usually larger than the size of the usual access requested by a CPU instruction, which ranges from 1 to 16 bytes. Each location in each memory also has an index, which is a unique number used to refer to that location. The index for a location in main memory is called an address. Each location in the cache has a tag that contains the index of the datum in main memory that has been cached. In a CPU's data cache these entries are called cache lines or cache blocks. 5. What do you mean by paging supervisor? Paging supervisor This part of the operating system creates and manages the page tables. If the dynamic address translation hardware raises a page fault exception, the paging supervisor searches the page space on secondary storage for the page containing the required virtual address, reads it into real physical memory, updates the page tables to reflect the new location of the virtual address and finally tells the dynamic address translation mechanism to start the search again. Usually all of the real physical memory is already in use and the paging supervisor must first save an area of real physical memory to disk and update the page table to say that the associated virtual addresses are no longer in real physical memory but saved on disk. Paging supervisors generally save and overwrite areas of real physical memory which have been least recently used, because these are probably the areas which are used least often. So every time the dynamic address translation hardware matches a virtual address with a real physical memory address, it must put a time-stamp in the page table entry for that virtual address. h segment into pages. In systems that combine them, such as Multics and the IBM System/38 and IBM System i machines, virtual memory is usually implemented with paging, with segmentation used to provide memory protection.[8][9][10] With the Intel 80386 and later IA-32 processors, the segments reside in a 32-bit linear paged address space, so segments can be moved into and out of that linear address space, and pages in that linear address space can be moved in and out of main memory, providing two levels of virtual memory; however, few if any operating systems do so. Instead, they only use paging. The difference between virtual memory implementations using pages and using segments is not only about the memory division with fixed and variable sizes, respectively. In some systems, e.g. Multics, or later System/38 and Prime machines, the segmentation was actually visible to the user processes, as part of the semantics of a memory model. In other words, instead of a process just having a memory which looked like a single large vector of bytes or words, it was more structured. This is different from using pages, which doesn't change the model visible to the process. This had important consequences. A segment wasn't just a "page with a variable length", or a simple way to lengthen the address space (as in Intel 80286). In Multics, the segmentation was a very powerful mechanism that was used to provide a single-level virtual memory model, in which there was no differentiation between "process memory" and "file system" - a process' active address space consisted only a list of segments (files) which were mapped into its potential address space, both code and data. It is not the same as the later mmap function in Unix, because inter-file pointers don't work when mapping files into semi-arbitrary places. Multics had such addressing mode built into most instructions. In other words it could perform relocated inter-segment references, thus eliminating the need for a linker completely. This also worked when different processes mapped the same file into different places in their private address spaces. 6. Write about the virtual memory in a computer system. Virtual memory is a computer system technique which gives an application program the impression that it has contiguous working memory (an address space), while in fact it may be physically fragmented and may even overflow on to disk storage. Systems that use this technique make programming of large applications easier and use real physical memory (e.g. RAM) more efficiently than those without virtual memory. Virtual memory differs significantly from memory virtualization in that virtual memory allows resources to be virtualized as memory for a specific system, as opposed to a large pool of memory being virtualized as smaller pools for many different systems. Note that "virtual memory" is more than just "using disk space to extend physical memory size" that is merely the extension of the memory hierarchy to include hard disk drives. Extending memory to disk is a normal consequence of using virtual memory techniques, but could be done by other means such as overlays or swapping programs and their data completely out to disk while they are inactive. The definition of "virtual memory" is based on redefining the address space with a contiguous virtual memory addresses to "trick" programs into thinking they are using large blocks of contiguous addresses. All modern general-purpose computer operating systems use virtual memory techniques for ordinary applications, such as word processors, spreadsheets, multimedia players, accounting, etc. Older operating systems, such as DOS and Microsoft Windows[1] of the 1980s, or those for the mainframes of the 1960s, generally had no virtual memory functionality - notable exceptions being the Atlas, B5000 and Apple Computer's Lisa. PART—C UNIT—3 MEMORY DESIGN AND MANAGEMENT 1. Write in detail about the development of the virtual memory. In the 1940s and 1950s, before the development of a virtual memory, all larger programs had to contain logic for managing two-level storage (primary and secondary, today's analogies being RAM and hard disk), such as overlaying techniques. Programs were responsible for moving overlays back and forth from secondary storage to primary. The main reason for introducing virtual memory was therefore not simply to extend primary memory, but to make such an extension as easy to use for programmers as possible.[2] Many systems already had the ability to divide the memory between multiple programs (required for multiprogramming and multiprocessing), provided for example by "base and bounds registers" on early models of the PDP-10, without providing virtual memory. That gave each application a private address space starting at an address of 0, with an address in the private address space being checked against a bounds register to make sure it's within the section of memory allocated to the application and, if it is, having the contents of the corresponding base register being added to it to give an address in main memory. This is a simple form of segmentation without virtual memory. Virtual memory was developed in approximately 1959–1962, at the University of Manchester for the Atlas Computer, completed in 1962.[3] However, Fritz-Rudolf Güntsch, one of Germany's pioneering computer scientists and later the developer of the Telefunken TR 440 mainframe, claims to have invented the concept in 1957 in his doctoral dissertation Logischer Entwurf eines digitalen Rechengerätes mit mehreren asynchron laufenden Trommeln und automatischem Schnellspeicherbetrieb (Logic Concept of a Digital Computing Device with Multiple Asynchronous Drum Storage and Automatic Fast Memory Mode). In 1961, Burroughs released the B5000, the first commercial computer with virtual memory.[4][5] It used segmentation rather than paging. Like many technologies in the history of computing, virtual memory was not accepted without challenge. Before it could be implemented in mainstream operating systems, many models, experiments, and theories had to be developed to overcome the numerous problems. Dynamic address translation required a specialized, expensive, and hard to build hardware, moreover initially it slightly slowed down the access to memory.[2] There were also worries that new system-wide algorithms of utilizing secondary storage would be far less effective than previously used application-specific ones. By 1969 the debate over virtual memory for commercial computers was over.[2] An IBM research team led by David Sayre showed that the virtual memory overlay system consistently worked better than the best manually controlled systems. Possibly the first minicomputer to introduce virtual memory was the Norwegian NORD-1. During the 1970s, other minicomputers implemented virtual memory, notably VAX models running VMS. Virtual memory was introduced to the x86 architecture with the protected mode of the Intel 80286 processor. At first it was done with segment swapping, which became inefficient with larger segments. The Intel 80386 introduced support for paging underneath the existing segmentation layer. The page fault exception could be chained with other exceptions without causing a double fault. 2. What is the difference between static RAM and dynamic RAM? A capacitor stores electrons in computer memory cells. The memory must then be refreshed or flip-flopped. Your computer probably uses both static RAM and dynamic RAM at the same time, but it uses them for different reasons because of the cost difference between the two types. If you understand how dynamic RAM and static RAM chips work inside, it is easy to see why the cost difference is there, and you can also understand the names. Dynamic RAM is the most common type of memory in use today. Inside a dynamic RAM chip, each memory cell holds one bit of information and is made up of two parts: a transistor and a capacitor. These are, of course, extremely small transistors and capacitors so that millions of them can fit on a single memory chip. The capacitor holds the bit of information -- a 0 or a 1 (see How Bits and Bytes Work for information on bits). The transistor acts as a switch that lets the control circuitry on the memory chip read the capacitor or change its state. A capacitor is like a small bucket that is able to store electrons. To store a 1 in the memory cell, the bucket is filled with electrons. To store a 0, it is emptied. The problem with the capacitor's bucket is that it has a leak. In a matter of a few milliseconds a full bucket becomes empty. Therefore, for dynamic memory to work, either the CPU or the memory controller has to come along and recharge all of the capacitors holding a 1 before they discharge. To do this, the memory controller reads the memory and then writes it right back. This refresh operation happens automatically thousands of times per second. This refresh operation is where dynamic RAM gets its name. Dynamic RAM has to be dynamically refreshed all of the time or it forgets what it is holding. The downside of all of this refreshing is that it takes time and slows down the memory. Static RAM uses a completely different technology. In static RAM, a form of flip-flop holds each bit of memory (see How Boolean Gates Work for detail on flip-flops). A flip-flop for a memory cell takes 4 or 6 transistors along with some wiring, but never has to be refreshed. This makes static RAM significantly faster than dynamic RAM. However, because it has more parts, a static memory cell takes a lot more space on a chip than a dynamic memory cell. Therefore you get less memory per chip, and that makes static RAM a lot more expensive. So static RAM is fast and expensive, and dynamic RAM is less expensive and slower. Therefore static RAM is used to create the CPU's speed-sensitive cache, while dynamic RAM forms the larger system RAM 3. Write in detail about the dynamic memory allocation. Dynamic memory allocation In computer science, dynamic memory allocation is the allocation of memory storage for use in a computer program during the runtime of that program. It can be seen also as a way of distributing ownership of limited memory resources among many pieces of data and code. Dynamically allocated memory exists until it is released either explicitly by the programmer, exiting a block, or by the garbage collector. This is in contrast to static memory allocation, which has a fixed duration. It is said that an object so allocated has a dynamic lifetime. Details The task of fulfilling allocation request o Finding a block of unused memory of sufficient size Problems during fulfilling allocation request o Internal and external fragmentation. Reduction needs special care, thus making implementation more complex (see algorithm efficiency). o Allocator's metadata can inflate the size of (individually) small allocations; Chunking attempts to reduce this effect. Usually, memory is allocated from a large pool of unused memory area called the heap (also called the free store). Since the precise location of the allocation is not known in advance, the memory is accessed indirectly, usually via a reference. The precise algorithm used to organize the memory area and allocate and deallocate chunks is hidden behind an abstract interface and may use any of the methods described below. Implementations Fixed-size-blocks allocation Main article: memory pool Fixed-size-blocks allocation, also called memory pool allocation, uses a free list of fixed-size blocks of memory (often all of the same size). This works well for simple embedded systems. Buddy blocks For more details on this topic, see Buddy memory allocation. In this system, memory is allocated from a large block in memory that is a power of two in size. If the block is more than twice as large as desired, it is broken in two. One of the halves is selected, and the process repeats (checking the size again and splitting if needed) until the block is just large enough. All the blocks of a particular size are kept in a sorted linked list or tree. When a block is freed, it is compared to its buddy. If they are both free, they are combined and placed in the next-largest size buddy-block list. (When a block is allocated, the allocator will start with the smallest sufficiently large block avoiding needlessly breaking blocks) 4. Write about the dynamic translation and paged virtual memory. Paged virtual memory Almost all implementations of virtual memory divide the virtual address space of an application program into pages; a page is a block of contiguous virtual memory addresses. Pages are usually at least 4K bytes in size, and systems with large virtual address ranges or large amounts of real memory (e.g. RAM) generally use larger page sizes. Page tables Almost all implementations use page tables to translate the virtual addresses seen by the application program into physical addresses (also referred to as "real addresses") used by the hardware to process instructions. Each entry in the page table contains a mapping for a virtual page to either the real memory address at which the page is stored, or an indicator that the page is currently held in a disk file. (Although most do, some systems may not support use of a disk file for virtual memory.) Systems can have one page table for the whole system or a separate page table for each application. If there is only one, different applications which are running at the same time share a single virtual address space, i.e. they use different parts of a single range of virtual addresses. Systems which use multiple page tables provide multiple virtual address spaces - concurrent applications think they are using the same range of virtual addresses, but their separate page tables redirect to different real addresses. Paging Paging is the process of saving inactive virtual memory pages to disk and restoring them to real memory when required. Most virtual memory systems enable programs to use virtual address ranges which in total exceed the amount of real memory (e.g. RAM). To do this they use disk files to save virtual memory pages which are not currently active, and restore them to real memory when they are needed. Pages are not necessarily restored to the same real addresses from which they were saved - applications are aware only of virtual addresses. Usually when a page is going to be restored to real memory, the real memory already contains another virtual memory page which will be saved to disk before the restore takes place. Dynamic address translation If, while executing an instruction, a CPU fetches an instruction located at a particular virtual address, fetches data from a specific virtual address or stores data to a particular virtual address, the virtual address must be translated to the corresponding physical address. This is done by a hardware component, sometimes called a memory management unit, which looks up the real address (from the page table) corresponding to a virtual address and passes the real address to the parts of the CPU which execute instructions. If the page tables indicate that the virtual memory page is not currently in real memory, the hardware raises a page fault exception (special internal signal) which invokes the paging supervisor component of the operating system (see below). PART-B UNIT-4 COMPUTER PERIPHERALS 1. Write short notes on hard disk. Hard Disk (ATA / SATA / SCSI) � Used to stored data permanently. � Different Type of Hard Disk Size (3.5”, 2.5”, 1.8”, Micro Drive) � Different Interface: ATA / SATA / SCSI (Speed: ATA < SATA < SCSI) � Different Speed (Mechanical) (4,200rpm / 5,400rpm / 7,200rpm / 10,000 rpm) 2. Give the details about mother board. Main Board / Mother Board (MB) � Provide a platform to Connecting all the Devices. (Keyboard / Mouse / Power / CPU / Memory / Hard Disk / Floppy Disk / Display Card etc) � Many Main Board has Already build in Sound Card / Network Card Or even display card. 3. Give the details of I/O device and interface. I/O Device & Interface � ATA / SATA /SCSI (For Hard Disk) � Parallel Port or LPT Port (For Printer) � COM Port (For Modem) � RJ45 Socket (For Network) � PS/2 (For Keyboard / Mouse) � D-Sub / DVI (For Monitor) � USB (All compatible devices) 4. Write the details of hard disk memory. Hard Disk (ATA / SATA / SCSI) � Different Build in Memory Size (2M / 8M / 16M etc) � Different Capacity (80G to 500G or even 1T) � Small Size Hard Disk are more popular as They are portable size. 5. Write short notes on connects of keyboards. Connection types There are several ways of connecting a keyboard using cables, including the standard AT connector commonly found on motherboards, which was eventually replaced by the PS/2 and the USB connection. Prior to the iMac line of systems, Apple used the proprietary Apple Desktop Bus for its keyboard connector. Wireless keyboards have become popular for their increased user freedom. A wireless keyboard often includes a required combination transmitter and receiver unit that attaches to the computer's keyboard port (see Connection types above). The wireless aspect is achieved either by radio frequency (RF) or by infrared (IR) signals sent and received from both the keyboard and the unit attached to the computer. A wireless keyboard may use an industry standard RF, called Bluetooth. With Bluetooth, the transceiver may be built into the computer. However, a wireless keyboard needs batteries to work and may pose a security problem due to the risk of data "eavesdropping" by hackers.[6] 6. What is meant by alternative text entering method? Alternative text-entering methods An on-screen keyboard controlled with the mouse can be used by users with limited mobility. Optical character recognition (OCR) is preferable to rekeying for converting existing text that is already written down but not in machine-readable format (for example, a Linotype-composed book from the 1940s). In other words, to convert the text from an image to editable text (that is, a string of character codes), a person could re-type it, or a computer could look at the image and deduce what each character is. OCR technology has already reached an impressive state (for example, Google Book Search) and promises more for the future. Speech recognition converts speech into machine-readable text (that is, a string of character codes). The technology has already reached an impressive state and is already implemented in various software products. For certain uses (e.g., transcription of medical or legal dictation; journalism; writing essays or novels) it is starting to replace the keyboard; however, it does not threaten to replace keyboards entirely anytime soon. It can, however, interpret commands (for example, "close window" or "undo that") in addition to text. Therefore, it has theoretical potential to replace keyboards entirely (whereas OCR replaces them only for a certain kind of task). Pointing devices can be used to enter text or characters in contexts where using a physical keyboard would be inappropriate or impossible. These accessories typically present characters on a display, in a layout that provides fast access to the more frequently used characters or character combinations. Popular examples of this kind of input are Graffiti, Dasher and on-screen virtual keyboards. 7. Write short notes on keystroke hacking. Keystroke hacking Keystroke logging (often called keylogging) is a method of capturing and recording user keystrokes. While it is used legitimately to measure employee productivity on certain clerical tasks, or by law enforcement agencies to find out about illegal activities, it is also used by hackers for law-breaking, or other illegal activities. Hackers use keyloggers as a means to obtain passwords or encryption keys and thus bypassing other security measures. Keystroke logging can be achieved by both hardware and software means. Hardware key loggers are attached to the keyboard cable or installed inside standard keyboards. Software keyloggers work on the target computer’s operating system and gain unauthorized access to the hardware, hook into the keyboard with functions provided by the OS, or use remote access software to transmit recorded data out of the target computer to a remote location. Some hackers also use wireless keylogger sniffers to collect packets of data being transferred from a wireless keyboard and its receiver, and then they crack the encryption key being used to secure wireless communications between the two devices. Anti-spyware applications are able to detect many keyloggers and cleanse them. Responsible vendors of monitoring software support detection by anti-spyware programs, thus preventing abuse of the software. Enabling a firewall does not stop keyloggers per se, but can possibly prevent transmission of the logged material over the net if properly configured. Network monitors (also known as reverse-firewalls) can be used to alert the user whenever an application attempts to make a network connection. This gives the user the chance to prevent the keylogger from "phoning home" with his or her typed information. Automatic form-filling programs can prevent keylogging entirely by not using the keyboard at all. Most keyloggers can be fooled by alternating between typing the login credentials and typing characters somewhere else in the focus window. [7] Electromagnetic waves released every time key is pressed on the keyboard can be detected by a nearby antenna and interpreted by computer software to work out exactly what was typed. [8] 8. Write about the key switches. Key switches "Dome-switch" keyboards (sometimes incorrectly referred to as a membrane keyboards) are the most common type now in use. When a key is pressed, it pushes down on a rubber dome sitting beneath the key. A conductive contact on the underside of the dome touches (and hence connects) a pair of conductive lines on the circuit below. This bridges the gap between them and allows electric current to flow (the open circuit is closed). A scanning signal is emitted by the chip along the pairs of lines in the matrix circuit which connects to all the keys. When the signal in one pair becomes different, the chip generates a "make code" corresponding to the key connected to that pair of lines. Keycaps are also required for most types of keyboards; while modern keycaps are typically surface-marked, they can also be 2-shot molded, or engraved, or they can be made of transparent material with printed paper inserts Keys on older IBM keyboards were made with a "buckling spring" mechanism, in which a coil spring under the key buckles under pressure from the user's finger, pressing a rubber dome, whose inside is coated with conductive graphite, which connects two leads below, completing a circuit. This produces a clicking sound, and gives physical feedback for the typist indicating that the key has been depressed.[3][4]When a key is pressed and the circuit is completed, the code generated is sent to the computer either via a keyboard cable (using on-off electrical pulses to represent bits) or over a wireless connection. While not nearly as popular as dome-switch keyboards, these "clicky" keyboards have been making a comeback recently, particularly among writers and others who use keyboards heavily.[5] A chip inside the computer receives the signal bits and decodes them into the appropriate keypress. The computer then decides what to do on the basis of the key pressed (e.g. display a character on the screen, or perform some action). When the key is released, a break code (different from the make code) is sent to indicate the key is no longer pressed. If the break code is missed (e.g. due to a keyboard switch) it is possible for the keyboard controller to believe the key is pressed down when it is not, which is why pressing then releasing the key again will release the key (since another break code is sent). Other types of keyboards function in a similar manner, the main differences being how the individual key-switches work. For more on this subject refer to the article on keyboard technology. Certain key presses are special, namely Ctrl-Alt-Delete and SysRq, but what makes them special is a function of software. In the PC architecture, the keyboard controller (the component in the computer that receives the make and break codes) sends the computer's CPU a hardware interrupt whenever a key is pressed or released. The CPU's interrupt routine which handles these interrupts usually just places the key's code in a queue, to be handled later by other code when it gets around to it, then returns to whatever the computer was doing before. The special keys cause the interrupt routine to take a different "emergency" exit instead. This more trusted route is much harder to intercept. The layout of a keyboard can be changed by remapping the keys. When you remap a key, you tell the computer a new meaning for the pressing of that key. Keyboard remapping is supported at a driver-level configurable within the operating system, or as add-ons to the existing programs. 9. Write short about the system commands. System commands The SysRq / Print screen commands often share the same key. SysRq was used in earlier computers as a "panic" button to recover from crashes. The Print screen command used to capture the entire screen and send it to the printer, but in the present it usually puts a screenshot in the clipboard. The Break key/Pause key no longer has a well-defined purpose. Its origins go back to teletype users, who wanted a key that would temporarily interrupt the communications line. The Break key can be used by software in several different ways, such as to switch between multiple login sessions, to terminate a program, or to interrupt a modem connection. In programming, especially old DOS-style BASIC, Pascal and C, Break is used (in conjunction with Ctrl) to stop program execution. In addition to this, Linux and variants, as well as many DOS programs, treat this combination the same as Ctrl+C. On modern keyboards, the break key is usually labeled Pause/Break. In most Windows environments, the key combination Windows key+Pause brings up the system properties. The Escape key (often abbreviated Esc) is used to initiate an escape sequence. As most computer users no longer are concerned with the details of controlling their computer's peripherals, the task for which the escape sequences were originally designed, the escape key was appropriated by application programmers, most often to mean Stop. This use continues today in Microsoft Windows's use of escape as a shortcut in dialog boxes for No, Quit, Exit, Cancel, or Abort. A common application today of the Esc key is as a shortcut key for the Stop button in many web browsers. On machines running Microsoft Windows, prior to the implementation of the Windows key on keyboards, the typical practice for invoking the "start" button was to hold down the control key and press escape. This process still works in Windows XP and Windows Vista. The Menu key or Application key is a key found on Windows-oriented computer keyboards. It is used launch a context menu with the keyboard rather than with the usual right mouse button. The key's symbol is a small icon depicting a cursor hovering above a menu. This key was created at the same time as the Windows key. This key is normally used when the right mouse button is not present on the mouse. Some Windows public terminals do not have a Menu key on their keyboard to prevent users from right-clicking (however, in many windows applications, a similar functionality can be invoked with the Shift+F10 keyboard shortcut). PART-C UNIT-4 COMPUTER PERIPHERALS 1. Write in detail about the computer structure and the power supply. Basic Computer Structure 1. Logical Structure of a computer includes: � BIOS (The Basic Input Output System) � CPU (The Processor) � Memory / RAM (Temporary Storage) � Hard Disk (Permanent Storage) � Input / Output Device � Communication Channel (E.g. USB) � Bus (High Speed Internal Communication) � Other Add-on Device… Structure of a computer Power Supply � Power Supply Convert the A.C. Voltage to Lower D.C. Voltage which is suitable for Computer. � Power Supply can be Classified by their loading (Watt). � Different type of socket for Different device. BIOS � Basic Input Output System � Store all the parameter before the OS Load (Example is Hard Disk Size, Memory Speed, Turn on or turn off the build in device Such as Sound Card, USB, printer etc) � Usually stored in Flash Memory 2. Write short notes on mouse. Mechanical mouse devices Mechanical mouse, shown with the top cover removed Operating a mechanical mouse. 1: moving the mouse turns the ball. 2: X and Y rollers grip the ball and transfer movement. 3: Optical encoding disks include light holes. 4: Infrared LEDs shine through the disks. 5: Sensors gather light pulses to convert to X and Y velocities. Bill English, builder of Engelbart's original mouse,[10] invented the ball mouse in 1972 while working for Xerox PARC.[11] The ball-mouse replaced the external wheels with a single ball that could rotate in any direction. It came as part of the hardware package of the Xerox Alto computer. Perpendicular chopper wheels housed inside the mouse's body chopped beams of light on the way to light sensors, thus detecting in their turn the motion of the ball. This variant of the mouse resembled an inverted trackball and became the predominant form used with personal computers throughout the 1980s and 1990s. The Xerox PARC group also settled on the modern technique of using both hands to type on a full-size keyboard and grabbing the mouse when required. The ball mouse utilizes two rollers rolling against two sides of the ball. One roller detects the forward–backward motion of the mouse and other the left–right motion. The motion of these two rollers causes two disc-like encoder wheels to rotate, interrupting optical beams to generate electrical signals. The mouse sends these signals to the computer system by means of connecting wires. The driver software in the system converts the signals into motion of the mouse pointer along X and Y axes on the screen. Ball mice and wheel mice were manufactured for Xerox by Jack Hawley, doing business as The Mouse House in Berkeley, California, starting in 1975.[12][13] Based on another invention by Jack Hawley, proprietor of the Mouse House, Honeywell produced another type of mechanical mouse.[14][15] Instead of a ball, it had two wheels rotating at off axes. Keytronic later produced a similar product.[16] Modern computer mice took form at the École polytechnique fédérale de Lausanne (EPFL) under the inspiration of Professor Jean-Daniel Nicoud and at the hands of engineer and watchmaker André Guignard.[17] This new design incorporated a single hard rubber mouseball and three buttons, and remained a common design until the mainstream adoption of the scrollwheel mouse during the 1990s.[18] Another type of mechanical mouse, the "analog mouse" (now generally regarded as obsolete), uses potentiometers rather than encoder wheels, and is typically designed to be plug-compatible with an analog joystick. The "Color Mouse," originally marketed by Radio Shack for their Color Computer (but also usable on MS-DOS machines equipped with analog joystick ports, provided the software accepted joystick input) was the best-known example. 3. Write short notes on optical mouse. Optical mice An optical mouse uses a light-emitting diode and photodiodes to detect movement relative to the underlying surface, rather than moving some of its parts – as in a mechanical mouse. Early optical mice Xerox optical mouse chip Early optical mice, first demonstrated by two independent inventors in 1980,[19] came in two different varieties: 1. Some, such as those invented by Steve Kirsch of MIT and Mouse Systems Corporation,[20][21] used an infrared LED and a four-quadrant infrared sensor to detect grid lines printed with infrared absorbing ink on a special metallic surface. Predictive algorithms in the CPU of the mouse calculated the speed and direction over the grid. 2. Others, invented by Richard F. Lyon and sold by Xerox, used a 16-pixel visible-light image sensor with integrated motion detection on the same chip[22][23] and tracked the motion of light dots in a dark field of a printed paper or similar mouse pad.[24] These two mouse types had very different behaviors, as the Kirsch mouse used an x-y coordinate system embedded in the pad, and would not work correctly when the pad was rotated, while the Lyon mouse used the x-y coordinate system of the mouse body, as mechanical mice do. The optical sensor from a Microsoft Wireless IntelliMouse Explorer (v. 1.0A) 4. Write short notes on 3D mice. 3D mice Also known as bats,[30] flying mice, or wands,[31] these devices generally function through ultrasound. Probably the best known example would be 3DConnexion/Logitech's Space Mouse from the early 1990s. In the late 1990s Kantek introduced the 3D Ring Mouse. This wireless mouse was worn on a ring around a finger, which enabled the thumb to access three buttons. The mouse was tracked in three dimensions by a base station.[32] Despite a certain appeal, it was finally discontinued because it did not provide sufficient resolution. A recent consumer 3D pointing device is the Wii Remote. While primarily a motion-sensing device (that is, it can determine its orientation and direction of movement), Wii Remote can also detect its spatial position by comparing the distance and position of the lights from the IR emitter using its integrated IR camera (since the nunchuk lacks a camera, it can only tell its current heading and orientation). The obvious drawback to this approach is that it can only produce spatial coordinates while its camera can see the sensor bar. In February, 2008, at the Game Developers' Conference (GDC), a company called Motion4U introduced a 3D mouse add-on called "OptiBurst" for Autodesk's Maya application. The mouse allows users to work in true 3D with 6 degrees of freedom.[citation needed] The primary advantage of this system is speed of development with organic (natural) movement. 5. Write about ps/2 and protocol. PS/2 interface and protocol For more details on this topic, see PS/2 connector. With the arrival of the IBM PS/2 personal-computer series in 1987, IBM introduced the eponymous PS/2 interface for mice and keyboards, which other manufacturers rapidly adopted. The most visible change was the use of a round 6-pin mini-DIN, in lieu of the former 5-pin connector. In default mode (called stream mode) a PS/2 mouse communicates motion, and the state of each button, by means of 3-byte packets.[35] For any motion, button press or button release event, a PS/2 mouse sends, over a bi-directional serial port, a sequence of three bytes, with the following format: Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0 Byte 1 YV Byte 2 XV YS XS 1 X movement MB RB LB Byte 3 Y movement Here, XS and YS represent the sign bits of the movement vectors, XV and YV indicate an overflow in the respective vector component, and LB, MB and RB indicate the status of the left, middle and right mouse buttons (1 = pressed). PS/2 mice also understand several commands for reset and self-test, switching between different operating modes, and changing the resolution of the reported motion vectors. In Linux, a PS/2 mouse is detected as a /dev/psaux device. 6. Explain about in detail about keyboard. Keyboard (computing) In computing, a keyboard is an input device, partially modeled after the typewriter keyboard, which uses an arrangement of buttons or keys, which act as mechanical levers or electronic switches. A keyboard typically has characters engraved or printed on the keys and each press of a key typically corresponds to a single written symbol. However, to produce some symbols requires pressing and holding several keys simultaneously or in sequence. While most keyboard keys produce letters, numbers or signs (characters), other keys or simultaneous key presses can produce actions or computer commands. In normal usage, the keyboard is used to type text and numbers into a word processor, text editor or other program. In a modern computer, the interpretation of keypresses is generally left to the software. A computer keyboard distinguishes each physical key from every other and reports all keypresses to the controlling software. Keyboards are also used for computer gaming, either with regular keyboards or by using keyboards with special gaming features, which can expedite frequently used keystroke combinations. A keyboard is also used to give commands to the operating system of a computer, such as Windows' Control-Alt-Delete combination, which brings up a task window or shuts down the machine. Types Standard Standard keyboards, such as the 101-key US traditional keyboard 104-key Windows keyboards, include alphabetic characters, punctuation symbols, numbers and a variety of function keys. The internationally-common 102/105 key keyboards have a smaller 'left shift' key and an additional key with some more symbols between that and the letter to its right (usually Z or Y).[1] Laptop-size Keyboards on laptops and notebook computers usually have a shorter travel distance for the keystroke and a reduced set of keys. As well, they may not have a numerical keypad, and the function keys may be placed in locations that differ from their placement on a standard, fullsized keyboard. The keyboards on laptops such as this Sony VAIO have a shorter travel distance and a reduced set of keys. Gaming and multimedia Keyboards with extra keys, such as multimedia keyboards, have special keys for accessing music, web and other oft-used programs, a mute button, volume buttons or knob and standby (sleep) button. Gaming keyboards have extra function keys, which can be programmed with keystroke macros. For example, 'ctrl+shift+y' could be a keystroke that is frequently used in a certain computer game. Shortcuts marked on color-coded keys are used for some software applications and for specialized uses including word processing, video editing, graphic design and audio editing. Thumb-sized Smaller keyboards have been introduced for laptops, PDAs, cellphones or users who have a limited workspace. The size of a standard keyboard is dictated by the practical consideration that the keys must be large enough to be easily pressed by fingers. To reduce the size of the keyboard, the numeric keyboard to the right of the alphabetic keyboard can be removed, or the size of the keys can be reduced, which makes it harder to enter text. Another way to reduce the size of the keyboard is to reduce the number of keys and use chording keyer, i.e. pressing several keys simultaneously. For example, the GKOS keyboard has been designed for small wireless devices. Other two-handed alternatives more akin to a game controller, such as the AlphaGrip, are also used as a way to input data and text. Another way to reduce the size of a keyboard is to use smaller buttons and pack them closer together. Such keyboards, often called a "thumbboard" (thumbing) are used in some personal digital assistants such as the Palm Treo and BlackBerry and some Ultra-Mobile PCs such as the OQO. Numeric Numeric keyboards contain only numbers, mathematical symbols for addition, subtraction, multiplication, and division, a decimal point, and several function keys (e.g. End, Delete, etc.). They are often used to facilitate data entry with smaller keyboard-equipped laptops or with smaller keyboards that do not have a numeric keypad. Non-standard or special-use types Chorded A keyset or chorded keyboard is a computer input device that allows the user to enter characters or commands formed by pressing several keys together, like playing a "chord" on a piano. The large number of combinations available from a small number of keys allows text or commands to be entered with one hand, leaving the other hand free to do something else. A secondary advantage is that it can be built into a device (such as a pocket-sized computer) that is too small to contain a normal sized keyboard. A chorded keyboard designed to be used while held in the hand is called a keyer. Virtual Main article: Virtual keyboard Virtual keyboards, such as the I-Tech Virtual Laser Keyboard, project an image of a full-size keyboard onto a surface. Sensors in the projection unit identify which key is being "pressed" and relay the signals to a computer or personal digital assistant. There is also a virtual keyboard, the On-Screen Keyboard, for use on Windows. The On-Screen Keyboard is an image of a standard keyboard which the user controls by using a mouse to hover over the desired letter or symbol, and then clicks to enter the letter. The On-Screen Keyboard is provided with Windows as an accessibility aid, to assist users who may have difficulties using a regular keyboard. The iPhone uses a multi-touch screen to display a virtual keyboard. 7. Explain about the control processor of keyboard. Control processor The modern PC keyboard has more than just switches. It also includes a control processor and indicator lights to provide feedback to the user about what state the keyboard is in. Depending on the sophistication of the controller's programming, the keyboard may also offer other special features. The processor is usually a single chip 8048 microcontroller variant. The keyboard switch matrix is wired to its inputs and it processes the incoming keystrokes and sends the results down a serial cable (the keyboard cord) to a receiver in the main computer box. It also controls the illumination of the "caps lock", "num lock" and "scroll lock" lights. A common test for whether the computer has crashed is pressing the "caps lock" key. The keyboard sends the key code to the keyboard driver running in the main computer; if the main computer is operating, it commands the light to turn on. All the other indicator lights work in a similar way. The keyboard driver also tracks the shift, alt and control state of the keyboard. When pressing a keyboard key, the key "bounces" like a ball against its contacts several times before it settles into firm contact. When released, it bounces some more until it reverts to the uncontacted state. If the computer were watching for each pulse, it would see many keystrokes for what the user thought was just one. To resolve this problem, the processor in a keyboard (or computer) "debounces" the keystrokes, by aggregating them across time to produce one "confirmed" keystroke that (usually) corresponds to what is typically a solid contact. Some low-quality keyboards suffer problems with rollover (that is, when multiple keys are pressed in quick succession); some types of keyboard circuitry will register a maximum number of keys at one time. This is undesirable for games (designed for multiple keypresses, e.g. casting a spell while holding down keys to run) and undesirable for extremely fast typing (hitting new keys before the fingers can release previous keys). A common side effect of this shortcoming is called "phantom key blocking": on some keyboards, pressing three keys simultaneously sometimes resulted in a 4th keypress being registered. Modern keyboards prevent this from happening by blocking the 3rd key in certain key combinations, but while this prevents phantom input, it also means that when two keys are depressed simultaneously, many of the other keys on the keyboard will not respond until one of the two depressed keys is lifted. With better keyboards designs, this seldom happens in office programs, but it remains a problem in games even on expensive keyboards, due to wildly different and/or configurable key/command layouts in different games. PART-B UNIT—5 ADVANCED SYSTEM ARCHITECTURE. 1. Write in short about VLIW architecture. Very-Long Instruction Word (VLIW) Computer Architecture ABSTRACT VLIW architectures are distinct from traditional RISC and CISC architectures implemented in current mass-market microprocessors. It is important to distinguish instruction-set architecture—the processor programming model—from implementation—the physical chip and its characteristics. VLIW microprocessors and superscalar implementations of traditional instruction sets share some characteristics—multiple execution units and the ability to execute multiple operations simultaneously. The techniques used to achieve high performance, however, are very different because the parallelism is explicit in VLIW instructions but must be discovered by hardware at run time by superscalar processors. VLIW implementations are simpler for very high performance. Just as RISC architectures permit simpler, cheaper high-performance implementations than do CISCs, VLIW architectures are simpler and cheaper than RISCs because of further hardware simplifications. VLIW architectures, however, require more compiler support. Philips Semiconductors Introduction to VLIW Computer Architecture 2 INTRODUCTION AND MOTIVATION Currently, in the mid 1990s, IC fabrication technology is advanced enough to allow unprecedented implementations of computer architectures on a single chip. Also, the current rate of process advancement allows implementations to be improved at a rate that is satisfying for most of the markets these implementations serve. In particular, the vendors of general-purpose microprocessors are competing for sockets in desktop personal computers (including workstations) by pushing the envelopes of clock rate (raw operating speed) and parallel execution. The market for desktop microprocessors is proving to be extremely dynamic. In particular, the x86 market has surprised many observers by attaining performance levels and price/performance levels that many thought were out of reach. The reason for the pessimism about the x86 was its architecture (instruction set). Indeed, with the advent of RISC architectures, the x86 is now recognized as a deficient instruction set. Instruction set compatibility is at the heart of the desktop microprocessor market. Because the application programs that end users purchase are delivered in binary (directly executable by the microprocessor) form, the end users’ desire to protect their software investments creates tremendous instruction-set inertia. There is a different market, though, that is much less affected by instruction-set inertia. This market is typically called the embedded market, and it is characterized by products containing factoryinstalled software that runs on a microprocessor whose instruction set is not readily evident to the end user. Although the vendor of the product containing the embedded microprocessor has an investment in the embedded software, just like end users with their applications, there is considerably more freedom to migrate embedded software to a new microprocessor with a different instruction set. To overcome this lower level of instruction-set inertia, all it takes is a sufficiently better set of implementation characteristics, particularly absolute performance and/or price-performance. This lower level of instruction-set inertia gives the vendors of embedded microprocessors the freedom and initiative to seek out new instruction sets. The relative success of RISC microprocessors in the high-end of the embedded market is an example of innovation by microprocessor vendors that produced a benefit large enough to overcome the market’s inertia. To the vendors’ disappointment, the benefits of RISCs have not been sufficient to overcome the instruction-set inertia of the mainstream desktop computer market. Because of advances in IC fabrication technology and advances in high-level language compiler technology, it now appears that microprocessor vendors are compelled by the potential benefits of another change in microprocessor instruction sets. As before, the embedded market is likely to be first to accept this change. The new direction in microprocessor architecture is toward VLIW (very long instruction word) instruction sets. VLIW architectures are characterized by instructions that each specify several independent operations. This is compared to RISC instructions that typically specify one operation and CISC instructions that typically specify several dependent operations. VLIW instructions are necessarily longer than RISC or CISC instructions, thus the name. Philips Semiconductors 2. Write short about comparison of RISC AND CISC. IMPLEMENTATION COMPARISON: SUPERSCALAR CISC, SUPERSCALAR RISC, VLIW The differences between CISC, RISC, and VLIW architectures manifest themselves in their respective implementations. Comparing high-performance implementations of each is the most telling. High-performance RISC and CISC designs are called superscalar implementations. Superscalar in this context simply means “beyond scalar” where scalar means one operations at a time. Thus, superscalar means more than one operation at a time. Most CISC instruction sets were designed with the idea that an implementation will fetch one instruction, execute its operations fully, then move on to the next instruction. The assumed execution model was thus serial in nature. RISC architects were aware of the advantages and peculiarities of pipelined processor implementations, and so designed RISC instruction sets with a pipelined execution model in mind. In contrast to the assumed CISC execution model, the idea for the RISC execution model is that an implementation will fetch one instruction, issue it into the pipeline, and then move on to the next instruction before the previous one has completed its trip through the pipeline. 3. Write short notes on advantage of VLIW. SOFTWARE INSTEAD OF HARDWARE: IMPLEMENTATION ADVANTAGES OF VLIW A VLIW implementation achieves the same effect as a superscalar RISC or CISC implementation, but the VLIW design does so without the two most complex parts of a high-performance superscalar design. Because VLIW instructions explicitly specify several independent operations—that is, they explicitly, specify parallelism—it is not necessary to have decoding and dispatching hardware that tries to reconstruct parallelism from a serial instruction stream. Instead of having hardware attempt to discover parallelism, VLIW processors rely on the compiler that generates the VLIW code to explicitly specify parallelism. Relying on the compiler has advantages. First, the compiler has the ability to look at much larger windows of instructions than the hardware. For a superscalar processor, a larger hardware window implies a larger amount of logic and therefore chip area. At some point, there simply is not enough of either, and window size is constrained. Worse, even before a simple limit on the amount of hardware is reached, complexity may adversely affect the speed of the logic, thus the window size is constrained to avoid reducing the clock speed of the chip. Software windows can be arbitrarily large. Thus, looking for parallelism in a software window is likely to yield better results. Second, the compiler has knowledge of the source code of the program. Source code typically contains important information about program behavior that can be used to help express maximum parallelism at the instruction-set level. A powerful technique called trace-driven compilation can be employed to dramatically improve the quality of code output by the compiler. Trace-drive compilation first produces a suboptimal, but correct, VLIW program. The program has embedded routines that take note of program behavior. The recorded program behavior—which branches are taken, how often, etc.—is then used by the compiler during a second compilation to produce code that takes advantage of accurate knowledge of Philips Semiconductors 4. Write short notes on RISC SYSTEM. Reduced instruction set computer The acronym RISC (pronounced as risk), for reduced instruction set computer, represents a CPU design strategy emphasizing the insight that simplified instructions that "do less" may still provide for higher performance if this simplicity can be utilized to make instructions execute very quickly. Many proposals for a "precise" definition[1] have been attempted, and the term is being slowly replaced by the more descriptive load-store architecture. Well known RISC families include Alpha, ARC, ARM, AVR, MIPS, PA-RISC, Power Architecture (including PowerPC), SuperH, and SPARC. Being an old idea, some aspects attributed to the first RISC-labeled designs (around 1975) include the observations that the memory restricted compilers of the time were often unable to take advantage of features intended to facilitate coding, and that complex addressing inherently takes many cycles to perform. It was argued that such functions would better be performed by sequences of simpler instructions, if this could yield implementations simple enough to cope with really high frequencies, and small enough to leave room for many registers[2], factoring out slow memory accesses. Uniform, fixed length instructions with arithmetic’s restricted to registers were chosen to ease instruction pipelining in these simple designs, with special load-store instructions accessing memory. 5. Write about the characteristics. Typical characteristics of RISC For any given level of general performance, a RISC chip will typically have far fewer transistors dedicated to the core logic which originally allowed designers to increase the size of the register set and increase internal parallelism. Other features, which are typically found in RISC architectures are: Uniform instruction format, using a single word with the opcode in the same bit positions in every instruction, demanding less decoding; Identical general purpose registers, allowing any register to be used in any context, simplifying compiler design (although normally there are separate floating point registers); Simple addressing modes. Complex addressing performed via sequences of arithmetic and/or load-store operations; Few data types in hardware, some CISCs have byte string instructions, or support complex numbers; this is so far unlikely to be found on a RISC. Exceptions abound, of course, within both CISC and RISC. RISC designs are also more likely to feature a Harvard memory model, where the instruction stream and the data stream are conceptually separated; this means that modifying the memory where code is held might not have any effect on the instructions executed by the processor (because the CPU has a separate instruction and data cache), at least until a special synchronization instruction is issued. On the upside, this allows both caches to be accessed simultaneously, which can often improve performance 6. Write about the comparison of RISC and x86 systems. RISC and x86 However, despite many successes, RISC has made few inroads into the desktop PC and commodity server markets, where Intel's x86 platform remains the dominant processor architecture (Intel is facing increased competition from AMD, but even AMD's processors implement the x86 platform, or a 64-bit superset known as x86-64). There are three main reasons for this. 1. The very large base of proprietary PC applications are written for x86, whereas no RISC platform has a similar installed base, and this meant PC users were locked into the x86. 2. Although RISC was indeed able to scale up in performance quite quickly and cheaply, Intel took advantage of its large market by spending vast amounts of money on processor development. Intel could spend many times as much as any RISC manufacturer on improving low level design and manufacturing. The same could not be said about smaller firms like Cyrix and NexGen, but they realized that they could apply pipelined design philosophies and practices to the x86architecture — either directly as in the 6x86 and MII series, or indirectly (via extra decoding stages) as in Nx586 and AMD K5. 3. Later, more powerful processors such as Intel P6 and AMD K6 had similar RISC-like units that executed a stream of micro-operations generated from decoding stages that split most x86 instructions into several pieces. Today, these principles have been further refined and are used by modern x86 processors such as Intel Core 2 and AMD K8. The first available chip deploying such techniques was the NexGen Nx586, released in 1994 (while the AMD K5 was severely delayed and released in 1995). While early RISC designs were significantly different than contemporary CISC designs, by 2000 the highest performing CPUs in the RISC line were almost indistinguishable from the highest performing CPUs in the CISC line.[12][13][14] PART-C UNIT—5 ADVANCED SYSTEM ARCHITECTURE. 1. Compare the difference about the RISC AND CISC. The simplest way to examine the advantages and disadvantages of RISC architecture is by contrasting it with it's predecessor: CISC (Complex Instruction Set Computers) architecture. Multiplying Two Numbers in Memory On the right is a diagram representing the storage scheme for a generic computer. The main memory is divided into locations numbered from (row) 1: (column) 1 to (row) 6: (column) 4. The execution unit is responsible for carrying out all computations. However, the execution unit can only operate on data that has been loaded into one of the six registers (A, B, C, D, E, or F). Let's say we want to find the product of two numbers one stored in location 2:3 and another stored in location 5:2 - and then store the product back in the location 2:3. The CISC Approach The primary goal of CISC architecture is to complete a task in as few lines of assembly as possible. This is achieved by building processor hardware that is capable of understanding and executing a series of operations. For this particular task, a CISC processor would come prepared with a specific instruction (we'll call it "MULT"). When executed, this instruction loads the two values into separate registers, multiplies the operands in the execution unit, and then stores the product in the appropriate register. Thus, the entire task of multiplying two numbers can be completed with one instruction: MULT 2:3, 5:2 MULT is what is known as a "complex instruction." It operates directly on the computer's memory banks and does not require the programmer to explicitly call any loading or storing functions. It closely resembles a command in a higher level language. For instance, if we let "a" represent the value of 2:3 and "b" represent the value of 5:2, then this command is identical to the C statement "a = a * b." One of the primary advantages of this system is that the compiler has to do very little work to translate a high-level language statement into assembly. Because the length of the code is relatively short, very little RAM is required to store instructions. The emphasis is put on building complex instructions directly into the hardware. The RISC Approach RISC processors only use simple instructions that can be executed within one clock cycle. Thus, the "MULT" command described above could be divided into three separate commands: "LOAD," which moves data from the memory bank to a register, "PROD," which finds the product of two operands located within the registers, and "STORE," which moves data from a register to the memory banks. In order to perform the exact series of steps described in the CISC approach, a programmer would need to code four lines of assembly: LOAD A, 2:3 LOAD B, 5:2 PROD A, B STORE 2:3, A At first, this may seem like a much less efficient way of completing the operation. Because there are more lines of code, more RAM is needed to store the assembly level instructions. The compiler must also perform more work to convert a high-level language statement into code of this form. However, the RISC strategy also brings some very important advantages. Emphasis on hardware Emphasis on software Because each instruction requires only one clock cycle Includes multi-clock Single-clock, to execute, the entire complex instructions reduced instruction only program will execute in approximately the same Memory-to-memory: Register to register: amount of time as the multicycle "MULT" command. "LOAD" and "STORE" "LOAD" and "STORE" incorporated in instructions are independent instructions These RISC "reduced instructions" require less transistors of hardware space Small code sizes, Low cycles per second, than the complex high cycles per second large code sizes instructions, leaving more room for general purpose Transistors used for storing Spends more transistors registers. Because all of the instructions execute in a complex instructions on memory registers uniform amount of time (i.e. one clock), pipelining is possible. CISC RISC Separating the "LOAD" and "STORE" instructions actually reduces the amount of work that the computer must perform. After a CISC-style "MULT" command is executed, the processor automatically erases the registers. If one of the operands needs to be used for another computation, the processor must re-load the data from the memory bank into a register. In RISC, the operand will remain in the register until another value is loaded in its place. The Performance Equation The following equation is commonly used for expressing a computer's performance ability: The CISC approach attempts to minimize the number of instructions per program, sacrificing the number of cycles per instruction. RISC does the opposite, reducing the cycles per instruction at the cost of the number of instructions per program. RISC Roadblocks Despite the advantages of RISC based processing, RISC chips took over a decade to gain a foothold in the commercial world. This was largely due to a lack of software support. Although Apple's Power Macintosh line featured RISC-based chips and Windows NT was RISC compatible, Windows 3.1 and Windows 95 were designed with CISC processors in mind. Many companies were unwilling to take a chance with the emerging RISC technology. Without commercial interest, processor developers were unable to manufacture RISC chips in large enough volumes to make their price competitive. Another major setback was the presence of Intel. Although their CISC chips were becoming increasingly unwieldy and difficult to develop, Intel had the resources to plow through development and produce powerful processors. Although RISC chips might surpass Intel's efforts in specific areas, the differences were not great enough to persuade buyers to change technologies. The Overall RISC Advantage Today, the Intel x86 is arguable the only chip which retains CISC architecture. This is primarily due to advancements in other areas of computer technology. The price of RAM has decreased dramatically. In 1977, 1MB of DRAM cost about $5,000. By 1994, the same amount of memory cost only $6 (when adjusted for inflation). Compiler technology has also become more sophisticated, so that the RISC use of RAM and emphasis on software has become ideal. 2. Write about the advantage of complier complexity. THE ADVANTAGE OF COMPILER COMPLEXITY OVER HARDWARE COMPLEXITY While a VLIW architecture reduces hardware complexity over a superscalar implementation, a much more complex compiler is required. Extracting maximum performance from a superscalar RISC or CISC implementation does require sophisticated compiler techniques, but the level of sophistication in a VLIW compiler is significantly higher. VLIW simply moves complexity from hardware into software. Luckily, this trade-off has a significant side benefit: the complexity is paid for only once, when the compiler is written instead of every time a chip is fabricated. Among the possible benefits is a smaller chip, which leads to increased profits for the microprocessor vendor and/or cheaper prices for the customers that use the microprocessors. Complexity is usually easier to deal with in a software design than in a hardware design. Thus, the chip may cost less to design, be quicker to design, and may require less debugging, all of which are factors that can make the design cheaper. Also, improvements to the compiler can be made after chips have been fabricated; improvements to superscalar dispatch hardware require changes to the microprocessor, which naturally incurs all the expenses of turning a chip design. PRACTICAL VLIW ARCHITECTURES AND IMPLEMENTATIONS The simplest VLIW instruction format encodes an operation for every execution unit in the machine. This makes sense under the assumption that every instruction will always have something useful for every execution unit to do. Unfortunately, despite the best efforts of the best compiler algorithms, it is typically not possible to pack every instruction with work for all execution units. Also, in a VLIW machine that has both integer and floating-point execution units, the best compiler would not be able to keep the floatingpoint units busy during the execution of an integer-only application. FIGURE 4 Philips Semiconductors Introduction to VLIW Computer Architecture 10 The problem with instructions that do not make full use of all execution units is that they waste precious processor resources: instruction memory space, instruction cache space, and bus bandwidth. There are at least two solutions to reducing the waste of resources due to sparse instructions. First, instructions can be compressed with a more highly-encoded representation. Any number of techniques, such as Huffman encoding to allocate the fewest bits to the most frequently used operations, can be used. Second, it is possible to define an instruction word that encodes fewer operations than the number of available execution units. Imagine a VLIW machine with ten execution units but an instruction word that can describe only five operations. In this scheme, a unit number is encoded along with the operation; the unit number specifies to which execution unit the operation should be sent. The benefit is better utilization of resources. A potential problem is that the shorter instruction prohibits the machine from issuing the maximum possible number of operations at any one time. To prevent this problem from limiting performance, the size of the instruction word can be tuned based on analysis of simulations of program behavior. Of course, it is completely reasonable to combine these two techniques: use compression on shorter-thanmaximumlength instructions. 3. Write about the historical perspective of VLIW. HISTORICAL PERSPECTIVE VLIW is not a new computer architecture. Horizontal microcode, a processor implementation technique in use for decades, defines a specialized, low-level VLIW architecture. This low-level architecture runs a microprogram that interprets (emulates) a higher-level (user-visible) instruction set. The VLIW nature of the horizontal microinstructions is used to attain a high-performance interpretation of the high-level instruction set by executing several low-level steps concurrently. Each horizontal microcode instruction encodes many irregular, specialized operations that are directed at primitive logic blocks inside a processor. From the outside, the horizontally microcoded processor appears to be directly running the emulated instruction set. In the 1980s, a few small companies attempted to commercialize VLIW architectures in the general-purpose market. Unfortunately, they were ultimately unsuccessful. Multiflow is the most well known. Multiflow’s founders were academicians who did pioneering, fundamental research into VLIW compilation techniques. Multiflow’s computers worked, but the company was probably about a decade ahead of its time. The Multiflow machines, built from discrete parts, could not keep pace with the rapid advances in single-chip microprocessors. Using today’s technology, they would have a better chance at being competitive. In the early 1990s, Intel introduced the i860 RISC microprocessor. This simple chip had two modes of operation: a scalar mode and a VLIW mode. In the VLIW mode, the processor always fetched two instructions and assumed that one was an integer instruction and the other floating-point. A single program could switch (somewhat painfully) between the scalar and VLIW modes, thus implementing a crude form of code compression. Ultimately, the i860 failed in the market. The chip was positioned to compete with other general-purpose microprocessors for desktop computers, but it had compilers of insufficient quality to satisfy the needs of this market. Philips Semiconductors