* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CAO - E
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					II BSC ECS
COMPUTER ARCHITECTURE AND ORGANIZATION
UNIT I MODERN COMPUTER ORGANIZATION
Introduction – Layers in modern computer - Computer organization – Main Memory – CPU Operation –
Computer types – System performance and measurement – High performance techniques – Booting sequence –
Computer design process – Computer structure – Computer Function – Architecture and Organization – CISC Vs
RISC
UNIT II PROCESSOR DESIGN AND DATA PATH
Introduction – Processor role – Processor design goals – Processor design process – Data path organization
– Main memory interface – Local storage register file – Data path simple instructions
UNIT III MEMORY DESIGN AND MANAGEMENT
Introduction – Memory parameters – Classification of memory – Memory Technology – Main memory
allocation – Static RAM IC – Dynamic RAM – ROM logic – Multiple memory decoding – Memory Hierarchy –
Cache memory – Principle of cache – Virtual memory Concept – Advantage of Virtual memory
UNIT IV COMPUTER PERIPHERALS
Introduction – Keyboard – CRT display monitor – Printer – Magnetic storage devices – Floppy disk drive –
Hard disk drive – Special types of disk drives – Mouse and Track ball – Modem – CD-ROM Drive – Scanner –
Digital Camera – DVD.
UNIT V ADVANCED SYSTEM ARCHITECTURE
Introduction – High performance computer architecture – RISC systems – Superscalar architecture – VLIW
architecture – EPIC architecture –Multiprocessor Systems
TEXT BOOK
1. Govindarajalu.B “Computer Architecture and Organization Design Principles and Applications” Tata
McGraw-Hill, 2006
B.Sc. Electronics & C. Sys. (Colleges-revised) 2010-11
Page 11 of 35
Annexure No. 30 B
SCAA Dt.
PART-A
UNIT-1
MODERN COMMPUTER ORGANIZATION
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Arithmetic and logical operations are done in CPU
The abbreviation of RISC is reduced instruction set computing
Registers are the temporary memory area while data manipulation.
Computers uses binary based codes to give the information.
The maximum number of clock cycles measured in MHz is speed .
The memory inbuilt in the processor is cache memory .
The abbreviation of PGA is pin grid array.
ASCII is the standard code used for text character.
The abbreviation of ASCII is American standard code for information interchange.
The abbreviation for DRAM is Dynamic random access memory .
A microprocessor is an IC that contains CPU on a single chip.
The stepping process of computer is called Booting.
AT & ATX stands for advanced technology & extended advanced technology.
PART-A
UNIT-2
PROCESSOR DESIGN AND DATAPATH
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
An IC is an electronic device that contains resitors, capacitors and transistors.
Processors are divided as 3 stages CU, CPU, ALU.
The system crystal determines the speed of the CPU.
The three keys that measures the CPU performance are speed,addressbus,databus.
Pentium MMX is used for multimedia world.
Today’s standard p3 processor speeds up to 500 MHz.
The Pentium processor uses multithreading and RISC technology.
The p3 processor uses streamline coding and advanced cache technology.
The PowerPC name stands for performance optimization with RISC .
LIF & ZIF stands for low insertion force & zero insertion force.
When handling CPU see electrostatic discharge and potential pin damage.
EMI & RFI stands for electromagnetic interference & radio frequency interference.
PART-A
UNIT—3
MEMORY DESIGN AND MANAGEMENT
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
The types of memory are primary & secondary memories.
The abbreviation of BIOS is basic input output systems.
The other name for BIOS is firmware.
The PCI stands for Peripheral component interconnect.
POST refers to power on self test.
ROM is an nonvolatile memory.
DRAM uses microscopic transistors and microscopic capacitors for storing data bit.
SIPP stands for single inline pinned package.
SIMM stands for single inline memory module.
SRAM uses flip flops for storing the data bits.
The original onboard cache is known as internal cache.
The UMA ranges from 640 KB to 1024 KB.
The function of the shadow ram is to re-writes the contents of ROMBIOS.
The MEM command gives the information of the amount and type of memory available.
PART—A
UNIT—4
COMPUTER PHERIPERALS
1.
2.
3.
4.
5.
The mouse is a graphical user interface.
The term MODEM stands for modulator and demodulator .
The round shape device on that the letters are fixed is TRACE BALL.
The types of keyboard are membrane, capacitive, mechanical keyboards.
The device that is used to get the pictorial information into CPU is scanner.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
The capacity of the hard disk is given by CHS values.
SCSI stands for small computer system interface.
The HDD has the sector value of 512 bytes.
The 3.5 inch floppy is the industrial standard.
FDD parameters are stored in CMOS.
The term LPT stands for line print terminal.
The horizontal printing is known as LANDSCAPE.
The resolution of the printer is measured in dot per inch (dpi).
The CRT stands for cathode ray tube and it works as a output device.
The scanner converts the photographic information into digital information.
The input devices are mouse.keyboard, joystic, microphone, scanner, cd-rom drive.
The output devices are printer, monitor, plotter, speaker.
The input and output devices are HDD, FDD, MODEM,TAPE DRIVE.
PART—A
UNIT—5
ADVANCED SYSTEM ARCHITECTURE.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
The abbreviation for RISC is reduced instruction set computing.
The system which does more than one process is called multiprocessor.
The term CISC stands for complex instruction set computing.
The super scalar technology uses two instruction pipelines such as U & V.
The abbreviation of FRC is functional redundancy check.
The abbreviation of DCL is data conversion logic.
In 8088 processor RAM logic contains 4 banks of 9 chips.
The lock signal is used to prevent the system bus from other bus.
The term DVD refers to digital versatile disc.
The hardware and software collectively called as firmware.
PART-B
UNIT 1
MODERN COMPUTER ORGANIZATION
1. Define computer.
A computer is a machine that can be programmed to manipulate symbols. Its principal characteristics
are:
It responds to a specific set of instructions in a well-defined manner.
It can execute a prerecorded list of instructions (a program).
It can quickly store and retrieve large amounts of data.
Therefore computers can perform complex and repetitive procedures quickly, precisely and reliably.
Modern computers are electronic and digital. The actual machinery (wires, transistors, and circuits) is
called hardware; the instructions and data are called software. All general-purpose computers require
the following hardware components:
Central processing unit (CPU): The heart of the computer, this is the component that actually
executes instructions organized in programs ("software") which tell the computer what to do.
Memory (fast, expensive, short-term memory): Enables a computer to store, at least
temporarily, data, programs, and intermediate results.
Mass storage device (slower, cheaper, long-term memory): Allows a computer to permanently
retain large amounts of data and programs between jobs. Common mass storage devices
include disk drives and tape drives.
Input device: Usually a keyboard and mouse, the input device is the conduit through which
data and instructions enter a computer.
Output device: A display screen, printer, or other device that lets you see what the computer
has accomplished.
In addition to these components, many others make it possible for the basic components to work
together efficiently. For example, every computer requires a bus that transmits data from one part of
the computer to another.
2. Write about notebook computer.
Notebook computer
An extremely lightweight personal computer. Notebook computers typically weigh less than 6 pounds
and are small enough to fit easily in a briefcase. Aside from size, the principal difference between a
notebook computer and a personal computer is the display screen. Notebook computers use a variety
of techniques, known as flat-panel technologies, to produce a lightweight and non-bulky display
screen. The quality of notebook display screens varies considerably. In terms of computing power,
modern notebook computers are nearly equivalent to personal computers. They have the same CPUs,
memory capacity, and disk drives. However, all this power in a small package is expensive. Notebook
computers cost about twice as much as equivalent regular-sized computers. Notebook computers
come with battery packs that enable you to run them without plugging them in. However, the
batteries need to be recharged every few hours.
3. Define about workstation computer.
It is a type of computer used for engineering applications (CAD/CAM), desktop publishing, software
development, and other types of applications that require a moderate amount of computing power
and relatively high quality graphics capabilities. Workstations generally come with a large, highresolution graphics screen, at large amount of RAM, built-in network support, and a graphical user
interface. Most workstations also have a mass storage device such as a disk drive, but a special type of
workstation, called a diskless workstation, comes without a disk drive. The most common operating
systems for workstations are UNIX and Windows NT. Like personal computers, most workstations are
single-user computers. However, workstations are typically linked together to form a local-area
network, although they can also be used as stand-alone systems.
4. Write short notes on desktop computer.
N.B.: In networking, workstation refers to any computer connected to a local-area network. It could
be a workstation or a personal computer Desktop model
A computer designed to fit comfortably on top of a desk, typically with the monitor sitting on top of
the computer. Desktop model computers are broad and low, whereas tower model computers are
narrow and tall. Because of their shape, desktop model computers are generally limited to three
internal mass storage devices. Desktop models designed to be very small are sometimes referred to as
slim line models.
5. Write about palmtop computer.
A small computer that literally fits in your palm. Compared to full-size computers, palmtops
are severely limited, but they are practical for certain functions such as phone books and calendars.
Palmtops that use a pen rather than a keyboard for input are often called hand-held computers or
PDAs. Because of their small size, most palmtop computers do not include disk drives. However, many
contain PCMCIA slots in which you can insert disk drives, modems, memory, and other devices.
Palmtops are also called PDAs, hand-held computers and pocket computers.
6. What are the requirements of a computer?
All general-purpose computers require the following hardware components:
Central processing unit (CPU): The heart of the computer, this is the component that actually
executes instructions organized in programs ("software") which tell the computer what to do.
Memory (fast, expensive, short-term memory): Enables a computer to store, at least
temporarily, data, programs, and intermediate results.
Mass storage device (slower, cheaper, long-term memory): Allows a computer to permanently
retain large amounts of data and programs between jobs. Common mass storage devices
include disk drives and tape drives.
Input device: Usually a keyboard and mouse, the input device is the conduit through which
data and instructions enter a computer.
Output device: A display screen, printer, or other device that lets you see what the computer
has accomplished.
PART-C
UNIT-1
MODERN COMPUTER ORGANIZATION
1. Write about Supercomputer and Mainframe.
Supercomputer is a broad term for one of the fastest computers currently available. Supercomputers
are very expensive and are employed for specialized applications that require immense amounts of
mathematical calculations (number crunching). For example, weather forecasting requires a
supercomputer. Other uses of supercomputers scientific simulations, (animated) graphics, fluid
dynamic calculations, nuclear energy research, electronic design, and analysis of geological data (e.g.
in petrochemical prospecting). Perhaps the best known supercomputer manufacturer is Cray
Research.
Mainframe was a term originally referring to the cabinet containing the central processor unit or
"main frame" of a room-filling Stone Age batch machine. After the emergence of smaller
"minicomputer" designs in the early 1970s, the traditional big iron machines were described as
"mainframe computers" and eventually just as mainframes. Nowadays a Mainframe is a very large
and expensive computer capable of supporting hundreds, or even thousands, of users simultaneously.
The chief difference between a supercomputer and a mainframe is that a supercomputer channels all
its power into executing a few programs as fast as possible, whereas a mainframe uses its power to
execute many programs concurrently. In some ways, mainframes are more powerful than
supercomputers because they support more simultaneous programs. But supercomputers can execute
a single program faster than a mainframe. The distinction between small mainframes and
minicomputers is vague, depending really on how the manufacturer wants to market its machines.
2. Write about CPU in detail.
Central processing unit
Die of an Intel 80486DX2 microprocessor (actual size: 12×6.75 mm) in its packaging.
A central processing unit (CPU) or processor is an electronic circuit that can execute computer
programs. This topic has been in use in the computer industry at least since the early 1960s
(Weik 1961). The form, design and implementation of CPUs have changed dramatically since
the earliest examples, but their fundamental operation has remained much the same.
Early CPUs were custom-designed as a part of a larger, sometimes one-of-a-kind, computer.
However, this costly method of designing custom CPUs for a particular application has largely
given way to the development of mass-produced processors that are made for one or many
purposes. This standardization trend generally began in the era of discrete transistor mainframes
and minicomputers and has rapidly accelerated with the popularization of the integrated circuit
(IC). The IC has allowed increasingly complex CPUs to be designed and manufactured to
tolerances on the order of nanometers. Both the miniaturization and standardization of CPUs
have increased the presence of these digital devices in modern life far beyond the limited
application of dedicated computing machines. Modern microprocessors appear in everything
from automobiles to cell phones to children's toys.
EDVAC, one of the first electronic stored program computers.
Prior to the advent of machines that resemble today's CPUs, computers such as the ENIAC had
to be physically rewired in order to perform different tasks. These machines are often referred to
as "fixed-program computers," since they had to be physically reconfigured in order to run a
different program. Since the term "CPU" is generally defined as a software (computer program)
execution device, the earliest devices that could rightly be called CPUs came with the advent of
the stored-program computer.
The idea of a stored-program computer was already present in the design of J. Presper Eckert and
John William Mauchly's ENIAC, but was initially omitted so the machine could be finished
sooner. On June 30, 1945, before ENIAC was even completed, mathematician John von
Neumann distributed the paper entitled "First Draft of a Report on the EDVAC." It outlined the
design of a stored-program computer that would eventually be completed in August 1949 (von
Neumann 1945). EDVAC was designed to perform a certain number of instructions (or
operations) of various types. These instructions could be combined to create useful programs for
the EDVAC to run. Significantly, the programs written for EDVAC were stored in high-speed
computer memory rather than specified by the physical wiring of the computer. This overcame a
severe limitation of ENIAC, which was the large amount of time and effort it took to reconfigure
the computer to perform a new task. With von Neumann's design, the program, or software, that
EDVAC ran could be changed simply by changing the contents of the computer's memory. [1]
While von Neumann is most often credited with the design of the stored-program computer
because of his design of EDVAC, others before him such as Konrad Zuse had suggested and
implemented similar ideas. Additionally, the so-called Harvard architecture of the Harvard Mark
I, which was completed before EDVAC, also utilized a stored-program design using punched
paper tape rather than electronic memory. The key difference between the von Neumann and
Harvard architectures is that the latter separates the storage and treatment of CPU instructions
and data, while the former uses the same memory space for both. Most modern CPUs are
primarily von Neumann in design, but elements of the Harvard architecture are commonly seen
as well.
Being digital devices, all CPUs deal with discrete states and therefore require some kind of
switching elements to differentiate between and change these states. Prior to commercial
acceptance of the transistor, electrical relays and vacuum tubes (thermionic valves) were
commonly used as switching elements. Although these had distinct speed advantages over
earlier, purely mechanical designs, they were unreliable for various reasons. For example,
building direct current sequential logic circuits out of relays requires additional hardware to cope
with the problem of contact bounce. While vacuum tubes do not suffer from contact bounce, they
must heat up before becoming fully operational and eventually stop functioning altogether.[2]
Usually, when a tube failed, the CPU would have to be diagnosed to locate the failing component
so it could be replaced. Therefore, early electronic (vacuum tube based) computers were
generally faster but less reliable than electromechanical (relay based) computers.
3. Write about clock rate of CPU.
Clock rate
Most CPUs, and indeed most sequential logic devices, are synchronous in nature.[8] That is, they
are designed and operate on assumptions about a synchronization signal. This signal, known as a
clock signal, usually takes the form of a periodic square wave. By calculating the maximum time
that electrical signals can move in various branches of a CPU's many circuits, the designers can
select an appropriate period for the clock signal.
This period must be longer than the amount of time it takes for a signal to move, or propagate, in
the worst-case scenario. In setting the clock period to a value well above the worst-case
propagation delay, it is possible to design the entire CPU and the way it moves data around the
"edges" of the rising and falling clock signal. This has the advantage of simplifying the CPU
significantly, both from a design perspective and a component-count perspective. However, it
also carries the disadvantage that the entire CPU must wait on its slowest elements, even though
some portions of it are much faster. This limitation has largely been compensated for by various
methods of increasing CPU parallelism. (see below)
However, architectural improvements alone do not solve all of the drawbacks of globally
synchronous CPUs. For example, a clock signal is subject to the delays of any other electrical
signal. Higher clock rates in increasingly complex CPUs make it more difficult to keep the clock
signal in phase (synchronized) throughout the entire unit. This has led many modern CPUs to
require multiple identical clock signals to be provided in order to avoid delaying a single signal
significantly enough to cause the CPU to malfunction. Another major issue as clock rates
increase dramatically is the amount of heat that is dissipated by the CPU. The constantly
changing clock causes many components to switch regardless of whether they are being used at
that time. In general, a component that is switching uses more energy than an element in a static
state. Therefore, as clock rate increases, so does heat dissipation, causing the CPU to require
more effective cooling solutions.
One method of dealing with the switching of unneeded components is called clock gating, which
involves turning off the clock signal to unneeded components (effectively disabling them).
However, this is often regarded as difficult to implement and therefore does not see common
usage outside of very low-power designs.[9] Another method of addressing some of the problems
with a global clock signal is the removal of the clock signal altogether. While removing the
global clock signal makes the design process considerably more complex in many ways,
asynchronous (or clockless) designs carry marked advantages in power consumption and heat
dissipation in comparison with similar synchronous designs. While somewhat uncommon, entire
asynchronous CPUs have been built without utilizing a global clock signal. Two notable
examples of this are the ARM compliant AMULET and the MIPS R3000 compatible MiniMIPS.
Rather than totally removing the clock signal, some CPU designs allow certain portions of the
device to be asynchronous, such as using asynchronous ALUs in conjunction with superscalar
pipelining to achieve some arithmetic performance gains. While it is not altogether clear whether
totally asynchronous designs can perform at a comparable or better level than their synchronous
counterparts, it is evident that they do at least excel in simpler math operations. This, combined
with their excellent power consumption and heat dissipation properties, makes them very
suitable for embedded computers (Garside et al. 1999).
4. Write about the design and implementation of CPU.
Design and implementation
Integer range
The way a CPU represents numbers is a design choice that affects the most basic ways in which
the device functions. Some early digital computers used an electrical model of the common
decimal (base ten) numeral system to represent numbers internally. A few other computers have
used more exotic numeral systems like ternary (base three). Nearly all modern CPUs represent
numbers in binary form, with each digit being represented by some two-valued physical quantity
such as a "high" or "low" voltage.[6]
MOS 6502 microprocessor in a dual in-line package, an extremely popular 8-bit design.
Related to number representation is the size and precision of numbers that a CPU can represent.
In the case of a binary CPU, a bit refers to one significant place in the numbers a CPU deals
with. The number of bits (or numeral places) a CPU uses to represent numbers is often called
"word size", "bit width", "data path width", or "integer precision" when dealing with strictly
integer numbers (as opposed to floating point). This number differs between architectures, and
often within different parts of the very same CPU. For example, an 8-bit CPU deals with a range
of numbers that can be represented by eight binary digits (each digit having two possible values),
that is, 28 or 256 discrete numbers. In effect, integer size sets a hardware limit on the range of
integers the software run by the CPU can utilize.[7]
Integer range can also affect the number of locations in memory the CPU can address (locate).
For example, if a binary CPU uses 32 bits to represent a memory address, and each memory
address represents one octet (8 bits), the maximum quantity of memory that CPU can address is
232 octets, or 4 GiB. This is a very simple view of CPU address space, and many designs use
more complex addressing methods like paging in order to locate more memory than their integer
range would allow with a flat address space.
Higher levels of integer range require more structures to deal with the additional digits, and
therefore more complexity, size, power usage, and general expense. It is not at all uncommon,
therefore, to see 4- or 8-bit microcontrollers used in modern applications, even though CPUs
with much higher range (such as 16, 32, 64, even 128-bit) are available. The simpler
microcontrollers are usually cheaper, use less power, and therefore dissipate less heat, all of
which can be major design considerations for electronic devices. However, in higher-end
applications, the benefits afforded by the extra range (most often the additional address space)
are more significant and often affect design choices. To gain some of the advantages afforded by
both lower and higher bit lengths, many CPUs are designed with different bit widths for different
portions of the device. For example, the IBM System/370 used a CPU that was primarily 32 bit,
but it used 128-bit precision inside its floating point units to facilitate greater accuracy and range
in floating point numbers (Amdahl et al. 1964). Many later CPU designs use similar mixed bit
width, especially when the processor is meant for general-purpose usage where a reasonable
balance of integer and floating point capability is required.
================================
PART-B
UNIT-2
PROCESSOR DESIGN AND DATA PATH
1. Write about processor design goals.
The first CPUs were designed to do mathematical calculations faster and more reliably than
human computers.
Each successive generation of CPU might be designed to achieve some of these goals:
higher performance levels of a single program or thread
higher throughput levels of multiple programs/threads
less power consumption for the same performance level
lower cost for the same performance level
greater connectivity to build larger, more parallel systems
more specialization to aid in specific targeted markets
Re-designing a CPU core to a smaller die-area helps achieve several of these goals.
Shrinking everything (a "photomask shrink"), resulting in the same number of transistors on a
smaller die, improves performance (smaller transistors switch faster), reduces power (smaller
wires have less parasitic capacitance) and reduces cost (more CPUs fit on the same wafer of
silicon).
Releasing a CPU on the same size die, but with a smaller CPU core, keeps the cost about the
same but allows higher levels of integration within one VLSI chip (additional cache, multiple
CPUs, or other components), improving performance and reducing overall system cost.
2. Write about the basic architecture of a computer.
Basic Architecture of a Modern Computer/Network:
Abstraction Layers
When the machine powers up, tells central processing unit (CPU)
to check memory, etc. and where to go to find how to "boot up"
1
2Controls
access to almost all reads (sensing the keyboard, disk drive, memory,
or other inputs), writes (to memory, printer, screen, speakers) through
the CPU, which actually processes the data stream. Also includes filing
system, e.g. where you locate your documents (from papers to music and
images), applications, and the like.
bFor
Windows (up to 2000) and MacOS (up to 9.2), a patched, cobbled-in way
of using Internet-standard communications protocols, such as TCP/IP and
Ethernet (and its descendents). For Unix and its variants (such as MacOS X
and Linux) communications are now embedded in the OS.
aThe
very basic interface between hardware and software—where the
computer "converses" with all peripheral devices, as well as hard drives,
video/sound cards, etc.
3 [One
hopes!] a set of knowable "sockets" into which data to and from
applications can be fed, and through which a keystroke or other data input
is handled by the OS and CPU. Can be open and publically known, but is
often internal corporate, proprietary (and thus secret) information—
a de facto set of "standards"
4The
applications with which you're familiar, for example Netscape, Mulberry,
Word, WinAmp, etc.-—indeed, the operating environment in which you probably
spend most of your time
5The
applications and systems that allow communication and integration among
separate machines (caveat: Unix does this implicitly) for high-level, often Netbased
data-handling. In theory, these are independent of the specific hard- and
software of any PC—they are "cross-platform."
6Networked
processing, with the ability to hand-off processing tasks to any
CPU able to perform the tasks requested. Potentially a very rich level, where
individual processors are able to negotiate with others, and perhaps develop
their own practices of deference to each other.
The Architecture of a Modern Personal Computer:
Basic Hardware Configuration
input-output bus
input-output bus
printer driver
"read-only"
devices
"write-only"
devices
Video card/
"rasterizer"
3. Write about the high end processor economics.
High-end processor economics
Developing new, high-end CPUs is a very costly proposition. Both the logical complexity
(needing very large logic design and logic verification teams and simulation farms with perhaps
thousands of computers) and the high operating frequencies (needing large circuit design teams
and access to the state-of-the-art fabrication process) account for the high cost of design for this
type of chip. The design cost of a high-end CPU will be on the order of US $100 million. Since
the design of such high-end chips nominally takes about five years to complete, to stay
competitive a company has to fund at least two of these large design teams to release products at
the rate of 2.5 years per product generation.
As an example, the typical loaded cost for one computer engineer is often quoted to be $250,000
US dollars/year. This includes salary, benefits, CAD tools, computers, office space rent, etc.
Assuming that 100 engineers are needed to design a CPU and the project takes 4 years.
Total cost = $250,000/engineer-man_year X 100 engineers X 4 years = $100,000,000 US dollars.
The above amount is just an example. The design teams for modern day general purpose CPUs
have several hundred team members.
Only the personal computer mass market (with production rates in the hundreds of millions, producing
billions of dollars in revenue) can support such a large design and implementation teams.[citation needed] As
of 2004, only four companies are actively designing and fabricating state of the art general purpose
computing CPU chips: Intel, AMD, IBM and Fujitsu.[citation needed] Motorola has spun off its semiconductor
division as Freescale as that division was dragging down profit margins for the rest of the company.
Texas Instruments, TSMC and Toshiba are a few examples of a companies doing manufacturing for
another company's CPU chip design
4. Write about general purpose computing.
General purpose computing
The vast majority of revenues generated from CPU sales is for general purpose computing. That
is, desktop, laptop and server computers commonly used in businesses and homes. In this
market, the Intel IA-32 architecture dominates, with its rivals PowerPC and SPARC maintaining
much smaller customer bases. Yearly, hundreds of millions of IA-32 architecture CPUs are used
by this market.
Since these devices are used to run countless different types of programs, these CPU designs are not
specifically targeted at one type of application or one function. The demands of being able to run a wide
range of programs efficiently has made these CPU designs among the more advanced technically, along
with some disadvantages of being relatively costly, and having high power consumption
===============================================================
PART-C
UNIT -2
PROCESSOR DESIGN AND DATA PATH
1. What do you mean by system performance analysis?
Performance analysis
Because there are too many programs to test a CPU's speed on all of them, benchmarks were
developed. The most famous benchmarks are the SPECint and SPECfp benchmarks developed
by Standard Performance Evaluation Corporation and the ConsumerMark benchmark developed
by the Embedded Microprocessor Benchmark Consortium EEMBC.
Some important measurements include:
Instructions per second - Most consumers pick a computer architecture (normally Intel IA32
architecture) to be able to run a large base of pre-existing pre-compiled software. Being
relatively uninformed on computer benchmarks, some of them pick a particular CPU based on
operating frequency (see Megahertz Myth).
FLOPS - The number of floating point operations per second is often important in selecting
computers for scientific computations.
Performance per watt - System designers building parallel computers, such as Google, pick CPUs
based on their speed per watt of power, because the cost of powering the CPU outweighs the
cost of the CPU itself. [1][2]
Some system designers building parallel computers pick CPUs based on the speed per dollar.
System designers building real-time computing systems want to guarantee worst-case response.
That is easier to do when the CPU has low interrupt latency and when it has deterministic
response. (DSP)
Computer programmers who program directly in assembly language want a CPU to support a
full featured instruction set.
Low power - For systems with limited power sources (e.g. solar, batteries, human power).
Small size or low weight - for portable embedded systems, systems for spacecraft.
Environmental impact - Minimizing environmental impact of computers during manufacturing
and recycling as well during use. Reducing waste, reducing hazardous materials
2. Write about CPU design.
CPU design
CPU design focuses on these areas:
1.
2.
3.
4.
5.
6.
datapaths (such as ALUs and pipelines)
control unit: logic which controls the datapaths
Memory components such as register files, caches
Clock circuitry such as clock drivers, PLLs, clock distribution networks
Pad transceiver circuitry
Logic gate cell library which is used to implement the logic
CPUs designed for high-performance markets might require custom designs for each of these
items to achieve frequency, power-dissipation, and chip-area goals.
CPUs designed for lower performance markets might lessen the implementation burden by:
Acquiring some of these items by purchasing them as intellectual property
Use control logic implementation techniques (logic synthesis using CAD tools) to implement the
other components - datapaths, register files, clocks
Common logic styles used in CPU design include:
Unstructured random logic
Finite-state machines
Microprogramming (common from 1965 to 1985, no longer common except for CISC CPUs)
Programmable logic array (common in the 1980s, no longer common)
Device types used to implement the logic include:
Transistor-transistor logic Small Scale Integration jelly-bean logic chips - no longer used for CPUs
Programmable Array Logic and Programmable logic devices - no longer used for CPUs
Emitter-coupled logic (ECL) gate arrays - no longer common
CMOS gate arrays - no longer used for CPUs
CMOS ASICs - what's commonly used today, they're so common that the term ASIC is not used
for CPUs
Field-programmable gate arrays (FPGA) - common for soft microprocessors, and more or less
required for reconfigurable computing
A CPU design project generally has these major tasks:
Programmer-visible instruction set architecture, which can be implemented by a variety of
microarchitectures
Architectural study and performance modeling in ANSI C/C++ or SystemC
High-level synthesis (HLS) or RTL (eg. logic) implementation
RTL Verification
Circuit design of speed critical components (caches, registers, ALUs)
Logic synthesis or logic-gate-level design
Timing analysis to confirm that all logic and circuits will run at the specified operating frequency
Physical design including floorplanning, place and route of logic gates
Checking that RTL, gate-level, transistor-level and physical-level representations are equivalent
Checks for signal integrity, chip manufacturability
As with most complex electronic designs, the logic verification effort (proving that the design
does not have bugs) now dominates the project schedule of a CPU.
Key CPU architectural innovations include index register, cache, virtual memory, instruction
pipelining, superscalar, CISC, RISC, virtual machine, emulators, microprogram, and stack.
3. Write the role of the processor in the CPU.
The processor plays a significant role in the following important aspects of your computer
system:
Performance: The processor is probably the most important single determinant of system
performance in the PC. While other components also play a key role in determining
performance, the processor's
capabilities dictate the maximum performance of a system. The other devices only allow the
processor to reach its full potential.
Software Support: Newer, faster processors enable the use of the latest software. In addition,
new processors such as the Pentium with MMX Technology, enable the use of specialized
software not usable on earlier machines.
Reliability and Stability: The quality of the processor is one factor that determines how reliably
your system will run. While most processors are very dependable, some are not. This also
depends to some extent on the age of the processor and how much energy it consumes.
Energy Consumption and Cooling: Originally processors consumed relatively little power
compared to other system devices. Newer processors can consume a great deal of power.
Power consumption has an impact on everything from cooling method selection to overall
system reliability.
Motherboard Support: The processor you decide to use in your system will be a major
determining factor in what sort of chipset you must use, and hence what motherboard you buy.
The motherboard in turn dictates many facets of your system's capabilities and performance.
PART—B
UNIT—3
MEMORY DESIGN AND MANAGEMENT
1. Write about the structure of the cache memory.
Structure
Cache row entries usually have the following structure:
Data
Valid
Tag Index Displacement
blocks
bit
The data blocks contain the actual data fetched from the main memory. The memory address is split
(MSB to LSB) into the tag, the index and the displacement (offset), while the valid bit denotes that this
particular entry has valid data. The index length is log2(cache_rows) bits and describes which row
the data has been put in. The displacement length is log2(data_blocks) and specifies which block
of the ones we have stored we need. The tag length is address
− index − displacement and
contains the most significant bits of the address, which are checked against the current row (the row has
been retrieved by index) to see if it is the one we need or another, irrelevant memory location that
happened to have the same index bits as the one we want
2. Write about the associativity of the cache memory.
Associativity
Which memory locations can be cached by which cache locations
The replacement policy decides where in the cache a copy of a particular entry of main memory
will go. If the replacement policy is free to choose any entry in the cache to hold the copy, the
cache is called fully associative. At the other extreme, if each entry in main memory can go in
just one place in the cache, the cache is direct mapped. Many caches implement a compromise
in which each entry in main memory can go to any one of N places in the cache, and are
described as N-way set associative. For example, the level-1 data cache in an AMD Athlon is 2way set associative, which means that any particular location in main memory can be cached in
either of 2 locations in the level-1 data cache.
Associativity is a trade-off. If there are ten places the replacement policy can put a new cache
entry, then when the cache is checked for a hit, all ten places must be searched. Checking more
places takes more power, chip area, and potentially time. On the other hand, caches with more
associativity suffer fewer misses (see conflict misses, below), so that the CPU spends less time
servicing those misses. The rule of thumb is that doubling the associativity, from direct mapped
to 2-way, or from 2-way to 4-way, has about the same effect on hit rate as doubling the cache
size. Associativity increases beyond 4-way have much less effect on the hit rate, and are
generally done for other reasons (see virtual aliasing, below).
In order of increasing (worse) hit times and decreasing (better) miss rates,
direct mapped cache—the best (fastest) hit times, and so the best tradeoff for "large" caches
2-way set associative cache
2-way skewed associative cache -- "the best tradeoff for .... caches whose sizes are in the range
4K-8K bytes" -- André Seznec[3]
4-way set associative cache
fully associative cache -- the best (lowest) miss rates, and so the best tradeoff when the miss
penalty is very high
3. Write short notes on pseudo- associative cache.
Pseudo-associative cache
A true set-associative cache tests all the possible ways simultaneously, using something like a
content addressable memory. A pseudo-associative cache tests each possible way one at a time.
A hash-rehash cache is one kind of pseudo-associative cache.
In the common case of finding a hit in the first way tested, a pseudo-associative cache is as fast
as a direct-mapped cache. But it has a much lower conflict miss rate than a direct-mapped cache,
closer to the miss rate of a fully associative cache.
4. Write short notes on cache.
A CPU cache is a cache used by the central processing unit of a computer to reduce the average
time to access memory. The cache is a smaller, faster memory which stores copies of the data
from the most frequently used main memory locations. As long as most memory accesses are
cached memory locations, the average latency of memory accesses will be closer to the cache
latency than to the latency of main memory.
When the processor needs to read from or write to a location in main memory, it first checks
whether a copy of that data is in the cache. If so, the processor immediately reads from or writes
to the cache, which is much faster than reading from or writing to main memory.
The diagram on the right shows two memories. Each location in each memory has a datum (a
cache line), which in different designs ranges in size from 8[1] to 512[2] bytes. The size of the
cache line is usually larger than the size of the usual access requested by a CPU instruction,
which ranges from 1 to 16 bytes. Each location in each memory also has an index, which is a
unique number used to refer to that location. The index for a location in main memory is called
an address. Each location in the cache has a tag that contains the index of the datum in main
memory that has been cached. In a CPU's data cache these entries are called cache lines or cache
blocks.
5. What do you mean by paging supervisor?
Paging supervisor
This part of the operating system creates and manages the page tables. If the dynamic address
translation hardware raises a page fault exception, the paging supervisor searches the page space
on secondary storage for the page containing the required virtual address, reads it into real
physical memory, updates the page tables to reflect the new location of the virtual address and
finally tells the dynamic address translation mechanism to start the search again. Usually all of
the real physical memory is already in use and the paging supervisor must first save an area of
real physical memory to disk and update the page table to say that the associated virtual
addresses are no longer in real physical memory but saved on disk. Paging supervisors generally
save and overwrite areas of real physical memory which have been least recently used, because
these are probably the areas which are used least often. So every time the dynamic address
translation hardware matches a virtual address with a real physical memory address, it must put a
time-stamp in the page table entry for that virtual address.
h segment into pages. In systems that combine them, such as Multics and the IBM System/38
and IBM System i machines, virtual memory is usually implemented with paging, with
segmentation used to provide memory protection.[8][9][10] With the Intel 80386 and later IA-32
processors, the segments reside in a 32-bit linear paged address space, so segments can be moved
into and out of that linear address space, and pages in that linear address space can be moved in
and out of main memory, providing two levels of virtual memory; however, few if any operating
systems do so. Instead, they only use paging.
The difference between virtual memory implementations using pages and using segments is not
only about the memory division with fixed and variable sizes, respectively. In some systems, e.g.
Multics, or later System/38 and Prime machines, the segmentation was actually visible to the
user processes, as part of the semantics of a memory model. In other words, instead of a process
just having a memory which looked like a single large vector of bytes or words, it was more
structured. This is different from using pages, which doesn't change the model visible to the
process. This had important consequences.
A segment wasn't just a "page with a variable length", or a simple way to lengthen the address
space (as in Intel 80286). In Multics, the segmentation was a very powerful mechanism that was
used to provide a single-level virtual memory model, in which there was no differentiation
between "process memory" and "file system" - a process' active address space consisted only a
list of segments (files) which were mapped into its potential address space, both code and data. It
is not the same as the later mmap function in Unix, because inter-file pointers don't work when
mapping files into semi-arbitrary places. Multics had such addressing mode built into most
instructions. In other words it could perform relocated inter-segment references, thus eliminating
the need for a linker completely. This also worked when different processes mapped the same
file into different places in their private address spaces.
6. Write about the virtual memory in a computer system.
Virtual memory is a computer system technique which gives an application program the
impression that it has contiguous working memory (an address space), while in fact it may be
physically fragmented and may even overflow on to disk storage. Systems that use this technique
make programming of large applications easier and use real physical memory (e.g. RAM) more
efficiently than those without virtual memory. Virtual memory differs significantly from memory
virtualization in that virtual memory allows resources to be virtualized as memory for a specific
system, as opposed to a large pool of memory being virtualized as smaller pools for many
different systems.
Note that "virtual memory" is more than just "using disk space to extend physical memory size" that is merely the extension of the memory hierarchy to include hard disk drives. Extending
memory to disk is a normal consequence of using virtual memory techniques, but could be done
by other means such as overlays or swapping programs and their data completely out to disk
while they are inactive. The definition of "virtual memory" is based on redefining the address
space with a contiguous virtual memory addresses to "trick" programs into thinking they are
using large blocks of contiguous addresses.
All modern general-purpose computer operating systems use virtual memory techniques for
ordinary applications, such as word processors, spreadsheets, multimedia players, accounting,
etc. Older operating systems, such as DOS and Microsoft Windows[1] of the 1980s, or those for
the mainframes of the 1960s, generally had no virtual memory functionality - notable exceptions
being the Atlas, B5000 and Apple Computer's Lisa.
PART—C
UNIT—3
MEMORY DESIGN AND MANAGEMENT
1. Write in detail about the development of the virtual memory.
In the 1940s and 1950s, before the development of a virtual memory, all larger programs had to
contain logic for managing two-level storage (primary and secondary, today's analogies being
RAM and hard disk), such as overlaying techniques. Programs were responsible for moving
overlays back and forth from secondary storage to primary.
The main reason for introducing virtual memory was therefore not simply to extend primary
memory, but to make such an extension as easy to use for programmers as possible.[2]
Many systems already had the ability to divide the memory between multiple programs (required
for multiprogramming and multiprocessing), provided for example by "base and bounds
registers" on early models of the PDP-10, without providing virtual memory. That gave each
application a private address space starting at an address of 0, with an address in the private
address space being checked against a bounds register to make sure it's within the section of
memory allocated to the application and, if it is, having the contents of the corresponding base
register being added to it to give an address in main memory. This is a simple form of
segmentation without virtual memory.
Virtual memory was developed in approximately 1959–1962, at the University of Manchester for
the Atlas Computer, completed in 1962.[3] However, Fritz-Rudolf Güntsch, one of Germany's
pioneering computer scientists and later the developer of the Telefunken TR 440 mainframe,
claims to have invented the concept in 1957 in his doctoral dissertation Logischer Entwurf eines
digitalen Rechengerätes mit mehreren asynchron laufenden Trommeln und automatischem
Schnellspeicherbetrieb (Logic Concept of a Digital Computing Device with Multiple
Asynchronous Drum Storage and Automatic Fast Memory Mode).
In 1961, Burroughs released the B5000, the first commercial computer with virtual memory.[4][5]
It used segmentation rather than paging.
Like many technologies in the history of computing, virtual memory was not accepted without
challenge. Before it could be implemented in mainstream operating systems, many models,
experiments, and theories had to be developed to overcome the numerous problems. Dynamic
address translation required a specialized, expensive, and hard to build hardware, moreover
initially it slightly slowed down the access to memory.[2] There were also worries that new
system-wide algorithms of utilizing secondary storage would be far less effective than previously
used application-specific ones.
By 1969 the debate over virtual memory for commercial computers was over.[2] An IBM
research team led by David Sayre showed that the virtual memory overlay system consistently
worked better than the best manually controlled systems.
Possibly the first minicomputer to introduce virtual memory was the Norwegian NORD-1.
During the 1970s, other minicomputers implemented virtual memory, notably VAX models
running VMS.
Virtual memory was introduced to the x86 architecture with the protected mode of the Intel
80286 processor. At first it was done with segment swapping, which became inefficient with
larger segments. The Intel 80386 introduced support for paging underneath the existing
segmentation layer. The page fault exception could be chained with other exceptions without
causing a double fault.
2. What is the difference between static RAM and dynamic RAM?
A capacitor stores electrons in computer memory cells. The memory must then be refreshed or
flip-flopped.
Your computer probably uses both static RAM and dynamic RAM at the same time, but it uses them for different
reasons because of the cost difference between the two types. If you understand how dynamic RAM and static RAM
chips work inside, it is easy to see why the cost difference is there, and you can also understand the names.
Dynamic RAM is the most common type of memory in use today. Inside a dynamic RAM chip, each memory cell
holds one bit of information and is made up of two parts: a transistor and a capacitor. These are, of course, extremely
small transistors and capacitors so that millions of them can fit on a single memory chip. The capacitor holds the bit of
information -- a 0 or a 1 (see How Bits and Bytes Work for information on bits). The transistor acts as a switch that lets
the control circuitry on the memory chip read the capacitor or change its state.
A capacitor is like a small bucket that is able to store electrons. To store a 1 in the memory cell, the bucket is filled
with electrons. To store a 0, it is emptied. The problem with the capacitor's bucket is that it has a leak. In a matter of a
few milliseconds a full bucket becomes empty. Therefore, for dynamic memory to work, either the CPU or the
memory controller has to come along and recharge all of the capacitors holding a 1 before they discharge. To do
this, the memory controller reads the memory and then writes it right back. This refresh operation happens
automatically thousands of times per second.
This refresh operation is where dynamic RAM gets its name. Dynamic RAM has to be dynamically refreshed all of the
time or it forgets what it is holding. The downside of all of this refreshing is that it takes time and slows down the
memory.
Static RAM uses a completely different technology. In static RAM, a form of flip-flop holds each bit of memory (see
How Boolean Gates Work for detail on flip-flops). A flip-flop for a memory cell takes 4 or 6 transistors along with some
wiring, but never has to be refreshed. This makes static RAM significantly faster than dynamic RAM. However,
because it has more parts, a static memory cell takes a lot more space on a chip than a dynamic memory cell.
Therefore you get less memory per chip, and that makes static RAM a lot more expensive.
So static RAM is fast and expensive, and dynamic RAM is less expensive and slower. Therefore static RAM is used
to create the CPU's speed-sensitive cache, while dynamic RAM forms the larger system RAM
3. Write in detail about the dynamic memory allocation.
Dynamic memory allocation
In computer science, dynamic memory allocation is the allocation of memory storage for use in
a computer program during the runtime of that program. It can be seen also as a way of
distributing ownership of limited memory resources among many pieces of data and code.
Dynamically allocated memory exists until it is released either explicitly by the programmer,
exiting a block, or by the garbage collector. This is in contrast to static memory allocation, which
has a fixed duration. It is said that an object so allocated has a dynamic lifetime.
Details
The task of fulfilling allocation request
o Finding a block of unused memory of sufficient size
Problems during fulfilling allocation request
o Internal and external fragmentation.
 Reduction needs special care, thus making implementation more complex (see
algorithm efficiency).
o Allocator's metadata can inflate the size of (individually) small allocations;
 Chunking attempts to reduce this effect.
Usually, memory is allocated from a large pool of unused memory area called the heap (also
called the free store). Since the precise location of the allocation is not known in advance, the
memory is accessed indirectly, usually via a reference. The precise algorithm used to organize
the memory area and allocate and deallocate chunks is hidden behind an abstract interface and
may use any of the methods described below.
Implementations
Fixed-size-blocks allocation
Main article: memory pool
Fixed-size-blocks allocation, also called memory pool allocation, uses a free list of fixed-size
blocks of memory (often all of the same size). This works well for simple embedded systems.
Buddy blocks
For more details on this topic, see Buddy memory allocation.
In this system, memory is allocated from a large block in memory that is a power of two in size.
If the block is more than twice as large as desired, it is broken in two. One of the halves is
selected, and the process repeats (checking the size again and splitting if needed) until the block
is just large enough.
All the blocks of a particular size are kept in a sorted linked list or tree. When a block is freed, it
is compared to its buddy. If they are both free, they are combined and placed in the next-largest
size buddy-block list. (When a block is allocated, the allocator will start with the smallest
sufficiently large block avoiding needlessly breaking blocks)
4. Write about the dynamic translation and paged virtual memory.
Paged virtual memory
Almost all implementations of virtual memory divide the virtual address space of an application
program into pages; a page is a block of contiguous virtual memory addresses. Pages are usually
at least 4K bytes in size, and systems with large virtual address ranges or large amounts of real
memory (e.g. RAM) generally use larger page sizes.
Page tables
Almost all implementations use page tables to translate the virtual addresses seen by the
application program into physical addresses (also referred to as "real addresses") used by the
hardware to process instructions. Each entry in the page table contains a mapping for a virtual
page to either the real memory address at which the page is stored, or an indicator that the page is
currently held in a disk file. (Although most do, some systems may not support use of a disk file
for virtual memory.)
Systems can have one page table for the whole system or a separate page table for each
application. If there is only one, different applications which are running at the same time share a
single virtual address space, i.e. they use different parts of a single range of virtual addresses.
Systems which use multiple page tables provide multiple virtual address spaces - concurrent
applications think they are using the same range of virtual addresses, but their separate page
tables redirect to different real addresses.
Paging
Paging is the process of saving inactive virtual memory pages to disk and restoring them to real
memory when required.
Most virtual memory systems enable programs to use virtual address ranges which in total
exceed the amount of real memory (e.g. RAM). To do this they use disk files to save virtual
memory pages which are not currently active, and restore them to real memory when they are
needed. Pages are not necessarily restored to the same real addresses from which they were
saved - applications are aware only of virtual addresses. Usually when a page is going to be
restored to real memory, the real memory already contains another virtual memory page which
will be saved to disk before the restore takes place.
Dynamic address translation
If, while executing an instruction, a CPU fetches an instruction located at a particular virtual
address, fetches data from a specific virtual address or stores data to a particular virtual address,
the virtual address must be translated to the corresponding physical address. This is done by a
hardware component, sometimes called a memory management unit, which looks up the real
address (from the page table) corresponding to a virtual address and passes the real address to the
parts of the CPU which execute instructions. If the page tables indicate that the virtual memory
page is not currently in real memory, the hardware raises a page fault exception (special internal
signal) which invokes the paging supervisor component of the operating system (see below).
PART-B
UNIT-4
COMPUTER PERIPHERALS
1. Write short notes on hard disk.
Hard Disk (ATA / SATA / SCSI)
� Used to stored data permanently.
� Different Type of Hard Disk Size
(3.5”, 2.5”, 1.8”, Micro Drive)
� Different Interface: ATA / SATA /
SCSI
(Speed: ATA < SATA < SCSI)
� Different Speed (Mechanical)
(4,200rpm / 5,400rpm /
7,200rpm / 10,000 rpm)
2. Give the details about mother board.
Main Board / Mother Board (MB)
� Provide a platform to
Connecting all the
Devices.
(Keyboard / Mouse /
Power / CPU /
Memory / Hard Disk /
Floppy Disk / Display
Card etc)
� Many Main Board has
Already build in Sound
Card / Network Card
Or even display card.
3. Give the details of I/O device and interface.
I/O Device & Interface
� ATA / SATA /SCSI (For Hard Disk)
� Parallel Port or LPT Port (For Printer)
� COM Port (For Modem)
� RJ45 Socket (For Network)
� PS/2 (For Keyboard / Mouse)
� D-Sub / DVI (For Monitor)
� USB (All compatible devices)
4. Write the details of hard disk memory.
Hard Disk (ATA / SATA / SCSI)
� Different Build in Memory Size
(2M / 8M / 16M etc)
� Different Capacity
(80G to 500G or even 1T)
� Small Size Hard Disk are more popular as
They are portable size.
5. Write short notes on connects of keyboards.
Connection types
There are several ways of connecting a keyboard using cables, including the standard AT
connector commonly found on motherboards, which was eventually replaced by the PS/2 and the
USB connection. Prior to the iMac line of systems, Apple used the proprietary Apple Desktop
Bus for its keyboard connector.
Wireless keyboards have become popular for their increased user freedom. A wireless keyboard
often includes a required combination transmitter and receiver unit that attaches to the
computer's keyboard port (see Connection types above). The wireless aspect is achieved either by
radio frequency (RF) or by infrared (IR) signals sent and received from both the keyboard and
the unit attached to the computer. A wireless keyboard may use an industry standard RF, called
Bluetooth. With Bluetooth, the transceiver may be built into the computer. However, a wireless
keyboard needs batteries to work and may pose a security problem due to the risk of data
"eavesdropping" by hackers.[6]
6. What is meant by alternative text entering method?
Alternative text-entering methods
An on-screen keyboard controlled with the mouse can be used by users with limited mobility.
Optical character recognition (OCR) is preferable to rekeying for converting existing text that is
already written down but not in machine-readable format (for example, a Linotype-composed
book from the 1940s). In other words, to convert the text from an image to editable text (that is, a
string of character codes), a person could re-type it, or a computer could look at the image and
deduce what each character is. OCR technology has already reached an impressive state (for
example, Google Book Search) and promises more for the future.
Speech recognition converts speech into machine-readable text (that is, a string of character
codes). The technology has already reached an impressive state and is already implemented in
various software products. For certain uses (e.g., transcription of medical or legal dictation;
journalism; writing essays or novels) it is starting to replace the keyboard; however, it does not
threaten to replace keyboards entirely anytime soon. It can, however, interpret commands (for
example, "close window" or "undo that") in addition to text. Therefore, it has theoretical
potential to replace keyboards entirely (whereas OCR replaces them only for a certain kind of
task).
Pointing devices can be used to enter text or characters in contexts where using a physical
keyboard would be inappropriate or impossible. These accessories typically present characters on
a display, in a layout that provides fast access to the more frequently used characters or character
combinations. Popular examples of this kind of input are Graffiti, Dasher and on-screen virtual
keyboards.
7. Write short notes on keystroke hacking.
Keystroke hacking
Keystroke logging (often called keylogging) is a method of capturing and recording user
keystrokes. While it is used legitimately to measure employee productivity on certain clerical
tasks, or by law enforcement agencies to find out about illegal activities, it is also used by
hackers for law-breaking, or other illegal activities. Hackers use keyloggers as a means to obtain
passwords or encryption keys and thus bypassing other security measures.
Keystroke logging can be achieved by both hardware and software means. Hardware key loggers
are attached to the keyboard cable or installed inside standard keyboards. Software keyloggers
work on the target computer’s operating system and gain unauthorized access to the hardware,
hook into the keyboard with functions provided by the OS, or use remote access software to
transmit recorded data out of the target computer to a remote location. Some hackers also use
wireless keylogger sniffers to collect packets of data being transferred from a wireless keyboard
and its receiver, and then they crack the encryption key being used to secure wireless
communications between the two devices.
Anti-spyware applications are able to detect many keyloggers and cleanse them. Responsible
vendors of monitoring software support detection by anti-spyware programs, thus preventing
abuse of the software. Enabling a firewall does not stop keyloggers per se, but can possibly
prevent transmission of the logged material over the net if properly configured. Network
monitors (also known as reverse-firewalls) can be used to alert the user whenever an application
attempts to make a network connection. This gives the user the chance to prevent the keylogger
from "phoning home" with his or her typed information. Automatic form-filling programs can
prevent keylogging entirely by not using the keyboard at all. Most keyloggers can be fooled by
alternating between typing the login credentials and typing characters somewhere else in the
focus window. [7]
Electromagnetic waves released every time key is pressed on the keyboard can be detected by a
nearby antenna and interpreted by computer software to work out exactly what was typed. [8]
8. Write about the key switches.
Key switches
"Dome-switch" keyboards (sometimes incorrectly referred to as a membrane keyboards) are the
most common type now in use. When a key is pressed, it pushes down on a rubber dome sitting
beneath the key. A conductive contact on the underside of the dome touches (and hence
connects) a pair of conductive lines on the circuit below. This bridges the gap between them and
allows electric current to flow (the open circuit is closed). A scanning signal is emitted by the
chip along the pairs of lines in the matrix circuit which connects to all the keys. When the signal
in one pair becomes different, the chip generates a "make code" corresponding to the key
connected to that pair of lines.
Keycaps are also required for most types of keyboards; while modern keycaps are typically
surface-marked, they can also be 2-shot molded, or engraved, or they can be made of transparent
material with printed paper inserts
Keys on older IBM keyboards were made with a "buckling spring" mechanism, in which a coil
spring under the key buckles under pressure from the user's finger, pressing a rubber dome,
whose inside is coated with conductive graphite, which connects two leads below, completing a
circuit. This produces a clicking sound, and gives physical feedback for the typist indicating that
the key has been depressed.[3][4]When a key is pressed and the circuit is completed, the code
generated is sent to the computer either via a keyboard cable (using on-off electrical pulses to
represent bits) or over a wireless connection. While not nearly as popular as dome-switch
keyboards, these "clicky" keyboards have been making a comeback recently, particularly among
writers and others who use keyboards heavily.[5]
A chip inside the computer receives the signal bits and decodes them into the appropriate
keypress. The computer then decides what to do on the basis of the key pressed (e.g. display a
character on the screen, or perform some action). When the key is released, a break code
(different from the make code) is sent to indicate the key is no longer pressed. If the break code
is missed (e.g. due to a keyboard switch) it is possible for the keyboard controller to believe the
key is pressed down when it is not, which is why pressing then releasing the key again will
release the key (since another break code is sent). Other types of keyboards function in a similar
manner, the main differences being how the individual key-switches work. For more on this
subject refer to the article on keyboard technology.
Certain key presses are special, namely Ctrl-Alt-Delete and SysRq, but what makes them special
is a function of software. In the PC architecture, the keyboard controller (the component in the
computer that receives the make and break codes) sends the computer's CPU a hardware
interrupt whenever a key is pressed or released. The CPU's interrupt routine which handles these
interrupts usually just places the key's code in a queue, to be handled later by other code when it
gets around to it, then returns to whatever the computer was doing before. The special keys cause
the interrupt routine to take a different "emergency" exit instead. This more trusted route is much
harder to intercept.
The layout of a keyboard can be changed by remapping the keys. When you remap a key, you
tell the computer a new meaning for the pressing of that key. Keyboard remapping is supported
at a driver-level configurable within the operating system, or as add-ons to the existing programs.
9. Write short about the system commands.
System commands
The SysRq / Print screen commands often share the same key. SysRq was used in earlier
computers as a "panic" button to recover from crashes. The Print screen command used to
capture the entire screen and send it to the printer, but in the present it usually puts a screenshot
in the clipboard. The Break key/Pause key no longer has a well-defined purpose. Its origins go
back to teletype users, who wanted a key that would temporarily interrupt the communications
line. The Break key can be used by software in several different ways, such as to switch between
multiple login sessions, to terminate a program, or to interrupt a modem connection.
In programming, especially old DOS-style BASIC, Pascal and C, Break is used (in conjunction
with Ctrl) to stop program execution. In addition to this, Linux and variants, as well as many
DOS programs, treat this combination the same as Ctrl+C. On modern keyboards, the break key
is usually labeled Pause/Break. In most Windows environments, the key combination Windows
key+Pause brings up the system properties.
The Escape key (often abbreviated Esc) is used to initiate an escape sequence. As most computer
users no longer are concerned with the details of controlling their computer's peripherals, the task
for which the escape sequences were originally designed, the escape key was appropriated by
application programmers, most often to mean Stop. This use continues today in Microsoft
Windows's use of escape as a shortcut in dialog boxes for No, Quit, Exit, Cancel, or Abort.
A common application today of the Esc key is as a shortcut key for the Stop button in many web
browsers. On machines running Microsoft Windows, prior to the implementation of the
Windows key on keyboards, the typical practice for invoking the "start" button was to hold down
the control key and press escape. This process still works in Windows XP and Windows Vista.
The Menu key or Application key is a key found on Windows-oriented computer keyboards. It is
used launch a context menu with the keyboard rather than with the usual right mouse button. The
key's symbol is a small icon depicting a cursor hovering above a menu. This key was created at
the same time as the Windows key. This key is normally used when the right mouse button is not
present on the mouse. Some Windows public terminals do not have a Menu key on their
keyboard to prevent users from right-clicking (however, in many windows applications, a similar
functionality can be invoked with the Shift+F10 keyboard shortcut).
PART-C
UNIT-4
COMPUTER PERIPHERALS
1. Write in detail about the computer structure and the power supply.
Basic Computer Structure
1. Logical Structure of a computer includes:
� BIOS (The Basic Input Output System)
� CPU (The Processor)
� Memory / RAM (Temporary Storage)
� Hard Disk (Permanent Storage)
� Input / Output Device
� Communication Channel (E.g. USB)
� Bus (High Speed Internal Communication)
� Other Add-on Device…
Structure of a computer
Power Supply
� Power Supply Convert the
A.C. Voltage to Lower D.C.
Voltage which is suitable
for Computer.
� Power Supply can be
Classified by their loading
(Watt).
� Different type of socket for
Different device.
BIOS
� Basic Input Output System
� Store all the parameter before the OS Load
(Example is Hard Disk Size, Memory
Speed, Turn on or turn off the build in device
Such as Sound Card, USB, printer etc)
� Usually stored in Flash Memory
2. Write short notes on mouse.
Mechanical mouse devices
Mechanical mouse, shown with the top cover removed
Operating a mechanical mouse.
1: moving the mouse turns the ball.
2: X and Y rollers grip the ball and transfer movement.
3: Optical encoding disks include light holes.
4: Infrared LEDs shine through the disks.
5: Sensors gather light pulses to convert to X and Y velocities.
Bill English, builder of Engelbart's original mouse,[10] invented the ball mouse in 1972 while
working for Xerox PARC.[11] The ball-mouse replaced the external wheels with a single ball that
could rotate in any direction. It came as part of the hardware package of the Xerox Alto
computer. Perpendicular chopper wheels housed inside the mouse's body chopped beams of light
on the way to light sensors, thus detecting in their turn the motion of the ball. This variant of the
mouse resembled an inverted trackball and became the predominant form used with personal
computers throughout the 1980s and 1990s. The Xerox PARC group also settled on the modern
technique of using both hands to type on a full-size keyboard and grabbing the mouse when
required.
The ball mouse utilizes two rollers rolling against two sides of the ball. One roller detects the
forward–backward motion of the mouse and other the left–right motion. The motion of these two
rollers causes two disc-like encoder wheels to rotate, interrupting optical beams to generate
electrical signals. The mouse sends these signals to the computer system by means of connecting
wires. The driver software in the system converts the signals into motion of the mouse pointer
along X and Y axes on the screen.
Ball mice and wheel mice were manufactured for Xerox by Jack Hawley, doing business as The
Mouse House in Berkeley, California, starting in 1975.[12][13]
Based on another invention by Jack Hawley, proprietor of the Mouse House, Honeywell
produced another type of mechanical mouse.[14][15] Instead of a ball, it had two wheels rotating at
off axes. Keytronic later produced a similar product.[16]
Modern computer mice took form at the École polytechnique fédérale de Lausanne (EPFL)
under the inspiration of Professor Jean-Daniel Nicoud and at the hands of engineer and
watchmaker André Guignard.[17] This new design incorporated a single hard rubber mouseball
and three buttons, and remained a common design until the mainstream adoption of the scrollwheel mouse during the 1990s.[18]
Another type of mechanical mouse, the "analog mouse" (now generally regarded as obsolete),
uses potentiometers rather than encoder wheels, and is typically designed to be plug-compatible
with an analog joystick. The "Color Mouse," originally marketed by Radio Shack for their Color
Computer (but also usable on MS-DOS machines equipped with analog joystick ports, provided
the software accepted joystick input) was the best-known example.
3. Write short notes on optical mouse.
Optical mice
An optical mouse uses a light-emitting diode and photodiodes to detect movement relative to the
underlying surface, rather than moving some of its parts – as in a mechanical mouse.
Early optical mice
Xerox optical mouse chip
Early optical mice, first demonstrated by two independent inventors in 1980,[19] came in two
different varieties:
1. Some, such as those invented by Steve Kirsch of MIT and Mouse Systems Corporation,[20][21] used
an infrared LED and a four-quadrant infrared sensor to detect grid lines printed with infrared
absorbing ink on a special metallic surface. Predictive algorithms in the CPU of the mouse
calculated the speed and direction over the grid.
2. Others, invented by Richard F. Lyon and sold by Xerox, used a 16-pixel visible-light image sensor
with integrated motion detection on the same chip[22][23] and tracked the motion of light dots in
a dark field of a printed paper or similar mouse pad.[24]
These two mouse types had very different behaviors, as the Kirsch mouse used an x-y coordinate
system embedded in the pad, and would not work correctly when the pad was rotated, while the
Lyon mouse used the x-y coordinate system of the mouse body, as mechanical mice do.
The optical sensor from a Microsoft Wireless IntelliMouse Explorer (v. 1.0A)
4. Write short notes on 3D mice.
3D mice
Also known as bats,[30] flying mice, or wands,[31] these devices generally function through
ultrasound. Probably the best known example would be 3DConnexion/Logitech's Space Mouse
from the early 1990s.
In the late 1990s Kantek introduced the 3D Ring Mouse. This wireless mouse was worn on a ring
around a finger, which enabled the thumb to access three buttons. The mouse was tracked in
three dimensions by a base station.[32] Despite a certain appeal, it was finally discontinued
because it did not provide sufficient resolution.
A recent consumer 3D pointing device is the Wii Remote. While primarily a motion-sensing
device (that is, it can determine its orientation and direction of movement), Wii Remote can also
detect its spatial position by comparing the distance and position of the lights from the IR emitter
using its integrated IR camera (since the nunchuk lacks a camera, it can only tell its current
heading and orientation). The obvious drawback to this approach is that it can only produce
spatial coordinates while its camera can see the sensor bar.
In February, 2008, at the Game Developers' Conference (GDC), a company called Motion4U
introduced a 3D mouse add-on called "OptiBurst" for Autodesk's Maya application. The mouse
allows users to work in true 3D with 6 degrees of freedom.[citation needed] The primary advantage of
this system is speed of development with organic (natural) movement.
5. Write about ps/2 and protocol.
PS/2 interface and protocol
For more details on this topic, see PS/2 connector.
With the arrival of the IBM PS/2 personal-computer series in 1987, IBM introduced the
eponymous PS/2 interface for mice and keyboards, which other manufacturers rapidly adopted.
The most visible change was the use of a round 6-pin mini-DIN, in lieu of the former 5-pin
connector. In default mode (called stream mode) a PS/2 mouse communicates motion, and the
state of each button, by means of 3-byte packets.[35] For any motion, button press or button
release event, a PS/2 mouse sends, over a bi-directional serial port, a sequence of three bytes,
with the following format:
Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0
Byte 1 YV
Byte 2
XV
YS
XS
1
X movement
MB
RB
LB
Byte 3
Y movement
Here, XS and YS represent the sign bits of the movement vectors, XV and YV indicate an
overflow in the respective vector component, and LB, MB and RB indicate the status of the left,
middle and right mouse buttons (1 = pressed). PS/2 mice also understand several commands for
reset and self-test, switching between different operating modes, and changing the resolution of
the reported motion vectors.
In Linux, a PS/2 mouse is detected as a /dev/psaux device.
6. Explain about in detail about keyboard.
Keyboard (computing)
In computing, a keyboard is an input device, partially modeled after the typewriter keyboard,
which uses an arrangement of buttons or keys, which act as mechanical levers or electronic
switches. A keyboard typically has characters engraved or printed on the keys and each press of
a key typically corresponds to a single written symbol. However, to produce some symbols
requires pressing and holding several keys simultaneously or in sequence. While most keyboard
keys produce letters, numbers or signs (characters), other keys or simultaneous key presses can
produce actions or computer commands.
In normal usage, the keyboard is used to type text and numbers into a word processor, text editor
or other program. In a modern computer, the interpretation of keypresses is generally left to the
software. A computer keyboard distinguishes each physical key from every other and reports all
keypresses to the controlling software. Keyboards are also used for computer gaming, either with
regular keyboards or by using keyboards with special gaming features, which can expedite
frequently used keystroke combinations. A keyboard is also used to give commands to the
operating system of a computer, such as Windows' Control-Alt-Delete combination, which
brings up a task window or shuts down the machine.
Types
Standard
Standard keyboards, such as the 101-key US traditional keyboard 104-key Windows keyboards,
include alphabetic characters, punctuation symbols, numbers and a variety of function keys. The
internationally-common 102/105 key keyboards have a smaller 'left shift' key and an additional
key with some more symbols between that and the letter to its right (usually Z or Y).[1]
Laptop-size
Keyboards on laptops and notebook computers usually have a shorter travel distance for the
keystroke and a reduced set of keys. As well, they may not have a numerical keypad, and the
function keys may be placed in locations that differ from their placement on a standard, fullsized keyboard.
The keyboards on laptops such as this Sony VAIO have a shorter travel distance and a reduced set of
keys.
Gaming and multimedia
Keyboards with extra keys, such as multimedia keyboards, have special keys for accessing
music, web and other oft-used programs, a mute button, volume buttons or knob and standby
(sleep) button. Gaming keyboards have extra function keys, which can be programmed with
keystroke macros. For example, 'ctrl+shift+y' could be a keystroke that is frequently used in a
certain computer game. Shortcuts marked on color-coded keys are used for some software
applications and for specialized uses including word processing, video editing, graphic design
and audio editing.
Thumb-sized
Smaller keyboards have been introduced for laptops, PDAs, cellphones or users who have a
limited workspace. The size of a standard keyboard is dictated by the practical consideration that
the keys must be large enough to be easily pressed by fingers. To reduce the size of the
keyboard, the numeric keyboard to the right of the alphabetic keyboard can be removed, or the
size of the keys can be reduced, which makes it harder to enter text.
Another way to reduce the size of the keyboard is to reduce the number of keys and use chording
keyer, i.e. pressing several keys simultaneously. For example, the GKOS keyboard has been
designed for small wireless devices. Other two-handed alternatives more akin to a game
controller, such as the AlphaGrip, are also used as a way to input data and text. Another way to
reduce the size of a keyboard is to use smaller buttons and pack them closer together. Such
keyboards, often called a "thumbboard" (thumbing) are used in some personal digital assistants
such as the Palm Treo and BlackBerry and some Ultra-Mobile PCs such as the OQO.
Numeric
Numeric keyboards contain only numbers, mathematical symbols for addition, subtraction,
multiplication, and division, a decimal point, and several function keys (e.g. End, Delete, etc.).
They are often used to facilitate data entry with smaller keyboard-equipped laptops or with
smaller keyboards that do not have a numeric keypad.
Non-standard or special-use types
Chorded
A keyset or chorded keyboard is a computer input device that allows the user to enter characters
or commands formed by pressing several keys together, like playing a "chord" on a piano. The
large number of combinations available from a small number of keys allows text or commands to
be entered with one hand, leaving the other hand free to do something else. A secondary
advantage is that it can be built into a device (such as a pocket-sized computer) that is too small
to contain a normal sized keyboard. A chorded keyboard designed to be used while held in the
hand is called a keyer.
Virtual
Main article: Virtual keyboard
Virtual keyboards, such as the I-Tech Virtual Laser Keyboard, project an image of a full-size
keyboard onto a surface. Sensors in the projection unit identify which key is being "pressed" and
relay the signals to a computer or personal digital assistant. There is also a virtual keyboard, the
On-Screen Keyboard, for use on Windows. The On-Screen Keyboard is an image of a standard
keyboard which the user controls by using a mouse to hover over the desired letter or symbol,
and then clicks to enter the letter. The On-Screen Keyboard is provided with Windows as an
accessibility aid, to assist users who may have difficulties using a regular keyboard. The iPhone
uses a multi-touch screen to display a virtual keyboard.
7. Explain about the control processor of keyboard.
Control processor
The modern PC keyboard has more than just switches. It also includes a control processor and
indicator lights to provide feedback to the user about what state the keyboard is in. Depending on
the sophistication of the controller's programming, the keyboard may also offer other special
features. The processor is usually a single chip 8048 microcontroller variant. The keyboard
switch matrix is wired to its inputs and it processes the incoming keystrokes and sends the results
down a serial cable (the keyboard cord) to a receiver in the main computer box. It also controls
the illumination of the "caps lock", "num lock" and "scroll lock" lights.
A common test for whether the computer has crashed is pressing the "caps lock" key. The
keyboard sends the key code to the keyboard driver running in the main computer; if the main
computer is operating, it commands the light to turn on. All the other indicator lights work in a
similar way. The keyboard driver also tracks the shift, alt and control state of the keyboard.
When pressing a keyboard key, the key "bounces" like a ball against its contacts several times
before it settles into firm contact. When released, it bounces some more until it reverts to the
uncontacted state. If the computer were watching for each pulse, it would see many keystrokes
for what the user thought was just one. To resolve this problem, the processor in a keyboard (or
computer) "debounces" the keystrokes, by aggregating them across time to produce one
"confirmed" keystroke that (usually) corresponds to what is typically a solid contact.
Some low-quality keyboards suffer problems with rollover (that is, when multiple keys are
pressed in quick succession); some types of keyboard circuitry will register a maximum number
of keys at one time. This is undesirable for games (designed for multiple keypresses, e.g. casting
a spell while holding down keys to run) and undesirable for extremely fast typing (hitting new
keys before the fingers can release previous keys). A common side effect of this shortcoming is
called "phantom key blocking": on some keyboards, pressing three keys simultaneously
sometimes resulted in a 4th keypress being registered.
Modern keyboards prevent this from happening by blocking the 3rd key in certain key
combinations, but while this prevents phantom input, it also means that when two keys are
depressed simultaneously, many of the other keys on the keyboard will not respond until one of
the two depressed keys is lifted. With better keyboards designs, this seldom happens in office
programs, but it remains a problem in games even on expensive keyboards, due to wildly
different and/or configurable key/command layouts in different games.
PART-B
UNIT—5
ADVANCED SYSTEM ARCHITECTURE.
1. Write in short about VLIW architecture.
Very-Long Instruction Word (VLIW)
Computer Architecture
ABSTRACT
VLIW architectures are distinct from traditional RISC and CISC architectures
implemented in current mass-market microprocessors. It is important to
distinguish instruction-set architecture—the processor programming
model—from implementation—the physical chip and its characteristics.
VLIW microprocessors and superscalar implementations of traditional
instruction sets share some characteristics—multiple execution units and the
ability to execute multiple operations simultaneously. The techniques used
to achieve high performance, however, are very different because the
parallelism is explicit in VLIW instructions but must be discovered by
hardware at run time by superscalar processors.
VLIW implementations are simpler for very high performance. Just as RISC
architectures permit simpler, cheaper high-performance implementations
than do CISCs, VLIW architectures are simpler and cheaper than RISCs
because of further hardware simplifications. VLIW architectures, however,
require more compiler support.
Philips Semiconductors
Introduction to VLIW Computer Architecture
2
INTRODUCTION AND MOTIVATION
Currently, in the mid 1990s, IC fabrication technology is advanced enough to allow
unprecedented
implementations of computer architectures on a single chip. Also, the current rate of process
advancement
allows implementations to be improved at a rate that is satisfying for most of the markets these
implementations serve. In particular, the vendors of general-purpose microprocessors are
competing for
sockets in desktop personal computers (including workstations) by pushing the envelopes of
clock rate (raw
operating speed) and parallel execution.
The market for desktop microprocessors is proving to be extremely dynamic. In particular, the
x86 market
has surprised many observers by attaining performance levels and price/performance levels
that many
thought were out of reach. The reason for the pessimism about the x86 was its architecture
(instruction
set). Indeed, with the advent of RISC architectures, the x86 is now recognized as a deficient
instruction set.
Instruction set compatibility is at the heart of the desktop microprocessor market. Because the
application
programs that end users purchase are delivered in binary (directly executable by the
microprocessor) form,
the end users’ desire to protect their software investments creates tremendous instruction-set
inertia.
There is a different market, though, that is much less affected by instruction-set inertia. This
market is
typically called the embedded market, and it is characterized by products containing factoryinstalled
software that runs on a microprocessor whose instruction set is not readily evident to the end
user.
Although the vendor of the product containing the embedded microprocessor has an investment
in the
embedded software, just like end users with their applications, there is considerably more
freedom to
migrate embedded software to a new microprocessor with a different instruction set. To
overcome this
lower level of instruction-set inertia, all it takes is a sufficiently better set of implementation
characteristics,
particularly absolute performance and/or price-performance.
This lower level of instruction-set inertia gives the vendors of embedded microprocessors the
freedom and
initiative to seek out new instruction sets. The relative success of RISC microprocessors in the
high-end of
the embedded market is an example of innovation by microprocessor vendors that produced a
benefit large
enough to overcome the market’s inertia. To the vendors’ disappointment, the benefits of RISCs
have not
been sufficient to overcome the instruction-set inertia of the mainstream desktop computer
market.
Because of advances in IC fabrication technology and advances in high-level language compiler
technology, it
now appears that microprocessor vendors are compelled by the potential benefits of another
change in
microprocessor instruction sets. As before, the embedded market is likely to be first to accept
this change.
The new direction in microprocessor architecture is toward VLIW (very long instruction word)
instruction
sets. VLIW architectures are characterized by instructions that each specify several
independent operations.
This is compared to RISC instructions that typically specify one operation and CISC instructions
that typically
specify several dependent operations. VLIW instructions are necessarily longer than RISC or
CISC
instructions, thus the name.
Philips Semiconductors
2. Write short about comparison of RISC AND CISC.
IMPLEMENTATION COMPARISON: SUPERSCALAR CISC, SUPERSCALAR RISC, VLIW
The differences between CISC, RISC, and VLIW architectures manifest themselves in their
respective
implementations. Comparing high-performance implementations of each is the most telling.
High-performance RISC and CISC designs are called superscalar implementations. Superscalar
in this
context simply means “beyond scalar” where scalar means one operations at a time. Thus,
superscalar
means more than one operation at a time.
Most CISC instruction sets were designed with the idea that an implementation will fetch one
instruction,
execute its operations fully, then move on to the next instruction. The assumed execution model
was thus
serial in nature.
RISC architects were aware of the advantages and peculiarities of pipelined processor
implementations, and
so designed RISC instruction sets with a pipelined execution model in mind. In contrast to the
assumed
CISC execution model, the idea for the RISC execution model is that an implementation will
fetch one
instruction, issue it into the pipeline, and then move on to the next instruction before the
previous one has
completed its trip through the pipeline.
3. Write short notes on advantage of VLIW.
SOFTWARE INSTEAD OF HARDWARE: IMPLEMENTATION ADVANTAGES OF VLIW
A VLIW implementation achieves the same effect as a superscalar RISC or CISC
implementation, but the
VLIW design does so without the two most complex parts of a high-performance superscalar
design.
Because VLIW instructions explicitly specify several independent operations—that is, they
explicitly, specify
parallelism—it is not necessary to have decoding and dispatching hardware that tries to
reconstruct
parallelism from a serial instruction stream. Instead of having hardware attempt to discover
parallelism,
VLIW processors rely on the compiler that generates the VLIW code to explicitly specify
parallelism. Relying
on the compiler has advantages.
First, the compiler has the ability to look at much larger windows of instructions than the
hardware. For a
superscalar processor, a larger hardware window implies a larger amount of logic and therefore
chip area.
At some point, there simply is not enough of either, and window size is constrained. Worse,
even before a
simple limit on the amount of hardware is reached, complexity may adversely affect the speed
of the logic,
thus the window size is constrained to avoid reducing the clock speed of the chip. Software
windows can
be arbitrarily large. Thus, looking for parallelism in a software window is likely to yield better
results.
Second, the compiler has knowledge of the source code of the program. Source code typically
contains
important information about program behavior that can be used to help express maximum
parallelism at
the instruction-set level. A powerful technique called trace-driven compilation can be employed
to
dramatically improve the quality of code output by the compiler. Trace-drive compilation first
produces a
suboptimal, but correct, VLIW program. The program has embedded routines that take note of
program
behavior. The recorded program behavior—which branches are taken, how often, etc.—is then
used by the
compiler during a second compilation to produce code that takes advantage of accurate
knowledge of
Philips Semiconductors
4. Write short notes on RISC SYSTEM.
Reduced instruction set computer
The acronym RISC (pronounced as risk), for reduced instruction set computer, represents a
CPU design strategy emphasizing the insight that simplified instructions that "do less" may still
provide for higher performance if this simplicity can be utilized to make instructions execute
very quickly. Many proposals for a "precise" definition[1] have been attempted, and the term is
being slowly replaced by the more descriptive load-store architecture. Well known RISC
families include Alpha, ARC, ARM, AVR, MIPS, PA-RISC, Power Architecture (including
PowerPC), SuperH, and SPARC.
Being an old idea, some aspects attributed to the first RISC-labeled designs (around 1975)
include the observations that the memory restricted compilers of the time were often unable to
take advantage of features intended to facilitate coding, and that complex addressing inherently
takes many cycles to perform. It was argued that such functions would better be performed by
sequences of simpler instructions, if this could yield implementations simple enough to cope
with really high frequencies, and small enough to leave room for many registers[2], factoring out
slow memory accesses. Uniform, fixed length instructions with arithmetic’s restricted to registers
were chosen to ease instruction pipelining in these simple designs, with special load-store
instructions accessing memory.
5. Write about the characteristics.
Typical characteristics of RISC
For any given level of general performance, a RISC chip will typically have far fewer transistors
dedicated to the core logic which originally allowed designers to increase the size of the register
set and increase internal parallelism.
Other features, which are typically found in RISC architectures are:
Uniform instruction format, using a single word with the opcode in the same bit positions in
every instruction, demanding less decoding;
Identical general purpose registers, allowing any register to be used in any context, simplifying
compiler design (although normally there are separate floating point registers);
Simple addressing modes. Complex addressing performed via sequences of arithmetic and/or
load-store operations;
Few data types in hardware, some CISCs have byte string instructions, or support complex
numbers; this is so far unlikely to be found on a RISC.
Exceptions abound, of course, within both CISC and RISC.
RISC designs are also more likely to feature a Harvard memory model, where the instruction stream and
the data stream are conceptually separated; this means that modifying the memory where code is held
might not have any effect on the instructions executed by the processor (because the CPU has a
separate instruction and data cache), at least until a special synchronization instruction is issued. On the
upside, this allows both caches to be accessed simultaneously, which can often improve performance
6. Write about the comparison of RISC and x86 systems.
RISC and x86
However, despite many successes, RISC has made few inroads into the desktop PC and
commodity server markets, where Intel's x86 platform remains the dominant processor
architecture (Intel is facing increased competition from AMD, but even AMD's processors
implement the x86 platform, or a 64-bit superset known as x86-64). There are three main reasons
for this.
1. The very large base of proprietary PC applications are written for x86, whereas no RISC platform
has a similar installed base, and this meant PC users were locked into the x86.
2. Although RISC was indeed able to scale up in performance quite quickly and cheaply, Intel took
advantage of its large market by spending vast amounts of money on processor development.
Intel could spend many times as much as any RISC manufacturer on improving low level design
and manufacturing. The same could not be said about smaller firms like Cyrix and NexGen, but
they realized that they could apply pipelined design philosophies and practices to the x86architecture — either directly as in the 6x86 and MII series, or indirectly (via extra decoding
stages) as in Nx586 and AMD K5.
3. Later, more powerful processors such as Intel P6 and AMD K6 had similar RISC-like units that
executed a stream of micro-operations generated from decoding stages that split most x86
instructions into several pieces. Today, these principles have been further refined and are used
by modern x86 processors such as Intel Core 2 and AMD K8. The first available chip deploying
such techniques was the NexGen Nx586, released in 1994 (while the AMD K5 was severely
delayed and released in 1995).
While early RISC designs were significantly different than contemporary CISC designs, by 2000
the highest performing CPUs in the RISC line were almost indistinguishable from the highest
performing CPUs in the CISC line.[12][13][14]
PART-C
UNIT—5
ADVANCED SYSTEM ARCHITECTURE.
1. Compare the difference about the RISC AND CISC.
The simplest way to examine the advantages and disadvantages of RISC architecture is
by contrasting it with it's predecessor: CISC (Complex Instruction Set Computers)
architecture.
Multiplying Two Numbers in Memory
On the right is a diagram representing the
storage scheme for a generic computer. The
main memory is divided into locations
numbered from (row) 1: (column) 1 to (row)
6: (column) 4. The execution unit is
responsible for carrying out all computations.
However, the execution unit can only operate
on data that has been loaded into one of the
six registers (A, B, C, D, E, or F). Let's say
we want to find the product of two numbers one stored in location 2:3 and another stored
in location 5:2 - and then store the product
back in the location 2:3.
The CISC Approach
The primary goal of CISC architecture is to
complete a task in as few lines of assembly
as possible. This is achieved by building
processor hardware that is capable of
understanding and executing a series of
operations. For this particular task, a CISC
processor would come prepared with a
specific instruction (we'll call it "MULT"). When executed, this instruction loads the two
values into separate registers, multiplies the operands in the execution unit, and then
stores the product in the appropriate register. Thus, the entire task of multiplying two
numbers can be completed with one instruction:
MULT 2:3, 5:2
MULT is what is known as a "complex instruction." It operates directly on the
computer's memory banks and does not require the programmer to explicitly call any
loading or storing functions. It closely resembles a command in a higher level language.
For instance, if we let "a" represent the value of 2:3 and "b" represent the value of 5:2,
then this command is identical to the C statement "a = a * b."
One of the primary advantages of this system is that the compiler has to do very little
work to translate a high-level language statement into assembly. Because the length of
the code is relatively short, very little RAM is required to store instructions. The
emphasis is put on building complex instructions directly into the hardware.
The RISC Approach
RISC processors only use simple instructions that can be executed within one clock
cycle. Thus, the "MULT" command described above could be divided into three separate
commands: "LOAD," which moves data from the memory bank to a register, "PROD,"
which finds the product of two operands located within the registers, and "STORE,"
which moves data from a register to the memory banks. In order to perform the exact
series of steps described in the CISC approach, a programmer would need to code four
lines of assembly:
LOAD A, 2:3
LOAD B, 5:2
PROD A, B
STORE 2:3, A
At first, this may seem like a much less efficient way of completing the operation.
Because there are more lines of code, more RAM is needed to store the assembly level
instructions. The compiler must also perform more work to convert a high-level
language statement into code of this form.
However, the RISC strategy
also brings some very
important advantages.
Emphasis on hardware
Emphasis on software
Because each instruction
requires only one clock cycle
Includes multi-clock
Single-clock,
to execute, the entire
complex instructions
reduced instruction only
program will execute in
approximately the same
Memory-to-memory:
Register to register:
amount of time as the multicycle "MULT" command.
"LOAD" and "STORE"
"LOAD" and "STORE"
incorporated in instructions are independent instructions These RISC "reduced
instructions" require less
transistors of hardware space
Small code sizes,
Low cycles per second,
than the complex
high cycles per second
large code sizes
instructions, leaving more
room for general purpose
Transistors used for storing Spends more transistors
registers. Because all of the
instructions execute in a
complex instructions
on memory registers
uniform amount of time (i.e.
one clock), pipelining is
possible.
CISC
RISC
Separating the "LOAD" and "STORE" instructions actually reduces the amount of work
that the computer must perform. After a CISC-style "MULT" command is executed, the
processor automatically erases the registers. If one of the operands needs to be used
for another computation, the processor must re-load the data from the memory bank
into a register. In RISC, the operand will remain in the register until another value is
loaded in its place.
The Performance Equation
The following equation is commonly used for expressing a computer's performance
ability:
The CISC approach attempts to minimize the number of instructions per program,
sacrificing the number of cycles per instruction. RISC does the opposite, reducing the
cycles per instruction at the cost of the number of instructions per program.
RISC Roadblocks
Despite the advantages of RISC based processing, RISC chips took over a decade to
gain a foothold in the commercial world. This was largely due to a lack of software
support.
Although Apple's Power Macintosh
line featured RISC-based chips and
Windows NT was RISC compatible,
Windows 3.1 and Windows 95 were
designed with CISC processors in
mind. Many companies were
unwilling to take a chance with the
emerging RISC technology. Without
commercial interest, processor
developers were unable to
manufacture RISC chips in large
enough volumes to make their price
competitive.
Another major setback was the presence of Intel. Although their CISC chips were
becoming increasingly unwieldy and difficult to develop, Intel had the resources to plow
through development and produce powerful processors. Although RISC chips might
surpass Intel's efforts in specific areas, the differences were not great enough to
persuade buyers to change technologies.
The Overall RISC Advantage
Today, the Intel x86 is arguable the only chip which retains CISC architecture. This is
primarily due to advancements in other areas of computer technology. The price of
RAM has decreased dramatically. In 1977, 1MB of DRAM cost about $5,000. By 1994,
the same amount of memory cost only $6 (when adjusted for inflation). Compiler
technology has also become more sophisticated, so that the RISC use of RAM and
emphasis on software has become ideal.
2. Write about the advantage of complier complexity.
THE ADVANTAGE OF
COMPILER COMPLEXITY
OVER HARDWARE
COMPLEXITY
While a VLIW architecture
reduces hardware complexity
over a superscalar
implementation, a much more
complex compiler is required.
Extracting maximum performance
from a superscalar RISC or CISC
implementation does require
sophisticated compiler techniques, but the level of sophistication in a VLIW compiler is
significantly higher.
VLIW simply moves complexity from hardware into software. Luckily, this trade-off has a
significant side
benefit: the complexity is paid for only once, when the compiler is written instead of every time a
chip is
fabricated. Among the possible benefits is a smaller chip, which leads to increased profits for
the
microprocessor vendor and/or cheaper prices for the customers that use the microprocessors.
Complexity
is usually easier to deal with in a software design than in a hardware design. Thus, the chip may
cost less to
design, be quicker to design, and may require less debugging, all of which are factors that can
make the
design cheaper. Also, improvements to the compiler can be made after chips have been
fabricated;
improvements to superscalar dispatch hardware require changes to the microprocessor, which
naturally
incurs all the expenses of turning a chip design.
PRACTICAL VLIW ARCHITECTURES AND IMPLEMENTATIONS
The simplest VLIW instruction format encodes an operation for every execution unit in the
machine. This
makes sense under the assumption that every instruction will always have something useful for
every
execution unit to do. Unfortunately, despite the best efforts of the best compiler algorithms, it is
typically
not possible to pack every instruction with work for all execution units. Also, in a VLIW machine
that has
both integer and floating-point execution units, the best compiler would not be able to keep the
floatingpoint
units busy during the execution of an integer-only application.
FIGURE 4
Philips Semiconductors
Introduction to VLIW Computer Architecture
10
The problem with instructions that do not make full use of all execution units is that they waste
precious
processor resources: instruction memory space, instruction cache space, and bus bandwidth.
There are at least two solutions to reducing the waste of resources due to sparse instructions.
First,
instructions can be compressed with a more highly-encoded representation. Any number of
techniques,
such as Huffman encoding to allocate the fewest bits to the most frequently used operations,
can be used.
Second, it is possible to define an instruction word that encodes fewer operations than the
number of
available execution units. Imagine a VLIW machine with ten execution units but an instruction
word that can
describe only five operations. In this scheme, a unit number is encoded along with the
operation; the unit
number specifies to which execution unit the operation should be sent. The benefit is better
utilization of
resources. A potential problem is that the shorter instruction prohibits the machine from issuing
the
maximum possible number of operations at any one time. To prevent this problem from limiting
performance, the size of the instruction word can be tuned based on analysis of simulations of
program
behavior.
Of course, it is completely reasonable to combine these two techniques: use compression on
shorter-thanmaximumlength instructions.
3. Write about the historical perspective of VLIW.
HISTORICAL PERSPECTIVE
VLIW is not a new computer architecture. Horizontal microcode, a processor implementation
technique in
use for decades, defines a specialized, low-level VLIW architecture. This low-level architecture
runs a
microprogram that interprets (emulates) a higher-level (user-visible) instruction set. The VLIW
nature of the
horizontal microinstructions is used to attain a high-performance interpretation of the high-level
instruction
set by executing several low-level steps concurrently. Each horizontal microcode instruction
encodes many
irregular, specialized operations that are directed at primitive logic blocks inside a processor.
From the
outside, the horizontally microcoded processor appears to be directly running the emulated
instruction set.
In the 1980s, a few small companies attempted to commercialize VLIW architectures in the
general-purpose
market. Unfortunately, they were ultimately unsuccessful. Multiflow is the most well known.
Multiflow’s
founders were academicians who did pioneering, fundamental research into VLIW compilation
techniques.
Multiflow’s computers worked, but the company was probably about a decade ahead of its time.
The
Multiflow machines, built from discrete parts, could not keep pace with the rapid advances in
single-chip
microprocessors. Using today’s technology, they would have a better chance at being
competitive.
In the early 1990s, Intel introduced the i860 RISC microprocessor. This simple chip had two
modes of
operation: a scalar mode and a VLIW mode. In the VLIW mode, the processor always fetched
two
instructions and assumed that one was an integer instruction and the other floating-point. A
single program
could switch (somewhat painfully) between the scalar and VLIW modes, thus implementing a
crude form of
code compression. Ultimately, the i860 failed in the market. The chip was positioned to compete
with other
general-purpose microprocessors for desktop computers, but it had compilers of insufficient
quality to
satisfy the needs of this market.
Philips Semiconductors
					 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            