Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
FUNDAMENTALS OF INFORMATION TECHNOLOGY ITEC1104 LECTURE NOTES © Copyright 2007 Mrs. G. Campbell 1.1.4. 1.1.3. 1 TABLE OF CONTENTS Module 1 – Introduction to computers and information technology .......................... 5 1.1. Definition of Information Technology................................................................ 5 Purpose of Information Technology ........................................................................... 5 Functions of Information Technology ........................................................................ 5 Benefits of Information Technology........................................................................... 5 Application of Information Technology ..................................................................... 6 1.2. Definition of computer system............................................................................ 6 What is a computer?.................................................................................................... 6 What is a computer system? ....................................................................................... 6 1.3. Functions of computer systems ........................................................................... 7 1.4. Components of computer systems ...................................................................... 7 Memory (RAM & ROM, Cache) ................................................................................ 7 Central Processing Unit (CPU) – parts and their functions ........................................ 9 A.L.U. ......................................................................................................................... 9 Control Unit ................................................................................................................ 9 1.5. Storage .............................................................................................................. 10 Units of storage ......................................................................................................... 10 Magnetic tape unit..................................................................................................... 10 Magnetic disk ............................................................................................................ 11 Optical disk unit ........................................................................................................ 11 MicroFiche /Film ...................................................................................................... 12 Smart Card ................................................................................................................ 12 Flash Card/Picture Card ............................................................................................ 12 Flash Drive ................................................................................................................ 13 1.6. Other peripheral devices ................................................................................... 14 Input .......................................................................................................................... 14 Output ....................................................................................................................... 18 2. Module Two - Software ............................................................................................ 26 2.1. Definition of software ....................................................................................... 26 2.2. System Software ............................................................................................... 26 Operating systems ..................................................................................................... 26 Utilities...................................................................................................................... 27 2.3. Application software ......................................................................................... 27 Productivity tools/business ....................................................................................... 27 Graphic design/multimedia ....................................................................................... 28 Education/personal/home .......................................................................................... 29 Communication ......................................................................................................... 29 Programming languages/program development software ........................................ 29 3. Module Three - Data and Information ...................................................................... 30 3.1. Definitions of data and information .................................................................. 30 Data ........................................................................................................................... 30 Information ............................................................................................................... 30 3.2. Desired characteristics of information .............................................................. 30 1. 1.1.4. 1.1.3. 2 3.3. Information systems .......................................................................................... 30 Information requirements and information systems used in the functional units of an enterprise ................................................................................................................... 32 4. Module 4 – Telecommunication and Computer Networks ....................................... 34 4.1. Definition of networks ...................................................................................... 34 4.2. Uses of networks/Role in business ................................................................... 34 Intranet ...................................................................................................................... 35 Extranet ..................................................................................................................... 35 Internet ...................................................................................................................... 36 4.3. Advantages and Disadvantages of Networks .................................................... 37 4.4. Network classifications ..................................................................................... 37 4.5. Network Topologies.......................................................................................... 39 Bus ............................................................................................................................ 39 Ring ........................................................................................................................... 40 Star ............................................................................................................................ 41 4.6. Network components ........................................................................................ 41 4.7. Software: Network Operating Systems ............................................................. 44 4.8. Transmission Media: wireless, wired ............................................................... 45 Wired Transmission Media (Physical /Guided) ........................................................ 45 Wireless Transmission Media (Unguided) ............................................................... 48 5. Module 5 – Network Security ................................................................................... 50 5.1. Define Computer security ................................................................................. 50 5.2. What is a computer security risk? ..................................................................... 50 5.3. Categories of risk and their effects ................................................................... 50 5.4. Risk Management Solutions ............................................................................. 59 Backup is the key – the ultimate safeguard .............................................................. 64 6. Module 6 – Database Management Systems ............................................................ 65 6.1. Traditional/File Processing Approach versus Database Approach ................... 65 6.2. What is a Database? .......................................................................................... 66 6.3. What is a DataBase Management System (DBMS)? ........................................ 67 6.4. Examples of DBMS’s ....................................................................................... 67 6.5. Common examples of databases in society ...................................................... 67 6.6. Sample Payroll Database Structure (Single example) ...................................... 68 6.7. The languages used in database systems (Data definition and Data manipulation) ................................................................................................................ 69 6.8. Functions/features common to most DBMS’s .................................................. 70 6.9. Database Administration ................................................................................... 71 6.10. Types of databases/Database models ............................................................ 72 Hierarchical Model ................................................................................................... 73 Network Model ......................................................................................................... 75 Relational .................................................................................................................. 77 Object-Oriented......................................................................................................... 78 Multidimensional ...................................................................................................... 84 6.11. The advantages of databases ......................................................................... 85 6.12. The disadvantages of databases .................................................................... 85 6.13. Data Warehousing ......................................................................................... 86 1.1.4. 1.1.3. 3 7. Module 7 – Information Technology in the business office ..................................... 94 7.1. Definition and purpose of office automation .................................................... 94 7.2. Features of office automation ........................................................................... 94 Facsimile ................................................................................................................... 94 Voice mail ................................................................................................................. 94 Voice messaging ....................................................................................................... 94 Telemarketing ........................................................................................................... 95 Teleconferencing....................................................................................................... 95 Telecommuting ......................................................................................................... 95 Electronic fund transfer............................................................................................. 95 E-commerce .............................................................................................................. 95 Electronic mail .......................................................................................................... 96 Internet ...................................................................................................................... 96 7.3. Application of computers in various fields ....................................................... 98 1.1.4. 1.1.3. 4 1. Module 1 – Introduction to computers and information technology 1.1. Definition of Information Technology Information Technology (IT) entails all aspects of managing and processing information. IT includes any and all hardware, software, and data used to create, store, process, and communicate information electronically as well as services that are utilized to maintain operations of resources. Purpose of Information Technology There are many purposes of IT. IT is used:  To improve operations of any organization/individual by utilizing technology as the underling tool to improve the processing and dissemination of information.  in helping organizations achieve profitable results and keep competitive forces in check.  in optimizing effectiveness and efficiency of processes within the industry it is used, whether education, business, science etc.  to solve problems . Functions of Information Technology 1. Provide supporting information to assist managers in making strategic decisions 2. Provide effective communication 3. Allows the effective Management of Information – capturing, generation, storage, retrieval and transmission of information Benefits of Information Technology 1. Speed: The processing of transactions is carried out at high speeds. The ability of computers to perform information processing in fractions of a second primarily facilitates high processing speed. For example computers are able to perform complex mathematical calculation within milliseconds. 2. Consistency: Once a computer has been given the correct instructions to execute a specific command, that command will be executed consistently without variation each time. For example the addition of the two numbers one (1) and five (5) will result in the answer six (6) each time the addition is carried out. 3. Storage: A computer can transfer data quickly from storage to memory process it and then store it again for future use. 4. Reliability: Computer systems provide reliability by ensuring consistency, speed and precision in the execution of tasks. Additionally, computers can carry out human related tasks with greater efficiency and minimized error. 5. Communications: Most Computers to day can communicate with other computers often wirelessly. 1.1.4. 1.1.3. 5 Application of Information Technology         1.2. Tourism Industry Education Edutainment Entertainment Business Science Architecture Personal computing etc. Definition of computer system What is a computer? A computer is an electronic (no moving parts) device that can accept instructions and input data and can manipulate that data to produce meaningful output. The functions of a computer are therefore:  Input  Process  Storage  Output What is a computer system? A contemporary computer system consists of a central processing unit (CPU), primary storage, secondary storage, input devices, output devices, and communications devices. 2. System Unit: A case that contains the electronic "brain" of the computer and the main computational components of the computer that is used to process data. 3. Storage Devices: Stores and hold data, instructions & information for future use. 4. Input devices: computer hardware that allows a user to enter data and instructions. 5. Output devices: computer hardware that allows a user to receive information. 6. Communications devices: hardware components that enables a computer to send & receive data, instructions & information to or from one or more computers. 1.1.4. 1.1.3. 6 1.3. Functions of computer systems Computers are used to speed up the work/functions carried out in a business. The vast number of transactions that exist today could not possibly be processed at the desired speed and as accurately as possible. Computers also allow for easier retrieval of files, and saves on physical storage space. Computers reduce the amount of paperwork and remove boring, repetitive tasks. Computers also allow persons in different locations to effectively communicate with each other. Computer systems provide supporting information to assist managers in making strategic decisions. Computer systems allow the effective Management of Information – capturing, generation, storage, retrieval and transmission of information. 1.4. Components of computer systems Memory (RAM & ROM, Cache) Primary - also called IAS (immediate access storage), main storage, main memory. To hold data and instructions after input until needed. Hold info awaiting output. Fast data instantly accessible. Close proximity to the processor. Data must be transferred here before it can be processed. Expensive - contained in a semiconductor chip. Part of O/S remains resident in memory.  Instructions awaiting to be obeyed, Instructions currently being obeyed  Data awaiting processing  Data currently being processed, Data awaiting output. Made up of silicon and galium (semiconductors) on the motherboard. The motherboard is the main circuit board, on it are mounted the microprocessor, 1.1.4. 1.1.3. 7 memory chips, slots for adding other circuit boards. RAM - random access memory - another name for main storage. Arranged like series of boxes, numbered from 0, so know location. Once data placed in each box it remains until replaced by more. Each location has a 0 or 1. Volatile - info lost if power switched off. There are different types of RAM eg. DRAM (dynamic RAM – needs to be reenergized constantly). SRAM (static RAM), SIMM, DIMM (dual inline memory module – pins are on opposite side of circuit board). SDRAM – synchronous because synchronized to system clock). ROM - read only memory. Permanently written during manufacture. Non-volatile - info kept even if power switched off. Firmware - software instructions held in ROM (contents hard wired into device). Controls the function of the machine. ROM extends computer instruction set, O/S, control S/W for peripherals. CMOS – Complementary Metal-Oxide semiconductor. A chip that holds computer settings/configuration. Eg. Date, start up info, keyboard. A battery is used to keep the information. PROM - programmable read only memory. able to have data and programs written in them after manufacture, but once written become permanently fixed. Used to provide permanent instruction capability to microprocessor on functions to perform each time it is used. Does not lose data when power turned off. EPROM - erasable PROM - PROMS which may be erased by a special process (e.g. thru exposure to ultra violet light) and written again as a new PROM. EAPROM - electrically alterable ROM - a variant of EPROM. RAM disk - RAM used as if it were hard disk. (e.g. palm top, data bank). Also called silicon disks. Advantage is fast performance. Cache – a special high speed memory that operates at the speed of the processor. It holds data or instructions that were recently used in anticipation that it will be required in the near future. 1.1.4. 1.1.3. 8 Central Processing Unit (CPU) – parts and their functions vb The system unit houses the electronic components of the computer that are used to process the data. The CPU is a chip. A chip is a small piece of semiconducting material usually no larger than a ½” square on which one or more integrated circuits are placed. An integrated circuit is a microscopic pathway capable of carrying electrical current. Each IC can contain millions of elements such as transistors which act as electronic switches or gates that open or close the circuit for electronic signals. The motherboard contains many types of chips, one of which is the CPU. The CPU is where the processing of the information takes place. It is the brain of the computer system. It carries out the basic instructions that operate the computer. It consists of electronic components called chips. It is the fastest of all the devices. Some large computers have more than 1 CPU. The CPU has 3 parts that are integrated by wiring (bus). The 3 parts are control unit, arithmetic and logic unit (ALU) and primary memory. Each storage location has an address so that the computer knows where to find it. The CPU is located on the motherboard or system board. Control Unit Arithmetic and Logic Unit Memory A.L.U. Composed of miniature solid state components. This is where calculations are done (add, subtract etc.). It is also where decisions are made based on comparisons. Uses logic operators eg >, <, =, AND, OR etc. The ALU has a space called accumulator where data items can be stored during processing. (same concept as scrap paper). Control Unit Also composed of miniature solid state components. Controls order in which program instructions are carried out. Determines priority of services. Also responsible for timing of all the operations done in CPU. Coordination and control of all hardware operations. It contains the system clock to synchronize or control timing. Instructions are sent here, interpreted then executed by sending command signals to the appropriate hardware. (Tells computer what to do). Instructions in a form which can be used directly by control unit are called machine instructions, and programs written in the form of machine instructions are written in machine language. 1.1.4. 1.1.3. 9 The CU is sometimes referred to as Traffic cop. It interprets instructions issued by a program then initiates the appropriate action to carry out the instruction. 1.5. Storage Units of storage bit - 0 or 1 (binary digit) byte - 8 bits, one character. by combination of bits represented by binary codes (e.g. ASCII codes). 32 bit machine - instructions take up 32 bits, i.e. 4 bytes. nibble - 1/2 byte KB - Kilobyte 1024 bytes (call it 1000) 210=1024 MB - megabyte 1 million bytes (1000KB) 220 = 1048576 GB - gigabyte billion bytes (1000 MB) 230 bytes TB terabyte trillion bytes (1000 GB) 240 Storage devices Secondary - Also called auxiliary, backing storage, mass storage. Supplements main/primary storage. For mass storage of programs and files (those not currently being operated on but which will be transferred to the main storage when required. Less expensive compared to primary storage. NB. Data in secondary storage cannot be directly addressed by CPU, so must be transferred to main memory before they can be processed by CPU. They are divided into online and offline storage. Magnetic tape unit A serial access device. Data records stored from start to end. Storage locations cannot be addressed. Read in order stored on medium. Aluminium strip - load point marker marks beginning of the tape for recording. Plastic base coated with metal oxide film. Header label is at start of tape - has filename etc. To record data you magnetise various spots on the tape (each spot is a bit). Tapes has variable size, length. Tape divided into tracks (channels) 7-9 tracks. A char is recorded across the tracks in a row called a frame. Data is read one block at a time. A block is a group of records treated as a single unit during transfer of data. Density = bpi (bytes/chars per inch). Between 200-6250 bpi (800-1600 most common). About 40 million chars per reel. Tape is known as offline storage. Can only write to tape if have "write-permit" ring attached. 1/2" wide, 2400ft long typically. cassette tape - similar to audio cassette - 1 track. 1/8" 1.1.4. 1.1.3. 10 cassette - 280 ft long, 340 KB cartridge tape - (1/4", 1/2", 8mm). 8mm store up to 10GB Quarter inch cartridge – 40MB-5GB digital audio tape (DAT). 4mm DAT store up to 2-49GB. Magnetic disk A Direct Access Storage Device (DASD). Have addressable storage locations. So if know address can go straight to it. Records can be retrieved without having to process other records. Coated with magnetic material (e.g. ferrous oxide). Data recorded as magnetized spots on tracks which run as concentric circles. A track is a narrow recording band that forms a full circle on the surface of the disk. Floppy disk/Diskette Made of thin circular flexible plastic disk with a magnetic coating enclosed in a square shaped plastic shell. Drive A 8" original - now less common 5 1/4" - 360KB -DD, 1.2MB - HD 3 1/2" - 720KB - DD, 1.44MB - HD Hard disk Enclosed in air tight case Made of several inflexible circular disks called platters. It is made of aluminium, glass or ceramic and is coated with magnetic material. The first partition is always drive C. fixed disk - disks not removable from disk drive fixed head disk - fixed disk which has one read/write head per track exchangeable disk - disk may be removed from the disk unit. (A disk pack is a set of exchangeable disks on one spindle). winchester disk - fixed disk unit is sealed and has robust mechanical features. Intended for use in adverse environments (e.g. dusty, humid). Zip disk/drive A higher capacity floppy disk that can store the equivalent of about 70 standard floppy disks. It was developed by Iomega Corporation. It stores about 250MB. SuperDisk drive Developed by Imation and holds 120MB. High-Capacity FD (HiFD) Developed by Sony Electronics and holds 200MB. Optical disk unit Drive D or E. Made up of a thin metal polymer compound. Data recorded by laser burns. Read by another laser of lower intensity detecting the pattern of light 1.1.4. 1.1.3. 11 reflected from beam by surface of disk. More storage capacity than magnetic disks and less susceptible to damage. Access to data slower than magnetic. Used for video, audio. CDs store items using microscopic pits (indentation=0) and land (flat areas =1) in a single spiral track. A high powered laser beam creates the pits and a lower powered laser reads items by reflecting light through the disk. The reflected light is converted into a series of bits that the computer can process. The speed of the drive measured as multiple of X where X is the speed of the original CD-ROM eg. 40X, 75X. X = 150KB per second. CD-ROM (compact disc - read only memory) - 4 1/2" 700MB/80 min. Kept in Jewel Box. CD-R (recordable – can record to it only once), therefore it is a WORM. CD-RW (rewritable) DVD-ROM (Digital video/versatile disk – read only memory). An advancement on CD-ROM technology and can store greater amounts of data e.g. an entire move (4.7GB – 17GB). The pits are placed closer together or there are two layers of pits or can be double sided. DVD-R DVD-RAM – can erase and record multiple times WORM (write once, read many) EO (erasable optical) - magnetic molecules in disk surface aligned when heated by a laser beam. PhotoCD or PictureCD – holds pictures MicroFiche /Film COM recorder used. COM - computer output on microfilm. Page of print may be photographically reduced and produced on reels of film. Alternatively the print may be output on small sheets of film called microfiche. Used to reduce filing space for paper. Used in libraries etc. For high volume of docs. Typical microfiche measuring 105mm x 148mm will hold 98 pages reduced in size about 24 times. Film - 16mm roll. 100-215 foot. COM recorder speed 10000 to 60000 lines per minute. Easily distributed, lightweight and compact. Microfiche can be duplicated up to 1000 copies per hour. Smart Card Similar in size to a credit card, stores data on a thin microprocessor embedded in the card. Electronic money/Digital cash. Flash Card/Picture Card Stores pictures from 2MB to 256MB. 1.1.4. 1.1.3. 12 Flash Drive A flash drive is a small removable data storage device that uses flash memory and a USB connector. Flash drives are also known as keydrive keychain drive, micro hard drive, pen drive, pocket drive, thumb drive, jump drive, USB flash drive, USB flash memory drive, USB key, USB memory key, USB stick, Piripicho (primarily in Spanish), and Kikinou (primarily in French) 1.6. 1.1.4. Flash drive 1.1.3. 13 Other peripheral devices Input Input is data or instructions that you enter into the memory of the computer. Once input is in memory, the CPU can access it and process the input into output. Input devices are used to load input data or programs or commands and user responses into a computer. Most computers have several input devices. Keyboard The most common. User enters info by typing/pressing the appropriate keys. Most desktops have 101 to 105 keys, while laptops etc have fewer keys.Data entered into computer in this way is either sent directly to processor or first stored (for processing in future) on a magnetic tape in a key-to-tape system or on a disk in a key-to-disk system. Types of keyboards  alphanumeric - like in lab, letters, numbers, special keys (func, ctrl, alt etc.)  QWERTY – due to the layout of the letter keys  DVORAK – has an alternative layout designed to improve typing speed. The most frequently used keys are placed in the middle (not very widely used)  Enhanced – have 12 function keys along the top, 2 CTRL keys, 2 ALT keys and a set of arrow and additional keys.  special func - e.g. in a fast food restaurant.  Wireless – transmit data via infrared light waves  Ergonomic – designed to reduce repetitive strain injuries Mouse A pointing device allows you to control a pointer on the screen. This allows you to move or select items on a screen. The mouse is the most common point and draw device (joystick, trackball). Used in computer systems with graphical user interfaces (GUI) alongside keyboard.  Mechanical mouse – the ball is at the bottom  Optical mouse – has no moving parts inside, it emits and senses light to detect movement. No need for a mouse pad.  Cordless/wireless – uses infrared or radio waves  Trackerball - a variation of mouse, ball is on top side so mouse stays stationery  Touchpad/track pad – an area on a laptop where the finger is used. It is sensitive to pressure and motion. 1.1.4. 1.1.3. 14 Joystick A vertical lever (like gear stick in a car) mounted on a base. Moves graphics cursor/pointer in direction stick is pushed. Commonly used for video games. Has buttons called triggers to activate certain events. Trackpoint/Pointing stick Looks like a miniature joystick. Operated with tip of finger. Used for same purpose as traditional mouse. (looks like a pencil eraser) Touch screen This is a monitor that has a touch sensitive panel. You interact by touching the screen with your finger. Due to the arm movement they are not used to enter large amounts of data. These are often used in ATMs, kiosks, hotels, stores, airports. Pads and Tablets These are able to recognise neat handwriting by means of a sensitive pad on which the source doc can be filled in by hand using ball point pen or pencil. Digitizer Tablet and Pen Consists of a flat, rectangular plastic board used to input drawings. Tablet has flat surface on which you can draw using a special pen. Pressure on surface is detected. Popular for people who draw maps (draughtsmen, architects etc.). Some people also use a puck which looks like a mouse but allows you to look through it to the tablet. Scanners, Digitizers/Graphics Tablet A light sensitive device that converts images into digital data that the computer can understand and represent on the screen. Bar code readers Bar codes readable by light pen, light wand, laser scanner. Bar codes are read optically or magnetically. Light Pens, Magnetic Pens, Stylus These resemble pens. More bulky hand held alternatives are sometimes called wands (used in point of sale systems). They can read specially coded data in form of either optical marks/chars, or magnetic codes, which have been previously recorded on strips of suitable material. A common version is called the bar code reader. Light pens can emit or detect light. Some require a special monitor while others work on the standard monitor. 1.1.4. 1.1.3. 15 The pen, also called a stylus looks like a ball point pen but uses an electronic head instead of ink. Pen computers use handwriting recognition software that translates the letters and symbols used in handwriting into character data that the computer can use. Key-Tape, Key-Disk Data entered, most errors filtered out by an edit program then stored on disk/tape. Verification then done by another operator, who keys in the data a second time. This is compared with data already stored and differences can be examined and corrected. Advantage is that reduces load on main computer (input is handled offline). Document Readers To enable data to be read directly from source documents (forms). Falls into 2 main categories a) mark readers b) card readers. Mark Readers Mark sensing (Old method) - Detect pencil marks by using electrical contacts which brushed the paper surface. Pencil mark between the contacts conduct electricity therefore detected. (Marks multiple choice exams e.g. CXC) Medium - mark sense sheets. Optical Mark Recognition (OMR) (New method) - Direct thin light beams onto paper which reflect or absorb depending on pencil mark. Character Readers Optical Character Readers (OCR) Used extensively in connection with billing, e.g. prepare bill, payment returned with bill, document is evidence of payment. (E.g. of turnaround technique). No typing required. Read typed data. Letters recognised based on shape of letters. Magnetic Ink Character Readers (MICR) Documents passed thru strong magnetic field causing the iron oxide in the ink encoded chars to be magnetised. Doc then passed under read head when a current flows. These chars appear on most cheques. Quality of print has to be very high. Badge Readers These read data from small rectangular plastic cards. This is done in several ways:- a) magnetized marks – a short stripe or magnetic tape sealed in to the card’s surface. Eg. Credit cards, ATM cards. Used to open doors when an employee swipes his id badge. B) optical marks, c) punched holes. 1.1.4. 1.1.3. 16 Smart Card Readers Normally a badge holds static data, however some types of badge readers may also change the data held on badges. Such a badge or smart card may be used as a form of electronic money. As the customer purchases an item, the reader may deduct units from the card. Eg. Phone cards, satellite disk/decoder cards. Digital camera Allows you to take pictures and store the photographed images digitally instead of traditional film. You then download or transfer a copy of the picture to your computer. A digital camera is therefore known as a data collection device instead of an input device. Microphone - Voice/Speech Recognition (Voice Data Entry) An eg of a biometric device. Still in early stage of development. Need microphone, speakers, software. Needs to be trained to recognise speech pattern of user. A good speech recognition system never stops learning new words. Spoken word translated to digital form. Mostly used by doctors, lawyers, journalists, physically disabled. Used for security and access control. Biometric input devices These devices, such as retinal scanner, fingerprint scanner read bodily features. They are mostly used in high security areas such as airports and embassies. 1.1.4. 1.1.3. 17 Output An output device gives information to a user. Output can either be softcopy or hardcopy. Video/Visual Display Unit (VDU)/ Monitor The VDU is also called screen, monitor. Screens have a _ or [] call cursor which indicates where the next character will be entered. You can adjust brightness. A typical screen is 24 lines by 80 chars per line. Produces soft copy – because it exists electronically and is displayed for a temporary period of time. Input - Output It is an Input - Output device. Types of monitors CRT (cathode ray tube) similar to a tv screen. The screen is coated with tiny dots of phosphor material that glows when it is electrically charged. Flat panel display – consumes less power than CRT. Lightweight Liquid crystal display (LCD) commonly on laptops, handheld, watches, calculators. It has special molecules deposited between two sheets of material. When an electronic current passes through them, the molecules twist, causing some light waves to be blocked and allowing others to pass thru, which creates the image. Gas plasma – instead of liquid crystal there is a layer of gas. When current is applied, the gas glows. Resolution - Pixels The resolution or sharpness/clarity is measured in pixels (picture elements). [similar to dot matrix] A pixel is a dot that is on or off with attributes of color and intensity. Resolution can be either (typically) :* high - 640 x 400 pixels (across then down) 1280x1024 * medium - 640 x 200 * low - 320 x 200 Lower resolutions show less detail. High resolutions needed in graphics software such as CAD, Coreldraw etc. Dot pitch – the vertical distance between each pixel. Refresh rate – speed at which image is redrawn on the screen (if slow then will flicker and give headaches). 1.1.4. 1.1.3. 18 Interlacing – odd number lines drawn first in one pass, then even numbered lines drawn. It is better when non-interlaced as less flicker. Graphics - VGA, EGA, SVGA The monitor rating is set by a graphics adaptor rating. Graphics adaptor put into an expansion slot and has a monitor port. Graphics adaptor is a device responsible for converting digital signals sent by computer to monitor to analog form which is required by monitor to display info. Monitor cable is plugged into this adaptor at back of computer in order to link it with computer’s processor. VGA - video graphics array - Supports 320 x 200 pixels in 256 colors OR 640 x 480 pixels in 16 colors. In straight text mode, characters are generated in 9 x 16 pixel grids. But in graphics mode can set it to any size.      EGA - enhanced graphics adaptor - This low grade adaptor supports 16 colors at a time, chosen from 64 and a resolution of 640 x 350 pixels. Chars are generated in 8 x 14 pixel grids. XGA – extended graphics array – 1024x768 with 256 colors OR 640x480 with 65,536 colors. CGA - color graphics adaptor - Original adaptor provided and is now obsolete. Supported 4 colors at a time chosen from 16. Resolution of 640 x 200 pixels. Chars generated in 8 x 8 pixel grids. MDA – monochrome graphics adapter – 720x350 – 1 color. SVGA - Super VGA - 1024 x 768 pixels (800x600 to 1280x1024), 16 million colors. Chars in 12x30 pixel grid. Graphics adaptors have their own primary storage called Video RAM (VRAM) which is used to store info while it is being prepared for display. Size of VRAM basically determines the number of colors, resolution, speed. Colour vs Monochrome Monochrome - a single color (white,green,amber/orange) and black Color - At back of computer are a number of connectors and switches. One of these connectors provides signals for using a color monitor using RGB (red,green,blue) color signals. Another connector provides the signal for operating a monochrome monitor. Advantages of Monitors Faster than a printer, Less noisy than printer, Paper not wasted Allow you to correct/edit data before it is saved on computer. Disadvantages of Monitors Eye strain, No permanent hardcopy (gives temporary/soft copy) 1.1.4. 1.1.3. 19 Printers The printed page is the most common type of computer output (despite the desire for a paperless office). Printed copy = hard copy. Paperless - use microfiche and film, send email instead of memo. Questions to ask when buying a printer. Price, Speed, color, cost to print, multiple copies, graphics, photographic quality, types of paper, size of paper, amt of paper in tray, compatible with existing, ink and paper cost, reliability, envelopes and transparencies, budget, what will I be printing. How much am I printing now and in 2 yrs. Availability of ink cartridge, noise factor Impact vs Non-Impact Printers grouped in 2 functional categories:a) Impact - This is the most widely used. The print mechanism strikes the paper thru an ink ribbon which makes char impression on the page. Impact printers use solid font mechanisms or dot matrix mechanisms. b) Non-impact - Uses thermal (heat), photographic (xerographic), electrostatic, light to print. Does not require physical contact with the paper and generally results in very high speeds. These are quieter. Tend to be more expensive. Do not have multiple copy facilities. Quality Printers are also classified by their print quality.  Letter quality  Near letter quality (NLQ)  Draft Draft is fastest speed. Line vs Page vs Character A character printer prints on character at a time. (80-200 chars per sec).  Dot matrix - impact  Daisy wheel - impact  Thermal - non impact  Ink jet - non impact 1.1.4. 1.1.3. 20 A line printer prints one line at a time. (up to 3000 lines per minute - lpm) Typically used with mainframes and use 11x17” paper (speedigreen)  Drum - impact  Chain, train - impact  Band - impact A page printer prints one page at a time. (up to 200 pages per min). All page printers are non impact.  Laser - non impact Dot Matrix Printer These are the most popular and widely used low speed printers in use today. Relatively inexpensive. Limited by speed, noise, quality. Wide range of character sets. Prints a pattern of dots in shape of desired char. Print head is a matrix of steel pins. The higher the number of pins the better the quality print. 18 pin. 24 pin heads produce NLQ. (slower because each letter printed twice). 80-300 chars per sec in draft mode. 50-100 chars per sec in NLQ. *…* The letter X. *…* .*.*. ..*.. .*.*. *…* *…* 5x7 dot matrix. Color ribbons are available. Colors cannot be blended for new colors. Wire matrix Similar to dot matrix. Chars formed by the selection of a series of small wires formed in a 5 x 7 matrix. By selecting certain wires and pressing them against an inked fabric ribbon different chars can be formed. 1.1.4. 1.1.3. 21 500-1000 lines per minute. Daisy Wheel No longer very common. High quality print. Impact. Line printer. Slow speed (15-50cps). Noisy. Relatively expensive. Cannot produce graphics only what on tip of spokes. Wheel rotated to print each char. The appropriate spoke is struck against an inked ribbon. Some can print left to right on one line and right to left on next line so printing is faster. Drum, Chain, Band Printers These are Line printers. High speed. Used mainly for mainframes. The drums, chains, or bands rotate at constant speed and are struck by hammers as the required chars pass the print positions. Up to 7 copies can be done at a time using paper with carbon. Chain and Train printers - Use a continuous chain (like bicycle chain) with 5 sections and 48 chars per sec. Chain rotates behind a continuous form paper. In front of page is an ink ribbon and a set of 32 magnetically activated hammers. As the char to be printed comes to the print position the hammer is activated, driving paper against ribbon and char on the chain. Train - 2000 lines per min. Drum revolves Band made of steel and can be exchanged to provide variety of char sets. 1.1.4. 1.1.3. 22 Bar printer Operates similar to chain. String of bars oscillate back and forth at high speed in front of paper. Char printed when it comes to the correct print position. 200-600 lines per min. Laser Tend to be most expensive non impact. Very high quality. Non impact. Page printer. 146 pages per minute. 7890-20000 line per min. 600-1200 dpi Uses electrostatic or optical methods. Uses toner powder (dried ink) which sticks to charged parts (on drum) traced by laser beam. (Rotating mirror deflects the laser beam across the surface of the drum). Paper pressed against drum. Images permanently fused to the paper using a heating unit. Image of whole page represented by series of minute dots, dots are so close together that print looks like a shaped char (so high quality). Can print combination of text and diagrams so used with word processors, desktop publishing. Cannot print on continuous paper. Photocopiers and lasers are similar except that Data source for laser is digital then translated to pattern to be traced Data source for photocopier is optical (analog) which may be translated to digital If want n copies it scans images n times. Typically image is not stored. Thermal A character printer. Non impact. Low speed (160 cps). Uses heated print head. Quiet. Inexpensive. Electrically heated pins are pushed against heat-sensitive paper Low print quality and images fade over time. Ink Jet Developed due to search for low-cost quality printing. It is non-impact. Page printer. Quieter. Low speed (540 chars per sec - draft, 180 chars per sec - letter quality). Or measured in ppm – pages per minute. Eg. 1-12ppm. It uses a matrix of ink dots sprayed on paper to form a character. It uses a matrix of ink dots sprayed on paper to form a character from 50 nozzles. It uses small ink drops so more drops are needed to form a character so the resolution of the character is greater than a dot matrix. (300-600 dpi - dots per inch). Able to change the size and style of type/font almost instantaneously. Some models have color. Cartridges exist with different colors. Colors can be blended. Does not use continuous paper. 1.1.4. 1.1.3. 23 Plotters Also called graph plotter. A plotter is a device that uses pens moving in various directions to produce text and graphics on paper. It differs from a printer in that it can produce continuous lines. Printers generate lines by printing a series of closely spaced dots. Electrostatic plotters use a row of charged wires (styli) to draw electrostatic patterns on specially coated paper and then fuses toner to the pattern. These are used in graphics, earthquake detection, lie detectors, heart monitors, for graphs, maps, CAD - computer aided design. Types:a) Flat bed - most common. Pen moves up (to not draw), down (to draw), across, side to side. Paper does not move. b) Drum - pen moves up, down, across. Paper moves side to side using rollers. Drum Flatbed Methods of receiving signals a) Digital plotter - receives digital input which specifies the position to which pen should move. b) Incremental plotter - receives input which specifies changes in position based on current position. Eg. Move 2 mm left. Many plotters have multiple pens of different colors. Some plotters use electrostatic printing rather than pen and ink. 1.1.4. 1.1.3. 24 Voice/Audio Output Uses speakers/headset Examples:1. Book reading machine for blind (Text to speech translators). Machine uses an OCR type scan on each page of book. 2. Over Telephone - If you want so and so press “1”. Used in account queries. Pre-recorded speech stored in EPROM (erasable programmable ROM) chips, each word has a specific address, so specify combination of addresses to get a sentence. (Interactive Voice Response Systems) 3. Used in games 4. Used in cars - “Petrol is low message”. 5. Printers - “Printing started”. Data Projectors Takes the image on screen and projects it onto a larger screen. 1.1.4. 1.1.3. 25 2. Module Two - Software 2.1. Definition of software Software - programs/instructions which drive the hardware. Collection of machine interpretable instructions that define the activities related to performing a specific task by computer. (i.e. tell computer what to do). 2.2. System Software System software consists of the programs that control or maintain the operations of the computer and its devices. Operating systems An operating system (O/S), affects the control and performance of a computer system. Controls hardware. Tells computer what to do and how to do it. Provides the user interface between user and computer hardware (h/w). The operating system is a set of programs residing in main memory (RAM) which directs all computer operations. Functions of an operating system  Control/Co-ordinate/configure the various devices. (i.e. To Make sure that fast devices do not have to wait for slow ones and that the computer as a whole works efficiently).  Control the allocation and utilisation of shared resources (e.g. CPU time, storage space, I/O devices).  Start up or boot up the computer. (IPL – initial program load or booting)  Protect the hardware and software from improper use. Maintain system integrity.  Deal with errors. (e.g. hardware failures, deadlocks, etc.)  Keep records/statistics of programs run - date, time, cost, use of resources etc. (Monitor performance)  Send and receive messages  Provide an interface between the hardware and the user that is more convenient than that presented by the bare machine.  Provide for the management, scheduling and interaction of tasks. (Manage programs), schedule jobs)  Administer security  Control a network  Establish an internet connection  Provide file management and other utilities 1.1.4. 1.1.3. 26 Utilities These programs provide a useful service to the user by providing facilities for performing common tasks. Examples are:-  File viewer – Displays the contents of a file. E.g. Quickview in Windows.  File compression – Reduces the size of a file usually to a ZIP extension. E.g. PKZIP, WinZip.  Diagnostic utility – Compiles technical information about your computer hardware and reports physical and logical problems. Physical – scratch on disk, Logical – corrupted file. E.g. Scandisk, Norton disk doctor.  Defragmenter – Reorganizes files and unused space on a disk so that data can be accessed quickly and programs run faster. E.g. defrag in Windows  Backup – Copies selected files to another disk/tape. It alerts you if an additional disk is needed. The opposite RESTORE utility should also exist. E.g. MSBACKUP, NovaBACKUP  Anti-virus – Prevents, detects, removes viruses from a computer system. E.g. Norton Anti-virus, Mcaffee, Trend Micro PC-cillin.  Screen saver – Causes the screen to display an image or blank screen if no keyboard or mouse activity occurs for a specified period of time. This prevented ghosting in the past. Ghosting is where images are permanently etched in the screen. You can also put a password on the screen saver to prevent access. 2.3. Application software Application software (A/S) are defined to fulfil a specific set of activities. E.g. Accounting, word processing, banking Productivity tools/business Word-processing Allow easy creation, edit, correction, printing of documents. Features include:- Bold, underline, margins, spell check, print, page number, justify, footnotes, table of contents, font size and type, mail merge, save. E.g. Word, WordPerfect Spreadsheets Designed to manipulate numeric data in a tabular form. 1.1.4. 1.1.3. 27 Electronic equivalent to accountant’s ledger - a large piece of paper divided into columns and rows into a grid of cells. User can enter numbers, formulas, text in each cell. Each cell referred to by its co-ordinates, e.g. D10.  You can copy formulas.  Can do “What If” analysis, e.g. If increase sales by 10%, just type 10 & automatically recalcs.  Can change data to be represented graphically (pie, bar, line etc.) E.g. Lotus, Excel, Quattro Pro Database Management Information is vital to business. A database is a collection of organized of information stored in a way that makes it easy to find and present. Database management software allows you to create and maintain a database. Field, fieldtype, keys, fieldsize. Can sort, query. E.g. Foxpro, Oracle, Dbase Desktop Publishing To produce documents, newsletters, posters. Combine word processing and graphics packages. E.g. Page Maker, Microsoft Publisher Integrated Packages Collections of packages which have been designed to be used together. E.g. Spreadsheet data might be easily fed into a database or displayed in a diagrammatic form using a graphics package. E.g. Microsoft Office, Lotus Smartsuite, Corel Wordperfect Suite. Graphic design/multimedia Graphics Provide facilities that allow user to do various kinds of computer graphics. Require a lot of main memory and usually special circuit board (graphics card) and a high resolution screen. Classified into:a) business graphics - generate charts and diagrams eg. Bar charts from existing figures. b) computer- aided design (CAD) Used by graphics designers, architect, engineer to design things e.g. Buildings, cars etc. Use light pen or mouse or digitiser pad. Have large colour palette. Can rotate, flip (invert), colour, move, delete, resize E.g. Corel Draw 1.1.4. 1.1.3. 28 Presentation graphics This allows you to create a slide show. The slide show is a multimedia presentation because it contains text, graphics, video and sound. (E.g PowerPoint) Multimedia An application that combines text, sound, graphics, motion video and animation. (e.g Windows Media Player). Education/personal/home Word processing Encyclopedia – Compton etc. Internet – for shopping, research, sending email Games Entertainment software These software include games, software that allows you to listen to music of watch movies. E.g. Windows Media Player, Chess, Monopoly, Solitaire etc. Communication Communication software is any program that a) helps users establish a connection to another computer or network or b) manage the transmission of data, instructions, and information or c) provide an interface for users to communicate with one another. Communication software allows users to send messages and files from one location to another. E.g. Network Operating Systems (NOS), Web Browsers (Internet Explorer, Netscape Navigator), Outlook (Email), etc. Programming languages/program development software Used to create programs. People called programmers use programming languages to tell the computer what to do. Hundreds of programming languages or language variants exist today. Most were developed for writing specific types of applications. However, many companies insist on using the most common languages so they can take advantage of programs written elsewhere and to ensure that their programs are portable, which means that they will run on different computers. E.g. foxpro, C, C++, Pascal, Visual Basic, COBOL etc. 1.1.4. 1.1.3. 29 3. Module Three - Data and Information 3.1. Definitions of data and information Input - operation of supplying data to a computer. Output - operation of displaying/printing information that has been processed. Data - information which computers use. Means nothing e.g. 0, 1, 5, F, G. Can be information to another person. Information - means something. The meaning is what makes it information. Used to make decisions. 3.2.      3.3. Desired characteristics of information Accurate, clear, timely, complete yet concise receiver has confidence in it appropriate channel, given to right person, should not be excessive cost effective must have a purpose, relevant for the purpose (user related) Information systems An information system may include a computer program and all of the users or it may refer to a single application including those data items, programs and hardware resources devoted to it. Another way of saying it is, An information system is a set of hardware, software, data, people, and procedures that works together to produce information They are developed for different purposes, depending on the needs of the business. Some general purpose information systems, called enterprise-wide systems, are used throughout an enterprise... Enterprise Resource planning (ERP) provides applications to help manage and coordinate ongoing activities. Customer relationship management (CRM) systems manage information about customers. A content management system (CMS) organizes and allows access to various forms of documents and files. The types of information systems are:An office information system (OIS) enables employees to perform tasks using computers and other electronic devices. 1.1.4. 1.1.3. 30 Management information system (MIS) (strategic level) - Outputs information that will be used for decision making. Performs queries on data produced by data processing systems. MIS generates accurate, timely, and organized information, so users can make decisions, solve problems, and track progress. Expert systems (strategic level) -. It is given rules to solve problems and uses these rules to come up with solutions. E.g. playing chess, making medical diagnoses. (Also called knowledge based system or Heuristic system or An artificially intelligent system). This is a step further in AI technology. An expert system is a computer program that represents and reasons with the knowledge of some specialist subject area with a view to solving problems or giving advice. (e.g. doctor, lawyer, engineer, finance expert, stock broker). It stores the knowledge of human experts and then imitates human reasoning and decision making. This technology allows us to for example, broaden the scope for medical diagnosis by computer (no need for doctor, or to help to doctor to make him more productive). An example of a medical expert system is MYCIN, which was developed at Stanford University, which makes judgments on the diagnosis of bacterial infection and proposes courses of therapy with antibiotics. MYCIN operates as a consultant by interacting with a doctor who knows the history of the patient. Features of an expert system: An organized base of knowledge that contains facts and rules  An interactive user interface to support diagnostic discussion with user. (e.g. able to pose medical problems and get an initial diagnosis)  Holds details of status of current consultation. (so that consultation can be continued at a later date from where you left off)  Inference engine - software which can use the knowledge and current status of consultation to either formulate further questions for the user or to draw conclusions and make recommendations.  Knowledge acquisition system- a facility to update the knowledge base (i.e. to learn from experience). Decision support system (DSS) (Tactical level) - Similar to management information systems. They allow users to do queries and perform "What if' analysis. A DSS helps users analyze data and make decisions. Transaction processing system (TPS) (Operational level) – (aka Data Processing System) – Captures and processes large amounts of data for routine (day-to-day) business transactions such as payroll and inventory. It relieves the tedium of performing repetitive tasks. 1.1.4. 1.1.3. 31 Information requirements and information systems used in the functional units of an enterprise The Special Information Requirements of an Enterprise-Sized Corporation A large organization, or enterprise, requires special computing solutions because of its size and geographical extent. Enterprise computing uses computers in networks or series of interconnected networks to satisfy the information needs of an enterprise. The types of information employees require depend on their level in the company. Executive management needs information to make strategic decisions that center on a company’s overall goals and objectives. Middle management needs information to make tactical decisions that apply specific programs and plans to meet stated objectives. Operational management needs information to make an operational decision that involves day-to-day activities. Non-management employees also need information to perform their jobs and make on-the-job decisions. It is important to know the different levels of management in an organization and be able to provide each with the level of information that they require. The Information Systems used in the functional units of an enterprise In an enterprise, the individual operating entities, called functional units, have specialized requirements for their information systems. Human Resources Information Systems (HRIS) manage one or more human resources functions. Accounting and financial systems manage everyday transactions and help budget. Computer-aided design (CAD) assists engineers in product design, and computer-aided engineering (CAE) tests product designs. Computeraided manufacturing (CAM) controls production equipment, and computer-integrated manufacturing (CIM) integrates operations in the manufacturing process. A marketing information system serves as a central repository for marketing tasks. Salesforce automation (SFA) software equips salespeople with the tools they need. Distribution systems control inventory and manage shipping. Customer interaction management (CIM) software manages interactions with customers. The information technology (IT) department makes technology decisions for an enterprise and maintains hardware and software applications. More on the levels of management Operations (day to day) This level makes decisions using predetermined rules that have predictable outcomes when implemented correctly. They make decisions that affect work scheduling, inventory control, control of processes such as production. Decisions are repetitive/certain. The information needed is repetitive, low level in nature. Dependent on information that captures current performance. Need online, realtime information. E.g. total sales by salesman for a particular day. They use data processing or transaction processing systems – process large amounts of data for routine business transactions such as payroll. They relieve the tedium of performing operational transactions. Require little decision making to set up. Middle/Managerial/Tactical (Year to year) Make short term planning and control decisions about how resources should be allocated to best meet organization objectives. They experience very little certainty in their 1.1.4. 1.1.3. 32 decision making environment. Their decisions range from forecasting future resource requirements to solving employee problems that threaten productivity. They require short term to long term information. They need information on current performance against set standards. There is a high need for historical information along with information that allows prediction of future events and simulation of possible scenarios. E.g. total sales vs budgeted sales for the quarter. They use management information systems. The output is used to make decisions Strategic (5-10 years) These solve non-routine problems. They look outwards from the organization to the future, making decisions that will guide middle and operations management in the months and years ahead. They work in highly uncertain decision making environments. They look at the broad picture (organization as a whole). E.g. to develop a new product line, divest itself of unprofitable ventures. They are dependent on information from external sources that supply news of market trends and the strategy of competing corporations. They need general summarized information rather than highly detailed raw data required by low level managers. Information may be older and estimated. They use decision support systems and knowledge based or expert systems. DSS similar to MIS and allow querying and what if analysis (e.g. crystal reports). Expert systems capture and use the knowledge of an expert for solving a particular problem experienced by the organization. E.g. total sales vs sales of another company or other product. 1.1.4. 1.1.3. 33 4. Module 4 – Telecommunication and Computer Networks 4.1. Definition of networks Network – A collection of computers and devices connected together via communications devices and transmission media, allowing computers to share resources. The purpose of a network is that of routing, managing, and storing rapidly changing data. Communications – this describes a process in which one computer transfers data, instructions, information to another computer. (also known as data communication/transmission). This is done by means of a communication channel or medium. Components required for successful communications     A sending device (sender or encoder) initiates an instruction to transmit data, instructions or information A communication channel – path over which signals are sent (aka path, medium, line). Signals can be in analog or digital form. A communication device – that receives the signals from the communication channel and converts them to a form understood by the receiving device. It connects the sending device to a communication channel. It also connects the communication channel to a receiving device. A receiving device (receiver or decoder) that accepts the data, instructions, information 4.2.         Uses of networks/Role in business Facilitating communications - Sending e-mail, voice mail, fax (facsimile), doing research on internet, chat rooms, instant messaging Telecommuting, video conferencing, making a phone call Transferring Funds (Electronic Fund Transfer, EFT) - Inter-branch banking, multilink – checking debit card balance. Western union – wiring money to another location. Sharing a hardware resource such as a printer. Sharing data (EDI – electronic data interchange), information, Sharing software Verifying blue cross card. Etc. 1.1.4. 1.1.3. 34 Intranet International network that uses Internet technologies. An Intranet is a network based on the internet TCP/IP open standard. An intranet belongs to an organization, and is designed to be accessible only by the organization's members, employees, or others with authorization. An intranet's Web site looks and act just like other Web sites, but has a firewall surrounding it to fend off unauthorized users. Intranets are used to share information. Secure intranets are much less expensive to build and manage than private, proprietary-standard networks. An intranet is a private network that is contained within an enterprise. It may consist of many interlinked local area networks and also use leased lines in the wide area network. Typically, an intranet includes connections through one or more gateway computers to the outside Internet. The main purpose of an intranet is to share company information and computing resources among employees. An intranet can also be used to facilitate working in groups and for teleconferences. An intranet uses TCP/IP, HTTP, and other Internet protocols and in general looks like a private version of the Internet. With tunneling, companies can send private messages through the public network, using the public network with special encryption/decryption and other security safeguards to connect one part of their intranet to another. Typically, larger enterprises allow users within their intranet to access the public Internet through firewall servers that have the ability to screen messages in both directions so that company security is maintained. When part of an intranet is made accessible to customers, partners, suppliers, or others outside the company, that part becomes part of an extranet. Extranet An extranet is a private network that uses Internet technology and the public telecommunication system to securely share part of a business's information or operations with suppliers, vendors, partners, customers, or other businesses. It is a portion of a company’s network that allows customers or suppliers to access parts of an enterprise’s intranet. An extranet can be viewed as part of a company's intranet that is extended to users outside the company. It has also been 1.1.4. 1.1.3. 35 described as a "state of mind" in which the Internet is perceived as a way to do business with other companies as well as to sell products to customers. An extranet requires security and privacy. These can include firewall server management, the issuance and use of digital certificates or similar means of user authentication, encryption of messages, and the use of virtual private networks (VPNs) that tunnel through the public network. Companies can use an extranet to:       Exchange large volumes of data using Electronic Data Interchange (EDI) Share product catalogs exclusively with wholesalers or those "in the trade" Collaborate with other companies on joint development efforts Jointly develop and use training programs with other companies Provide or access services provided by one company to a group of other companies, such as an online banking application managed by one company on behalf of affiliated banks Share news of common interest exclusively with partner companies Internet The Internet is a large, international computer network linking millions of users around the world that use the TCP/IP protocols. It is used daily by many individuals for the main purposes of sending and receiving electronic mail (e-mail), obtaining information on almost any subject, or to communicate with others around the world. Access to the Internet is obtained by subscription, and an Internet address is needed to receive or to send a message. Such addresses have a specific format that specifies the name of the user, the machine they are working on, and where that machine is located. Many people use the terms Internet and World Wide Web (a.k.a. the Web) interchangeably, but in fact the two terms are not synonymous. The Internet and the Web are two separate but related things. The Internet is a massive network of networks, a networking infrastructure. It connects millions of computers together globally, forming a network in which any computer can communicate with any other computer as long as they are both connected to the Internet. Information that travels over the Internet does so via a variety of languages known as protocols. The World Wide Web, or simply Web, is a way of accessing information over the medium of the Internet. The web is a subset of the Internet and consists of web pages that can be accessed with a Web browser. It is an information-sharing model that is built on top of the Internet. The Web uses the HTTP protocol, only one of the languages spoken over the Internet, to transmit data. Web services, which use HTTP to allow applications to communicate in order to exchange business logic, use the Web to share information. The Web also utilizes browsers, such as Internet Explorer or Netscape, to access Web documents called Web pages that are linked to each other via hyperlinks. Web documents also contain graphics, sounds, text and video. The Web is just one of the ways that information can be disseminated over the Internet. The Internet, not the Web, is also used for e-mail, which relies on SMTP, Usenet news groups, instant messaging and FTP. So the Web is just a portion of the Internet, albeit a large portion, but the two terms are not synonymous and should not be confused. 1.1.4. 1.1.3. 36 4.3. Advantages and Disadvantages of Networks Advantages Faster and easier access to information Better communication Ability to have a worldwide audience (able to advertise and market your product) E-commerce /E-business Disadvantages Children have access to pornography, harmful information, pedofiles etc. Easier to plagiarise information as able to copy and paste. 4.4. Network classifications Computer Networks are classified according to the distance between individual computers that are attached to the network. The classification includes the following: 1. Local Area Network (LAN) 2. Wide Area Network (WAN) 3. Metropolitan Network (MAN) Local Area Network (LAN) A Local Area Network (LAN) is a network that connects computers and devices in a limited geographical area such as a house, school laboratory or an office building. Each computer or device on the network is called a node, often shares resources such as printers, large hard disks, and programs. Generally LANs use wires (physical cables) for connectivity between the devices. It is however possible to connect the devices via wireless means giving rise to what is known as a WLAN (wireless LAN). Wide Area Network (WAN) A Wide Area Network (WAN) is a network that covers a large geographic area such as a city, country or the world. Using communications channel that combines many types of media such as telephone lines, cables, and radio waves. A WAN can be one large network or can consist of two or more LANs connected together. For example, the campus spanning network that connects different departments in any University or larger company is called a WAN. The Internet is the world’s largest WAN. Metropolitan Network (MAN) A Metropolitan Area Network (MAN) is a network that spans a whole metropolitan area. It is referred to as a high speed-network that covers a city. MANs use similar technology to LANs but cover a much wider geographic region. A consortium of users or a single network provider that sells the service to the other users usually manages a MAN. 1.1.4. 1.1.3. 37 Types of LANs (Network Architecture) There are two (2) main types of network architectures. These include 1. Client/Server 2. Peer to Peer  Client/server – one or more computers designated the server (host) and others the clients. The client is the requesting machine and the server is the supplying machine. In other words, the client requests services and the server provides the service. The server controls access to the hardware and software on the network and provides a centralized storage area for data. Clients rely on the server for resources such as files, processing power, storage. Server software generally runs on powerful computers dedicated for exclusive use to running the business application. Client software on the other hand generally runs on common PCs or workstations. Clients rely on the application server for things such as configuration files, business application programs, or to offload compute-intensive application tasks back to the server in order to keep the client computer (and client computer user) free to perform other tasks.  Peer-to-peer Network (P2P) – simple, inexpensive that connects less than 10 computers using twisted pair or coaxial cables. Each computer (called a peer), can share the hardware located on any other computer. Each computer has equal responsibilities and capabilities. The Network Operating System (NOS) must be installed on each computer. Ideal for small businesses and home offices. A communications environment that allows all computers in the network to act as servers and share their files with other users on the network. Peer-to-peer networks are quite common in small offices that do not use a dedicated file server, and client versions of the Windows, Mac and Linux operating systems allow files to be shared. Peers act as clients and server. Peer-to-peer networks allow you to connect two or more computers in order to pool their resources. Individual resources such as disk drives, CD-ROM drives, scanners and even printers are transformed into shared resources that are accessible from each of the computers.  Internet Peer-to Peer is a different kind of peer-to-peer network exists on the Internet that allows users to share files on their 1.1.4. 1.1.3. 38 hard disks, essentially creating global peer-to-peer networks. It is used mostly for music files e.g. Napster, KaZaa 4.5. Network Topologies Network Topology is the configuration or physical arrangement of the devices or nodes. i.e. The layout of the computers and devices on a network. The 3 main topologies are ring, star and bus. Bus A bus network is a network architecture in which there is a single central cable to which all devices are attached. The central cable is called a bus. The bus transmits data in both directions. Only one device can transmit at a time. When a sending device transmits data, the address of the receiving device is included with the transmission so that the data is routed to the appropriate receiving device. It is easy to add/remove devices from a bus network. It is also an inexpensive topology. Failure of one device does not affect another device. The network will fail if the bus (central cable) fails. Advantages of a Bus Topology include:  Easy to connect a computer or peripheral to a linear bus.  Typically the cheapest topology to implement  Failure of one station does not affect others 1.1.4. 1.1.3. 39 Disadvantages of a Bus Topology  Entire network shuts down if there is a break in the main cable.  Terminators are required at both ends of the backbone cable.  Difficulty in identifying the problem if the entire network shuts down  Performance degrades as additional computers are added Ring A ring network is a topology where each device is connected to two others, so as to create a ring or closed loop. Data transmitted on a ring network travels in one direction on the ring from device to device until it reaches its destination. If a device fails, devices before it are not affected, it is the devices after it that are affected. Advantages of Ring Network  Growth of the system has minimal impact on performance  All stations have equal access  Each node on the ring acts as a repeater (booster of the signal), allowing ring networks to span greater distances than other physical topologies.  Because data travels in one-direction high speeds of transmission of data are possible. Disadvantages of Ring Network  Often the most expensive topology  Failure of one computer may impact others 1.1.4. 1.1.3. 40 Star A star network topology, in its simplest form, consists of one central, or hub computer, which acts as a router to transmit messages. All devices are connected to the central computer (hub). All data passes through the hub. If a device fails, there is no effect on the network, only if the hub fails will the network be affected. Advantages of Star Network  Easy to implement and extend, even in large networks  Well suited for temporary networks (quick setup)  The failure of a non-central node will not have major effects on the functionality of the network. Disadvantages of Star Network  Limited cable length and number of stations  Maintenance costs may be higher in the long run  Failure of the central node (hub) can disable the entire network  Central hub can be a bottleneck. 4.6. Network components Networking hardware includes all computers, peripherals, interface cards and other equipment needed to perform data-processing and communications within the network. Below are descriptions of commonly used networking hardware.  Modem – Short for modulator-demodulator. This device enables a computer to transmit data over telephone or cable lines. It converts digital signals to analog signals and vice versa. (modulation/de-modulation). Computer information is stored digitally whereas information transmitted over telephone lines is transmitted in the form of analog waves. The device converts between the two forms. These devices can be internal or external. The internal ones come as an expansion board that you can insert into a vacant expansion slot. External ones are connected to the computers through an RS-232 interface. The following characteristics distinguish one from another: 1.1.4. 1.1.3. 41 a) b) c) d) e) f) Bps (bits per second) – how fast the device can transmit and receive data (baud rate). The fastest ones are about 57600bps. Voice/data – many support a switch to change between voice and data modes. Auto-answer – enables your computer to receive calls in your absence. Data compression – enables the device to send data at a faster rate. (The receiving device must be able to decompress the data using the same compression technique. Flash memory – some devices have flash memory instead of the conventional ROM, which means that the communications protocols can be easily updated if necessary. Fax capability – most modern ones are fax modems, which means that they can send and receive faxes.  Cable modem (DSL modem – Digital Subscriber Line) – sends and receives data over a cable television (CATV) network, which consists largely of coaxial cable. It can transmit up to 2Mbps. Because the coaxial cable used by cable tv provides much greater bandwidth than telephone lines, the device can be used to achieve extremely fast access to the world wide web.  ADSL modem (asymmetric digital subscriber line). A new technology that allows more data to be sent over existing copper telephone lines (POTS – plain old telephone service). ADSL supports data rates from 1.5 to 9Mbps when receiving data (downstream rate) and from 16 to 640 Kbps when sending data (upstream rate).  Multiplexer – combines 2 or more input signals from various devices into a single stream of data then transmits it over a single transmission medium. (sometimes called mux). The advantage is that it saves on cabling costs.  Network interface card (NIC) – expansion card that is inserted into the computer to connect it to a network. Also call LAN adapter. Most are designed for a particular type of network, protocol and media, although some can serve multiple networks.  Hub – (concentrator, multi-station access unit (MAU)) is a device that provides a central point for cables in a network. It usually has ports for 8 –12 devices. It is a common connection point for devices in a network. They are commonly used to connect segments of a LAN. When a packet arrives at a port, it is copied to the other ports so that all segments of the LAN can see all packets. A passive hub serves simply as a conduit for the data, enabling it to go from one device or segment to another. Intelligent hubs include additional features that enable an administrator to monitor the traffic passing through the hub and to configure each port in the hub. Switching hubs read the destination address of each packet and then forwards the packet to the correct port.  Switch – A device that filters and forwards packets between LAN segments. (A packet is a piece of a message transmitted over a packet switching network. The packet contains the destination address in addition to the data). Most switches are active, that is they electrically amplify the signal as it moves from one device to another. Switches no longer broadcast network packets as hubs did in the past, they memorize addressing of computers and send the information to the correct location directly. Switches are:  Usually configured with 8, 12, or 24 RJ-45 ports  Often used in a star topology  Sold with specialized software for port management  Also called hubs 1.1.4. 1.1.3. 42  Usually installed in a standardized metal rack that also may store netmodems, bridges, or routers  Repeater – a device that accepts a signal from a medium, amplifies it, retransmits it. Solves the problem of attenuation (weakening of a signal due to distance). It regenerates or replicates the signal. It can also relay messages between sub-networks that use different protocols or cable types. Hubs can operate as this device by relaying messages to all connected computers. The device cannot do the intelligent routing performed by bridges and routers. Repeaters can be separate devices or they can be incorporated into a hub. They are used when the total length of your network cable exceeds the standards set for the type of cable being used.  Bridge – a device that connects 2 LANs or 2 segments of the same LAN that use the same protocol such as Ethernet. A bridge is a device that allows you to segment a large network into two smaller, more efficient networks. If you are adding to an older wiring scheme and want the new network to be up-to-date, a bridge can connect the two. A bridge monitors the information traffic on both sides of the network so that it can pass packets of information to the correct location. Most bridges can "listen" to the network and automatically figure out the address of each computer on both sides of the bridge. The bridge can inspect each message and, if necessary, broadcast it on the other side of the network. The bridge manages the traffic to maintain optimum performance on both sides of the network. You might say that the bridge is like a traffic cop at a busy intersection during rush hour. It keeps information flowing on both sides of the network, but it does not allow unnecessary traffic through.  Gateway – a) A combination of hardware and software that connects networks that use different protocols. b) A node on a network that serves as an entrance to another network. c) An earlier term for router. The gateway node often acts as a proxy server and firewall. The proxy server sits between the client application and the real server and intercepts all messages entering and leaving the network and checks if it can fulfil the requests itself, if not, then it forwards the request to the real server. It also hides the true network addresses. The purpose of the proxy server is to improve performance and to filter requests (e.g. prevent users from accessing a specific set of websites. The firewall prevents unauthorized access to or from a private network. The gateway is also associated with both a router, which use headers and forwarding tables to determine where packets are sent, and a switch, which provides the actual path for the packet in and out of the gateway.  Router – A device that connects multiple networks, routes traffic to appropriate network using the fastest available path. It forwards data packets along networks. It is connected to at least 2 networks. They are located at gateways, the places where 2 or more networks connect. Routers use headers (part of the data packet and has information about the file or the transmission) and forwarding tables to determine the best path for forwarding the packets, and they use protocols such as ICMP (Internet Control Message Protocol – an extension to the Internet Protocol – IP) to communicate with each other and configure the best route between any 2 hosts. Very little filtering of data is done through these devices. A router translates information from one network to another; it is similar to a superintelligent bridge. Routers select the best path to route a message, based on the destination address and origin. The router can direct traffic to prevent head-on collisions, and is smart enough to know when to direct traffic along back roads and shortcuts. While bridges know the addresses of all computers on each side of the network, routers know the addresses of computers, bridges, and other routers on the network. Routers can even "listen" to the entire network to determine which sections are busiest -- they can then 1.1.4. 1.1.3. 43 redirect data around those sections until they clear up. If you have a school LAN that you want to connect to the Internet, you will need to purchase a router. In this case, the router serves as the translator between the information on your LAN and the Internet. It also determines the best route to send the data over the Internet. Routers can:  Direct signal traffic efficiently  Route messages between any two protocols  Route messages between bus and star topologies  Route messages across fiber optic, coaxial, and twisted-pair cabling.  Client - The client is the requesting machine. In other words, the client requests services. Clients rely on the server for resources such as files, processing power, storage. Client software generally runs on common PCs or workstations. Clients rely on the application server for things such as configuration files, business application programs, or to offload compute-intensive application tasks back to the server in order to keep the client computer (and client computer user) free to perform other tasks.  Server – The server is the supplying machine. In other words, the server provides service to the client. The server controls access to the hardware and software on the network and provides a centralized storage area for data. Server software generally runs on powerful computers dedicated for exclusive use to running the business application. 4.7. Software: Network Operating Systems The similarities and differences between a single-user operating system and a network operating system Similarities  Controls/manages the computer hardware (e.g. memory)  Provides a user interface  Allows more than one program to run at the same time.  Schedules jobs and configures devices  Manages programs  Provides file management and other utilities  Starts the computer Differences  A network operating system (NOS) is an operating system that organizes and coordinates how multiple users access and share resources on a network. A single user operating system allows only one user to run one or more programs at a time.  An NOS has more security control features. It also controls a network, establish internet connection and allows more than one computers to talk to each other.  An NOS allows for the management of files on other computers.  NOS typically resides on a server. 1.1.4. 1.1.3. 44 4.8. Transmission Media: wireless, wired A transmission medium or communication channel is also referred to as a circuit, line or path. It is a pathway over which data are transferred between remote devices. Wired Transmission Media (Physical /Guided) Wired media are identified by a physical cable on which the data travels. This cable can be seen and touched. Twisted Pair Cable  A thin-diameter copper wire (22 to 26 gauge) commonly used for telephone and network cabling.  The two insulated copper wires are twisted around each other to minimize interference (electrical disturbance or noise) from other twisted pairs in the cable. The noise can degrade communication. These pairs are then bundled together.  There are two main types: Shielded Twisted Pair (STP) and Unshielded Twisted Pair (UTP). The STP has a metal wrapper around each twisted pair wire, which further reduces noise. The UTP does not have this shield, it is inexpensive and easy to install. The categories of UTP cable are: Category 1 Voice Only (Telephone Wire) Category 2 Data to 4 Mbps (LocalTalk) Category 3 Data to 10 Mbps (Ethernet) Category 4 Data to 20 Mbps (16 Mbps Token Ring) Category 5 Data to 100 Mbps (Fast Ethernet)  The standard connector for UTP cable is an RJ-45 connector, which looks like a large telephone modular connector.  It is easy to tap into (i.e. persons can listen on the line).  It is a low frequency transmission medium.  It is relatively inexpensive. (Cheapest)  It is small in size and easy to install.  Generally for analog transmission.  It covers limited distance, usually less than 100 meters.  It is the most popular medium for LANs and is generally the best option for school networks. 1.1.4. 1.1.3. 45 Coaxial Cable (Coax for short)          A cable consisting of a conducting outer metal tube enclosing and insulated from a central conducting core, used for high-frequency transmission of telephone, telegraph, and television signals. Consists of single aluminium or copper wire surrounded by 3 layers a) insulating material, b) woven or braided metal mesh, c) plastic outer coating. It is common in cable television applications. The cable is designed to carry a high-frequency or broadband signal. (In other words, it can carry many signals at the same time). The bandwidth can be up to 400 Mhz. It can carry signals for longer distances than Twisted Pair. (300-600 meters) It is less susceptible to electromagnetic interference than Twisted Pair because it is more heavily insulated. Commonly used in harsh environments. (e.g. factories where there are chemicals etc.) It is quite bulky and sometimes difficult to install. The most common type of connector used with coaxial cables is the BNC (Bayone-Neill-Concelman) connector. There are two types of coaxial cable: Thin coaxial cable o refered to as thinnet o 10Base2 is the IEEE (Institute of Electrical and Electronics Engineers) standard for Ethernet running on thin coaxial cable o the 2 refers to the approximate maximum segment length being 200 meters o is popular in school networks, especially linear bus networks Thick coaxial cable  refered to as thicknet  10Base5 is the IEEE standard for Ethernet running on thick coaxial cable  the 5 refers to the approximate maximum segment length being 500 meters  has an extra protective plastic cover that helps keep moisture away from the center conductor  difficult to bend and install  used for long distance bus networks 1.1.4. 1.1.3. 46 Fiber Optic Cable            1.1.4. A thin glass strand designed for light transmission. A single hair-thin fiber is capable of transmitting trillions of bits per second. The core consists of dozens or hundreds of thin strands of glass or plastic that use light to transmit signals. Inside the cable is an insulating glass cladding and a protective coating. Optical fibers offer many advantages over electricity and copper wire. Optical fibers transmit data at a faster rate. They are also able to carry more signals. (Broad bandwidth). (bandwidth of up to 2 Gbps) Fibers allow longer distances to be spanned before the signal has to be regenerated by expensive "repeaters." (Repeaters boost a signal that has become weakened due to distance.) E.g. used for distances up to 100 kilometers. Fibers are more secure, because taps in the line can be detected. There are lower error rates. (i.e. the message that is sent is the message that is received). Less susceptible to noise from other devices such as copy machine. Smaller size (thinner, lighter) Fiber optic cable is very expensive. It is hard to install and modify, and requires highly skilled installers. 10BaseF refers to the specifications for fiber optic cable carrying Ethernet signals. 1.1.3. 47 Wireless Transmission Media (Unguided) Wireless media transmit data through high frequency radio signals or infrared light beams. The medium cannot be seen with the naked eye. Wireless media is used when it is inconvenient, impractical or impossible to install physical cables. Broadcast radio  Distributes radio signals through the air over long distances.  It is used with AM, FM radio, television stations, CB (citizens band) radio (i.e. walkie-talkie).  Each station has a different frequency (E.g. FAME FM is 95.5MHz, LOVE is 101MHz etc.).  Broadcast radio is slower and more susceptible to noise than physical transmission media but it provides flexibility and portability.  Bluetooth, HomeRF and 802.11 technologies use broadcast radio signals  Speeds range from 1Mbps to 54Mbps depending on the technology. Cellular radio  Form of broadcast radio used for mobile communication. (Handheld computers, Phones etc.)  Transmission speeds range from 9.6Kbps to 2Mbps depending on the generation. Microwaves (also called Fixed-Point Wireless)  High frequency radio waves that provide high-speed transmission. (up to 150Mbps)  Signals are sent from one microwave station to another.  Limited to line-of-sight transmission, which means that the microwave must be transmitted in a straight line with no obstructions between antennas.  This means that microwave antennae are placed in high places (tall building, hill etc.) 1.1.4. 1.1.3. 48 Satellites  A satellite is a space station that receives microwave signals from an earth based station (downlink), amplifies the signals then broadcasts the signals back over a wide area (uplink).  Satellites are usually placed about 22,300 miles above the equator. They are considered geosynchronous because they orbit at the same rate as the earth, therefore maintaining its position over the earth’s surface.  VSAT – very small aperture terminal) – a small communications satellite.  Transmission speed is up to 1Gbps  More expensive/harder to fix problems.  Affected by bad weather (e.g. you lose certain channels whenever it is raining). Infrared (IR)  Sends signals using infrared light waves.  Also uses line-of-sight transmission.  Signals only travel for short distances.  Used by remote controls, wireless devices such as mouse/ printer/ digital camera 1.1.4. 1.1.3. 49 5. Module 5 – Network Security 5.1. Define Computer security In the computer industry, computer security refers to techniques for ensuring that data stored in a computer cannot be read or compromised by any individuals without authorization. Most security measures involve data encryption and passwords. Data encryption is the translation of data into a form that is unintelligible without a deciphering mechanism. A password is a secret word or phrase that gives a user access to a particular program or system. 5.2. What is a computer security risk? A computer security risk is any event or action that could cause a loss of or damage to computer hardware, software, data, information, or processing capability. A computer security plan is a summary in writing of all the safeguards that are in place to protect a company’s information assets. 5.3. Categories of risk and their effects Category Human error – e.g. delete a file by accident, adding data twice, entering incorrect data, not adequately trained/experienced (e.g. young child) Technical error – system failure e.g. hard disk crash, booting file missing/corrupted Virus – program that causes damage to files or computer. Disasters (Natural or otherwise) – earthquake, hurricane, fire, flood, lightening, power surges, low voltage, insects Unauthorized use and access – hacker/cracker gets access illegally. This can lead to things like software piracy. Theft, vandalism, civil disorder 1.1.4. Effect Loss of data, less data integrity (incorrect data) therefore incorrect information will be retrieved. Damage to computer due to improper use. Loss of data, loss of time in having to re-enter data. Loss of files/data, loss of time. May need to re-install software. Physical damage to computer. Loss of data. Loss of computer. Huge repair bill. Competing entity could use data against your company. Identity theft. Loss of sales due to piracy. Also leads to theft of intellectual property, theft of marketing information (e.g., customer lists, pricing data, or marketing plans), or blackmail based on information gained from computerized files (e.g., medical information, personal history, or sexual preference). Employees do things to deliberately modify the data. Loss of computer and data. Illegal access to files. Loss of time. Loss of income due to software piracy. 1.1.3. 50 Who is a Hacker? A slang term for a computer enthusiast, i.e., a person who enjoys learning programming languages and computer systems and can often be considered an expert on the subject(s). Among professional programmers, depending on how it used, the term can be either complimentary or derogatory, although it is developing an increasingly derogatory connotation. The pejorative sense of hacker is becoming more prominent largely because the popular press has co-opted the term to refer to individuals who gain unauthorized access to computer systems for the purpose of stealing and corrupting data. Hackers, themselves, maintain that the proper term for such individuals is cracker. What is software piracy?  Software piracy is the unauthorized copying of software.  A software license is a type of proprietary or unwarranted license as well as a memorandum of contract between a producer and a user of computer software sometimes called an End User License Agreement (EULA) — that specifies the perimeters of the permission granted by the owner to the user.  By buying the software, a user becomes a licensed user rather than an owner. Users are allowed to make copies of the program for backup purposes, but it is against the law to give copies to friends and colleagues. Software licenses are primarily written to deal with issues of copyright law.     Copying software is an act of copyright infringement, and is subject to civil and criminal penalties. Copyright is exclusive rights given to authors and artists to duplicate, publish, and sell their materials. A copyright provides its holder the right to restrict unauthorized copying and reproduction of an original expression (i.e. literary work, movie, music, painting, software, etc.) Software copyright stands in contrast to other forms of intellectual property, such as patents, which grant a monopoly right to the use of an invention or software, because it is not a monopoly right to do something, merely a right to prevent others doing it. It is illegal whether you use pirated software yourself, give it away, or sell it. In addition, it is illegal to provide unauthorized access to software or to serial numbers used to register software. The Internet allows products to move from computer to another computer, with no hard media transaction and little risk of detection. The most used method of piracy and illegal use of downloading is Internet piracy. Computer Crime   Computer crime is defined as deliberate actions to steal, damage, or destroy computer data without authorization, as well as accessing a computer system and/or account without authorization. Criminals or perpetrators may be employees, outside users, hackers and crackers, and organized crime members. What is Intellectual Property? Intellectual property refers to the category of intangible (non-physical) property comprising primarily copyright, moral rights related to copyrighted materials, trademark, patent and industrial design. 1.1.4. 1.1.3. 51 Other threats The following allows someone to gain illegal/unauthorized access to data which leads to unauthorized use:       Spoofing - Getting one computer on a network to pretend to have the identity of another computer, usually one with special access privileges, so as to obtain access to the other computers on the network. Masquerade - Accessing a computer by pretending to have an authorized user identity. Scanning - Sequentially testing/ scanning passwords/ authentication codes until one is successful. Snooping (Eavesdropping) - Electronic monitoring of digital networks to uncover passwords or other data. Shoulder Surfing - Direct visual observation of monitor displays to obtain access. Scavenging/Dumpster Diving - Accessing discarded trash to obtain passwords and other data. The following causes problems on a network: Spamming - Overloading a system with incoming message or other traffic to cause system crashes. What is Malware? Malware is a program that performs unexpected or unauthorized, but always malicious, actions. It is a general term used to refer to viruses, Trojans, and worms. Malware, depending on their type, may or may not include replicating and non-replicating malicious code. What is a computer virus? A computer program that is designed to replicate itself by copying itself into the other programs stored in a computer. It may be benign or have a negative effect, such as causing a program to operate incorrectly or corrupting a computer's memory. Viruses are the colds and flues of computer security: Ubiquitous (ever-present), at times impossible to avoid despite the best efforts and often very costly to an organization's productivity. Computer viruses are called viruses because they share some of the traits of biological viruses. A computer virus passes from computer to computer like a biological virus passes from person to person. There are similarities at a deeper level, as well. A biological virus is not a living thing. A virus is a fragment of DNA inside a protective jacket. Unlike a cell, a virus has no way to do anything or to reproduce by itself -- it is not alive. Instead, a biological virus must 1.1.4. 1.1.3. 52 inject its DNA into a cell. The viral DNA then uses the cell's existing machinery to reproduce itself. In some cases, the cell fills with new viral particles until it bursts, releasing the virus. In other cases, the new virus particles bud off the cell one at a time, and the cell remains alive. A computer virus shares some of these traits. A computer virus must piggyback on some other program or document in order to get executed. Once it is running, it is then able to infect other programs or documents. Obviously, the analogy between computer and biological viruses stretches things a bit, but there are enough similarities that the name sticks. What is a payload? In addition to replication, some computer viruses share another commonality: a damage routine that delivers the virus payload. A virus” payload is an action it performs on the infected computer. While payloads may only display messages or images, they can also destroy files, reformat your hard drive, or cause other damage. If the virus does not contain a damage routine, it can cause trouble by consuming storage space and memory, and degrading the overall performance of your computer. Types of viruses Several thousand viruses have been recorded by authorities. Most of them are variations of two main types: 1. File viruses, including macro viruses 2. Boot sector (or system sector) viruses. There are however, other types of viruses. The following describes the various types of viruses: Worms: A worm is a small piece of software that uses computer networks and security holes to replicate itself. A copy of the worm scans the network for another machine that has a specific security hole. It copies itself to the new machine using the security hole, and then starts replicating from there, as well. 1.1.4. 1.1.3. 53 Trojans: A Trojan appears to be something that it is not so that you give out certain information. Example: a fake login screen will allow you to put in your id and password, thereby allowing it to be read by unscrupulous persons. Another example is a fake web site on which you give out your credit card information. The program also claims to do one thing (it may claim to be a game) but instead does damage when you run it (it may erase your hard disk). Trojan horses have no way to replicate automatically. Boot sector (or system sector) viruses: These viruses infect floppy disk boot records or master boot records in hard disks. The boot sector of your hard disk contains the programs used to boot (or start) your computer. These system sectors are vital for proper operation of your computer. When you switch on your computer, the hardware automatically finds and runs the system sector program. This program then loads your operating system. The viruses replace the boot record program (which is responsible for loading the operating system in memory) copying it elsewhere on the disk or overwriting it. Boot viruses load into memory if the computer tries to read the disk while it is booting. In the past, boot sector viruses were spread mainly by infected bootable floppy disks. Today any disk can cause infection if it is in the drive when the computer boots up. Boot sector viruses can also be spread across a network and by e-mail attachments. These viruses usually remain active on your computer, and can infect any floppy disk you access. Examples: Form, Disk Killer, Michelangelo, and Stone virus Program/File viruses: These infect executable program files, such as those with extensions like .BIN, .COM, .EXE, .OVL, .DRV (driver) and .SYS (device driver). These programs are loaded in memory during execution, taking the virus with them. The virus becomes active in memory, making copies of itself and infecting files on the disk. Examples: Sunday, Cascade Multipartite viruses: A hybrid of Boot and Program viruses. This sophisticated type of virus infects program files and when the infected program is executed, this virus infects the boot record. When you boot the computer next time the virus from the boot record loads in memory and then starts infecting other program files on disk. Egs: Invader, Flip, and Tequila Stealth viruses: These viruses use certain techniques to avoid detection. They may either redirect the disk head to read another sector instead of the one in which they reside or they may alter the reading of the infected file’s size shown in the directory listing. For 1.1.4. 1.1.3. 54 instance, the Whale virus adds 9216 bytes to an infected file; then the virus subtracts the same number of bytes (9216) from the size given in the directory. A stealth virus can conceal its presence in many ways and some go undetected for years. Egs: Frodo, Joshi, Whale Polymorphic viruses: A virus that can encrypt its code in different ways so that it appears differently in each infection. These viruses are more difficult to detect. Egs: Involuntary, Stimulate, Cascade, Phoenix, Evil, Proud, Virus 101 Macro Viruses: A macro virus is a new type of computer virus that infects the macros within a document or template. A macro is an automated series of program commands, such as a list of formatting commands for a word processing program. Many applications use macros, including popular spreadsheet and word processing programs. When you open a word processing or spreadsheet document, the macro virus is activated and it infects the Normal template (Normal.dot)-a general purpose file that stores default document formatting settings. Every document you open refers to the Normal template, and hence gets infected with the macro virus. Since this virus attaches itself to documents, the infection can spread if such documents are opened on other computers. Examples: DMV, Nuclear, Word Concept. E-mail viruses - An e-mail virus moves around in e-mail messages, and usually replicates itself by automatically mailing itself to dozens of people in the victim's e-mail address book. Active X: ActiveX and Java controls will soon be the scourge of computing. Most people do not know how to control their web browser to enable or disable the various functions like playing sound or video and so, by default, leave a nice big hole in the security by allowing applets free run into their machine. There has been a lot of commotion behind this and with the amount of power that JAVA imparts, things from the security angle seem a bit gloom. Time and logic bombs: Viruses can also be categorized by what activates them. Some viruses were written to activate on a particular date such as Friday the 13th. These viruses are called Time bombs. Other viruses were written to activate when the user carries out a certain action, such as open a particular file. These viruses are called Logic bombs. How are computers infected? It's important to realize that viruses can't infect email messages or Web pages, since these are both based solely on text, and plain text cannot contain computer viruses. 1.1.4. 1.1.3. 55 The two ways viruses are most commonly spread are through e-mail attachments and by floppy disk. 1. E-mail attachments  You won't get a virus just from reading text e-mails. Viruses are spread through email as executable file attachments (files ending in ".EXE", ".VBS" or ".COM" for example) or messages containing embedded executable code (such as JavaScript code embedded in an html e-mail). You won't get a virus from opening image files, audio files, text files or pure data files.  However, some viruses such as the "Anna Kournakova" virus make themselves look like an innocent picture - the file name was anna.jpg.vbs. The .jpg component of the filename made many computer users think that the file was a picture and distracted them from the .vbs ending which identifies the file as an executable file.  Macro viruses are commonly distributed by e-mail. These are dangerous because they often use common file formats such as Microsoft Word. Users recognize the file extension and assume the file contains only data. However, when macroenabled files are opened they actually execute a program, which could be infected with a virus. Some viruses and worms, such as the famous "I Love You" virus, gain broad distribution by targeting the infected host's e-mail contact listings. The virus will send copies of itself to their contacts using their name. The recipient often recognizes and trusts the sender, not realizing that the message was not sent unknowingly. 2. Floppy Disk  Many viruses are spread by sharing an infected disk. These viruses will usually reside in your computer's memory and infect any floppy disk you place in your drive. Both file viruses and system sector viruses can be spread by floppy disk. Your best line of defence is common sense. If you are in doubt about the source of the file or attachment, don't open it.  1.1.4. When a software application is infected, the virus will attempt to infect any documents accessed by that program. If the infected computer is on a network, the infection can rapidly spread to other networked computers that share files. If a copy of the infected file is transferred to another computer through e-mail or 1.1.3. 56 floppy disk, the virus can spread to that computer. The virus will continue until it's found and eradicated.  Obviously, a virus can't do any harm if it is caught before it gets a chance to start. You can use anti-virus (or virus protection) software to check (or "scan") the files on your hard drive to detect if any of them contain known viruses. If an infected file is found, most anti-virus programs give you the option of deleting the infected file or attempting to remove the virus and leave the uninfected version of the file on your computer. What damage can computer viruses do? It's important to remember that most viruses aren't programmed with destructive intentions. Most simply reproduce without any destructive attack. However, these viruses can cause damage to your files, particularly since many of the viruses are poorly written programs that can cause unintended software conflicts. At the very least, viruses are intrusive applications that steal storage and CPU cycles without your permission. Most people's worst virus fear is having their hard drive erased, but those who regularly create back-up versions of important data could recover within a few hours. Viruses that subtly corrupt data are potentially much more destructive - computer users may not notice their presence until a great deal of data has been ruined. Some viruses insert random numbers in spreadsheet applications or system files, or add typos to word processing documents. One particularly nasty virus posted confidential documents in the user's name to Internet newsgroups. All viruses will attempt to infect other files, and some will launch some form of attack, but it's not obvious when these events will occur. Viruses are programmed to perform these actions upon certain conditions, or triggers. Triggers can be anything from a set day or time, a counter within the virus, the specific number of times executed, or even a specific event such as the deletion of an employee's payroll file. Some viruses will lay dormant for years, ensuring that many computers can be infected before initiating the attack phase. Despite the claims of some hoax e-mails that have been widely circulated, viruses usually won't do direct damage to your hardware. In a few rare cases viruses can manipulate your 1.1.4. 1.1.3. 57 software configurations in a way that renders some hardware components useless, but the vast majority of virus attacks affect applications or operating systems and not the underlying hardware. 1.1.4. 1.1.3. 58 5.4. Risk Management Solutions What is risk management (risk management solution)? Risk management is an action taken to either prevent a risk from happening or to reduce its effects. The following table shows the various categories of risks and solutions to either prevent or reduce the effects of the risks. The solutions either protect the physical computer (hardware), or protect the data /information/software (files) on the computer. Category of Risk Solution Human error– e.g. delete a file by accident, adding data twice, entering incorrect data  Data validation (validation rules)  Reduction of human interaction (because humans make mistakes). In other words, automate as many processes as possible. For example, use a bar code reader to scan in the items rather than have the cashier typing in the item code.  Training of the user  Password protection  Authority levels (to limit access)  Supervision of children and inexperienced users.  Separation of duties (e.g. can enter data but not change) Technical error– system failure e.g. hard disk crash, booting file missing/corrupted  Buy quality hardware from a reputable dealer  Get a warranty period when purchase a computer  Backup just in case the hardware fails you  Air conditioning – to keep computer cool  Plastic dust covers to keep dust out of diskette drives etc.  Proper (sturdy) desk on which to store computer  No magnets/sunlight/don’t open shutter and other proper diskette care procedures  Proper maintenance (care) – e.g. defrag, cleaning computer  Regular testing of hardware and software. Virus – program that causes  Antivirus software (e.g. McAfee, Norton Antivirus, damage to files or computer Trend Micro-PCcillin). This must be updated regularly.  Firewall - A firewall is (a program and/or hardware   1.1.4. that filters the data coming through the internet to prevent unauthorized access. Some firewalls protect systems from viruses, junk email (spam). (e.g. Black Ice, Zone Alarm Limit connectivity, such as staying off a network if it is not necessary. Visit trusted sites only when on the internet. Limit software downloads since viruses can be caught 1.1.3. 59 by downloading music, games etc.    Use only authorized media for loading data and software  Write protecting diskettes when opening files on another computer Backup files regularly  Not opening unknown email and attachments Enforce mandatory access controls. Viruses generally cannot run unless host application is running. Natural disasters etc. – earthquake, hurricane, fire, flood, power surges, insects  Offsite Backup (located elsewhere such as another branch or another country)  Good location (e.g. not on a hillside or near the sea)  Strong, weatherproof facilities (no windows, fireproof) No food/drink around the computer – no insects, spills on keyboard etc  Raised (false) floors – Similar to a false ceiling except this is below your feet. It is used for earthquake protection as it works as a shock absorber. Raised floors also allow you to hide cables below.  UPS (Uninterruptible Power Supply) – This has a battery which charges will there is power. It gives you time to shut down the computer properly when there is a power cut.  Generator – Used during a power cut and runs on gas. It allows you to continue using the computer for as long as there is gas.  Surge protectors to protect against low voltage, power surge/spike, lightening etc.  Lightening rod to protect the building and all electrical devices within the building from lightening storms.  Fire extinguishers – specially made for computers (foam). These will not damage the computers whereas water would cause damage.  Insurance of equipment in order to repurchase if your computer is destroyed. Unauthorized  Physical use and access security – – e.g. e.g. locks, hacker/cracker guards, gets access grills etc. illegally  Access 1.1.4. 1.1.3. 60      1.1.4. codes and passwords – passwords should not be easy to guess (e.g. do not use your birthday). Biometric devices – e.g. Retinal scan, finger print scan, voice activated Require frequent password changes. By the time a hacker goes through a listing of the possible passwords, it would have changed. Sign off when you leave your desk, even for a moment. Authority levels – so that only certain users can perform certain tasks. Firewall - 1.1.3. 61  (a program and/or hardware that filters the informatio n coming through the internet to prevent unauthoriz ed access. Some firewalls also protect systems from viruses and junk email (spam). (e.g.s of firewalls include: Black Ice, Zone Alarm). Encryption of data encoding data so that it means nothing to hackers if they get into the system. Audit trails – keeps track of what a user does when he is on the system  Log systems – keeps track of user sign on/off  Intrusion detection software – e.g. detects if put in wrong password more than 3 times and kicks you off. (e.g. try to put in a false telephone card number, or the wrong PIN for 1.1.4. 1.1.3. 62 your debit card at the ATM) Time and Location controls User can only use system at certain times and in certain locations (can’t hide and do wrong things)  Separation of duties (e.g. one person enters and another person is needed to change the data such as a cashier). This is in order to prevent employees from committing fraud or stealing from the company. Restrict report distribution, shred reports – e.g. do not throw away credit card statements (prevents persons from going in your garbage and getting your private information).  Go to reputable web sites so that will not steal credit card number. Go to secure sites (lock at the bottom of the screen).  Secrecy Act in Jamaica – so that employees do not give out company information.  Copyright and License agreements – so that you have the right to sue persons who steal your software/data.  Auditing the programs that are written in case an unscrupulous employee deliberately put in code for his benefit.  Callback systems – the user can connect to the computer only after the computer calls the user back at a previously established telephone number.  Theft and vandalism 1.1.4.      Physical security – locks, guard, dogs, biometrics Metal detectors to prevent hardware theft Backup Lock the computer to the desk Low profile facilities (no overt disclosure of high-value nature of site, in other words do not display a sign to let 1.1.3. 63  persons know where your computer facilities are) Mark your computers in a secret place so that you can identify it if the police finds it Backup is the key – the ultimate safeguard Regardless of the precautions that you take, things can still go wrong. Backup is therefore the main risk management solution. A backup is a duplicate of a file, or disk that can beused if the original is lost, damaged, or destroyed. If your computer fails you can restore from the backup. The following describes the different types of backup.      Full – backup that copies all of the files in a computer (also called archival backup) Incremental – backup that copies only the files that have changed since the last full or last incremental backup Differential – backup that copies only the files that have changed since the last full backup Selective – backup that allows a user to choose specific files to back up, regardless of whether or not the files have changed since the last backup Grandfather, Father, Son (or Three-generation backup) – backup method in which you recycle 3 sets of backups. The oldest backup is called the grandfather, the middle backup is the father and the latest backup is called the son. Each time that you backup you reuse the oldest backup medium. The father then becomes the grandfather, the son becomes the father and the new backup becomes the son. This method allows you to have the last 3 backups at all times. 1.1.4. 1.1.3. 64 6. Module 6 – Database Management Systems A database system is essentially nothing more than a computerized record-keeping system similar to that of a filing cabinet. The users will have the following facilities: add new files, insert new data, retrieve data, update data, delete data, and delete files. 6.1. Traditional/File Processing Approach versus Database Approach Almost all application programs use either the file processing approach or the database approach to store and manage data. 1. Traditional/File Processing Approach This is an approach to storing and managing data where each department within an organization typically has its own set of files.      Files are often designed specifically for their particular application Files are designed to meet the needs of a given program (e.g. Prog 1 uses the employee file only). The focus is on procedures (what needs to be done by the programs Prog 1 and Prog 2) The records in a file may not relate to records in any other file. (e.g. the employee file in no way related to the Warehouse file). Companies have usually been using file processing for many years Major weaknesses of the traditional approach  Data Redundancy – Each department has its own files, therefore: o The same fields are stored multiple times causing wasted resources o The chance for errors is increased (e.g. different spelling in different locations causing inconsistency)  Isolated data – Resulting in difficulty to access data stored in different files. (e.g. Prog 1 cannot access directly those files designed for Prog 2) 1.1.4. 1.1.3. 65    Poor data control – with no centralized control at the data element level it is common for the same data element to have multiple names Data had to be kept sorted (e.g. in order to locate a particular item) File structure changes severely impact existing programs. Exercise: Find the employees making less than $23000 who a) work in warehouse with floor area larger than 30000 square feet. b) have issued an order to supplier “S6”. This would not be possible in the traditional approach if the files are separate. 2. The Database Approach In this approach many programs and users share the data in the database. Users access data using software called a Database Management System (DBMS). The focus is on the data and not on procedures or programs that use the data. The data resource is separate from the programs. 6.2. What is a Database? A database is an organized collection of data. The data is organized in a manner to allow access, retrieval and use of that data. OR A database is a single organized collection of structured data, stored with minimum of duplication of data items so as to provide a consistent and controlled pool of data. This data is common to all users of the system but is independent of programs, which use the data. 1.1.4. 1.1.3. 66 6.3. What is a DataBase Management System (DBMS)? A DBMS is an item of complex software, which constructs and maintains a database in a controlled way. It allows us to use the computer to create a database to which we can change, add and delete data in the database. It also allows us to sort and retrieve data and create forms, queries and reports using the data in the database. OR A DBMS is application software that allows creation, access, and management of a database. It consists of a collection of interrelated data and a collection of programs to access that data. A DBMS is usually purchased from a software vendor and is the means by which an application programmer or end-user views and manipulates data in a database. 6.4. Examples of DBMS’s Microsoft Access Oracle DB2 Visual Foxpro Informix Ingres Paradox Sybase SQL Server Approach GemStone D3 Essbase FastObjects InterBase JdataStore Adabas Versant 6.5. Common examples of databases in society Payroll Employee data Inventory management/Stock Sales Customer data Supplier data Library book management Banking Student record keeping 1.1.4. 1.1.3. 67 6.6. Sample Payroll Database Structure (Single example) Character A number, letter, punctuation mark, or other symbol that is represented by a single byte in the ASCII and EBCDIC coding schemes. Fields A field (also known as attribute), contains a specific piece of information within a record. A field name uniquely identifies each field. In the example employee table above the “lastname” field would contain all of the last names of the employees in the table. It is an attribute or characteristic of an entity. (An entity is an object or event about which someone chooses to collect data. It may be a person, place, event or thing. E.g. Student, car, library book, employee, bank account etc.) Records A record is a group of related fields. It is a collection of data items. A record contains information about a given person, place, event or thing. A record in an employee table would contain specific information about a particular employee. Tables/File A table is a group of related records. It captures all of the records of a particular type of entity. E.g. the employee table has all of the employee records. The structure of the table is described by the fields, that is, the type of data that will be held in the table. 1.1.4. 1.1.3. 68 6.7. The languages used in database systems (Data definition and Data manipulation) Some databases have their own computer languages associated with them, which allow the user to access and retrieve data. Other databases are only accessed via languages such as COBOL. Data descriptions must be standardized, for this reason Data Description/Definition Language (DDL) is provided which must be used to specify the data in the database. Similarly, a Data Manipulation Language (DML) is provided which must be used to access the data. The combination of the DDL and DML is often called a Data Sub-Language (DSL) or a query language. Data Definition Language - The DDL is that portion of the DBMS, which allows us to create and modify the structure of the database and the database tables. The functions of a DDL may therefore include:  Creating Database structures  Creating table structures  Associating fields with table structures  Associating data types with field structures etc. Data Manipulation Language - The DML is that portion of the DBMS, which allows us to store, modify, and retrieve data from the database. There are two types of DMLs: procedural DML and the nonprocedural DML.  Procedural DMLs require that the user specify the data that is needed from the database and how to obtain it Procedural DMLs are more difficult to use since they require that the user be proficient in using the language commands to manipulate the structure and the contents of the data file. On the other hand they are more flexible since they allow the user to determine the method that is used for accessing and manipulating the structure and contents of a file.  ·Nonprocedural DMLs require that the user specify the data that is needed from the database, but it does not allow the user to tell how to obtain it Nonprocedural DMLs are easier to use since they do not require a detailed knowledge of the language commands, which are needed to manipulate the structure and the contents of a data file. On the other hand they lack flexibility since the programmer has no way of determining the method for accessing and manipulating the contents of the data file. Please note that it is the nonprocedural DML of a 4th Generational Language that allows it to exhibit structural and data independence. Query Language 1.1.4. 1.1.3. 69 The implementation of a query language is very vital for a DBMS. The query language allows the end user to generate adhoc queries, which are immediately answered. In most languages the DML and the query language are one and the same. Today, many DBMS also provide support for a standardized query language that may be different from the DML of the language. This is known as the Structured Query Language (SQL). 6.8.        1.1.4. Functions/features common to most DBMS’s Data Dictionary/ Repository o Contains data about each field and table in the database (data about data is called metadata) o Should only be updated by skilled personnel o Is used to perform validation checks o Allows users to specify a default field value File retrieval and maintenance o Many tools provided o Involves adding new records, updating existing records and deleting unwanted records o It also provides the interface between the user and the data Query Language o Allows users to specify data to be displayed, printed or stored o Consists of simple English-like statements o Each has its own grammar and vocabulary o Usually quickly learned by a non-programmer Form o A window used to enter and change data o When well designed validates data as entered, thus reducing data entry errors Report Generator/Writer o Allows users to design a report on the screen o Normally used only to retrieve data Data Security o A DBMS provides means to ensure that only authorized users access users at permitted times o Most DBMSs allow different levels of access privileges Backup and Recovery o A DBMS provides a variety of techniques to restore a damaged or destroyed database to usable form. o A Backup or copy of the entire database should be made on a regular basis o Some DBMSs maintain a log of activities 1.1.3. 70 6.9. Database Administration Managing a company’s database requires a lot of coordination. These database activities are performed by:  Database Analyst (DA) o Focuses on meaning and usage of data. o Decides the placements of fields and defines relationships among data  Database Administrator (DBA) o Creates and maintains data dictionary o Manages DB security o Monitors performance o Performs backup and security A database administrator (DBA) is a person who is responsible for the environmental aspects of a database. Managing a company’s database requires a great deal of coordination. The role of coordinating the use of the database belongs to the database administrator (DBA). The duty of a database administrator varies depending on job description, corporate and IT policies and the technical features and capabilities of the database management system’s (DBMS’s) being administered. They nearly always include disaster recovery (backups and testing of backups), performance analysis and tuning, and some database design or assistance thereof. Database administrators work with database management systems software and determine ways to organize and store data. They identify user requirements, set up computer databases, and test and coordinate modifications to the computer database systems. An organization’s database administrator ensures the performance of the system, understands the platform on which the database runs, and adds new users to the system. Because they also may design and implement system security, database administrators often plan and coordinate security measures. With the volume of sensitive data generated every second growing rapidly, data integrity, backup systems, and database security have become increasingly important aspects of the job of database administrators. Their salaries range from $65,000US to $86,000US depending on qualifications and experience. The administrative and other controls carried out by the DBA therefore include the following:      Select and implement the DBMS Develop database models (e.g. Entity relationship diagrams) Create and maintain the data dictionary1. This includes documentation of the data dictionary. Ensures that the database structure is documented Supervise the addition of new data 1 A data dictionary (also called repository) is a DBMS element that contains data about each table in a database and each field within those tables. 1.1.4. 1.1.3. 71           Provides manuals describing the facilities the database offers and how to make use of these facilities. Provides the facilities for retrieving data and for structuring reports are appropriate to the needs of organization. Ensures that the data in the database meets the information requirements of the organization (designs the database) Manages security of the database. (Includes backup and recovery) Recoverability - Checks backup and recovery/restore procedures Perform archiving (backup and remove historical data from current files) Availability – ensures that the database is running when necessary Use query languages to obtain reports of the information in the database Periodic appraisal of the data to ensure it is complete, accurate and not duplicated. (Monitor performance). Verifies database integrity Appraise the performance of the database and takes corrective actions if performance degrades. Although not strictly part of a database administrator's duties, logical and physical design of databases is sometimes part of the job. These functions are traditionally thought of as being the duties of a database analyst or database designer. 6.10. Types of databases/Database models Every database and DBMS is based on a specific data model. The data model consists of the rules that define how the database organizes data and how users view the organization of data. Databases are classified according to the approaches taken to database organization. The classes are:  Relational  Network  Hierarchical  Object Oriented  Multidimensional A data model is a representation of data and its interrelationships which describe ideas about the real world. The hierarchical and network database models store its data in a series of records, which have a set of field values attached to it. They collect all the instances of a specific record together as a record type. These record types are the equivalent of tables in the relational model, and with the individual records being the equivalent of rows. Links between the record types are created using Parent-child relationships. 1.1.4. 1.1.3. 72 Hierarchical Model A hierarchical system is one that is organized in the shape of a pyramid, with each row of objects linked to objects directly beneath it. Hierarchical systems pervade everyday life. Examples of hierarchical systems in society are:  The army which has generals at the top and privates at the bottom  The classification of plants and animals according to species, family, genus etc. Examples of hierarchical systems in computers are:  File system – a hierarchy of folders and sub-folders in which files are placed.  Menu driven system – systems of main menus and sub-menus below. (E.g. when you click on File another menu comes up under it). The hierarchical model is the oldest of the database models, and unlike the network, relational and object oriented models, does not have a well documented a history of its conception and initial release. It is derived from the Information Management Systems of the 1950's and 60's. It was adopted by many banks and insurance companies who are still running it as a legacy system to this day. Hierarchical database systems can also be found in inventory and accounting systems used by government departments and hospitals. The hierarchical model is a tree structured model and consists of many record types with one being the root. The root record type exists at the top of the tree. All data must be accessed through the root. One-to-many relationships exist between records in the hierarchy with one being the parent and the other the child. Each child has a unique parent and a parent can have many children. This child/parent rule assures that data is systematically accessible. To get to a low-level table, you start at the root and work your way down through the tree until you reach your target. Of course, as you might imagine, one problem with this system is that the user must know how the tree is structured in order to find anything. For example, in the diagram below, the root record type is customer, the parent of order is customer, the parent of parts is order. In order to access an order, you must first access the customer (e.g. by knowing the customer#). Order has two children which are parts and salesman. In order to access the parts, you must first access the customer then the order. The path to the parts record type is therefore Customer, Order, Parts. Hierarchical structures were widely used in the first mainframe database management systems. However, due to their restrictions, they often cannot be used to relate structures that exist in the real world. Hierarchical relationships between different types of data can make it very easy to answer some questions, but very difficult to answer others. If a one-to-many relationship is violated (e.g., a patient can have more than one physician) then the hierarchy becomes a network. The hierarchical model is no longer used as the basis for current commercially produced systems, however, there are a large number of legacy (old) installations. These legacy systems are likely to be phased out over time, as the number of qualified staff declines due to retirement and retraining. Examples of hierarchical databases include:  IMS - Information Management Systems by IBM  System 2000 by MRI systems corp.  Adabas 1.1.4. 1.1.3. 73     GT.M Caché Multidimensional_hierarchical_toolkit Mumps_compiler Figure – structure of a hierarchical database Figure – sample hierarchical database Advantages of the Hierarchical Model  Data is unified since all records stem from the root  Easier to secure the database since you can access data through only one path  Good for large volumes of one-to-many relationships Disadvantages of the Hierarchical Model  Software dependence (Changes to the database structure requires modification to all programs which access the database)  Adding, updating, and deleting records is more efficient and accurate  You cannot add a record to a child table until it has already been incorporated into the parent table. This might be troublesome if, for example, you wanted to add a student who had not yet signed up for any courses. In Figure 2, you cannot add a new salesperson until there is a customer and an order.  Cannot (difficult) show many-to-many relationships  One-to-many relationship can result in redundant data  Not flexible enough to support ad-hoc queries  Data can only be accessed through the right path  It is not user friendly as users have to know the structure in order to access data through the right path 1.1.4. 1.1.3. 74 Network Model The network model is a database model conceived as a flexible way of representing objects and their relationships. Its original inventor was Charles Bachman, and it was developed into a standard specification published in 1969 by the Conference on Data Systems Languages (CODASYL) Consortium. In many ways, the Network Database model was designed to solve some of the problems with the Hierarchical Database Model. Where the hierarchical model structures data as a tree of record types, with each record type having one parent record and many children, the network model allows each record type to have multiple parent and child records, forming a lattice structure. This allows the model to support many-to-many relationships. There is no root record type. Data can therefore be accessed through more than one path. For example, in the diagram below, (Figure 6), an order can be accessed through either the salesperson or the customer as order has salesperson and customer as its parents. Another way of saying it is that the child of salesperson and customer is order. The path to Parts is either Salesperson, Order, Parts or Customer, Order, Parts. You can therefore access parts by either knowing who the salesperson is or through the order by knowing for example, the order #. The chief argument in favour of the network model, in comparison to the hierarchical model, was that it allowed a more natural modeling of relationships between entities. Although the model was widely implemented and used, it failed to become dominant for two main reasons. Firstly, IBM chose to stick to the hierarchical model in their established products such as IMS and DL/I. Secondly, it was eventually displaced by the relational model, which offered a higher-level, more declarative interface. Examples of network databases include:  Codasyl  Total  VAX-DBMS  IMAGE of Hewlett Packard  DMS-1100 of UNIVAC  SUPRA of Cincom 1.1.4. 1.1.3. 75 Figure 1 Figure 2 Advantages of the Network Model  Many-to-many relationships are easily represented  It is more flexible as you can access data through more than 1 path Disadvantages of the Network Model  Software dependence. (Changes to the database structure requires modification to all programs which access the database)  Uses more processing time than the hierarchical structure  Users must have knowledge of the structure of the database in order to navigate  Hard to design, use and maintain 1.1.4. 1.1.3. 76 Relational o Stores data in tables that consist of rows and columns. o Each row has a primary key o Each column has a unique name o Relational DB developer calls file a relation, record a tuple, and field an attribute o Relational DB user calls file a table, record a row, and field a column o Most include Structured Query Language (SQL) a query language that allows users to manage, update and retrieve data. o Examples: Access, Sybase, Visual FoxPro, Oracle, DB2 Relational databases consist of tables called relations. Relations are made up of tuples and attributes. The rows are called tuples. The columns are called attributes. Relationships between relations are implicit in the overlapping attributes. All have the same simple format making them easy to set out under column headings. Each row normally has a unique identifying key. Most relational databases include Structured Query Language (SQL) a query language that allows users to manage, update and retrieve data. CustName Salesperson Orderno Salesperson Part-No Order-no Advantages of the Relational Model  Structural independence (i.e. Changes to the database structure DOES NOT require modification to all programs which access the database  Powerful and flexible query mechanism that makes adhoc queries possible  Easy representation of all types of relationships  Unification of data that minimizes redundancy and maximizes security Disadvantages of the Relational Model  Requires more space and processing power  Requires more planning if the database structure is to be designed properly 1.1.4. 1.1.3. 77 Object-Oriented o Stores data in objects (An object contains data plus the actions that process the data) o Can usually store more types of data than Relational databases o Can usually access data faster than the Relational DB o Stores unstructured data more efficiently than the Relational DB o Example FastObjects, GemStone What is an Object? An object generally is any item that can be individually selected and manipulated. This can include shapes and pictures that appear on a screen as well as less tangible software entities. In object-oriented programming an object is a self-contained entity that consists of both data and procedures to manipulate the data. In other words, an object is an item that contains data, as well as the actions that read or process the data. Real-world objects share two characteristics: They all have state and behavior. For example, dogs have state (name, color, breed, hungry) and behavior (barking, fetching, wagging tail). Bicycles have state (current gear, current pedal, two wheels, number of gears) and behavior (braking, accelerating, slowing down, changing gears). Software objects are modeled after real-world objects in that they too have state and behavior. You might want to represent real-world dogs as software objects in an animation program or a real-world bicycle as a software object in the program that controls an electronic exercise bike. You can also use software objects to model abstract concepts. What is a Class? A class is a category of objects. For example, there might be a class called shape that contains objects which are circles, rectangles, and triangles. The class defines all the common properties (characteristics) of the different objects that belong to it. A class is a special programming construct that allows us to create objects. In other words, a class provides the blueprint for the creation of an object. The class must specify a description of the data that is stored and a description of the operations that the object can provide. As indicated above, each object must have a state and a set of methods, which are encapsulated (contained) inside the object. The state refers to the data that is stored inside the object, while the methods/behaviours refer to the set of operations/functions, which the object can perform. For example, a user can click on a button, put the mouse over the button, right click or double click on the button. Click, double click, right click, mouse over etc are therefore examples of methods. When the user clicks on the button, the relevant code for the particular user action is executed. Each object must have a set of 1.1.4. 1.1.3. 78 well-defined public interfaces, which a client may use to get the object to perform a specific operation. Examples of objects. An object oriented database can contain many classes of objects, these include:  Command buttons  List boxes  Data windows  Windows  Menus  Text boxes  Pictures  Audio clips  Video clips (animation)  Students  Courses  Employees What is an object-oriented database (OODB)? Object-oriented databases or object database management systems grew out of research during the early to mid-1980s into having intrinsic database management support for graph-structured objects. The term "object-oriented database system" first appeared around 1985. An object-oriented database stores data in objects. The most significant characteristic of objectoriented database technology is that it combines object-oriented programming with database technology to provide an integrated application development system. Object-oriented databases are designed to work well with object-oriented programming languages such as Java, C#, and C++. An object contains data, as well as actions that read or process the data. A Member object, for example, might contain data about a member such as Member ID, First Name, Last Name, Address, and so on. It also could contain instructions on how to print the member record or the formula required to calculate a member's balance due. A record in a relational database, by contrast, would contain only data about a member. Object-oriented databases have several advantages compared with relational databases. They can store more types of data, access this data faster, and allow programmers to reuse objects. An object-oriented database stores unstructured data more efficiently than a relational database. Unstructured data includes photographs, video clips, audio clips, and documents. When users query an object-oriented database, the results often display more quickly than the same query of a relational database. If an object already exists, programmers can reuse it instead of recreating a new object saving on program development time. For example, if a Close button exists on each 1.1.4. 1.1.3. 79 screen, the programmer only needs to write the code once, then place the same button on each screen. This is called inheritance as discussed below. The following are features of an object-oriented database:  Inheritance – the ability to create new objects by allowing them to automatically obtain the data members and the data operations of an existing class without rewriting the code that is present in the existing class.  Polymorphism (many forms) – the ability to have multiple classes of objects using the same interfaces although the implementation details may vary from object to object. For example, you can have a function/subroutine that calculates the area of an object. The way it calculates area depends on the type of object that called the function. This is because the formula for area is different for circle, rectangle, triangle etc. In other words, there is one function called CALCULATE_AREA and multiple objects will call this function, but the function behaves differently from object to object.  Encapsulation – the ability of an object to hide its internal representation from the program that uses it. This is accomplished by defining public interfaces and by specifying that these public interfaces must be used when accessing the internal data.  Information-hiding - an object has a public interface that other objects can use to communicate with it. The object can maintain private information and methods that can be changed at any time without affecting other objects that depend on it. You don't need to understand a bike's gear mechanism to use it. Examples of object oriented databases include:      FastObjects GemStone KE Texpress ObjectStore Versant Examples of applications appropriate for an object-oriented database include the following:   A multimedia database stores images, audio clips, and/or video clips. For example, a geographic information system (GIS) database stores maps. A voice mail system database stores audio messages. A television news station database stores audio and video clips. A groupware database stores documents such as schedules, calendars, manuals, memos, and reports. Users perform queries to search the document contents. For example, you can search people's schedules for available meeting times. 1.1.4. 1.1.3. 80    A computer-aided design (CAD) database stores data about engineering, architectural, and scientific designs. Data in the database includes a list of components of the item being designed, the relationship among the components, and previous versions of the design drafts. A hypertext database contains text links to other types of documents. A hypermedia database contains text, graphics, video, and sound. The Web contains a variety of hypertext and hypermedia databases. You can search these databases for items such as documents, graphics, audio and video clips, and links to Web pages. A Web database links to an e-form on a Web page. The Web browser sends and receives data between the form and the database. OODBs add database functionality to object programming languages. A major benefit is the unification of the application and database development into a seamless data model and language environment. As a result, applications require less code, use more natural data modeling, and code bases are easier to maintain. Object developers can write complete database applications with a modest amount of additional effort. According to Rao (1994), "The object-oriented database (OODB) paradigm is the combination of object-oriented programming language (OOPL) systems and persistent systems. The power of the OODB comes from the seamless treatment of both persistent data, as found in databases, and transient data, as found in executing programs." Data is a database is said to be persistent (constant) because you can read a record at one point in time and read the record at another point in time and the record is still there. In other words, the record is not transient (temporary). In contrast to a relational DBMS where a complex data structure must be flattened out to fit into tables or joined together from those tables to form the in-memory structure, OODBs have no performance overhead to store or retrieve a web or hierarchy of interrelated objects. This one-to-one mapping of object programming language objects to database objects has two benefits over other storage approaches: it provides higher performance management of objects, and it enables better management of the complex interrelationships between objects. This makes object DBMSs better suited to support applications such as financial portfolio risk analysis systems, telecommunications service applications, world wide web document structures, design and manufacturing systems, and hospital patient record systems, which have complex relationships between data. What is a hybrid object-relational database (ORD)? An object-relational database (ORD) or object-relational database management system (ORDBMS) combines features of the relational and object-oriented data models. It is a relational database management system that allows developers to integrate the database with their own custom data types and methods. The term object-relational database is sometimes used to describe external software products running over traditional DBMSs to provide similar features; these systems are more correctly referred to as object-relational mapping systems. 1.1.4. 1.1.3. 81 Whereas RDBMS or SQL-DBMS products focused on the efficient management of data drawn from a limited set of data types (defined by the relevant language standards), an object-relational DBMS allows software developers to integrate their own types and the methods that apply to them into the DBMS. The goal of ORDBMS technology is to allow developers to raise the level of abstraction at which they view the problem domain. Object-relational database management systems (ORDBMSs) add new object storage capabilities to the relational systems at the core of modern information systems. These new facilities integrate management of traditional fielded data, complex objects such as time-series and geospatial data and diverse binary media such as audio, video, images, and applets. An applet is an application that has limited features, requires limited memory resources, and is usually portable between operating systems. By encapsulating methods with data structures, an ORDBMS server can execute complex analytical and data manipulation operations to search and transform multimedia and other complex objects. As an evolutionary technology, the object-relational (OR) approach has inherited the robust transaction- and performance-management features of its relational ancestor and the flexibility of its object-oriented cousin. Database designers can work with familiar tabular structures while assimilating new object-management possibilities. Examples of Object-relational databases include:  DB2  JDataStore  Oracle  Polyhedra  PostgreSQL What is Object Definition Language (ODL)? Object-oriented and object-relational databases often use a query language called object query language (OQL) to manipulate and retrieve data. These databases also have an object definition language (ODL). ODL is used to define and manipulate the objects in the database. ODL must specify a description of the data that is stored in objects as well as a description of the operations that the object can provide. For example, an object could be defined as being a command button. Code could be written to manipulate the button in various ways such as: raise the button, move its location, bring it into focus, enlarge it etc. Representation of an object oriented database. In the sample website below, the object-oriented database contains buttons and a map. When the user clicks on a particular area of the map, information on that area will appear. When the user clicks on a button, there is a link to another web page. When the user puts their mouse over a button, a description of the button appears. 1.1.4. 1.1.3. 82 1.1.4. 1.1.3. 83 Multidimensional o Stores data in dimensions. o The number of dimensions varies o Most have a time dimension o Examples: D3, Oracle Express The following shows the difference between the relational view of sales data and the multidimensional view of sales data. Relational View INVOIC Table E Number Date 2034 2035 2036 2037 Custom er 15/5/9 Dartoni 6 k 15/5/9 INC 6 16/5/9 Dartoni 6 k 16/5/9 INC 6 LINE Table Amou nt $3500 Numb er 2034 $1800 2034 Produc Pric Quantit t e y Mouse $15 20 0 Disket $50 10 te $2000 $800 Multidimensional View Time Dimension 15/5/9 16/5/9 Totals 6 6 Dartonik $3500 $2000 $5500 INC $1800 $800 $2600 Totals $5300 $2800 $8100 Sales figures occur at the intersection of a customer row and time column Customer Dimension 1.1.4. 1.1.3. 84 6.11. The advantages of databases        Reduced data redundancy – most data items are stored in only one file which greatly reduces duplicate data. o There is also an economic advantage in not duplicating data Data definition and documentation are standardized. (e.g. the dates in the database are all reported in the same format) Improved data integrity – data modification is accomplished by changing only one file, reducing the probability of introducing inconsistencies Shared data – o Data belongs to and are shared, usually over a network, by the entire organization. o Information supplied to managers is more valuable because it is based on a comprehensive collection of data instead of files, which contain only the data needed for one application. (Total availability). o The integration of different business systems is greatly facilitated. o Security settings are usually used to define who have access to what level. Easier record-keeping Easier and Faster Access to data o Non-technical users can access and maintain data if afforded the necessary privileges o As well as routine reports, it is possible to obtain ad-hoc reports to meet particular requirements. Reduced development/programming time (e.g. a programmer will take less time to create a payroll system). 6.12. The disadvantages of databases   1.1.4. Require more memory, storage and processing power Data are more vulnerable than in file processing systems 1.1.3. 85 6.13. Data Warehousing The need for data analysis. Organizations tend to grow and prosper as they gain a better understanding of their environment. Typically, business managers must be able to track daily transactions to evaluate how the business is performing. By tapping into the operational database, management can develop strategies to meet organizational goals. In addition, data analysis can provide information about short-term tactical evaluations and strategies such as: are our sales promotions working? What market percentage are we controlling? Are we attracting new customers? Tactical and strategic decisions are also shaped by constant pressure from external and internal forces, including globalization, the cultural and legal environment and, perhaps most important, technology. Given the many and many and varied competitive pressures, managers are always looking for competitive advantages through product development, service, marketing and so on. Managers understand that their business climate is very dynamic, thus mandating their prompt reaction to change in order to remain competitive. In other words, the decision making cycle time is reduced. In addition, the modern business climate requires managers to approach increasingly complex problems based on a rapidly growing number of internal and external variables. There is therefore growing interest in creating support systems, dedicated to facilitating quick decision making in a complex environment. Different managerial levels require different decision support needs. For example, transaction processing systems, based on operational databases, are tailored to serve the information needs of people who deal with short term inventory, accounts payable or purchasing. Middle level managers, general managers, vice-presidents and presidents focus on strategic and tactical decision making. Such managers require detailed information designed to help them make decisions in a complex data and analysis environment. Data warehousing Downloading does move data closer to the user and thereby increase its potential utility. Unfortunately, while one or two download sites can be managed without a problem, if every department wants to have its own source of downloaded data, the management problems become immense. Accordingly, organizations began to look for some means of providing a standardized service for moving data to the user and making them more useful. That service is called data warehousing. What is a data warehouse? A data warehouse (DW) is a huge database that stores and manages the data required to analyze historical and current transactions. A data warehouse contains a wide variety of data that present a coherent picture of business conditions at a single point in time. A data warehouse includes not only data but also tools, procedures, training, personnel and other resources that make access to the data easier and more relevant to decision makers. The goal of the data warehouse is to increase the value of the organization’s data asset. It typically has a user-friendly interface so users easily can interact with its data. It is designed to support management decision making. Through a data warehouse, managers and other users access transactions and summaries 1.1.4. 1.1.3. 86 transactions quickly and efficiently. The databases in a data warehouse usually are quite large. Development of a data warehouse includes development of systems to extract data from operating systems plus installation of a warehouse database system that provides managers flexible access to the data. Figure – A Data Warehouse (DW) The role of the data warehouse is to store extracts from operational data and make them available to users in a useful format. The data can be extracts from databases and files, but can also be document images, recordings, photos and other non-scalar data. The source data could also be purchased from other organizations. The data warehouse stores the extracted data and also combines it, aggregates2 it, transforms it and makes it available to users via tools that are designed for analysis and decision making such as OLAP (see section “What is On-line analytical processing (OLAP)?” below). A comparison of data warehouse and operational database characteristics Characteristic Integrated Operational database data Similar data can have different representations or meanings Subject-Oriented Data are stored with a functional or process orientation (for 2 Data warehouse data Provide a unified view of all data elements with a common definition and representation for all departments. Data are stored with a subject orientation that facilitates multiple A collection of, or the total of, disparate elements 1.1.4. 1.1.3. 87 example, invoices, credits, debits etc). Time-Variant Non-volatile Data represent current transactions (e.g. the sales of a product in a given data). Data updates and deletes are very common. views for data and decision making (e.g. sales, products, sales by products etc.) Data are historic in nature. A time dimension is added to facilitate data analysis and time comparisons. Data cannot be changed. Data are only added periodically from operational systems. Once data are stored, no changes are allowed. Evolution of the data warehouse. The origins of today’s Data Warehouses can be traced to the reporting systems that were popular in the 1980s. These reporting systems provided some basic answers to the end user’s questions, although the format wasn’t always the most appropriate. The end user’s questions, although the format wasn’t always the most appropriate. The reporting systems that formed the foundation of basic decision support required direct access to the operational data through a menu interface to yield predefined report structures. Typically, the reporting system was front-ended by a text-only presentation tool. The next development stage produced a sophisticated form of decision support by supplying lightly summarized data extracted form the operational database. Such lightly summarized data were usually stored in an RDBMS and were accessed through SQL statements via a query tool. The SQL-based query tool provided some predefined reports and, better yet, some ad hoc query capability. Unfortunately, to use the queries the end user had to know the details of the underlying data structure. The presentation tool was similar to the one used by the original reporting system, but it did provided additional customization options for ad hoc reports. A variation on this theme of greater end user empowerment was the use of spreadsheets or statistical packages to analyze operational data. End users used their own desktop tools to access and manipulate data in order to support their decision making process. Primitive as they were by current standards, these reporting systems and their extensions gave IS departments the first major tools with which to solve decision support problems. Given advances in hardware and software in the late 1980s and early to mid-1990s, the explosion of available operational data, and the growing sophistication of decision support systems, data warehouse developments were almost inevitable. What is a data mart? Some organizations decide to limit the scope of the warehouse to more manageable chunks. A data mart is a smaller version of a data warehouse, containing a database that helps a specific group or department make decisions. Marketing and sales departments may have their own separate data marts. Individual groups or departments often extract data from the data warehouse to create their data marts. 1.1.4. 1.1.3. 88 Restricting a data mart to a particular type of data makes the management of the data warehouse simpler and probably means that an off-the-shelf DBMS product can be used to manage the data warehouse. Metadata3 is also simpler and easier to maintain. A data mart that is restricted to a particular business function, such as marketing analysis, may have many types of data and metadata to maintain, but all of those data serve the same type of users. Tools for managing the data warehouse and for providing data to the users can be written with an eye toward the requirements that marketing analysts are likely to have. A data mart that is restricted to a particular business unit or geographical area may have many types of input and many types of users, but the amount of data to be managed is less than for the entire company. There will also be fewer requests for service, so the data warehouse resources can be allocated to fewer users. The following diagram summarizes the scope of alternatives for sharing data. Data downloading is the smallest and easiest alternative. Data are extracted from operational systems and delivered to particular users for specific purposes. The downloaded data are provided on a regular and recurring basis, so the structure of the application is fixed, the users are well trained, and problems such as timing and domain inconsistencies are unlikely to occur because users gain experience working with the same data. At the other extreme, a data warehouse provides extensive types of data and services for both recurring and ad hoc requests. Data marts fall in the middle. As we move from left to right, the alternatives become more powerful but also more expensive and difficult to create. Data Marts Data Downloading Particular Data Inputs Particular Business Functions Particular Business Unit or Geographical Region Easier Data Warehouse More Difficult Figure - Continuum of Enterprise Data Sharing Components of a data warehouse          3 4 Data extraction tools Extracted data Metadata4 of warehouse contents Warehouse DBMS(s) and OLAP (online analytical processing) servers Warehouse data management tools Data delivery programs End-user analysis tools User training courses and materials Warehouse consultants Data about the data such a field names, field types, validation rules etc). Data about the data such a field names, field types, validation rules etc). 1.1.4. 1.1.3. 89 The source of the warehouse is operational data or data generated from routine transaction processing systems such as Sales, Registration of a student, Payroll, Banking deposit/withdrawal etc. The data warehouse therefore needs tools for extracting the data and storing them. These data however are not useful without metadata and describe the nature of the data, their origins, their format, limits on their use and other characteristics of the data that influence the way they can and should be used. Potentially, the data warehouse contains billions of bytes of data in many different formats. Accordingly, it needs DBMS and OLAP servers of its own to store and process the data. In fact, several DBMS and OLAP products may be used, and the features and functions of these may be augmented by additional in-house developed software the reformats, aggregates5, integrates and transfers data from one processor to another within the data warehouse. Programs may be needed to store and process non-scalar data like graphics and animations also. Because the purpose of the data warehouse is to make organizational data more available, the warehouse must include tools not only to deliver the data to the users but also to transform the data for analysis, query and reporting, and OLAP for user-specified aggregation and disaggregation. The data warehouse provides an important, but complicated set of resources and services. Hence the warehouse needs to include training courses, training materials and on-line help utilities, and other similar training products to make it easy for users to take advantage of the warehouse resources. Finally, the data warehouse includes knowledgeable personnel who can serve as consultants. What is On-line analytical processing (OLAP)? OLAP refers to an advanced data analysis environment that supports decision making, business modelling, and operations research activities. OLAP systems share four major characteristics, these are: 1. Use multidimensional data analysis techniques 2. Provide advanced database support 3. Provide easy-to-use end user interfaces 4. Support client/server architecture 5 A collection of, or the total of, disparate elements 1.1.4. 1.1.3. 90 The following shows the difference between the operational view of sales data and the multidimensional view of sales data. Operational View INVOICE Table Number Date 2034 15/5/96 2035 15/5/96 2036 16/5/96 2037 16/5/96 LINE Number 2034 2034 Customer Amount Dartonik $3500 INC $1800 Dartonik $2000 INC $800 Table Product Price Quantity Mouse $150 20 Diskette $50 10 Multidimensional View Customer Dimension Dartonik INC Totals Time Dimension 15/5/96 16/5/96 Totals $3500 $2000 $5500 $1800 $800 $2600 $5300 $2800 $8100 Sales figures occur at the intersection of a customer row and time column What is data mining? Often, the database is distributed. Data warehouses often use a process called data mining. Data mining is a process that often is used by data warehouses to find patterns and relationships among data. E.g. A state government could mine through data to check if the number of births has a relationship to income level. Many e-commerce sites use data mining to determine customer preferences. Examples of data mining findings can be:  65% of customers who did not use their credit card in the last six months are 88% likely to cancel their account  82% of customers who bought a new TV 27” or larger are 90% likely to buy and entertainment center within the next four weeks  If age < 30 and income <= 25000 and credit rating < 3 and credit amount > 25000 then the minimum loan term is 10 years. User requirements for a data warehouse The requirements for a data warehouse are different from the requirements for a traditional database application. For one, a typical database application, the structure of reports and queries is standardized. While the data in a report or query may vary from month to month, for instance, the structure of the report or query stays the same. Data warehouse users, on the other hand, often need to change the structure of queries and reports. Another difference is that users want to do their own data aggregation6. For example, a user who wants to investigate the impact of different marketing campaigns may want to aggregate product sales according to package color at one time; according to marketing program at another time; 6 To collect or total disparate elements 1.1.4. 1.1.3. 91 according to package color within marketing program at a third time. The analyst wants the same data in each report; but simply presents it differently. Data warehouse users also want to dis-aggregate them in their own terms, or drill down their data. For example, a user may be presented with a screen that shows total product sales for a given year. The user may then want to be able to click on the data and have them explode into sales by month; to click again and have the data explode into sales by product by month or sales by region by product by month. Graphical output is another common requirement. Users want to see results of geographic data in geographic form. Sales by state and province should be shown on a map. A reshuffling of employees and offices should be shown on a diagram of office space. These requirements are more difficult because they vary from user to user and from task to task. Many users of data warehouse facilities want to import warehouse data into domain-specific programs. For example, financial analysts want to import data into their spreadsheet models and into more sophisticated financial analysis programs. Portfolio managers want to import data into portfolio management programs, and oil drilling engineers want to import data into seismic analysis programs. All of this importing usually means that the warehouse data needs to be formatted in specific ways. Rules for defining a data warehouse. The following list is made up of 12 rules that define a data warehouse. This list was created by William H. Inmon and Chuck Kelley in 1994. 1. The data warehouse and operational environments are separated. 2. The data warehouse data are integrated. 3. The data warehouse contains historical data over a long time horizon. 4. The data warehouse data are snapshot data captured at a given point in time. 5. The data warehouse data are subject-oriented. 6. The data warehouse data are mainly read-only periodic batch updates from operational data. No online updates are allowed. 7. The data warehouse development life cycle differs from classical systems development. The data warehouse development is data driven; the classical approach is process driven. 8. The data warehouse contains data with several levels of detail: current details data, old detail data, lightly summarized, and highly summarized data. 9. The data warehouse environment is characterized by read-only transactions to very large data sets. The operational environment is characterized by numerous update transactions to a few data entities at a time. 10. The data warehouse environment has a system that traces data sources, transformations and storage. 1.1.4. 1.1.3. 92 11. The data warehouse’s metadata7 are a critical component of this environment. The metadata identify and define all data elements. The metadata provide the source, transformation, integration, storage, usage, relationships, and history of each data element. 12. The data warehouse contains a charge-back mechanism for resource usage that enforces optimal use of the data by end users. The 12 rules capture the data warehouse life cycle, from its introduction as an entity separate from the operational data store, to its components, functionality, and management processes. The current generation of specialized decision support systems provides a comprehensive infrastructure to design, develop, implement and use decision support systems within an organization. 7 Data about the data such a field names, field types, validation rules etc). 1.1.4. 1.1.3. 93 7. Module 7 – Information Technology in the business office 7.1. Definition and purpose of office automation Office automation (OA) refers to the use of information processing and communication technologies for writing, collecting, storing, organizing, retrieving, and communicating office data. It changes the way people work. It affects the job itself, the flow of information through an organization, the process of management and even the corporate structure. Paperless office – email, electronic filing, file security, internet, intranet Electronic conferencing Software – word processing, spreadsheet, powerpoint, databases, business systems, desktop publishing Equipment Fax machines Computers, workstations Printers Scanners, Cameras 7.2. Features of office automation Facsimile (FAX) – device that transmits and receives documents over telephone lines. Voice mail Communications technology that functions much like an answering machine, allowing callers to leave a voice message for an individual. Voice messaging Voice mail is a service that functions much like an answering machine, allowing a person to leave a voice message for one or more persons. Voice messaging is using voice mail as an alternative to electronic mail, in which voice messages are intentionally recorded, not because the recipient was not available. 1.1.4. 1.1.3. 94 Telemarketing Selling over the telephone. Teleconferencing Conferencing is where people meet and see and speak to each other. Teleconferencing is conferencing via telecommunication channels. The user is able to see and hear a person at the other end of the line in another location. Teleconferencing requires a camera, microphone, speakers and the appropriate communication software. Once this technology catches on it will be very popular and be a huge money maker. Teleconferencing minimizes the time and cost spent traveling (no hotel fees, jet lag etc.) The world becomes smaller place (global marketplace). You can have business meetings with all the desired people when and where need to. Telecommuting Commuting is traveling from one location to the other, such as from work to home. Telecommuting is where a person does not have to travel to work but works from home and connects to his office via telecommunication channels. (i.e. commuting/going to work via telecommunication channels) Advantages    Less traffic on the roads, less pollution etc. Reduction in expenses such as work clothes, gas etc. Reduction in the need for parking spaces and offices for staff Disadvantages   Persons who are not disciplined enough will not be productive as they will talk on the phone, watch tv, eat etc instead of work. Anti-social behaviour will result as there is limited social interaction. This will not be possible with all jobs Electronic fund transfer This system permits the movement of money by means of electronic signals relayed between computers via such means as telephone lines and radio waves. Designed primarily to reduce banking costs by decreasing paperwork, EFT virtually eliminates the use of cash, checks, and conventional credit cards. In such a system, salaries, social security payments, and other income are credited directly to a user's account. Payment of utility bills, rent, or home mortgage loans are likewise made directly, with the amount of the outlays deducted from the balance of the account. E-commerce the conducting of business online, including shopping, banking, investing. E-commerce (electronic-commerce) refers to business over the Internet. Web sites such as Amazon.com, Outpost.com, and eBay are all e-commerce sites. The two major forms of e-commerce are Business-to-Consumer (B2C) and Business-to-Business (B2B). While companies like Amazon.com cater mostly to consumers, other companies provide goods and services exclusively to other businesses. The terms 1.1.4. 1.1.3. 95 "e-business" and "e-tailing" are often used synonymously with e-commerce. They refer to the same idea; they are just used to confuse people trying to learn computer terms. Electronic mail E-mail, short for electronic mail, is the transmission of messages over communications networks. The messages can be notes entered from the keyboard or electronic files stored on disk. Most mainframes, minicomputers, and computer networks have an e-mail system. Some electronic-mail systems are confined to a single computer system or network, but others have gateways to other computer systems, enabling users to send electronic mail anywhere in the world. Companies that are fully computerized make extensive use of e-mail because it is fast, flexible, and reliable. Most e-mail systems include a rudimentary text editor for composing messages, but many allow you to edit your messages using any editor you want. You then send the message to the recipient by specifying the recipient's address. You can also send the same message to several users at once. This is called broadcasting. Sent messages are stored in electronic mailboxes until the recipient fetches them. To see if you have any mail, you may have to check your electronic mailbox periodically, although many systems alert you when mail is received. After reading your mail, you can store it in a text file, forward it to other users, or delete it. Copies of memos can be printed out on a printer if you want a paper copy. All online services and Internet Service Providers (ISPs) offer e-mail, and most also support gateways so that you can exchange mail with users of other systems. Usually, it takes only a few seconds or minutes for mail to arrive at its destination. This is a particularly effective way to communicate with a group because you can broadcast a message or document to everyone in the group at once. Internet The Internet is a large, international computer network linking millions of users around the world that use the TCP/IP protocols. It is used daily by many individuals for the main purposes of sending and receiving electronic mail (email), obtaining information on almost any subject, or to communicate with others around the world. Access to the Internet is obtained by subscription, and an Internet address is needed to receive or to send a message. Such addresses have a specific format that specifies the name of the user, the machine they are working on, and where that machine is located. Advantages  Better communication – email, chat rooms etc. 1.1.4. 1.1.3. 96    Easier, faster access to information Less travelling – e.g. to a library, store More convenient – e.g. shopping, paying bills Disadvantages  Exposure of children to pornography, pedofiles, harmful information  Can be addictive for some persons  Persons unable to socialize 1.1.4. 1.1.3. 97 7.3. Application of computers in various fields Hotel To make bookings for rooms To bill guests To prepare financial statements To pay hotel staff To do banking transactions (eg customer pays by credit card) Advantages  Able to handle customer’s better – can check if rooms available etc.  Able to produce bills faster therefore check out is better. Education Timetabling Student records - grades Transcripts Schools at all levels recognize the importance of training students to use computers effectively. Students can no longer rely solely on their textbooks for information. They must also learn to do their research on the World Wide Web. Efforts are underway to connect schools to the Internet, but schools must be able to afford the equipment, the connection charges, and the cost of training teachers. Libraries that traditionally contained only books and other printed material now have PCs to allow their patrons to go online. Some libraries are transferring their printed information into databases. Rare and antique books are being photographed page by page and put onto CD-ROMs. Advantages Computers have proved to be valuable educational tools. Computer-assisted instruction, or CAI, uses computerized lessons that range from simple drills and practice sessions to complex interactive tutorials. These programs have become essential teaching tools in medical schools and military training centers, where the topics are complex and the cost of human teachers is extremely high. Educational aids, such as some encyclopedias and other major reference works, are available to personal-computer users--either on magnetic disks or optical discs or through various telecommunication networks such as the Internet. Banking Account balances - deposit, withdrawals, transfers Calculate interest, withholding tax, loans Bank charges - credit card etc. ATM, ABM Advantages 1.1.4. 1.1.3. 98     A more sophisticated form of electronic banking that may eventually become the standard means of conducting financial transactions is electronic funds transfer (EFT). This system permits the movement of money by means of electronic signals relayed between computers via such means as telephone lines and radio waves. Designed primarily to reduce banking costs by decreasing paperwork, EFT virtually eliminates the use of cash, checks, and conventional credit cards. In such a system, salaries, social security payments, and other income are credited directly to a user's account. Payment of utility bills, rent, or home mortgage loans are likewise made directly, with the amount of the outlays deducted from the balance of the account. Another EFT feature is the extensive use of remote point-of-sale terminals linking stores and banks, which allows purchases to be charged against bank accounts. (MULTILINK) An EFT transaction of this kind requires the use of a debit card similar to the one employed with automated tellers. The card is coded with information that identifies the bank and account number of the cardholder. The store clerk inserts the customer's card into an EFT terminal and enters the price of the item. The terminal, equipped with either a magnetic tape reader or laser scanner, reads the encoded information and contacts the customer's bank whose computer checks the appropriate account, compares the balance and the amount of the funds requested, and then sends back approval to the store. The funds are transferred electronically to the merchant's bank and credited to his account. The entire transaction is accomplished within minutes regardless of the geographical distances between the point of sale and the banks involved. This eliminates the need for cash. Update balances faster. ATM’s – no need to go into the bank to do a transaction. Disadvantages  ATM’s have been burglarized.  Computer crime allows people to steal money Home Advantages  Can reorder groceries on the internet  Home security systems  Program appliances, lights etc.  Shopping from Home  Research from Home  School from home  Most recent-model cars are equipped with computerized ignition and fuel systems designed to increase fuel economy and performance. Japanese engineers also have developed a car with an on-board computer that the driver can use to plan his route. The driver simply enters his intended destination into the computer, which transmits the information to special roadside computer-sensor units. These units measure and analyze the traffic 1.1.4. 1.1.3. 99 flow on all possible routes to the desired destination and recommend the one with the least amount of traffic. Disadvantages  Encourages laziness – not walking in the stores.  Parents are often concerned about what their children can access on the    Internet. They can install special programs called Web filters that automatically block access to Web sites that may be unsuitable for children. Some people are concerned that online shopping will put many physical retail stores out of business, to the detriment of personal one-on-one service. Criminals can log into the Internet just like everyone else, and they can commit crimes against other people who are also logged in. These criminals may give out false information to encourage others to send them money. They may also be predators who use the anonymity afforded by chat rooms and discussion groups to meet people under false pretences. Too little social contact with humans Supermarkets Advantages  Allow faster processing at the cash register – point of sale system (by reading UPC – universal product code/bar code)  No need for cashier to remember the prices, reduces errors made by the cashier  Calculates GCT and change instantly  Instant update of stock balances  Debit and credit cards enable no use of cash, which can be stolen Disadvantages  Automated systems, however, do have certain limitations and drawbacks. Although usually very reliable, they can malfunction. Moreover, an entire system may fail to operate properly if there is a single error in setting it up. A backup system has to be provided or a human "override" capability built into the system so that operations can be handled manually. (E.g. what happens when bar code reader malfunctions.) 1.1.4. 1.1.3. 100