Download decision support systems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Data center wikipedia , lookup

Operational transformation wikipedia , lookup

Database wikipedia , lookup

Data model wikipedia , lookup

Data analysis wikipedia , lookup

Clusterpoint wikipedia , lookup

Expense and cost recovery system (ECRS) wikipedia , lookup

Relational model wikipedia , lookup

3D optical data storage wikipedia , lookup

Information privacy law wikipedia , lookup

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

Database model wikipedia , lookup

Transcript
SYSTEMS DEVELOPMENT LIFE CYCLE
The sequence of activities involved and development of new systems. It involves a series of steps in
which output from a preceding stage serves as input to the next. In most organizations, different
application systems may be needed by different users. Systems are resources that need to be managed,
that is they will be developed according to their, need, value and availability of specialists to work on
them. Not all development projects take place at the same time
Not all development projects take place at the same time, some activities are priotized over others. A
new system may be required if an existing system fails to meet or just partially satisfies an organizations
objectives. There are two main views to systems development, the managerial view and the technical
view.
REQUIREMENTS ANALYSIS: The collection of data on the end user needs of a proposed or
intended system. It is the most important stage of the cycle. If user needs are not fully gathered and
understood by a developer, then the intended system will fail to satisfy users and may be rejected.
DESIGN: Designs are made based on details from the specification document. All internal and external
aspects of a product are laid down. Input, output, processing and control procedures, file sizes, database
specifications are also described and designed. The system design is broken down into modules
(independent pieces of the system) with well-defined interfaces to the rest of the product. A designer
must specify what a module does and how it works. Flaws identified in a design are corrected through
iteration, repeating parts of the analysis and design work in order to improve the design.
BUILD PHASE (Physical Construction)
Process of translating the detailed design into the real product. Whiles smaller components or modules
are being built or assembled, they must be tested. Next, the modules are combined or integrated into a
functional whole
NOTE: The manner of module integration (one at a time or all at once) and the specific order (top to
bottom or bottom to top) has a critical influence on the quality of the resulting product.
BOTTOM UP_ major design faults may not be detected early and may necessitate an expensive
repetition.
TEST PHASE
This is to determine whether the product as a whole functions correctly. Various types exist
Integration: this is to check that modules combine correctly to achieve a product that satisfies
specifications. Here care is taken to check module interfaces.
Functionality: product is checked to see if specifications have been correctly implemented.
Correctness: Constraints listed in a specification document must be tested
Robustness: erroneous data is imputed to determine if its error handling capabilities are adequate or
whether it will crash.
Acceptance: system is delivered to the client and tested using actual data. If a new product passes its
acceptance test, the task of developers is complete
IMPLEMENTATION (Conversion or Change over)
After a new system has been built and tested, it should be implemented. Implementation is the process
of replacing an existing system with a newly built one. The method depends on a number of factors such
as the complexity of the system, cost, time frame, readiness of users etc. There are four main
techniques or strategies
PARALEL: Both old and new systems are operated alongside each other until all users are convinced
that the new system is okay then the old is terminated. It is the most costly change over method but
involves lower risks
PILOT: Only a selected group of people are allowed to use the new system. It is less costly compared to
the parallel but increases the period an organization takes to attain total changeover.
PHASED: This involves using the new system in modules or subsystems but not the entire system. The
risk of error or failure is limited to module being used.
DIRECT CUTOVER: An organization moves straight onto a new system as soon as it is fixed and discards
the old. It is a cheaper technique but highly risky because if it is faulty or not well understood, this may
disrupt operations and can end up being costly to an organization.
INFORMATION SYSTEMS
LEVELS OF INFORMATION SYSTEMS
OPERATIONAL (Ground) LEVEL: Transaction Processing Systems, Office Automation Systems,
Knowledge based Systems
TACTICAL (Middle) LEVEL: Management Information Systems, Decision Support Systems, Expert
Systems and Artificial Intelligence
STRATEGIC (Top) LEVEL: Group Decision Support Systems, Executive Systems
TRANSACTION POCESSING SYSTEMS
Transactions are the basic business activities that take place daily within a firm e.g. clock or time cards,
received purchase orders, issuing customer receipts, payroll checks, etc.
A TPS is a system that processes the detailed data necessary to update records on the fundamental
business operations of an organization.
A TPS may be manual or automated. The automation of systems is to provide a competitive edge. A TPS
also provides insight into the structure of a firm because it includes all the different problem solving
activities employees engage in.
CHARACTERISTICS OF TPSs
1. Work processes of a large number of people so the loss of one system has a negative impact
on a firm.
2. Their use has a high potential for security related problems.
3. Require regular auditing to ensure input data, processing and output are accurate.
TRANSACTION PROCESSING ACTIVITIES
The cycle of events that a business goes through from data capture till document production
DATACOLLECTION: the process of capturing all data necessary to complete a transaction
DATA EDITING: check data for validity and completeness
DATA CORRECTION: re- entering mis keyed or mis -scanned data that was edited or found to be in error
DATA MANIPULATION: performing calculations and other transformations related to a business
transaction
DATA STORAGE: updating one or more databases with new transactions
DOCUMENT PRODUCTION: process of generating output
TRANSACTION PROCESSING ACTIVITIES
Example: order entry
TPSs are the back bone of every organization’s information system. Most organizations will grind to a
halt if their transaction systems failed. To ensure the operational integrity of such systems, key actions
are taken. These include:
1. The development of emergency alternate procedures: in case the system becomes unusable for a
sometime.
2. Disaster recovery: to keep the system running until normal operations can be re- established.
3. Periodic system audits: to assess whether procedures and controls are being properly used.
MANAGEMENT INFORMATION SYSTEMS
In every organization, every functional arm of business has its own TPS for gathering and using data. In a
sense, each system stands alone since it belongs to a particular department. Since all such systems are
part of the firm they need to be integrated otherwise a firm will end up with a collection of disjointed
and ineffective systems. A way to integrate and unify various systems is through a shared database.
The primary purpose of an MIS is to provide managers with insight into the regular activities of an
organization so that they can manage efficiently. In short it provides information as well as support for
effective decision making as well as feedback on daily operations.
An MIS is an integrated system for providing information to support the planning, control and
operations of an organization. It involves people, procedures, equipment, models and data. It is
organized to take data from internal (transactions) and external (competitors, legislation, economy to
develop information useful to management.
CHARACTERISTICS OF MISs
Provide various categories of reports either soft or hard copies
INPUT: The various TPSs are the most important sources of data
Example: sales figures grouped by
Week
Year
Product
Sales rep
These inputs are processed in such a way that they become useful to managers in the form of
predetermined reports.
OUTPUT: A collection of varied reports
Scheduled: a report produced periodically or on schedule e.g. daily, weekly, monthly
Key indicator: a scheduled report that summarizes the previous days critical activities linked to CSFs
Demand : a report developed to give certain information at a manager’s request e.g. inventory levels,
total sales for a particular product in a year, hours worked by a particular employee
Exception: reports automatically produced under unusual situations that require management action
Drill down: reports that provide increasingly detailed data about a situation
DIAGRAM: A typical MIS
DECISION SUPPORT SYSTEMS
Whiles TPSs and MISs help to solve structured and semi structured problems, DSSs and ESSs assist in
solving semi structured and unstructured problems. In any organization, effective decision making is
needed to attain objectives. Decision making activities are influenced by the objectives as well as the
long term (strategic) plans of a firm. When managers solve problems goals are achieved. Problem
solving begins with decision making. A DSS aids individual decision making styles and techniques.
MODELS OF DECISION MAKING
The decision making phase of problem solving process has been explained using two models
1. HERBERT SIMON MODEL: Decision making is made up of 3 stages: intelligence, design and
choice
2. GEORGE HUBER MODEL: Problem solving is made up of 5 stages: I + D + C + Implementation+
Monitoring
INTELLIGENCE: potential problems or opportunities are identified and defined. Constraints on the
possible solution and the problem environment are studied
DESIGN: alternative solutions to the problem are developed. The feasibility and implications of
alternatives are evaluated
CHOICE: when a course of action is selected. This may not always be clearly cut and simple. Several
factors influence choice
IMPLEMENTATION: When action is taken to put the solution into effect.
MONITORING: the decision maker evaluates the implemented solution to determine whether
anticipated results were achieved and to modify the process in light of new information got.
FACTORS INFLUENCING PROBLEM SOLVING
Multiple Decision Objectives: Some organizations have multiple objectives, not just increasing profit and
reducing cost. Their problem solving objectives are complex and difficult to solve.
Increased Alternatives: Alternatives to consider for problem solving keep changing and increasing
Increased Competition: Competition involves two or more businesses vying to reach similar goals
through similar customer groups. Increased competition makes it very difficult for a business to meet
defined goals
The need for creativity: creativity is the ability to originate or generate new ideas or approaches to add
value to products and services and to offer a response to exploit an opportunity or solve a problem. This
can differentiate a company from its competitors
Social and Political action: at all levels these have a profound impact on problem solving.
TYPES OF PROBLEMS
STRUCTURED: Problems that are straight forward requiring known facts and relationships.
UNSTRUCTURED: Problems that are not routine and are complex, involving data of various formats with
unclear data relationships.
APPROACHES TO COMPUTERIZED DECISION SUPPORT SYSTEMS
In general, computerized DSSs can either optimize or satisfice
OPTIMIZATION MODEL: An approach that will find the best solution to solve a problem e.g. to find the
number of products to produce to meet a profit goal. Such models utilize problem constraints e.g. a limit
on the number of working hours in a manufacturing plant is a problem constraint. Some spreadsheets
like excel have optimizing features.
SATISFICING MODEL: An approach that will find a good but not necessarily the best solution. This is used
when modeling a problem to get the optimal decision is too difficult, complex o costly, e.g. a decision to
locate a new plant in Africa. The optimal approach will consider every country but the satisficing will
consider only five countries. This may not result in the best decision but will result in a good decision
that will not take much time.
HEURISTICS: Commonly accepted guidelines or procedures that usually find a good solution are used in
decision making.
TYPES OF DECISIONS
PROGRAMMED DECISIONS: These are made using a rule, procedure or quantitative method e.g. to order
inventory when levels drop below 100 units is to adhere to a rule. This is typical of most TPSs and MISs.
NON PROGRAMMED DECISIONS: Decisions made on unusual or exceptional situations and such
decisions are difficult to quantify, e.g. determining the appropriate training program for a new employee
or deciding whether to start a new type of product line
OVERVIEW OF DECISION SUPPORT SYSTEMS
A decision support system is an organized collection of people, procedures, SW, DBs and devices used to
support problem specific decision making.
A DSS although skewed towards top management, can be used at all levels because managers at all
levels are faced with less structured non routine problems. Since most businesses are saddled with a
bureaucracy of complex rules, procedures and decisions, DSSs are used to bring more structure to these
problems to aid the decision making process. The DSS approach realizes that people, not machines make
decisions.
CHARACTEISTICS OF DSSs
Handle large amounts of data from different sources
Provide report and presentation flexibility
Support drill down analysis e.g. cost of an entire project, cost per phase, cost per activity.
Perform “what if” analysis, that is the process of making hypothetical changes to problem data and
observing the impact on the results.
Perform simulation analysis that is the ability of a DSS to act like or duplicate the features of a real
system. There is a level of uncertainty or probability involved. E. g. the number of people arriving at a
restaurant at night surely varies. A DSS can be used to simulate nightly arrival rates over a six month
period to determine the best type and number of employees needed.
Perform goal seeking analysis which is the process of determining the problem data required for a given
result. E.g. a manager who wants a return of 9% on an investment. Goal seeking allows the manager to
determine what monthly net income (problem data) is needed to have a return of 9% (problem result).
GROUP DECISION SUPPORT SYSTEMS
This is a system that consists of most of the elements in a DSS plus a software to support group decision
making. They are also called computerized collaborative work systems
EXECUTIVE SUPPORT SYSTEMS
A specialized DSS that includes all hardware, software, data, procedures used to assist senior level
executives (answerable to stockholders) within an organization.
ARTIFICIAL INTELLIGENCE
It is a field of science and technology based on a particular discipline (computer science, psychology,
mathematics, engineering, etc.). The goal of artificial intelligence is to develop computers that can think,
see, hear, walk, talk, feel. AI is an attempt to duplicate human capabilities in computer systems. The
Three major Application areas of AI are
AI
COGNITIVE SCIENCE
ROBOTICS
NATURAL INTERFACE
Expert Systems
Visual Perception
Speech Recognition
Learning Systems
Locomotion
Virtual Reality
Neural Networks
Navigation
Multisensory Interface
COGNITIVE SCIENCE: Focuses on researching on how the human brain works and how humans think and
learn
ROBOTICS: Applications developed to give robots the powers of sight, virtual perception, touch and
other human like capabilities
NATURAL INTERFACE: The development of computers to be used as if they were human beings e.g.
being able to talk to machines in natural language and have them understand.
EXPERT SYSTEMS
These are the most practical examples of artificial intelligence
Expert System= major components of a computer info system + knowledge base
AN expert system is a knowledge based information system that uses knowledge about a specific or
complex program to act as a consultant to users
CHARACTERISTICS OF EXPERT SYSTEMS
They provide answers to specific problem questions by making human like inferences from its
knowledge base.
They have the ability to explain their reasoning process and conclusions to a use, that is a form of user
support
They provide answers to a user’s question in an interactive process.
COMPONENTS
METHODS OF MAKING INFERENCE USING EXPERT SYSTEMS
There are four main techniques
CASE BASED REASONING: Knowledge represented in the form of past or historic occurrences and
experiences
FRAME BASED: Knowledge represented in the form of a hierarchy or network of frames. A frame is a
collection of knowledge about an entity consisting of a complex package of attributes and corresponding
data values and items.
OBJECT BASED: Knowledge represented as a link of objects.
An object = data element + methods or processes that act on the data
RULE BASED: Knowledge represented in the form of statements and rules based on facts. These take the
form of a promise and a conclusion. E.g.
IF (condition) = she is a systems analyst
THEN (conclusion) = she must have a computer
BENEFITS OF EXPERT SYSTEMS
They are faster and efficient machines that contain the expertise of several people so can outperform a
single human mind.
They preserve and reproduce the knowledge of experts which can be shared by reproducing or making
multiple copies of software
IF an expert decides to leave, multiple copies of his ideas can be used in training novices throughout an
organization
THE HIERARCHY OF DATA
Data is generally organized in a hierarchy that begins with the smallest piece and progresses to the
highest part call the Database.
BIT: The smallest unit of data is the bit which represents either an on or off circuit. A bit represents the
actual state of data storage on primary media.
BYTE: A byte is a basic grouping of bits. A group of bytes is called a character. 8 bits makes up one
byte. Characters represent the basic building blocks of information, that is, from a user’s point of view,
a character is the most basic element of data that can be observed and manipulated e.g. [0,1,2….] letters
(a, b, c) or symbols ( -,:)
FIELD: a field is a combination of characters that describes an aspect of an object or activity. Fields are
also called attributes, that is, characteristics or qualities of entities [objects, person, place or event] an
object name, employee salary, etc. The specific value of an attribute is called a data item.
RECORD: a record is a collection of related fields, that is, a collection of the different aspects of an object
created to achieve a more complex description. Using employee record as an example, a record = a
collection of fields about one (single) employee. Such a record will have different fields (name, address,
phone no., pay rate)
FILE: a collection of related records for an object is called a file. In a company, employee file= a
collection of all employee records within the company. An inventory file is a collection inventory records
for a particular company.
Any ordered set of accessible records is called a file. In computing terms, it refers to a number of records
that must be retained over a number of operational cycles
Reference File: an e.g. is a product price file. Reference is made to this to arrive at the total invoice value
of a particular customer
Dynamic File: Used to record constantly changing transactions as they occur e.g. ledger files
File storage in a computer system depends on 3 factors:
1. The speed with which data can be retrieved from the storage medium onto memory
2. Medium design, that is the manner in which records can be arranged for access on a particular
medium
3. The volume of information that the medium can conveniently hold
FILE ORGANIZTION
This refers to the relationship of the key of a record to the physical location of that record in the file
SERIAL FILE ORGANIZATION
In a serial file, records are simply stored as they occur; files are not placed in any predefined order or
storage condition. Since there is no way of predicting where any particular record will be stored,
information can only be accessed by a SEARCH
SEQUENTIAL FILE ORGANIZATION
This is the simplest form of ordered files. Records are placed in ascending or descending order of a
numerical or alphabetic key or a combination of both. To set up such files records must first of all be
sorted. The best advantage is that different master files ordered on the same key field can be merged.
These are frequently used in Batch Processing environments and in On- Line Query systems. The major
disadvantage is that transaction files will have to be sorted in the same sequence as the master file
before updating
INDEXED FILE ORGANIZATION
A file index is made up of key values of a record together with additional information on where the
complete record can be found. Indexed files may not be orderly stored because the index coin aids
retrieval which is a two-step process: first for the index, then for the record.
INDEXED SEQUNTIAL FILE ORGANIZATION
DIRECT (RANDOM) FILE ORGANIZATION
DATABASE
A collection of integrated and related files is called a database. A database is organized information,
designed to meet the needs of a specific user or particular business environment. A database can be
manually stored in cabinets but this greatly reduces the safety and accuracy of information. A better
management procedure is to automate the storage of large size databases to reduce data
inconsistencies, data isolation, and duplication and security problems.
APPROACHES TO DATA MANAGEMENT
MANUAL APPROACH
Data stored in paper/card board folders are kept in non- computerized systems called filing cabinets
Here the filing cabinets are files that contain records stored in paper folders. These records do not show
any inter relationships.
This method requires an individual to manually bridge the gap between files and records.
FILE BASED APPROACH (STRUCTURED PROGRAM)
A file is a collection of related records. With this approach data is stored in an electronic system. It is
the method in which separate date files are created and stored for each application program, i.e., each
and every application program has its own store of data to work on.
If new data is captured, then new programs must be written. This leads to the creation of large volumes
of data with lot of disadvantages, e.g.
INFLEXIBLE DATA: Since data cannot be shared, it is not possible to supervise concurrent access between
different or unrelated programs.
DATA DEPENDENCE: The same data is organized differently for each separate program making that same
source of data incompatible with regards to different programs.
DATA INCONSISTENCY: Different copies of the same data may no longer agree e.g. one program may
effect a change on a piece of data whiles that same data may be in a separate file and be accessed by a
different program.
REDUCED DATA INTEGRITY: Stored data must fall within some consistency limits. When set limits are
altered, it is difficult to change programs to enforce them and even more difficult when limits involve
several data items from different files.
DATA REDUNDANCY: Because of the duplication of data, a high storage cost is incurred on irrelevant
data.
SECURITY PROBLEMS: Not all users should be able to access all data.
DATABASE APPROACH
This is the method of data management in which a pool of related data is shared by multiple application
programs. There is greater flexibility in accessing data because of sharing from a central store.
DATABASE: A shared collection of logically related data and the description of the data designed to
meet the needs of an organization or a specific user.
Organizing data using the database approach requires the use of additional software called the
Database Management System.
DBMS:
A software system that enables users to define create, maintain and control access to a
database. DBMSs fall into the category of software called system Management Programs [also O.S.,
MW Programs, utilities mgt., e.g. Antivirus Packs].
FUNCTIONS OF A DBMS
It allows users to DEFINE the database through data definition language. ADDL is a language for
describing entities, attributes and relations for a database.
Allows users to UPDATE that is, insert, delete and retrieve data from the data base through a DATA
MANIPULATION LANGUAGE. ADML is a language that provides a set of operations to support the basic
operations on data.
Provides controlled access to the database e.g.
a. Security; prevents unauthorized users. User names & passwords define user access and
prevent unauthorized users.
b. Integrity; Maintains consistency of stored data
c. Concurrency Control: It allows many users to access shared data at the same tine without
interference. Also prevents two users from accessing the same file at the same time
d.
Recovery control: It restores a database to previous state following HW or SW failure.
e. User accessible Catalog: This contains descriptions of the data in the database.
ADVANTAGES
1.
Controls redundancy but does not rule out redundancy and duplication entirely but controls
this so less space is wasted.
2.
DATA CONSISTENCY: ensures all copies of a stored data item is kept consistent.
3. DATA SHARING: all authorized users can share the same data using multiple application
4. Economy of Scale combining all operational data and creating a set of applications to work on
the data source is cost saving.
5. Balance of conflicting requirements balance of conflicting user’s needs may differ; the DBMS
enables the DBA to make decisions about the design and operational use of the database that
provides important applications with optimal resources at the expense of less critical ones.
DISADVANTAGES
COMPLEXITY: Its extensive functionality makes a DBMS a complex piece of software. Failure of
users to fully understand it can lead to bad decisions consequential to an organization.
COST: Varies depending on the environment and functionally. Disk storage requirements may
attract additional costs balance of conflict the large size occurs space.
PERFORMANCE: Other applications on the O.S. may not run as fast as they should because o the
size and functionality (it is a multiple application not just one) of a DBMS
HIGHER IMPACT OF FALURE The centralization of resources increases the vulnerability of the
system. Since all users rely on the DBMS, failures of any component can bring operations to a halt.
FACTORS TO CONSIDER IN SELECTING A DBMS
PERFORMANCE: You need to consider the speed of the DBMS per second and not minute .e.g. checking
credit card rating, airline reservations. The speed with which a DBMS updates records is of utmost
importance.
EASE OF INTEGRITY: The ability of a DBMS to integrate with other applications in the database is of
critical consideration. E.g. exporting or importing data to other databases and programs.
ENHANCED FEATURES: There must be in built tools for security procedures, privacy protection, manuals
and documentation for easy use and understanding etc.
VENDOR REPUTATION: It must be well respected in the information systems Industry. Good financial
position, proven records of products, after sales support staff etc.
COST: Product packages for PCs cost a few hundreds of dollars for mainframes, products costs hundreds
of thousands of dollars. Apart from initial costs, budget for monthly operating costs (repairs, upgrades,
maintenance), hardware costs, personal training costs. Some vendors rent or lease DBMS software.
DATABASE
A collection of integrated and related files is called a database. A database is organized information,
designed to meet the particular business environment. A database can be manually stored in cabinets
but this greatly reduces the safety and accuracy of information. A better management procedure is to
automate the storage of large size databases to reduce data inconsistencies, data isolation, data
duplication and security problems.
DATA MODELLING
In organizing a database, you must first consider;
1. The type of data that needs to be collected.
2. User access. Before embarking on the project.
Building a database requires 2 different types of designs:
A Logical design: this shows abstract models of how the dataase should be structured and arranged to
meet the information needs of an organization. This involves identifying the links between different
data items and grouping them in an orderly fashion, since a database provides both input and output for
information systems, all functional users should assist in creating the logical design.
A Physical Design: How data will be organized and located within the database e.g., actual storage on
hardware and this involves only highly technical staff.
A tool used in designing databases is a data model which is often a map or diagrams of entities and their
relationships.
DATA MODELS OR STRUCTURES
A model is represented of a real world object or event together with their associations. The purpose of
a data model is to represent data to make it more understandable.
A model is made up of set of rules governing how a database will be constructed, operations that will be
permitted and means to ensure data accuracy (integrity rule)
There are 3 main types of data models
Object based models; e.g. E-R diagrams
Record based model; e.g. hierarchy, network, relational
Physical data models; e.g. unifying models – describes data at the internal level.
HIERARCHICAL
Data is organized in a top down or inverted tree structure. All records are root record and any number
of lower level records. All relationships are one to many since each data element is related to any one
element above it e.g. one parent many children situation and there is only one access path to any data
element or child.
PROJECT
DEPT. A
EMPLOYEE 1
Root element
DEPT. B
EMPL.2
EMPL.3
NETWORK
With this model a particular data item can be accessed through more than one path since it permits
many to many relationships.
The major limitation of the network and hierarchical models is that once relationships are established
among data elements, it is difficult to modify or create new ones.
ACCOUNTING FACULTY
STDUDENT 1
MARKETING FACULTY
STD.2
ADMINISTRATION FACULTY
STD. 3
COURSE A
STD 4
COURSE B
STD. 5
COURSE C
RELATIONAL MODEL.
It is based on the mathematical concept of a relation proposed by E. F. Codd in 1970.
A relation is a physical representation of a table. Here all data elements are placed in 2 dimensional
tables. As long as these tables share one common element they can be linked to input information.
PROPERTIES OF RELATIONS.
Every table must have a name distinct from all others in a schema.
Each cell must have only one value.
Each column must have a distinct name.
Each row is distinct, there are o duplicate rows. E.g. for a student database of level 200.
Std. ID
Last name
First name
DOB
Gender
Conduct
Program
Exam score
Relational models support basic manipulations e.g. select, project, join, link etc.
Select eliminating rows according to certain criteria.
Project eliminating columns in a table
Join—combine two or more tables.
Link—combine two or more tables that share at least one common data element. This makes the
relational model very flexible, and is of great advantage when information is needed from multiple
tables.
DATA ANALYSIS.
A process of evaluating data to uncover problems with the content of a database. Data of good content
must be simple, flexible, non- redundant and adaptable to a number of different applications. This can
be achieved y
i.
Normalization. Purging data so that the attributes in a table depends on the full primary key
alone. It involves 3 main steps.
1. Eliminating repetitive groups. That is sets of attribute that appear more than once.
2. Eliminating data that occurs more than ones.
3. Eliminating attributes in a table that are not dependent on the primary key of that table.
RELATIONAL TERMINOLOGY.
RELATION/File—a table with rows and column.
Tuple/Record—a row of a relation.
Attribute/Field—a column of a relation.
Degree—the number of columns a relation contains
Cardinality—the number of rows a relation contains.
RELATIONAL DATABASE.
A collection of normalized relations with distinct relation names. E.g. the main database with so many
tables.
RELATIONAL KEYS.
SUPERKEY—the value for an attributes that uniquely identifies a row.
CANI DATE KEY—A superkey that has no proper subset which can imitate its uniqueness.
PRIMARY KEY—A candidate key which uniquely identify a relation.
NULL
A null represent a value for a column that is currently not known, or not applicable to a particular row.
A null is not the same as a zero or a space, zero and spaces are values. A null represent the absence of a
value.
RELATIONAL INTEGRITY.
Any relation has a set of values or data items it can permit. These are called the domain. The number of
values in a domain are definite and this forms restrictions on the set of allowable values.
In additions, two other rules form restrictions on a relational database.
Entity Integrity. In a relation, no value of a primary key can be null. In a relation, a foreign
key value must match the candidate key value in the home relation or the foreign key value
must