Download MIS 250 Definitions for Quiz 4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
6.1 A grouping of characters into a word, a group of words, or a complete number (such as a person’s
name or age) is called a field.
A group of related fields, such as the student’s name, the course taken, the date, and the grade,
comprises a record; a group of records of the same type is called a file.
A group of related files makes up a database.
A record describes an entity. An entity is a person, place, thing, or event on which we store and
maintain information.
Each characteristic or quality describing a particular entity is called an attribute.
Data redundancy wastes storage resources and also leads to data inconsistency, where the same
attribute may have different values.
Program-data dependence refers to the coupling of data stored in files and the specific programs
required to update and maintain those files such that changes in programs require changes to the data.
6.2 A more rigorous definition of a database is a collection of data organized to serve many
applications efficiently by centralizing the data and controlling redundant data.
A database management system (DBMS) is software that permits an organization to centralize data,
manage them efficiently, and provide access to the stored data by application programs.
The most popular type of DBMS today for PCs as well as for larger computers and mainframes is the
relational DBMS. Relational databases represent data as two-dimensional tables (called relations).
The actual information about a single supplier that resides in a table is called a row. Rows are commonly
referred to as records, or in very technical terms, as tuples.
The field for Supplier_Number in the SUPPLIER table uniquely identifies each record so that the record
can be retrieved, updated, or sorted and it is called a key field.
Each table in a relational database has one field that is designated as its primary key. This key field is the
unique identifier for all the information in any row of the table and this primary key cannot be
duplicated.
When the field Supplier_Number appears in the PART table it is called a foreign key and is essentially a
lookup field to look up data about the supplier of a specific part.
An object-oriented DBMS stores the data and procedures that act on those data as objects that can be
automatically retrieved and shared.
Hybrid object-relational DBMS systems are now available to provide capabilities of both object-oriented
and relational DBMS.
DBMS have a data definition capability to specify the structure of the content of the database. It would
be used to create database tables and to define the characteristics of the fields in each table.
A data dictionary is an automated or manual file that stores definitions of data elements and their
characteristics.
Most DBMS have a specialized language called a data manipulation language that is used to add,
change, delete, and retrieve the data in the database.
The most prominent data manipulation language today is Structured Query Language, or SQL.
The process of creating small, stable, yet flexible and adaptive data structures from complex groups of
data is called normalization.
Relational database systems try to enforce referential integrity rules to ensure that relationships
between coupled tables remain consistent.
Database designers document their data model with an entity-relationship diagram, illustrated in the
relationship between the entities SUPPLIER, PART, LINE_ITEM, and ORDER. (The boxes represent
entities).
6.3 A data warehouse is a database that stores current and historical data of potential interest to
decision makers throughout the company.
A data mart is a subset of a data warehouse in which a summarized or highly focused portion of the
organization’s data is placed in a separate database for a specific population of users.
To obtain the answer, you would need online analytical processing (OLAP). OLAP supports
multidimensional data analysis, enabling users to view the same data in different ways using multiple
dimensions. Each aspect of information—product, pricing, cost, region, or time period—represents a
different dimension.
Data mining is more discovery-driven. Data mining provides insights into corporate data that cannot be
obtained with OLAP by finding hidden patterns and relationships in large databases and inferring rules
from them to predict future behavior. The patterns and rules are used to guide decision making and
forecast the effect of those decisions. The types of information obtainable from data mining include
associations, sequences, classifications, clusters, and forecasts.
Predictive analytics use data mining techniques, historical data, and assumptions about future
conditions to predict outcomes of events, such as the probability a customer will respond to an offer or
purchase a specific product.
Text mining tools are now available to help businesses analyze these data. These tools are able to
extract key elements from large unstructured data sets, discover patterns and relationships, and
summarize the information.
The discovery and analysis of useful patterns and information from the World Wide Web is called Web
mining.
In a client/server environment, the DBMS resides on a dedicated computer called a database server.
The DBMS receives the SQL requests and provides the required data. The middleware transfers
information from the organization’s internal database back to the Web server for delivery in the form of
a Web page to the user.
6.4 An information policy specifies the organization’s rules for sharing, disseminating, acquiring,
standardizing, classifying, and inventorying information. Information policy lays out specific procedures
and accountabilities, identifying which users and organizational units can share information, where
information can be distributed, and who is responsible for updating and maintaining the information.
Data administration is responsible for the specific policies and procedures through which data can be
managed as an organizational resource. These responsibilities include developing information policy,
planning for data, overseeing logical database design and data dictionary development, and monitoring
how information systems specialists and end-user groups use data.
You may hear the term data governance used to describe many of these activities. Promoted by IBM,
data governance deals with the policies and processes for managing the availability, usability, integrity,
and security of the data employed in an enterprise, with special emphasis on promoting privacy,
security, data quality, and compliance with government regulations.
In close cooperation with users, the design group establishes the physical database, the logical relations
among elements, and the access rules and security procedures. The functions it performs are called
database administration.
Analysis of data quality often begins with a data quality audit, which is a structured survey of the
accuracy and level of completeness of the data in an information system. Data quality audits can be
performed by surveying entire data files, surveying samples from data files, or surveying end users for
their perceptions of data quality.
Data cleansing, also known as data scrubbing, consists of activities for detecting and correcting data in a
database that are incorrect, incomplete, improperly formatted, or redundant. Data cleansing not only
corrects errors but also enforces consistency among different sets of data that originated in separate
information systems.