Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CHAPTER 3 DATA RESOURCE MANAGEMENT Learning Objectives Examine the managerial and organizational requirements for managing data as a resource. Foundation concepts A computer system organizes data in a hierarchy that starts with bytes and progresses to fields, records, files, and databases. FILE ORGANIZATION BIT: the smallest unit of data a computer can handle. BYTE: Combination of BITS called a byte, represents a CHARACTER which can be a letter, a number or another symbol. FIELD: A grouping of characters into a word, a group of words, or a complete number (such as person’s name or age). RECORD: A group of related fields, such as the student’s name, the course taken, the date, and the grade comprise a RECORD. FILE: A collection of similar RECORDS. DATABASE: A group of related files makes up a DATABASE. DATABASE ORGANIZATION’S ELECTRONIC LIBRARY STORES & MANAGES DATA IN A CONVENIENT FORM * Example Student Database HIERARCHY Course file Database Personal history file NAME File Record Financial file John Stewart Karen Taylor Emily Vincent COURSE DATE GRADE IS 101 F02 IS 101 F02 IS 101 F02 NAME COURSE DATE GRADE John Stewart IS 101 F02 B+ Field John Stewart (NAME field) Byte 01001010 (Letter J in ASCII) Bit B+ A C 0, 1 KEY FIELD Every record on a file should contain at least one field that uniquely identifies instances of that record so that record can be retrieved, updated and sorted. This identifier field is called KEY FIELD. Social security number for a personnel record would be a key field. File organization In the database, the individual database is a collection of related attributes about entities. A record describes an ENTITY. An ENTITY is something you collect data about, such as people or classes. A record is a collection of related attributes about a single entity. Each record typically consists of many attributes which are individual pieces of information. For example, order number, order date, order account, item number, and item quantity would each be an attribute of the entity order. Entity=ORDER Attributes Order number Order date Item number Quantity 4340 02/08/03 1583 2 Amount 17.40 fields Key field The simple data table Entity: Student Attribute types ID Number Last Name First Name Major 209345 Borden Chris Sociology Attribute 213009 Chan Lisa Finance 45789 Austin John Law Record (one row) Database Management Systems (DBMS) A DBMS is a software application with which you create, store, organize, and retrieve data from a single database or several databases. In the traditional file environment each functional area tended to develop systems in isolation from other functional areas. Accounting, finance, manufacturing, human resources all developed their own systems and data files. The organization is saddled with hundreds of programs with no one who knows what they do, what data they use, and who is using the data. DBMS DBMS DBMS is a software that permits to centralize data, manage them effectively, and provide access to the stored data by application programs. Software to create & maintain data Enables business applications to extract data. Rather than storing data in separate files for each application, data are stored physically to appear to users as being stored in only one location. Microsoft Access is an example of a popular DBMS for personal computers. DBMS DBMS software enables end users to create and manage their own database applications. Most DBMSs have a specialized language called a data manipulation language that is used to manipulate data in the database. STRUCTURED QUERY LANGUAGE (SQL) SQL is a data manipulation language to access, read and update data from databases. Most common language used to interface with the databases. * SQL The most important SQL commands: SELECT: lists the columns from tables that the user would like to see in a result table. FROM: identifies the tables or views from which the columns will be selected. WHERE: includes conditions for selecting specific rows (records) within a single table and conditions for joining multiple tables. SQL Database PART SUPPLIER Part number Part description Unit price Supplier number 137 145 150 152 Door latch Door handle Door seal Compressor 22.50 26.25 6.00 70.00 4058 2038 1058 1125 Supplier number 4058 2038 1125 Supplier name CBM Inc. Ace Inc. Bryant Corp. Supplier address 44 Winslow, Gary, IN 44950 Rte, 101, Essex, NJ 07763 51 Elm, Rochester, NY 11349 SQL Suppose you wanted to see the Part_Number, Part_Description, and Unit_Price for each part in the PART table. You would specify: SELECT Part_Number, Part_Description, Unit_Price FROM PART; The results of using the SELECT statement to select only the columns Part_Number, Part_Description and Unit_Price from all rows in the PART table. Part number Part description Unit price 137 145 150 152 Door latch Door handle Door seal Compressor 22.50 26.25 6.00 70.00 SQL Suppose, you wanted to see the same data only for parts in the PART table with unit prices less than $25.00. You would specify: SELECT Part_Number, Part_Description, Unit_Price FROM PART; WHERE Unit_Price<25.00 The results of using a conditional selection to select only parts that meet the condition of having unit prices less than $25. Part number Part description Unit price 137 150 Door latch Door seal 22.50 6.00 Data manipulation language Many DBMS packages have a simpler way of interfacing with the databasesusing a concept called query by example (QBE). It enables us to fill out a grid, or template, in order to construct a sample of the data we want to see. Database Trends 1.Data Warehouses Database that stores current and historical data of potential interest to managers throughout the company. This data is extracted for various operational systems and external sources including web site transactions. The data from these diverse applications are copied into the data warehouse database as often as needed (hourly, weekly, daily, monthly). Data Warehouses The data are standardized into a common data model and consolidated so that they can be used across the enterprise for management analysis and decision making. The data are available for anyone to access as needed but cannot be altered. Much like a physical warehouse for products and components, the data warehouse stores and distributes data on computer-based information systems. Components of a data warehouse Operational data INTERNAL DATA SOURCES Historical data Operational data Historical data EXTERNAL DATA SOURCES External data External data Extract and transform Data Warehouse Information directory Data Access and analysis •Data mining •Reports 2. Data marts Companies can build enterprise-wide data warehouses where a central data warehouse serves the entire organization, or they can create smaller, decentralized warehouses called data marts. Small data warehouse for special function, e.g., Focused marketing based on customer info. Data marts Subset of a data warehouse in which a summarized or highly focused portion of the organization's data is placed in a separate database for a specific population of users. Each data mart contains a subset of the data for a single aspect of a company’s business, for ex. finance, inventory, or personnel. Data marts have been popular among small and medium sized businesses and among departments within larger companies. Data marts typically contain tens of gigabytes of data, as opposed to the hundreds of gigabytes in data warehouse. The cost of data mart is typically less than $1 million, while the cost for a data warehouse can exceed $10 million. DATABASE ADMINISTRATION DEFINES & ORGANIZES DATABASE STRUCTURE AND CONTENT DEVELOPS SECURITY PROCEDURES DEVELOPS DATABASE DOCUMENTATION MAINTAINS DBMS * Management requirements for database systems The database is an organizational discipline, a method, rather than a tool or technology. It requires organizational change. Without management support and understanding, database efforts fail. The critical elements in a database environment are data administration, data planning, and database management. ELEMENTS OF DATABASE ENVIRONMENT DATA DATABASE ADMINISTRATION MANAGEMENT DATABASE MANAGEMENT SYSTEM DATA PLANNING USERS Data administration The organizations must develop a data administration function for managing (collecting, storing, and dissemination of all types of data) the organization's data resources. Data administration is responsible for the specific policies and procedures through which data can be managed as an organizational resource. Data planning The organization requires enterprise-wide planning for data. Enterprise analysis, which addresses the information requirements of the entire organization is needed to develop databases. The purpose of it is to identify the key entities, attributes, and relationships that constitute the organization's data. Database management Databases require new software and a new staff specially trained in DBMS’ techniques as well as new data management structures. Most corporations develop a database design and management group within the corporation that is responsible for defining and organizing the structure and content of the database and maintaining the database. Advantages of DBMS DBMS can manage hundreds of tables simultaneously. Helps us to manage tremendous volume and complexity of interrelated data. Prevents unnecessary redundancies of data. Improved data sharing. Improved data quality. Improved data accessibility. Database Structures In all information systems, data resources must be organized and structured in some logical manner. Contemporary DBMS uses different database models to keep track of entities, attributes and relationships. Database Structures Hierarchical Network Relational Object-oriented TYPES OF RELATIONS ONE-TO-ONE: STUDENT CLASS ONE-TO-MANY: STUDENT A MANY-TO-MANY: ID STUDENT B CLASS 1 STUDENT A STUDENT C CLASS 2 STUDENT B STUDENT C HIERARCHICAL DATA MODEL Hierarchical DBMS depict data logically as one to-many relationships. The hierarchical DBMS presents data to users in a treelike structure. To the user, each record looks like an organization chart with one top-level segment called the root. An upper segment is connected logically to a lower segment in a parent-child relationship. Used for structured, routine types of transaction processing. HIERARCHICAL DATA MODEL ROOT FIRST CHILD 2nd CHILD Employer Compensation Ratings Salary Job Assignments Pension Benefits Insurance Health NETWORK STRUCTURE Network DBMS depict data logically as many-tomany relationships. More flexible than the hierarchical structure in support of databases for many types of business operations. A typical many-to-many relationship for a network DBMS is the student-course relationship. There are many courses in the university and many students. A student takes many courses and a course has many students. NETWORK DATA MODEL MANY-TO-MANY: CLASS 1 CLASS 2 STUDENT STUDENT STUDENT A B C RELATIONAL DBMS The most popular type of DBMS today for PCs as well as for larger computers and mainframes is the relational DBMS. It can relate data stored in one table to data in another as long as the two tables share a common data element. Easier for programmers to work with and easier to maintain than the hierarchical and network models. Leading mainframe relational database management systems include IBM’s DB2 and Oracle from the Oracle Corporation. MS Access is a PC relational database management system. RELATIONAL DBMS PART SUPPLIER Part number Part description Unit price Supplier number Supplier number Supplier name Supplier address 137 145 150 152 Door latch Door handle Door seal Compressor 22.50 26.25 6.00 70.00 4058 2038 1058 1125 4058 2038 1125 CBM Inc. Ace Inc. Bryant 44 Winslow, Gary, IN 44950 Rte, 101, Essex, NJ 07763 51 Elm, Rochester, NY 11349 Select Part_number=137 or 152 Join by Supplier_number Part number Supplier number Supplier name Supplier address 137 152 4058 1125 CBM Inc. Bryant 44 Winslow, Gary, IN 44950 51 Elm, Rochester, NY 11349 Combined data from two different tables and only selected attributes are displayed. Object-oriented DBMS Many applications today and in the future will require databases that can store and retrieve not only structured numbers and characters but also drawings, images, photographs, voice, and full-motion video. Stores data & procedures as objects. Conventional DBMSs are not well suited to handling graphics-based or multimedia applications. They are becoming more popular because they can be used to manage the various multimedia components or Java applets used in Web applications which typically integrate pieces of info from a variety of sources. CREATING A DATABASE To create a database, one must go through two design exercises: CONCEPTUAL DESIGN PHYSICAL DESIGN * CREATING A DATABASE CONCEPTUAL DESIGN: Abstract model from a business perspective. It describes how data will be grouped. It establishes end-user needs. * CREATING A DATABASE PHYSICAL DESIGN: Detailed model by database specialists How data are to be physically stored and accessed on the storage devices * CREATING A DATABASE Entity-relationship diagram Database designers document the logical data model with an entity-relationship diagram (ERD) which is a methodology for documenting databases illustrating the relationship between various entities in the database. ENTITY- RELATIONSHIP DIAGRAM Entity Attributes Entities ORDER ORDER: #, DATE, PART #, QUANTITY 1 Relationships CAN HAVE M PART: #, DESCRIPTION, UNIT PRICE, SUPPLIER # PART M CAN HAVE 1 SUPPLIER SUPPLIER: #, NAME, ADDRESS