Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Information Technology Foundations-BIT 112 CHAPTER 4 Data and Knowledge Management Information Technology Foundations-BIT 112 Chapter Outline • 4.1 Managing Data • 4.2 The Database Approach • 4.3 Database Management Systems • 4.4 Data Warehousing • 4.5 Data Governance • 4.6 Knowledge Management 2 Information Technology Foundations-BIT 112 Learning Objectives • Recognize the importance of data, issues involved in managing data and their lifecycle. • Describe the sources of data and explain how data are collected. • Explain the advantages of the database approach. • Explain the operation of data warehousing and its role in decision support. • Explain data governance and how it helps to produce high-quality data. • Define knowledge, and describe different types of knowledge. 3 Information Technology Foundations-BIT 112 Examples of Data Sources Credit card swipes E-mails RFID tags Digital video surveillance Radiology scans Blogs 5 Information Technology Foundations-BIT 112 Chapter Opening Case Push Model Products 6 Information Technology Foundations-BIT 112 Chapter Opening Case Pull Model Orders 7 Information Technology Foundations-BIT 112 4.1 Managing Data • Difficulties in Managing Data – Amount of data increases exponentially. – Data are scattered and collected by many individuals using various methods and devices. – Data come from many sources. – Data security, quality and integrity are critical. 8 Information Technology Foundations-BIT 112 Difficulties in Managing Data • An ever-increasing amount of data needs to be considered in making organizational decisions. The Data Deluge http://www.applimation.com/ 9 Information Technology Foundations-BIT 112 Data Life Cycle (Figure 4.1) • Businesses run on data that have been processed or transformed into information and knowledge. • Figure 4.1 illustrates the processing of data into information and ultimately knowledge. Time 10 Information Technology Foundations-BIT 112 Data, Information, Knowledge, Wisdom • Putting data, information, knowledge, and wisdom into perspective. 11 Information Technology Foundations-BIT 112 What is meaning Data, Information, Knowledge, and Wisdom ? • At your tables, take a few minutes and try to define these terms. 12 Information Technology Foundations-BIT 112 What is meaning of Data, Information, Knowledge, and Wisdom ? • Data Item – Elementary description of things, events, activities and transactions that are recorded, classified and stored but are not organized to convey any specific meaning. • Information – Data organized so that they have meaning and value to the recipient. • Knowledge – Data and/or information organized and processed to convey understanding, experience, accumulated learning and expertise as they apply to a current problem or activity. • Wisdom – The quality or state of being wise; knowledge of what is true or right coupled with just judgment as to action 13 Information Technology Foundations-BIT 112 4.2 The Database Approach • A database management system (DBMS) provides all users with access to all the data. • DBMSs minimize the following data management problems: – Data redundancy: • The same data are stored in many places. – Data isolation: • Applications cannot access data associated with other applications. – Data inconsistency: • Various copies of the data do not agree. 14 Information Technology Foundations-BIT 112 Database Approach (continued) • DBMSs maximize the following issues: – Data security: • Keeping the organization’s data safe from theft, modification, and/or destruction. – Data integrity: • Data must meet constraints (e.g., student grade point averages cannot be negative). – Data independence: • Applications and data are independent of one another. This means that applications and data are not linked to each other, so application logic can be changed and the database does not have to be modified. The inverse is also true. 15 Information Technology Foundations-BIT 112 Database Management Systems 16 Information Technology Foundations-BIT 112 Data Hierarchy (some DBMS Terminology) • A bit – a binary digit, or a “0” or a “1”. • A byte – eight bits and represents a single character (e.g., a letter, number or symbol). • A field – a group of logically related characters (e.g., a word, small group of words, or identification number). • A record – a group of logically related fields (e.g., student in a university database). • A file – a group of logically related records. • A database – a group of logically related files. 17 Information Technology Foundations-BIT 112 Hierarchy of Data for a Computer-Based File 18 Information Technology Foundations-BIT 112 Data Hierarchy (continued) Bit (binary digit) Byte (eight bits) 19 Information Technology Foundations-BIT 112 See Digital Data Representation Handout • Review Digital Data Representation Handout 20 Information Technology Foundations-BIT 112 Data Hierarchy (continued) • Example of Field and Record 21 Information Technology Foundations-BIT 112 Data Hierarchy (continued) Example of a Database Form. 22 Information Technology Foundations-BIT 112 Designing the Database • Data Model – A diagram that represents the entities in the database and their relationships. • Data Model Components – Entity • An entity is a person, place, thing, or event about which information is maintained. • A record is a database instance of an entity. – Attribute • A particular characteristic or quality of a particular entity. – Primary Key • A field that uniquely identifies a record. – Non-key Attributes • A property or characteristic of an entity that is not part of the key 23 Information Technology Foundations-BIT 112 Entity Example Entity Attributes MOVIE Movie Number Instances Name Rating Rental Rate 12345345 Die Hard PG13 $3 23456781 Wings PG $2 65656565 Black Beauty G $2 CUSTOMER Cust Number Name Address Status Code 123-345 Tom Jones 12 Oak St OK 789-789 Mary Sullivan 456 Hill Ave Pend 567-342 Bob Waters 7676 Scutter Rd OK 24 Information Technology Foundations-BIT 112 Entity Attribute Try it … • Copy #-The sequence number of the item available for rent. Used to differentiate multiple copies of a Movie. • Customer # (Fk2)-Unique identifier of an individual authorized to rent a Movie. • Late Status-A status code identifying if the rental item has not been returned by the Return Date. • Length-The running time in minutes of the item available for rent. • Movie #-Unique identifier of the item available for rent. • Movie Rental-An instance of a Movie being rented by a customer. • Movie Type-The genre or classification associated with the items available for rent. • Movie-An item that is available to rent, a motion picture or television production. • MPAA Rating-Motion Picture Association of America evaluation. Valid values are: G, PG, PG-13 R, and NC-17. • Rent Date-The date a Movie is rented by a Customer. • Return Date-The date a rented Movie is to be returned to the store for restocking. • Title-The name of the item available for rent. 25 Information Technology Foundations-BIT 112 Entity-Relationship Modeling • Database designers plan the database design in a process called entity-relationship (ER) modeling. • ER diagrams consists of entities, attributes and relationships. • Other concepts – Entity classes • Groups of entities of a certain type. – Instance • The representation of a particular entity. – Identifiers • Attributes that are unique to that entity instance. 26 Information Technology Foundations-BIT 112 Sample Information Model (Relational - IDEF 1X) (SET TYPE) 27 Information Technology Foundations-BIT 112 Entity-Relationship Diagram Model 28 Information Technology Foundations-BIT 112 4.3 Database Management Systems Key Definitions • Database management system (DBMS) – A set of programs that provide users with tools to add, delete, access, and analyze data stored in one location. • Relational database model – A popular type of DBMS that is based on the concept of twodimensional tables. • Structured Query Language (SQL) – SQL is a standard interactive and programming language for querying and modifying data and managing databases. – The core of SQL is formed by a command language that allows the retrieval, insertion, updating, and deletion of data, and performing management and administrative functions. • Query by Example (QBE) – allows users to fill out a grid or template to construct a filter or description of the data one wants. 29 Information Technology Foundations-BIT 112 Example of a Relational Database Table 30 Information Technology Foundations-BIT 112 Normalization • A set of rules for analyzing the attributes of an information model – – – – Eliminate model redundancy Ensure model consistency Verify structural correctness Maximize stability • However, normalization cannot validate a model's accuracy in reflecting the business meaning of the information 31 Information Technology Foundations-BIT 112 Normal Forms • Sequential steps for achieving an optimized and logically desirable information model • Provides a common foundation from which an efficient physical database design can be created • There are six degrees of normal form - the first three are usually sufficient for most modeling applications • • • • • • First normal form Second normal form Third normal form Boyce/Codd normal form Fourth normal form Fifth normal form 32 Information Technology Foundations-BIT 112 First Normal Form - (1NF) • Every key and non-key attribute of an entity must be single valued • No entity instance can have multiple values for a given attribute • i.e., The No Repeat Rule • A violating entity is corrected by removing repeating or multivalued attributes to another, dependent (child) entity 33 Information Technology Foundations-BIT 112 First Normal Form - Example RESTAURANT REST NAME ADDRESS PHONE # EMPLOYEE NAME REST NAME ADDRESS PHONE # BURGER KING 123 NORTH ST 123-2345 TACO HOUSE 345 126TH PLACE 765-8907 FISH COMPANY 77 SUNSET AVE 395-5682 RESTAURANT REST NAME ADDRESS PHONE # EMPLOYEE NAME JOHN, SUE, LISA MARY, BILL ED, SAM, JOSE, RICK EMPLOYEE employs EMPLOYEE NAME REST NAME POSITION 34 Information Technology Foundations-BIT 112 Second Normal Form - (2NF) • An entity that is in first normal form and each non-key attribute is dependent on the entire primary key • No non-key attribute instance can be determined by knowing just part of an entity instances key • A violating entity is corrected by removing to a parent entity any attributes that depend on only a subset of the primary key 35 Information Technology Foundations-BIT 112 Second Normal Form - Example RESTAURANT ORDER REST NAME SUPPLIER NAME ORDER ITEM SUPPLIER PHONE # REST NAME SUPPLIER NAME ORDER ITEM BURGER KING SAM'S PRODUCE TACO HOUSE SALSA INC. FISH COMPANY SAM'S PRODUCE SUPPLIER SUPPLIER NAME PHONE # fills BEEF PEPPERS SNAPPER SUPPLIER PHONE # 123-2345 765-8907 123-2345 RESTAURANT ORDER REST NAME ORDER ITEM SUPPLIER NAME (FK1) 36 Information Technology Foundations-BIT 112 Third Normal Form - (3NF) • An entity that is in second normal form and each non-key attribute is only dependent on the entire primary key and nothing other than the key • No non-key attribute instance can be determined by knowing the value of another non-key attribute for the same instance • A violating entity is corrected by removing to a parent entity any attributes exhibiting transitive dependencies (non-key attributes that not only depend on the whole key but also on other nonkey attributes) 37 Information Technology Foundations-BIT 112 Third Normal Form - Example RESTAURANT RESERVATION REST NAME RESERVATION # CUSTOMER NAME CUSTOMER PHONE # TIME # IN PARTY REST NAME RES # CUST NAME F. JONES BURGER KING 12 R. SMITH TACO HOUSE 234 F. JONES FISH COMPANY 88 CUSTOMER CUSTOMER NAME PHONE # makes CUST PH # TIME 123-2345 765-8907 123-2345 11:00 AM 2:30 PM 8:15 PM # IN PARTY 4 4 6 RESTAURANT RESERVATION REST NAME RESERVATION # CUSTOMER NAME (FK1) TIME # IN PARTY 38 Information Technology Foundations-BIT 112 Example #2 Non-Normalized Relation 39 Information Technology Foundations-BIT 112 Normalizing the Database (part A) 40 Information Technology Foundations-BIT 112 Normalizing the Database (part B) 41 Information Technology Foundations-BIT 112 Summary: Normalization Produces Order 42 Information Technology Foundations-BIT 112 Database that Catches Plagiarists P116 A Turnitin originality report http://www.turnitin. com 43 Information Technology Foundations-BIT 112 4.4 Data Warehousing • Data warehouse – A repository of historical data organized by subject to support decision makers in an organization. – Organized by business dimension or subject. – Data warehouses are multidimensional. A Data Cube with three dimensions: • customer, • product, and • time. 44 Information Technology Foundations-BIT 112 Data Warehousing (continued) • Data warehouses are historical. – Historical data in data warehouses can be used for identifying trends, forecasting, and making comparisons over time. • Data warehouses use Online Analytical Processing (OLAP). – OLAP involves the analysis of accumulated data by end users (usually in a data warehouse). – In contrast, Online Transaction Processing (OLTP) typically involves a database, where data from business transactions are processed online and as soon as they occur. 45 Information Technology Foundations-BIT 112 Data Warehouse Framework & Views • Process of building and using a data warehouse. 46 Information Technology Foundations-BIT 112 Relational Databases • First slide of five showing the relationship between relational databases and a multidimensional data structure (or data cube). 47 Information Technology Foundations-BIT 112 Multidimensional Database View 48 Information Technology Foundations-BIT 112 Equivalence Between Relational and Multidimensional Databases 49 Information Technology Foundations-BIT 112 Equivalence Between Relational and Multidimensional Databases 50 Information Technology Foundations-BIT 112 Equivalence Between Relational and Multidimensional Databases 51 Information Technology Foundations-BIT 112 Benefits of Data Warehousing • End users can access data quickly and easily via Web browsers because they are located in one place. • End users can conduct extensive analysis with data in ways that may not have been possible before. • End users have a consolidated view of organizational data. 52 Information Technology Foundations-BIT 112 Data Marts • A data mart is a small data warehouse, designed for the end-user needs in a strategic business unit (SBU) or a department. • Are far less costly than an enterprise Data Warehouse. Typically by at least an order of magnitude. 53 Information Technology Foundations-BIT 112 4.5 Data Governance – An enterprise wide approach to managing data • Data governance definition – An approach to managing data and information across an entire organization. • Master Data Management – A method that organizations use in data governance. – Comprises a set of processes and tools for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing data throughout an organization in such a way as to ensure consistency and control in the ongoing maintenance and application use of this information. • Master data – The set of core data, non transactional data, such as customer, product, employee, and location, that spans all enterprise information systems. 54 Information Technology Foundations-BIT 112 Relationship Among Executive Management, IT Governance, and Data Governance • Shows the relationship between data governance and data management. Master Data Management 55 Information Technology Foundations-BIT 112 Data Governance (continued) 56 Information Technology Foundations-BIT 112 4.6 Knowledge Management • Knowledge management (KM) – process that helps organizations manipulate important knowledge that is part of the organization’s memory, usually in an unstructured format. • Knowledge – Is something that is contextual, relevant, and actionable. – a.k.a., Intellectual capital (or intellectual assets) 57 Information Technology Foundations-BIT 112 Knowledge Management (continued) Explicit Knowledge (above the waterline) • objective, rational, technical knowledge that has been documented. • Examples: policies, procedural guides, reports, products, strategies, goals, core competencies. Tacit Knowledge (below the waterline) • subjective or experiential learning. • Examples: experiences, insights, expertise, know-how, trade secrets, understanding, skill sets, and learning. 58 Information Technology Foundations-BIT 112 Knowledge Management (continued) • Knowledge management systems (KMSs) – Systems that use information technologies to systematize, enhance, and expedite intra and inter-organization knowledge management. • Best practices – The most effective and efficient ways/processes of doing things. 59 Information Technology Foundations-BIT 112 Knowledge Management System Life Cycle Six steps 1.Create knowledge 2.Capture knowledge 3.Refine knowledge 4.Store knowledge 5.Manage knowledge 6.Disseminate knowledge 60 Information Technology Foundations-BIT 112 Chapter Closing Case P. 131 High CVM passengers travel in style 61