Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 7 Data and Knowledge Management J. S. Chou, P.E., Ph.D. Assistant Professor Department of Business Administration National Chung Cheng University 1 Objectives 1. Describe why databases have become so important to organizations 2. Describe what databases and database management systems are and how they work 3. Explain how organizations are getting the most from their investment in database technologies 4. Describe what is meant by knowledge management and knowledge assets as well as benefits and challenges of deploying a knowledge management system 2 Database Technology • A collection of related data organized in a way that makes it valuable and useful • Allows organizations to retrieve, store, and analyze information easily • Is vital to an organization’s success in running operations and making decisions 3 Database Terminology Entities • Things we store information about. (i.e. persons, places, objects, events, etc.) • Have relationships to other entities (i.e. the entity Student has a relationship to the entity Grades in a University Student database Attributes • These are pieces of information about an entity (i.e. Student ID, Name, etc. for the entity Student) 4 Relationship of DBMS Concepts to Others? 5 Levels of a Database Management System (DBMS) Level Term Term Definitions Lowest Highest Field Individual characteristics about an ENTITY. Fields are also called attributes or columns depending on the type of DBMS Record A group of fields or attributes to describe a single instance of an ENTITY. These are also called rows depending on the DBMS File A collection of records or instances for a given ENTITY. These are also called tables depending on the DBMS Database A collection of files or entities containing information to support a given system or a particular topic area 6 View of a Database Table or File Attribute (One Column) Attribute Type Record (One Row) 7 File Processing vs Database Approach Summary File Processing Approach (Old School) • Storage Media: Sequential tapes or files • Data: stored in long sequential files • Organization: redundant data in multiple files • Efficiency: data embedded to support processing • Updates: requires multiple updates in many files • Processing: slower query/faster processing Data Base Approach (New School-TODAY) • Storage Media: Direct Access Storage Device (DASD) • Data: stored in related tables • Organization: redundant data minimized/eliminated • Efficiency: data only stored only in tables • Updates: requires few or one update for a data field • Processing: faster query/slower processing 8 Roles in Database Development and Use Database Administrator (DBA) • Designs, develops and monitors performance of databases • Enforces policy and standards for data uses and security Systems Programmer • Creates business applications that connect to databases • Tests the new systems and databases before use Systems Analyst • Defines data requirements working with a DBA • Incorporates the database design into new program designs 9 Database Systems Activities – Data Entry Employment Applications Enter Forms (Form Entry Screen) Example • Data is entered from paper employment applications into a form entry screen • The entry forms are designed to match the paper forms for easy entry • The form data is processed by the entry program and then stored in the employment database (Form Entry Program) (Employment DB) 10 Database Systems Activities – Query Query – A database function that extracts and displays information from a database given selection parameters. SQL (Structure Query Language) • A language to select and extract data from a database • The industry standard language for relational databases QBE (Query by Example) • A technique that allows a user to design a query on a screen by dragging and placing the query field in their desired locations Example – Display applicants entered in the last 30 days • Query parameters are selected in the query request screen • The database program uses SQL to query and present the result (Query Request) (Query Program) (Employment Query) 11 Database Systems Activities – Report Report – A database function that extracts and formats information from a database for printing and presentation Report Generator • A specialized program that uses SQL to retrieve and manipulate data (aggregate, transform, or group) • Reports are designed using standard templates or can be custom generated to meet informational needs Example – Report on applicants entered in the last 30 days • Report parameters are selected in the report request screen • The database program uses SQL to query and present the result (Query Request) (Query Program) (Employment Report) 12 Designing Databases – Data Model Data Model • A map or diagram that represents entities and their relationships • Used by Database Administrators to design tables with their corresponding associations Example: ERD (Entity Relationship Diagram) 13 Designing Databases – Keys Database Keys Mechanisms used to identify, select, and maintain one or more records using an application program, query, or report Primary Key A unique attribute type used to identify a single instance of an entity. Compound Primary Key A unique combination of attributes types used to identify a single instance of an entity Secondary Key An attribute that can be used to identify one or more records within a table with a given value 14 Designing Databases – Keys (Example) Primary Key ENTITIES - Student ID Secondary Key Entities are translated into Tables (Students and Grades) - Major Entities are joined by common attributes Compound Primary Key - Student ID - Course ID - Sec No. - Term 15 Designing Databases - Associations Associations • Define the relationships one entity has to another • Determine necessary key structures to access data • Come in three relationship types: - One-to-One - One-to-Many - Many-to-Many Foreign Key • An attribute that appears as a non-primary key in one entity (table) and as a primary key attribute in another entity (table) 16 Designing Databases - Associations Entity Relationship Diagram (ERD) • Diagramming tool used to express entity relationships • Very useful in developing complex databases Example • Each Home Stadium has a Team (One-to-One) • Each Team has Players (One-to-Many) • Each Team Participates in Games • For each Player and Game there are Game Statistics 17 Designing Databases - Associations 18 Designing Databases – Associations (Example) 19 The Relational Model The Relational Model • The most common type of database model used today in organizations • Is a three-dimensional model compared to the traditional two-dimensional database models - Rows (first-dimension) - Columns (second-dimension) - Relationships (third-dimension) • The third-dimension makes this model so powerful because any row of data can be related to any other row or rows of data 20 The Relational Model - Example 21 The Relational Model - Normalization Normalization • A technique to make complex databases more efficient by eliminating as much redundant data as possible • Example: Database with redundant data (below) 22 The Relational Model - Normalization Normalized Database 23 The Relational Model – Data Dictionary Data Dictionary • Is a document that database designers prepare to help individuals enter data • Provides several pieces of information about each attribute in the database including: - Name - Key (is it a key or part of a key) - Data Type (date, alpha-numeric, numeric, etc.) - Valid Value (the format or numbers allowed) • Can be used to enforce Business Rules which are captured by the database designer to prevent illegal or illogical values from entering the database. (e.g. who has authority to enter certain kinds of data) 24 Online Transactional Processing (OLTP) Online Transactional Processing • The mechanism by which customers, suppliers, and employees process business transactions for an organization • These users conduct transactions online through internal systems and external Websites for processing and storage Example 25 Operational vs Informational Systems 26 Organizational Use of Databases Operational Informational Extract Data Extract Data Department Databases Data Warehouse • Day to Day Department Transactions • Used primarily by departments • Extracted Department transactions • Used for business analysis Data Mart • Extracted subset of a data warehouse • Used for highly specific business analysis 27 Online Analytical Processing (OLAP) Online Analytical Processing • Graphical software tools that provide complex analysis of data stored on a database • OLAP tools enable users to analyze different dimensions of data beyond data summary and data aggregations of normal database queries • The OLAP Server is the chief component of an OLAP system which understands how the data is organized and has special functions for analyzing data • OLAP can provide time series and trend analysis views of data, data-drill downs, and the ability to answer “what-if” and “why” questions as part of its functions 28 Data Mining Data Mining • Is a method companies use to analyze information to better understand their customers, products, markets, or any other phase of their business for which they have data • With data mining tools you can graphically drill down, sort or extract data based on certain conditions, perform a variety of statistical analysis • Data mining applications are very powerful and use highly complex algorithms to analyze and to identify opportunities 29 Data Warehouse Example 30 Uses of Data Warehousing 31 Data Life Cycle Process Continued The result - generating knowledge 32 Data Sources The data life cycle begins with the acquisition of data from data sources. These sources can be classified as internal, personal, and external. • • Internal Data Sources are usually stored in the corporate database and are about people, products, services, and processes. Personal Data is documentation on the expertise of corporate employees usually maintained by the employee. It can take the form of: – – – – – • • estimates of sales opinions about competitors business rules Procedures Etc. External Data Sources range from commercial databases to Government reports. Internet and Commercial Database Services are accessible through the Internet. 33 Methods for Collecting Raw Data The task of data collection is fairly complex. Which can create data-quality problem requiring validation and cleansing of data. • Collection can take place – in the field – from individuals – via manually methods • • • • time studies Surveys Observations contributions from experts – using instruments and sensors – Transaction processing systems (TPS) – via electronic transfer – from a web site (Clickstream) 34 Methods for managing data collection One way to improve data collection from multiple external sources is to use a data flow manager (DFM), which takes information from external sources and puts it where it is needed, when it is needed, in a usable form. • DFM consists of – a decision support system – a central data request processor – a data integrity component – links to external data suppliers – the processes used by the external data suppliers. 35 Data Quality and Integrity Data quality (DQ) is an extremely important issue since quality determines the data’s usefulness as well as the quality of the decisions based on the data. Data integrity means that data must be accurate, accessible, and upto-date. • Intrinsic DQ: Accuracy, objectivity, believability, and reputation. • Accessibility DQ: Accessibility and access security. • Contextual DQ: Relevancy, value added, timeliness, completeness, amount of data. • Representation DQ: Interpretability, ease of understanding, concise representation, consistent representation. Data quality is the cornerstone of effective business intelligence. 36 Knowledge Management Definitions Knowledge Management The process an organization uses to gain the greatest value from its knowledge assets Knowledge Assets All underlying skills routines, practices, principles, formulas, methods, heuristics, and intuitions whether explicit or tacit Explicit Knowledge Anything that can be documented, archived, or codified often with the help of information systems Tacit Knowledge The processes and procedures on how to effectively perform a particular task stored in a persons mind 37 Knowledge Management System (KMS) Best Practices Procedures and processes that are widely accepted as being among the most effective and/or efficient Primary Objective How to recognize, generate, store, share, manage this tacit knowledge (Best Practices) for deployment and use Technology Generally not a single technology but instead a collection of tools that include communication technologies (e.g. e-mail, groupware, instant messaging), and information storage and retrieval systems (e.g. database management system) to meet the Primary Objective 38 Knowledge – Knowledge Management Systems The goal of knowledge management is for an organization to be aware of individual and collective knowledge so that it may make the most effective use of the knowledge it has. Firms recognize the need to integrate both explicit and tacit knowledge into a formal information systems - Knowledge Management System (KMS) • A functioning knowledge management system follows six steps in a cycle dynamically refining information over time 1. 2. 3. 4. 5. 6. Create knowledge. Capture knowledge. Refine knowledge. Store knowledge. Manage knowledge. Disseminate knowledge. As knowledge is disseminated, individuals develop, create, and identify new knowledge or update old knowledge, which they replenish into the system. 39 Knowledge – Knowledge Management Systems Continued Knowledge Management Cycle 40 Knowledge Management – Information Technology Knowledge management is more than a technology or product, it is a methodology applied to business practices. However, information technology is crucial to the success of knowledge management systems. • Components of Knowledge Management Systems: – Communication technologies allow users to access needed knowledge and to communicate with each other. – Collaboration technologies provide the means to perform group work. – Storage and retrieval technologies (database management systems) to store and manage knowledge. 41 Knowledge Management – Integration Knowledge management systems integration. 42 Benefits and Challenges of Knowledge Management 43