Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data analysis wikipedia , lookup
Versant Object Database wikipedia , lookup
Expense and cost recovery system (ECRS) wikipedia , lookup
Concurrency control wikipedia , lookup
3D optical data storage wikipedia , lookup
Data vault modeling wikipedia , lookup
Enterprise content management wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
Information privacy law wikipedia , lookup
Chapter 3 Content Management Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 Chapter Objectives • To understand how digital content is represented. • To have an appreciation for how transactions are recorded and processed. • To understand the role of a database management system (DBMS) in creating and using databases. • To appreciate the different types of DBMSs available and understand the trends in DBMSs. • To appreciate the potential for using data mining tools to derive insights from data stored in databases and data warehouses. Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 2 Data Representation • A Byte is typically 8 bits. • A bit is the smallest item information technology can process, normally either a 1 or 0. • A field or data element is the smallest unit of data that has meaning to humans. – Examples include, EmployeeNumber, EmployeeName, Department, and StartDate • Field is normally used to describe the field name. • Data element is used to describe the contents of the field. Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 3 Data Representation • A record is a collection of fields that contain information concerning a specific thing or event. – An example of an employee record would include the four previous fields • • • • EmployeeNumber=“10121” EmployeeName=“Greenwood, Marie-Louise” Department=“Customer Service” StartDate=“05/01/2002” • A collection of records is called a file. • Records are usually identified by a key field or Primary key. • A group of related files would be referred to as a database. Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 4 File Access • Sequential Access – A specific record is located by starting at the beginning of the file and scanning each record until the desired record is located. • Direct Access – A specific record is located by going directly to correct folder or close to it. – One popular technique is hashing, based on a mathematical algorithm. • The hashing algorithm is applied to the primary key field to generate a storage location on a physical storage device. Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 5 File Access • ISAM or indexed sequential access method – In between sequential and direct access – An index is maintained that points to sections of records in the file. – When a specific record is requested, the database software goes to the first record of the section. – Then reads the records in that section sequentially until the correct record is located. Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 6 Transaction Processing • A transaction is the record of an event. • Transaction processing involves the use of human procedures and/or computer programs to store, retrieve, and manipulate records of events. • Master File • Transaction – information relevant to the most recent transaction. Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 7 Transaction Processing • Master File • Transaction File • File Processing System (used to store, retrieve, and manipulate records within files) • Sequential File Organization • Data Redundancy Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 8 Data Processing • DBMS– Database Management System – Data Definition Language Used in conjuction – Data Dictionary – Data Manipulation Language (program for retrieving and manipulating data) – Application Generators (easy to use queries for retrieving files) – Data Administration (i.e., back up data) Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 9 Data Capture and Processing • Batch Processing – Transactions are temporarily stored and then processed all at once. • Real Time Processing – Each transaction is processed as it occurs. • OLTP – Online Transaction Processing – Combination of on-line data capture and realtime processing. Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 10 Relational Database Model • Relational Database Model – Relations or tables • Two dimensional – Keys • Primary Key – Uniquely identifies each record. • Foreign Key(s) – A primary key is placed in a second table to maintain a relationship. Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 11 Retrieving Data • SQL—Structured Query Language – Is a data manipulation language incorporated in the DBMS. – SQL is a set of concise and powerful data management commands – SELECT ORDER.Order.Date, ORDER.OrderTotal FROM ORDER WHERE ORDER.CustomerNumber=10 – SQL can be embedded in a programming language, embedded SQL Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 12 Presenting Information • Report Generator, is a group of programs that are designed to facilitate the creation of standard, formatted output that is referred to as a report. – Paper – Computer monitor Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 13 DBMS Vendors • IBM – Largest share of DBMSs running on a mainframe. • Oracle – Leader in DBMSs running on servers. • Microsoft Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 14 Performance Criteria of DBMSs • Cost – Includes software license fees – Service and maintenance fees – Consulting fees for installation • Compatibility – Ability to support necessary applications without major modification • Capacity – Number of simultaneous users – Volume of transactions Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 15 Object Oriented Database Model • Object Oriented Database Management System is based on a model that integrates object-oriented concepts with the data-base system • Object Oriented Databases • CAD – Computer Aided Design • Object Query Language (OQL) Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 16 Data Warehouses • Data Warehouse is a special type of database that is designed to support decision making, rather than transaction processing – OLAP or Online Analytical Processing: on-line systems that access databases and data warehouses and then process data to support decision making – Data Mart: smaller subset of data warehouse – Multidimensional Database (not just 2D) Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 17 Data Mining: getting the most out of the data that have been collected • Customer Relationship Management or CRM • Query and Reporting • Neural Network Tools (uses raw data points as inputs, and attempt to identify patterns) • Ad targeting and direct marketing Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 18 Distributed Databases • A distributed database is where a database is duplicated allowing users at different locations to access exact replications of the database • Issues with Distributed Databases – Identical Copies – Backups • Security (Imarbank lost all its database for weeks) Information Technology & Management Thompson Cats-Baril The McGraw-Hill Companies, Inc. 2002 19