* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download data modelling - Applied Computer Science
Survey
Document related concepts
Transcript
ACS1803 Lecture Outline 2 DATA MANAGEMENT CONCEPTS Text, Ch. 3 How do we store data (numeric and character records) in a computer so that we can optimize on storage and retrieve data efficiently and flexibly [different possible combinations]? - we have already seen data management basics in material on MS Access dbms - we now expand this to computers in general The Database Approach • Traditional approach to data management: – Each distinct operational system used data files dedicated to that system • Database approach to data management: – Pool of related data is shared by multiple application programs 2 - firstly, there were individual files for each set of programs later, database approach: o one centralized source of data for all programs to tap into o minimizes redundancy and provides retrieval efficiency o can have one database for some subject area: student database, employee database how do we structure the data [on disk] for a database? Different people need different combinations [views] of data; yet the database must satisfy all views Relational Database - data set up as a centralized collection of tables - tables are linked by common columns of data o - tables are designed together to minimize repetition - this is the most common database structure - - this is the one used in microcomputer databases and many larger ones too The database management system -software that sets up the database structure (‘skeleton’ on disk according to a certain model, fills the structure with data and retrieves the data to provide meaningful information (p. 130 bot) -related to using a dbms: - data dictionary: defines each field and record, explains what each field means and who is authorized to update it (p. 143) -query language: used to extract data that satisfy certain criteria from a database The Database Management System Data Dictionary • Is a document that database designers prepare to help individuals enter data • Provides several pieces of information about each attribute in the database including: - Name - Key (is it a key or part of a key?) - Data Type (date, alphanumeric, numeric, etc.) - Valid Value (the format or numbers allowed) • Can be used to enforce Business Rules which are captured by the database designer to prevent illegal or illogical values from entering the database. (e.g. who has authority to enter certain kinds of data) 6 e.g., SQL: Structured Query Language: - Popular language for making requests to a relational dbms (p. 145 bot) e.g., SELECT LAST NAME, DEPARTMENT, SALARY FROM EMPLOYEE WHERE DEPARTMENT = ‘4530’ AND SALARY > 25000 Popular dbms: Access - individual Oracle - Client / Server network Data bases and computer networks - can have centralized database (at one location) or, fragmented, distributed database: different parts of the database are stored in the locations where they are accessed most often, but continue to be fully accessible to others; Database Administrator (DBA): person in an IS department responsible for the database (p. 147) Databases on the Web: - the only way for organizations to conduct business on the Internet is to people outside the organizations access to their databases (e.g., items for auction) 138 bot Text databases Databases are thought of as consisting of structured records of fields; but can also have ‘databases’ of text - - these can be on the web and can use search engines to search for keywords - such ‘databases’ can come on CD-ROMs Data Warehousing - most regular organizational databases are ‘transactional’ and current; their contents change daily - but transaction data in such a database as of a certain time (a ‘snapshot’) can be archived - a data warehouse is a large (relational) database of archived data used to gain extra insights to support management decision making -combines data from databases across the organization --data mart: smaller collection of data that focuses on a particular subject or department Data Warehouses, Data Marts, and Data Mining (continued) Operational Informational Extract Data Department Databases • Day-to-day department transactions • Used primarily by departments Extract Data Data Warehouse Data Mart • Extracted department transactions • Used for business analysis • Extracted subset of a data warehouse • Used for highly specific business analysis 11 Data Warehousing and Mining p. 155-157 • data in warehouses is accumulated and may reflect many years of business activity • various business analysis programs (models) can then be applied to the data warehouse to gain insight • organizations often choose mainframe computers with multiple CPUs to store and manage data warehouses; • • Data Mining • the process of selecting, exploring, and modelling large amounts of data to discover previously unknown relationships - data mining software searches through large amounts of data for meaningful patterns of information often used by marketing managers to see purchasing patterns - can be used in banking: to find profitable customers and patterns of fraud Data Warehouses, Data Marts, and Data Mining (continued) • Predictive analysis: – Form of data mining that combines historical data with assumptions about future conditions to predict outcomes of events – Used by retailers to upgrade occasional customers into frequent purchasers – Software can be used to analyze a company’s customer list and a year’s worth of sales data to find new market segments 14 Database Design: -before we can even start asking a dbms to set up a database structure for us, we must know what data we need to store and in what form -this work of determining what our database should look like is called database design - it is the ‘up front’ work in using dbms - - a first step in database design is data modelling we look at a business and see what main items we want to keep data about [data entities] and what are the organizational relationships between these items we then draw a diagram of entities and relationships between them; this is called data modelling - we draw an entity-relationship diagram (see board) STUDENT takes COURSE Is taken by - from this first step a data analyst would eventually determine a database design (schema) (what tables we will have and where the common columns will be) Designing Databases - Associations Entity Relationship Diagram (ERD) • Diagramming tool used to express entity relationships • Very useful in developing complex databases Example • Each Home Stadium has a Team (One-to-One) • Each Team has Players (One-to-Many) • Each Team participates in Games • For each Player and Game there are Game Statistics 17 More design • The entity-relationship diagram is then augmented to show attributes of each entity -Then, a large ERD is “trimmed down’ (normalized) and after that, entities usually become tables and attributes become columns in the tables Relationships in an ERD • Relationships between entities in the ERD dictate where the common column should be to relate certain tables The Relational Model - Example 20 REA Model • - an ERD where the entities model specific business RESOURCES, EVENTS and AGENTS is called a Resource Event Agent model - This model is useful in designing databases for various business information systems (e.g. modern accounting systems) REA Example • COOKIES(resource) Offered for SALE(ev) org by SALESPERSON (ag) attended by CUSTOMER(ag) REA Example 23 Why is database knowledge important? how you structure your data, so you shall be able to process your data -***a database is the ‘back end’ of a computer-based information system -See handout (MS Word) on Data Modeling Explanation