* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ISY 4340 Class Introduction
Survey
Document related concepts
Transcript
G. Green Foundations of Database Systems Class Introduction 1 • • • • • • Introductions Seating Chart Course Overview Syllabus Case Database Development Overview G. Green Agenda 2 Foundations of Database Systems Understand data-related activities of SDLC Implement data modeling, database design, and database implementation techniques CASE (Visio) Database (SQL Server) G. Green Objectives Course Contents Lectures, Examples, In-Class Exercises Individual Assignments (3) Team Project* (3 parts) Quizzes (3) Exams (2) 3 *Can request teammates; see syllabus for Team Preferences deadline Research • International and US • Periodic Assessments G. Green • Service Learning & Kolb’s Learning Cycle • Some NOT graded; others are 4 Learning › › › › Prepare --read & reread book, notes-- for each class Attend, listen, be attentive, engaged Ask and answer questions, & add to discussion Do each assignment completely & in a timely and professional manner G. Green Participate : Take PLENTY of notes in class: › Do NOT just rely on powerpoint Explore : › Go beyond classroom material 5 Class Resources http://canvas.baylor.edu Schedule also contains links to all lecture slides, study guides, assignments and project write-ups G. Green Syllabus/Schedule, Grades, Attendance: Other Resources: http://blogs.baylor.edu/gina_green/mis-4340-resources/ NOTE: the syllabus/schedule on this website will NOT contain the links described above 6 G. Green Syllabus… 7 8 Introduction to Databases Chapter 1 G. Green Topics • Chapter 1 • The Database Environment • Database Development Process • Big Data 9 • Chapter 9 (Pages 409 – 410) • Chapter 10 (Pages 444 – 445, 446-447) • Master Data Management • Data Federation • Chapter 11 (Pages 464 – 472, 486, 499 – 506) • • • • Database Personnel Metadata Management (e.g., Data Dictionaries) Backup Facilities Overview of Tuning the Database for Performance G. Green Evolution of Database Technologies 1970’s 1980’s 1990’s 2000+ Federated 10 1960’s MDDB Hierarchical Object XML Traditional Files Relational Network NoSQL Object-Relational ……. G. Green Figure 1-3 Old file processing systems: Example Duplicate Data 11 Traditional File Processing Environment › Program-data dependence = “structural” & “data” › Limited data sharing = “islands of automation” › Duplication of data = “redundancy” › Lengthy development times › Excessive program maintenance 12 Disadvantages: G. Green The Database Environment G. Green 13 Program-data independence Improved data sharing Minimal data redundancy Improved data accessibility/responsiveness Improved data consistency Faster application development Enforcement of standards Improved data quality Reduced program maintenance 14 Advantages of Databases G. Green 15 Data and Database Administration Chapter 11 G. Green Traditional Administration Definitions Data Administration: A high-level function that is responsible for the overall management of data resources in an organization, including maintaining corporate-wide definitions and standards Database Administration: A technical function that is responsible for physical database design and for dealing with technical issues such as security enforcement, database performance, and backup and recovery 16 Data People Involved in SDLC Data(base) Analysts/Designers requirements elicitation, design Business (Intelligence) Analyst BI requirements, design Data Architects strategy, governance Data Stewards quality, metadata, MDM Business Analytics Engineer data analytics, statistics, mining Data Mining Engineer; Big Data “big data” specialists 17 Data Administrators Engineer; Data Scientist … Database Administrators (System) DBAs implementation/maintenance Application DBAs Procedural DBAs stored code e-DBAs web-enabled DBMSs Data Warehouse Administrators ETL, DW implementation G. Green • • • • • • • • • Relational database design, implementation Database programming ETL (extract, translate, load) Data warehousing design (star schema) and implementation (MDDB) Data analysis, reporting, and mining techniques Cloud database implementations Statistical modeling with tools such as R, SAS, or SPSS Data visualization tools Technologies for structured and unstructured data • Hadoop (Hadoop is an Apache project to provide an open-source implementation of frameworks for reliable, scalable, distributed computing and data storage.) • NoSQL • "NewSQL" ***See Big Data University for (mostly) free self-study training 18 Growing Skillset G. Gree n 19 Data Quality and Integration Chapter 10 G. Green Metadata Management • System Catalog 20 • Part of DBMS • "Active" dictionary • Data Dictionary • Typically "passive" • Extension of catalog metadata • Information Repository (e.g., IRDS) • Standards for data dictionaries • Integrates dictionaries G. Green • "Ensuring the currency, meaning, and quality of reference data within and across various subject areas" (pg 444) • Identify 21 Master Data Management • Common Data Subjects • Common Data Elements • Sources of "the truth" • Cleanse • Update applications to reference Master Data repository • Ensures consistency of key data (not ALL data) throughout organization G. Gree n 22 Database Development Process G. Green Systems Development Life Cycle DB Activities in SDLC Planning Enterprise Modeling* Analysis DB Scope, Requirements (Conceptual Data Model) Design DB Design (Logical DB Design) DB Design (Physical DB Design) Implementation DB Implementation (Load, Test, Eval, Op) DB Maintenance* 23 SDLC for this class G. Green Enterprise Data Modeling • Determine organizational data requirements • Build enterprise data model • outcome is a very high-level Entity-Relationship Diagram • see : • http://da.ks.gov/kito/ITPlans/data_maps06.ppt • http://www.tdan.com/view-articles/5205 25 G. Gree n Source: http://www.tdan.com/view-articles/5205 Conceptual Data Modeling Determine business rules 26 Determine user data requirements Build conceptual data model › outcome is an Entity-Relationship Diagram (conceptual schema) G. Green Logical Database Design › e.g., the Relational Model 27 Select database model Transform conceptual (ERD) into logical (relational) data model Normalize data structures › Outcome is normalized, relational tables G. Green Physical Database Design Select storage device(s) 28 Select database product (e.g., SQL Server) Design fields, records, files (physical schema) › outcomes are detailed, physical definitions for: fields (data dictionary) records (space requirements for physical structures)* files (access methods) *Will not do in this class G. Green Database Implementation • Create database file/table structures • Establish access rights 29 • Create views (external schema) • Load test data • Write/test programs that process data • Install database (with production data) into production operations › outcomes are secured database tables loaded with data G. Green Database Maintenance • Maintain database structures • Storage/space management • Performance, tuning • I/O Contention • CPU Usage • Application Tuning • Data availability • DBMS upgrades, "fixes" • Backup, recovery ……. Database Maintenance, cont… • Full • Incremental • Differential 31 • Backup • Business Continuity • Data Replication ("fallback") G. Gree n 32 Data and Database Administration Chapter 11 G. Green Cloud Computing • Business Model Computing resources on demand Need-based architectures Internet-based delivery Pay as you go 33 • • • • • History (VERY high-level and approximate) Time-sharing Utility Computing Virtual Machines 50's 60's WWW 70's Cloud Computing Personal Computers 80's Grid Computing 90's 2000's G. Gree n Cloud Computing Services • Impacts to Data(base) Administration • See textbook page 469 G. Green 34 Summary • Evolution of Data Management • Disadvantages of file processing • Components of a DBMS Environment • Database Advantages 35 • Database Concepts • Database Development: • Overall SDLC • Database Activities in the SDLC • Data Models/Schemas • What they represent • People Involved in SDLC (esp. DB) • Traditional job divisions and responsibilities • Newer job titles G. Green