* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Databases - School of Engineering
Oracle Database wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Ingres (database) wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Clusterpoint wikipedia , lookup
ContactPoint wikipedia , lookup
History of Computing - Database CSE 3002 Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Box U-255 Storrs, CT 06269-3255 [email protected] http://www.engr.uconn.edu/~steve (860) 486–4818 (Office) (860) 486-3719 (CSE Office) HoCDB.1 Overview Review the History of Databases CSE 3002 History of Databases Prof. Ying Ding, Indiana University info.slis.indiana.edu/~dingying/Teaching/S511/new/lectur es/DatabaseOverview.ppt Introduction to Databases Steven Demurjian CSE4701 Class Notes Historical Perspective 40 Years of VLDB (Very Large Database) Major database conference http://vldb.org/2015/wp-content/uploads/2015/09/40years.pdf Ethic in Databases https://en.wikipedia.org/wiki/Database Professional, Legal, and Ethical Issues in Data Management http://www.cs.utexas.edu/~mitra/csSpring2011/cs327/lectures/New_S lides/ch13.ppt Databases Steve’s Done HoCDB.2 Database Overview Prof Ying Ding School of Informatics and Computing Indiana Univesrity info.slis.indiana.edu/~dingying/Teaching/S51 1/new/lectures/DatabaseOverview.ppt S511 Session 2, IU-SLIS 3 Database Management System - manages interaction between end users and database Database Systems: Design, Implementation, & Management: Rob & Coronel S511 Session 2, IU-SLIS 4 Database System Environment Hardware Software - OS - DBMS - Applications People Procedures Data Database Systems: Design, Implementation, & Management: Rob & Coronel S511 Session 2, IU-SLIS 5 Evolution of Data Models • Timeline 1960s 1970s 1980s 1990s 2000+ File-based Hierarchical Object-oriented Network Relational Web-based Entity-Relationship S511 Session 2, IU-SLIS 6 Database: Historical Roots • Manual File System – to keep track of data – used tagged file folders in a filing cabinet – organized according to expected use • e.g. file per customer – easy to create, but hard to • locate data • aggregate/summarize data • Computerized File System – to accommodate the data growth and information need – manual file system structures were duplicated in the computer – Data Processing (DP) specialists wrote customized programs to • write, delete, update data (i.e. management) • extract and present data in various formats (i.e. report) S511 Session 2, IU-SLIS 7 File System: Example Database Systems: Design, Implementation, & Management: Rob & Coronel S511 Session 2, IU-SLIS 8 File System: Weakness • Weakness – “Islands of data” in scattered file systems. • Problems – Duplication • same data may be stored in multiple files – Inconsistency • same data may be stored by different names in different format – Rigidity • requires customized programming to implement any changes • cannot do ad-hoc queries • Implications – Waste of space – Data inaccuracies – High overhead of data manipulation and maintenance S511 Session 2, IU-SLIS 9 File System: Problem Case CUSTOMER file AGENT file A_Name (15 char) A_Name (20 char) Carol Johnson Carol T. Johnson SALES file AGENT (20 char) Carol J. Smith - inconsistent field name, field size - inconsistent data values - data duplication S511 Session 2, IU-SLIS 10 Database System vs. File System Database Systems: Design, Implementation, & Management: Rob & Coronel S511 Session 2, IU-SLIS 11 Hierarchical Database • Background – Developed to manage large amount of data for complex manufacturing projects – e.g., Information Management System (IMS) • IBM-Rockwell joint venture • clustered related data together • hierarchically associated data clusters using pointers • Hierarchical Database Model – Assumes data relationships are hierarchical • One-to-Many (1:M) relationships – Each parent can have many children – Each child has only one parent – Logically represented by an upside down tree S511 Session 2, IU-SLIS 12 Hierarchical Database: Example Database Systems: Design, Implementation, & Management: Rob & Coronel S511 Session 2, IU-SLIS 13 Hierarchical Database Definition CSE 4701 DBD SEGM FIELD FIELD FIELD SEGM FIELD FIELD SEGM FIELD FIELD FIELD SEGM FIELD FIELD FIELD SEGM FIELD FIELD FIELD @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME @NAME = = = = = = = = = = = = = = = = = = = = University Courses (Course#, SEQ), TYPE = CHAR, BYTES = 6 Title, TYPE = CHAR, BYTES = 20 Descrip, TYPE = CHAR, BYTES = 100 Prereq, PARENT = Courses (PCourse#, SEQ), TYPE = CHAR, BYTES = 6 Title, TYPE = CHAR, BYTES = 20 Formats, PARENT = Courses (Section#, SEQ, M), TYPE = INT, BYTES = 2 Quarter, TYPE = CHAR, BYTES = 10 Campus, TYPE = CHAR, BYTES = 15 Faculty, PARENT = Formats (SSN, SEQ), TYPE = CHAR, BYTES = 9 Name, TYPE = CHAR, BYTES = 30 Ophone, TYPE = CHAR, BYTES = 7 Student, PARENT = Formats (SSN, SEQ), TYPE = CHAR, BYTES = 9 Name, TYPE = CHAR, BYTES = 30 Gpa, TYPE = FLOAT, BYTES = 4 Chaps1&2-14 Hierarchical Graphical Representation CSE 4701 Courses Course#* 1 Title 1 n n Prereq PCourse#* Descrip Title n Student SSN#* Name Formats Section#* 1 1 Quarter Campus 1 GPA Faculty SSN#* Name Phone Chaps1&2-15 Hierarchical Database: Pros & Cons • Advantages – Conceptual simplicity • groups of data could be related to each other • related data could be viewed together – Centralization of data • reduced redundancy and promoted consistency • Disadvantages – Limited representation of data relationships • did not allow Many-to-Many (M:N) relations – Complex implementation • required in-depth knowledge of physical data storage – Structural Dependence • data access requires physical storage path – Lack of Standards • limited portability S511 Session 2, IU-SLIS 16 Network Database • Objectives – Represent more complex data relationships – Improve database performance – Impose a database standard • Network Database Model – Similar to Hierarchical Model • Records linked by pointers – Composed of sets • Each set consists of owner (parent) and member (child) – Many-to-Many (M:N) relationships representation • Each owner can have multiple members (1:M) • A member may have several owners S511 Session 2, IU-SLIS 17 Network Database: Example Database Systems: Design, Implementation, & Management: Rob & Coronel S511 Session 2, IU-SLIS 18 Network Database Definition SCHEMA NAME IS University. CSE 4701 RECORD NAME IS Student; DUPLICATES ARE NOT ALLOWED FOR SSN. Name ; CHARACTER 30. SSN ; CHARACTER 9. Gpa ; FLOAT. RECORD NAME IS Faculty; DUPLICATES ARE NOT ALLOWED FOR SSN. Name ; CHARACTER 30. SSN ; CHARACTER 9. Ophone ; CHARACTER 7. RECORD NAME IS Courses; DUPLICATES ARE NOT ALLOWED FOR Course#. Course# ; CHARACTER 6. Title ; CHARACTER 20. Descrip ; CHARACTER 100. RECORD NAME IS Formats; DUPLICATES ARE NOT ALLOWED FOR Section#. Section#; FIXED 3. Quarter ; CHARACTER 10. Campus ; CHARACTER 15. RECORD NAME IS Prereq; PCourse#; CHARACTER 6. Title ; CHARACTER 20. SET NAME IS Requirements; OWNER IS Courses; MEMBER IS Prereq; SET NAME IS COfferings; OWNER IS Courses; MEMBER IS Formats; SET NAME IS QtrOfferings; OWNER IS Formats; MEMBER IS Courses; SET NAME IS Takes; OWNER IS Formats; MEMBER IS Student; SET NAME IS Teaches; OWNER IS Formats; MEMBER IS Faculty; Chaps1&2-19 Network Graphical Representation CSE 4701 Courses Course#* Title Requirements Prereq PCourse#* Descrip COfferings Title Takes Student SSN#* Name GPA QtrOfferings Formats Section#* Quarter Campus Teaches Faculty SSN#* Name Phone Chaps1&2-20 Network Database: Pros & Cons • Advantages – More data relationship types – More efficient and flexible data access • “network” vs. “tree” path traversal – Conformance to standards • enhanced database administration and portability • Disadvantages – System complexity • require familiarity with the internal structure for data access – Lack of structural independence • small structural changes require significant program changes S511 Session 2, IU-SLIS 21 Relational Database • Problems with legacy database systems – Required excessive effort to maintain • Data manipulation (programs) too dependent on physical file structure – Hard to manipulate by end-users • No capacity for ad-hoc query (must rely on DB programmers). • Evolution in Data Organization – E. F. Codd’s Relational Model proposal • Separated the notion of physical representation (machine-view) from logical representation (human-view) • Considered ingenious but computationally impractical in 1970 – Relational Database Model • Dominant database model of today • Eliminated pointers and used tables to represent data • Tables – flexible logical structure for data representation – a series of row/column intersections – related by sharing common entity characteristic(s) S511 Session 2, IU-SLIS 22 Relational Database: Example Provides a logical “human-level” view of the data and associations among groups of data (i.e., tables) Customer_ID Customer_Account Agent_ID 1224 4556 1225 4558 Agent_ID Customer_ID Last_Name 1224 Vira 1225 Davies 23 25 Last_Name 23 Sturm 25 Long First_Name Dyne Tricia First_Name David Kyle Phone 334-5678 556-3421 Phone Account_Balance 678-9987 1223.95 556-3342 234.25 S511 Session 2, IU-SLIS 23 Relational Tables - Rows/Columns/Tuples CSE 4701 Chaps1&2-24 Relational Database Definition CSE 4701 CREATE TABLE Student: Name(CHAR(30)), SSN(CHAR(9)), Gpa(FLOAT(2)) CREATE TABLE Faculty: Name(CHAR(30)), SSN(CHAR(9)), Ophone(CHAR(7)) CREATE TABLE Courses: Course#(CHAR(6)), Title(CHAR(20)), Descrip(CHAR(100)), PCourse#(CHAR(6)) CREATE TABLE Formats: Section#(INTEGER(3)), Quarter(CHAR(10)), Campus(CHAR(15)) CREATE TABLE TakeorTeach: SSN(CHAR(9)), Course#(CHAR(6)), Section#(INTEGER(3)) CREATE TABLE COfferings: Course#(CHAR(6)), Section#(INTEGER(3)) Student(Name*, SSN, Gpa) Faculty(Name*, SSN, Ophone) Courses(Course#*, Title, Descrip, PCourse#*) Formats(Section#*, Quarter, Campus) TakeorTeach(SSN, Course#, Section#) COfferings(Course#, Section#) Chaps1&2-25 Relational Views CSE 4701 Two Views Derived From Prior Tables Student Transcript View Course Prerequisite View Chaps1&2-26 Relational Database: Pros & Cons • Advantages – Structural independence • Separation of database design and physical data storage/access • Easier database design, implementation, management, and use – Ad hoc query capability with Structured Query Language (SQL) • SQL translates user queries to codes • Disadvantages – Substantial hardware and system software overhead • more complex system – Poor design and implementation is made easy • ease-of-use allows careless use of RDBMS S511 Session 2, IU-SLIS 27 Entity Relationship Model • Peter Chen’s Landmark Paper in 1976 – “The Relationship Model: Toward a Unified View of Data” – Graphical representation of entities and their relationships • Entity Relationship (ER) Model – Based on Entity, Attributes & Relationships • Entity is a thing about which data are to be collected and stored – e.g. EMPLOYEE • Attributes are characteristics of the entity – e.g. SSN, last name, first name • Relationships describe an associations between entities – i.e. 1:M, M:N, 1:1 – Complements the relational data model concepts • Helps to visualize structure and content of data groups – entity is mapped to a relational table • Tool for conceptual data modeling (higher level representation) – Represented in an Entity Relationship Diagram (ERD) • Formalizes a way to describe relationships between groups of data S511 Session 2, IU-SLIS 28 E-R Diagram: Chen Model • Entity – represented by a rectangle with its name in capital letters. • Relationships – represented by an active or passive verb inside the diamond that connects the related entities. • Connectivities – – i.e., types of relationship written next to each entity box. Database Systems: Design, Implementation, & Management: Rob & Coronel S511 Session 2, IUSLIS 29 E-R Diagram: Crow’s Foot Model • Entity – represented by a rectangle with its name in capital letters. • Relationships – represented by an active or passive verb that connects the related entities. • Connectivities – indicated by symbols next to entities. • 2 vertical lines for 1 • “crow’s foot” for M Database Systems: Design, Implementation, & Management: Rob & Coronel S511 Session 2, IUSLIS 30 E-R Model: Pros & Cons • Advantages – Exceptional conceptual simplicity • easily viewed and understood representation of database • facilitates database design and management – Integration with the relational database model • enables better database design via conceptual modeling • Disadvantages – Incomplete model on its own • Limited representational power – cannot model data constraints not tied to entity relationships » e.g. attribute constraints – cannot represent relationships between attributes within entities • No data manipulation language (e.g. SQL) – Loss of information content • Hard to include attributes in ERD S511 Session 2, IU-SLIS 31 Object-Oriented Database • Semantic Data Model (SDM) – Modeled both data and their relationships in a single structure (object) • • Object-oriented concepts became popular in 1990s – – • Developed by Hammer & McLeod in 1981 Modularity facilitated program reuse and construction of complex structures Ability to handle complex data types (e.g. multimedia data) Object-Oriented Database Model (OODBM) – – Maintains the advantages of the ER model but adds more features Object = entity + relationships (between & within entity) • • consists of attributes & methods – attributes describe properties of an object – methods are all relevant operations that can be performed on an object self-contained abstraction of real-world entity – Class = collection of similar objects with shared attributes and methods • • e.g. EMPLOYEE class = (employ1 object, employ2 object, …) organized in a class hierarchy – e.g. PERSON > EMPLOYEE, CUSTOMER – Incorporates the notion of inheritance • attributes and methods of a class are inherited by its descendent classes S511 Session 2, IU-SLIS 32 Object-Oriented Database Declarations CSE 4701 Specifying the Object Types Employee, Date, and Department Using Type Constructors Chaps1&2-33 Object-Oriented Database Declarations CSE 4701 Adding Operations to Definitions of Employee and Department: Chaps1&2-34 Object Oriented DB Vendors/Products CSE 4701 Cache (http://www.intersystems.com) CommonSQL / UncommonSQL db4o (DeeBeeFourOh) http://www.db4o.com (open source) GOODS (http://www.garret.ru/~knizhnik/goods.html) Objectivity/DB (http://www.objectivity.com/objectdatabase.shtml) ObjectDesignInc OzoneDb (http://ozone-db.org) PLOB! (acronym for Persistent Lisp OBjects; see http://plob.sourceforge.net/ ) XL2 (http://www.xl2.net) Chaps1&2-35 OO Database Model vs. E-R Model OODBM: - can accommodate relationships within a object - objects to be used as building blocks for autonomous structures Database Systems: Design, Implementation, & Management: Rob & Coronel S511 Session 2, IU-SLIS 36 Object-Oriented Database: Pros & Cons • Advantages – Semantic representation of data • fuller and more meaningful description of data via object – Modularity, reusability, inheritance – Ability to handle • complex data • sophisticated information requirements • Disadvantages – Lack of standards • no standard data access method – Complex navigational data access • class hierarchy traversal – Steep learning curve • difficult to design and implement properly – More system-oriented than user-centered – High system overhead • slow transactions S511 Session 2, IU-SLIS 37 Web Database • Internet is emerging as a prime business tool – Shift away from models (e.g. relational vs. O-O) – Emphasis on interfacing with the Internet • Characteristics of “Internet age” databases – – – – Flexible, efficient, and secure Internet access Support for complex data types & relationships Seamless interfaces with multiple data sources and structures Ease of use for end-user, database architect, and database administrator • Simplicity of conceptual database model • Many database design, implementation, and application development tools • Powerful DBMS GUI S511 Session 2, IU-SLIS 38 NoSQL • NoSql is not literally “no sql”. They are non relational data stores. • Next Generation Databases being non-relational, distributed, open-source and horizontally scalable have become a favorite back end storage for cloud community . High performance is the driving force. NoSQL • Pros – open source (Cassandra, CouchDB, Hbase, MongoDB, Redis) – Elastic scaling – Key-value pairs, easy to use – Useful for statistical and real-time analysis of growing lists of elements (tweets, posts, comments) • Cons – Security (No ACID: ACID (Atomicity, Consistency, Isolation, Durability) – No indexing support – Immature – Absence of standardization S511 Session 2, IU-SLIS 40 Introduction to Databases CSE 4701 Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT 06269-3155 [email protected] http://www.engr.uconn.edu/~steve (860) 486 - 4818 The majority of these slides are being used with the permission of Dr. Ling Lui, Associate Professor, College of Computing, Georgia Tech. Some slides have been adapted from the AWL web site for the textbook Chaps1&2-41 The Role of DBMS in Computing CSE 4701 Chaps1&2-42 What is a Database System? Web or PC app Mobile app REST API or Web Services CSE 4701 Chaps1&2-43 What is the Role of Database System? CSE 4701 Pervasive in Almost All Applications and Every Application Domain Norm rather than Exception Difficult to Imagine Application without Persistent Store Remember – Database is a Repository at Minimum Database Management for Mobile Computing Myriad of Architectures and Approaches: From: http://java.sun.com/javaone/javaone98/sessions/T400/index.html Chaps1&2-44 Database Concepts - Summary CSE 4701 Schema vs. Data Database-Structured Collection of Data Describing Objects of Universe of Discourse being Modeling. A Database Consists of Schema and Data Schema: Describes the Intension (Type) of Objects Entity/Table/Relation: A portion of a Schema Data: Describes the Extension (Tuples) of Objects Data Definition vs. Data Manipulation Languages What is Metadata? DML DDL define Schema (metadata) Table Data Operate on data according to the schema Chaps1&2-45 What are Programming Analogs? CSE 4701 Schema is Equivalent to a Class Library All of Different Types of Information Entity/Table/Relation Data Attributes and Types Akin to a Class Tuples Akin to Creating an Instance from Class Key Difference - Entity/Table is Two Abstractions Structure like a Class Also Represents a Set of all Tuples Meta-Data Akin to Java Reflection and Introspection Access to the Runtime Features of Objects Let’s See Example Chaps1&2-46 Classes for a Medical Application CSE 4701 Data Types, Methods Patient Inherits from Person and Creates a Single Instance “John” Substance Id:Integer name: String statusCode: String effectiveTime:Dat e repeatNumber: Int takesPrescribedMedication Observation Person Id:Integer statusCode: String name: String value: String Id: Integer name: name address: Address bday: String tel: String Name family-name: String given-name: String prefix: String suffix: String Address hasMedicalObservations Patient Ethnicity: String prefLang: String race:String Email: String gender: String getAllergies() get_clinical_notes() get_demographics() get_medications() get_immunizations() Provider deaNumber: String npiNumber:String Ethnicity: String race:String Email: String gender: String street: String locality: String region: String country: String Chaps1&2-47 Database Entity Relationship Diagram CSE 4701 Patient Entity represents Attributes of a set of Patients Defines Type and the Collection Patient Entity is a Database Table with Structure Like a Class However, Contains many Instances, e.g., Patients “John”, “George”, “Jane”, etc. statusCode value Ethnicity id effectiveTime id prefLang Observation race Patient address Substance name id name tel effectiveTime bday statusCode repeatNumber Chaps1&2-48 Database Tables CSE 4701 Patient(pid, name, address, tel, bday, etc.) Substance(sid, name, statusCode, etc.) Observation(oid, value, statusCode, etc.) PatientObservations(pid, oid) PatientMedications(pid, sid) Chaps1&2-49 An Example Database System An Integrated Telephone Customer Information System (Circa early 1980s) What are Examples Today? Has Scale Increased? CSE 4701 Chaps1&2-50 The OpenMRS Sample Database Schema CSE 4701 99 Tables, Sample Database with 5000 patients and 500,000 observations Chaps1&2-51 What are World Largest DBs? (2010)* CSE 4701 *http://www.comparebusinessproducts.com/fyi/10-largest-databases-in-the-world Chaps1&2-52 Available Database Systems/Platforms CSE 4701 Ranging from Relational to Object-Oriented to RealTime to Embedded to Mobile Long History of Database Systems First Database Journal – 1976 ACM Transactions on Database Systems Founded by David K. Hsiao (my doctoral advisor) 1st Issue – P. Chen on the Entity Relationship Model 2nd Issue System R – IBM’s First Mainframe DBMS Abstraction by S. Navarthe (our textbook author) 3rd Issue – The INGRES DBMS – DEC (Berkeley) 4th Issue – Functional Dependencies/Normal Forms 6th Issue – Abstraction and Generalization Chaps1&2-53 Available Database Systems CSE 4701 Microsoft SQL Server IBM DB2 Oracle MySQL Emerging Mobile Platforms Berkeley DB Couchbase Lite LevelDB SQLite UnQLite Chaps1&2-54 Databases for Mobile Platforms CSE 4701 A wide Range of Emerging Products SQL Anywhere (Sybase) DB2 Everyplace (IBM) SQL Server Compact/Express (Microsoft) Oracle Lite MySQLMobile, Android PHP/MySQL Mobile Features Embedded in the Mobile Device Offers DB Query Capabilities Synchronizes with Server Side Allows Local Storage on Mobile Device Potential Topic for Project this Semester! Chaps1&2-55 Databases for Mobile Platforms CSE 4701 Oracle Berkeley DB Via SQL, Java Objects, or XML Documents Couchbase Lite NoSQL – storing/retrieving data in format that is not relational/SQL-based LevelDB (written at Google) Open Source Library for Key/Value Pair Storage and Retrieval SQLite Manage in Memory and on Disk UnQLite NoSQL Counterpart of SQLite Chaps1&2-56 Database Market Share 1995 CSE 4701 Today’s market Share – the Top 3: Oracle: 44.4% IBM: 21.2% Microsoft: 18.6% http://datadoghouse.typepad.com/data_doghouse/2007/05/database_market.html What will be the Role of Open Source? MySQL (MS) and Innobase (Oracle on top of MySQL) Evans Data Corporation (http://www.evansdata.com/) http://news.taume.com/Technology/Tech-Deals/Report-MySQL-Gains-25-percent-Market-Share-729 Chaps1&2-57 Market: Prerelational vs. Relational 1999 CSE 4701 Prerelational Revenue Shrinking about 9% Per Year - Currently 1.8 Billion/year Relational Revenue Growing about 30% Year Currently 11.5 Billion/year Object-oriented Revenue about 150 Million/year Chaps1&2-58 Database Market Share 2007 CSE 4701 Chaps1&2-59 Database Market Share in 2013 CSE 4701 Chaps1&2-60 Market Share 2014 CSE 4701 Chaps1&2-61 Professional, Legal, and Ethical Issues in Data Management Transparencies http://www.cs.utexas.edu/~mitra/csSpring2011/cs 327/lectures/New_Slides/ch13.ppt 83 Objectives How to define ethical and legal issues in information technology. How to distinguish between legal and ethical issues and situations data/database administrators face. How new regulations are placing additional requirements and responsibilities on data/database administrators. ©Pearson Education 2009 84 Objectives How legislation such as the Sarbanes-Oxley Act and the BASEL II accords impact data/database administration functions. Best practices for preparing for and supporting auditing and compliance functions. Intellectual property (IP) issues related to IT and data/database administration. ©Pearson Education 2009 85 Legal and ethical issues and database systems Organizations increasingly find themselves having to answer tough questions about the conduct and character of their employees and the manner in which their activities are carried out. At the same time, we need to develop knowledge of what constitutes professional and non-professional behavior. ©Pearson Education 2009 86 Ethics in the context of information technology Ethics - A set of principles of right conduct or a theory or a system of moral values. Can consider ethical behavior as “doing what is right” according to the standards of society. This, of course, begs the question “of whose society” as what might be considered ethical behavior in one culture (country, religion, and ethnicity) might not be so in another. ©Pearson Education 2009 87 Difference between ethical and legal behavior Laws can be considered as simply enforcing certain ethical behaviors. This leads to two familiar ideas: what is ethical is legal and what is unethical is illegal. Consider – Is all unethical behavior illegal? Is all ethical behavior legal? Ethical codes of practice help determine whether specific laws should be introduced. Ethics fills the gap between the time when technology creates new problems and the time when laws are introduced. ©Pearson Education 2009 88 Ethical behavior in information technology A survey conducted by TechRepublic, an IT oriented web portal maintained by CNET Networks (techrepublic.com), reported that 57% of the IT workers polled indicated they had been asked to do something ‘unethical’ by their supervisors (Thornberry, 2002). Examples include installing unlicensed software, accessing personal information, and divulging trade secrets. ©Pearson Education 2009 89 Legislation and its impact on the IT function Securities and Exchange Commission (SEC) Regulation National Market System (NMS) The Sarbanes-Oxley Act, COBIT, and COSO The Health Insurance Portability and Accountability Act The European Union (EU) Directive on Data Protection of 1995 The United Kingdom’s Data Protection Act of 1998 International banking – BASEL II Accords ©Pearson Education 2009 90 Securities and Exchange Commission (SEC) Regulation National Market System (NMS) Concerns activities that appear ethical but are in fact illegal. Presents an ‘order protection rule’ under which an activity that is acceptable to one facet of the investment community was deemed illegal under the new regulation. Result of this regulation is that financial services firms are now required to collect market data so that they can demonstrate that a better price was indeed not available at the time the trade was executed. ©Pearson Education 2009 91 The Sarbanes-Oxley Act, COBIT, and COSO Result of major financial frauds allegedly carried out within companies such as Enron, WorldCom, Parmalat, and others. US and European governments presented legislation to tighten requirements on how companies form their board of directors, interact with auditors, and report their financial statements. ©Pearson Education 2009 92 The Sarbanes-Oxley Act, COBIT, and COSO Requires security and auditing of financial data and has implications on data collection, processing, security and reporting both internally and externally to the organization. Concerns establishment of internal controls - A set of rules an organization adopts to ensure policies and procedures are not violated, data is properly secured and reliable, and operations can be carried out efficiently. ©Pearson Education 2009 93 The Health Insurance Portability and Accountability Act Administered by Health and Human Services in US and affects providers of healthcare and health insurance. Five main provisions of Act includes: Privacy of patient information Standardizing electronic health/medical records and transactions between health care organizations Establishing a nationally recognized identifier for employees to be used by all employee health plans Standards for the security of patient data and transactions involving this data Need for a nationally recognized identifier for healthcare organizations and individual providers ©Pearson Education 2009 94 The European Union (EU) Directive on Data Protection of 1995 The official title of the EU’s data protection directive is: ‘Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data’ (OJEC 1995). ©Pearson Education 2009 95 The United Kingdom’s Data Protection Act of 1998 Presents eight data protection principles - ©Pearson Education 2009 96 International banking – BASEL II Accords Presents policies and framework that must be enacted into law in each country and monitored by national regulators. Framework presents three main ‘pillars’ Minimum capital requirements Supervisory review process Market discipline ©Pearson Education 2009 97 Establishing a culture of legal and ethical data stewardship Senior managers such as board members, presidents, Chief Information Officers (CIOs), and data administrators are increasingly finding themselves liable for any violations of these laws. Steps to consider include Develop an organization-wide policy for legal and ethical behavior. Professional organizations and codes of ethics. ©Pearson Education 2009 98 Intellectual Property (IP) Covers inventions, inventive ideas, designs, patents and patent applications, discoveries, improvements, trademarks, designs and design rights (registered and unregistered), written work (including computer software) and know-how devised, developed, or written by an individual or set of individuals. Two types of IP: Background IP – IP that exists before an activity takes place. Foreground IP - IP that is generated during an activity. ©Pearson Education 2009 99 Intellectual Property (IP) Patents - provides an exclusive (legal) right for a set period of time to make, use, sell or import an invention. Patents are granted by a government when an individual or organization can demonstrate: the invention is new; the invention is in some way useful; the invention involves an inventive step. ©Pearson Education 2009 100 Intellectual Property (IP) Copyright - provides an exclusive (legal) right for a set period of time to reproduce and distribute a literary, musical, audiovisual, or other ‘work’ of authorship. Trademark - provides an exclusive (legal) right to use a word, symbol, image, sound, or some other distinction element that identifies the source of origin in connection with certain goods or services another make, use, sell, or import an invention. ©Pearson Education 2009 101 Database Steve’s Done CSE 3002 Naval Postgraduate School, 1983-1987 The Implementation of a Multibackend Database System (MDBS): An Exercise in Database Software Engineering Multiple Process Parallel Database System Ohio State/UConn, 1982-1988 The Multilingual Databases System Supports Multiple Data Models Attribute-Based Relation Network Hierarchical Functional HoCDB.102 What is MBDS? CSE 3002 MBDS is Multi-Process, Multi-Computer, Parallel Database System MBDS Composed of … Host for Issuing User Requests Controller to Interact with Host (and User) One or More Backend Database Processors Goals of MBDS Suppose Request Takes 4 Minutes with One Backend Improve Response Time by Increasing Backends Two Backends - Request 2+ Minutes Four Backends - Request 1+ Minutes HoCDB.103 What is MBDS Architecture? Database Blocks are Distributed Across All Backends CSE 3002 Backend (BE) DB Processors are Replicated Database Controller Sends Same Query in Parallel to all BEs Host User Database Controller Backend Database Processor Backend Database Processor BEs work in Parallel on Each Query and Communicate for Join Results are Sent to and Collected by the DB Controller - then to the User Backend Database Processor HoCDB.104 Approach Distributes Data Across Backends CSE 3002 Suppose System has 10 Backends Consider a Number of Tables Inventory Customers Employees … What Happens if Place One Table/Backend? What Happens if you Distribute … Table Across 10 Backends? Backend Database Processor 2 Backend Database Processor 1 Backend Database Processor 10 HoCDB.105 What are MBDS Processes? CSE 3002 Database Controller Request Preparation Post Processing Put Msg. Get Msg. Get Msg. Put Msg. Directory Management Record Processing Concurrency Control Disk I/O Backend Database Processor HoCDB.106 What are MBDS Messages? CSE 3002 No. 1 2 3 4 6 12 15 16 21 22 23 Type New Request Results of Request Number of Reqs in Transaction Aggregate Operators (Sum, etc.) Parsed Request to Backends Backend Aggregate Operator Results Ids for Accessing Database Indexes Request and Disk Addresses Ids for Accessing Database Records Locks Obtained: Okay to Execute Request ID of Finished Request SRC Host PoPr ReqP ReqP ReqP RecP DM DM DM CC RecP DST ReqP Host PoPr PoPr DM PoPr DMs RecP CC RecP CC HoCDB.107 Sample Processing of Retrieve Request CSE 3002 F15 From Other Backend A1 Request Preparation D6 Put Msg. B3 C4 K12 Post Processing K12 Get Msg. E15 To Backend(s) Get Msg. Put Msg. D6,F15 E15 Directory Management G21 K12 H22 Record Processing I16 Concurrency Control J23 Disk I/O HoCDB.108 What are Synchronization Issues in MBDS? CSE 3002 Coordination of Synchronous Behavior … Within Controller and Backend to Allow Multiple Active Requests within Each Process Requests at Different Stages in Different Processes Between Controller and Backends to Allow A Request to be Processed by All Backends A Request to be Processed by One Backend Among Multiple Backends to Allow a Backend to Synchronize its Work on one Request with Other Backends to Forward Results to Another Backend HoCDB.109 Multi-Lingual Database System CSE 3002 HoCDB.110 Different Data Models CSE 3002 network hierarchical relational Attribute-based HoCDB.111 Attribute-based CSE 3002 HoCDB.112 Relational CSE 3002 HoCDB.113 Hierarchical CSE 3002 HoCDB.114 Network CSE 3002 HoCDB.115 Network CSE 3002 HoCDB.116 Functional CSE 3002 HoCDB.117 CSE 3002 HoCDB.118