* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Data Protection Act, 2012 wikipedia , lookup
Versant Object Database wikipedia , lookup
Data center wikipedia , lookup
Concurrency control wikipedia , lookup
Data analysis wikipedia , lookup
Clusterpoint wikipedia , lookup
Expense and cost recovery system (ECRS) wikipedia , lookup
3D optical data storage wikipedia , lookup
Information privacy law wikipedia , lookup
Data vault modeling wikipedia , lookup
Distributed DBMSs • A distributed database is a single logical database that is physically distributed to computers on a network. • Homogeneous DDBMS has the same local DBMS at each site. • Heterogeneous DDBMS has at least two sites where the local DBMSs are different. Characteristics of Distributed DBMSs • Location transparency feels to a user as though the entire database is at their location. • Replication transparency is where the user is unaware of the behind the scenes replication of the data. • Fragmentation transparency is where a local object can be divided among the various locations on the network. Advantages of Distributed Databases • • • • Local control of data Increasing database capacity System availability Added efficiency Disadvantages of Distributed Databases • • • • • Update of replicated data More complex query processing More complex treatment of shared update More complex recovery measure More difficult management of data dictionary • More complex data design File Servers • File server contained files required by the individual workstations on the network. Client/Server Systems • Client/Server has the DBMS run on the file server, but the user sends requests for specific data, not files. Advantages of Client/Server Systems • More efficient than file server systems. • Possibility of distributing work among several processors. • Workstations need not be as powerful. • The user doesn’t need to learn any special commands or techniques. Advantages of Client/Server Systems • Easier for users to access data from a variety of sources. • Provides greater level of security than file server systems. • Powerful enough to replace expensive mainframe applications. Data Warehouses • A subject-oriented, integrated, timevariant, nonvolatile collection of data in support of management’s decision-making process. Data Warehouse Architecture Data Warehouse Structure Why build a Data Warehouse? • To speed up the writing and maintaining of queries and reports by technical personnel • To more easily query and report data, on a regular basis, from multiple transaction processing systems and/or from external data sources • To provide a repository of transaction processing system data that contains data over a span of time Why build a Data Warehouse • To address security concerns • To provide a repository of "cleaned up" transaction processing systems data that can be reported against and that does not necessarily require fixing the transaction processing systems Data errors • Incomplete – Missing records/fields • Incorrect – Wrong codes (or incorrect pairing of codes) • Incomprehensible – Multiple fields in one field – Many to many relationships – Spreadsheet and word-processing files Data Errors • Inconsistent – – – – – Use / meaning of codes Business rules Timing Use of attributes Use of nulls/spaces Data Mining • • • • Identify the goal Assemble the relevant data Choose your analysis methods Decide which software tool is best for implementing the method • Run the analysis • Decide how to implement the results Organizational Databases • Operational Database – organized about a transaction – supports OLTP (record keeping) – thousands of users – accesses few records at a time – response time in seconds • Data Warehouses – organized about a subject – supports OLAP (decision support) – few hundred users – accesses many records at a time – response times in minutes Organizational Databases • Operational Database – primitive & detailed – smaller (current) – highly normalized (many tables with few columns) – dynamic (continuous updates online) • Data Warehouses – derived & summarized – larger (historical) – de-normalized (few tables with many columns) – periodic (batch update)