Download Chapter 21:Application Development and Administration

Document related concepts

Global serializability wikipedia , lookup

Tandem Computers wikipedia , lookup

Microsoft Access wikipedia , lookup

IMDb wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Commitment ordering wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

SQL wikipedia , lookup

Ingres (database) wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

PL/SQL wikipedia , lookup

Serializability wikipedia , lookup

Functional Database Model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Versant Object Database wikipedia , lookup

Relational model wikipedia , lookup

ContactPoint wikipedia , lookup

Concurrency control wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Chapter 23:
Advanced Application Development
1
Database System Concepts, 5th Ed.
23.1
©Silberschatz, Korth and Sudarshan
Database System Concepts






Chapter 1: Introduction
Part 1: Relational databases

Chapter 2: Relational Model

Chapter 3: SQL

Chapter 4: Advanced SQL

Chapter 5: Other Relational Languages
Part 2: Database Design

Chapter 6: Database Design and the E-R Model

Chapter 7: Relational Database Design

Chapter 8: Application Design and Development
Part 3: Object-based databases and XML

Chapter 9: Object-Based Databases

Chapter 10: XML
Part 4: Data storage and querying

Chapter 11: Storage and File Structure

Chapter 12: Indexing and Hashing

Chapter 13: Query Processing

Chapter 14: Query Optimization
Part 5: Transaction management

Chapter 15: Transactions

Chapter 16: Concurrency control

Chapter 17: Recovery System





Part 6: Data Mining and Information Retrieval

Chapter 18: Data Analysis and Mining

Chapter 19: Information Retreival
Part 7: Database system architecture

Chapter 20: Database-System Architecture

Chapter 21: Parallel Databases

Chapter 22: Distributed Databases
Part 8: Other topics

Chapter 23: Advanced Application Development

Chapter 24: Advanced Data Types and New Applications

Chapter 25: Advanced Transaction Processing
Part 9: Case studies

Chapter 26: PostgreSQL

Chapter 27: Oracle

Chapter 28: IBM DB2

Chapter 29: Microsoft SQL Server
Online Appendices

Appendix A: Network Model

Appendix B: Hierarchical Model

Appendix C: Advanced Relational Database Model
2
Database System Concepts, 5th Ed.
23.2
©Silberschatz, Korth and Sudarshan
Part 8: Other topics
(Chapters 23 through 25).
 Chapter 23: Advanced Application Development

covers performance benchmarks, performance tuning and standardization.
 Chapter 24: Advanced Data Types and New Applications

covers advanced data types and new applications, including temporal data,
spatial and geographic data, multimedia data, and issues in the management
of mobile and personal databases.
 Chapter 25: Advanced Transaction Processing

deals with advanced transaction processing. We discuss transactionprocessing monitors, high-performance transaction systems, real-time
transaction systems, and transactional workflows.
3
Database System Concepts, 5th Ed.
23.3
©Silberschatz, Korth and Sudarshan
Table of Contents
 23.1 Performance Tuning
 23.2 Performance Benchmarks
 23.3 Standardization
 23.4 Application Migration
 23.5 Summary
4
Database System Concepts, 5th Ed.
23.4
©Silberschatz, Korth and Sudarshan
• Ch 8: Application Design & Development
User
User
Web Browser
Application
WEB Interface
: Servlet
JSP
Monitoring DBMS
using Trigger
Limited Access
by Authorization
Protecting DBMS
by Security
DBMS
 Ch 23: Advanced Application Development

Performance

Standardization

Application Migration
5
Database System Concepts, 5th Ed.
23.5
©Silberschatz, Korth and Sudarshan
Performance Tuning

Adjusting various parameters and design choices to improve system performance
for a specific application.

Tuning is best done by identifying bottlenecks & eliminating them.

Can tune a database system at 3 levels:



Hardware

add disks to speed up I/O

add memory to increase buffer hits

move to a faster processor
Database system parameters

set buffer size to avoid paging of buffer

set checkpointing intervals to limit log size

system may have automatic tuning.
Higher level database design

such as the schema, indices and transactions (more later)
6
Database System Concepts, 5th Ed.
23.6
©Silberschatz, Korth and Sudarshan
Bottlenecks
 Performance of most systems (at least before they are tuned) usually limited by
performance of one or a few components
 These are called bottlenecks
 E.g. 80% of the code may take up 20% of time and 20% of code takes up 80%
of time
 Worth spending most time on 20% of code that take 80% of time

Finding bottlenecks is very important
 Bottlenecks may be in hardware or in software

e.g. disks are very busy, CPU is idle
 Removing one bottleneck often exposes another bottleneck

De-bottlenecking consists of repeatedly finding bottlenecks, and removing them

This is a heuristic
7
Database System Concepts, 5th Ed.
23.7
©Silberschatz, Korth and Sudarshan
Identifying Bottlenecks
 Transactions request a sequence of services

e.g. CPU, Disk I/O, locks
 With concurrent transactions, transactions may have to wait for a requested service
while other transactions are being served
 Can model database as a queueing system with a queue for each service
 transactions repeatedly do the following
 request a service  wait in queue for the service  get serviced
 Bottlenecks in a database system typically show up as very high utilizations of a
particular service
 Bottlenecks shows very long queues
 E.g. Disk utilization vs CPU utilization

100% utilization leads to very long waiting time:

Rule of thumb:
– design a system for about 70% utilization at peak load
– utilization over 90% should be avoided
8
Database System Concepts, 5th Ed.
23.8
©Silberschatz, Korth and Sudarshan
Finding Bottlenecks:
Queues in a Database System
9
Database System Concepts, 5th Ed.
23.9
©Silberschatz, Korth and Sudarshan
Tunable Parameters
 The lowest level is at the hardware level

Adding disks or using a RAID system

Adding more memory

Moving to a faster processor
 The second level consists of the DBMS parameters

Buffer size and Checkpointing intervals
 The third level is the highest level

Schema and Transactions

Indices and Materialized views
10
Database System Concepts, 5th Ed.
23.10
©Silberschatz, Korth and Sudarshan
Tuning of Hardware
 Even well-tuned transactions typically require a few I/O operations

Typical disk supports about 100 random I/O operations per second

Suppose each transaction requires just 2 random I/O operations.

Then to support n transactions per second, we need to stripe data across
n/50 disks (ignoring skew)

If 1000 transactions / 1 sec is required, 20 disks are needed
 Number of I/O operations per transaction can be reduced by keeping more data
in memory

If all data is in memory, I/O would be needed only for writes

Keeping frequently used data in memory reduces disk accesses, reducing
number of disks required, but has a memory cost
11
Database System Concepts, 5th Ed.
23.11
©Silberschatz, Korth and Sudarshan
Hardware Tuning: Memory
 Question: which data to keep in memory:

The disk cost for 1 I/0 per second is

price-per-disk-drive
accesses-per-second-per-disk
 Reducing 1 I/O saves this cost
 If a page is accessed n times per second, the value of keeping the page in
memory can be thought of as much as the disk cost for n I/Os per second
 n*
price-per-disk-drive
accesses-per-second-per-disk
 Cost of keeping a page in memory (buying more memory)

price-per-megabyte-of-memory
pages-per-megabyte-of-memory

Break-even point: value of n for which above two costs are equal
 If accesses to the papge are more than n times  better to keep the page in
the memory (buying more memory)
– saving by buying memory is greater than buying cost
12
Database System Concepts, 5th Ed.
23.12
©Silberschatz, Korth and Sudarshan
Hardware Tuning: 5-Min & 1-Min Rule
 Solving above equation with current disk and memory prices leads to:
5-minute rule: if a page that is randomly accessed is used more frequently
than once in 5 minutes  it should be kept in memory (by buying sufficient
memory!)
 For sequentially accessed data, more pages can be read per second
Assuming sequential reads of 1MB of data at a time:
1-minute rule: sequentially accessed data that is accessed once or more in
a minute should be kept in memory (by buying sufficient memory!)
 Prices of disk and memory have changed greatly over the years, but the ratios
have not changed much

rules remain as 5 minute and 1 minute rules, not 1 hour or 1 second rules!
13
Database System Concepts, 5th Ed.
23.13
©Silberschatz, Korth and Sudarshan
Hardware Tuning: Choice of RAID Level
 To use RAID 1 or RAID 5?

Depends on ratio of reads and writes

RAID 5 requires 2 block reads and 2 block writes to write out one data block
 If an application requires r reads and w writes per second

RAID 1 requires r + 2w
I/O operations per second

RAID 5 requires: r + 4w
I/O operations per second
 For reasonably large r and w, this requires lots of disks to handle workload

RAID 5 may require more disks than RAID 1 to handle load!

Apparent saving of number of disks by RAID 5 (by using parity, as opposed to
the mirroring done by RAID 1) may be illusory!
 Thumb rule: RAID 5 is fine when writes are rare and data is very large, but RAID 1 is
preferable otherwise

If you need more disks to handle I/O load, just mirror them since disk capacities
these days are enormous!
14
Database System Concepts, 5th Ed.
23.14
©Silberschatz, Korth and Sudarshan
Tuning of the Schema
 Vertically partition relations to isolate the data that is accessed most often -- only
fetch needed information.
•
E.g., split account (account-number, branch-name, balance) into two relations,
(account-number, branch-name) and (account-number, balance).
•
Branch-name need not be fetched unless required
 Improve performance by storing a denormalized relation
•
E.g., store join of account relation and depositor relation; branch-name and
balance information is repeated for each holder of an account, but join need not
be computed repeatedly.
•
•
Price paid: more space and more work for programmer to keep relation
consistent on updates
better to use materialized views (more on this later..)
 Cluster together on the same disk page records that would match in a frequently
required join,

compute join very efficiently when required.
15
Database System Concepts, 5th Ed.
23.15
©Silberschatz, Korth and Sudarshan
Tuning of Indices
 Index tuning

Tradeoff between queries and updates

Create appropriate indices to speed up slow queries/updates

Speed up slow updates by removing excess indices

Choose type of index (B-tree / Hash) appropriate for most frequent queries

Choose which index to make clustered
 Index tuning wizards (SW Tool)

Look at past history of queries and updates (the workload)

Recommend which indices would be best for the workload
16
Database System Concepts, 5th Ed.
23.16
©Silberschatz, Korth and Sudarshan
Tuning using Materialized Views
 Materialized views can help speed up certain queries

Particularly aggregate queries
 Overheads

Space

Time for view maintenance

Immediate view maintenance: done as part of update transaction
– time overhead paid by update transaction

Deferred view maintenance: done only when required
– update transaction is not affected, but system time is spent on view
maintenance
»
until updated, the view may be out-of-date
 Preferable to denormalized schema since view maintenance is systems
responsibility, not programmers

Avoids inconsistencies caused by errors in update programs
17
Database System Concepts, 5th Ed.
23.17
©Silberschatz, Korth and Sudarshan
Tuning using Materialized Views (Cont.)
 How to choose a set of materialized views

Helping one transaction type by introducing a materialized view may hurt others

Choice of materialized views depends on costs


Users often have no idea of actual cost of operations
Overall, manual selection of materialized views is tedious
 Some database systems provide tools to help DBA choose views to materialize

“Materialized view selection wizards”
18
Database System Concepts, 5th Ed.
23.18
©Silberschatz, Korth and Sudarshan
Automated Tuning of Physical Design
 Automated Tools

Index selection

Materialized View selection

Partition data in a parallel database system
 Users specify information about the size of database and related statistics
 MicroSoft’s Database Tuning Assistant

Allow users to pose “what-if” questions for selecting materialized view
 Workload compression

Generate workload using a small number of queries and updates
 Greedy heuristics and other various techniques
19
Database System Concepts, 5th Ed.
23.19
©Silberschatz, Korth and Sudarshan
Tuning of Transactions
 Basic approaches to tuning of transactions

Improve set orientation

Reduce lock contention
 Split a large transaction
 Rewriting of queries to improve performance was important in the past, but smart
optimizers have made this less important
 Improving set orientation

Communication and query handling overhead of each call are significant
 Set orientation  fewer calls to database
 Combine multiple embedded SQL/ODBC/JDBC queries into a single setoriented query
E.g. tune program that computes total salary for each department using a
separate SQL query by instead using a single query that computes total
salaries for all department at once (using group by)
 Use stored-procedures: avoids re-parsing and re-optimization of query

20
Database System Concepts, 5th Ed.
23.20
©Silberschatz, Korth and Sudarshan
Tuning of Transactions (Cont.)
 Reducing lock contention

Long transactions (typically read-only) that examine large parts of a relation
result in lock contention with update transactions


Use multi-version concurrency control


E.g. large query to compute bank statistics and regular bank transactions
E.g. Oracle “snapshots” which support multi-version 2PL
Use degree-two consistency (cursor-stability) for long transactions

Drawback: result may be approximate
21
Database System Concepts, 5th Ed.
23.21
©Silberschatz, Korth and Sudarshan
Tuning of Transactions (Cont.)
 Long update transactions cause several problems

Exhaust lock space
 Exhaust log space
 Also greatly increase recovery time after a crash

may exhaust log space even more during recovery if recovery algorithm is
badly designed!
 If a single large transaction updates every record of a very large relation,
log may grow too big
• Split large transaction into a batch of ``mini-transactions'‘
• Each mini transaction is suppose do perform a limited number of the updates
• Hold locks across transactions in a mini-batch to ensure serializability
• If lock table size is a problem  can release locks, but at the cost of
serializability
• In case of failure during a mini-batch
• must complete its remaining portion on recovery, to ensure atomicity
• support the concept of nested transaction
22
Database System Concepts, 5th Ed.
23.22
©Silberschatz, Korth and Sudarshan
Performance Simulation
 Performance simulation using queuing model

useful to predict bottlenecks as well as the effects of tuning changes,
 even without access to real system
 Queuing model as we saw earlier
 Models activities that go on in parallel
 Simulation model is quite detailed, but usually omits some low level details

Model service time, but disregard details of service
 E.g. approximate disk read time by using an average disk read time
 Experiments can be run on model, and provide an estimate of measures such as
average throughput/response time
 After simulation, parameters can be tuned in model and then replicated in real system

E.g. number of disks, memory, algorithms, etc
23
Database System Concepts, 5th Ed.
23.23
©Silberschatz, Korth and Sudarshan
Table of Contents
 23.1 Performance Tuning
 23.2 Performance Benchmarks
 23.3 Standardization
 23.4 Application Migration
 23.5 Summary
24
Database System Concepts, 5th Ed.
23.24
©Silberschatz, Korth and Sudarshan
Performance Benchmarks
 Suites of tasks used to quantify the performance of software systems

single task not enough for complex systems
 Important in comparing database systems, especially as systems become
more standards compliant.
 Commonly used performance measures:
 Throughput (transactions per second, or TPS)
 Response time (delay from submission of transaction to return of result)
 Availability or mean time to failure (MTTF)
 Need to beware when computing average throughput of different transaction types

Suppose that a system A runs transaction type T1 at 99 tps and transaction
type T2 at 1 tps and another system B runs both T1 and T2 at 50 tps.
 The idea of simple averaging TPS is wrong!
– equal performance of two systems A and B (50 tps)!
 If we ran 50 transactions of each type
– System A: T1 (0.01 sec), T2 (1 sec)  need 50.5 secs
– System B: T1 (0.02 sec), T2 (0.02)  need 2 secs
25
Database System Concepts, 5th Ed.
23.25
©Silberschatz, Korth and Sudarshan
Performance Benchmarks (Cont.)
 Instead, harmonic mean of n throughputs t1, t2, … tn:
n
1/t1 + 1/t2 + … + 1/tn

system A (T1: 99 tps. T2: 1 tps), system B (T1: 50 tps. T2: 50 tps)

The harmonic mean for system A is 1.98 ( 2 / ((1/99) + (1/1)) ) while for system
B, it is 50 (2 / ((1/ 50) + (1 / 50)) )

System B is 25 times faster than system A
 Interference (e.g. lock contention) makes even this incorrect if different transaction
types run concurrently
26
Database System Concepts, 5th Ed.
23.26
©Silberschatz, Korth and Sudarshan
Database Application Classes
 Online transaction processing (OLTP)

requires high concurrency and clever techniques to speed up commit
processing, to support a high rate of update transactions.
 Decision support applications

including online analytical processing, or OLAP applications

require good query evaluation algorithms and query optimization.
 Architecture of some database systems tuned to one of the two classes

E.g. Teradata DBMS is tuned to decision support
 Others try to balance the two requirements

E.g. Oracle, with snapshot support for long read-only transaction
27
Database System Concepts, 5th Ed.
23.27
©Silberschatz, Korth and Sudarshan
Transaction Benchmarks Suites
 The Transaction Processing Council (TPC) benchmark suites are widely used.

TPC-A and TPC-B: simple OLTP application modeling a bank teller application
with and without communication
Not used anymore
 TPC-C: complex OLTP application modeling an inventory system
 Current standard for OLTP benchmarking
 TPC-D: complex decision support application
 Superceded by TPC-H and TPC-R


TPC-H: (H for ad hoc) based on TPC-D with some extra queries
 Models ad hoc queries which are not known beforehand
– Total of 22 queries with emphasis on aggregation
 prohibits materialized views
permits indices only on primary and foreign keys
 TPC-R: (R for reporting) same as TPC-H, but without any restrictions on
materialized views and indices
 TPC-W: (W for Web) End-to-end Web service benchmark modeling a Web
bookstore, with combination of static and dynamically generated pages 28

Database System Concepts, 5th Ed.
23.28
©Silberschatz, Korth and Sudarshan
TPC Performance Measures
 TPC performance measures

transactions-per-second with specified constraints on response time

transactions-per-second-per-dollar accounts for cost of owning system
 TPC benchmark requires database sizes to be scaled up with increasing
transactions-per-second

reflects real world applications where more customers means more database
size and more transactions-per-second
 External audit of TPC performance numbers is mandatory

TPC performance claims can be trusted
29
Database System Concepts, 5th Ed.
23.29
©Silberschatz, Korth and Sudarshan
Sample TPC Performance Measures
 Two types of tests for TPC-H (ad hoc) and TPC-R (report)




Power metric test

runs queries and updates sequentially

then takes mean to find queries per hour
Throughput metric test

runs queries and updates concurrently

multiple streams running in parallel: each stream generates queries, with
one parallel update stream

the total time of the entire run
Composite “query per hour” metric

overall metric

square root of ( power metric X throughput metric )
Composite “price vs. performance” metric

system price / composite metric
Database System Concepts, 5th Ed.
23.30
30
©Silberschatz, Korth and Sudarshan
Other Benchmarks
 OODB transactions require a different set of benchmarks.

Reason: hard to define what is a typical OODB application

The object operational benchmark, version 1: 001 benchmark

OO7 benchmark has several different operations

provides a separate benchmark number for each kind of operation
– Traversing a connected objects
– Retrieving all objects in a class
 Benchmarks for XML databases are now being discussed
31
Database System Concepts, 5th Ed.
23.31
©Silberschatz, Korth and Sudarshan
Table of Contents
 23.1 Performance Tuning
 23.2 Performance Benchmarks
 23.3 Standardization
 23.4 Application Migration
 23.5 Summary
32
Database System Concepts, 5th Ed.
23.32
©Silberschatz, Korth and Sudarshan
Standardization
 The complexity of contemporary software systems and the need for their
interoperation require a variety of standards.

syntax and semantics of programming languages

functions in application program interfaces

data models (e.g. object-oriented / object-relational databases)
 Formal standards
 De facto standards
 Anticipatory standards
 Reactionary standards
33
Database System Concepts, 5th Ed.
23.33
©Silberschatz, Korth and Sudarshan
Standardization (Cont.)
 Formal standards are standards developed by a standards organization (ANSI,
ISO), or by industry groups, through a public process.
 De facto standards are generally accepted as standards without any formal
process of recognition

Standards defined by dominant vendors (IBM, Microsoft) often become de
facto standards

De facto standards often go through a formal process of recognition and
become formal standards
 Anticipatory standards lead the market place, defining features that vendors then
implement

Ensure compatibility of future products

But at times become very large and unwieldy since standards bodies may not
pay enough attention to ease of implementation (e.g.,SQL-92 or SQL:1999)
 Reactionary standards attempt to standardize features that vendors have already
implemented, possibly in different ways.

Can be hard to convince vendors to change already implemented features.
E.g. OODB systems
34
Database System Concepts, 5th Ed.
23.34
©Silberschatz, Korth and Sudarshan
SQL Standards History
 SQL developed by IBM in late 70s / early 80s
 SQL-86: The first formal standard
 IBM SAA standard for SQL in 1987
 SQL-89 added features to SQL-86 that were already implemented in many
systems

SQL-89 was a reactionary standard
 SQL-92 added many new features to SQL-89 (anticipatory standard)

Defines levels of compliance (entry, intermediate and full)

Even now few database vendors have full SQL-92 implementation
 SQL 1999

Adds variety of new features --- extended data types, object orientation,
procedures, triggers, etc.
 SQL 2003
35
Database System Concepts, 5th Ed.
23.35
©Silberschatz, Korth and Sudarshan
SQL Standards History (Cont.)
 The SQL: 2003 Standard

Part 1: SQL/Framework; overview
 Part 2: SQL/Foundation; types, schemas, DDL, DML, security, etc

Part 3: SQL/CLI (Call Level Interface); API interface
 Part 4: SQL/PSM (Persistent Stored Modules); procedural extensions
 Part 9: SQL/MED (Management of External Data)
 Interfacing of database to external data sources





– other databases and even files can be viewed as part of the database
Part 10: SQL/OLB (Object Language Bindings); embedding SQL in Java
Part 11: SQL Schemata(Information and Definition Schema) defines a standard
catalog interface
Part 13: SQL/JRT (Java Routines and Types) defines standards for accessing
routines and types in Java
Part 14: SQL/XML defines XML-related specifications
Missing parts 5,6,7, 8, 12 cover features that are not near standardization yet
 SQL/Bindings, Temporal Data, Distributed Transaction, Multimedia Data
36
Database System Concepts, 5th Ed.
23.36
©Silberschatz, Korth and Sudarshan
Database Connectivity Standards
 Open DataBase Connectivity (ODBC) standard for database interconnectivity





based on Call Level Interface (CLI) developed by X/Open consortium and the
SQL Access Group
 defines application programming interface (API), and SQL features that must
be supported at different levels of compliance
JDBC standard used for interconnectivity between Java and Database
X/Open XA standards by X/Open consortium
 define transaction management standards for supporting distributed 2PC
OLE-DB by Microsoft
 API like ODBC, but to support non-db sources such as flat files or email stores
 OLE-DB program can negotiate with data source to find what features are
supported
 Interface language may be a subset of SQL
ADO (Active Data Objects) by Microsoft
 Easy-to-use interface to OLE-DB functionality
 Can be called from scripting languages such as VBScript and Jscript
 ADO.NET API is for application in the .NET languages such as C# and Visual
Basic.NET
37
Database System Concepts, 5th Ed.
23.37
©Silberschatz, Korth and Sudarshan
Xopen / XA 작동 그림예제
38
Database System Concepts, 5th Ed.
23.38
©Silberschatz, Korth and Sudarshan
OLE-DB & ADO 작동 그림예제
39
Database System Concepts, 5th Ed.
23.39
©Silberschatz, Korth and Sudarshan
Object Oriented Databases Standards
 Object Database Management Group (ODMG) standard for OODB

version 1 in 1993, version 2 in 1997, version 3 in 2000

provides language independent Object Definition Language (ODL) as well as
several language-specific bindings
 Object Management Group (OMG) standard for distributed SW based on objects

Object Request Broker (ORB) provides transparent message dispatch to
distributed objects

Interface Definition Language (IDL) for defining language-independent data types

Common Object Request Broker Architecture (CORBA) defines specifications of
ORB and IDL
40
Database System Concepts, 5th Ed.
23.40
©Silberschatz, Korth and Sudarshan
CORBA 작동 그림예제
41
Database System Concepts, 5th Ed.
23.41
©Silberschatz, Korth and Sudarshan
XML-Based Application Standards
 Several XML based Standards for E-commerce

E.g. RosettaNet (supply chain), BizTalk
 Define catalogs, service descriptions, invoices, purchase orders, etc.
 XML wrappers are used to export information from relational databases to XML
 Simple Object Access Protocol (SOAP):

XML based remote procedure call (RPC) standard
 Uses XML to encode data (both parameters and results)
 HTTP as transport protocol (procedure call becomes HTTP request)

Endorsed by W3C
 Popular in B2B E-commerce
 Standards based on SOAP for specific applications
 E.g. OLAP and Data Mining standards from Microsoft
 DOM & SAX API are for general programming language to connect to XML Data
42
Database System Concepts, 5th Ed.
23.42
©Silberschatz, Korth and Sudarshan
SOAP 작동 그림예제
43
Database System Concepts, 5th Ed.
23.43
©Silberschatz, Korth and Sudarshan
Table of Contents
 23.1 Performance Tuning
 23.2 Performance Benchmarks
 23.3 Standardization
 23.4 Application Migration
 23.5 Summary
44
Database System Concepts, 5th Ed.
23.44
©Silberschatz, Korth and Sudarshan
Legacy Systems
 Legacy systems are older-generation systems that are incompatible with current
generation standards and systems but still in production use

E.g. applications written in Cobol that run on mainframes
 Today’s hot new system is tomorrows legacy system!
 Porting legacy system applications to a more modern environment is problematic
 Rewriting code
 Very expensive, since legacy system may involve millions of lines of code,
written over decades
 Original programmers usually no longer available
 Switching over from old system to new system is a problem
 Big-Bang Approach
 Chicken-little Approach
 One approach: build a wrapper layer on top of legacy application to allow
interoperation between newer systems and legacy application
 E.g. use ODBC or OLE-DB as wrapper
45
Database System Concepts, 5th Ed.
23.45
©Silberschatz, Korth and Sudarshan
Legacy Systems (Cont.)
 Rewriting legacy application requires a first phase of understanding what it does

Often legacy code has no documentation or outdated documentation

reverse engineering: process of going over legacy code to

Come up with schema designs in ER or OO model

Find out what procedures and processes are implemented, to get a high
level view of system
 Re-engineering: reverse engineering followed by design of new system

Improvements are made on existing system design in this process
46
Database System Concepts, 5th Ed.
23.46
©Silberschatz, Korth and Sudarshan
Migrating to a New System


Switching over from old to new system is a major problem

Production systems are in every day, generating new data

Stopping the system may bring all of a company’s activities to a halt, causing
enormous losses
Big-bang approach:
1.
Implement complete new system
2.
Populate it with data from old system
1.
No transactions while this step is executed
2.
scripts are created to do this quickly
3.
Shut down old system and start using new system

Danger with this approach: what if new code has bugs or performance
problems, or missing features

Company may be brought to a halt
47
Database System Concepts, 5th Ed.
23.47
©Silberschatz, Korth and Sudarshan
Migrating to a New System (cont.)
 Chicken-little approach:

Replace legacy system one piece at a time

Use wrappers to interoperate between legacy and new code

E.g. replace front end first, with wrappers on legacy backend
– Old front end can continue working in this phase in case of problems
with new front end

Replace back end, one functional unit at a time
– All parts that share a database may have to be replaced together, or
wrapper is needed on database also

Drawback: significant extra development effort to build wrappers and ensure
smooth interoperation

Still worth it if company’s life depends on system
48
Database System Concepts, 5th Ed.
23.48
©Silberschatz, Korth and Sudarshan
Table of Contents
 23.1 Performance Tuning
 23.2 Performance Benchmarks
 23.3 Standardization
 23.4 Application Migration
 23.5 Summary
49
Database System Concepts, 5th Ed.
23.49
©Silberschatz, Korth and Sudarshan
Ch 21: Summary (1)
 The Web browser has emerged as the most widely used user interface to
databases.

HTML provides the ability to define interfaces that combine hyperlinks with
forms facilities.

Web browsers communicate with Web servers by the HTTP protocol.

Web servers can pass on request to application programs, and return the
results to the browser.
 There are several client-side scripting languages - Java script is the most
widely used-that provide richer user interaction at the browser end.
 Web servers execute applications programs to implement desired functionality.

Servlets are a widely used mechanism to write application programs that
run as part of the Web server process, in order to reduce overheads.

There are also many server-side scripting languages that are interpreted by
the Web server and provide application program functionality as part of
Web server.
50
Database System Concepts, 5th Ed.
23.50
©Silberschatz, Korth and Sudarshan
Ch 21: Summary (2)
 Tuning of the database (system parameters, as well as the higher-level
database design-such as the schema, indices, and transactions) is important
for good performance.


Performance benchmarks play an important role in comparisons of database
systems, especially as systems become more standards compliant.


Tuning is best done by identifying bottlenecks and eliminating them.
The TPC benchmark suites are widely used, and the different TPC
benchmarks are useful for our comparing the performance of databases
under different workloads.
Standards are important because of the complexity of database systems and
their need for interoperation.


Formal standards exist for SQL. Defacto standards, such as ODBC and
JDBC, and standard adopted by industry groups, such as CORBA, have
played an important role in the growth of client-server database systems.
Standards for object-oriented databases, such as ODMG, are being
developed by industry groups.
51
Database System Concepts, 5th Ed.
23.51
©Silberschatz, Korth and Sudarshan
Ch 21: Summary (3)
 E-commerce systems are fast becoming a core part of how commerce is performed.

There are several database issues in e-commerce systems. Catalog
management, especially personalization of the catalog, is done with the
databases.

Electronic marketplaces help in pricing of products through auctions, reverse
auctions, or exchanges.

High-performance database systems are needed to handle such trading.

Orders are settled by electronic payment systems, which also need highperformance database systems to handle very high transaction rates.
 Legacy systems are systems based on older-generation technologies such as
nonrelational databases or even directly on file systems.

Interfacing legacy systems with new-generation systems is often important when
the run mission-critical systems.

Migrating from legacy systems to new-generation systems must be done
carefully to avoid disruptions, which can be very expensive.
52
Database System Concepts, 5th Ed.
23.52
©Silberschatz, Korth and Sudarshan
Ch 21: Bibliographical Notes (1)
 Information about servlets, including tutorials, standard specifications, and
software, is available on java.sun.com/products/servlet.

Information about JSP is available at java.sun.com/products/jsp.

An early proposal for a database-system benchmark (the Wisconsin benchmark)
was made by Bitton et al.[1983].

The TPC-A,-B, and –C benchmarks are described in Gray[1991].

An online version of all the TPC benchmarks descriptions, as well as benchmark
results, is available on the World Wide Web at the URL www.tpc.org




the site also contains up-to-date information about new benchmark proposals.
O’Neil and O’Neil[2000] provides a very good textbook coverage of performance
measurement and tuning.
The five minute and one minute rules are described in Gray and Putzolu[1987]
and Gray and Graefe[1997].
Brown et al.[1994] describes an approach to automated tuning.
53
Database System Concepts, 5th Ed.
23.53
©Silberschatz, Korth and Sudarshan
Ch 21: Bibliographical Notes (2)
 Index selection and materialized view selection are addressed by Ross et al.[1996],
Labio et al. [1997], Gupta [1997], Chaudhuri and Narasayya [1997], Agrawa et al.
[2000] and Mistry et al. [2001].
 Poess and Floyd[2000] gives an overview of the TPC-H, TPC-R, and TPC-W
benchmarks.

The OO1 benchmark for OODBs is described in Cattell and Skeen [1992]; the OO7
benchmark is described in Carey et al. [1993].

Kleinrock [1975] and Kleinrock [1976] is a popular two-volume textbook on queuing
theory.

Shasha [1992] provides a good overview of database tuning.
 The American National Standard SQL-86 is described in ANSI[1986].

The IBM Systems Application Architecture definition of SQL is specified by IBM
[1987].

The standards for SQL-89 and SQL-92 are available as ANSI [1989] and ANSI
[1992] respectively.

For reference in the SQL:1999 standard, see bibliographical notes of Chapter 9.
54
Database System Concepts, 5th Ed.
23.54
©Silberschatz, Korth and Sudarshan
Ch 21: Bibliographical Notes (3)
 The X/Open SQL call-level interface is defined in X/Open [1993]; the ODBC API
is described in Microsoft [1997] and Sanders [1998].
 The X/Open XA interface is defined in X/Open[1991]. More information about
ODBC, OLE-DB, and ADO can be found on the Web site
www.microsoft.com.data, and in a number of books on the subject can be found
through www.amazon.com.
 The ODMG 3.0 standard is defined in Cattell [2000]. ACM Sigmod Record, which
is published quarterly, has a regular section on standards in databases, including
benchmark standards.
 A wealth of information on XML based standards is available online.
 You can use a Web search engine such as Google to search for more up-to-date
information about the XML and other standards.

Loeb[1998] provides a detailed description of secure electronic transactions.
 Business process reengineering is covered by Cook [1996]. Kirchmer [1999]
describes application implementation using standard software such as Enterprise
Resource Planning (ERP) packages.
 Umar [1997] covers reengineering and issues in dealing with legacy systems.
55
Database System Concepts, 5th Ed.
23.55
©Silberschatz, Korth and Sudarshan
Ch 21: Tools
 There are many Web development tools that support database connectivity
through servlets, JSP, Javascript, or other mechanisms.
 We list a few of the better-known ones here:


JAVA SDK from SUN (java.sun.com)

Apache’s Tomcat (jakarta.apache.org) and Webserver (apache.org)

IBM WebSphere (www.software.ibm.com)

Microsoft’s ASP tools (www.microsoft.com)

Allaire’s Coldfusion and JRun products (www.allaire.com)

Caucho’s Resin (www.caucho.com)

Zope (www.zope.org).
A few of these, such as Apache, are free for any use, some are free for
noncommercial use or for personal use, while others need to be paid for

See the respective Web sites for more information.
Database System Concepts, 5th Ed.
23.56
56
©Silberschatz, Korth and Sudarshan
Table of Contents
 23.1 Performance Tuning
 23.2 Performance Benchmarks
 23.3 Standardization
 23.4 Application Migration
 23.5 Summary
57
Database System Concepts, 5th Ed.
23.57
©Silberschatz, Korth and Sudarshan
End of Chapter
Database System Concepts
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Database System Concepts
58
©Silberschatz, Korth and Sudarshan