* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ch15
Survey
Document related concepts
Microsoft Access wikipedia , lookup
Expense and cost recovery system (ECRS) wikipedia , lookup
Semantic Web wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Data vault modeling wikipedia , lookup
Information privacy law wikipedia , lookup
Business intelligence wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Open data in the United Kingdom wikipedia , lookup
Web analytics wikipedia , lookup
Concurrency control wikipedia , lookup
Versant Object Database wikipedia , lookup
Relational model wikipedia , lookup
Transcript
Chapter 15 Database and the Internet Fundamentals of Database Management Systems by Mark L. Gillenson, Ph.D. University of Memphis Presentation by: Amita Goyal Chin, Ph.D. Virginia Commonwealth University John Wiley & Sons, Inc. Chapter Objectives List the four differences between the Internet database environment and the standard database environment. Describe the database connectivity issues in the Internet environment. Describe the expanded set of data types found in the Internet environment. 15-2 Chapter Objectives Describe such database control issues as performance, availability, scalability, and security and privacy in the Internet environment. Describe the significance of data extraction into XML in the Internet environment. 15-3 The Internet E-mail File transfers Electronic commerce Shop online Bank online Get news online Get entertainment online Etc. 15-4 Internet & Databases Essence of e-commerce activity is data stored in databases. Data for a company’s product selections comes from a database. Your orders with a company go into a database. Checking your bank account balance requires querying a database. 15-5 Internet & Non-Internet Database Environment: Similarities Most (but not all) e-commerce databases are relational databases and many are transactional in nature. SQL and other standard query tools can be and are used in the e-commerce environment by the companies that own the databases. 15-6 Internet & Non-Internet Database Environment: Differences Database Connectivity Issues Expanded Set of Data Types Database Control Issues Data Extraction into XML 15-7 Database Connectivity Issues In a simple database environment, the application program, DBMS, and data are all contained within a single computer (single or multi-user environment). 15-8 Database Connectivity Issues: Client/Server Systems Two classes of computers. Client computers are enduser PCs, connected to a server computer on a LAN. Server contains application programs, the database management system, and the database, which all of the clients share. 15-9 Database Connectivity Issues: Client/Server Systems When an end user wants to run an application or retrieve data from the shared database: client computers handle the initial processing (the presentation or graphical user interface aspect) the data is sent on to the server for processing by the application code, including data retrieval from the shared database. server returns the results to the client PC. client is responsible for formatting the screen display. 15-10 Database Connectivity Issues: Client/Server System Usually thought of as a system built on a LAN. World Wide Web (WWW) can be considered to be a massive client/server system built on the Internet. 15-11 Database Connectivity Issues: Client/Server System WWW The clients are the PCs that individuals and companies use to connect to the Internet. The browsers in the PCs handle the client side screen presentation. The servers are the company Web servers with which people use their PCs for ecommerce. 15-12 Database Connectivity Issues: Hardware Components of the Web - Database Connectivity There are three levels of computers in this arrangement: the client PCs the Web server the database server. 15-13 Database Connectivity Issues: Specialized Web Software for E-Commerce When a client’s browser sends a message to the Web server, message follows rules of the Transmission Control Protocol/Internet Protocol (TCP/IP). All Internet traffic (including, e.g., e-mail) must follow TCP/IP. 15-14 Database Connectivity Issues: Specialized Web Software for E-Commerce The client’s message must also follow the rules of the Hypertext Transfer Protocol (HTTP), which is an additional protocol layer for WWW traffic on the Internet. 15-15 Database Connectivity Issues: TCP/IP & HTTP TCP specifies how the message is broken up into smaller “packets” for transmission. IP deals with the address of the computer to which the message is being sent. HTTP indicates the type of browser in the client and other information needed to format Web pages. 15-16 Database Connectivity Issues: The Web Database Environment There can be different kinds of: Hardware, even just between the Web server and the database server Application software languages Browsers on the client side Data, not just data in relational databases. Need specialized interfaces and specialized software known as middleware. 15-17 Database Connectivity Issues: Interfaces Common Gateway Interface (CGI) Application Program Interface (API) These interfaces have software “scripts” associated with them that allow them to exchange data between the application in the server and the databases controlled by the database server. 15-18 Database Connectivity Issues: ODBC Open Database Connectivity One standard way of accessing data. Designed as an interface to relational databases. 15-19 Database Connectivity Issues: JDBC Java Database Connectivity Another standard way of accessing data. Designed as an interface to relational databases. 15-20 Database Connectivity Issues: Middleware Used for connecting the applications in the Web server with the databases in the database server. Examples: Cold Fusion Oracle Application Server Microsoft Active Server Pages (ASP) and others. 15-21 Expanded Set of Data Types Data Types needed for various kinds of data: Numeric Character Large text blocks Graphic images Video clips Audio clips Databases that support Web sites must be capable of storing, searching, and retrieving this wide variety of data. 15-22 Expanded Set of Data Types: Oracle LOB Category LOB = large object category, includes data types: Binary LOB (BLOB) - for graphic images, photographs, video clips, audio clips Binary File (BFILE) - a pointer to a file external to the database. Character LOB (CLOB) - for large text files or documents National Character LOB (NCLOB) - for large text files or documents in non-Latin characters 15-23 Expanded Set of Data Types: Informix Universal Server The object/relational DBMS Informix Universal Server provides another style of handling multimedia and large text data using “Data Blades”: IMAGE - a general-purpose image data type DOC - for storing large text blocks A set of data types, including point, line, polygon, path, and circle, which can be used for storing a variety of graphic images. 15-24 Database Control Issues The Internet database environment requires a special emphasis on: Performance Availability Scalability Security and Privacy 15-25 Database Control Issues: Performance Internet traffic to a Web site, the number of people or companies trying to access it simultaneously, can vary greatly and because of a variety of factors such as: Time of day (worldwide basis) Season of the year (e.g., Christmas shopping) Popularity of a Web site A major new product introduction A major event (e.g., the Victoria’s Secret annual fashion show) 15-26 Database Control Issues: Performance Spikes in Internet traffic require serious predictive capacity planning. System performance is also affected by software design and database design. Performance-boosting techniques should be considered. 15-27 Database Control Issues: Performance-Boosting Techniques Physical design techniques, including denormalization. Database persistence - holding a temporary copy of the retrieved data outside of the database. Query cache - a special dedicated memory associated with the Web server Use canned queries instead of having employees write their own SQL queries to access data over an internal Intranet. 15-28 Database Control Issues: Availability A company’s Web site and the databases that it accesses should be available to the public at all times. Three o’clock in the morning in one part of the world is the middle of the day in another. 24/7 15-29 Database Control Issues: Availability An information system may be unavailable because: A system or telecommunications failure Failure of a support system, such as an electrical outage Planned down period for system maintenance Excessive traffic that clogs the system 15-30 Database Control Issues: Improving Availability Clustering A cluster of several servers is built, each with its own replicated copy of the database. As queries come in over the Web, sophisticated software checks the activity on each of the servers and their databases and performs load balancing, sending each particular query to a server that is relatively idle at that moment. 15-31 Database Control Issues: Scalability An information system (IS) that supports a Web site and its traffic growth must be scalable. An IS must be capable of growing in size without adversely affecting the operations of the site. Hardware and software must be chosen that is capable of rapid and major expansion. 15-32 Database Control Issues: Security and Privacy In the business-to-consumer e-commerce environment, the company wants as many people as possible to “visit” its Web site and buy its products. This means that hackers, data thieves, virus writers, and anyone else with mischief on their minds has an openly published entry point into the company’s information system. 15-33 Database Control Issues: Security Measures- Isolation Separating the different parts of the information system so that they run on different computers. The Web server and the database server should be different computers. Should be separated from the rest of the company’s information system by being on a separate LAN. 15-34 Database Control Issues: Security Measures - Firewalls Making major use of firewalls. Firewalls can be separate “proxy” computers that extract data from incoming messages and pass the data on in a different format to the Web server. 15-35 Database Control Issues: Privacy Companies have long held personal data about their customers in their databases. Now companies are communicating digitally with their customers through their Web sites, including passing their personal data, over the Internet. The collected personal data in the company’s database makes a tempting target for someone out to steal such data. 15-36 Data Extraction into XML XML = Extensible Markup Language When a Web server sends a Web page to your PC, the text and data in the page comes formatted in HyperText Markup Language (HTML). Embedded HTML “tags” mark up the text and data, instructing your PC’s browser on how to display the page on your monitor. 15-37 Data Extraction into XML: SGML SGML = Standard Generalized Markup Language HTML is derived from SGML SGML is also capable of indicating the meaning of data. XML, also derived from SGML, focuses on meaning of data. 15-38 Data Extraction into XML: XML DTD DTD = <!ELEMENT book> <!ELEMENT booknumbe r (#PCDATA)> Document <!ELEMENT booknam e (#PCDATA)> type <!ELEMENT publicationyea r (#PCDATA)> definition <!ELEMENT pages (#PCDATA)> <!ELEMENT publishername (#PCDATA)> Good <book> Reading’s <booknu mber>374566</booknumbe r> database <bookname>Catch-22</booknam e> <publicationye ar>1955</public ationyear> example <pages>443</pages> <publishername>Simon and Schuster</publishername> 15-39 Data Extraction into XML: XML and Database Management Supply chain management and electronic data interchange (EDI) require coordination between companies of database constructs. XML provides an independent layer of data definition that is separate from the particular formatting of each company’s data in their databases. 15-40 Data Extraction into XML: Good Reading Bookstores 15-41 “Copyright 2004 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without express permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages caused by the use of these programs or from the use of the information contained herein.” 15-42