Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Extensible Storage Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Oracle Database wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Relational model wikipedia , lookup
GISC 6383 Technology Assessment Report GIS Database Options By: Deo Nabar Uchit Patel Instructor: Dr. Ronald Briggs Date: October 21st, 2004 Topics to be covered….. • • • • • Introduction Generations of DBMS The Move to Object-Oriented Architectures Special Data Representation Special Data Model • Factors consideration for selecting RDBMS • Comparison of major RDBMS (Oracle, DBII, SQL Server, MS Access) • Spatial features supported by major RDBMS • Conclusion Introduction What is GIS? • Geographic Information System (GIS) is a computerized database management system used for Assembling, Storing, Manipulating, Analyzing and Displaying geographically-referenced information of geographic data in Database Management System (DBMS). • The role of DBMS in GIS – GIS is a data driven information system – Managing the data is a major issue in a GIS application Introduction • What is a database and database management system – A database is a collection of related data. – Information generally stored in Tables (rows & columns): • • rows: records, observations, features (ArcInfo and ArcGIS), concepts or entities – All information about one occurrence of a feature columns: fields, data element, variables, items (ArcInfo), properties or attributes – Two-dimentional list (arrey) of records containing attributes of objects. • A database management system (DBMS) is a collection of program that enables users to create and maintain databases. • Why use a database management system? – – – – – – – – The centralized management of data To normalize data To control redundancy To restrict unauthorized access To provide multiple user interfaces To representing complex relationships among data To enforce integrity constraints To provide backup and recovery Why use a database management system? • With Operating System (OS) files, application programmer must define data needed and specify its characteristics and location • DBMS provides an interface between application program and physical data files stored by the OS. • • With DBMS, user/programmer defines only data needed, the DBMS tracks physical location and characteristics. DBMS presents logical view of data to user/ programmer, while maintaining internally a physical view of where and how data is actually stored. (* from dbconcept.ppt) User/Programmer Logical Data element DBMS Physical field OS File Structure Nuts & Bolts of DBMS • Data Dictionary: inventory of data elements; defines and stores their characteristics: • • • • physical characteristics (size, type) location ownership and security usage (last date, programs, reports, etc.) • Data Definition Language (DDL): language used by data base administrator to specify the content and structure (the schema) of the data base – Originally this was unique to each vendor, and still is to a degree – UML (Unified Modeling Language) now provides a standardized, visual-based approach for creating schemas • Data Manipulation Language (DML): commands permitting end-users and/or programmers to extract and transform the data • structured query language (SQL) is the standard • Applications often contain point & click interfaces which generate SQL queries (* dbconcept.ppt) Generations of DBMS DBMS Characteristics Problems Flat file system -One large file contains all the data -Unique Identifier -Data Redundancy -Access time is high -Wastage of memory -Not easy to add new fields Navigational file system (Hierarchical & Network) -Multiple files with different record structure -Record as master or parent -Each parent have many child records -Child records have children & parents -Access via parent -Pointer structure is very complex e.g.- IBM’s IMS Relational DBMS -Multiple files each with a different record structure -Tables can be related on a common record identifier -High computational requirements if many joins -Tables & E-R carefully planned e.g.- ESRI’s INFO, IBM’s DBII, Oracle, Ingress, Sybase, Informix, SQL Server, MS-Access The Move to Object-Oriented Architectures • Relational (RDBMS) – Today 95% of corporate data is stored in RDBMS – COTS DBMS provided by lead industry players such as IBM DB2, Oracle Universal Server, Informix, Microsoft SQL Server and Access – Inability to store complete objects directly into the database – data and code are functionally separate – Poor performance of RDBMS for many types of geographic query • Object (ODBMS) – – – – – Designed to address several weaknesses of RDBMS Object - data encapsulated by code, thus can be “data” or “software code” ODBMS can store objects persistantly (Semi-permanently on disk). Provide object-oriented query tools. ODBMS have been developed by vendors such as, eXcelon Corp-ObjectStore, GemStone Systems-GemStone. – Lack of industry support due to massive installed base of RDBMS. • Object-Relational (ORDBMS) – Hybrid DBMS - RDBMS engine adapted to handle objects (i.e. data describing object attributes and behavior of object/methods or functions) – Many ODBMS capabilities have been added in RDBMS. – Reuse code and objects in new programs – Ideal geographic ORDBMS is extended to support geographic Object types and functions * GIS text by Longley Geographic (Spatial) DBMS extensions • Beginning 2001 – following three major DBMS vendors have released spatial database extensions to their standard ORDBMS products to Store, Manage and Query geographic objects. • • • • IBM DB2 Spatial Extender Informix Spatial Datablade Oracle Spatial Option Main focus of extensions is data storage, retrieval and management, however they do not have real capabilities for geographic editing, mapping and analysis. Consequently should be used in conjunction with a GIS except simplest query-focused applications. Spatial Data Representations • Vector data for representing features – CAD, Coverages, shapefiles, geodatabases – Classified by three dimensions: • Points – represents Zero-dimensional shapes for very small areas. • Lines – represents One-dimensional shapes for narrow areas. Segment of a line can be stright, circular, elliptical or splined. • Polygons – two-dimensional shapes for broad geographically features. Series of segments form a set of closed areas. • Raster represents gridded data of photos, images, scanned maps. – Camera and Imaging systems record data as pixel value in a twodimensional Grid or Raster. – Image data is stored in .bmp, .tiff, .jpeg, .sid, ERDAS formats – Raster data is in discrete or continuous GRIDS (ESRI’s native file format for raster) – Discrete grids (but not continuous) can have attribute data table Spatial Data Representations • Triangulated Irregular Networks (TINS) used for modeling surfaces. – TINs enables surface analysis such as watershed studies, surface visibility such as peaks, ridges and streams, etc. – Although TINS are a vector format, as of ArcGIS 9, they are not yet supported by the Personal Geodatabase and must be stored in coverage workspaces. Examples of Spatial Data Representation • Property associated – – – – – – – – Legal parcel Assessor parcel Parcel boundary plat map Parcel photograph Owner Address Land value • Street associated – – – – – – – Street Street segment Intersection Traffic light Traffic analysis zone Bus route Bus stop – – – – – – – – polygon polygon line string raster image alphanumeric alphanumeric numeric – – – – – – – line string line segment node point polygon route point pixels abcdef123 110210.67 * dbdesign.ppt Spatial (Geographic) Data Models Characteristics Problems Computer Aided Design (CAD) -Primary approach in 1960-1970 - vector Maps with lines displayed on CRT, raster maps with overprinted characters on line printers. -Store geographic data in binary file for representations of points, lines and areas. -Only graphics but no topological information -limited attribute information and no database -Different features may be combined in same layer or feature class such as road, streams and railroad on same layer or feature class Coverage or Georelational Model -In 1981 introduced by ESRI - Spatial data combined with attribute data stored in a indexed binary files and tables -Stores topological relationship between vector features. -Complex structure -Non industry standard database must maintain two databases -Homogeneous collection of points, lines and polygons with generic behavior (Road=Stream). -Data duplicated Shapefile -Introduced in 1990 with ArcView 2 -Openly published structure than proprietary coverage model -Attribute data stored in a .dbf file -Spatial data not fully topological -Requires 3 separate files (.shp .shx and .dbf) for storage. -dBase database (.dbf) is currently out-dated in IT industry. Geodatabase DBMS model • Introduced by ESRI with ArcView v8.0 in 2000 • Characteristics - Built on object oriented concepts and technology - Brings physical data model closer to logical data model such as owner, parcel, road, building objects. - Implement majority of custom behaviors without writing code • Benefits - Features are made smarter by endowing them with natural behaviors - Allowed any sort of relationship to be defined among features - A uniform repository of all geographic data - Data entry and editing is more accurate - Features on a map display are dynamic - Shapes of features are better defined - Many user can edit geographic data simultaneously Geodatabase Today – The preferred approach to use in GIS today – Is a true database unlike shapefiles • Powerful capabilities (domains, validation rules, etc) for ensuring data integrity and simplifying data entry – Can be incorporated into “Industry Standard” data bases such as Microsoft Access, SQL Server, Oracle Spatial (Geographic) Data Models from Other Vendors • Focus is on spatial data models from ESRI, the GIS market leader • Other GIS vendors (e.g. MapInfo, Intergraph Geomedia, etc.) have equivalent data models, although perhaps less sophisticated than the geodatabase • Oracle Spatial is offered by Oracle Corporation, the database market leader • Many other RDBMS off-the-shelf vendors provide extention to their existing DBMS as discussed beore. * dbconcepts.ppt Spatial File Formats--example Personal Geodatabase In a gdb, feature class can have Feature data set only one feature Feature class (feature type = polygon) type. Feature class (feature type = arc) Coverage (= feature class) A coverage can Feature type (arc) have multiple feature typesFeature type (point) now viewed as a Feature type (polygon) shortcoming. Feature type (point) Coverage (= feature class) Feature type (arc) Feature type (point) Tracts feature class table Locator (table) Raster Shapefile Shapefile Features (rows) Feature ID (key field) Feature type Secondary or Foreign key • • • • • Introduction Generations of DBMS The Move to Object-Oriented Architectures Special Data Representation Special Data Model Topics to be covered…..by Uchit Patel • Factors consideration for selecting RDBMS • Comparison of major RDBMS (Oracle, DBII, SQL Server, MS Access) • Spatial features supported by major RDBMS • Conclusion Factors consideration for selecting RDBMS - Platform and System requirement - Data types support - Manageability Administrative Assistant Database Administration Database Performance SQL Performance System Performance Memory Management Recovery Management - Security - Program language support - Scalability - Capability for Spatial Application - Price - Ease of Use Current Market Leaders Oracle DBII SQL Server, MS Access Vendor Oracle IBM Version Oracle 9i DBII 8.1 Address Oracle Corporation 500 Oracle Parkway Redwood Shores, CA 94065 International Business Machines Corporation New Orchard Road Armonk, NY 10504. (914) 499-1900 Microsoft Corporation One Microsoft Way Redmond, WA 98052 1(800)-360-7561 Website www.oracle.com www.ibm.com www.microsoft.com Microsoft SQL Server 2000, MS Access Product description • Oracle 9i - Stores and manages more data types than any other database - Most advanced SQL, Java, XML, web services - Sophisticated performance, reliability and security features - Easy to configure and manage from any web browser • DBII 8.1 - DBII Universal Database is the first multimedia, web- ready relational dbms strong enough to meet the demands of large corporations and flexible to serve small and medium sized businesses - Open solutions that can access and integrate data from multiple, geographically separated sources on different platforms - Provides object-relational dbms that is ideal for mobile applications • SQL Server 2000 - Complete database and analysis offering for rapidly delivering the next generation of scalable ecommerce, line of business and data warehousing solutions - Query, analyze and manipulate data over the web - XML in SQL Server to exchange data between loosely coupled systems - Grow without limits with enhanced scalability and reliability features - Gives advantage of Symmetric Multiprocessing (SMP) hardware Platform Requirement Windows NT/2000 UNIX Linux Oracle 9i Y Y Y DBII 8.1 Y Y Y SQL Server 2000, MS Access 2003 Y N N System Requirement Different for different platform, installation, product edition, operating system, memory, hard disk and software Support Data Types Data Types Oracle 9i DBII 8.1 SQL Server 2000 Built-in data types -Character -Character -Number -Datetime -Long and Raw -Large object -Rowid -Graphic string -Number -Datetime -Binary -External Data -Character -Number -Binary -Unicode character User defined Built-in or other user defined as building blocks to model the structure and behavior of data -Distinct -Structured -Reference -object ANSI SQL supported data types Most data types can be convert to oracle data types Built in data types is the ANSI SQL data types Convert to built-in data types Manageability Factor for Oracle 9i Comparison DBII 8.1 Administrative Assistant Provides an integrated management framework includes configuration and administration console, security services SmartDBA Cockpit has power to manage databases from anywhere and anytime Database Administration Enterprice Manager provides GUI to manage instances, database users, storage structures, database objects etc. Change Manager- change management solution for resolution of the most complex database changes Database Performance Parallel Server- Real Application Clusters to use clustered servers without modification, native compilation of PL/SQL, better java performance, distributed database performance enhancements Optimizes performance of database through an integrated set of expert DBA tools Factor for Oracle 9i Comparison DBII 8.1 SQL Performance Hint wizard- adds hints to optimize execution plan SQL tuning wizard- rewrites SQL in order to generate better execution plan SQL-Explorer for DBII simulates SQL execution non-intrusively without degrading performance System Performance PL/SQL and General I/O latch contention enhancements improves system performance PATROL Database knowledge module enables us to determine the status of our databases and navigate through data and graph Memory Management Self tuning memory management, Legato storage manager, Presumable space allocation, Unused indexed identification Space Expert gives intelligent, highest performance space management Recovery Management Recovery manager and server managed backup and recovery SQL- BackTrack is a high-performance backup and recovery product to simplify and automate backup and recovery process Manageability in SQL Server 2000 • SQL Server 2000 in Windows 2000 and later allow SQL Server database to be managed centrally • SQL Server 2000 provides automate management and tuning as much as possible to reduce the burden on the administrator • SQL Server offers institutive wizards to quickly step administrators through complex tasks Security • Oracle 9i - Enhanced 3 tier security, hosting security, user security - Role level, function level, row level security - Advanced security option (EE), Label security (EE), Encryption tool kit, Password management, Proxy authentication • DBII 8.1 - User relies on security mechanisms - User has been authenticated all authorization lies within the database • SQL Server 2000 - Provides role based security and integrated tools for security auditing - Provides support for sophisticated file and network encryption - Certified under U.S. governments with a C2 level Program Language Support Oracle 9i DBII 8.1 SQL Server 2000 Cobol C/C++ SQL N Y Y Y Y Y N Y Y PL/SQL Visual Basic Perl Java XML Y Y Y Y Y N Y Y Y Y N Y N N (.NET) Y Scalability DBII 8.1 Oracle 9i SQL Server 2000 MS-Access -Handle terabytes of data -Excellent scalability -Handle gigabytes -Handle gigabytes of data but works only on Intel and Microsoft's platform -Good scalability -Handle only 2 gigabytes of data -Very poor scalability of data but gives poor performance in terabytes of data -Real Application Clusters new servers are added as required -Reduced resources requirement per user -Very good scalability Capability for Spatial Application • Oracle 9i - Oracle Spatial extender used with EE to manage location information - Extendable spatial object data types - Support spatial reference system - Geometric functions and procedures, coordinate system transformation functions - Fully use of SQL language for spatial data operation • DBII 8.1 - Provide new Spatial extender - Supplies spatial types to model real world entities • SQL Server 2000 - No Spatial extender - Use as a back end database for GIS application Price Oracle 9i Personal Edition $400 named user Standard Edition $15,000 per processor Enterprise Edition $40,000 per processor Extra price for Spatial, Data Mining, Label Security, Advanced Security, Enterprise Manager tools DBII 8.1 Version 8.1.4 of DBII Everyplace Enterprise Database package costs $15,000 per processor SQL Server 2000 Personal Edition Comes with Enterprise Edition Enterprise Edition range $1599.95 - $10594.02 per processor MS-Access MS-Access comes with MS office App.. $250 Ease of Use • Oracle 9i - Serials of tools to make work easier e.g. Oracle Developer, Oracle Reports Developer, Oracle Forms Developer, Oracle e-business suit, Jdeveloper, XML Developer kit • DBII 8.1 - Complete suite of GUI administration tools - Programmer friendly tools for application up and running quickly • SQL Server 2000 - Developers tools to access and manipulation of data - New application by taking advantage of existing code Conclusion • Oracle 9i is the most functional mainly because it can be run on any operating system. Oracle 9i is a fully object-oriented database, it can model any object in the real world. Finally it has a most of the market share. • DB II 8.1 has recently moved from main frame to client server architecture database market. It runs on most operating systems. • SQL Server 2000 is the least expensive but it can run only on the Windows/Intel environment. - Database selection depends on the consistency of the organizational software system and the functional requirement. - Migrating the data to a new system may be relatively expensive. Reference • • • • • • • www.oracle.com www.ibm.com www.microsoft.com www.bmc.com http://www.utdallas.edu/~briggs/poec6383.html Modeling Our World By- Michael Zeiler Geographic Information Systems and Science ByLongley, Goodchild, Maguire, Rhind