Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
New kinds of databases 1. Distributed Databases 2. Warehouse Architecture 3. Mobile databases 4. GIS 5. Multimedia Databases 6. Genome Data Management Distributed Databases Distributed Database System DBMS DBMS DBMS DBMS data data data data CS 245 Notes 12 3 Parallelism: Pipelining Example: CS 245 T1 SELECT * FROM A WHERE cond T2 JOIN T1 and B select join A B (with index) Notes 12 4 Parallelism: Concurrent Operations Example: SELECT * FROM A WHERE cond merge data location is important... select select select A where A where A where A.x < 10 CS 245 10 A.x < 20 Notes 12 20 A.x 5 ATM Two Phase Commit Bank Mainframe ATM CS 245 Notes 12 6 What is a Warehouse? Collection of diverse data subject oriented aimed at executive, decision maker often a copy of operational data with value-added data (e.g., summaries, history) integrated time-varying non-volatile more CS 245 Notes11 7 What is a Warehouse? Collection of tools gathering data cleansing, integrating, ... querying, reporting, analysis data mining monitoring, administering warehouse CS 245 Notes11 8 Warehouse Architecture Client Client Query & Analysis Metadata Warehouse Integration Source CS 245 Source Notes11 Source 9 Motivating Examples Forecasting Comparing performance of units Monitoring, detecting fraud Visualization CS 245 Notes11 10 Mobile Databases Recent advances in portable and wireless technology led to mobile computing, a new dimension in data communication and processing. Portable computing devices coupled with wireless communications allow clients to access data from virtually anywhere and at any time. 2 Multimedia Databases In the years ahead multimedia information systems are expected to dominate our daily lives. Our houses will be wired for bandwidth to handle interactive multimedia applications. Our high-definition TV/computer workstations will have access to a large number of databases, including digital libraries, image and video databases that will distribute vast amounts of multisource multimedia content. 2.1 Multimedia Databases DBMSs have been constantly adding to the types of data they support. Today the following types of multimedia data are available in current systems. Text: May be formatted or unformatted. For ease of parsing structured documents, standards like SGML and variations such as HTML are being used. Graphics: Examples include drawings and illustrations that are encoded using some descriptive standards (e.g. CGM, PICT, postscript). 2.1 Multimedia Databases(2) Images: Includes drawings, photographs, and so forth, encoded in standard formats such as bitmap, JPEG, and MPEG. Compression is built into JPEG and MPEG. These images are not subdivided into components. Hence querying them by content (e.g., find all images containing circles) is nontrivial. Animations: Temporal sequences of image or graphic data. 2.1 Multimedia Databases(3) Video: A set of temporally sequenced photographic data for presentation at specified rates– for example, 30 frames per second. Structured audio: A sequence of audio components comprising note, tone, duration, and so forth. Audio: Sample data generated from aural recordings in a string of bits in digitized form. Analog recordings are typically converted into digital form before storage. 2.1 Multimedia Databases(4) Composite or mixed multimedia data: A combination of multimedia data types such as audio and video which may be physically mixed to yield a new storage format or logically mixed while retaining original types and formats. Composite data also contains additional control information describing how the information should be rendered. 2.1 Multimedia Databases(5) Nature of Multimedia Applications: Multimedia data may be stored, delivered, and utilized in many different ways. Applications may be categorized based on their data management characteristics as follows: Repository applications: A large amount of multimedia data as well as metadata is stored for retrieval purposes. Examples include repositories of satellite images, engineering drawings and designs, space photographs, and radiology scanned pictures. 2.1 Multimedia Databases(6) Presentation applications: A large amount of applications involve delivery of multimedia data subject to temporal constraints; simple multimedia viewing of video data, for example, requires a system to simulate VCR-like functionality. Complex and interactive multimedia presentations involve orchestration directions to control the retrieval order of components in a series or in parallel. Interactive environments must support capabilities such as real-time editing analysis or annotating of video and audio data. 2.3 Multimedia Database Applications Large-scale applications of multimedia databases can be expected encompasses a large number of disciplines and enhance existing capabilities. Documents and records management Knowledge dissemination Education and training Marketing, advertising, retailing, entertainment, and travel Real-time control and monitoring 3 Geographic Information Systems Geographic information systems(GIS) are used to collect, model, and analyze information describing physical properties of the geographical world. The scope of GIS broadly encompasses two types of data: 1. spatial data, originating from maps, digital images, administrative and political boundaries, roads, transportation networks, physical data, such as rivers, soil characteristics, climatic regions, land elevations, and 3 Geographic Information Systems(2) 2. nonspatial data, such as socio-economic data (like census counts), economic data, and sales or marketing information. GIS is a rapidly developing domain that offers highly innovative approaches to meet some challenging technical demands. 3.1 GIS Applications It is possible to divide GISs into three categories: cartographic applications, digital terrain modeling applications, and geographic objects applications 3.1 GIS Applications(2) GIS Applications Geographic Objects Digital Terrain Applications Modeling Irrigation Applications Car Earth navigation science Crop yield systems Geographic analysis Civil engineering market analysis and military Land Utility evaluation Evaluation Soil Surveys distribution Planning and Air and water and Facilities pollution studies Consumer product consumption management and services – Landscape Flood Control economic analysis studies Cartographic Traffic pattern analysis Water resource management 4.1 Genome Data Management Biological Sciences and Genetics: The biological sciences encompass an enormous variety of information. Environmental science gives us a view of how species live and interact in a world filled with natural phenomena. Biology and ecology study particular species. Anatomy focuses on the overall structure of an organism, documenting the physical aspects of individual bodies. Traditional medicine and physiology break the organism into systems and tissues and strive to collect information on the workings of these systems and the organism as a whole. 4.1 Genome Data Management(2) Histology and cell biology delve into the tissue and cellular levels and provide knowledge about the inner structure and function of the cell. This wealth of information that has been generated, classified, and stored for centuries has only recently become a major application of database technology. 4.1 Genome Data Management(3) Genetics has emerged as an ideal field for the application of information technology. In a broad sense, it can be taught of as the construction of models based on information about genes – which can be defined as units of heredity – and population and the seeking out of relationships in that information. 4.1 Genome Data Management(4) The study of genetics can be divided into three branches: 1. Mendelian genetics is the study of the transmission of traits between generations. 2. Molecular genetics is the study of the chemical structure and function of genes at the molecular level. 3. Population genetics is the study of how genetic information varies across populations of organisms. 4.1 Genome Data Management(5) The origins of molecular genetics can be traced to two important discoveries: 1. In 1869 when Friedrich Miescher discovered nuclein and its primary component, deoxyribonucleic acid (DNA). In subsequent research DNA and a related compound, ribonucleic acid , were found to be composed of nucleotides (a sugar, a phosphate, and a base which combined to form nucleic acid) linked into long polymers via the sugar and phosphate. 4.1 Genome Data Management(6) 2. The second discovery was the demonstration in 1944 by Oswald Avery that DNA was indeed the molecular substance carrying genetic information. 4.1 Genome Data Management(7) Genes were shown to be composed of chains of nucleic acids arranged linearly on chromosomes and to serve three primary functions: 1. replicating genetic information between generations, 2. providing blueprints for the creation of polypeptides, and 3. accumulating changes– thereby allowing evolution to occur. Watson and Crick found the double-helix structure of the DNA in 1953, which gave molecular biology a new direction. Summary From this lecture you can learn some new kinds of databases Any Questions? If there are any outstanding questions you can ask me one-to-one after the lecture OR privately in my office. Exercises