* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction to Databases Background and Fundamentals 1.2
Survey
Document related concepts
Transcript
Introduction to Databases Background and Fundamentals 1.2 Background and History 1.2.1 Core concepts and definitions Data are valuable organizational resources. One of the technologies to manage data is a database. Databases are at the core of many (most?) information systems, including Electronic Health Records. Informatics practice has been defined as management and processing of data, information, and knowledge. Relevant to databases, processing includes Transactions (moving data) Passing data into and out of the database(s) Presentation (appearance/format) columns, labels, grouping Transformation (data type, value conversions…) Changes of data type (number to text or text to number), value conversions such as degrees F to degrees C or lb to kg, graphing…all the way to more complex rule-based transformations (“if person is the CEO then show *** for salary otherwise show salary rate per week * 52 and label this column as annual salary”) Aggregation (sum, count…) sum, count, average and other mathematical formulas that evaluate across a group of records Example: average temperature Key concepts Informatics Meta-structures see ANA draft document for definitions. You should be able to define these! Data: smallest unit. Discrete “facts” or values. Information: data that have been aggregated, processed, or put with other data so that meaning emerges Knowledge: information that has been synthesized so that relationships are known Null (unknown or data do not exist). This is not the same as Zero This is not the same as the string of letters “N-U-L-L” (or “null”, “NULL”...) Some databases differentiate between an empty string (“” or ‘’) and null – others treat them as the same thing. Be careful of this when we write queries Meta Data: loosely described as data about data • Structural metadata: the design and specification of data structures (data about the containers for data). Describes the structures that will contain data – e.g., “this field contains date and time data”. • Descriptive metadata: the “data about specific instances of data (contents)” e.g., when it was last modified… Data Element The Unified Modeling Language (UML) is a standard supported by the International Standards Organization (ISO), Object Modeling Group (OMG). According to this standard, an element is an atomic constituent of a model. “Atomic” means the smallest unit that has precise meaning or precise semantics. A data element, then, is the smallest unit of data, and a clinical data element is the atomic form of clinical data (the smallest subset of meaningful clinical data). For clinical data this includes the name-value pair (e.g., temp 37 C – temp is the “name” and 37 is the “value”; the units “C” is called a qualifier), a person component (who the data belong to or are about), and what date & time are associated with it (some sort of time stamp). An example might be John Smith temp 37.2 C 1/5/2011 8:00 am Tables Data are organized into tables that can be linked together. You can envision a table like a spreadsheet , with rows and columns. In a relational database • Each table represents a topic. This can be a real world thing (person) or conceptual topic (doctor visit). The topic is also called an entity • A row [also called a record or instance] contains the data for one individual value set (e.g., data about one specific patient) • The columns are the descriptors (called attributes of the entity). For a patient, this would be things like medial record number, name, date of birth... • A cell [field] is the value of the column for a particular patient At the intersection of a row and column is ONE value. This value is in it’s smallest useful form (atomic form). One more thing… We use a table, with rows and columns, as a MENTAL MODEL. It helps us envision the data. The data are not physically stored that way, though…the data are physically stored in whatever manner the operating system and database software determine. Database notation When describing a database table, I will use a common notation. This is the table name, followed by the name of the attributes in parentheses. For example, Person (Person_ID, Last_Name, First_Name, Birth_Date) More Definitions • Database: a collection of related data (and a description of this data), designed to meet the information needs of an individual or an organization. A self-describing collection of related records. o Metadata: the self-describing part of the database. Also called the system catalog or data dictionary. It is “data about data” such as the data type (number, text, etc), field size, and other characteristics In Access we can use a tool called the documenter to obtain a report showing metadata. In Oracle, we can use a describe (desc) command to do the same thing. Part of the reason that databases tend to be larger file size that spreadsheets is because the metadata are stored in system tables—the database contains both the content and the description of the structures. • Database Management System (DBMS): software that defines, creates, maintains, and controls access to a database (e.g., MS Access, Oracle, MySQL, SQL server, Sybase …) • Database Application: a computer program that interacts with a database. Serves as an intermediary between the user and the DBMS. (Often contain forms, reports…). Interacts with the DBMS by sending requests, often an SQL statement. • Database System: a collection of database applications • Schema: Term is used in many ways in DB literature! Basically means some sort of description of the database. It can mean an overall description of the database. It could also mean the portion of the overall database that “belongs to” (is owned by) a particular user. • Instance: the actual data at any particular point in time. A single instance can be thought of as one row in a table Database Structure Terms…From biggest/most general to smallest/most specific: *know the definitions of these items, and their synonyms ** (hint hint) • Database (Workspace) the overall “container” for the data • Table (File, Entity) represents one “thing” about which we want to store data • Row (Record, Instance) one individual example of the entity (e.g., one specific person) • Column (Field, Attribute) the descriptors or properties of the entity • Cell (data; value; Character, Number, Picture) the value of the column for a particular row