Download Introduction to Databases Background and Fundamentals 1.2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Introduction to Databases
Background and Fundamentals
1.2 Background and History
1.2.1 Core concepts and definitions
Data are valuable organizational resources. One of the technologies to manage data is a database. Databases are at the
core of many (most?) information systems, including Electronic Health Records.
Informatics practice has been defined as management and processing of data, information, and knowledge. Relevant to
databases, processing includes
Transactions (moving data)
Passing data into and out of the database(s)
Presentation (appearance/format) columns, labels, grouping
Transformation (data type, value conversions…)
Changes of data type (number to text or text to number), value
conversions such as degrees F to degrees C or lb to kg, graphing…all the way to more complex rule-based
transformations (“if person is the CEO then show *** for salary otherwise show salary rate per week * 52 and label
this column as annual salary”)
Aggregation (sum, count…) sum, count, average and other mathematical formulas that evaluate across a group of
records Example: average temperature
Key concepts
Informatics Meta-structures
see ANA draft document for definitions. You should be able to define these!
Data: smallest unit. Discrete “facts” or values.
Information: data that have been aggregated, processed, or put with other data so that meaning emerges
Knowledge: information that has been synthesized so that relationships are known
Null (unknown or data do not exist).
This is not the same as Zero
This is not the same as the string of letters “N-U-L-L” (or “null”, “NULL”...)
Some databases differentiate between an empty string (“” or ‘’) and null – others treat them as the same thing.
Be careful of this when we write queries
Meta Data: loosely described as data about data
• Structural metadata: the design and specification of data structures (data about the containers for data).
Describes the structures that will contain data – e.g., “this field contains date and time data”.
• Descriptive metadata: the “data about specific instances of data (contents)” e.g., when it was last modified…
Data Element
The Unified Modeling Language (UML) is a standard supported by the International Standards Organization (ISO), Object
Modeling Group (OMG). According to this standard, an element is an atomic constituent of a model. “Atomic” means the
smallest unit that has precise meaning or precise semantics.
A data element, then, is the smallest unit of data, and a clinical data element is the atomic form of clinical data (the
smallest subset of meaningful clinical data). For clinical data this includes the name-value pair (e.g., temp 37 C – temp is
the “name” and 37 is the “value”; the units “C” is called a qualifier), a person component (who the data belong to or are
about), and what date & time are associated with it (some sort of time stamp).
An example might be
John Smith
temp 37.2 C
1/5/2011 8:00 am
Tables
Data are organized into tables that can be linked together. You can envision a table like a spreadsheet , with rows and
columns. In a relational database
• Each table represents a topic. This can be a real world thing (person) or conceptual topic (doctor visit).
The topic is also called an entity
• A row [also called a record or instance] contains the data for one individual value set (e.g., data about
one specific patient)
• The columns are the descriptors (called attributes of the entity). For a patient, this would be things like
medial record number, name, date of birth...
• A cell [field] is the value of the column for a particular patient
At the intersection of a row and column is ONE value. This value is in it’s smallest useful form (atomic form).
One more thing…
We use a table, with rows and columns, as a MENTAL MODEL. It helps us envision the data. The data are not physically
stored that way, though…the data are physically stored in whatever manner the operating system and database software
determine.
Database notation
When describing a database table, I will use a common notation. This is the table name, followed by the name of the
attributes in parentheses. For example,
Person (Person_ID, Last_Name, First_Name, Birth_Date)
More Definitions
•
Database: a collection of related data (and a description of this data), designed to meet the information needs
of an individual or an organization. A self-describing collection of related records.
o Metadata: the self-describing part of the database. Also called the system catalog or data dictionary. It is
“data about data” such as the data type (number, text, etc), field size, and other characteristics
In Access we can use a tool called the documenter to obtain a report showing metadata. In Oracle, we can use a
describe (desc) command to do the same thing. Part of the reason that databases tend to be larger file size that
spreadsheets is because the metadata are stored in system tables—the database contains both the content and the
description of the structures.
•
Database Management System (DBMS): software that defines, creates, maintains, and controls access to a
database (e.g., MS Access, Oracle, MySQL, SQL server, Sybase …)
•
Database Application: a computer program that interacts with a database. Serves as an intermediary between
the user and the DBMS. (Often contain forms, reports…). Interacts with the DBMS by sending requests, often an
SQL statement.
•
Database System: a collection of database applications
•
Schema: Term is used in many ways in DB literature! Basically means some sort of description of the database.
It can mean an overall description of the database. It could also mean the portion of the overall database that
“belongs to” (is owned by) a particular user.
•
Instance: the actual data at any particular point in time. A single instance can be thought of as one row in a table
Database Structure Terms…From biggest/most general to smallest/most specific:
*know the definitions of these items, and their synonyms ** (hint hint)
• Database (Workspace)
the overall “container” for the data
• Table (File, Entity)
represents one “thing” about which we want to store data
• Row (Record, Instance)
one individual example of the entity (e.g., one specific person)
• Column (Field, Attribute)
the descriptors or properties of the entity
• Cell (data; value; Character, Number, Picture) the value of the column for a particular row