* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Survey
Document related concepts
Oracle Database wikipedia , lookup
Microsoft Access wikipedia , lookup
Concurrency control wikipedia , lookup
Functional Database Model wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
ContactPoint wikipedia , lookup
Versant Object Database wikipedia , lookup
Transcript
1 Intro to Databases (using Microsoft Access) Yitzchak Rosenthal 2 Database (DB) • A Database is an organized collection of data that can be accessed and manipulated by computer programs. 3 People who use databases • End Users – Most business people do not usually use a database directly. Rather, business people use programs geared to their specific business needs. “Behind the scenes” those programs store or retrieve data from a database in response to the user’s actions. • Technical IT Professionals (e.g. programmers, database administrators) – IT Professional such as programmers and database administrators need to understand the in-depth issues about databases. These professionals create and maintain the programs that are used by the end users. • Business Analysts (BA) – When a new program needs to be created, the business analysts work with both the “end users” and the “technical IT professionals” to design the programs. The BAs don’t actually create the programs, they just work on the design. BAs must also understand databases concepts. • (See architecture diagram later in this presentation …) Database (DB) vs. DataBase Management System (DBMS) • Database (DB) – A Database contains the actual data. • Database Management System (DBMS) – A Database Management System (DBMS) is the software that is used to add, remove and manage the data in a database. • Difference between DB and DBMS – You can think of a database and a DBMS similar to the way you think about a word processing program (e.g. Microsoft Word) VS a word processing file. The DBMS is similar to the word processing program, while the database is similar to a word processing file. 4 One DBMS can control many different databases • A single “Database Management System” can control many different “databases”. • Individual databases contain information about a particular topic. • EXAMPLE: A telephone company may have the following different databases (as well as many others). These different databases can all be controlled through one DBMS program: 5 – Human Resources Database: contains information about employee name, titles, hire dates, salaries, pension and benefit information, 401K data, etc – Customer and billing database: information about customer names and addresses, billing information, calling patterns, etc – Assets database: information about real estate that the company owns and leases, cell phone towers and leased lines, etc. – Marketing database: information about marketing campaigns, advertising rates, locations of billboards that are being used, times/stations and rates for radio and TV ad spots, data from focus groups, etc. – etc. 6 CATEGORIES OF DATABASES (RELATIONAL, OBJECT ORIENTED, HIERARCHICAL, NETWORK) Relational / Hierarchical / Network / Object • Databases are categorized by how they organize information. • Types of databases – Hierarchical Database model – Network Database model – Object Oriented Database model – Relational Database model 7 8 Hierarchical Databases • Hierarchical Databases: – Data is organized into a hierarchy, similar to the folder/file hierarchy of the file system. – Hierarchical databases can be faster than an equivalent relational database. However, with modern faster hardware this is not as much of an issue as it originally was. – Hierarchical databases are not used much anymore for storing business data. – IBM’s IMS database is an example of a hierarchical database that is still used, mostly on large mainframe computers. 9 Network Databases • Network databases: – The word “network” in “network database” refers to how the information is organized in the database. (The word “network” is NOT related to the common concept of a computer network.) – Network databases were proposed as an enhancement of the hierarchical database. – Discussion of network databases is primarily academic since they are not really used anymore. 10 Object Oriented Databases • Object Oriented Databases: – Data is organized into information about “objects”. – Object Oriented Databases are designed to be easily integrated with Object Oriented programming languages. – Object oriented programming languages are very popular today. However, object oriented programming languages can ALSO be used with other types of databases. In practice most object oriented programs today use Relational Databases. – Object Oriented databases are the newest type of database. They will probably NOT replace relational databases but their use will probably grow. – An Object-Relational database is an object oriented database that can be implemented on top of a relational database. 11 Relational Databases • Relational Databases: – Data is organized into many “tables” of information. – Vast majority of databases in use today are “Relational” databases. – This is the type of database that we will be studying in depth. 12 RELATIONAL DBMS PRODUCTS 13 Current Relational DBMS Products • Most popular – MySql: “open source” DBMS. Can be downloaded and used for free. Powerful. Companies can purchase a technical support license from MySql, inc. Used a lot by smaller companies. Not as quick to catch on in larger fortune 500 companies. – Microsoft SQL Server – Microsoft’s “high-end” DBMS. More powerful than Access (i.e. faster, can handle more data, more users, better security features, etc). – Microsoft Access – Microsoft’s “low-end” DBMS. Used for smaller applications. Cheaper than SQL Server to buy/license. – Oracle – first commercially successful Relational DBMS. Very powerful. Has majority market share for large DB applications in fortune 500 companies. • Others – – – – Sybase IBM UDB IBM Informix many others … 14 CLIENTS AND SERVERS 15 Client Programs and Server Programs • Client programs and server programs are programs that are designed to communicate with each other. • A client program makes requests of a server program to send some information or perform some action. • The server program performs the requests of the client program and sends the client a response. • The overall architecture of such a system is known as a “client-server” architecture. 16 Example: databases • Example: – In the database architecture described above, the “application program” is acting as a “client program” and the DBMS is acting as a “server program”. – The “application program” requests the DBMS to lookup some information in the database and the DBMS performs the lookup and returns the results to the application program. 17 OTHER TYPES OF CLIENT AND SERVER PROGRAMS 18 Other types of client and server programs • The client server model is used to implement many different types of computer applications. • Examples – Web browsing – Email – Instant Messaging (IM) – Many other examples … 19 Web Browsing • TODO – show architecture diagrams for – Web – Email – IM – File sharing What is a network? • A network is a group of computers that can communicate with each other directly. servers clients 20 What is Internetworking? • • • any network can communicate with any other network some communications are direct (e.g. A-B, A-C, C-D) some communications go through other networks (e.g. B-C, B-D, A-D) Network A Network B servers clients Network D 21 direct communication line Network C The Internet • • The Internet "cloud" in the diagram represents many many networks connected to each other. The internet permits communication (usually indirect) between any two computers attached to the internet. Network A Network B servers clients Network D 22 The Internet Network C Clients and Servers • There are two general classifications of computers (AKA machines) on a network (or the Internet) – clients – servers 23 What Is a Server? • Server Machine – a computer on a network that runs programs ("Server Programs") which provide other computers on the network (i.e. "Client Machines") with services. • Server Program (AKA Server Software) – these are the programs that are run on "server machines" to provide other computers on the network with services • Server – The term "server" is sometimes used to refer to a "server machine" and is sometimes used to refer to a "server program" – In general it is the thing (machine or program) that provides the services. You will be able to tell from context the word refers to the machine or the program. 24 What is a Client? • Client Machine – a computer on a network that communicates with a Server to get some services. • Client Program (AKA Client Software) – a program that runs on the "client machine" which communicates with the server to get some services. • Client – The term "client" is sometimes used to refer to the client machine and sometimes used to refer to the client program – In general it is the thing (machine or program) that uses the services. You will be able to tell from context the word refers 25 to the machine or the program. One server, many clients • In general, one server handles the requests of many clients. • If there are too many clients for a single server then you can usually add a second (third, fourth, etc) server to handle the extra load. server clients 26 Server Types • There are many many types of servers. The following are just a few: – Web Server: Sends web pages to clients that request them. The client program that communicates with a web server is a web browser. – Email server: Delivers email from one client to another client – IM server: Coordinates IM connections – File server: Allows client machines to store files on the server machine's disk drives. (e.g. Z: drive, F: drive, etc) – Print server: Allows different machines to use a single printer. – Network server: This is a "traffic cop" that controls the communications that allow computers on the network to "talk" with each other. – Database Server: Coordinates access to the information in a database. Allows different client programs to read/write/modify information in the database. – Custom Server: For running a custom application. The custom application would also consist of custom client software. For example, a bank may develop an application so that its branches can exchange their financial information in "real time". 27 One computer can run both client programs and server programs at the same time • A single computer can simultaneously run several server programs and client programs. – Example • a person sitting at the console of a database server can run a web browser that gets a web page from a different web server. • In this case the "database server machine" is running a "web client program" (i.e. the browser). • When referring to the database, this machine can be thought of as the server machine and when referring to the web, this machine can be thought of as the client machine. – Another example • The "database server program" might store a file on another machine. • In this case the database server machine is acting as a client to the "file server machine" – Many server programs on one machine • You may have one machine that runs a database server program and a web server program 28 29 DATABASE CLIENT SERVER ARCHITECTURE 30 Database Architecture Diagram • The diagram on the next slide shows the organization of a typical database system. Don’t be worried if you don’t understand the diagram now. The diagram will be referenced and explained throughout the coming slides. • Two computers in the diagram: – The computer on the left is running an “application program” (see later slides for an explanation) – The computer on the right is running DBMS software (see later slides for an explination). The computer running the DBMS software is shown without a screen. This is because the user doesn’t use this computer, rather the application program does. A computer that runs software that is used by other programs is known as a “server computer”. The program that it runs is known as a “server program”. • Symbols in the diagram – A cylinder is generally used to represent a hard drive or a database. • See next slide … Architecture Diagram Application Program running on user’s computer DBMS software running on server computer Tel# Lookup Program Step 1: Name: Joe Smith here look up Click to to Lookup # # CLICK Click DB Management System (DBMS) Server Program Step 4: 938-4763 Step 3: DBMS Software finds answer in database on server computer’s hard drive. EXPLINATION Step 1: User runs application program on his computer. Step 2: application program sends query to server: “What’s Joe Smith’s tel Number?” (query is actaully expressed using Sequential Query Language (SQL), NOT in English as in picture) Step 3: DBMS Software looks up answer in the database. Step 4: DBMS Software sends answer back to appl. program Step 5 (not shown): application program displays answer to the user. Server Computer’s Hard Drive First Name Last Name Telephone Address Oscar Grouch 9837232 123 Sesame St Joe Smith 9384763 987 Main St Bugs Bunny 8372364 1 Rabbit Hole 31 32 About the Diagram: DB and DBMS • The diagram above shows two different computers, one the left and one on the right. (The reason why there are 2 computers will be explained a little later.) The computer on the left is known as the “client” computer and the one on the right is the “server” computer. (The terms “client” and “server” will also be explained later.) • The DBMS is shown as a program that is running on the server computer (the computer on the right). • The database appears on the diagram below the server computer (the computer on the right). The cylinder that surrounds the data on the diagram represents the server computer’s hard drive. (Even though the hard drive is usually “inside” the server computer, it is often shown on such diagrams separately). 33 APPLICATION PROGRAMS 34 Application Programs • People do not usually use a DBMS program directly, rather people use an “application program”. • The application program communicates with a DBMS program to store or retrieve data from a database in response to the user’s actions. • Very often, the application program and the DBMS are running on different computers and communicate with each other using the computer network. [Note that it is possible to have the application program and the DBMS on the same computer. However for large database applications used by businesses, the DBMS is usually on a different computer than the application program is running on. ] 35 About the Diagram: Application Program • The diagram above shows an application program that is running on the computer on the left. • In “Step 1” on the diagram, the application program in the diagram presents a screen for the user to enter a name and click a button to lookup the person’s telephone number. • The arrows for “Step 2” and “Step 4” show the application program and the DBMS communicating over the computer network to get the desired telephone number. The user does not need to know that this part is happening. • In “Step 5” (not shown on the diagram) the “application program” displays the telephone number to the user. It appears to the user that the application program “knows” the telephone number. However, really, the application program just asked for this information from the database management system. 36 Other Types of Database Clients • The following are all systems that use a database – ATM machine – data is stored on the bank’s database. The data is accessible from any ATM machine. – Point of Sale – when you buy items in a supermarket and the cashier “scans” the item – the scanner and the checkout system acts as a client to the database server. The database sends product and pricing information back to the checkout system to produce the bill. The database also updates its inventory records to reflect the fact that the items have been purchased. – “EZ-Pass” – when a car passes through an “ez-pass” toll booth, the information about the car is recorded in a database which is then used to generate billing information. – eCommerce websites – use a database to keep track of product descriptions, customer orders and shipping information – etc. 37 STRUCTURED QUERY LANGUAGE (SQL) 38 Structured Query Language (SQL) • The application program uses a special computer language, called Structured Query Language (SQL), to communicate with the DBMS program. • SQL is often pronounced as “sequel” • Users of the application program do NOT need to understand SQL. It is the computer programmers who create application programs who must understand how to write SQL. 39 SQL : Inserts, Updates, Deletes, Queries, … • The most common types of SQL requests are: – Inserts: Requests from the application program to add information to the database (DB). – Deletes: Requests from the application program to remove information from the DB. – Updates: Requests from the application program to modify some information that already exists in the DB. – Queries: Requests from the application program to retrieve specific information from the DB. The diagram above shows the application querying the DMBS for Joe Smiths telephone number. This is a simple query that returns a single piece of information. Queries can be much more complex and return a lot of information. An example of a more complex query might be – find the list of names and telephone numbers for those people whose last name begins with “S” and whose telephone number is in the 917 area code. 40 Complete Explanation of Architecture Diagram • Complete Explanation of Architecture Diagram – Step 1: The user runs an application program that allows him to look up people’s telephone numbers. The user types in a name to lookup and then clicks the “Lookup” button. – Step 2: The application needs to get the information so the application program sends a query to the DBMS over the computer network to lookup the information in the database. This query is expressed using the SQL programming language. – Step 3: The DBMS looks up the telephone number in the database. – Step 4: The DBMS sends the information back to the client application program. – Step 5: The application program displays the answer to the user (not shown in the diagram). 41 ONE DATABASE – MANY APPLICATION PROGRAMS 42 Many Clients at the Same Time • The diagram above only shows a single client computer. • In reality it is more common for many different client computers running application programs to communicate with the DBMS at the same time. • EXAMPLE: Given a database that stores information about flights for an airline. An application program might be created for the ticketing agents to issue tickets to passengers. It is possible for many different ticketing agents to use the application program on their different computers at the same time. Server running DBMS Clients running application programs 43 Different types of application programs. • The information contained in a database can be used by different types of application programs simultaneously. • Example: A database that tracks information about planes and flights for an airline might be accessed by any of the following types of application programs – A program used by ticketing agents to sell tickets for the flights – A different program used by the airplane maintenance staff to determine which planes need to be maintained based on how many miles they have flown so far – A third program used by the payroll department to calculate paychecks based on the number of hours and miles flown by the crew. – Etc. 44 FEATURES OF DATABASES 45 Database vs. Excel • Many students wonder why a database is useful more than an Excel file would be for storing data. • The following slide lists some features of modern databases that cannot be easily accomplished with a program such as Excel 46 Features of Modern Databases • Automated storage, retrieval and manipulation of the data programs can access the data, not just people • Network accessible. Data can be accessed over a network by several users/programs at the same time • Able to process HUGE amounts of data – generally, the amount of data that data database can process is limited only by the amount of disk space on the computer. • Efficient (i.e. fast) storage, retrieval and manipulation of the data • Flexible querying capability – You can quickly summarize and reorganize the data in the database in many different ways. 47 Features of Modern Databases (continued) • Validation features. Has features for preventing “bad data” from being entered into the database (e.g. typos or illogical information such as a person’s age being recorded as 237, or as a word such as “blue”, etc). Databases can be setup with rules to describe what is considered valid data. Invalid data will not be accepted into the database. • Durable data. Modern databases have features to ensure that data will remain valid and uncorrupted even in the face of catastrophic errors such as a power outage. 48 Features of Modern Databases (continued) • Security. “Authentication” means that before accessing the data, a user must “login” or otherwise identify who they are. If a user has already logged into the operating system (e.g. windows) then many databases can identify the user based on how he logged into the operating system. Authentication is necessary to allow the DBMS to enforce security of the data. The database can be setup with rules to indicate which users can access which data. For example a database might store information about employees in a company. The names and telephone numbers might be available to every employee in the company. However, salary data might only be available to employees in the human resources department. 49 Overview of Databases using Access 2007 50 What is “Microsoft Access”? • What is “Microsoft Access”? – Access is a Database Management System (DBMS) program (see above for an explanation of what a DBMS is). – Access contains features to allow designers of database applications to create and manage the database. This includes the ability to create • “Tables” to store the data • “Queries” that extract and display information from the tables • “Reports” that display the information from a query in a “fancy” formatted way – It is important to remember that even though Access provides the ability to run queries directly, often the queries are started by an application program, and not directly by a user. 51 Books Database • The information in this tutorial will be based on storing information in a database that keeps track of books. • The database will store information about – books (title, number of pages, retail price, etc) – authors (names, addresses, etc) – publishers (names addresses, etc) – Advances and royalties (i.e. how much each author was paid) 52 Creating a new Access database Start Access and Click Here Name the Database File Specify a name for the database file here. If you need to choose a different folder, click here. Click “Create” 53 54 Extension: .accdb vs .mdb • Extension – The extension for an “Access 2007” file is .accdb – The extension for files from “Access 2003” and earlier is .mdb 55 ACCESS FILES AND VIRUSES 56 Opening a File That You Didn’t Create • It is possible for a malicious person to insert a computer virus into an Access Database file. (NOTE: This is also true for other Microsoft Office applications, e.g. word files, powerpoint files, excel files, etc.) • Therefore, if you open an Access file that you didn’t create yourself, you must trust the person or website who you got the file from. • Access can tell which files were created on the computer that you are using. If you open an Access file that was created on any other computer, you may see a warning message (see next slide). • The message does NOT mean that the file contains a virus. Rather it means that any file that was created on another computer is suspect and that you must trust the person or website that you got the file from. 57 Security Warning Message • If you open an Access file that was created on a different computer you may see the warning message shown below. • Click on the “Options…” button to see options (see next slide …) 58 “Enable this content” • Click on the “Options …” button to get this dialog box. • If you trust the person you got the file from (e.g. a book publisher, your professor, etc.) then choose “Enable this content” and press OK. • If you don’t choose to “Enable this content” then you will still be able to use the access file, but some of its features may not be available to you. 59 TABLES 60 What is a database “Table” • Information in a relational database is organized into several “tables” • Each table stores information about a “specific” type of concept or thing (known as an “entity”). Examples: people, cars, books, purchases, etc. • Below is an example of a table that contains information about books. Each row in the table stores information about a specific book. 61 Parts of a table • Every table has – A name of the table – “Rows” (or “Records”) that contain the data for specific items (or “entities”) – “Columns” (or “fields”) that contain specific “attributes” of each item. – Each “column” (or “field”) has a name. See next slide … 62 Name of the table A row (or record) of the table (contains info about a specific book) Example – “Books” table Columns or “fields” of the table. Each column contains information about a different “attribute” of a book. Column names or field names The “value” of one “field” in a specific row. To add a new “row” just start typing in the last row of the table (the one with the “*” next to it) 63 Entering new Information • To enter a new row of information start typing the new information on the last row of the table (the row with the “*” next to it). • Press TAB and Shift-TAB to move from one field to the next. • When you start typing the new row will have a pencil icon next to it. The pencil means that the row is not yet saved to the database. Once you’ve entered all of the fields and press TAB, the row is AUTOMATICALLY SAVED in the database and the pencil icon goes away. You do not have to do anything else to save the new row. 64 Deleting a Row • Deleting a Row: – STEP 1: right click on the “header” to the left of the row – STEP 2: choose “Delete Record” 65 CREATING YOUR FIRST TABLE Every database must have at least one table 66 • Every database must have at least one table • Therefore, when you create a new database, Access automatically presents the following screen which allows you to create a new table. • See next slide … 67 Creating Your First Table • The table already has a single field named “ID”. This is a “special” field that will be important for later. • The ID field has numbers that are automatically assigned by the database as new rows are added. We will discuss this more later. • To enter information just start typing HERE (see next slide). This will automatically create new fields. Adding a New Row • • Our first table will store information about books (titles, number of pages, prices, publication dates) To enter information about the first book, – – – – • start by typing the title (“The Great Pickle Caper”) into Field1. Press TAB to move to the next field and enter the number of pages (87). Continue pressing TAB to enter the price ($9.99) and publication date (1/1/2007). To move to the next field, press the TAB key (or Shift-TAB to move to the previous field). To enter information about the next book, (not shown in picture) – click on the 2nd row and enter the title, etc. Click here and type this The number 1 is automatically entered here by Access Then press TAB and type this. Click here to enter info about the 2nd book. Click and drag here to make field wider or narrower. 68 Saving and Naming the Table 69 • To save the table click on the “save” icon (or if you close the table you’ll be asked to save it). • When you save the table, you will be asked to give a name to the table. • The navigation pane lists the names of all created tables. If your database contains more than one table, you can double click on a table name to see the contents of that table. 70 SWITCHING BETWEEN “DATASHEET VIEW” AND “DESIGN VIEW” 71 Design of a table • In addition to the data in a table, you must specify the “Design” of each table. • The “Design” defines the type of data that is allowed in each field. • You cannot see both the design and the data at the same time. The data is visible in the “Datasheet View” of the table and the design is visible in the “Design View” of the table. • To switch between the “Datasheet View” and “Design View” of the table, use the “View” button as shown on the following slides … 72 To see the “Design View” of the table Design View Step 1: Click on the “Home” tab Step 2: Click on the arrow under the “View” button Step 3: Choose “Design View” Currently looking at the “Data”. Follow the steps at the left to view the “Design” 73 To see the “Datasheet View” of the table: Datasheet View Step 1: Click on the “Design” or “Home” tabs Step 2: Click on the arrow under the “View” button Step 3: Choose “Datasheet View” Currently looking at the “Design”. Follow the steps at the left to view the “Data” 74 DESIGN VIEW 75 Design of a table • The “design” of a table describes : – The names of the columns (or fields). – The type of information you may place in each field. • The type of information allowed in a field could be any of the following. (These types will be explained more fully in later slides) – – – – – – – • It is IMPOSSIBLE to enter information into a field of a different type than the type that is defined for that field. – – – – • text, number, currency, auto-number, date/time, yes/no, and some other types. Example: “pages” in the books table is defined to only accept “number” data (see next slide). Therefore page numbers must be entered as numbers (e.g. 298). It will be IMPOSSIBLE for someone to enter words for the page number field (e.g. “yellow”). It is also IMPOSSIBLE to enter words that represent numbers such as “two hundred and ninety eight”. See next slide for the design of the “books” table … 76 Design View Design view allows you to specify the field names and data types of each column in the table. “Description” describes in English what the column is used for. The Description is ignored by Access. The section on the bottom gives more control over the type of data allowed in the currently selected field on top. 77 DATA TYPES Data Types 78 Below are some of the data types that can be defined for fields. Keep reading for more info … Data Type What’s allowed in the field Examples Number Only numbers 0 -3 325 12.345 Text Anything that can be typed on the keyboard “Hello there!” “red” “several words in a row” Currency Numbers that will be used to represent currency amounts $125.99 Date/time A date and/or a time. 1/1/2008 6:00 AM 1/1/2008 6:00 AM Yes / no Yes or no. These values are represented on the screen as a checked checkbox (Yes) or an empty checkbox (No). Yes No Autonumber (See description on a later slide …) 79 DATA TYPES - NUMBER AND TEXT 80 Number Data Type • NUMBER – Can store only numbers – Cannot store letters or any “special” symbols (e.g. @#$%+_) – Use the number data type to store numerical data such as peoples ages, number of pages in a book, etc. • Integer numbers vs. numbers with decimal points – By default, you can only store “integer” values (i.e. whole numbers) in a number field – You can modify number fields to allow the storage of numbers with decimal points but we will not show how to do this here. 81 Text data type • TEXT – Can store any combination of letters, numbers and special symbols (e.g. @#$%+_) – Use the Text data type for fields that will store people’s names, addresses, names of colors, etc. 82 Storing numbers in a Text field - Question • QUESTION: Since both the Number and Text data types can be used to store a number, how should a field that is intended to store a number (eg. number of pages, zip codes, tel. numbers, etc) be defined, with the Number data type or Text data type? • ANSWER: see next slide … 83 Storing numbers in a Text field - Answer • ANSWER: In general, it is best to use the “Number” data type for numbers that you may need to either - perform mathematical calculations with, OR - compare two values numerically to see which is larger Use the “Text” data type for all other numbers • There are exceptions to this rule, but don’t worry about them. • EXAMPLES on next slide … 84 Storing numbers in a Text field - Examples • EXAMPLES: – Number of pages in a book should be stored as Number. This is because you may want to add up all the pages from several books - e.g. add up the page counts from all Harry Potter books to see how prolific JK Rowling is (performing mathematical calculations). You may also want to compare two different books to see which has more pages (numerical comparison) – ZIP Codes and Telephone Numbers should be stored as Text. This is because you would never need to do math with these values and you generally wouldn’t need to compare two ZIP codes or telephone numbers to see which is numerically larger. 85 OTHER DATA TYPES 86 Currency Data Type • CURRENCY – Can store numbers that represent amounts of money (e.g. prices, bank account balances, etc.) – Numbers are automatically displayed with a “$” and 2 decimal places of information 87 Date/Time Data Type • Date/Time – Can store a date, a time, or both a date and a time (i.e. a moment in history) – Works similar to the way Excel stores dates and times. 88 Yes/No Data Type • Yes/No – Can ONLY store the word “Yes” and the word “No”. – Use Yes/No to store information that can only have one of two values. – Works similarly to the way Excel stores “Logical”/”Boolean” values. 89 Autonumber Data Type • Autonumber – Used to store a number that is different for every row in the table. – The actual numbers to be put in the table are automatically generated by Access. – By default, the number for the first row in the table is 1. – Every time a new row is created the value in the autonumber field is one more than it was for the last row. – Autonumber is generally only used for “primary key” fields (see later in the presentation for information about what a “primary key” field is and more information about the Autonumber data type.) – You may only have one Autonumber field per table. 90 RELATING 2 DIFFERENT TABLES IN A 1 TO MANY RELATIONSHIP 91 Adding Publisher Information • We would like to add the publisher’s name, address and tel # for each book. – There is a RIGHT way to do this and – There is a WRONG way to do this. See next slides for more info … 92 The WRONG Way to Do It • The WRONG way to add publisher data would be to simply add columns to the existing books table, such as below. • This approach is WRONG because it introduces duplicate information into the database. New columns for publisher information Duplicated information for “Doe Books” Duplicated information for “Abc Publishing” 93 Duplicate Information is BAD • One of the goals of a relational database is to never repeat the same information twice in the database (or at least to minimize duplications as much as possible). • We’ll see how it is possible to do that soon … • Two reasons to avoid duplicate data – (more important) – Duplicate data can lead to errors. Suppose that the tel. number for “Doe Books” were entered differently in two different places. You would not know which number is correct. Even if you knew that the tel. number was correct in one part of the database you would have to check EVERY row to make sure that it is correct all over. It is much easier to have a single copy of the data that you can check. – (less important) – To save disk space. 94 The RIGHT Way to Do It • The RIGHT way to do this is to create a separate table for publishers. • You will then need to somehow connect the two tables to show which book was published by which publisher. • More info on next few slides … 95 What is a Primary Key field? • What is a Primary Key field? – A Primary Key is a field (or set of fields) that is guaranteed to not be empty and to have a different value for every single row in a table. • We need to add a Primary key to the Publisher’s Table. 96 The Publishers Table (with a primary key) • Below we show a new table named “Publishers”. • The publishers table now contains a column named “pubId” that contains a different value for every row in the table. This column will serve as a “primary key” for the publishers table. • We will soon see how we make use of the primary key … Relating the Books table to the Publishers table 97 • To record the publisher for each book, we can add a column to the Books table to record the pubId of the publisher of the book. • The two tables are shown below. • The pubId column in the books table is known as a foreign key. 98 Primary Key VS Foreign Key • Primary Key – Must have different values for each row in the table – The primary key is used to uniquely identify each row in the table – A table may only have ONE Primary Key • Foreign Key – Values may be the same for different rows of the table – A foreign key stores a value from the primary key of a different table. – A table may have more than one foreign key (we’ll see why later) 99 Names of Primary and Foreign key fields • As mentioned above, each Foreign Key field in one table, has a corresponding Primary Key field in a different table. • A foreign key field in the first table often has the same name as its corresponding primary key field in the second table. • However, these names can be different too. 100 Entity Relationship Diagram (ERD) • We show the relationship between tables in a database using an Entity Relationship Diagram. • The diagram below shows that the Books and Publishers tables are related to each other through the primary key – foreign key relationship of the pubId fields in each table (the line is drawn from pubId to pubId). • The key symbol next to pubId in the publishers table indicates that pubId is a primary key in the publishers table. • The 1 and the infinity sign (a sideways 8) will be discussed on the next slide … 101 One To Many Relationship • One to many relationship – A book can only be published by one publisher. – However, a publisher can publish many books. – This defines a “one to many” relationship. • The 1 in the ERD represents the one side of the relationship (i.e. one publisher) and the infinity sign represents the “many” side of the relationship (i.e. can publish many books”) Review – how to store a 1 to Many relationship in a database 102 • How to create a 1 to many relationship in a database – Create 2 different tables, one for the 1 side and one for the many side – Create a primary key for the table on the 1 side – Create a foreign key for the table on the many side. This foreign key will be used to store values from the primary key of the other table. 103 ANOTHER 1 TO MANY RELATIONSHIP (AUTHORS-BOOKS) 104 Authors and Books • Next we want to record which author wrote which books. • As a first attempt we will approach this just like we did for publishers. • We will create a different table to store author information (name, address, tel #, etc). • The author’s table will have a primary key (authorId). • We will add a new column to the books table to store the authorId. This will be a “foreign key”. • See the next slide … 105 New “authors” table Primary Key Tables Added a new “foreign” key field to the books table. 106 Who Wrote Which Book? • From the previous slide, you can see that – “Joe Jones” wrote • How to Stuff an Olive • Waltzing with Walt and Wendy • Uncle Sam is Sleeping – “Alice Aames” wrote • The Great Pickle Caper • A is for Artichoke – “Sue Smith” didn’t write any books 107 Updated ERD • The updated Entity Relationship Diagram is below. • This ERD shows that – a book can only have one author (we will change this soon, keep reading …) – However an author can write many books 108 MANY-MANY RELATIONSHIPS (AUTHORS-BOOKS) 109 Many-Many relationships • In the “real world”: – A single book can be written by many authors and – A single author can also write many books • This is an example of a many-many relationship. 110 Allowing Several Authors For a Single Book • Unfortunately, the way the database is currently setup, we cannot record information about books that were written by more than one author. • For example, suppose that both “Alice Aames” and “Sue Smith” collaborated to write the book “A is for Artichoke”. There is no way to record that in our database. – NOTE: You can’t add another authorId in the books table for “A is Artichoke” since there is only one authorId value allowed. Recording a Many-Many relationship requires 3 tables. 111 • The proper way to record information about manymany relationships in a database is to use 3 tables. • We will modify our database tables to allow us to record information about books that were written by several authors. • Keep reading … 112 Authors_Books Table • We create a new table called Authors_Books that shows which author wrote which book. • We do not need to record the “name” of the author in this table, just the author’s ID. • Similarly, we do not need to record the “title” of the book, just an ID for the book. To create an ID for the books table we will add a new column, bookID and make it the primary key of the books table. • Keep reading … 113 Creating the new tables • On the next slide you can see the new table structure. • The data in the new tables on the next slide represents the exact same information that we had in the previous table structure, i.e. the same authors wrote the same books (this slide doesn’t show yet two authors for a single book). • On a later slide we will show that you can now add a 2nd author for one or more of the books. Tables 114 New primary key for books table New table authors_books shows who wrote which book. Authors table is the same as before Removed authorId field from books table and moved it to the authors_books table 115 Recording 2 authors for one book • Each row in the Authors_Books table represents the fact that a specific author was involved with writing a specific book. • With the new table structure you can indicate that 2 authors wrote the same book by just adding another record to the Authors_Books table for the 2nd author. • To indicate that “Alice Aames” and “Sue Smith” collaborated to write the book “A is for Artichoke” add the row as shown on the next slide. Tables These two rows now show that both Alice Aames and Sue Smith collaborated to write “A is for Artichoke”. 116 Added this new row. Everything else is the same as before. Composite Primary Key and 2 Foreign Keys of Authors_Books Table 117 • Foreign Keys – The authorId column in the Authors_Books table is a foreign key that stores the value of a primary key from the authors table. – The bookId column in the Authors_Books table is a foreign key that stores the value of a primary key from the books table. • Composite Primary Key – The primary key in the Authors_Books table is made up of both the authorId and the bookId fields together. – A primary key that is made up of more than one field is known as a “composite primary key” – What this means is that there cannot be two rows in the Authors_Books table that have the same value for both the authorId and the bookId. However, two rows MAY have the same value for either the authorId or bookId. 118 Updated ERD for New Table Structure 119 REVIEW OF MAJOR IDEAS 120 Major Concepts • Relational databases avoid the repeating of duplicate information. • Different tables are created to store related types of information. • Relationships between rows in the tables is implemented through primary-key to foreign-key relationships • A one-to-many relationship can be represented with exactly 2 tables • A many-to-many relationship must be represented using 3 tables • An entity relationship diagram (ERD) shows the tables and how they are related to each other 121 Access Queries 122 What is a Query • What is a Query – A “query” means a question. We can ask the database a question about the data and the database will figure out the answer. – Example on next slide … 123 Example • Example – We can ask the database to list the titles of all books that have more than 140 pages. – After you create and run the query (see later slides) the database would respond as shown below. – To get the answer though, you must first understand how to ask the question (i.e. how to “create the query”) – Keep reading … 124 CREATING A QUERY - STEP 1 : OPENING THE QUERY DESIGN WINDOW 125 Example Query • We will create a query that answers the following question: List the titles of all books that have more than 140 pages. 126 Opening the “Query Designer” • To create a query – Click the “Query Design” button on the “Create” tab. 127 CREATING A QUERY – STEP 2: SPECIFYING TABLES AND FIELDS 128 Choose the Table(s) • When you open the query designer you are prompted to specify the tables that are needed to answer the query. • To add a table to the query designer click on the table name and then click the Add button. • (DO NOT choose a table if it is not necessary to answer the query). 129 Layout of Query Designer Window • Top of Query window shows tables. • Bottom shows other info about query (we’ll explain more about this soon …) 130 Adding more tables Right Click here • To add additional tables to the query – Right click on a blank portion of the top of the query window and choose “Show Table” – Then you will see the “Show Table” dialog box (not shown in picture) which allows you to add more tables. 131 Related Tables in a Query • All tables that are in a query should be “related” to each other through a Primary Key – Foreign Key relationship. • If the primary key and the foreign key have the same name, Access automatically draws the connecting line that shows the relationship. • If the primary key and the foreign key have different names, Access doesn’t draw the line automatically. Line that connects foreign key in books table to primary key in publishers table. 132 Drawing missing relationship lines Click here • To manually draw a “relationship-line” – click on the foreign key in one table – Drag it over the corresponding primary key in the other table – Let go of the mouse button. – (you can also drag in the reverse direction) • NOTE: lines that you draw manually may not be labeled with the 1 and the infinity sign. Then drag to here and let go of mouse button 133 Deleting Relationship Lines • To remove a relationship line that was drawn incorrectly – Right click in the middle of the line – Choose “Delete” Right-click here and choose “Delete” 134 Removing a table from the query Right-click here and choose “Remove Table” • You can remove a table from a query. • This does not remove the table from the database (i.e. your data will still be there) it just removes the table from “participating“ in answering the query. • To remove a table from the query – Right click on the name of the table and then – Choose “Remove Table” 135 Tables and lines must be correct • A query should never include more tables than necessary. If too few or too many tables are included in the query the results will not be correct. • The relationship lines between the tables in the query must be drawn correctly or else the results of the query will not be correct. 136 Back to the example … • List the titles of all books that have more than 140 pages. – To answer our query, the database only needs to look at the contents of “books” table since the books table contains both the title and the number of pages for each book. 137 Bottom of query window … • The bottom of the query window lists the fields that will be involved in the query. • Double click on each filed that is needed to answer the query. • The fields are then listed in the bottom of the query window. Double-click here and here Fields are shown here and here 138 Adding and Removing Fields From Query • To add or remove a column from the bottom portion of the query designer, click in one of the columns and press the appropriate button from the “Query Setup” group on the “Design” tab. – Choose “Delete Columns” to delete a column – Choose “Insert Columns” to insert a new column • NOTE: inserting or deleting columns from the query does NOT affect the underlying tables – only the query. First Click in a column Then choose appropriate button above 139 RUNNING A QUERY 140 Running A Query • Once you’ve added the tables and fields, you can start to see some results. • To see the results of the query – Click on the “Run” button from the “Results” group of the “Design” Tab. (see the results on next slide …) OR – [Another way to do the same thing: click on the “Design View” option from the “View” button.] 141 Query Results (datasheet view) • The results (or the “datasheet view”) of the query is below. • The results show only those fields that we selected in the query designer. • The contents of all rows from the original tables are displayed. • To restrict the rows that are displayed to only those books which have more than 140 pages, we must specify additional information in the design of the query. To do so, we must return to the “design view” of the query (see next slide). Switch from “Datasheet View” to “Design View” 142 • To return to the “Design View” of the query, choose “Design View” from the drop down list accessible from the “View” button on the “Home” tab. 143 CREATING A QUERY – STEP 3: SPECIFYING CRITERIA, SORT ORDER AND SHOW OPTIONS 144 Specifying Criteria • To specify which rows to display you must specify criteria for the fields. • Only those rows that match the specified criteria will be displayed in the results. • You can specify criteria for as many fields as you like. 145 Specific values VS numeric comparisons • Specific values – To specify that a field must have a specific value just type the value in the criteria for the field • Numeric comparisons – To specify that a field’s value must be greater than a specific value, precede the value for the criteria with the greater than sign. – You can use any of the following numeric operators: > < >= <= <> = greater than less than greater than or equal to less than or equal to not equal to equal to (for this you can leave out the operator too) 146 Text (quotes) vs Numeric (no quotes) • When a specifying value for a text field you must surround the value with quotes. (If you leave out the quotes, Access will usually add them for you.) • When a specifying value for a numeric field you must NOT surround the value with quotes. 147 Example - Design • To specify that only books with more than 140 pages should be listed, add that to the criteria in the query designer. 148 Example - Results • The results now correctly show only those books that have more than 140 pages. • Notice that the results are NOT sorted by either title or pages. 149 Sorting • To sort the results by a field, click on the “Sort” option for that field and choose either “Ascending” (increasing order) or “Descending” (decreasing order) • see results on next slide … 150 Sorting - results • Results are listed in “Decreasing” order of the number of pages. 151 Sorting by several fields - Example • For this query, we added a category field to the table. • The results are sorted by category, price and pages. • See next slide. 152 Example - Results • All rows are sorted in ascending alphabetical order by category (“howto” before “mystery”, before “politics”) • Those rows with the same category value are then sorted in ascending order by price. • Those rows with the same category and price are then sorted in descending order by number of pages. All rows sorted by category Rows with same categor y sorted by price Rows with same category and price are sorted descending by pages 153 Show • On the query designer uncheck “Show” to cause a field to not be shown on the results screen. • For example, you can uncheck “show” on the pages column to display only the titles of the books that contain more than 140 pages. • Results on next slide … 154 Show (unchecked) - results • By unchecking “show” from the pages field in the designer, you only see the title for those books that have 140 or more pages. 155 QUERIES THAT USE MORE THAN ONE TABLE 156 Direct and Indirect Relationships • All tables in a query MUST be related to each other either directly or indirectly. – Direct relationship between books and publishers – Indirect relationship between books and authors by going “through” authors_books 157 Queries Of More Than One Table • The following queries require more than one table – Show the titles and prices of all books that were published by “Abc Publishing”. – Show the titles and prices of all books written by Alice Aames • The 1st query requires just publishers and the books tables – Publishers is required to specify the name of the publisher (Abc Publishing) – Books is required to specify the title and the price • The 2nd query requires the authors, books AND authors_books tables – Authors is required to specify the name of the author – Books is required to specify the title and the price – Authors_books is required to “link together” the authors and the books tables. 158 QUERIES THAT USE AGGREGATE FUNCTIONS 159 Aggregate Functions • Aggregate functions are functions that create a single value out of multiple values. • Examples: – Sum creates a single value by adding up several values – Average creates a single value by taking the average of several values – Max creates a single value by finding the largest value of several values – Min creates a single value by finding the smallest value of several values – Count creates a single value by how many values there are of several values 160 Combining values from several rows • A query that uses aggregate functions combines values from several rows to create a single value. • Example 1: – List the total number of pages in all books. – This query uses the sum aggregate function to add up the page numbers for all books in the books table. • Example 2: – List the average number of pages in each category of book. – This query uses the average aggregate function to calculate up the average number of pages in all “howto” books, all “politics” books and all “mystery” books. Show “Total” row in the query designer 161 • To create a query that uses aggregate functions, right click in the bottom portion of the query designer and choose “Totals”. • This will add a new “Total” row to the bottom portion of the query designer. 162 Example 1: total pages in all books • Step 1: show the “Total” row in the designer (see previous slide) • Step 2: choose the pages field from the books table • Step 3: choose “Sum” from the drop-down list of aggregate functions for the pages field. • results on next slide … 163 Results • Results show a row for the entire table 164 Sum, Average, Max, Min of all pages • The following query shows several aggregate functions all being used on the pages field. • Results on next slide … 165 Results 166 GROUP BY 167 Group By • Queries that use aggregate functions may have one or more of the fields in the query specified as a “group by” field. Queries with no “Group By” fields return exactly one row 168 • If no “group by” fields are specified then the query is guaranteed to return a single row. • This is because the aggregate functions create a single value out of ALL of the values in the table. Queries with one or more “Group By” fields can return several rows 169 • If one or more fields in a query is specified as a “Group By” field that means that the aggregate functions that are specified for the other fields will work on “groups” of values. • The aggregate functions return separate numbers for each “group” of rows from the underlying table(s) for which the “Group By” field(s) have the same value(s). • See example on next slide Example 2: List the average number of pages in each category of book. • Step 1: show the “Total” row in the designer. • Step 2: choose the category and pages fields from the books table • Step 3: choose “Group By” from the drop-down list of aggregate functions for the category field • Step 4: choose “Avg” from the drop-down list of aggregate functions for the pages field • results on next slide … 170 171 Example 2 - Results • Results shows separate rows for “howto” books, “mystery” books and “politics” books 172 Access Forms and Reports 173 Access Forms and Reports • Access tables and queries can be very confusing for someone who has not been trained in Access or database techniques. • Access Forms are custom designed windows that display information from the tables and queries in a database. • The forms can be designed to allow a user to enter and retrieve information online without having to use tables or queries directly. The forms can present information on the screen in a more intuitive way than would be displayed in the underlying tables and queries. • Access Reports are similar to forms in that they present information from the underlying tables and queries in a more intuitive way. The difference between forms and reports is that forms are designed for working with information on the screen and reports are designed to display information in printed report. 174 Access is Often Used on a Single Computer • Very often databases are accessed over a network by a custom built application program. Those application programs are often very complex and difficult to create. • Complex application programs can be created for use with Access, however, instead of creating a custom program, Access Forms and Reports are often sufficient to satisfy the needs of a simple application.