Download MySQL/PHP Workshop

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Data vault modeling wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
MySQL/PHP Workshop
http://www.trii.org/courses/mysql/
•
•
•
•
•
2 MySQL lectures
2 PHP lectures
Each lecture builds on concepts taught and learned in the previous
lectures.
The first two lectures discuss the concept of a relational database
such as MySQL and show you how to manipulate the data stored in
the database from the command line. It is essential to learn this
first because PHP makes use of the language of the database.
The third and fourth lectures will introduce you to PHP, a serverside scripting language that allows you to interact with the MySQL
database from a web browser and create fancy web pages to
display the data. PHP is the go-between that fetches the data
from the MySQL database and then spits it out dynamically as the
nicely formatted HTML page that the browser expects.
Class #1: Introduction to MySQL
Manda Wilson, BIC, cBio, MSKCC
Trii Training: Bioinformatics and Computational Methods
• Relational databases
• Database design
• SQL
o Creating databases
o Creating tables
o Selecting from, deleting, and updating tables
• Exercises
•
You should have
o Class notes
o Exercise handout
o MySQL Pocket Reference
•
Useful resource: http://dev.mysql.com/doc/
Relational Databases
• A database is a collection of tables
• Columns define attributes of the data
o
All data in a column must have the same data type
• A record is stored in a row
table name
row
First Name
Nadia
Madhu
Ajuma
Wade
Helen
Employees
Last Name
Li
Charu
Kinsaka
Randal
Clark
Phone
2687
7856
4489
5257
2147
column
Use a Relational Database When…
• You have a very large dataset
• There is redundant data
o Wastes disk space
o Increases errors
• Information must be updated in multiple
locations
• Security is important
o Different users can be granted different
permissions
• Strict enforcement of data types is important
Spreadsheet Example
Title
A House for Mr. Biswas
Midnight's Children
On the Road
Author
VS Naipaul
Salman Rushdie
Jack Kerouac
Borrower
Sarah
Phone
646.555.1234
Spreadsheet Example
Title
A House for Mr. Biswas
Midnight's Children
On the Road
Author
VS Naipaul
Salman Rushdie
Jack Kerouac
Borrower
Sarah
Phone
646.555.1234
One Flew Over the Cuckoo's Nest
Sula
Villette
Ken Kesey
Toni Morrison
Charlotte Bronte
Sarah
646.555.1244
Jim
646.555.4586
Data is inconsistent!
Now imagine you are designing the New York Public
Library database which has tens of million books and
well over a million cardholders.
Database Design
Entity Relationship Design
Entity (“thing”, “object”)
Table
Attributes (describe entity)
Columns
Entity Instance
Row
Relationships between entities preserved in relationships
between tables.
If you are interested in learning more formal design
methods look up “normalization” and/or “third normal
form”.
Our data
Ensembl Gene ID
Symbol /
Name
Chromo
some
Common
Name
Species
ENSG00000186891.3
TNFRSF18
1
1044947
1048147
8784
9606
human
Homo sapiens
ENSG00000078808.4
CAB45
1
1058370
1073469
51150
9606
human
Homo sapiens
ENSG00000176022.1
B3GALT6
1
1073703
1076476
126792
9606
human
Homo sapiens
ENSG00000160087.5
UBE2J2
1
1095352
1115292
118424
9606
human
Homo sapiens
ENSG00000162572.4
SCNN1D
1
1123634
1133467
6339
9606
human
Homo sapiens
ENSG00000162576.4
MGC3047
1
1194130
1199973
84308
9606
human
Homo sapiens
ENSG00000175756.3
AKIP
1
1215168
1216641
54998
9606
human
Homo sapiens
ENSG00000131586.2
MRPL20
1
1288703
1294063
55052
9606
human
Homo sapiens
ENSG00000179403.2
WARP
1
1322311
1327547
64856
9606
human
Homo sapiens
ENSG00000160072.5
ATAD3B
1
1358611
1396091
83858
9606
human
Homo sapiens
ENSG00000008128.5
CDC2L2
1
1582617
1604060
985
9606
human
Homo sapiens
ENSG00000169911.4
SLC35E2
1
1611978
1625728
9906
9606
human
Homo sapiens
ENSG00000008130.3
FLJ13052
1
1630975
1659805
65220
9606
human
Homo sapiens
ENSG00000078369.3
GNB1
1
1665027
1770792
2782
9606
human
Homo sapiens
ENSMUSG00000041954.1
TNFRSF18
4
154139702
154142251
21936
10090
mouse
Mus musculus
mouse
Mus musculus
ENSMUSG00000023286.1
Start
Position
(bp)
End
Position
(bp)
LocusLink
ID
Taxonomy
ID
UBE2J2
154057210
1540722964
140499
10090
What entities or “objects”
are 4defined
here?
Is there any redundant data?
What happens if we want to add another species attribute (e.g. genus)?
Our tables
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
How do we know which organism a gene belongs to?
Do we have unique identifiers?
Can the organism have more than one gene?
Can the gene have more than one organism?
Our tables
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTi me™ and a
T IFF (LZW ) decompressor
are needed to see t his pict ure.
Means each gene has “one and only one”
organism, but that each organism can
have more than one gene.
This is an example of an entity relationship diagram.
Our tables
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
• A primary key is a unique identifier for a record in a
table.
• A foreign key in one table refers to the primary key of
another.
• The gene table has a foreign key, Organism_ID. Each
record in the gene table will have an organism id to link it
to the correct species record in the organism table.
Database Design Caveat
• Sometimes design is “sacrificed” for
speed.
Data Types
float
integer
tinyint
varchar(size)
o stores strings
o size can be between 0 - 255, inclusive
• datetime
What data types should our attributes (columns) be?
• + more
•
•
•
•
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Complete Design
Database name: ensmartdb
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Column
gene_id
ensembl_gene_id
organism_id
name
locuslink
chromosome
chromo_start
chromo_end
description
Column
organism_id
taxonomy_id
common_name
species
Gene
Data Type
integer
varchar(50)
integer
varchar(35)
varchar(10)
tinyint
integer
integer
varchar(255)
Organism
Data Type
integer
integer
varchar(35)
varchar(35)
Connecting to MySQL from the Command
Line
mysql -uusername -p
Example:
>mysql -uroot
To EXIT MySQL:
EXIT;
Basic SQL Commands
• SQL statements end with a semicolon
• View databases
SHOW DATABASES;
Importing a Database
• Creating a database
CREATE DATABASE trii;
• From the command line:
mysql -uusername -ppassword databasename <
filename.sql
• Example:
o
mysql -uroot trii < trii.sql
Basic SQL Commands
• Use database databasename
USE databasename;
• Display all tables in a database
SHOW TABLES;
Create Table
database name
CREATE TABLE organism (
organism_id INTEGER NOT NULL AUTO_INCREMENT,
column
taxonomy_id INTEGER NOT NULL,
names
common_name VARCHAR(35) NOT NULL,
species VARCHAR(35) NOT NULL,
PRIMARY KEY (organism_id),
UNIQUE (taxonomy_id)
);
View column details for a table
DESC tablename;
Selecting all data
SELECT * FROM tablename;
Select only the columns you need
be faster)
SELECT common_name, species
FROM organism;
(it will
Limiting your data
• Get the species name for a specific
organism (you have the id)
SELECT species
FROM organism
WHERE organism_id=1;
How do we select all the gene names for chromosome 1?
Insert
• Inserting a gene
INSERT INTO gene
(ensembl_gene_id,
organism_id,
name,
chromosome,
chromo_start,
chromo_end) VALUES (‘MY_NEW_GENE’,
1, ‘MY GENE’, 1, 12345, 67890);
• Get the id of the gene:
SELECT gene_id FROM gene WHERE name='MY GENE';
Delete/Update
• Deleting a gene
DELETE FROM gene WHERE gene_id=19;
• Updating a gene
UPDATE gene SET name=‘NEW NAME’
WHERE name=‘AKIP’;
Table Joins
• Sometimes you want data from more than
one table. To do this we join the tables.
The result is a new (temporary) table.
Cross Join
• SELECT gene.name, organism.species
FROM gene, organism;
Note: There are only
two records in the
gene table with the
name “TNFRSF18”.
One is a mouse gene
and the other is a
human gene. What do
you think happened?
Cross Join (Continued)
Each row of the gene table was joined with
every row in the organism table.
Cross Join (Continued)
• We want to use the organism id to match
a gene record with the correct organism
record so that we get:
Remember there
are two gene
records with the
name “TNFRSF18”.
Cross Join (Continued)
SELECT gene.name, organism.species
FROM gene, organism
WHERE
gene.organism_id=organism.organism_id;
Notice that we have 18
rows and that there are
18 rows in the gene
table.
Class #2: Introduction to MySQL
(Continued)
•
•
•
•
•
•
Table Joins
Where clause (continued)
Aggregate Functions
Backing up your database
Indexes (if time)
Importing large datasets (if time)