Download Introduction to your genome

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cancer epigenetics wikipedia , lookup

Medical genetics wikipedia , lookup

Population genetics wikipedia , lookup

Epistasis wikipedia , lookup

Epigenomics wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Primary transcript wikipedia , lookup

Metagenomics wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Frameshift mutation wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Human genetic variation wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Transposable element wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

RNA-Seq wikipedia , lookup

NUMT wikipedia , lookup

Genetic engineering wikipedia , lookup

Gene wikipedia , lookup

Mutation wikipedia , lookup

Genome (book) wikipedia , lookup

ENCODE wikipedia , lookup

Pathogenomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Point mutation wikipedia , lookup

Designer baby wikipedia , lookup

Minimal genome wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Microevolution wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Non-coding DNA wikipedia , lookup

History of genetic engineering wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Genomic library wikipedia , lookup

Public health genomics wikipedia , lookup

Helitron (biology) wikipedia , lookup

Human genome wikipedia , lookup

Genomics wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome editing wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Introduction to your genome
CSE291: Personal Genomics for
Bioinformaticians
01/10/17
The personal genomics revolution
23andMe: >1 million customers ($200)
Genographic Project: >800,000 customers
($150)
Family Tree DNA: >800,000 in database ($99)
Genome sequencing is quickly becoming a commodity!
The power of commercial genome databases
Survey: are you
a morning
a night
Survey:
whator
color
are person?
the stripes?
Can perform a GWAS on hundreds of thousands of people in a matter of days!
Hu et al. Nature Communications 2016
I have a long standing interest in genetics…
Age: 20
Age: 1
Extra credit: which one is me?
Outline
• Why analyze your genome?
• Course overview
• History of analyzing
genomes
• Basic biology intro
• Basic human genetics intro
• Discuss problem set 1
Why analyze your genome?
Mutations have implications in human health
Example: Cystic Fibrosis
- Caused by mutations in the gene CFTR, most
common mutation is Δ508.
- Results in salty skin, poor growth,
accumulation of thick, sticky mucus, frequent
chest infections.
- Life expectancy: 37 years
- ~1 in 25 Europeans is a carrier
01
01
Pre-natal carrier testing
of parents can now
identify couples at risk
1 1: 25%
0 0: 25%
11
and inform reproductive
0 1: 50%
options
Our genomes contain a record of human history
Recent history
Familial relationships
Ancient history
Populations
Human migration, ancient humans
Parents, siblings, cousins, etc.
https://aliciarmartin.com/research/migration_map_revised-2/
Novembre, et al. 2008
Your genome is uniquely identifying
Your genome can help science!
Interpreting one genome requires tens of thousands of genom
- Daniel MacArthur
vs.
e.g. latest schizophrenia genome wide association study used >100,000 control genom
Course overview
Course objectives
• Gain basic bioinformatics skills needed to analyze a
personal genome using the UNIX command line
• Gain the ability to critically read and interpret basic
science and translational literature relevant to
personal genomics
• Demonstrate knowledge and understanding of the
social impacts of the personal genomics revolution
• Gain skills and experience necessary to carry out
original research related to personal genomics
Grading
• Participation 10%
• Attendance 10%
• Problem set 1 5%
• Problem set 2 10%
• Problem set 3 10%
• Problem set 4 10%
• Problem set 5 10%
• Project proposal 5%
• Final Project 30%
Analyzing your own genome
• You are welcome and encouraged to explore
your own genome (e.g. from 23andMe) through
the problem sets.
• If you want to do that, order ASAP, it takes
several weeks to get the data back.
• Your grade does not depend in any way on
whether you analyze your own genome.
• You do not need to tell me if you analyze your
own genome.
• We cannot offer to pay for the test, or provide
any counseling
A whirlwind history
of human genetics
Mendel establishes heredity as a principle (~1865)
Green peas
Yellow peas
GG
YY
F1 Generation
100% Yellow
YG
YG
YG
YG
F2 Generation
75% Yellow
25% Green
YY
YG
GY
Conclusions:
1. Inheritance is determined by “units” (now called genes)
2. An individual inherits one such unit from each parent for each trait
3. A trait my “skip” a generation
GG
mid-1900s: DNA is the genetic material
• Griffith experiment (1928): showed bacteria
can transfer genetic information
• Avery-MacLeod-McCarty experiment
(1944): showed that DNA was key component
of Griffith’s experiment
• Hershey-Chase experiment (1952): used
radioactive labeling to show DNA, not protein,
transfers genetic information
• DNA structure identified (1953) by Watson,
Crick (using data from Rosalind Franklin)
First disease gene mapped (1983)
George Huntington’s paper (1872)
Huntington’s Disease
• Progressive neurodegenerative disease
• Loss of motor control, jerky movements
• Age of onset: typically 30-45 years old
• Caused by expansion of a CAG repeat,
encoding polyglutamine, in the gene HTT
Gusella et al. 1983
The human genome is sequenced (2001)
• $3 Billion public project
beginning in 1990
• In 1998, Craig Venter started
competing private project at
Celera
• “Draft” published in 2000.
We still do not have a
complete genome
sequence!
• >70% from a single male
donor from Buffalo, NY
(RP11). At least 4 individuals
included.
Toward the $1000 Genome
The personal genomics revolution
Hair color
Eye color
>1 million customers
$200 to genotype 1.5 million
genomic positions
Ancestry
Biology Intro
Bird’s eye view of the human genome
Nucleus
Cell
Autosomes
Sex chromosomes
http://missinglink.ucsf.edu/lm/genes_and_genomes/content.html
DNA (deoxyribonucleic acid) structure
Bases:
Base pairing Watson-Crick
Cytosine
C
Guanine
G
Adenine
A
Thymine
T
3’
5’
C
G
A
T
G
C
T
A
Other components:
Phosphate
Deoxyribose
(sugar)
5’
3’
Forward strand: 5’-TGAC-3’
Reverse strand: 5’-GTCA-3’ (reverse complement)
The central dogma
DNA
GENE
DNA
Transcription
RNA
mRNA
Translation
Protein
Protein
The genetic code
http://www.chemguide.co.uk/organicprops/aminoacids/dna4.html
The structure of a gene
TF
Promoter
Exon 1
Exon 2
Exon 3
Intron 1
Intron 2
DNA
Transcription
ACACUAUCGAUGCAGAUAAAGUUGAGUAGCUGUCUCGGUCGAGCGUACGUAUAAAUCACUAC
Splicing
3’ UTR
5’ UTR
ACACUAUCGAUGCAGAUAAAUAGCUGUCUCGCGUACGUAUAAATCACUAC
RNA
mRNA
Translation
M Q
I
N
S
Start codon (AUG=Methionine)
C
L
A
Y
V
*
Protein
Stop codon (UGA, UAA, UAG)
Organization of the human genome
~30,000 protein coding
genes in the human
genome
http://book.bionumbers.org/how-many-genes-are-in-a-genome/
Cell division – mitosis (somatic)
DNA replication
Mitosis
Two diploid cells
Cell division – meiosis (germline)
DNA replication
Homologous
recombination
Meiosis
I
Meiosis II
Four haploid cells
Recombination
https://www.reddit.com/r/askscience/comments/3hq4zl/does_crossover_occur_in_all_4_nonsister/
Human genetics intro
Mutations – the bread and butter of genetics!
SNP
Short indel (1-20bp)
ACGACTCGAGCG
ACGACTCGAGCG
ACGACACGAGCG
ACGAC-CGAGCG
μSNP: 1.20 × 10-8 /loc/gen
μINDEL: 0.68 × 10-9 /loc/gen
Alu retrotransposition
Short tandem repeat
CAGCAG---CAGCAGCA
Struct. Var /CNV (>20bp)
~75+
~75+
STR
STR
0.05
0.05
0.2
0.2
33
Alu
Alu
CAGCAGCAGCAGCAGCA
Alu
50
50
SNP
SNP
75
75
50
50
25
25
00
# de novo/gen
100
100
μSTR:
10-2-10-5 /loc/gen
SV
SV
Indel
Indel
How do mutations affect proteins?
But also…
• Regulatory regions
• Large structural variations
• Alternative splicing
• Many others…
http://www.nbs.csudh.edu/chemistry/faculty/nsturm/CHEMXL153/DNAMutationRepair.htm
Intro to Mendelian genetics
Back to Mendel’s peas…
x
YG
YG
F2 Generation
75% Yellow
25% Green
GY
YG
YY
Y
Parent 2
GG
G
Y
YY
YG
G
GY
GG
Parent 1
Modes of inheritance - dominant
aa
Aa
Example – Marfan Syndrome
• Tall and slender build
• Long arms, legs, and fingers
• Heart murmurs, other cardiovascular defects
• Nearsightedness
Aa
Aa
aa
aa
Caused by loss of function mutations in FBN1
>=1 copies of dominant allele: affected
0 copies of dominant allele: unaffected
Unless de novo, at least one parent is affected
http://www.mayoclinic.org/diseases-conditions/marfan-syndrome/symptoms-causes/dxc-20195415
Modes of inheritance - recessive
Aa
Aa
AA
Aa
aA
aa
Example – Cystic Fibrosis
• Caused by mutations in the gene CFTR,
most common mutation is Δ508 (in frame
deletion).
• Results in salty skin, poor growth,
accumulation of thick, sticky mucus,
frequent chest infections.
• Life expectancy: 37 years
• ~1 in 25 Europeans is a carrier
Caused by loss of function mutations in
2 copies of recessive allele: affected
CFTR
<=1 copies of recessive allele: unaffected
Often, both parents unaffected
https://hutchbio.wordpress.com/2012/11/07/cystic-fibrosis/
Modes of inheritance – X linked recessive
XX’
XY
X’Y
XY
XX’
Example – Hemophilia A
• Blood doesn’t clot properly
• Heavy bleeding even from small cuts
• Bruise easily
• Some female carriers show symptoms
XX
Caused by loss of function mutations in
clotting Factor VIII
Need at least one unaffected copy of X to be
unaffected
X’Y, X’X’ affected (X’X’ lethal for some disorders)
Typically affects only males
Heterozygous females are called “carriers”
http://reference.medscape.com/features/slideshow/hemophilia-a
Example recessive trait – red hair
https://blog.23andme.com/health-traits/no-im-not-irish/
Example recessive trait – blue eyes
IrisPlex: predicts eye color from 6 SNPs
All blue eyes have a single common ancestor with a regulatory change in HERC2
Walsh, et al. 2010
Sturm, et al. 2008
Beyond Mendelian – complex traits
Example: height
Fisher hypothesized that Mendelian traits could
explain continuous traits if many genes each
contribute additively to a phenotype.
Sir Ronald Fisher
Example complex trait: schizophrenia
Heritability: 80%
i.e. 80% of twin pairs concordant for SCZ status
Schizophrenia Working Group of the Psychiatric Genomics Consortium
Problem set 1
SNP array data
• This is the type of data you’ll get from 23andMe and other companies
• As opposed to whole genome sequencing, which sequences the
entire genome, genotype arrays genotyped a pre-determined set of
known polymorphic positions
• E.g. 23andMe genotypes ~1.5 million variants
BB
• Probes for allele “A” and
“B”
• By comparing intensities,
can infer genotype (e.g.
AA, AB, BB)
AB
AA
Getting started
https://gymreklab.github.io/teaching/personal_genomics/ps1_resources.html
Before you go:
• Sign up for an XSEDE account
• Get started on PS1