Download PAS Meeting

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Artificial gene synthesis wikipedia , lookup

Transcript
Genes
Outline
Genes: definitions
Molecular genetics - methodology
Genome Content
Molecular structure of mRNA-coding genes
Genetics
Gene regulation
 Genetics
 Molecular biology
 Arrays
 Issues
 Genetic and misexpression approaches






Gene Definitions
 Gene
 Molecular definition: stretch of DNA that encodes:
 Functional RNAs - tRNA, rRNA
 Functional proteins - mRNA
 All sequences necessary for proper function (genetic) –
includes regulatory elements and transcription unit
 Generally excludes other types of genomic sequences
 Centromeres, telomeres, origins of DNA replication,
transposons
 Genetic definition: element required for proper organismal
function
Molecular Genetics
 Genetics [mutant phenotype]
 Molecular Biology [gene: sequence, expression-arrays]
 Biochemistry [activities, interactions]
 Cell biology [structure, dynamics]
Genomic Content
 Calf Thymus DNA sheared to a size of ~300 bp, denatured,
and reannealed
 3 classes:
 Highly repetitive – 10% DNA - anneals very rapidly
 Middle repetitive – 30% DNA - C0t1/2 = 0.04
 Non-repetitive (unique) – 60% DNA - C0t1/2 = 4000
Highly Repetitive Simple Sequence DNA
 Clusters of tandemly-linked 5-10 bp repeats
 Can have > 106 copies/genome
 Not transcribed
Drosophila virilis
satellite DNAs
> 95% each satellite
consists of predominant
sequence
Intermediately Repetitive DNA – Mobile Elements
Unique DNA






Repeat
Repetitive elements interspersed among unique DNA
Most are transposons – mobile DNA
Many are no longer able to transpose
Dispersed throughout the genome
Different classes
Transpose as DNA or RNA intermediates
Unique DNA-Coding Sequence Genes
 Slow kinetic class corresponds mainly to protein-coding
genes
 Average gene size (transcribed region only)/organism
 E. coli
1.2 kb
 Yeast
1.7 kb
 Drosophila
11.3 kb
 Human
27.0 kb
 As complexity increases, so does gene size
Overview of Gene Expression-1
Regulatory
region
Transcription unit
DNA > ACGT
RNA > ACGU
Nucleus
Transport to
cytoplasm
Overview of Gene Expression-2
aa1 = methionine
Protein - myoglobin
DNA and Clones
 Genomic or chromosomal DNA – genomic clones (exons,
introns, spacer, etc.)
 Transcription unit – entire region of gene transcribed (exons
+ introns)
 mRNA – cDNA clones (exonic sequences)
 ESTs – expressed sequence tags
 Oligonucleotides – small stretches of DNA (~20-50 nt)
Human Genome Project: Gene Number
 Size 3,200 Mb
 Predicted gene number
 Celera – 39,114
 Public consortium – 29,691
 Refseq (known genes) – 11,015
 Non-identity
 ~64% novel genes don’t overlap
 > 80% novel genes expressed
 Indicates they are real
 Estimate ~50,000 genes
 Estimate ~ 64 kb/gene
 Transcribed region = 27 kb
 Spacer DNA = 37 kb
 Repeats + control elements
 Human – large number of transcripts/gene
exist because of alternative splicing
Completed Genomic Sequencing Projects
 Human – disease genes
 Drosophila – model system for animal development and
gene control
 Strength - genetics
 Nematode - model system for development and behavior
 Strength - genetics
 Fly and human more related than worm-human
 Arabidopsis – weed: model plant genetic system
 Crop plants – rice, maize
 Yeast – typical eukaryotic cell
 E. coli
 Many pathogenic bacteria - disease
Genome Projects in Progress
 Multiple Humans – SNPs : disease genes and predispositions
 Mouse – model system to study human/mammalian gene
function
 Strength – knockout mutants
 Zebrafish - model vertebrate genetic system
 Strength – large-scale genetic screens
 Crop plants – poplar, apple, tomato + pests
 Additional Drosophila species
 Identify gene control regions
Drosophila
 Drosophila genome = 180 Mb
 Sequenced 120 Mb euchromatic region
 60 Mb heterochromatic region unsequenced (few genes)
 Annotation – 13,601 predicted genes
 Genie – predicts ORFs/exons
 Compare to Expressed Sequence
Tags (ESTs-cDNAs)
 Blast searches – sequence identity
to known genes
Complications of Gene Prediction by Computer:
Cranky Example
 RT-PCR of embryonic RNA
EST-LP05454-3'
Exons
1
2
3 4
5
67 8 9
EST-LP05454-5'
CG14554
Genie
CG12561 1-3
CG14552 1-4
CG14553 1-3
 33 kb
Drosophila Gene Functions
 14,113 predicted transcripts with different coding sequences

Biochemical function
 2,081 Transcription factors
 2,422 Enzymes
 665
Transporters
 622
Signal transduction
 303
Structural proteins
 216
Cell adhesion
 7,576 Unknown
Process
2,274 Metabolism
530
Cell communication
486
Development
201
Physiology
118
Sensation & behavior
8,884
Unknown