Download What are genomes and how are they studied

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Quantitative trait locus wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Y chromosome wikipedia , lookup

Genetic engineering wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Long non-coding RNA wikipedia , lookup

NUMT wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Polyploid wikipedia , lookup

Neocentromere wikipedia , lookup

Metagenomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

X-inactivation wikipedia , lookup

Gene expression programming wikipedia , lookup

Human genetic variation wikipedia , lookup

Whole genome sequencing wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene desert wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Copy-number variation wikipedia , lookup

Public health genomics wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Ridge (biology) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genomic imprinting wikipedia , lookup

Tag SNP wikipedia , lookup

Gene wikipedia , lookup

Non-coding DNA wikipedia , lookup

Microevolution wikipedia , lookup

Transposable element wikipedia , lookup

History of genetic engineering wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Genomic library wikipedia , lookup

Pathogenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Designer baby wikipedia , lookup

Genome editing wikipedia , lookup

Human genome wikipedia , lookup

Genome (book) wikipedia , lookup

Human Genome Project wikipedia , lookup

Minimal genome wikipedia , lookup

Helitron (biology) wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
BB30055: genes and genomes
MV Hejmadi (bssmvh), 2006-07
Genomes and genome projects –Lecture 4: Insights from HGP
What makes us human?
Major insights from the HGP on genome organisation:
1) Genes: Genes vary widely in their size, content and locationMore genes: Twice as many as drosophila /
C.elegans

Uneven gene distribution: Gene-rich and gene-poor regions

More paralogs: some gene families have extended the number of paralogs e.g. olfactory genes

More alternative transcripts: Increased RNA splice variants thereby expanding proteins by 5 fold
2) Proteome: proteome more complex than invertebrates
Domain arrangements in human:
 largest total number of domains is 130
 largest number of domain types per protein is 9
 Mostly identical arrangement of domains no huge difference in domain number in humans, but frequency of
domain sharing very high in human proteins (especially structural proteins and proteins involved in signal
transduction and immune function).
Only 3 cases where a combination of 3 domain types shared by human & yeast proteins.
3) Single nucleotide polymorphisms (SNP) identificationSites that result from point mutations in individual base
pairs

More than 1.4million SNPs identified (~ 1 in every 1.9kb length on average)

~60,000 SNPs lie within exons and untranslated regions (85% of exons lie within 5kb of a SNP)

May or may not affect the ORF (synonymous or non synonymous)

variable densities over regions and chromosomes. e.g.HLA region has a high SNP density,
reflecting maintenance of diverse haplotypes over My
HAPLOTYPE: Haplotype is a set of single nucleotide polymorphisms (SNPs) on a single chromatid that are
statistically associated. Haplotypes are generally shared between populations but their frequency can vary.
BB30055: genes and genomes
MV Hejmadi (bssmvh), 2006-07
4) Distribution of GC content
Genome wide average of 41%. Huge regional variations exist E.g.distal 48Mb of chromosome 1p-47% but
chromosome 13 has only 36%. Confirms cytogenetic staining with G-bands (Giemsa); dark G-bands – low GC
content (37%) light G-bands – high GC content (45%)
5) CpG islands (~28,890 in number)
Greatly under-represented in human genome
•CpG islands show no methylation
•Variable density e.g. Y – 2.9/Mb but 16,17 & 22 have 19-22/Mb (average is 10.5/Mb)
6) Recombination rates
•Recombination rate increases with decreasing chromosome arm length
•Recombination rate suppressed near the centromeres and increases towards the distal 20-35Mb
7) Repeat content
a) Age distribution
 Most interspersed repeats predate eutherian radiation
 LINEs and SINE have extremely long lives
 2 major peaks of transposon activity
 No DNA transposition in the past 50MYr
 LTR retroposons teetering on the brink of extinction
 overall decline in IR activity in hominid lineage in past 35-40MYr compared to mouse genome
b) Comparison with other genomes: Compared to fruitfly, C.elegans and plant genomes, human genomes show

higher density of transposable elements in euchromatic portion of genome

Higher abundance of ancient transposons

60% of IR made up of LINE1 and Alu repeats whereas DNA transposons represent only 6% c) Variation in
distribution of repeats: regions show either a high repeat density (e.g. chromosome Xp11 – a 525kb region
shows 89% repeat density) or a low repeat density (e.g. HOX homeobox gene cluster (<2% repeats), indicative
of regulatory elements which have low
tolerance for insertions)
d) Distribution by GC content: (High GC –
gene rich ; High AT – gene poor): LINEs
abundant in AT-rich regions but SINEs
lower in AT-rich regions. Alu repeats in
particular, retained in actively transcribed
GC rich regions. E.g. chromosme 19 has
5% Alus compared to Y
e) Y chromosome: Unusually young genome
(high tolerance to gaining insertions).
Mutation rate is 2.1X higher in male
germline.
Working draft published – Feb 2001
Finished sequence – April 2003
Annotation of genes going on
(refer: IHGSC. Finishing the euchromatic sequence of the human genome. Nature 21 October 2004)
References:
Chapter 9: Human Molecular Genetics 3 by Strachan and Read AND/OR
Chapter 10: Genetics from genes to genomes by Hartwell et al (2/e) pp 339-348
Nature (2001) 409: pp 879-891
Nature (1st Sept 2005) for Chimp genome Vol 437 pp50-51.