* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download The Human Genome
Gene desert wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Community fingerprinting wikipedia , lookup
Gene regulatory network wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Exome sequencing wikipedia , lookup
Genomic imprinting wikipedia , lookup
Non-coding DNA wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Genomic library wikipedia , lookup
The Human Genome Some interesting facts Biological system overview Genes have variability, which causes a phenotype Genes need to be expressed at the right time in the right place ~ 5k – 10k genes per tissue Proteins and RNAs interact in pathways and networks ~8 interactions pp Genes encode proteins which may be processed or modified 100k – 500k proteins The human genome Genome size: 3200 Mbp 24 chromosomes + mitochondrion http://www.ensembl.org Sequencing the genome In 1953 James Watson and Francis Crick discovered the structure of DNA - the code of instructions for all life on earth 50 years later the human genome was sequenced by hierarchical shotgun sequencing Sequencing the genome The human genome was sequenced by: The International Human Genome Sequencing Consortium Celera Genomics Technique –hierarchical shotgun sequencing Draft sequences release in early 2001, but ~10% euchromatin missing and 150 000 gaps! After finishing -rereleased in 2004 with 341 gaps and covering 99% of euchromatic genome Sequencing time period First human genome took ~5 years and cost ~$3 billion Now, can sequence in a few weeks for ~$5,000 BUT: doesn’t consider cost and time for data analysis! International Human Genome Sequencing Consortium 2001. Nature 409, 860 – 921. Size of the genome There are 100 trillion (100,000,000,000,000) cells in your body. There are three billion (3,000,000,000) base pairs in the DNA code within each cell. The genome requires more than 3 gigabytes of computer storage space Full genome done by NGS costs $100/genome per year to store http://www.pbs.org/wgbh/nova/genome/facts.html Interesting facts If all the DNA in your body was put end to end, it would reach to the sun and back over 600 times (100 trillion times six feet/92 million miles).\ If unwound and tied together, the strands of DNA in one cell would stretch almost six feet but would be only 50 trillionths of an inch wide. It would take a person typing 60 words per minute, eight hours a day, around 50 years to type the human genome. If all three billion letters in the human genome were stacked one millimeter apart, they would reach a height 7,000 times the height of the Empire State Building. http://www.pbs.org/wgbh/nova/genome/facts.html Some statistics Only 1.5% of genome is coding Other non-protein coding sequence is for other kinds of “genes” or “lost genes” A proportion of our genome is not our own! 50% repeat regions, most of viral origin! single most common protein is the "recipe" for making Reverse Transcriptase 99.9% of our sequences are identical Number of human genes First estimates of between 20 000 and 150 000 genes Seems to be between 20 000 and 30 000 genes Expansion of the number of different protein molecules due to: (a) alternative splicing (30 to 50% increase); (b) post-translational modifications (5 to 10 fold increase) There could be about 1 million different protein molecules in the human body Gene numbers 21000 14000 genes 22000 19000 genes 2000-5000 genes 6000 genes 24000 genes Latest genome build Known protein-coding genes: 20,442 Novel protein-coding genes: 434 Pseudogenes: 15,007 RNA genes: 12,523 Gene exons: 649,964 Gene transcripts: 181,744 Protein coding genes Many of the genes are alternatively spliced Human genes have short exons (50 codons) and long introns (10k) Average gene length is 3000bp, max is 2.4 mill We know the function of less than half of all the genes Comparative genomics Organism Genome No. of Comparing the size (Mbp) genes human genome to Human 3000 21,000 others: Mouse 2800 22,000 Fruit fly 180 14,000 Worm 97 19,000 Yeast 12 6000 Evolution of humans Genes in common with other organisms About 75% of human genes have non-human homologues, ~70% match mouse proteins International Human Genome Sequencing Consortium 2001. Nature 409, 860 – 921. Functional composition Humans have more multifunctional genes, and genes involved in cell-cell communication and signalling International Human Genome Sequencing Consortium 2001. Nature 409, 860 – 921. Human genome resources Ensembl UCSC Genome Browse OMIM –human genes and inherited disorders dbSNP -single nucleotide polymorphisms Genetic Map at NCBI Etc. http://www.ncbi.nlm.nih.gov