Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Introduction to Bioinformatics Human Genome Project A New Disease Encyclopedia Goals • Identify the approximate 40,000 genes in human DNA • Determine the sequences of the 3 billion bases that make up human DNA • Store this information in database • Develop tools for data analysis • Address the ethical, legal and social issues that arise from genome research Genome Health Implications New Genetic Fingerprints New Diagnostics New Treatments 2 Bioinformatics IT BT Biocomputing 3 What is Bioinformatics? Bio – molecular biology Informatics – computer science Bioinformatics – solving problems arising from biology using methodology from computer science. 4 Basics in Molecular Biology 5 Chromosomes DNA in a human cell: 2m DNA in a human body: 2 1011 km Earth-to-Sun: 1.5 108 km 6 DNA(Deoxyribonucleic Acid) Nucleotide들로 구성 Nucleotide = Sugar + Phosphate + Nitrogenous base Adenine – Thymine Guanine – Cytosine Double Helix 구조 7 DNA Base-pairs 8 DNA AACCTGCGGAAGGATCATTACCGAGTGCGG GTCCTTTGGGCCCAACCTCCCATCCGTGTCT ATTGTACCCGTTGCTTCGGCGGGCCCGCCG CTTGTCGGCCGCCGGGGGGGCGCCTCTGC CCCCCGGGCCCGTGCCCGCCGGAGACCCC AACACGAACACTGTCTGAAAGCGTGCAGTCT GAGTTGATTGAATGCAATCAGTTAAAACTTTC AACAATGGATCTCTTGGTTCCGGCATGCAAT CAGTCCCGTTGCTTCGGCACTGTCTGAAAGC GCCTTTGGGCCCAACCTCCCATCCGTGTCTA TTGTACCCGTTGCTTCGGCGGGCCCGCCGC TTGTCGGCCGCCGGGGGGGCGCCGTTGCTT CGGCGGGCCCGCCGCTTGTCGGCCGCCGG GGCTATTGTACCCGTTGCTTCGGATCTCTTG GGGATCTCTTGGTTCCGGCATGCAATCAGTC CCGTTGCTTCGGCACTGTCTGAAAGCGCCTT TGGGCCCAACCTCCCACCGTTGCTTCGGCG GGCCCGCCGCTTGTCGGCCGCCGGGGGGG CGGCCGCCGGGGGCACTGTCTGAAAGCTCG GCCGCC 9 Some Facts DNA differs between humans by 0.2%, (1 in 500 bases). Human DNA is 98% identical to that of chimpanzees. 97% of DNA in the human genome has no known function. 3.2*109 letters in the DNA code in every cell in your body. 1014 cells in the body. 12,000 letters of DNA decoded by the Human Genome Project every second. 10 Gene and Genome Gene Fundamental unit of heredity 단백질을 합성하는데 필요한 정보 포함 Genome의 일부 Genome 생명체가 갖는 전체 DNA 11 Numbers of Genes Humans 25,000 - 40,000 C. elegans (worm): 19,000 S. cerevisiae (yeast) 6,000 Tuberculosis microbe 4,000 12 RNA(Ribonucleic Acid) A, C, G, U(Uracil) mRNA DNA에서 gene을 transcription하여 세포 내에서 단백 질을 합성하는 기관인 ribosome에 정보 전달 tRNA Ribosome이 아미노산을 만들 때, mRNA와 아미노 산 사이의 adaptor 역할을 함 13 Molecular Biology: Flow of Information (Central Dogma) DNA “gene” RNA Protein Folded Protein 14 DNA (gene) control statement RNA TATA start Protein Termination stop control statement gene Ribosome binding 5’ utr Transcription (RNA polymerase) mRNA 3’ utr Translation (Ribosome) Protein 15 Codon tRNA는 3개의 nucleic acid와 결합 codon 조합 개수 64 20가지의 아미노산, stop codon 지정 하나의 codon은 하나의 amino acid를 만들고 amino acid가 결합하여 단백질을 형성한다. 16 Genetic Code: 3 bases=1amino acid First Position (5’ end) T C A G Second position T C Phe Phe Leu Leu Leu Leu Leu Leu Ser Ser Ser Ser Third Position (3’ end) A G Cys Cys STOP Trp Arg Arg Arg Arg T C A G lle lle Lle Met Pro Pro Pro Pro Thr Thr Thr Thr Tyr Tyr STOP STOP His His Gln Gln Asn Asn Lys Lys Ser Ser Arg Arg Val Val Val Val Ala Ala Ala Ala Asp Asp Glu Giu Gly Gly Gly Gly T C A G T C A G T C A G 17 Protein Structure 18 Human Genetic Variations (Single Nucleotide Polymorphisms) SNP’s- “genetic individuality” ~1/1000 bases variable (2 humans) Make us more/less susceptible to diseases May influence the effect of drug treatments TTTGCTCCGTTTTCA TTTGCTCYGTTTTCA TTTGCTCTGTTTTCA 19 SNP (Single Nucleotide Polymorphism) Finding single nucleotide changes at specific regions of genes Diagnosis of hereditary diseases Personal drug Finding more effective drugs and treatments 20 Human Individuality 21 Flood of Data! (SWISS-PROT) Number of sequences x 1000 80 70 60 50 40 30 20 10 0 1988 1990 1992 1994 1996 Year of release 22 How Can We Analyze the Flood of Data? Data: don’t just store it, analyze it! By comparing sequences, one can find out about things like ancestors of organisms phylogenetic trees protein structures protein function 23 Bioinformatics Is About: Elicitation of DNA sequences from genetic material Sequence annotation (e.g. with information from experiments) Understanding the control of gene expression (i.e. under what circumstances proteins are transcribed from DNA) The relationship between the amino acid sequence of proteins and their structure. 24 Aim of Research in Bioinformatics Understand the functioning of living things – to “improve the quality of life”. Drug design Identification of genetic risk factor Gene therapy Genetic modification of good crops and animals, etc 25 Extension of Bioinformatics Concept Genomics Functional genomics Structural genomics Proteomics: large scale analysis of the proteins of an organism Pharmacogenomics: developing new drugs that will target a particular disease Microarray: DNA chip, protein chip 26