* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Bioinformatics Research - Purdue University :: Computer Science
Epigenetics of neurodegenerative diseases wikipedia , lookup
DNA vaccination wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
History of RNA biology wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Human genome wikipedia , lookup
Genome (book) wikipedia , lookup
Metagenomics wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Expanded genetic code wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Primary transcript wikipedia , lookup
Genetic engineering wikipedia , lookup
Gene expression profiling wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome evolution wikipedia , lookup
Designer baby wikipedia , lookup
Non-coding DNA wikipedia , lookup
Genome editing wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Protein moonlighting wikipedia , lookup
Genetic code wikipedia , lookup
History of genetic engineering wikipedia , lookup
Microevolution wikipedia , lookup
Point mutation wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Bioinformatics Research Presented by : Amgad Madkour Outline z z z What is Bioinformatics ? (Definition) Some important terms Major research areas in Bioinformatics What is Bioinformatics ? z z Bioinformatics or computational biology is the use of techniques from applied mathematics, informatics, statistics, and computer science to solve biological problems . The terms bioinformatics and computational biology are often used interchangeably, although the latter typically focuses on algorithm development and specific computational methods. National Institute of health definition of the field z z Bioinformatics: Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data. Computational Biology: The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. Some Important Terms z z z z z z z Protein Amino Acid DNA RNA Chromosome Gene Expression Genetic Code Protein z z z z Is a complex, high-molecular-weight organic compound that consists of amino acids joined by peptide bonds. Many proteins are enzymes or subunits of enzymes, catalyzing chemical reactions. Other proteins play structural or mechanical roles, such as those that form the struts and joints of the cytoskeleton, serving as biological scaffolds for the mechanical integrity and tissue signaling functions. Other protein functions include immune response Protein Protein (Function) z z z z Proteins are involved in practically every function performed by a cell, including regulation of cellular functions such as signal transduction and metabolism. Life, chemically speaking, is nothing but the function of proteins although the information to make a unique protein resides in DNA. The protein involved in functions control almost all the molecular processes of the body. Without such proteins, the activity requires a different set of conditions, such as high temperature and pressure. Protein (Types) z z z z z z z Enzymes, which are responsible for catalyzing the thousands of chemical reactions of the living cell Keratin, elastin, and collagen, which are important types of structural, or support, proteins Hemoglobin and other gas transport proteins Ovalbumin, casein, and other nutrient molecules Antibodies, which are molecules of the immune system (see immunity) Protein hormones, which regulate metabolism Proteins that perform mechanical work, such as actin and myosin, the contractile muscle proteins. Amino Acids z z z z z Amino acids are the basic structural building units of proteins. They form short polymer chains called peptides or polypeptides which in turn form structures called proteins. The process of such formation is known as translation, which is part of protein synthesis. Other amino acids contained in proteins are usually formed by post-translational modification, which is modification after translation in protein synthesis. Twenty amino acids are encoded by the standard genetic code and are called proteinogenic or standard amino acids. At least two others are also coded by DNA in a non-standard successfully DNA z z z z Deoxyribonucleic acid (DNA) is a nucleic acid —usually in the form of a double helix— that contains the genetic instructions specifying the biological development of all cellular forms of life Two bases which form a "rung of the DNA ladder." A DNA nucleotide is made of a molecule of sugar, a molecule of phosphoric acid, and a molecule called a base. In DNA, the code letters are A, T, G, and C, which stand for the chemicals adenine, thymine, guanine, and cytosine, respectively. In base pairing, adenine always pairs with thymine, and guanine always pairs with cytosine. DNA RNA z z z z Ribonucleic acid (RNA) is a nucleic acid polymer consisting of covalently bound nucleotides. RNA nucleotides contain ribose rings and uracil unlike deoxyribonucleic acid (DNA), which contains deoxyribose and thymine. It is transcribed from DNA by enzymes called RNA polymerases and further processed by other enzymes. RNA serves as the template for translation of genes into proteins, transferring amino acids to the ribosome to form proteins, and also translating the transcript into proteins. Central Dogma Chromosome z z The DNA which carries genetic information in cells is normally packaged in the form of one or more large macromolecules called chromosomes. If you were to stretch out all the DNA from one of your cells, it would be over 3 feet (1 meter) long from end to end! You can think of chromosomes as "DNA packages" that enable all this DNA to fit in the nucleus of each cell. Normally, we have 46 of these packages in each cell; we received 23 from our mother and 23 from our father. Chromosome Phases Gene Expression z z z Gene expression, also called protein expression or often simply expression is the process by which a gene's DNA sequence is converted into the structures and functions of a cell. Gene expression is a multi-step process that begins with transcription of DNA, which genes are made of, into messenger RNA. It is then followed by post transcriptional modification and translation into a gene product, followed by folding, post-translational modification and targeting. The amount of protein that a cell expresses depends on the tissue, the developmental stage of the organism and the metabolic or physiologic state of the cell. Genetic Code z z z The genetic code is a set of rules that maps DNA sequences to proteins in the living cell, and is employed in the process of protein synthesis. Nearly all living things use the same genetic code, called the standard genetic code, although a few organisms use minor variations of the standard code. The gene sequence inscribed in DNA, and in RNA, is composed of tri-nucleotide units called codons, each coding for a single amino acid. There are 4^3=64 different codon combinations. For example, the RNA sequence UUUAAACCC contains the codons UUU, AAA and CCC, each of which specifies one amino acid So, this RNA sequence represents a protein sequence, three amino acids long Major research areas in Bioinformatics z z z z z z z z z Sequence analysis Genome Annotation Computational evolutionary biology Measuring biodiversity Gene expression analysis Regulation Analysis Protein Expression Analysis Structure prediction Comparative Genomics Sequence Analysis z z z z Data is analyzed to determine genes that code for proteins, as well as regulatory sequences A comparison of genes within a species or between different species can show similarities between protein functions, or relations between species A variant , sequence alignment is used in the sequencing process itself Automatic search for genes and regulatory sequences within a genome. Not all of the nucleotides within a genome are genes. Within the genome of higher organisms, large parts of the DNA do not serve any obvious purpose. Genome Annotation (GO) z z Genome annotation is the process of attaching biological information to sequences. It consists of two main steps. First, identifying elements on the genome, and second attaching biological information to these elements. Structural annotation consists of identification of genomic elements. z z z z gene structure coding regions location of regulatory motifs Functional annotation consists of attaching biological information to genomic elements. z z z z biochemical function biological function involved regulation and interactions expression Computational evolutionary biology z z z z Trace the evolution of a large number of organisms by measuring changes in their DNA, rather than through physical taxonomy or physiological observations alone More recently, compare entire genomes, which permits the study of more complex evolutionary events, such as gene duplication, lateral gene transfer, and the prediction of bacterial speciation factors, Build complex computational models of populations to predict the outcome of the system over time Track and share information on an increasingly large number of species and organisms Measuring Biodiversity z z z z Databases are used to collect the species names, descriptions, distributions, genetic information, status and size of populations, habitat needs, and how each organism interacts with other species. Specialized software programs are used to find, visualize, and analyze the information Computer simulations model such things as population dynamics, or calculate the cumulative genetic health of a breeding pool (in agriculture) or endangered population (in conservation) One very exciting potential of this field is that entire DNA sequences, or genomes of endangered species can be preserved, allowing the results of Nature's genetic experiment to be remembered in silico, and possibly reused in the future, even if that species is eventually lost. Gene Expression Analysis z z The expression of many genes can be determined by measuring mRNA levels with multiple techniques including microarrays, expressed cDNA sequence tag and so forth All of these techniques are extremely noise-prone and/or subject to bias in the biological measurement, and a major research area in computational biology involves developing statistical tools to separate signal from noise in high-throughput gene expression (HT) studies. Regulation analysis z z z Regulation is the complex orchestra of events starting with an extra-cellular signal and ultimately leading to the increase or decrease in the activity of one or more protein molecules were Bioinformatics techniques have been applied to explore various steps in this process. promoter analysis involves the elucidation and study of sequence motifs in the genomic region surround the coding region of a gene Expression data can be used to infer gene regulation Protein expression analysis z z z Bioinformatics is very much involved in making sense of protein microarray and HT MS data The former involves a number of the same problems involve in examining microarrays targeted at mRNA The latter involves the problem of matching large amounts of mass data against predicted masses from protein sequence databases, and the complicated statistical analysis of samples where multiple, but incomplete, peptides from each protein are detected. Structure prediction z z z z One of the key ideas in bioinformatics research is the notion of homology. In the genomic branch of bioinformatics, homology is used to predict the function of a gene: if the sequence of gene A, whose function is known, is homologous to the sequence of gene B, whose function is unknown, one could infer that B may share A's function. In the structural branch of bioinformatics homology is used to determine which parts of the protein are important in structure formation and interaction with other proteins. In a technique called homology modelling, this information is used to predict the structure of a protein once the structure of a homologous protein is known. This currently remains the only way to predict protein structures reliably. Structure Prediction (Example) z One example of this is the similar protein homology between hemoglobin in humans and the hemoglobin in legumes (leghemoglobin). Both serve the same purpose of transporting oxygen in both organisms. Though both of these proteins have completely different amino acid sequences, their protein structures are virtually identical, which reflects their near identical purposes. Comparative genomics z z z The core of comparative genome analysis is the establishment of the correspondence between genes (orthology analysis) or other genomic features in different organisms It is these intergenomic maps that make it possible to trace the evolutionary processes responsible for the divergence of two genomes Many of these studies are based on the homology detection and protein families computation. Special Thanks .. z z I would like to thank my Father Prof. Dr. Magdy Madkour for his biological background support to the subject I would also like to thank my friend Ibrahim Imam for his support in the biological background which aided me a lot in understanding a lot of concepts References z z z z z Wikipedia http://www.wikipedia.org EMC http://www.emc.maricopa.edu/faculty/farabee/BIOBK/Bio BookPROTSYn.html Fact Monster http://www.factmonster.com/ce6/sci/A0860558.html Molecular Biology of the Gene (Watson,Hopkins,Roberts,Steiz,Weiner) University Of Utah http://gslc.genetics.utah.edu/units/disorders/karyotype/w hatarechrom.cfm