* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download LabM3bioinformatics
Magnesium transporter wikipedia , lookup
Gene regulatory network wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Gene expression profiling wikipedia , lookup
Community fingerprinting wikipedia , lookup
Synthetic biology wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Genetic code wikipedia , lookup
Gene expression wikipedia , lookup
Non-coding DNA wikipedia , lookup
Western blot wikipedia , lookup
Protein moonlighting wikipedia , lookup
List of types of proteins wikipedia , lookup
Protein adsorption wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
Molecular ecology wikipedia , lookup
Homology modeling wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Proteolysis wikipedia , lookup
Protein structure prediction wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
History of molecular evolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Medical Biochemistry and Molecular Biology Department MEDICAL BIOCHEMISTRY AND MOLECULAR BIOLOGY DEPARTMENT PRACTICAL GUIDE NOTES ON MOLECULAR BIOLOGY PRACTICAL COURSE MOLECULAR BIOLOGY TECHNIQUES BIOINFORMATICS Medical Biochemistry and Molecular Biology Department BIOINFORMATICS ILO of the current topic: By the end of this topic, the student will be able to: Gather, organize and appraise information including the use of information technology where applicable. What Is Bioinformatics? Many definitions can apply, all about use of computers and software to store, analyze and interpret Biological data. Bioinformatics is the rapidly developing area of computer science devoted to collecting, organizing, and analyzing DNA and protein sequences. Bioinformatics can be defined as the interface between biological and computational science in which this scientific field deals with the computational management of all kind of biological information about genes and their products. Bioinformatics is the unified discipline formed from the combination of biology, computer science, soft ware engineering, mathematics and molecular biology. Bioinformatics "The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information. (Frank Tekaia) A Molecular Alphabet Most large biological molecules are polymers, ordered chains of simple molecules called monomers All monomers belong to the same general class, but there are several types with distinct and well-defined characteristics Many monomers can be joined to form a single, large macromolecule; the ordering of monomers in the macromolecule encodes information, just like the letters of an alphabet. These Alphabets can be the nucleotide bases in case of nucleic acids (A, G,C, T or U), or the amino acids in case of proteins (single letter abbreviations of amino acids as A, G, V, L, P, etc.). Genomic sequencing The genome of different living organisms contains many elements including gene coding for proteins. Scientists are working to sequence and assemble the genomes of these organisms. The goal is to obtain a genomic sequence and to identify a complete set of genes that codes for a protein or proteins. Vast amount of DNA sequence has already been determined and the pace at which new sequences are characterized is continuously accelerating. Computers are necessary to store and distribute this enormous volume of data. Bioinformatics Techniques: Reviews of bioinformatics are most often technology centered, focusing on the techniques that have evolved rapidly in this new discipline for evermore sophisticated analysis of sequences and structures. As a consequence of large amount of data produced in the field of molecular biology, most of the current bioinformatics project deals with structural and functional aspects of genes and proteins. First, the data produced by the thousands of research teams all over the world are collected and organized in a particular specialized data bases. In the next step, computational tools are needed to analyze the collected data in most efficient manner. Computational tools were developed to integrate the information in new types of web resources. By using these web sites, the molecular cell biologists throughout the world enter the Different databases as genebank, protein database (PDB) etc.. (www.ncbi.nlm.nih.gov) Medical Biochemistry and Molecular Biology Department Bioinformatics can be used to suggest the functions of newly identified genes and proteins. As the proteins with similar functions contain homologus amino acid sequences that corresponds to important functional domains in the three dimensional structure of the proteins, so the function of a protein that is not been isolated often can be predicted based on the homology of its gene or cDNA with DNA sequences encoding protein of known function. This is done by identifying and cloning the gene encoded for this protein with unknown function and then comparing these newly derived sequences with previously determined sequences stored in data banks to search for similarities, called homologous sequences. Protein-coding regions can be translated into amino acid sequences, which also can be compared. Because of degeneracy in the genetic code, related proteins often exhibit more homology than the genes encoding them. Computational programming used for searching sequence databases As mentioned above, the discovery of sequence homology to a known protein or family of proteins often provides the first clues about the function of a newly sequenced gene and as the DNA and amino acid sequence databases continue to grow in size, they become increasingly useful in the analysis of newly sequenced genes and proteins because of the greater chance of finding such homology. There are a number of software tools for searching sequence databases but all use some measure of similarity between sequences to distinguish biologically significant relationships from chance similarities. **Basic Local Alignment Search Tool (BLAST) Program: BLAST is the most famous and friendly user web based tool (www.ncbi.nlm.nih/blast/blast.cgi) used for rapid searching of nucleotide and protein sequences, that obtained from the DNA sequencer and the nucleic acid translation program. It directly approximates alignments between the novel sequences (queries) and the previously characterized genes (databases) that optimize a measure of local similarity, the maximal segment pair (MSP) score Sequences alignment provides a powerful way to compare novel sequences with previously characterized genes. BLAST provides a method for rapid searching of nucleotide and protein databases. Bioinformatics Lab Activity 1. 2. 3. 4. 5. 6. Go to http://www.ncbi.nlm.nih.gov/ Choose: Blast Choose: Human Feed your sequence Choose: MegaBlast or BlastN Run Blast and wait. Normal Hemoglobin beta subunit mRNA AUGGUGCAUCUGACUCCUGAGGAGAAGUCUGCCGUUACUGCCCUGUGGGGCAAGGUGAACGUGGAUGA AGUUGGUGGUGAGGCCCUGGGCAGGCUGCUGGUGGUCUACCCUUGGACCCAGAGGUUCUUUGAGUCCU UUGGGGAUCUGUCCACUCCUGAUGCUGUUAUGGGCAACCCUAAGGUGAAGGCUCAUGGCAAGAAAGUG CUCGGUGCCUUUAGUGAUGGCCUGGCUCACCUGGACAACCUCAAGGGCACCUUUGCCACACUGAGUGAG CUGCACUGUGACAAGCUGCACGUGGAUCCUGAGAACUUCAGGCUCCUGGGCAACGUGCUGGUCUGUGU GCUGGCCCAUCACUUUGGCAAAGAAUUCACCCCACCAGUGCAGGCUGCCUAUCAGAAAGUGGUGGCUGG UGUGGCUAAUGCCCUGGCCCACAAGUAUCACUAA To translate: http://expasy.org/tools/dna.html Normal Hemoglobin beta protein sequence Medical Biochemistry and Molecular Biology Department MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGA FSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVASALA HKYH Hemoglobin S beta chain GAG became GUG (GTG) AUGGUGCAUCUGACUCCUGUGGAGAAGUCUGCCGUUACUGCCCUGUGGGGCAAGGUGAACGUGGAUGA AGUUGGUGGUGAGGCCCUGGGCAGGCUGCUGGUGGUCUACCCUUGGACCCAGAGGUUCUUUGAGUCCU UUGGGGAUCUGUCCACUCCUGAUGCUGUUAUGGGCAACCCUAAGGUGAAGGCUCAUGGCAAGAAAGUG CUCGGUGCCUUUAGUGAUGGCCUGGCUCACCUGGACAACCUCAAGGGCACCUUUGCCACACUGAGUGAG CUGCACUGUGACAAGCUGCACGUGGAUCCUGAGAACUUCAGGCUCCUGGGCAACGUGCUGGUCUGUGU GCUGGCCCAUCACUUUGGCAAAGAAUUCACCCCACCAGUGCAGGCUGCCUAUCAGAAAGUGGUGGCUGG UGUGGCUAAUGCCCUGGCCCACAAGUAUCACUAA Glutamic 6 will be changed to Valine Medical Biochemistry and Molecular Biology Department MVHLTPVEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGA FSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVASALA HKYH To Allign two sequences: http://xylian.igh.cnrs.fr/bin/align-guess.cgi Normal Hemoglobin beta protein sequence MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGA FSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVASALA HKYH Hemoglobin S beta chain MVHLTPVEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGA FSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVASALA HKYH Normal Hemoglobin beta subunit mRNA AUGGUGCAUCUGACUCCUGAGGAGAAGUCUGCCGUUACUGCCCUGUGGGGCAAGGUGAACGUGGAUGA AGUUGGUGGUGAGGCCCUGGGCAGGCUGCUGGUGGUCUACCCUUGGACCCAGAGGUUCUUUGAGUCCU UUGGGGAUCUGUCCACUCCUGAUGCUGUUAUGGGCAACCCUAAGGUGAAGGCUCAUGGCAAGAAAGUG CUCGGUGCCUUUAGUGAUGGCCUGGCUCACCUGGACAACCUCAAGGGCACCUUUGCCACACUGAGUGAG CUGCACUGUGACAAGCUGCACGUGGAUCCUGAGAACUUCAGGCUCCUGGGCAACGUGCUGGUCUGUGU GCUGGCCCAUCACUUUGGCAAAGAAUUCACCCCACCAGUGCAGGCUGCCUAUCAGAAAGUGGUGGCUGG UGUGGCUAAUGCCCUGGCCCACAAGUAUCACUAA Hemoglobin S beta chain GAG became GUG (GTG) AUGGUGCAUCUGACUCCUGUGGAGAAGUCUGCCGUUACUGCCCUGUGGGGCAAGGUGAACGUGGAUGA AGUUGGUGGUGAGGCCCUGGGCAGGCUGCUGGUGGUCUACCCUUGGACCCAGAGGUUCUUUGAGUCCU UUGGGGAUCUGUCCACUCCUGAUGCUGUUAUGGGCAACCCUAAGGUGAAGGCUCAUGGCAAGAAAGUG CUCGGUGCCUUUAGUGAUGGCCUGGCUCACCUGGACAACCUCAAGGGCACCUUUGCCACACUGAGUGAG CUGCACUGUGACAAGCUGCACGUGGAUCCUGAGAACUUCAGGCUCCUGGGCAACGUGCUGGUCUGUGU GCUGGCCCAUCACUUUGGCAAAGAAUUCACCCCACCAGUGCAGGCUGCCUAUCAGAAAGUGGUGGCUGG UGUGGCUAAUGCCCUGGCCCACAAGUAUCACUAA