* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download A Bioinformatics Tool for Analyzing G
United Kingdom National DNA Database wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Designer baby wikipedia , lookup
Gene expression profiling wikipedia , lookup
Point mutation wikipedia , lookup
Non-coding DNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Microevolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
RNA interference wikipedia , lookup
Metagenomics wikipedia , lookup
Nucleic acid tertiary structure wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Helitron (biology) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Deoxyribozyme wikipedia , lookup
RNA silencing wikipedia , lookup
Polyadenylation wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
History of RNA biology wikipedia , lookup
Non-coding RNA wikipedia , lookup
RNA-binding protein wikipedia , lookup
Messenger RNA wikipedia , lookup
A Bioinformatics Tool for Analyzing G-quadruplexes in the mRNA Untranslated Regions ザカレ ザッパァ Zachary Zappala But why? To map theoretical existence of Quadruplex forming G-Rich Sequences in cytoplasmic mRNAs Utility Belt Access to NCBI Entrez Gene PHP/MySQL/C++/JavaScript Laptop (Dell Latitude D505 1.6 ghz 512 DDR) Internet server Perseverance Starbucks Frappucinos™ Bioinformatics? A hybrid of information sciences and biology Similar, but not the same as computational biology Enlists the help of databases and tools to analyze large masses of data to find patterns that are not easily discernable by the human eye Tools like NCBI BLAST are especially well know …Biology? In computer science? Well, it’s not that complicated The biology involved in this project is transcription/translation Genetics! Quick overview: DNA (double helix; 2 nucleotide strands) RNA (single nucleotide strands) DNA is transcripted into RNA, which travels to ribosomes, which translate the RNA data into amino acids, the building blocks of proteins Eukaryotic mRNA! DNA in the nucleus transcription There are 3 sections in cytoplasmic eukaryotic mRNAs. •The 5’ UTR •The Coding sequence •The 3’ UTR RNA in the nucleus (pre-mRNA) RNA processing (SPLICING!) RNA in the Cytoplasm (mRNA) 5’ UTR CDS 3’ UTR Gene expression regulation factors? G-quadrawhats? That’s right: G-quadruplexes A type of secondary structure that forms in single stranded nucleotide sequences (aka..mRNA) …GG~GG~GG~GG… Plates (tetrads) form between 4 guanine molecules that line up Why is this important? Since all this work is theoretical, it’s important to know that there could be an application QGRS in pre-mRNA has already been shown to play an important role in pre-mRNA splicing (Kikin, D’Antonio, Bagga 2006) So, what about cytoplasmic mRNA? Gene expression control (since not all mRNA become proteins) Internal Ribosomal Entry Sites (IRES) • Allows entry of ribosomes to start translation not at the beginning of the 5’ UTR But how to predict? Prior research has given several clues to what constitutes a strong QGRS For instance, it is known that only one loop can have a length of zero Also, the more tetrad plates that are forming, the more likely it is that the QGRS will exist QGRS Motif: GxNy1GxNy2GxNy3Gx G-score is assigned using a straightforward function Divining the QGRS in an mRNA sequence When a gene is requested by the user, data is parsed from NCBI and given as parameters for a C++ program Executes and saves data from the sequence, which is then picked up again by the PHP program to be displayed to the user “Behind the scenes” Program Flow! PHP Session Variable “Interface” Screenshots! …more screenshots! …and more screenshots! But that’s not all! In order to not overload you with pretty pictures, let’s just say that you can also view direct data tables and a sequence view that block out the QGRS locations Program executes in a small time frame, and due to the nature of mRNA there are not many abnormal situations Poor internet connects do tend to slow display…but that’s your ISP Successes & Failures Research on the NRAS oncogene has shown, using crystallography, that a QGRS exists @ the -222 CDS bp position (within the 5’ UTR) When QGRS Mapper 2 analyzed the same gene, it predicted a QGRS at the same position Incomplete NCBI entries have prevented full verification of the reported data Unfortunately, not enough data is available for research to be done on IRES sites All IRES sites must be determined empirically as no strict pattern has been shown to exist yet Think of tomorrow… Currently analyzing various oncogenes, especially the NRAS to find out if the Mapper successfully maps the conservative QGRS GRS UTRdb is currently being built as well, making it possible for large calculations to be applied to mapped data The design of this database is… Entity Relationship Diagrams! ORACLE Certified! Shows the relationships between different tables in the database Is currently being populated, and is not yet public Want to try mapping yourself? Go to http://bioinformatics.ramapu.edu/QGRS2/index.php While the Mapper program is publicly available, the database is still not ready for public access Related References