* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture_3_2005
Deoxyribozyme wikipedia , lookup
Gene expression wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Ridge (biology) wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Genomic imprinting wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Silencer (genetics) wikipedia , lookup
DNA sequencing wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Non-coding DNA wikipedia , lookup
Exome sequencing wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Molecular evolution wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Genome databases and webtools for genome analysis • Become familiar with microbial genome databases • Use some of the tools useful for analyzing genome • Visit sites used in lab exercise #2 Major components of NCBI • • • • • • • GenBank PubMed Entrez BLAST Conserved Domain Database (CDD) Cluster of orthologous groups (COGS) OMIM GenBank • Database of DNA and protein sequences • Searchable • Caution: Sequences deposited by the community, not curated for accuracy. • RefSeq - verified by NCBI. Example of a GenBank record BLAST • Basic Local Alignment Search Tool • Comparing nucleotide sequences and protein sequences • Microbial specific BLAST page • Focus of a future lab OMIM • Online Mendelian Inheritance in Man. • Database that links diseases and genes TIGR • Comprehensive microbial resource (CMR). • Many genomes. • Tools to analyze genomes. SubtiList • Website for B. subtilis genome. • Features – – – – – Annotated genes Gene region display Updated similarity searches for every protein BLAST and pattern search capabilities Links to journal articles and protein databases RDP • Ribosomal database project • Curated at MSU • Contains a compilation of all ribosomal DNA sequences (currently over 100,000). • Second database contains information regarding copy number of ribosomal RNA. KEGG • Kyoto Encyclopedia of Genes and Genomes • Often changing database of gene content, metabolic pathways, etc. • Excellent resource for reconstructing pathways in organism of interest. Genome sequencing and annotation Week 2 reading assignments - pages 65-79, 110-122. Boxes 2.1, 2.2 and 2.3. Don’t worry about the details of HMM. Hughes Functional Genomics Review. • Sequencing - dideoxy method for DNA sequencing. • Methods for sequencing genomes. • Methods for finding and annotating genes in microbial genomes. Dideoxy sequencing (Sanger method) • Developed by Frederick Sanger (for which he won his second Nobel Prize in 1980). Two types of labeling • Radioactive – 32P, 35S – Run out each dideoxy base in a separate reaction, lane on a gel. – No longer used • Fluorescent – Four different fluorophores for each base – Can be mixed. – Chromatograms - GTSF Cycle sequencing Phred • Method for automated quality assessment of DNA sequence traces. – Variance in peak spacing in 7 peak window – Ratio of largest uncalled peak to smallest called peak in 7 and 3 peak windows. – Number of bases between current base and nearest unresolved base. • Phred score = 10 x (-log(P)). • Phred scores of 20 or higher are considered good calls. Why? Sequencing of genomes • Hierarchical or contig based sequencing – Clone smaller segments of the genome. – Labor intensive, slow – Not needed for sequencing microbial genomes • Shotgun method – Randomly clone and sequence 1.5-2 kb fragments of DNA. 5-10 fold coverage. – Computationally intensive. Finding genes in a genome sequence • What to look for? • Glimmer - HMM algorithm for identifying genes. (TIGR). • ORF finder - NCBI. • Most automated annotation engines have ORF finding capabilities. • Much more difficult in eukaryotic genomes.