Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
• Review of important points from the NCBI lectures. – Example slides • Review the two types of microarray platforms. – Spotted arrays – Affymetrix • Specific examples that use microarray technology. – Gene expression - role of a transcription factor Web Access Text Entrez Sequence BLAST Structure VAST N ucleotide Translated BLAST P rotein Particularly useful for nucleotide sequences without protein annotations, such as ESTs or genomic DNA tblastn P N PPP PPP tblastx PPP P N N PPP PPP PPP N Database PPP blastx Query PPP Program Position Specific Score Matrix (PSSM) 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 D G V I S S C N G D S G G P L N C Q A A 0 -2 -1 -3 -2 4 -4 -2 -2 -5 -2 -3 -3 -2 -4 -1 0 0 -1 R N D C Q E G H I L K M F -2 0 2 -4 2 4 -4 -3 -5 -4 0 -2 -6 -1 0 -2 -4 -3 -3 6 -4 -5 -5 0 -2 -3 1 -3 -3 -5 -1 -2 6 -1 -4 -5 1 -5 -6 3 -3 -4 -6 0 -1 -4 -1 2 -4 6 -2 -5 -5 0 8 -5 -3 -2 -1 -4 -7 -6 -4 -6 -7 -4 -4 -4 -4 -1 -4 -2 -3 -3 -5 -4 -4 -5 -7 -6 -7 12 -7 -7 -5 -6 -5 -5 -7 -5 0 Serine scored 0 2 -1 -6 7 0is -2 0 -6differently -4 2 0 -2 -3 -3 -4 -4 -5 two 7 -4 positions -7 -7 -5 -4 -4 in-4 these -5 -2 9 -7 -4 -1 -5 -5 -7 -7 -4 -7 -7 -4 -2 -4 -4 -3 -3 -3 -4 -6 -6 -3 -5 -6 -6 -4 -5 -6 -5 -6 8 -6 -8 -7 -5 -6 -7 -6 -4 -5 -6 -5 -6 8 -6 -7 -7 -5 -6 -7 Active -6 -6 -5site -6 nucleophile -5 -5 -6 -6 -6 -7 -4 -6 -7 -6 -7 -7 -5 -5 -6 -7 0 -1 6 -6 1 0 -6 0 -6 -4 -4 -6 -6 -1 3 0 -5 4 -3 -4 -5 -5 10 -2 -5 -5 1 -1 -1 -5 0 -1 1 4 2 -5 2 0 0 0 -4 -2 1 0 0 -1 1 3 -4 -1 1 4 -3 -4 -3 -1 -2 -2 P 1 -2 -4 -5 -5 -1 -7 -5 -6 -5 -4 -6 -6 9 -6 -6 -4 0 -3 S 0 -2 0 -3 1 4 -4 -1 -3 -4 7 -4 -2 -4 -6 -2 -1 -1 0 T -1 -1 -2 0 -3 3 -4 -3 -5 -4 -2 -5 -4 -4 -5 -1 0 -1 -2 W -6 0 -6 -1 -7 -6 -5 -3 -6 -8 -6 -6 -6 -7 -5 -6 -5 -3 -2 Y -4 -6 -4 -4 -5 -5 0 -4 -6 -7 -5 -7 -7 -7 -4 -1 0 -3 -2 V -1 -5 -2 0 -6 -3 -4 -3 -6 -7 -5 -7 -7 -6 0 6 0 -4 -3 PSI-BLAST Create your own PSSM: Confirming relationships of purine nucleotide metabolism proteins query PSSM BLOSUM62 Alignment Affymetrix vs. glass slide based arrays • • • • Affymetrix Short oligonucleotides Many oligos per gene Single sample hybridized to chip • Glass slide • Long oligonucleotides or PCR products • A single oligo or PCR product per gene • Two samples hybridized to chip Bacterial DNA microarrays • • • • Small genome size Fully sequenced genomes, well annotated Ease of producing biological replicates Genetics Applications of DNA microarrays • Monitor gene expression – – – – Study regulatory networks Drug discovery - mechanism of action Diagnostics - tumor diagnosis etc. • Genomic DNA hybridizations – Explore microbial diversity – Whole genome comparisons – Diagnostics - tumor diagnosis • ? Characterization of the stationary phase sigma factor regulon (sH) in Bacillus subtilis • Patrick Eichenberger, Eduardo Gonzalez-Pastor, and Richard Losick Harvard University. • Robert A. Britton and Alan D. Grossman Massachusetts Institute of Technology. What is a sigma factor? • Directs RNA polymerase to promoter sequences • Bacteria use many sigma factors to turn on regulatory networks at different times. – Sporulation – Stress responses – Virulence Wosten, 1998 Alternative sigma factors in B. subtilis sporulation Kroos and Yu, 2000 The stationary phase sigma factor: sH most active at the transition from exponential growth to stationary phase mutants are blocked at stage 0 of sporulation • known targets involved in: phosphorelay (kinA, spo0F) sporulation (sigF, spoVG) cell division (ftsAZ) cell wall (dacC) general metabolism (citG) phosphatase inhibitors (phr peptides) Experimental approach • Compare expression profiles of wt and ∆sigH mutant at times when sigH is active. • Artificially induce the expression of sigH during exponential growth. – When Sigma-H is normally not active. – Might miss genes that depend additional factors other than Sigma-H. • Identify potential promoters using computer searches. Pspac sigH ∆sigH wild-type Grow cells Isolate RNA Make labeled cDNA Mix and hybridize Scan slide Analyze data wild type (Cy5) vs. sigH mutant (Cy3) Hour -1 Hour 0 citG Hour +1 sacT Identifying differentially expressed genes • Many different methods • Arbritrary assignment of fold change is not a valid approach • Statistical representation of the data – Iterative outlier analysis – SAM (significance analysis of microarrays) Data from a microarray are expressed as ratios • Cy3/Cy5 or Cy5/Cy3 • Measuring differences in two samples, not absolute expression levels • Ratios are often log2 transformed before analysis Genes whose transcription is influenced by sH • 433 genes were altered when comparing wt vs. ∆sigH. • 160 genes were altered when sigH overexpressed. • Which genes are directly regulated by Sigma-H? Identifying sigH promoters • Two bioinformatics approaches – Hidden Markov Model database (P. Fawcett) • HMMER 2.2 (hmm.wustl.edu) – Pattern searches (SubtiList) • Identify 100s of potential promoters Correlate potential sigH promoters with genes identified with microarray data. • Genes positively regulated by Sigma-H in a microarray experiment that have a putative promoter within 500bp of the gene. Directly controlled sigH genes • 26 new sigH promoters controlling 54 genes • Genes involved in key processes associated with the transition to stationary phase – – – – generation of new food sources (ie. proteases) transport of nutrients cell wall metabolism cyctochrome biogenesis • Correctly identified nearly all known sigH promoters • Complete sigH regulon: – 49 promoters controlling 87 genes. • Identification of DNA regions bound by proteins. Iyer et al. 2001 Nature, 409:533-538 Pathogen 1 Pathogen 2 Grow cells Isolate RNA Make labeled cDNA Mix and hybridize Scan slide Analyze data