* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Cas_ProteinsFinal
G protein–coupled receptor wikipedia , lookup
Epitranscriptome wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Magnesium transporter wikipedia , lookup
Expression vector wikipedia , lookup
Non-coding DNA wikipedia , lookup
Genetic code wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Metalloprotein wikipedia , lookup
Interactome wikipedia , lookup
Biosynthesis wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Structural alignment wikipedia , lookup
Biochemistry wikipedia , lookup
Homology modeling wikipedia , lookup
Protein purification wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Point mutation wikipedia , lookup
Gene expression wikipedia , lookup
Western blot wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
CRISPR-associated Proteins Sarah Pyfrom [email protected] Research Questions What Cas-proteins does our species share with the 10 other species we chose to study? If so, how do they compare? How do Cas-proteins function in relation to CRISPR units? [Edit]: Why did JGI change its annotation? Cas Proteins Proteins that are almost always associated with (near) CRISPR sequences Originally four major families Now, at least 45 families total JGI annotation “Old” Cas-Proteins “New” Cas-Proteins Cas1 Cas1 Cas2 Cas2 Cas3 Cas4 Cas4 Cas5 TM1800 Cas6 TM1801 Csh1 Csh2 Changes: TM1800= Cas5 TM1801=Csh2 Hypothetical protein = Csh1 Part of hypothetical protein = Cas6 Cas3 = hypothetical protein Cas4: MTDSSGDPVDRFLAAARDESAELPFRLTGVMFQYYVVCER ELWFLSRDVEIDRDTPAIVRGSDVDDSAYADKRRDVRVDGII AIDVLDSGEILEVKPSSSMTEPARLQLLFYLWYLDRVTGVEK TGVLAHPAEKRRETVELTPETSAEVESAIEGIRAVVTAESPPP AEEKPVCDSCAYHDFCWSC (red = original Cas4) Map of CRISPR region TM1800 TM1801 Transposases Cas3 Hypothetical proteins CRISPR Cas1 Cas2 Unidentified Cas4 Csh1 Cas5 Cas6 Csh2 Cas1 (from Sulfolobus solfataricus) high-affinity nucleic acid binding protein binds DNA, RNA and DNA–RNA hybrid sequence non-specific in a multi-site binding mode promotes the hybridization of complementary nucleic acid strands. From: SSO1450 – A CAS1 protein from Sulfolobus solfataricus P2 with high affinity for RNA and DNA Cas3 Cas4 Usually similar to helicases Often resemble Rec-B Unwinds double-stranded DNA Thought to be involved in DNA metabolism and repair Cas2 function unknown From Genbank exonucleases Break down nucleic acid strains Thought to be involved in DNA metabolism Cas5 Cas6 Often found with Cas1, Characterized by and Cas6. Share and N-terminal region of about 43 amino acids in length Are usually 210-265 amino acids long From: EMBL IPR013422 profile page (: http://www.ebi.ac.uk/interpro/IEntry?ac=IPR013422) GhGxxxxxGhG motif, where h indicates a hydrophobic residue, at the C-terminus From: Sanger PF09559 Profile page ( http://pfam.sanger.ac.uk/family/PF09559) Csh1 and Csh2? Protein families determined for ease of alignment Often large differences between species Alignment easier if protein “soup” is divided into more readily- compared subgroups. CRISPRs thought to create stable secondary RNA structures Spacers remain associated with their DR neighbors. Provide a way for CasProteins to recognize the spacers and facilitate immune response. From: Evolutionary conservation of sequence and secondary structures in CRISPR repeats Cas-Proteins and Immunity Thought to act like Slicer and Dicer (eukaryotic counterpart) Create siRNA that will inhibit/break down invading RNA Not known if Cas-proteins are involved in integrating pathogenic DNA into spacers Video of eukaryotic siRNA process: http://www.youtube.com/watch?v=D77BvIOLd0 Alignments of Cas Compared Cas1, Cas2, Cas3 etc. proteins across all 10 species… Comparison with other species: (based on “old” proteins) Species Cas1 Cas2 Cas3 Cas4 TM1800 TM1801 X X X X H: vallismortis H. volcanii H. sulfurifontis X H. sinaiiensis X X X H. californiae X X H. utahensis X X X X X X H. mucosum X X X X X X H. mediteranei X X X X X X H. denitrificans X X X X X X H. mukohataei X X X X X X Phylogenetic tree comparing amino acid sequences for all CAS-proteins 2 2 2 2 1800 1 4 Halomicrobium mukohataei Haloarcula sinaiiensis Haloarcula californiae Haloferax dentrificans Haloferax mediteranei Haloferax sulfurifontis Haloferax mucosum Halorhabdus utahensis 3 1 1801 1801 1801 1801 1801 3 1 1 1800 2 3 3 2 1800 1 1800 1800 1 3 1801 4 4 4 4 4 1 4 1800 3 3 1800 1801 Cas 1 and Cas2 did not change Cas 4 • JGI revision shortened this protein •Would expect low sequence similarity near end of protein TM1801 (Csh2) • Revision by JGI simply renamed this protein •Would expect sequence similarity Map of CRISPR region TM1800 TM1801 Transposases Cas3 Hypothetical proteins CRISPR Cas1 Cas2 Unidentified Cas4 Csh1 Cas5 Cas6 Csh2 In conclusion: We don’t know much…. …but we do know everything that everybody else knows. Questions? References Kunin, V., Sorek, R., Hugenholtz, P. (2007) Evolutionary conservation of sequence and secondary structures in CRISPR repeats. Genome Biology.http://genomebiology.com/2007/8/4/R61. Accessed 24 Nov, 2009. Haft, D.H., Selengut, J. Mongodin, EF., Nelson, K.E. (2005). A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLoS Comput Biol. http://www.ncbi.nlm.nih.gov/pubmed/16292354. Accessed 24 Nov 2009.