Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
BIT150 – Fall 2010 Midterm – Take-home exercises Due on Monday November 1st at 12pm by email to TA: [email protected] as Midterm_Lastname BEFORE STARTING… Exams are INDIVIDUAL. The finding of two or more unusual and identical errors will be considered evidence of copying. Make sure you CAREFULLY READ AND UNDERSTAND EACH QUESTION of the exam. Ask any doubt you may have ONLY regarding interpretation of the questions of the exam to TA. Make sure that all parts of each question are answered. Follow in detail the assignments given in each question of the exam. TA will NOT guess the answers of the exam. Make sure you provide appropriate references to indicate what each color/highlight/character used means. Question 1 20 points Question 2 15 points Question 3 15 points Question 4 25 points Question 5 25 points TOTAL 100pts 1. 20 Points The cDNA sequence for Triticum monococcum gene FtsH is provided below: >cDNA_TmFtsH_Wildtype_sequence ATGTTTGAACTTCTGTTTGTGTCGCAGGCTGGAAGGCGGTCGTCGAGTGTGGTCTACAATGAGCTAGTGAGTACAAGTGCTTTCAGGAC ACCTGCAAATGGCACCGGCGGAGTTCTCAAGGCGCTGCAAGAGAGGTACCGATCAAGCTACGTCGGTAGCTTCGCGCGCAGGCTACGAG ACTTTGACACGCCAAGTGATGCCTCCCTTCTTAAAGAGATCTACAGAAGTAACCCAGAAAGGGTCGTGCAGATCTTTGAGAGCCAGCCT TCCTTACATAACAACTCTTCGGCTCTCTCCCAGTATGTGAAGGCTCTTGTCGCTCTCGACAGGCTGGATGAAAGCCCGCTGCTTAAGAC ATTGCAGAGAGGAATTGTCAATTCAGCAAGGGAGGAAGAAGGGTTTAGTGGCATCCCAGCATTTCAAAGTGTTGGCCGTACGACGAAAG ATGGGGCTCTTGGTACTGCTGGTGCACCAATTCACATGGTTGCATCAGAGACTGGCCAATTCAAGGAGCAGCTTTGGCGTACCTTCCGA AGCATTGCACTCACTTTCCTAGTAATCTCTGGCATCGGGGCTCTGATTGAAGATAGAGGAATTAGTAAAGGCCTTGGATTGCATGAAGA GGTTCAGCCAAGCTTGGATTCGAGCACAAAATTCAGTGATGTCAAGGGGGTTGATGAAGCTAAAGCTGAACTCGAGGAAATAGTTCACT ACCTACGAGATCCCAAGCGTTTCACACGCCTTGGTGGCAAGCTTCCAAAAGGTGTTCTACTTGTCGGCCCACCCGGGACAGGGAAAACC ATGTTGGCAAGGGCTATCGCTGGGGAAGCTGGCGTTCCTTTCTTCTCCTGCAGCGGCAGTGAGTTTGAGGAGATGTTTGTGGGTGTCGG GGCAAGAAGAGTGAGGGATCTATTCAGTGCAGCAAAGAAACGATCTCCATGTATAATTTTCATTGATGAAATTGATGCAATTGGTGGGA GCAGAAACCCAAAAGATCAACAGTATATGAAGATGACCTTGAACCAGTTACTTGTTGAGCTGGATGGCTTTAAGCAGAATGATGGGATC ATTGTAATTGCAGCAACAAACTTCCCCCAGTCACTAGATAAAGCCCTTGTTAGGCCTGGGCGTTTTGACCGTCATATTGTGGTTCCTAA CCCAGATGTTGAGGGCCGACGGCAGATCCTGGAGACTCATATGTCAAAGGTGTTAAAAGCAGACGATGTGGATTTGATGACCATTGCCA GGGGAACGCCTGGATTCTCAGGTGCAGACCTTGCAAACCTGGTGAACGTGGCTGCTCTCAAGGCTGCCATGGATGGAGCGAAATCTGTT TCAATGACCGACCTCGAGTTTGCCAAGGACAGGATCATGATGGGCAGCGAGCGCAAATCAGCAGTGATATCCGACGAAAGCAGGAAGAT GACTGCATACCATGAGGGAGGGCATGCGCTGGTCGCGATACACACGGCCGGTGCCCACCCCGTCCACAAGGCCACCATTGTTCCGAGGG GAATGGCTCTGGGCATGGTCACGCAGCTGCCAGAGAAGGACCAGACCAGCGTGTCTAGGAAGCAGATGCTGGCAAGGCTGGACGTCTGC ATGGGAGGGCGGGTGGCCGAGGAGCTTATATTTGGGGAGAGTGAGGTAACGTCGGGCGCGTCGTCCGACCTAAGCCAAGCGACCCGGGC TGCCAAAGCCATGGTGACCAAGTATGGCATGAGCAAACGCGTGGGCCTTGTAGCCTACAATTATGACGACGATGGGAAGACCATGAGCA CGCAGACGCGGGGTCTGGTGGAGCAGGAGGTGAAGGAGCTGCTGGAGACGGCCTACAACAATGCCAAGACGATCCTCACGACCCACAAC AAGGAGCTGCACGCGCTCGCCAACGCCCTCATCGAGCGCGAGACCCTCACTGGCGCCCAGATCAAGAACCTTCTGTCGCAGGTAAACAG CAGCAGTGACACTCAGCAGCCCCAGGCCGCTGAGGTTCCACAGCAAACACCCGCTGCCCCAGCCTCACCCCAGTCTCCAGCAGCAGCGG CTGCGGCCGCCGCAGCAGCTGCAGCACAGCAAGCAGCGGCTCAAGCCAAAGGAGTCGCAGGCATCGGGTCCTAG The following mutations have been found in the FtsH gene at the DNA level: Mutation 1: G at position 1793 was mutated to T Mutation 2: C at position 782 was mutated to T Mutation 3: A at position 436 was mutated to G 1.1 Translate the provided wildtype cDNA into a protein sequence and paste it here. 1.2 Use the appropriate BLAST program to identify orthologous proteins. List the program used: To what conserved domain superfamily does this gene belong? Provide accession number and protein of the following orthologs: Organism Accession Protein sequence -Oryza sativa -Zea mays -Arabidopsis thaliana 1.3 Determine the resulting amino acid change for each base mutation and refer to the BLOSUM62 scoring matrix to find the scores for each mutant. (See slide 16 in Lecture 2 for BLOSUM score matrix) Mutant T. Monococcum Amino Acid Mutant Amino Acid BLOSUM Score 1 G 2 P 3 T 1.4 Use MEGA to create a CLUSTAL alignment with the five orthologous proteins to determine the most useful TILLING mutant to disrupt the function of this protein among the three listed above. Present an image of the highlighted Variable residues showing at least the first 25 amino acid positions. 1.5 Rank the mutants above in the order you would use them to study the function of this protein and describe your reasoning: 2. 15 Points Consider the four proteins below, and select the most appropriate program to perform a Multiple Sequence Alignment. Name the program and describe why you chose it. Include a publication quality image of your alignment. Unaligned >SequenceA MAGSGRDRDPLVVGRVVGDVLDAFVRSTNLKVTYGSKTVSNGCELKPSMVTHQPRVEVGGNDMRTFYTLVMVDMRDPDAPSPSDP >SequenceB MAGRDREPLVVGRVVGDVLDPFVRTTNLRVSYGARTVSNGCELKPSMVDMPSPPDP >SequenceC MSINRDPLIVSRVVGDVLDPFNRSITLKVTYGQREVTNGLDLRPSQVQNKPRVEIGGEDLRNFYTLVMVDMRDPDVPSPSNP >SequenceD MVGSGMQRGAPLVVGRVIGDVVDPFVRRVALRVGYASRDVANGCELRPSAIADPVMVPDAPSPSDP 3. 15 Points Using the provided alignment (midterm_alignment.mas), produce the following phylogenetic trees: NJ with Bootstrap values ME MP UPGMA NJ – Bootstrap ME MP UPGMA By hand, create a strict consensus tree for the trees you produced above. Strict Consensus 4. 25 Points Annotate any retroelements or genes in the following 23.5kb sequence from Triticum monococcum. >Tm_2010M TGTTGACGTGTGCCTCCATGATGTGATCTCTCATGTCCAAAGGTGTTGTTTTCTATGATGCAAAACTAGTCTTGCCTTAATTGGTTCTA GAGATTCTTCAAGAAAAGGGATTTTTCGTGTGGGTTTGTCTCGTTTAAGTTTAAACAACAGCAGGGATTTTCGTTTCTGATTCTTTCTG TTCTGTGTCCTTGTGGCGATGTGTGAGCTCAGCTTAACAATTTCCATATAATAAGCAAATATACATCAGATGTACAGTCTGCATCTGGG GCCCAATCTCTTTCAGATAAACTCATGCGGGCGGCTGTAAGTGTCATGTCTGTATGAGTGTAGGTTTGTGATAGCAATAGTAGGAGGCA AACCTGAACCCTTCCAGTGGTCCCAGGAGGCCAAGTGATTGCAGATAGCCACACATCCAAACAGCCTGAATAGCATCCACTAGCTGGTG CTCTCCTTCTCTATGCTCTAGATCAGAATTGGAGGGCTTCTGTGCAGCACCAGGATGCCCGGCAGCATCCACATCTCAGGCAAGTTATT CTCTCTTATGATTTGTGCTGTTGTCTTGTAAAAACAAATCCGAACAAAATGACCATACTGTTATACTGGAAGCAAGAATTTTGCTGTTC ATTTTGTTTTCATTTTACTGTGAAGCCATTAAACCACCAGTCACCGCTCCGCTGTTCTTGCAAGGTTGGTGTTCTGATTGTACTGTGCT TTCCTCTTTCTTCACTTTCCAGTTACTAATATTATTGTTTGTATACACAAAAATCGTTTTGGACACAGTCGTTGTGGGCAAAAGAGAAT ACCGTGGAAGCATAGGACAAGATGGTTTCTCTTTGTAAGCCTATTTTCATACTGCCTTGAGTGGCATTTAATTACTGTTACCTTTCTAC TTCCACAAATGTACATTGAAGCTTTCTTTATGGTGCTTAAAGGAAACTTTGTAGTTCCCTCACACGGAAATATCTATGTGCAGTCCAGT TACATCACTCCGTGACAGCACGGTAATGATGCTGCGGAATGCTGACGAAGAATTGATATCAAAAACAGGTATTATTTTTGTTGTTGGTG GTGTTTTTATCTTTATCAGCAGACCTACAATTGATGTGCATCACGATTTCTTGAGAATCACCACAGAAGTAAAGACAAAGGAAATTGTT GAGTTAGGAACTATGGATGTTGTTTTTACACTTGATAGTGGAGGGAAAATTATTATCCAGTTGCAATTCTTGCTCAGTGCTGAAGATCG CAAACGCGTCCAAGAAATGGTAAGATGGCCCTTTTTACCTTGTTTGCTATTTCATCATAAGTTAAAGATCGTTTTCCCCCACAAAATGT CCAGCTTAAGTCAAAGCATATTACTTTGTTTATTTATACTTATTTCCTATTATTCAGAGGAACTTTGCAATGAAAAGGAAACAGCAAGA GCTGCTTGGGAATGGTCTTTATTTTCAAGGTAACCCGTAAATTATAATATTTTCAGTTCTGTCTGTTTTTCCATTGCAAGTACAAATTT GGAGATCCATGCAACAGATTGTATGCGCTAGATGTGTGATGCCAATTTCTCCCCTACAAGGCTGAGGATACCCACTTCATGCATTAGTA CTTCTTAAAATTCTTTCTTGTGATGTTAACATAGGCCCAGAACTGAGGCATGTACACTCATTAAACTTCTAGATTTAGTCCTAATTTTT TGTTATACCTTAATTGCGATGCTAGTTTGTAATATTAATCATAGATAAAATATGTAATTGATATGAACAACATGTGGGAGTTCTCTTTC TTACAGATTGCTTTCAAATATATCAGACAGCCAGTTGTCTAAGGAGCAGACTGAAAAGATCTCCGACATCCCAAGCAAAGGAGATCAAC TTACGCTTTGGAAGAGTCTGTTGCTAGATGACTTGAAAGAGAGCGCTGTCTTCTCTGAAATAAGAGTTGATTCCCGCATGAAGGCTTCA AAGGACCTGCTACTGCCAAGTGTTGGAAGCACTTCAAAGCTTGAAGGACCCATCATCGGCTCCAAGAAAGGGCATGGTGAACCAGAGAG TAGAGCAAGTAGTGCAGTGAAGAAGATGATAAGTGCCTTCGAAAGCAGCCCTCCACAGGTTTTAAACCAATCAATATATTGAGAAACAA ATAGCTTCATTCCCTGAATTATGCTTGTCAGGTTTTGGGGGGAAACACCGATTGCTTTACTGTGCACCAAGAAGGGTTTATAATGTATA CATGTACGGAGACAAAGGAAAAATGCATGGAACCTAATTCTAACAAACTTCACATATATTTACACCATCTTCTCTTGGATTGCAGAGTC TGCCTTCGATTACAAGGATCAAATCAGAAAGCTCATTGGAAGTGATGTCAGTTTCTTCAGAGACTGGTACCAATTCTTCAGACAAGCCT TCTACTCCTGGCGCACCAGCGAATGCTTCCGACCGTACGCAGACAGGCCTTGTGGCCGAGACATCGGGCAAAGTGACCCTTCGTTCTGG TGATAAGGATTCCAGTTCGAGGTCAGGAAGGCAAGTTATGTTTGGCAATAAGAAATCAAATGCATCTAGACAGATCAATCTATCAAATA CATATGAAAGCAGAAGACGAAGCTCCAGTAGGCGCGATGAACCAGCTAAGAAGAGCATGGGAGAAGCTGACCTGATCCGTTCCAAGAAG CGGTCTGAAGACAAGCACCGTCGCTCCATTGGTCCCTACTCGCCTGAGCAGACTAACAGTTTGGTTGCAACATCTAGCATCACCTGGAT CCACCCACATGTTTGTATCACCACCGCGAGCCGACAGCTCAAAGATCTTGTTGAGCTTGAGCGCCTGGACCCAATGAAGTACGTAGAAC AGAATGTCCAGGAAGACACTGACGAGGTACGGAAAACTAAACTGATGTCGATGTTCCTACCAACTGTTGGTGCTTTTACTGCTTTTTGA GTTTGTCTTGCTGCTTCGTTCATCTTCATATGTATCACAAACCAAATTTTTGGAGCAGTGCACGAGCATCGATGAGGTGAGGCACGTGG CTGACTCTGCGCAACGGAGTGGTGGGTTTCCAGTGTTGAATGGGTGGATGATTAACCAGGCAAGTTATTGCTTTGACCTCTCCTATCCT ATCCTCTCTGATGCATGTTGCTCAAGAGTTTGTTTGCACATGCTAAATATGGATGTACTTGAAAACACCACTGATGCAGGGGGTGCGTG TGGTCATTGTGATCATAGCCTGCGGTGCTGTGTTCCTCAACAACAGGTGAAGTCAGAGGACTTGCAATATCGATTAGAGGTTATTGTCG CTCAGACTGTCAAGTAGCTGTGGAGTCCAGGATTAGGGGGTGTTCGGGTAGCCGGACTATACCTTCAGCCGGACTCCAGGACTATGAAG ATACAAGATTGAAGACTTTGTCCCGTGTCCGGATGGGACTTTCCTTGGCGTGGAAGGCAAGCCTGGCGATGCAGATATTCAAGATCTCC TACCATTGTAACCGACTTTGTGTAACCCTAACCCCCTCTGGTGTCTATATAAACCGGAGGGTTTTAGTCCGTAGGACAACTTCATCATA CAACAATCATACCATAGGCTAGCTTCTAGGGTTTAGACTCCTTGATCTCGCGGTAGATCTACTCTTGTACTACCCATATCATCAATATT AATCAAGCAGGACGTAGCCGGTCCTGAGATTTTGGGGGCCCGGGGCGAAAATAAATATGGGGCCCCTTAAAATTTTCTTTTAGCATTCT TTGACGGAAAGTCACTTACCGTGGTCTTACCTTGGATCCCTCCTCCTGCTGGTTGTGTGAAGGTGAACGTAGATGCTGCGATGACACAG TTGTGATTGTGTCAGAATGCAAGGAGCCGTGAGGGGGATTTCATGGGAGCAACAAATGTCATCTTGCTGGCATCCGGGGTCTTAGACAA TGGAAGCACTTTCGATGGCGATGAATTTGCTCGGTGGCAAGGGGAGAGTGGTTAGTACTGGTTCTAGGTCAGCAATACCCCTACGATGG TTCAGTGTGGCATTATGGGAAAACCATGGAGGAGATTACGGCAATAAAGCAAATATGATTCTCAAGAATATAAATTATGAAGGTTTCAT CTACAGTAATAATTTGAACTAAGATGGTATAAAAACAAATGCATGAACATACACATCAACAAAACTCCCGCTGCTGCTTCCATGCATGA AATTTATGACTTCATGATATAAATAGAATCACAAATCATACCTCTCATGAAGAGAGGAGAGAGAGAGAGTACTCAAATTAAACTTACTT GTACTGATGCGTACACACCTGAAGATGACAATTGAGATGACAGATCGTGAGGCCTTTCACCCTTAGACCCCCATAATCCTCATGATAGT ATTCGAACTTGCCACTGGCTGCCATGATTTTTTTTATAAGGTAACTGAAAACTCAACAAATTATCAGCCTCTAGACTGAAACAACAAGC ATAGCTGCATAGATCGATCTCCTCATAGTCTGCTCTCCAAGTTACGCGGCAGCGACATCGTCAGCGAATCTGTCGTGGGGTGGATTTCG ACGGTCCACCGGAGGCGAGACTTAGGGTCATTGTGAGGACGATCTGGGTGAGTCCGCAGTCGCCGGCTTGTGATTCACAATCTGGGCGA AAAGGAATCAGCCGAAATCACAGGGAAGCAGTCAAGCGATTCACGATCTGGGCGAAAAGGATTTTGTGGGGAGGTGGCGTGATGAGTCC AATGGCTCTGGTACGGACTGGCGACTCGGGTGGCGACGCGGCGGGTGGAGCGGCTCTGCTCGCTGGCCCCGGCGAGTCAGCGACCTGGT TGGTGGTGCGGACGCCGGGCCGGTCGGCACGGGCGTCCCCGGAGGCGGCGCCGCGCTGGACGGGGAGCCGTTTCTGGGCTCTGGCCGTG GGAAGCGACGATGAGGATTCGGAGGCGGAGGCGAGCGATGGCGAGGCTCGTCGTGCCGAGGCGAGACGACCGTCGATCGGCGCGTTCGT GGCTCGGGCTGAGGAGCTAGGGGGCTCGCTCACGGCCGGGCGGCGGCGGGCCCTTTGCGCCCTGGCGGCGGCGGTGCTCGCCGACCAGG GGTGTCTGGTCCAGCGTCGTTGGTGCGCGCGGCCTTGTTGGGGGGCGTGCGGGCGGCCTCCGGCAGCATGCGGGCTGCCTCCCATTCGC GGGGACCGGGGCTTGCACCAGTGGTGCAGGTGGTGCCTGGGGCGGGGGTGCCGGCGGGAGCAGGGCGGCGACGGCGACGGAGGAGGCAA CCGATCCCTGGCCTAGGGTTGGGGAGGCTGGCCGGGCTGGGCCGGTGGGCCCAAACCCAGGAAGTGAGTCAAGGTGGCCTTCGTGGGCC GTGCGCGGTCCAGGGCGTTAGTTCCGCGGAGACGGGGCTGCGTCTGGAGGTGGGCTGCTCAGATTCGCGGGGTTGCGCGGAGAAGCCCA GGCCTGGCCCGGCCTCGGAGCGGGCAGTACTTAAGTGGCTCTGGATCCGCCGCGGCGCCTCCGACAGCTCACTACGGTTTCCTGCCACT CCATCTGAGGTGCGGCGCTTCGGTGGATCCAGGCGCTTCCTTGTCCCGCCCGCCTCGCGGACGGCTTTTGCGATCTTTCGCAGAGGTGG TAGCCATGGTCTCCAAGGCGTACGGGCGTAATAACCAGAGGTCGAGTCTGTATATGTAGTCGAACAGGCGTAGCTATCAGTATATGCTC GCACTAGTTAATTCGAGTGGGTCGGAGCGTGTCGGCTAGGAGGGTGGCTGGGGGCCTCCCCCAGCTTGGTGGCTCAAGGAGTAAGAGAG AAAGCAGAAGAAGAAGGAGGCTTCAAGAAGAAGGGGCTGGAACTCAAGCGTAAGGAATCGTTCCCCCCCTACGGTGGTGACCGCGCCGG CGACCCCTACTCCAAGCAGATGCTGAAGAAGCACAAGCCGGCGATGCAGGCTCCCTCCAAGGCCCCCAAGCCCTCCGCCTCCAAGGGCA CGGCAGCGGAGCCCATTCCGGTCGAGGATGCGGGAGGGCCAGAGTGCTTCCACTGCGGGCGCACGGGCCACTACCAGAGCGAGTGCGGC TTCAAGCAACTGTGCGTGCTCTGCAGAAACGAGGGGCATGCCTCGGCCTACTGCCCCACGCGCGACAAGCAGCTCGTGCTCCAGACAAT GGGCCACACCTTTGCGGGTGGCGGTTTCTTCTGCCTATAATACCCGGAGAAGGCGGATGTGGCGGCGGAAGGAATGCTAGGGGCTAACG CGGCCCTGGTGTCGGCGGCGCCCGGCGTCCTGTCCAAGGAGATCCTCGAGGCTGAGCTCCCCCATCTTTTTGAGGGCGAGTGGGACTGG CAGGTGGCACCGTTCGATAGGGACTGCTTCTCTGTGGCTTTCCCGGACCCTGTCATGCTACGGATGGCGACGCGCAGCGGGAAGCTCTT CCTCTCCATCAACAACATCACCGCCGACATCCGTGACGCGGTCCTCGCCGAGCCCAAGGCGCTGTCAATGCCGGAGGTTTGGGTGAAGC TATCAGGTGTGCCTCCGAACCAGCGGCGTGTGGAGCGGCTGATGGCGGCCACGACTATGATCGGCCGGCCGTTGGTGGTGGATGAATTG TCCCTTATCCGTTCGGGCCCGGTGCGTATGAAGTTTGCTTGTCGTGTGCCGGCCAAGCTGTGCGTATCTGTGCAAATCTGGTTCAATGG CGAGGGCTACACCATCAAGCTCGAGCCTGAGGTGGACCAACCGCGCCCGGCGGCCCCTGCTCTCCCGCCCCCGCCGCCGCCCCCTGCGG GTCGGGGCCCGAGGGGTCACGGCAAGGACCAGCGCAAGGACAAGGAGCCGGAGCAAGACGCCAGCATGGAGGAGGACGACTCCATCGAT ACCGCGGCCTGGGACAAGCTCGGGATCTCTCCGTCGGGCCCTGCCGCGCTGGTGGAGGGGGTGGCGGGGCTGCGGGCTGTGCTGGGGTC GGGCCACGAGGGGATGCCAATCTCAATCCCCAACCCCTCGCGGCTCCGTTCGGTGGGATCTCCGGCGTCCGGTGGGCGTGGCTACACCT TGACCGGCTCAGTCAAGCCGATGAAGAAGACGGCGGTGCGCAAGGTCAGCACGGAGGTGGTGCGCACGATGGGACCGCTGGGGGCGCCT CCCATGCCTTGGCGGTCGGTCATGGCTCTGGCGGCCCCAACCTTGCCACCCGTCGCTGACCCGCTGCTGCAAGCGGCGGCCAAGGCGGC GATGATCTCCAAGGCCAAGCGTACAAAGACGGTGCCGGCGGCTCCGGTCCGCACAAGCTCCTGCTCCAAGGGGGCCAAGGGCAACATGC CCTCGCTGCAGCGGGCGCAACTCCTTCAGGCGCAGAAGAACATGGAGATCTCAGGTAATCCTCCGCCTCGCTTCACGGTTCTAGGTTCC TTTTCGGACGATCACTTGCATGAGGTGTTGGAAGCGAGCGGCGTGGTTTCTTCTTGTGAGGGGGTGTCAGGGAGCTGATTTCGCTCGTT CGCACCAAGGAGAGGGCGCAGGCAGCGCTCGCGGCTGCTGCGGCAGCGCTTGCTGACCAGAGGCCCGGTGAGCCCCGAGGGAGGGGCCA CCACGGCACTTTCTGTGCCCGAAGAGTGGGGTCAGGCAACAGGGGTTGCCCCCTCCCTCGCCTTGCCGTGGAAGGGTGAGGTGCGTGGC TAAGGCGGTCACCTCGCGAGGTGTCCGGCTCCGCAATCGCGTCATCTAGATGCGGGCCATCTTTTGGAACATTCGTGGCTTTGGCCACG CGGGGGATTTTCATTTAAGGAGTACATGCGTAGAGAGGACGTCGACATTGTTGGTTTGTAGGAAACAATCAAGGGGGATTTTCGTTTCC ATGAGTTGCTCGCGATTGATCCTTTGGAGCGCTTTGAGTGGCAACATGTTCCGGCTGTTGGTCACTCCGGTGGTTTGTTGCTAGGTTTA AATCGCGCTCTGTATGAGATTATCGACTGGGACGTTGGTTCGTTTTTTTATCTCAGCACATTTTAGAGTTAGGGCCTCGCGCCGCGCAT TGGTTGTCATTCAGGTCTACGGGCCGGCCGATCACTCGAGATCGACGGAGTTCTTGGGAGAACTCCAAGCCAAAGTTAACTTCATGATT GAGGCGGCGCTTCCTGTCCTGGTGGGAGGCGATTTCAACCTGATTAGGTCGGGTGCGGATAAGAGTAACAGTAACATAAACTGGCCTCG GGTGGCTATGTTTAATGTTGCGATCGCCTCGATGGCCCTCAGGGAGGTGGCTAGAATGGGCGCTCGATTCACATGGACAAACAAGCAAT TAGACCCGGTACGCTCGGTGCTGGATCGTGTGTTCATGTCTCCGGATTGGGAAATGGTTTTTCCCCTCTACACTCTGGTCGCGGAGACT CGTATTGGGTCGGACCATGTACCACTTGTGTACTCCTCGGGGGAGGATAGAGTGAGACGTAGCCCTCGTTTTTTCTTCGAAACGGCCTA GTTTGAGGTGCCAGGCTTCGAGGAGCTCTTTAGGGAGAAGTGGAGGGCGTGCGTGAGCCAAGTGGTACATGGTCCGACCCAATACGAAA TATGCCCTAGAGGCCCCAAAGTGGCCCGATGGAGTTCTGGAACGCTATCGGCGGACGACTCAGGGCAAGTCTCAAGGGATGGGGCGCTA ACCTAGGGAGGTCTGACAAAGCACATAGAGCCGCGATCCTAGCGGAGATCGCAACGATTGATTCCCAGTCTGATATACGGGACCTTTCT GAGACGAAATGGGCCCACGGATACGACCTGGAAAGTCAGGTTGAAGCGTCGTTGTGTGCGGAGGAAGAATATTGGCATCGCCGTAGTGG TTTGAAATGGATCCTCAAAGGGGATGCCAATACCAAATATTTTCAAGCCTATGCTAATGGCCGCCGTCGGAAATGCTCGATCCTGAGGT TGCAATCGGAGCAGGGGCTTCTGTTGCGACAAGAGGACATCTCTCGTCATATTTATGAGTTCTACATTAAGTTAATGTGGACATGTGGG GACCAGAGGGCTGGGATGAGCGCGGATATGTGGGAAGCCGGACAACGGGTTCTTGACGGCAAGAACGAGGGGCTGGGGCTAGCATTCCT TTCCCGAAGAAATCGATGCCGCGCTGATGGGTATGAAGGCGGACACGGCCCCTGGCCCGGATGGGTGGCCGGTGGCAATGTTTAAACGT TTCTGGCCCTTGCTTAGGGGCCCAATTTTCGAGATCTGCAACGGGTTTATGCACGGTTCGGTAGATATATCGCGCCTTAACTTTGGGGT GTTATCACTCATTCCAAAGGTTCAAGGGGCCAATGACATTAGACAGTTCCGTCCCATTGCGCTCATCACCGTTCCGTTTAAAATTTGTG CCAAAGTGTATGCGACTAGGTTAGTTCCGATTTCCCATCGTGTGATCAATTGCAACCAATCCGTGTTCATCCGTGGTCGCAATATCCTT GAGGCCCCCCTGGCCCTTCAGGAGATGATACACGAACTCAAACGCACTAAGGAACCGGCGGTGCCGTTTAAGCTAGACTTCGAAAAGGC ATACGATCGGGTTAATTGGAATTTTCTCCGTCAGGTGCTACTCAGCTGGGGTTTTTCCGCTGTTTGGGTGCACCGCGTCATGCAGTTGG TCTCGGGAGGACAAACTGCTATTTCGGTGAACGGAGAAGTTGGGCACTTCTTCCGGAATAAACGGGGCCTCAGACAAGGGGATCTATTT TCCCCCCTCCTGTTCAACTTTATTGTTGACGCGCTGTCCTCTATGCTGAGGAAAGCGGCGGAGGCTAGCCATATCAAAGGGTTGGTTGG ACATCTCATTCCAGGGGGAGTGACCCACTTGCAGTATGCGGATGACACGCTAGTGCTGTTTCGTCCGGACCTTCATAGCATTGCTGCGG TCAAGGCGATTCTTCTCAGTTTTGAGCTCATGTCGGGCCTCAAAATTAACTTCCACAAATGCGAGGTGCTCTCGCTGGGGATCAAGGCA CATTCCGGATCTGCTCAACTGCAAAGTGGGCAAATTCCCGTTCATTTATCTGGGCCTCCTGGTAGACACCAAACGACCCACGATAGAGG ATTGGGAGCCTCTATGTGCCAAAGTGAGGAATCGTGTATGTCCATGGCGGGGCAAATTTTTGTCGAAAGCAGCGAGTCTAGTGCTCACA AATTCCAGCCTGTCTTCCTTGCCAACGTCCTCCTGGTAGACTGCTTCTTCTTGCAGAAGGGGTCCACGCCAAGTTTGACACGCCTCGTG CCAAGTTCTTTTGGGAAGGAACTAGCCCGAGCCATAAATACCACATGGTTAAATGGGCCTGGGTGTGTCGGCCCAAGGATTTGGGGGGT CTTGGGATCACCAATTCTAGATGGCTTAACATAGCGTTGATGTGCAAATGGATTTGGAAAATCACCCAGGGGGCCTCCGGGTTGTGGGT TGATCTCCTAAGGGCCAAATATTTTCCTAACGGGAACTTCTTTGAAGGGAGGGCAAGGGGCTCACCCTTTTGGAATGATCTGCAGTTGA TCAAACCAGCTTTTTCCATGGGGGCAAAATTTTCGATCAGGAACGGCAGATCCGCGCGATTCTGGACTGATCATTGGATAGGTACCCAA CCCCTTTGGGTCGAATTCCGGGATTTGTACGATATTGCTGACGACACCGCGCTGTCGGTGGCGGATGCGCTTGCCGCGATGCCACCTGA GATTCAGTTTAAGCGGGAACTCAACAGGCCCGAGCAGGCAAGCCTCGCGGCCTTGCTGCAACTAATCGAACCAGTGGGTCTTTTGGACC AGTCTGATTCGGTAAGTTGGGCACTTACTAACTCGGGGAAGTTCTCGGTGAACTCCTTATACCGCAAGTTGTGTCGAGGGCCGACACAA CCAGTGATTGCTGGTTTGTGGAAAGCGCGGATCCCTTTGAAGATCAAACTTTTCATTTGCCAACTGTTTCGCCATAGGCTCCCCACTTC CTTGAACTAGCCAAACGTAATGGACCGGCCATGGGTCCGTGTGCGTTGTGCGGGGAACCGGAGGATGCCAATCATGTGTTTTTCCGTTG GCCTCTAGCGAGGTTCGCGTGGAGTGCAGTCCGGACCGCGGCGGGTGTCGTTTGGGACCCGCGCTCGGCTACCGAGCTTTTCAACCTCC TAGATGCGATTAAGGGCCCTGAATATAGGGTTATGTGGAGTTGTGTGGGGCACTTCTCTGGGCCTTATGGCGAACTAGGAACAAGTTTA CTATAGAAGGGTGTTTTCCAAATCATCCGGCTAATATCATCTTCAAATGCAACCTCCTATTGCAGCAGTGGAGTCCGTTGGGAAGGCGC AAGGATGCTGAGTTGATCAAGATCGCCCAACAATGACTGGTGCAAGTATATACAATGTTTAGGGAGTCATGACTTCGGTCCCTTTTGGT TGCGTGATGAGCCTGCGTGCTTTGTACGGGCCTTGCCTTGTAACCTTGATTGGTGGCTTGTTAAGTTTCGTTTCTACTTTCGTGGTCGA GCCGATATGGCTGTTGGTGATGTATTAAGACTTGGTATGTGGTACGCTGCTGTTGGGGCTTTATTAATCTAAAGCTGGACGTATCTGGC GTCTTCGTTCTAAAAAAGGAATTAGCCGAAATCACTGGGAAGCAGTCAATCAATTCAGGATCGGAAAGAATACCATGATAACGCTATCG TGGGCTACCAACTACGTGTGCATCTCAGGCCTTGTTTCTCCCGGAGCTTGTGCCAAGGGCTCACTTTTTTTCCCAGAAACTGCCATGGG CTAATTATGTTCGTGGGAGGGGGCTCCTAGCTTTCGCGGGCCCGGGGCGGCCGCCCCTGCTGCCCCCCCACGTAGGGTTTTACCTCCAT CAAGAGGGCCCGAACCTGGGTAAAACATTGTTTCCCTTGTCTCCTGTTACCATCCGCCTAGACGCACAGTTCGGGACCCCCTACCCGAG ATCCGCCGGTTTTGACACCGACATTGGTGCTTTCATTGAGAGTTCCTCTGTGTCGTTGCTTTTAGTCCCGATGGCTCCTTCGATCATCA ACAACGATGCAGTCCAGGGTGAGACTTTTCTCCCCGGACAGATCTTCGTCTTCGGCGGCTTCGCACTGCGGGCCAATTCGCTTGGCCAC CTTGAGCAGATCGAAAGCTACGCCCCTGGCCATCAGGTCAGGTTTGGAAGCCTAAACTATACGGCTGACATCCGCGGGGACTTGATCTT CGACGGATTCGAGCCACAGCCAAGCGCGCCGCACTGTCTCGATGGGCATGATATAGCTCTGCCGCCGAACAGCGCTTTGGAGGCCGCAC ACACACCGGTTCCGACCATTGATTCGGAGCCTACTGCGCCGATCGAGGATCAGCGGTTGGACGTTGCCTCAGGGGCTGCGATCTCAGAG GCGATCGAGCCGAACTCGAACCCCGCACTCCGCATGGCCCGTGACTCCGAGGAGCCGGATTCCTCTCCGAACTCCGAGCCCCCCGCGCC CCTGCCGATCGAATCCGATTGGGCGCCGATAATGGAGTTCACCGCCGTGGACATCTTTCAGCACTCGCCCTTCGGCGACATCCTGAATT CTCTAAAGTCTCTCTCTTTATCAGGAGAGCCCTAGCCGGACTACGGTCAGCAAGGTTGGGATACGGACGATGAAGAAATTCAAAACCCA CCCACCACCCACTTCGTAGCCACTGTCGACGACTTAACCGACATGCTTGACTTCGACTCCGAAGACATCGACGGTATGGACGACGATGC AGGAGACGAACAAGAACCAGCACATGTAGGGCGCTGGAAGGCCACCTCGTCATATGACATATATATGGTGGACACTCCAAAGGATGGAG ACGGCGATGGAATAGCGGGGGACGATACCTCTAAGAAACAGCCCAAGCGCCGGCGTCAGCGGCGCCGCTCTAAATCCCGCCAAAGGAAA AACGGTGATTCCGGCACGGGAGATAATACTACTCCGGATAGCACCGAAGAACACCCCCTCCAGCAAGAGTCAGCACAGGAGGACAGAGA AGCCAGCCCTCACGAGAGGGTGGCGGACAAAGAGGTTGAGGACGATAATCATATGCCTCCCTCCGAAGACGAGGCAAGCCTCGACGACG ACGAGTTCGTCGTGCCAGAGCATCTCGCCGAACAAGAGCGTTTTAAACGCAGGCTTATGGCCACGGCAAGCAGCCTCAAGAAAAAGCAG CAACAGCTTAGAGCTGATCAGGATTTGCTAGCTGATAGATGGACTGAAGTCCTTGCGGCCGAAGAGTATGAACTCGAACGCCCCTCCAA GAGTTACCCAAAGCACAGGCTGCTACCCCGACTAGAGGAGGAAGCACCTACATCACCAGCGCATGACATGGCCGATCGGCCACCTCGTG GCTGCGACAGAGAGGCCTCTCGGCCCTCCACTCAAGCCATGCCCCGGCACCGCGTCAAGCATACTAAGGCACGGGAAAATGCGCCCGAC CTGCGCGACATACTGGAGGACAAGGCAAGGCAAACAAGATCGATCTATGGATCGCGCAGGCACCCCACGGCACGTGACGGTGACCGTCA CTCCGGATGCAATGAATCCAGCCGGGCCGAACTCAACAGACAAAGCTCCTTCAAGCTGCGTCGTGATATAGCCCAATACAGAGGCGCCG CACACCCACTATGCTTCACAGATGAAGTAATGGATCATAAAATCCCTGACGGTTTCAAACCCATAAACATCAAATCATATGATGGCACA ACAGATCCTGCGGTATGGATCGAGGATTATCTCCTTCATATCCACATGGCCCGCGGTGATGATCTACACGCCATCAAATACCTCCCACT CAAACTTAAGGGACCGGCCCGGGATTGGCTTAACAGCTTGCCAGTAGACTCAATCGGTTCTTGGGAGGACCTGGAAGCCGCATTCCTTG ACAACTTCCAGGGCACTTATGTGCGACCACCGGACGCCGATGACCTAAGCCACATAATTCAGCAGCCAGAGGAATCGGCCAGGCAATTC TGGACACGGTTCCTAACAAAGAAAAACCAGATAGTCGACTGTCCGGACGCAAAGGCCTTAGCGGCCTTCAAGCATAATATCCGTGATGA GTGGCTTGCCCGGCACCTGGGACAGGAAAAGCCGAAATCTATCGCAACCCTCACGACACTCATGACCCGCTTTTGCGCGGGAGAAGACA GCTGGCTAGCTCGCAGCAACAACTTAACCAAGAACCCTGGTAATTCGAATACCAAGGACAAAAGTGACAGGTCGCGTCGGAACAAACAA AAGCCCCGCATTAACAGCGACAGCAATGAGGATACGACAGTTAATGCCAGATTCCGAGGCTACAAACCCAGTCAACGGAAAAGGCCATT CAAAAGAAATACTCAGGGCCCGTCCAGTTTGGACCGAATACTCGACCGCTTGTGCCAGATACATGGCAACCCCGAAAAGCCAGCCAATC ACACCAACAGGGATTGTCAGGTGTTCAAGCAGGCAGGCAAGTTAAGAGTCGAAAACAAAGACAAGGGGCTGCATAGCGACGACGAGGAG GAGCCCAGGCCGCCGAACAACAATGGACAAAAGGGATTTCCCCCGCAAGTGCGGACGGTGAACATGATATACGCAACCCACATCCCCAA GAGGGAGCAGAAGCGTGCGTTACGGGACGTATATGCGATGGAGCCAGTCGCCCCAAAGTTCAACCCATGGTCCTCCTGCCCGATCACCT TTGATCGAAGGGACCACCCCACTAGCATCCGTCACGGTGGCTTCGCCGCATTGGTTCTCGACCCAATCATTGACGGATTTCATCTCACA AGAGTCCTCATGGACAGCGGCACAGCCTGAACCTGCTTTACTAGGATACAGTGCAAAAAATAGGCATAGATCCCTCGAGGATCAAGCCC ACCAAAATGACCTTTAAAGGTGTCATACCAGGTGTAGAAGCCAACTATACAGGCTCAGTTACATTGGAAGTGGTCTTCGGATCTCCGGA TAACTTCCGAAGCGAGGAGTTAATCTTCGACATAGTCCCGTTCCGTAGTGGCTATCACGCACTGCTCGGGCGAACCGCATTCGCAAAAT TCAACGCGGTACCGCACTATGCATACCTCAAGATCAAGATGCCAGGCCCTAGAGGAGTAATCACGGTCAATGGGAACACTGAATGCTCC CTCCGAATGGAGGAGCACACGGCAGCGCTCGCAGCAGAAGTACAAAGCAGCCTCTCTAGGCAGTTCTCCAGTTCGGCCTTCAAAAAGCC GGACACTATCAAGCGCGCCCGGAGTACCCCACAACAAGACCGCCTGGCATGTTCTGAGCTAGCGTAGCAATGCGGCCCCAACCCTAGCC CTCGCGATATAGCGAAACCAGTGCTTCACATACATAACTACGCTCTTGAAATACCATGGGCACAGGGGAAGGGGCACTATCACGGCACG CCCGAAATACGGCTTAAACCGCACCAGGGGCTGCCGGATTCTTTTTTTTTCTTTTACTCTCAGGACTCCATACTTCGGACGACCCGTTC GGCAATTCAACTGCCACACAAACGATGCAAGACCCAGGAAAGCAGACAAGCCACGCCGCATTATGGAACTCCCAGGTGGTCTCTATTGC GAGCAGTATACCTATTTTTTAATACAATTCCGCGGCCTGCCCCTGGCCAAGACATGTAAATAGTCCAATTTCTTTTGCTTATCGCACTA TTTGTATCGTTCCGCTTTCATAGCAGCCTTTCTATAAACAATGCATAGCTTTTTGTCTATTTTTTGCATTGTCCTCTTTTTATATATAT GTTTATTAATAACATGTTGCATCCATACACTGTGGCACGGCAAAAATACGCCAGGGGCTTTAGTACCCATCAATATGGCGTGAGAAGTC CGTACACTTTCACAAGTGCGGCACCCCGAACTTATAGCACTATATGCATTGGCTCCGAATCATGATTTGGGTCAATAGTTGGGTTTGCC TGGCTCCTATGTTTTGGTGCCTTACGTTCCGCTATATCGGCTAAGGTAGCACTAGGAGAACTACTGCGATTGTGCCCCAGTTGAGCTGG GCTGAGCACCTTAGTAGAGAAAGCTAAAACTGACTGTCATGATGAGGCGAGAGACCGGTCGCTGTTCGAGAGGTTTTTTCGAGTCCTTA AAGACTTATGCTGCTTCGAGCGAGGAACCGGCTTTGTCCGGCCAAGGCGTGGATAGCGCCCCGAACTCGGTCTTCCGAATACTAGGGGC TTCGCTGAAATTTTAAAATTATAGAGTTCTATGGCTAAGTGAGAGTGTTCAAGCATTATACTCCGATTGCCTTGTTCGTTGTGCTGAGT GCCTCCCTCGACGGACCCAATCATGGGAAAAAGAGCGCTCGGGTTTATCCCGAACACCCCAGCACTAGTGGCATGGGGGCAGAAGCCGA CGAGTGGCCATCTCTCAATTTTTTGATAAACGGCCACACAGAAAGTAATATTTTAAATTCAAGCATTGCTTAGCGCATATGAACAAGTT TTCAGCGCACAGGATAACACGAGCGAGTTCATTCAAAAATTACATCCTTGGTACATTCATCCGCCATAAGGCGGGCACCAGCCAGAACA TTCTTGTAATAGTTCTCGGGCTTGCGATGCTCCTTCCCCGGCGGCGGCCCGTCCCTCACAAGCTTCTCACCGTCCAGCTTACCCCAGTG CACCTTTGCACGGGCAAGGGCCCTACGGGCACCCCTTGATGCAGACGGAGCGCTTGATGACCTCGAGCCTTGGACAAGCCTCCACCAGC CGCCGCACCAGTCCGAAGTAGCTCCCAGGCAGGGCCTCTCCAGGCCATAGCCGAACTATGAAGCCCTTTAGGGCCTGTTCAGCCGCCTT GTGGAGCTCGACCGGCTACTTCAGCTGGTCGCTCAAGGGCACAGGATGTCCGGCCTCAGCGTACTGAGACTCGGCTCAGTAGAATGCGG CGGCATCGTATACGCTGCGGGGAAGATCTGCGAATGCCCCTGGAGAGCTCCGGATTCGGGTAAGTAACAAGTAGTTTACTTTTATATGT CTGCTTTGCATGGAGAATGCCTTACCCGCCGCTATCTTTTTCACCGCATCCAATTCTTGGAGGGCCTTCTGGGCTTCGGCCTTGGCAGA TTTGGCAGTTTTAAGGGACGCAGCAAGCTCGGACGCTCGCGTCTTCGAGTCAAGCTCCAAACTCTCATGTTTTTTCATGAGAGCCTGAA GCTCTTGCTGCACCTCGCCGACCTGTGCCTCAAATTTCTCCCGCTCGGTGCGCTCCGTGGCCGCCTTCTTCTCGGCCTCGGACAACGCC TGCTTGAGGGTCGCCACCTCAGTTGTGGCCCCTACAATAAGCAGTGTACTCCTGTCATTTTTTGCAATTGCGTCTTCTTATAGGCATTT TTTTCTATAAGGTATCTCTTACCTTCTTTCTCCTCGAGCTGCCGTTTTGCAAGGCCGAGCTCTTGCTCGGTCCGCTCGAGGTCCTGCTT CAGGGCACCCACCTCCGCAGTCAGTGCGGCGGTGGCCAGCAGCAAAGCCTGCATACGTATATTGACTCCTTTTTAGTTAGACTCCTGTG ATATTTAATAGATCCTCTATTCGGCTTTTCTTTCCGAACGCCAAACAGAGCATCAGGGGCTACTGTCTATGCGGTAATATTTTTACATA TTTTTTACTTACCTCGAAGCCTGTTAGAAGGCTAGCACAAGCTTCAGTCAGTCCGCTTTTGGCGGACCGAACCTTCTGGACCATCGTAC TCATGATGGTACGGTGCTCCTCGTCGATGGAGGCGCCCTTAAGCACCTCCAACAGATTGTCCGGCGCCTCCGGATGGACGGAGGACGCT GGCTCAATAGGCTTGCTCCTCTTGGCAGGAGTTCGCCTGCCGCGGTCCGGAACCGTTGATGATTCCGGCGCGGTGTCCGGCTTAAAGCC GGACTTGGAGCCCTGGGGGGTCTTGTCCCCTTCACTCCTGGAGTCCGGGAGGTCGCCTCGTGGCTCCTCCTTCAAGTGAAGGAAATATG CCCTAGAGGCAATAATAAAGTTATTATTTATTTCCTTATATCATGATAAATGTTTATTATTCATGCTAGAATTGTATTAACCGGAAACA TAATACATGCGTGAATACATAGACAAATAGAGTGTCACTAGTATGCCTCTACTTGACTAGCTCGTTAATCAAAGATGGTTATGTTTCCT AACCATGAACAAAGAGTTGTTATTTGATTAACGAGGTCACATCATTAGTTGAATGATCTGATTGACATGACCCATTCCATTAGCTTAGC ACCCGATCGTTTAGTATGTTGCTATTGCTTTCTTCATGACTTATACATGTTCCTATAACTATGAGATTATGCAACTCCCGTTTACCGGA GGAACACTTTGGGTACTACCAAACGTCACAACGTAATTGGGTGATTATAAAGGAGTACTACAGGTGTCTCCAATGGTAGATGTTGGTTA GGGTCTGTTTGATTCAAAGGATTTTCATAGGATCTTTGAAGGATTAGAATCCTTAGGAATTTTTCCTACGTTGGTCGTTTGATTCGTAG GATTGAATCATGTAGAATATTTTCCTAAGGATTCATTTGTACTACGTTTCACAGGAATTATAACATGCACTCCAACCTCTTGAAAGAAA TCCTTTGTTTTTCATGTGACACAATCAAACAAACTCAAATCCTATAGGGATCCAATGAACATGCCATTCCAATTCTACTTTTTTCCTAT TCCCGTGTTTCTGCAATCCTATGAATCAAAGAGGCCCTTAGTTGGCGTATTTCGAGATTAGGGTTTGTCACTCCGATTGTCGGAGAGGT ATCTCTGGGCCCTCTCGGTAATACACATCACATAAGCCTTGCAAGCATTATAACTAAGATGTTAGTTGTGAGATGATGTATTACGGAAC GAGTAAAGAGACTTGCCAGTAACGAGATTGAACTAGGTATTGGATACCGGCGATCGAATCTCGGGCAAGTAACATACCGATGACAAAGG GAACAACGTATGTTGTTATGCGGTCTGACCGATAAAGATCTTCGTAGAATATGTAGGAGCCAATATGGGCATCCAGGTCCCGCTATTGG TTATTGGCCGGAGACGTGTCTCGGTCATGTCTACATTGTTCTCGAACTGTAGGGTCCGCACGCTTAACGTTACGATGACAGTTATTATG AGTTTATGCATTTTGATGTACCGAAGGTTGTTCGGAGTCCCGGATGTGATCACGGACATGACGAGGAGTCTCGAAATGGTCGAGACATA AAGATTGATATATTGGAAGCCTATGTTTGGACATCGGAAGTGTTCCGGGTGAAATCGGGATTTTACCGGATTACCGGGAGGGTTACCGG AACCCCCCGGGAGCCAAATGGGCCTACATGGGCCTTAGTGGAAAGGTGAAAGGGGCTGCCATGGAGGGCTGCGCGCCTCCCCCCCTCCC CTAGTCCTATTAGGACTAGGAGAGGTGGCCGGCCACCTCTCTCTCTCTTTCCCCCTTGGAGTCCTAGTTGGAATAGGATTGGAGGGGGG AGTCCTACTCCCGGTAGGAGTAGGACTCCTCCTGCGCCTCCCTTGCTTGGCCAGCCAGCCCTCCCCCTCTCATCCTTTATATACGGGGG CAGGGGGCACCTCTAGACACACAAGTTGATCCTTGAGATCATTCCTTAGCCGTGTGCGGTGCCCCCTGCCACCAAATTCCACCTCGATC ATACCGTTGTAGTGCTTAGGCGAAGCCCTGCGTCGGTAGTACATCAAGATCGTCACCACGCCGTCGTGCTGACGGAACTCTTCCTCGAC GCTTTGCTGGATCGGAGCCCGAGGATCGTCATCGAGCTGAACGTGTGCTAAGAACTCGGAGGTGCCGGAGTAACGGTGCTTGGATCGGT CGGATCGGGAAGACGTACGACTATTTCCTCTACGTTGTGTGTGATCGCTTCCGCAGTCGGTCTGCGTTGGTACGTAGACAACACTCTCC CCTCTCGTTGCTATGCATCACCATGATCTTGCGTGTGCGTAGGAAATTTTTTGAAATTACTACGTTCCCTAACATCAAGGGACCACCTC CTCTTGCCCCGGAGATTGTCGCGACGCCACTTTGGCGTCAGCGCGGCACAGGGGGAGGCAGCGGGCGGAACTGAATTCACGTCTGATGA ATCCAGGGAGCCGCTCGACAACTCGTCGAGCCCGTCCTTGGGCGGGCTGCATGATTATATTCGACATTAGGGAAAGTTGTGCAACAAAA GGAATATCATGAGTTACTCTGGTATCCGAACACTTACGATCTCGCCAGACGCTTGGCCCTATCCGGCCACTCCTCTTCGTCCTCGTCGG CGTCGGGGGCGTAGTCCGGCGGAGGGGTTTTCCTCTTCTTGGACCCTTCGGCGCCCCCTATTGGGGCGGCCTTCCTCTTGTTTCCTCCC CCCGCTGGAGGGGGAGAGGTTTCTTCTTCCTCCTCCTCGTCTTCACGGGAGGAGTGCGTCTCGGACTCGTCGGACGACGAGTCCGACAC CACCATATGCCGGGAACTCTTTTGAGTCCCCGTGGCCTTCTTCTTCTTGGCCTTCTTCTCCGGCACCATATAAGGTGCCGGAACCAGCA GCTTCACTAGGGGAGCAGGGGCTGGGTCTTCGGGCAAGGGAGCCGGACAGTTAATCAGTCCGGACTTCGCCTGCCAAGCCTGTCAAAGG TGAGGGAGTTTAGATCCCGCATAGAGTCAAACTATGAGAAAACTTAACATCCTGTAAAAGATGAAAATATCTTACCTCGCCAGCAGGAC GCTGCGAGCTGAATCCGCGGTCTTCGGAGGCGGATGTGGGAGCCTCGGCGCCTTTGGTTAGCCCCTTCCAGACATCTTTGTACGTTGTG TCGAAGAGCCTGCTCAGAGTCCGGTGGTGCGCCGGGTTGAACTCCCACAAATTGAAGTCGCGTTCTTGACACGGAAGGATCGGGCGGAC GAGCATGACCTGGACTACGTTGACAAGCTTGAGCTGCTTGTTCACCTAGGGACTGGATGCATTTTTGCAGTCCGGTCACCTCTTCTTCG TCGCCCCACGACAAGCCCGTCTCCTTCCAGGACGTGAGCCGTGTAGGGGGTCCGGATCGGAATTCAGGGGCTACGATCCACTTTGGATC GCGTGGCTCGGTGATGTAAAACCACCCTGACTGCCAGCCCTTCAAGGTCTCCACAAAGGAGCCCTCGAGGCATAGGACGTTGGCTATTT TGCCCGCCATGGCACCTCCGCACTCCGCCTGGTTGCTGCGCACCACCTTTGGCTTGACGTTGAAGGTCTTGAGCCAGAGGCCGAAATGG GGGTGGATGCAGAGGAAAGCCTCGCACACGACGATAAACGCCGAGATATTGAGGATGAAGTTCGGGGCCAAATCGTGGAAATCTAGGCC ATAATAGAACATGAGCCCACGGACAAATGGGTGGAGAGGAATACCCAGTCCGCAGAGGAAGTGGGGAAGAAATACTACCCTCTCATGGG ACCTTGGGGTGGGGAGAAGCTACCCCTCCTCAGGAAGCCGGTGCGCGATGTCTTCGGACAAGTATCCGGCCTTCCTTAGCTTTTTGACA TGGCCCTTCGTGACGGAGGAGACCGTCCACTTGCCTCCCGCTCCGGACATTGTTGGAGAAGATTGAGGTAGGAAGTGCGGGCTTGGGCG CTGGAGCTCGGGTGGGCAAAGGAGGAAGAAGGCGTAGGTAAAAAGGTGGATCCTTATCCCCTTATATGCGCGGATGCGACTACGCGTCC CCACCAGCCTAGTAAAACTCGCTTGCCTCCCAAGCGTCGTGATAAATTGCACGGTTGGGTTACCCACGTCCGTATTGATGAGAATCCCG TAAATGGGGGACACGATCTCTGCTTTGACAAGACGTGCCAAGGAAACCGCCTCGCAAAACACGCTGAGGTGGAAAAGTGAAAACGATTC GAATAAAGGCTTGGCCGTAGTGTGATGTCACGCTGCGGAATACGTCAGCAGATTAGATTTGTGTTAATATTATTCTCTCTGTGGCAATA CGTGGAAACTTATTTTGCAGAGCCAGACACTACTCTTGGTGTTTACAAACTTTTATGAAGAATTTGGAGGAGGAACCCGCCTTGCAATG TCGAAGACAATCTGCGCGTCGGACTCGTCGTCATTGAAACCTGGTTCAGGGGCTATTGAGGGAGTCCTGGATTAGGGGGTGCTTGAGTA GCCGGACTATACCTTCAGTCGGACTCCAGGACTATGAAGATGCAAGATTGAAGACTTCGTCCGTGTCCGGATGGGACTTTCCTTGGCGT GGAAGGCAAGCCTGGCGATGCGGATATTCAAGATCTCCTACCATTGTAACCGACTTTGTGTAACCCTAACnCCCTCCGGTGGTCTATAT AAACCGGAGGGTTTTAGTTCGTAGGACAACTTCATCATACAACAATCATACCATAGGCTAGCTACTAGGGTTTAGCCTCCTTGATCTCG TGGTAGATCTACTCTTGTACTACCCATATCATCAATATTAATCAAGCAGGACGTAGGGTTTTACCTCCATCAAGAGGGCCCAAACCTGG GTAAAACATTGTGTCCCTTGTCTCCTGTTACCATCTGCCTAGACGCACAGTTCGGGACCCCCTACCCAAGATCCGCCGGTTTTGACACC GACAGTAGCTGATTTGGAAGCGTCTTCATGATACCGAGCTCCCACACAGGAATGTGTGGGAGGACAGGGCACACGAGTATCCAGGATTT TGTGTGATGTTTTTTCAGAAATGATTATTTATTCGCTCTCGCGTTCTTGTGGATGTTGTTGGCATATATACACGTGAGATTACAGGTAT GTAAAATGACAAAAGATCTGATGAAAACCATCCTTACATATGTTGGAGGTGTGATATTGTACAAATTTGCAAAATTAACGGATGCCTCA CAGAAGGCAGATTGGCTTTATTTTTAACGACAGATTGGCTCTTAATTTAAAGGCATTCAGCACGTAACGATTCGATAGATGTTTAGGGT TAAAATCAACTTTGAGACGCACGATTTGTAAGAAGGCACGACAGTTTGTAAGTTGGGCTGGCTGGGCTCTAACCTAAGGAAGGAAGAAT ATAGTACATGTATATATACAGTTGTTTTTGCTTGCAACTATCTCTTGCAATCAAGGATCGATCTAGCTGTTAATCTAGCTAGCTAGTAA AGATAGACCTAATTAAGAAGTCGATGAGATTAACAGCTAGATCGACTTCTTAATTAGTTGGGCTGGGCCCTAACCTAAACCTAAGGAAG GAAGGATAGATCGCGGCCCACGGCCCACCCCCACCTTAATTTCTCATTCTCTGATTCTTCTCCCTGGCGATCGCTGGCGGCTCTTCTTG GGAGGAGGCTCAGTGGCTCCGTGCTCCAGCCAACGCCTGTTGGCCTGCAGTTCTTGCCGAGTATATTTAATACTTAGCAGTATTGAGCA GCCTAGCTATGTAGTCTAGTATAATTTACAAAACAAATCACAGTATAAATTAAATATTACCAGAGGATAACTTGCTCGTCCCTCCAGCT AGTTTGCTGTTGCAGTTGCGTCGATACCCTCTTCG 4.1 Use Dotter to align the complete sequence with itself. Present a picture of the alignment and describe what you see. 4.2 Use TREP to annotate the retroelements. Include a picture of the results. -Highlight each retroelement with a different color (one color per element). -Where appropriate, mark the Long Terminal Repeats (LTRs) of each retroelement by making them bold and underlined with the same highlight color as the element that they belong to. -Find the inversions delimiting the LTRs and indicate them in the sequence with bold, red, underlined letters with the same highlight color as the element that they belong to. -Highlight in yellow Host Duplications (HD) or Target Site Duplications (TSD) flanking the complete retroelement. -Create a text box next to each element (or insert a comment on the first letter of the element) and annotate its name and type of element. 4.3 Interpret the Dotter results in light of your final annotation from the TREP results. Explain the order of insertion of any elements present in your sequence. 4.4 Annotate the genes you predict in the non-repetitive regions of this sequence. Show your work. In different colors from any repeat elements, highlight each gene you find in the sequence using one color per gene. Identify in Red font the Start (ATG), Stop (TGA) and splice site (GT or AG) locations in the sequence. Create a key for your highlighting scheme (ie. GENE1, GENE2, ELEMENT1, ELEMENT2, etc) 4.5 Briefly list the steps (in order) that you took to annotate this sequence. 4.6 Provide the translated proteins of any genes you predict. 5. 25 Points Using PreGAP4 and GAP4, assemble the 14 sequences that are of potential mutants for a particular gene. In the ‘Configure Modules’ tab of Pregap4: Sequencing Vector Clip: Unselect this option. Screen for Unclipped Vector: Unselect this option. Cloning Vector Clip: Unselect this option. 5.1 Answer the following questions: Were all the sequences provided used to perform the assembly? How many contigs were created? What is the length (bp) of each contig? 5.2 Considering the reference sequence below, identify which reads (individuals) contain the following mutations (Hint- check the trace files): WT Position Number 1 C 508 Number 2 G 497 Number 3 C 1014 (Mutation positions are numbered from the Start codon) Mutation T A T Individual >PHD_genomicDNA ATGGCCGGTAGGGATAGGGACCCGCTGGTGGTTGGCAGGGTTGTGGGGGACGTGCTGGACCCCTTCGTCCGGACCACCAACCTCAGGGT GACCTTCGGGAACAGGACCGTGTCCAACGGCTGCGAGCTCAAGCCGTCCATGGTCGCCCAGCAGCCCAGGGTTGAGGTGGGCGGCAATG AGATGAGGACCTTCTACACACTCGTACGTACACAGTCACTATCTAATGCCAATTTATCTCTGAAAGTGCTCACCACACGCACATGATCG ATCGAGCTCGATCTATAGTACGTGAGGGAAATTGATTTTCGATGCTTCTGTTCACATGTTTGCCTCAGCAAGCACATGACTAATGCTCC ATCTTGCATATGTCTCTGTGCCCTCTGGTGTTGATCATGATTTTTCTATGCTTCTTCTATGTTCGGGGAGCATTTATTTTTTATGCTTC TCTTGACATGTTTCATGTTTGTCCTAGCAAGCACACGAGTAATTAAAGCTCGATCTTAAATACTCTCTCCGTCCGAATAAATGTACTTC TAGCTTTTGTCTTAAGTCAAAGTTTTAAAATTTTGACCAACTTTATAGGAAAAAGTAGCAGCATTTATGACACTAAATTAGTATCACTA GATTCGTTTTGAAATGTATTTTCATAATATATCAATTTGATATTATATATGTTACTACTTATTTGTATATAGTTGGTCAAAGTTTTAAA ACTTTGACTTAGGATAAAAACTAGAAGTACACTTATTCGTGGACGGAGGGAGTATATGCTTATGTAGGTAGTACTCTCTACTTTGATCA TGATGTGCACGCGTTTACTGCCCGCAGGTGATGGTAGACCCAGATGCTCCAAGTCCAAGCGATCCCAACCTTAGGGAGTATCTCCACTG GTAAGTACTAAATTTGTAACTCAGTTGAATAATTTCTCTGTCCCTAGATATACACACTAGCTCATGTGTGCGTGTGTGTGTCTACATGT GTGTGCAGGCTTGTGACAGATATCCCCGGTACAACTGGTGCGTCGTTCGGGCAGGAGGTGATGTGCTACGAGAGCCCTCGTCCGACCAT GGGGATCCACCGCTTCGTGCTCGTACTCTTCCAGCAGCTCGGGCGGCAGACGGTGTACGCCCCCGGGTGGCGCCAGAACTTCAACACCA GGGACTTCGCCGAGCTCTACAACCTCGGCCCGCCTGTCGCCGCCGTCTACTTCAACTGCCAGCGTGAGGCCGGCTCCGGCGGCAGGAGG ATGTACAATTGA 5.3 Which of the three mutations would you expect to see in the protein? _______ Generate the mutated cDNA and protein sequences for this mutant and use it with BLASTp on the Protein Data Bank database. Use the structure link at the right of any significant hits and navigate to the CN3D link to view the database protein structure aligned with your mutant protein. Take a screen shot showing the location of the mutated amino acid in the worm style rendering with secondary structure color rendering. Mutant cDNA Mutant Protein CN3D Image