Download Assignment 3 - OpenWetWare

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Epigenomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Human genome wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Genome evolution wikipedia , lookup

DNA vaccination wikipedia , lookup

History of genetic engineering wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

RNA-Seq wikipedia , lookup

Metagenomics wikipedia , lookup

Protein moonlighting wikipedia , lookup

Deoxyribozyme wikipedia , lookup

NEDD9 wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene nomenclature wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Designer baby wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Mutation wikipedia , lookup

Expanded genetic code wikipedia , lookup

Gene wikipedia , lookup

Microsatellite wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Microevolution wikipedia , lookup

Genome editing wikipedia , lookup

Genomics wikipedia , lookup

Frameshift mutation wikipedia , lookup

Genetic code wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Point mutation wikipedia , lookup

Transcript
Biophysics 101, Fall 2009
Assignment 3
Due: 9/24/2009 6:00 PM
Please use python to answer the following questions. Include your script in your assignment and
any output you get from the script/commands.
p53 (protein 53) is an important cell regulator and a suppressor of tumor growth in the
prevention of cancer. Below is a segment of the p53 gene (GenBank # X54156.1)
p53seg
cggagcagctcactattcacccgatgagaggggaggagagagagagaaaatgtcctttag
gccggttcctcttacttggcagagggaggctgctattctccgcctgcatttctttttctg
gattacttagttatggcctttgcaaaggcaggggtatttgttttgatgcaaacctcaatc
cctccccttctttgaatggtgtgccccaccccccgggtcgcctgcaacctaggcggacgc
taccatggcgtagacagggagggaaagaagtgtgcagaaggcaagcccggaggcactttc
aagaatgagcatatctcatcttcccggagaaaaaaaaaaaagaatggtacgtctgagaat
gaaattttgaaagagtgcaatgatgggtcgtttgataatttgtcgggaaaaacaatctac
ctgttatctagctttgggctaggccattccagttccagacgcaggctgaacgtcgtgaag
cggaaggggcgggcccgcaggcgtccgtgtggtcctccgtgcagccctcggcccgagccg
gttcttcctggtaggaggcggaactcgaattcatttctcccgctgccccatctcttagct
cgcggttgtttcattccgcagtttcttcccatgcacctgccgcgtaccggccactttgtg
ccgtacttacgtcatctttttcctaaatcgaggtggcatttacacacagcgccagtgcac
acagcaagtgcacaggaagatgagttttggcccctaaccgctccgtgatgcctaccaagt
cacagacccttttcatcgtcccagaaacgtttcatcacgtctcttcccagtcgattcccg
accccacctttattttgatctccataaccattttgcctgttggagaacttcatatagaat
ggaatcaggatgggcgctgtggctcacgcctgcactttggctcacgcctgcactttggga
ggccgaggcgggcggattacttgaggataggagttccagaccagcgtggccaacgtggtg
1. GC content is the % of G and C nucleotides in a sequence and is a marker to distinguish
genomes among various organisms. Please determine the GC content of p53seg.
2. The reverse complement of a DNA sequence is the DNA sequence reversed and then taken
with its complementary base pairs. For example, the reverse complement of the sequence
ATGGGCCT is AGGCCCAT. Determine the DNA reverse complement of p53seg.
3. To determine the protein sequence from the DNA sequence, use the standard codon table
below to convert the tri-nucleotide sequences (codon) to its one letter amino acid
representative.
Note: “*” means stop codon.
standard = { 'ttt':
'ttc': 'F',
'tta': 'L',
'ttg': 'L',
'ctt':
'ctc':
'cta':
'ctg':
'L',
'L',
'L',
'L',
'F', 'tct':
'tcc': 'S',
'tca': 'S',
'tcg': 'S',
'S', 'tat':
'tac': 'Y',
'taa': '*',
'tag': '*',
'Y', 'tgt': 'C',
'tgc': 'C',
'tga': '*',
'tgg': 'W',
'cct':
'ccc':
'cca':
'ccg':
'cat':
'cac':
'caa':
'cag':
'cgt':
'cgc':
'cga':
'cgg':
'P',
'P',
'P',
'P',
'H',
'H',
'Q',
'Q',
'R',
'R',
'R',
'R',
'att':
'atc':
'ata':
'atg':
'I',
'I',
'I',
'M',
'act':
'acc':
'aca':
'acg':
'T',
'T',
'T',
'T',
'aat':
'aac':
'aaa':
'aag':
'N',
'N',
'K',
'K',
'agt':
'agc':
'aga':
'agg':
'S',
'S',
'R',
'R',
'gtt':
'gtc':
'gta':
'gtg':
}
'V',
'V',
'V',
'V',
'gct':
'gcc':
'gca':
'gcg':
'A',
'A',
'A',
'A',
'gat':
'gac':
'gaa':
'gag':
'D',
'D',
'E',
'E',
'ggt':
'ggc':
'gga':
'ggg':
'G',
'G',
'G',
'G'
Translate the p53seg gene into its protein sequence in all 6 frames (+1, +2, +3, -1, -2, -3),
that is, starting with
(+1) frame: cgg agc agc …
(+2) frame: gga gca gct …
(+3) frame: gag cag ctc …
Reverse complement frame:
(-1) frame: cac cac gtt …
(-2) frame: acc acg ttg …
(-3) frame: cca cgt tgg …
4. Mutations in a gene can lead to changes in the protein sequence. This can occur in many
different ways including the insertion of nucleotides, loss of nucleotides, or the conversion of
one sequence to another. For example in sickle-cell disease, the replacement of A by T at
the 17th nucleotide of the gene for the -chain of hemoglobin changes the codon GAG (for
glutamic acid) to GTG (which encodes valine), leading to the 6th amino acid in the protein
being converted to valine instead of glutamic acid. Please introduce single base-pair
mutations (i.e. replacement of A by T/C/G, G by A/T/C, etc…) to the p53seg gene at a rate
of 1% (i.e. ~1 mutation every 100 base pairs) and document the changes to the protein
sequence (give a couple of trial results). How often do you see premature terminations?