Download Exercises

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

CRISPR wikipedia , lookup

Deoxyribozyme wikipedia , lookup

DNA barcoding wikipedia , lookup

Non-coding DNA wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Transposable element wikipedia , lookup

Gene wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Frameshift mutation wikipedia , lookup

Genetic code wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Pathogenomics wikipedia , lookup

RNA-Seq wikipedia , lookup

Genomic library wikipedia , lookup

Human genome wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Point mutation wikipedia , lookup

Metagenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genomics wikipedia , lookup

Microsatellite wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Genome editing wikipedia , lookup

Helitron (biology) wikipedia , lookup

Multiple sequence alignment wikipedia , lookup

Sequence alignment wikipedia , lookup

Transcript
Άσκηση 4
Ανάλυση ακολουθιών πυρηνικών οξέων
χρησιμοποιώντας το Internet
Sequence analysis using Internet functions
Short tutorial on restriction mapping, translation, and BLAST.
Many of the following exercises involve copying one sequence from a page in Netscape to another. For
these types of exercises, therefore, it is a good idea to use multiple windows of Netscape. To create a
new window select File - New Web Browser (Ctrl-N). Another method involves the temporary copying
of the sequence to an independent window, such as that of a Word document.
Restriction mapping. Connect to NCBI Entrez. Retrieve the nucleotide sequence entry with accession
code X01405. Mark the sequence from the NCBI page, select Edit-Copy (or Ctrl-C at the keyboard), go
to Webcutter (www.medkem.gu.se/cutter) and select Edit-Paste (or Ctrl-V at the keyboard) to introduce
the sequence in the sequence window of Webcutter. Click on the button "Analyze sequence" to see
restriction sites. We now want to find out where BamHI cleaves. Go back to the main page of
Webcutter and deselect "Map of restriction sites". Select "Only the following enzymes" and click on
BamHI. "Analyze sequence" gives you the BamHI cleavage sites. How many times does BamHI cleave
in the sequence X01405? Go back to the main Webcutter page and reselect "All enzymes in the
database". To see all enzymes that cleave exactly once in the sequence select "Enzymes cutting once".
The result should be a table where the top looks like:
Enzyme
Sequence
________
Positions of Recognition Sites
Recognition
______________________________________
___________
AatI
AccI
AccIII
AflIII
AlwNI
AocI
740
827
1762
408
423
369
agg/cct
gt/mkac
t/ccgga
a/crygt
cagnnn/ctg
cc/tna
.
.
Other exercises on restriction mapping with Webcutter:
www.medkem.gu.se/edu/res.html (in Swedish) or www.medkem.gu.se/edu/4/start.html (in
English)
Translation. Again retrieve the nt entry with accession code X01405. Translate this sequence using the
translation function at www.medkem.gu.se/edu/translat.html.
Copy the sequence X01405 as in the restriction exercise above to introduce the sequence in the
appropriate box of the page www.medkem.gu.se/edu/translat.html. Click on the "Translate" button to
see the three forward reading frames. Try to identify the open reading frame corresponding to the p53
sequence. (You can get help from the annotation section on the NCBI page!). When you have
identified the correct reading frame (1 out of 3 possible) do the translation again but select the correct
frame (under "Translate entire sequence and select reading frame ...") so that you obtain a linear
sequence of amino acids at the top of the result page that corresponds to p53.
The sequence (at the top of the page) should look like this:
KTYQGSYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDST 50
PPPGTRVRAMAIYKQSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGN 100
LRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRP 150
ILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP 200
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALEL 250
KDAQAGKEPGGSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD
Additional introductory exercises with translation, homology searching and multiple sequence
alignment are available at www.medkem.gu.se/edu/3/start.html. Connect to this page and follow the
intructions there:
Sequence analysis
Question # 1 of 8
You have recently been successful with the cloning of a protein gene. You have used the Sanger
dideoxynucleotide chain termination method to determine the DNA sequence of one of your clones.
What is the sequence read from your sequencing gel ?
Analysis of genomic information
A few exercises below involves the alignment of 2-4 different sequences. Useful sites are for the
alignment of 2 sequences :


www-hto.usc.edu/software/seqaln/seqaln-query.html (select global alignment!)
genome.eerie.fr/bin/align-guess.cgi
Site for multiple sequence alignment :

www.medkem.gu.se/ln/molbio/gene/msf.html
1. Alu repeats.
Use the SRS or Entrez to retrieve the nucleotide entry Z82206, a sequence from the human
chromosome 22 (In SRS select EMBL as database and enter the accession code with the field
"AccNumber" . In Entrez specify field as "Accession"). The annotation section may be used to identify
Alu repeats ("repeat region .. Alu.."). Part of the feature table of the annotation section will look like:
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
FT
repeat_region 1495..1797
/note="AluSp repeat: matches 1..303 of consensus"
repeat_region 2171..2375
/note="AluJo repeat: matches 100..302 of consensus"
/note="incomplete repeat"
repeat_region 2443..2519
/note="MIR repeat: matches 191..260 of consensus"
repeat_region 2664..2966
/note="AluSp repeat: matches 1..303 of consensus"
repeat_region 3146..4279
/note="TIGGER1 repeat: matches 2418..1298 of consensus"
repeat_region 4273..5430
/note="TIGGER1 repeat: matches 1172..1 of consensus"
repeat_region 6761..7062
/note="AluSx repeat: matches 1..302 of consensus"
repeat_region 7771..8069
/note="AluSc repeat: matches 1..299 of consensus"
repeat_region 8289..8692
/note="MLT1A1 repeat: matches 3..365 of consen
Use a multiple sequence alignment (www.medkem.gu.se/edu/msf.html) to compare four different Alu
repeats. Select for instance the regions 1495..1797, 2171..2375, 6761..7062 and 7771..8069. Are they
homologous?
2. Exon - intron structure.
a. Identification of a gene with BLAST. Make use of the same sequence Z82206 as above. In the
annotation section there is information about an exon (<20814..21617). Use BLAST
(www.ncbi.nlm.nih.gov/cgi-bin/BLAST/nph-blast?Jform=0) to compare this sequence to the database
(select program : blastn and database: nr). What seems to be the protein encoded by the exon?
b. Alignment of genomic sequence with mRNA. Retrieve the DNA sequences V00594 (Human
mRNA for metallothionein) and J00271 (corresponding genomic sequence). Compare these sequences
by doing a global sequence alignment. Useful sites are
www-hto.usc.edu/software/seqaln/seqaln-query.html or
genome.eerie.fr/bin/align-guess.cgi
If you are using www-hto.usc.edu/software/seqaln/seqaln-query.html please note:
1. Select Global alignment
2. Enter the two sequences in the 2nd and 3rd larger frames. The first frame is not for
sequences!
Based on the alignment, how many exons are there in this gene? Compare your result to what's in the
annotation section for J00271.
3. Pseudogenes. Consider the following human beta-tubulin sequences:
Normal genomic sequence : DNA
A processed pseudogene : DNA
Normal beta-tubulin : protein
Try to examine these sequences to find out why the pseudogene does not give rise to a functional
protein.
Hints: Translate the pseudogene in all three forward reading frames with the web translation utility
(www.medkem.gu.se/edu/translat.html). Compare each of the translation products to the betatubulin
sequence (betatub). Once you have identified the relevant reading frame, examine the amino acid
alignment between the pseudogene product and the normal beta-tubulin (Make use of
www.medkem.gu.se/edu/msf.html). What is the major discrepancy ? What are the significant
differences at the nucleotide level?