Download SARS: Severe Acute Respiratory Syndrome

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Artificial gene synthesis wikipedia , lookup

Expression vector wikipedia , lookup

Protein wikipedia , lookup

Biochemistry wikipedia , lookup

Interactome wikipedia , lookup

Genetic code wikipedia , lookup

Metalloprotein wikipedia , lookup

Western blot wikipedia , lookup

Protein purification wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Point mutation wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Homology modeling wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Proteolysis wikipedia , lookup

Two-hybrid screening wikipedia , lookup

Transcript
SARS: Severe Acute Respiratory Syndrome
Author: Ryan D. Zaremba
Severe Acute Respiratory Syndrome, also known as SARS, is a new and
mysterious disease affecting thousands of people currently worldwide. Although little is
currently known about the disease, breakthroughs occur on a daily basis concerning
SARS. The disease is caused by a virus that invades the lungs of a human, causing
severe pneumonia-like symptoms. Viruses are able to change their DNA, or mutate, and
by doing so, are able to become resistant to many of our current vaccines. This is why
many different types of SARS viruses were found throughout the world.
Viruses are not classified as living
species because they lack the proteins to
replicate themselves although they do contain a
DNA or RNA genome. They require the DNA
replication proteins of a living cell to make more
copies of the virus and begin an infection.
SARS human coronavirus is transmitted through
tiny droplets of saliva of infected people. When
a healthy person ingests these droplets, the virus makes its way into the lungs. When in
the lungs, it attaches itself to the cells of the alveoli (tiny pouches on the surface of the
lungs used for the oxygen-carbon dioxide gas exchange in normal human respiration or
breathing). SARS virus then injects its genome into the cell thereby taking over its
replication proteins. The cell is held hostage by the replicating virus and many copies of
the virus are made. Finally, the cell becomes filled with viruses and it bursts. It releases
1
the newly synthesized viruses to invade other cells and repeat the process. This then
leads to a full- blown SARS viral infection. SARS virus is thought to be transmitted
through the air (airborne) also.
This tutorial will overview the sequences of the proteins within the virus that
cause SARS. You will use the basic tools of the Biology Workbench to view the
sequences of the protein, find similar “mutated” SARS viruses, and see where the
differences between the two exist. Next, you will use Protein Explorer to view a SARS
protein in 3-dimensions. Also, with this particular program, you will be able to
specifically view the active site (the spot on the protein that causes the illness) of the
protein.
In order to make this an enjoyable and valuable learning experience, follow along
step by step; to do this we recommend that you print out a copy of this tutorial.
2
Part I:
Opening An Account
Go to the Biology Student Workbench homepage. The address is:
http://workbench.sdsc.edu
Click on the link that says, “Set up a free account”
If you already have an account, go ahead and log into the program. If not, this is what
you need to do:
Workbench will then ask you for some information. Type in all the information and click
“Register”.
3
Type in your user name and password and click “OK”.
You will be brought to the homepage for the Biology Workbench:
Choose the background color that you want to work with on the Workbench by scrolling
down the page. (Rose is the recommended color because it makes it easier to see
contrasts in colors that will be used later in alignments of sequences). Click on “Session
Tools”.
Note: There are a variety of different layouts for the Biology Workbench. You can toggle
between the different layouts by clicking on the Biology Workbench logo at the top of the
screen. Click on it until you find the format that fits the layout of this tutorial.
4
Starting a New Session:
There are a few different sets of tools supplied by the Workbench. They are “Session
Tools”, “Protein Tools”, “Nucleic Tools”, “Alignment Tools” and “Structure Tools
(Alpha)”. For this tutorial, we will only concern ourselves with “Session Tools”,
“Protein Tools”, and “Alignment Tools”. To maintain some organization in your
research, you should create a new session for every new topic you research.
First, highlight “Start New Session” and click on “Run”.
Next, the following screen will ask that you name the new session. Call this
session “Severe Acute Respiratory Syndrome”. Then click on “Start New Session”.
The new “Severe Acute Respiratory Syndrome” session will appear right below your
“Default Session”. You will be able to click back and forth between the default and
5
SARS sessions easily. But remember, you are working on the SARS session, so make
sure you have the “Severe Acute Respiratory Syndrome” selected.
This tutorial will learn how to use the tools that will let you search the protein
databases and analyze the protein sequences that you want to use. By clicking on the box
that says “Protein Tools” at the top of the page, you will be brought to the Protein Tools
homepage. It is “Empty” because you have not researched any specific protein
sequences and saved them yet. To “import” a sequence means to save it for later use. By
doing this, you will change the word “empty” to the name of the sequence that you
imported.
PART II:
Importing Sequences from Protein Sequence Databases
In this section of the tutorial, you will learn how to search databases for a specific
protein sequence. You will be working with the sequence that encodes the Replicase
protein and a few others of the SARS virus.
Notice the scrollable textbox on the page that contains a variety of tools. You will
use some of these tools within this tutorial. Highlight “Ndjinn – Multiple Database
Search” and then click “Run”.
6
The next screen lists the different databases that you can search. In the search box
at the top of the page type: “SARS”. This tells the search engine what to search for.
Also, when searching, you are able to control the amount of results you get. Notice the
“Hits per page” drop-down menu. This simply lets you decide how many sequences you
want to display on the results page. To improve your choices, select “All”.
As you scroll down the page, you will notice a list of different databases. They all
contain a variety of sequences that you can search for. They are separated into 2 distinct
groups: The first group contains sequences from many different organisms (for example,
database “GBBCT” contains a large number of sequences from many different bacteria),
whereas the second group contains the entire DNA genome sequences of specific
organisms (for example, “Aaeo” contains the entire genome sequence of the bacterium
Aquifex aeolicus).
7
Since you are looking for a viral protein, click on the box that is next to the
SWISSPROT database. This is a large database filled with many different protein
sequences from many different organisms.
Scroll back up to the top of the screen and click “Search”. You will then be sent
to a page that contains the results of your search. At the time this tutorial was written, the
search engine found 49 matches for “SARS”. If you get more then 49 results, do not
panic. Inconsistencies in the number of search results can occur because new sequences
are being added to the databases on a daily basis.
From the descriptions of the search results, we need to determine which one
encodes a series of proteins including the Replicase protein. The sequence you want is
“SWISSPROT:R1AB_CVHSA” (rank # 0). This is a specific sequence to a mutated
SARS virus (known as a “strain”). There are minor differences between strains of SARS
viruses.
8
Select this sequence and click on the “Import Sequence(s)” button at the bottom of the
page.
Your sequence will be transferred to the Protein Tools homepage.
Part III:
Finding another similar strain of SARS: Using BLASTP
Now that you have a strain of SARS, you can use it to find another similar strain
of the virus. Because we know that the SARS sequence that you have found already has
a similar sequence to other strains, we can use the original strain to search a database for
sequences that are extremely similar (or homologous) to it. In order to do this, you are
going to use a tool called BLASTP, more commonly called BLAST.
Scroll down the text box on the Protein Tools homepage and look for the tool
called BLASTP – Compare a PS to a PS DB”. (This stands for “Compare a Protein
Sequence to a Protein Sequence Database”). Select this tool and ensure that your
SWISSPROT:R1AB_CVHSA protein sequence has a checkmark in the box next to it.
Then click on the “Run” button.
9
You will next be sent to a screen that will give you many options. These options
allow the user to specify their search. For the purposes of this tutorial you do not need to
deal with this. The important step here is to choose the database that you want to use for
your homology search. Scroll down in the text box until you come to the “GenBank
Viral Sequences” database. Once again, you are only interested in the viral protein
databases because you are dealing with viral sequences. Select “GenBank Viral
Sequences” and then click on the “Submit” button at the bottom of the page.
You will be sent to the following screen that contains the results of your BLASTP search:
10
The first column after the description of the sequence is the “Score (bits)” column. You
will see that the first sequences with Score (bits) values have very high ones. A Score
(bits) value above 400 means that the sequence is very similar to the original sequence
you compared it to. However, you can be more certain of the extent of the similarity
between the two sequences by looking at the “E Value”. This number is to the right of
the “Score (bits)” value. Essentially, the smaller the E Value, the more similar the
sequence is to the original sequence BLASTED. An E Value of zero represents a
perfect or near perfect match.
As you can see, all the E Values shown are zero. This shows the high degree of
similarity between different strains of the SARS virus. You will be able to see how
similar each of them are later in the tutorial.
Now it is time to decide which strain sequence you want to use to compare to
your original sequence. Since you know that different strains of SARS virus are very
similar, you should expect them to be nearly identical (have very high homology). By
reading the descriptions of the sequences, you notice that most of them say “[SARS
coronavirus…]”. This indicates that it is a SARS virus and the following code of letters
indicates the specific strain. You are able to find out more information about the
11
sequences by checking the box next to the sequence and clicking the “Show Record(s)”
button.
Highlight the first few boxes and click “Show Record(s)”. Notice that a detailed
report is given to you for each sequence you highlighted.
For this tutorial we will be working with one of the strains found in Shanghai, China. If
you are working in Internet Explorer, click on the “Back” button or, if you are working
in Netscape, simply close the Records window (by clicking on the box in the top lefthand corner of the window). You should be back on the screen that contains the results
of you BLASTP search. Make sure that only the Shanghai sequence
(GBVRL:40457448_40457449) is highlighted.
Now scroll to the bottom of the screen and click on “Import Sequence(s)”. Your
Shanghai strain of SARS virus will be imported to the Protein Tools homepage.
Now you have two different strains of SARS virus sequences. You can compare
and contrast the sequences. It is time to see where the differences lie in these two
sequences.
12
Part IV:
Aligning the Sequences: Using CLUSTALW
In order to compare the protein sequences, they must be aligned one on top of the
other. This is the purpose of the CLUSTALW tool. The alignment process takes place
by comparing the two sequences and finding common regions within them. Therefore,
alignment is a key step for you to determine where differences in the sequences are
located.
The two sequences that you imported are aligned one on top of the other using
colors that show you where similarities and differences exist.
Scroll down the textbox menu and highlight “CLUSTALW – Multiple Sequence
Alignment”. Click no you two SARS virus sequences, SWISSPROT:R1AB_CVHSA
and GBVRL:40457448_40457449. Then click on the “Run” button.
Another careen will appear in which alignment options can be altered- we are
going to use the default settings so just click on the “Submit” button. You will now be
taken to a screen that will show you the aligned sequences.
You will notice that in the middle of the alignment, there are dashes (-----) for the
sequence GBVRL:40457448_40457449. This is due to the fact that the scientists that
13
submit sequences to the databases have been unable to determine the sequence of the
protein at this spot. All amino acids in the sequences that are the same are colored
bright blue. For the sake of this tutorial, all amino acids that are colored dark blue
and green are very similar amino acids, but they ARE different. All amino acids
that are colored black are not similar at all. These are the amino acids that cause
there to be different strains of the SARS virus.
Conclusion
Using the various tools in the workbench, you have shown that different strains of
the SARS virus are caused by small mutations (changes) in amino acids in protein
sequences. What effect could this have on the proteins of different virus strains? In the
next part of this tutorial, you will use a tool that will allow you to see a SARS protein in
3-D.
PART V:
Using Protein Explorer to Visualize the Mpro Protein
In the first part of this tutorial you identified the amino acid sequence of the
SARS virus and noticed that there were minor differences between different strains of the
virus. Now you are going to use a program that is available on the web at
http://molvis.sdsc.edu/protexpl/frntdoor.htm (or its accessible through the Biology
14
Workbench website) to visualize a SARS protein in 3-D. The program will also show
you where the active site of the protein is.
The two protein sequence that we are going to use is:
1Q2W = SARS Mpro protein
First we will look at the 3-dimensional view of the protein and then we will highlight the
active site.
WARNING: It is important that you follow these directions step by step and do
everything exactly as it says in this tutorial. The Protein Explorer software may
crash if you try to move through it too quickly. This tutorial includes several
warnings to prevent you from moving ahead too quickly…if you pay attention to
them, you should not have a problem.
PART A:
1. Go to the website for the Biology Workbench by typing in the web address:
<workbench.sdsc.edu>
2. Scroll down the page until you see the link that says “Protein Explorer”.
You will arrive at a new screen where you need to type in the name of the protein that
you would like to see. Type in “1Q2W” in the first space provided. This will bring up
the Mpro protein for us to see. Once you have typed in “1Q2W”, click on the “Go”
button.
15
3. Be patient: it may take a few seconds to load the information. On Windows
machines an intermediate window may appear – if so, click “Start Explorer Session”.
This window will appear. Do not press the “OK” button until the red “Busy” sign
says “Ready”.
Once you have pressed the “OK” button AFTER the sign says “Ready”, we need to do
a few more things to make sure that the program does not crash.
4. Click on the “Toggle Spinning” button. This will stop the molecule from
spinning which can cause the computer to crash.
16
Once you have stopped the spinning, we are going to hide the water molecules because
the can prevent us seeing the things we want to see. Click on the button that says
“Hide/Show Water”. At this point, if you have followed the directions correctly, the
molecule should look like this:
If you do not see this on your screen, re-read the directions and start over.
5. Before we move on, let’s take a look at our Mpro protein. Place you cursor
(mouse arrow) on the molecule and hold down the left button on the mouse (or the only
button on the mouse if your mouse only have one button). Now move the mouse around.
This will allow you to rotate your molecule and look at it from any angle you wish. We
will use this technique later on to find the active site of the protein.
6. Now, we are going to find the active site of the protein. In the box on the top
left, click on “Explore More!”.
17
The screen on the left will change. Now, we are going to make some changes to our
molecule that will make it easier for us to see what is going on.
7. First, let’s change the way the protein looks. Click on the menu that says
“Display” and highlight “Cartoon” in the drop-down menu.
IF YOU WANT TO PLAY AROUND WITH THE DIFFERENT TYPES, YOU CAN
CHOOSE ANY OF THE CHOICES ABOVE THE LINE (SPACEFILL AND ABOVE)
18
TO SEE WHAT THE MOLECULE WILL LOOK LIKE. WE WILL NOT USE ANY
OF THE CHOICES BELOW THE LINE.
Before you continue, make sure that “Cartoon” is highlighted before continuing with the
tutorial.
8. Now we are going to change the color of the protein. As of now, it appears as
two colors because two of the same smaller proteins join to form one protein. This is
called a protein dimer. Click on the menu box that says “Color” and choose “Blue”.
9. Now we are going to get more specific. We know that all proteins have an
active site on them. So we are going to change the color of one of the smaller monomers
(two monomers make up the dimer) so that it stands out from the rest of the protein. This
will make it easier for us to see where the active site is. Click on the “Select” menu and
choose “Chain A”.
19
Nothing visible should happen right away after you choose this. Go to the “Color” menu
and choose the color “Red”.
After you have chosen “Red”, Chain A in your protein should turn red. Rotate the
protein with your mouse again to see what the protein looks like now. As you can see,
the red chain and the blue chain are actually identical to each other. They are stuck to
each other in a diagonal way.
20
10. If you have done everything correctly so far, your protein should look like
this:
Now that you can tell that there are really two proteins making up the SARS protein, let’s
take a closer look. Because amino acids make up proteins, the active site of this protein
should be made of amino acids also. Let’s see if we can find it…
11. Click on the link that says “Advanced Explorer” (you may need to scroll
down).
21
REMEMBER: DO NOT PRESS ANY KEYS OR BUTTONS WHEN THE RED
“BUSY” SIGN IS ON THE SCREEN UNDERNEATH THE PROTEIN! IF YOU
PRESS SOMETHING, THE PROGRAM MIGHT CRASH!
12. After the “Busy” sign has changed to “Ready”, click on the link that says
“Seq3D”. This is going to allow us to view individual amino acids on the protein. We
will use this tool to find out where the active site on the protein is.
13. Go to the “CPK Colors” menu and choose “Sec’y Structure”. Nothing
should have happened to the protein yet. This is going to allow us to pick specific amino
acids to highlight.
22
14. Now, look at the bottom left-hand panel. Each one of those letters represents
an amino acid in the protein sequence. We want to view the amino acids that make up
the active site of the protein.
Each of the 20 amino acids has a specific letter that represents it (F-phenylalanine, Rarginine, K-lysine, etc.). As you roll the cursor over a letter, the text box above the
sequences tells you what amino acid you have the cursor on. It also tells you the position
of the amino acid (the first F in the A chain represents phenylalanine and it is in the
number 3 position). We can use this information to find the active site.
The sequence of the active site of this protein is:
CGTT—L—HV—CTAE—MLNP—Y—C—HM—DRQ
22 3 4 5
27
41 2
44 5 6 7
49 50 1 2
54
145 164 5
187 8 9
The positions of the amino acids are indicated under the sequence. The first amino acid of
the active site is cysteine (C) at position 22. Glycine (G) is next at position 23. With
your mouse, click on the “C” at position 22 on chain A. When you click on the C, the
23
site at which this amino acid is at should be visible on the protein. It should appear as a
colored bump. Next, click on the “G” at position 23. Then, click on the “T” at position
24 and 25. Continue to click on the amino acids that correspond to the active site
sequence. When you are done you should be able to see the active site and your protein
should look like this:
24
References
Chou, Kuo-Chen. Wei, Dong-Qing. Zhong, Wei-Zhu. Binding mechanism of
coronavirus main proteinase with ligands and its implication to drug design against
SARS. Science Direct, www.sciencedirect.com, 2003 July 3.
Marra, Marco A. The Genome Sequence of the SARS-Associated Coronavirus. Science.
Vol. 300. 2003 May 30. 1399-1404.
Rota, Paul A. Characterization of a Novel Coronavirus Associated with Severe Acute
Respiratory Syndrome. Science. Vol. 300. 2003 May 30. 1394-1399.
World Wide Web. Basic Information About SARS. Center for Disease Control. Visited
2/11/04. http://www.cdc.gov/ncidod/sars/factsheet.htm
World Wide Web. Amino Acid Abbreviations used in Protein Sequences. RRDC.
Visited 5/7/04.
http://www.dur.ac.uk/biological.sciences/Bioinformatics/aminoacids.htm
25
26