Download Answers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

List of types of proteins wikipedia , lookup

Degradomics wikipedia , lookup

Protein design wikipedia , lookup

Protein wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

Protein domain wikipedia , lookup

Protein structure prediction wikipedia , lookup

Structural alignment wikipedia , lookup

Homology modeling wikipedia , lookup

Transcript
1. Use sequence 1 from the multiple alignment file in a BLAST search and
comment on the results.
The BLAST search (BLASTP
conserved domain:
2.2.12; nr database) firstly reveals the presence of a
This is shown to be a domain common to the trypsin-like serine protease enzyme
family. The results from RPS-BLAST 2.2.11 show scores are large (88.2 to 243), and
the E values are all >0.1, suggesting that this domain is likely to be present in our
query sequence (i.e. our query sequence could be a serine protease).
The remaining results from the BLAST search show that all alignments shown are
significant (scores range from 288-550; E-values from 9e-77 - 2e-155). The first
alignment is to the sequence P06870, which is the accession number for KLK1
(Kallikrein 1 precursor). This sequence is 100% identical to our query sequence, with
no gaps inserted. The first seven results are all for human kallikrein 1, and they match
with a high level of identity. This suggests that our query sequence is human
Kallikrein 1. You will also notice that the other results include kallikrein sequences
from a variety of other species including mouse, rat and chimp. This would indicate
that the kallikreins are a family of proteins that are conserved well between species.
This may be of use when identifying the functional areas of the protein.
2. Using your results from the exercises in section 1, check the alignment of the Vega
and Ensembl sequences for SerpinA3 and identify where they differ. How do they
align to the UCSC sequence?
CLUSTALW 1.8 was used for multiple alignment of the three sequences. Areas
where there were difference between the sequences were identified using Boxshade
version 3.21 – these are shown in the file “Boxshade results”.
It can be see that the VEGA sequence has an extra 22 amino acids at the N terminus.
Since the UCSC entry for SerpinA3 tells us that the protein is extracellular, it may be
possible that these 22 amino acids are the signal sequence, which is involved in the
secretion of the protein out of the cell.
You will also notice that there is an area between 102-114 of the Ensembl sequence
that is different to the other 2 sequences. This is likely to be an error. The Ensembl
genes are predicted by automated methods, whereas the Vega entries are checked
manually. There is more chance that the Vega and UCSC sequences are correct at this
point. This is also true for the end of the protein sequence, where the Ensembl
sequence has extra amino acids compared to the Vega and UCSC entries.