Download Tutorials

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Protein C wikipedia , lookup

Proteolysis wikipedia , lookup

Biochemical cascade wikipedia , lookup

Transcript
Welcome to UniDrug_Target, This tool is intended to identify bacterial pathogen-specific drug
targets along with unique target's functionality information such as essentiality, choke point
activity, pathways in which the target is effective, pathways connectivity diagrams indicating the
effect of inhibiting the target (number of other pathways getting disturbed by inhibiting the
target), identifies targetable variations at residue level specific to the pathogen.
Step1:
Click on “Click here to execute the server” on home page of Uni-Drug target web server
(http://117.211.115.67/udt/main.html). You will see three hyperlinks
1. Identification of pathogen bacteria specific drug targets
Integration of genomic level information from sequences, essentiality, chokepoint
property, pathways connectivity, partial metabolic network construction and variation
analysis at the cavity sites at residue level to determine pathogen specific drug targets and
their efficacy and specificities against non-pathogen and human.
2. Identification of unique targetable sequences of user interest against a set of bacterial genomes
Integration of genomic level information from sequences, essentiality, chokepoint
property, pathways connectivity, partial metabolic network construction and variation
analysis at the cavity sites at residue level to determine specific drug targets in user
submitted sequences which helps in determining the efficacy of the drug target and their
commonality across the set of pathogen organisms and specificity against non-pathogen
organisms
3. Identification of uniqueness of a cavity in the user given sequences w.r.t non-pathogen and
human sequences, and commonality w.r.t to set of pathogen sequences
Hyperlinks 2&3 accept sequences from users (from Textbox and upload options). Listing of set
of pathogenic and nonpathogenic organisms names in two different selection areas were given in
all three hyperlinks.
You need to select the pathogenic organism(s) strain(s) of interest and different nonpathogenic
organisms against which you intend to find “unique pathogenic-specific drug targets” for first
case “Identification of pathogen bacteria specific drug targets”
Where as in the case of hyperlinks 2& 3 , you can select the pathogenic organism(s) strain(s) of
interest and different nonpathogenic organisms. These will be used to find out common and
specific targets/cavities in user given sequences respectively.
. Click on the “submit” button and you will see a message such as “Your data has been received,
Book mark the hyperlink http://117.211.115.67/program/status/26_475280.html to find out status”.
Step2: Results
You can use the book marked hyperlink to find out status of your JOB. Last column of the
transaction table indicates status of your request(s). If status column for a transaction table
displays “pending” then please wait till it indicates with three hyper links (Complete HTML
format, HTML/CSV format and Download HTML/CSV format).
Click on any of the three above hyperlinks your results are either displayed in tabular format
with following columns information as described below or all the results will get downloaded
both in HTML and CSV format.
Note 1:
If you downloaded the results using “Download HTML/CSV format” hyperlink extract the
respective .rar file and use results.html and result_csv.html files to visualize results in HTML
and HTML/CSV forms respectively. Images are not transfer while downloading so for the
images related to enzyme/pathway connectivity information please refer to UDT server’s
enzyme (http://117.211.115.67/udt/enzymes.html) and pathway(http://117.211.115.67/udt/path
ways .html) locations.
Note 2:
For the case of 3rd hyperlink, “Identification of uniqueness of cavity…” , your tabular output
contains Column1,Column 2,Column 12 repeated three times for pathogenic, non-pathogenic
and human protein sequence respectively.
Description of the results in Table:
Total of 12 columns will be there in the table. Information provided by each column is given
below with respect to column number.
Column 1:
Serial No.
Column 2:
GI number/synonymous name of the unique protein
Column 3:
Highest matching Non-pathogenic protein similarity score (in percentage)
Column 4:
Essentiality of a unique protein or its similarity with the known essential
proteins
Column 5:
Protein function as a chokepoint or not along with its rank
Column 6:
Whether unique protein is highly similar to known chokepoint enzymes in
other pathogenic organisms along with rank (Applicable for organisms whose
chokepoint enzymes and ranks were not reported)
Column 7:
Hyperlink to the enzyme and its reactions data along with reactions uniquely
consuming or producing metabolites with pathway connectivity graph starting
from each reaction catalyzed by the enzyme (Applicable when unique protein
is an enzyme)
Column 8:
KEGG orthologous group ID
Column 9:
Hyperlink to the pathways in which the protein is involved along with the
pathway connectivity diagram
Column 10:
Maximum similarity between unique protein and human proteome, matching
domains and unique cavity information.
Column 11:
Domains, non-conserved residues information in pathogenic organism
compared to non-pathogenic
Column 12:
Unique active site information of pathogen protein sequences along with nonpathogen matching residue information
7th column is hyperlinked to the enzyme data (EC number) along with reactions names which
are uniquely consuming or producing metabolites. Once you click on the enzyme's "EC
number", you will see all the reactions that are catalyzed by the enzyme. Reactions with
hyperlinks on them represent the reactions that are involved in synthesizing end-metabolites in a
metabolic pathway. Other reactions are involved in a set of cyclic reactions within the metabolic
pathways not resulting into end-metabolite production. When you click on the "Reaction"
hyperlink, you will see the pathway connectivity diagram. If you see only one pathway name
then it indicates the end metabolites are not transferred to any other metabolic pathway(s). If you
see multiple pathways and the arrows are drawn between the pathways, then tail of arrows
indicate the source pathway and their heads pointing to pathway(s) that is/are using the end
metabolite(s) of the source pathway as substrate. The diagram indicates whole set of pathways
getting disturbed by targeting the enzyme. The user can determine the role of these pathways in
pathogenesis and survival from the literature. The above information can be used to identify drug
target potential of pathogen-specific proteins. Similarly, user can retrieve information on
disturbances in pathways and metabolite(s) production for any pathogen-specific enzyme of a
bacterium by using this server.
9th column is hyperlinked to "pathways" (if any) in which the enzyme/protein is playing a
role. When you click on hyperlink, you will see the partial metabolic network (PMN) diagram
indicating pathways getting disturbed by inhibiting the target (enzyme cases only). Based upon
the functionality of the pathways, if they are involved in the metabolic reactions, you will see
connectivity diagram. For the case of signal transduction pathways, you will see only the
pathway(s) name.
10th Column indicates the extent of similarity between pathogen-specific protein and
sequences of human proteome.
10th and 11th columns are hyperlinked to data on domain level conservation of residues among
pathogen, nonpathogenic and human sequences. Explanation on data about output on this column
is discussed below:
For example, one of the predicted pathogen-specific protein, gi|15607938, in M. tuberculosis,
matched with a region within Q743F5_MYCPA (Swissprot_id) sequence containing
Linocin_M18 (Pfam_id: PF04454.6) domain. The observed matching regions (shown by the row
3, matching residue between pathogenic and domain sequences) show conservation of certain
residues between pathogenic and nonpathogenic sequences which are shown subsequent rows
with their GI number. This pathogen-specific protein showed low-matching with a set of
sequences (GI: 108797777, 119866868, 126435542, 108802357, etc.) in the
nonpathogenic/beneficial organisms. The role of conserved residues in the domain Linocin_M18
in protein function is determined by mapping the domain to respective location in the available
PDB structure (3fdx) to determine pocket forming and functional conservation role of the
matching residues. This signifies the important of conserved residues in low-matching regions
preserving the functionality of diversified protein. For example, from Table-1, we can identify
that some of the conserved residues are involved pocket '0' (biggest cavity of a protein) and '2'
(3rd biggest cavity) formation.
12th and 10th columns provide variation analysis of residues forming similar cavities
between pathogenic, and nonpathogenic and human proteins to identify cavities specific to a
pathogen protein.
To evaluate uniqueness of a pocket in a protein 3D structure you need to combine the
information of different chains involved in the PDB structure: For example one of the identified
target protein having gi_id: 15607945 is having only 27% identity and 43% similarity in the
matching regions of 3DO3 pdb structures (A, B, C, D, E) chain and 99% similarity with 3DO3
chain F. (This analysis was given instead of exact matching 3IB7 & 2HY1 to demonstrate the
usage of the tool for predicting targetable unique pocket sites from low matching structures as
only 20% structure data is available till date).
You can observe pocket5 in 3DO3 comprises residues from two chains A& D: Pocket5 chain A
residue composition in pathogenic 15607945 sequence against nonpathogenic (108800899,
119870039, 126436525, 108799028, etc) sequences:
Pocket5 chain D residue composition in pathogenic 15607945 sequence against nonpathogenic
(108800899, 119870039, 126436525, 108799028, etc) sequences:
To determine pocket5 residue variability in pathogenic organism you need to combine both
chains data (A & D) and then you can find the entire pocket forming residues variability
information by comparing residues of nonpathogenic organisms matching and represented in the
same row of corresponding pathogenic sequence residue involved in occupation of same position
in similar pocket formed by different pathogenic and nonpathogenic protein structures
Note: The generated sequence level specific pathogen protein sequences cavity data to determine
unique cavities in the protein which is hyperlinked linked by the 12th & 10th column of main
result table should be considered as authenticated to proceed for further evaluation if and only if
Residue name column (column No.1: representing three letter code of a amino acid) and Residue
column (Column No.4: Representing1 letter code of amino acids) are representing same amino
acids.
For the case of whole bacterial pathways connectivity and enzymes information you can
click on the “pathways connectivity data” and “Enzymes data' buttons in home page
“http://117.211.115.67/udt/main.html”