Download BIOINFORMATICS Biological information is encoded in the

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

United Kingdom National DNA Database wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Primary transcript wikipedia , lookup

Gene expression programming wikipedia , lookup

Genomic library wikipedia , lookup

Gene desert wikipedia , lookup

Gene expression profiling wikipedia , lookup

X-inactivation wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Human genetic variation wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Transposable element wikipedia , lookup

Genome evolution wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

SNP genotyping wikipedia , lookup

DNA barcoding wikipedia , lookup

History of genetic engineering wikipedia , lookup

Pathogenomics wikipedia , lookup

Genome (book) wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

RNA-Seq wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Neocentromere wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Non-coding DNA wikipedia , lookup

Gene wikipedia , lookup

Designer baby wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Point mutation wikipedia , lookup

Human genome wikipedia , lookup

Microevolution wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Genome editing wikipedia , lookup

Genomics wikipedia , lookup

Microsatellite wikipedia , lookup

Multiple sequence alignment wikipedia , lookup

Helitron (biology) wikipedia , lookup

Metagenomics wikipedia , lookup

Sequence alignment wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
BIOINFORMATICS
Biological information is encoded in the nucleotide sequence of DNA. Bioinformatics is the field that
identifies biological information in DNA using computer-based tools. Some bioinformatics algorithms aid
the identification of genes, promoters, and other functional elements of DNA. Other algorithms help
determine the evolutionary relationships between DNA sequences.
Because of the large number of tools and DNA sequences available on the Internet, experiments that are
done on the computer now complement experiments done in vitro and in vivo. This movement between
biochemistry and computation is a key feature of modern biological research. In Part I, you will use the
Basic Local Alignment Search Tool (BLAST) to identify sequences in biological databases and to make
predictions about the outcome of your experiments. In Part II, you will find and copy the human PTC taster
and non-taster allele DNA sequences. In Part III, you will discover the chromosome location of the PTC
tasting gene. In Part IV, you will explore the evolutionary history of the gene.
I. Use BLAST to Find DNA Sequences in Databases
The following primer set was used in the experiment
5’-CCTTCGTTTTCTTGGTGAATTTTTGGGATGTAGTGAAGAGGCGG-3’ (Forward Primer)
5'-AGGTTGGCTTGGTTTGCAATCATC-3' (Reverse Primer)
1. Initiate a BLAST search.
a. Open the Internet site of the National Center for Biotechnology Information (NCBI)
www.ncbi.nlm.nih.gov.
b. Click on the BLAST link in the “Popular Resources” list on the right side of the page.
c. Click on the link nucleotide BLAST under the heading Basic BLAST.
d. Where it says “NCBI/BLAST/blastn suite” be sure the “blastn” tab is selected
e. Enter the sequences of the primers into the Search window. These are the query sequences. It
may be easiest to cut and paste them from this document rather than typing them by hand.
Paste these sequences one right after the other into the query box (where it says “enter
accession number, gi, or FASTA sequence”), no spaces needed.
f. Omit any non-nucleotide characters from the window, because they will not be recognized by
the BLAST algorithm.
g. In the “Choose search set” section, under database, select “others (nr etc)” and in the drop
down list directly below this, select “Nucleotide collection (nr/nt)”
h. Under Program Selection, optimize for somewhat similar sequences by selecting blastn.
h. Click on BLAST! and the query sequences are sent to a server at the National Center for
Biotechnology Information in Bethesda, Maryland. There, the BLAST algorithm will attempt to
match the primer sequences to the millions of DNA sequences stored in its database. While
searching, a page showing the status of your search will be displayed until your results are
available. This may take only a few seconds, or more than a minute if a lot of other searches are
queued at the server.
2. The results of the BLAST search are displayed in three ways as you scroll down the page:
a. First, a “Graphic Summary” illustrates how significant matches, or hits, aligned with the query
sequence. Matches of differing lengths are coded by color.
b. This is followed by “Descriptions” which is a list of significant alignments, or hits, with links to
Accession (individual record in the database) information.
c. Last, the “Alignments” section shows a detailed view of how each primer sequence (query) matched
up (aligned) to the nucleotide sequence in the record in the database (subject). Notice that a match
to the forward primer (nucleotides 1–42), and a match to the reverse primer (nucleotides 44–68)
are within the same Accession.
3. Determine the predicted length of the product that the primer set would amplify in a PCR reaction (in
vitro):
a. In the list of significant alignments, notice the E-values in the column on the right. The Expectation
or E-value is the number of alignments with the query sequence that would be expected to occur by
chance in the database. The lower the E-value, the higher the probability that the record in the
database is related to the query.
What is the E-value for the first alignment in the database?
b. Look at the top 15 “hits” on the list.
Do they make sense (YES/NO)?
What do they have in common?
c. Scroll down to the Alignments section to see exactly where the two primers have landed in the first
subject sequence.
d. The lowest and highest nucleotide positions in the subject sequence indicate the borders of the
amplified sequence. Subtracting one from the other gives the difference between the two
coordinates.
e. However, the actual length of the fragment includes both ends, so add 1 nucleotide to the result to
determine the exact length of the PCR product amplified by the two primers.
What is the total length of the amplified sequence? ____________________________
II. Find and Copy the Human (Homo sapiens) PTC Taster and Non-taster Alleles
1. In the “Descriptions” section of the BLAST output, find the records for the human PTC taster and Nontaster alleles (you will see “taster” and “non-taster” in the name of the record).
What is the accession number for the human PTC taster allele? ____________
What is the accession number for the human PTC non-taster allele? ________
2. In the list of significant alignments, select the hit containing the human taster allele from among those
with the lowest E-values. Tell your browser to open this in a new window or tab. We will be returning
to the BLAST output page later.
3. Click on the Accession link at the left to open the sequence datasheet for this hit.
4. At the top of the report, note basic information about the sequence, including its basepair length,
database accession number, source, and references.
How large is the PTC taster sequence in this record (in basepairs, bp)? ___________
5. In the middle section of the report, note annotations of gene and regulatory features, with their
beginning and ending nucleotide positions (xx .. xx). One of the features should be the translation
Find the protein sequence within this record and paste the sequence or write the first 15 amino acids
here:
6. The bottom section of the report lists the entire nucleotide sequence of the gene or DNA sequence that
contains the PCR product. Paste/record this sequence below.
7. Repeat Steps 2–6 for the human non-taster allele.
Paste or record the first 20 bases of the PTC-taster allele sequence here:
Paste or record the first 20 bases of the PTC- non-taster allele sequence here:
III. Use Map Viewer to Determine the Chromosome Location of the TAS2R38 Gene
1. Return to the NCBI home page and click on “Map Viewer” located at the bottom of the page under the
“Featured” heading.
2. In the “Search:” box, select Homo sapiens (humans), in the “for:” box, put in TAS2R38 and click “GO”
3. The top of the results page shows a schematic of all human chromosomes. Each chromosome is
represented by two bars, the space between the bars represents the location of the centromere. The
location of TAS2R38 is also indicated in the schematic.
On what chromosome (and which arm) is the TAS2R38 gene?__________________
4. Clicked on the marked chromosome number in the diagram to move to the TAS2R38 locus.
5. On the left side of the page (shaded in blue, towards the bottom) is a picture of the chromosome
containing TAS2R38. Part of it is shaded in. This is the region of that chromosome that is shown to the
right. Just above this is a zoom tool. Use this to zoom in such that the map shows 1/1,000 th of a
chromosome. You will need to mouse over the tool to see what zoom level you are on.
6. There are multiple maps displayed in the remainder of the output, the title of each map is at the top. In
each of these maps, the location of TAS2R38 is highlighted. Some of the maps that may be present in
this output include:
a. Pheno: this shows loci associated with phenotypes
b. Morbid: this shows genes associated with disease phenotypes
c. Genes_cyto: this shows the cytogenetic locations of genes (e.g. where genes are relative to the
banding patterns seen in metaphase chromosomes).
d. ensTranscripts: This shows which regions that correspond to mRNA
e. ensGenes: This shows regions that correspond to annotated genes
f. RnRNA: This shows related rat mRNAs aligned to the human chromosome
Using the phenotype map, what are the names of the genes on either side of TAS2R38?
7. Click on the TAS2R38 locus in the phenotype map and then click on the “symbol” link that appears. This
takes you to another section of NCBI called “Online Mendelian Inheritance in Man (OMIM).” Use the
information on this site to answer the following questions.
What is the full name of the TAS2R38 gene?
8. Read about how the TAS2R38 gene was cloned and how it functions.
Paste the paragraph or record the first 20 words of the paragraph that describes the function of the
TAS2R38 gene here:
9. Much of this paragraph describes work done in 2009 by Shah et al. Click on the link for this reference.
This takes you to yet another section of NCBI called “PubMed.” PubMed is a massive database of
scientific literature. You are currently viewing the abstract of the paper written by Shah et al
concerning the TAS2R38. If you wanted, you could download this paper, or related works from this site.
Paste the abstract or record the first 20 words of the abstract of the Shah paper here:
IV. Use Multiple Sequence Alignment to Explore the Evolution of TAS2R38 Gene
1. Open the BioServers Internet site at the Dolan DNA Learning Center www.bioservers.org
2. Enter Sequence Server using the button in the left-hand column. (You can register if you want to save
your work for future reference.)
3. Create PTC gene sequences for comparison
a. Click on Create Sequence at the top of the page.
b. Copy one of the TAS2R38 sequences (from the “PTC sequences” file on Moodle), and paste it
into the Sequence window. Enter a name for the sequence, and click OK. Your new sequence
will appear in the workspace at the bottom half of the page.
c. Repeat Steps a. and b. for each of the human and primate sequences from the PTC sequences
file.
4. To compare sequences, click on the Check Box on the left-hand column to compare the human PTC
taster vs. human PTC non-taster. Then click on Compare in the grey bar. (The default operation is a
multiple sequence alignment, using the CLUSTAL W algorithm.) The checked sequences are sent to a
server at Cold Spring Harbor Laboratory, where the CLUSTAL W algorithm will attempt to align each
nucleotide position.
5. First, let’s compare the following sequences:
a. Human taster
b. Human non-taster
c. Human PCR product (non-taster)
6. The results will appear in a new window. This may take only a few seconds, or more than a minute if a
lot of other searches are queued at the server.
a. The sequences are displayed in rows of 25 nucleotides. Yellow highlighting denotes mismatches
between sequences or regions where only one sequence begins or ends before another.
b. To view the entire gene, enter 1100 as the number of nucleotides to display per page, then click
Redraw.
7. Scan the alignment where all three sequences appear, this is the region that is not highlighted in yellow.
8. Find a nucleotide position where you can see a difference between human taster and non-taster (a
single nucleotide in this region that is highlighted yellow)? Use the numbers at the end of each row to
help you determine the precise location
At what nucleotide position do you notice a difference between human taster and non-taster?
What is the nucleotide (G, A, T, or C) at this position in the human non-taster?
9. Now compare the following (DO NOT include the PCR product here):
a. Human taster
b. Human non-taster
c. Gorilla gorilla (Gorilla)
d. Pan troglodytes (Chimpanzee)
e. Pan paniscus (Bonobo)
At how many positions do you notice a difference among these sequences (i.e. the sequence is
highlighted in yellow)?
10. We can calculate how closely related these gene sequences are in all of the species we’ve compared.
To do this, plug in the answer to the above question into the following formula:
(1002 – answer above)/1002 x 100 = % of nucleotides that are identical across species
What percentage of nucleotides in the TAS2R38 sequence are identical across all of the species we
compared?
11. Finally, let’s compare all of the sequences, including the PCR product. Earlier we determined which
nucleotide was different in human tasters vs. non-tasters. Find this nucleotide in the new alignment.
Using the alignment you just generated that includes all species AND the PCR product do you think that
other primates are tasters or non-tasters?