* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download BIOINFORMATICS Biological information is encoded in the
United Kingdom National DNA Database wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Primary transcript wikipedia , lookup
Gene expression programming wikipedia , lookup
Genomic library wikipedia , lookup
Gene desert wikipedia , lookup
Gene expression profiling wikipedia , lookup
X-inactivation wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Human genetic variation wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Transposable element wikipedia , lookup
Genome evolution wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
SNP genotyping wikipedia , lookup
DNA barcoding wikipedia , lookup
History of genetic engineering wikipedia , lookup
Pathogenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Neocentromere wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Non-coding DNA wikipedia , lookup
Designer baby wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Point mutation wikipedia , lookup
Human genome wikipedia , lookup
Microevolution wikipedia , lookup
Smith–Waterman algorithm wikipedia , lookup
Genome editing wikipedia , lookup
Microsatellite wikipedia , lookup
Multiple sequence alignment wikipedia , lookup
Helitron (biology) wikipedia , lookup
Metagenomics wikipedia , lookup
BIOINFORMATICS Biological information is encoded in the nucleotide sequence of DNA. Bioinformatics is the field that identifies biological information in DNA using computer-based tools. Some bioinformatics algorithms aid the identification of genes, promoters, and other functional elements of DNA. Other algorithms help determine the evolutionary relationships between DNA sequences. Because of the large number of tools and DNA sequences available on the Internet, experiments that are done on the computer now complement experiments done in vitro and in vivo. This movement between biochemistry and computation is a key feature of modern biological research. In Part I, you will use the Basic Local Alignment Search Tool (BLAST) to identify sequences in biological databases and to make predictions about the outcome of your experiments. In Part II, you will find and copy the human PTC taster and non-taster allele DNA sequences. In Part III, you will discover the chromosome location of the PTC tasting gene. In Part IV, you will explore the evolutionary history of the gene. I. Use BLAST to Find DNA Sequences in Databases The following primer set was used in the experiment 5’-CCTTCGTTTTCTTGGTGAATTTTTGGGATGTAGTGAAGAGGCGG-3’ (Forward Primer) 5'-AGGTTGGCTTGGTTTGCAATCATC-3' (Reverse Primer) 1. Initiate a BLAST search. a. Open the Internet site of the National Center for Biotechnology Information (NCBI) www.ncbi.nlm.nih.gov. b. Click on the BLAST link in the “Popular Resources” list on the right side of the page. c. Click on the link nucleotide BLAST under the heading Basic BLAST. d. Where it says “NCBI/BLAST/blastn suite” be sure the “blastn” tab is selected e. Enter the sequences of the primers into the Search window. These are the query sequences. It may be easiest to cut and paste them from this document rather than typing them by hand. Paste these sequences one right after the other into the query box (where it says “enter accession number, gi, or FASTA sequence”), no spaces needed. f. Omit any non-nucleotide characters from the window, because they will not be recognized by the BLAST algorithm. g. In the “Choose search set” section, under database, select “others (nr etc)” and in the drop down list directly below this, select “Nucleotide collection (nr/nt)” h. Under Program Selection, optimize for somewhat similar sequences by selecting blastn. h. Click on BLAST! and the query sequences are sent to a server at the National Center for Biotechnology Information in Bethesda, Maryland. There, the BLAST algorithm will attempt to match the primer sequences to the millions of DNA sequences stored in its database. While searching, a page showing the status of your search will be displayed until your results are available. This may take only a few seconds, or more than a minute if a lot of other searches are queued at the server. 2. The results of the BLAST search are displayed in three ways as you scroll down the page: a. First, a “Graphic Summary” illustrates how significant matches, or hits, aligned with the query sequence. Matches of differing lengths are coded by color. b. This is followed by “Descriptions” which is a list of significant alignments, or hits, with links to Accession (individual record in the database) information. c. Last, the “Alignments” section shows a detailed view of how each primer sequence (query) matched up (aligned) to the nucleotide sequence in the record in the database (subject). Notice that a match to the forward primer (nucleotides 1–42), and a match to the reverse primer (nucleotides 44–68) are within the same Accession. 3. Determine the predicted length of the product that the primer set would amplify in a PCR reaction (in vitro): a. In the list of significant alignments, notice the E-values in the column on the right. The Expectation or E-value is the number of alignments with the query sequence that would be expected to occur by chance in the database. The lower the E-value, the higher the probability that the record in the database is related to the query. What is the E-value for the first alignment in the database? b. Look at the top 15 “hits” on the list. Do they make sense (YES/NO)? What do they have in common? c. Scroll down to the Alignments section to see exactly where the two primers have landed in the first subject sequence. d. The lowest and highest nucleotide positions in the subject sequence indicate the borders of the amplified sequence. Subtracting one from the other gives the difference between the two coordinates. e. However, the actual length of the fragment includes both ends, so add 1 nucleotide to the result to determine the exact length of the PCR product amplified by the two primers. What is the total length of the amplified sequence? ____________________________ II. Find and Copy the Human (Homo sapiens) PTC Taster and Non-taster Alleles 1. In the “Descriptions” section of the BLAST output, find the records for the human PTC taster and Nontaster alleles (you will see “taster” and “non-taster” in the name of the record). What is the accession number for the human PTC taster allele? ____________ What is the accession number for the human PTC non-taster allele? ________ 2. In the list of significant alignments, select the hit containing the human taster allele from among those with the lowest E-values. Tell your browser to open this in a new window or tab. We will be returning to the BLAST output page later. 3. Click on the Accession link at the left to open the sequence datasheet for this hit. 4. At the top of the report, note basic information about the sequence, including its basepair length, database accession number, source, and references. How large is the PTC taster sequence in this record (in basepairs, bp)? ___________ 5. In the middle section of the report, note annotations of gene and regulatory features, with their beginning and ending nucleotide positions (xx .. xx). One of the features should be the translation Find the protein sequence within this record and paste the sequence or write the first 15 amino acids here: 6. The bottom section of the report lists the entire nucleotide sequence of the gene or DNA sequence that contains the PCR product. Paste/record this sequence below. 7. Repeat Steps 2–6 for the human non-taster allele. Paste or record the first 20 bases of the PTC-taster allele sequence here: Paste or record the first 20 bases of the PTC- non-taster allele sequence here: III. Use Map Viewer to Determine the Chromosome Location of the TAS2R38 Gene 1. Return to the NCBI home page and click on “Map Viewer” located at the bottom of the page under the “Featured” heading. 2. In the “Search:” box, select Homo sapiens (humans), in the “for:” box, put in TAS2R38 and click “GO” 3. The top of the results page shows a schematic of all human chromosomes. Each chromosome is represented by two bars, the space between the bars represents the location of the centromere. The location of TAS2R38 is also indicated in the schematic. On what chromosome (and which arm) is the TAS2R38 gene?__________________ 4. Clicked on the marked chromosome number in the diagram to move to the TAS2R38 locus. 5. On the left side of the page (shaded in blue, towards the bottom) is a picture of the chromosome containing TAS2R38. Part of it is shaded in. This is the region of that chromosome that is shown to the right. Just above this is a zoom tool. Use this to zoom in such that the map shows 1/1,000 th of a chromosome. You will need to mouse over the tool to see what zoom level you are on. 6. There are multiple maps displayed in the remainder of the output, the title of each map is at the top. In each of these maps, the location of TAS2R38 is highlighted. Some of the maps that may be present in this output include: a. Pheno: this shows loci associated with phenotypes b. Morbid: this shows genes associated with disease phenotypes c. Genes_cyto: this shows the cytogenetic locations of genes (e.g. where genes are relative to the banding patterns seen in metaphase chromosomes). d. ensTranscripts: This shows which regions that correspond to mRNA e. ensGenes: This shows regions that correspond to annotated genes f. RnRNA: This shows related rat mRNAs aligned to the human chromosome Using the phenotype map, what are the names of the genes on either side of TAS2R38? 7. Click on the TAS2R38 locus in the phenotype map and then click on the “symbol” link that appears. This takes you to another section of NCBI called “Online Mendelian Inheritance in Man (OMIM).” Use the information on this site to answer the following questions. What is the full name of the TAS2R38 gene? 8. Read about how the TAS2R38 gene was cloned and how it functions. Paste the paragraph or record the first 20 words of the paragraph that describes the function of the TAS2R38 gene here: 9. Much of this paragraph describes work done in 2009 by Shah et al. Click on the link for this reference. This takes you to yet another section of NCBI called “PubMed.” PubMed is a massive database of scientific literature. You are currently viewing the abstract of the paper written by Shah et al concerning the TAS2R38. If you wanted, you could download this paper, or related works from this site. Paste the abstract or record the first 20 words of the abstract of the Shah paper here: IV. Use Multiple Sequence Alignment to Explore the Evolution of TAS2R38 Gene 1. Open the BioServers Internet site at the Dolan DNA Learning Center www.bioservers.org 2. Enter Sequence Server using the button in the left-hand column. (You can register if you want to save your work for future reference.) 3. Create PTC gene sequences for comparison a. Click on Create Sequence at the top of the page. b. Copy one of the TAS2R38 sequences (from the “PTC sequences” file on Moodle), and paste it into the Sequence window. Enter a name for the sequence, and click OK. Your new sequence will appear in the workspace at the bottom half of the page. c. Repeat Steps a. and b. for each of the human and primate sequences from the PTC sequences file. 4. To compare sequences, click on the Check Box on the left-hand column to compare the human PTC taster vs. human PTC non-taster. Then click on Compare in the grey bar. (The default operation is a multiple sequence alignment, using the CLUSTAL W algorithm.) The checked sequences are sent to a server at Cold Spring Harbor Laboratory, where the CLUSTAL W algorithm will attempt to align each nucleotide position. 5. First, let’s compare the following sequences: a. Human taster b. Human non-taster c. Human PCR product (non-taster) 6. The results will appear in a new window. This may take only a few seconds, or more than a minute if a lot of other searches are queued at the server. a. The sequences are displayed in rows of 25 nucleotides. Yellow highlighting denotes mismatches between sequences or regions where only one sequence begins or ends before another. b. To view the entire gene, enter 1100 as the number of nucleotides to display per page, then click Redraw. 7. Scan the alignment where all three sequences appear, this is the region that is not highlighted in yellow. 8. Find a nucleotide position where you can see a difference between human taster and non-taster (a single nucleotide in this region that is highlighted yellow)? Use the numbers at the end of each row to help you determine the precise location At what nucleotide position do you notice a difference between human taster and non-taster? What is the nucleotide (G, A, T, or C) at this position in the human non-taster? 9. Now compare the following (DO NOT include the PCR product here): a. Human taster b. Human non-taster c. Gorilla gorilla (Gorilla) d. Pan troglodytes (Chimpanzee) e. Pan paniscus (Bonobo) At how many positions do you notice a difference among these sequences (i.e. the sequence is highlighted in yellow)? 10. We can calculate how closely related these gene sequences are in all of the species we’ve compared. To do this, plug in the answer to the above question into the following formula: (1002 – answer above)/1002 x 100 = % of nucleotides that are identical across species What percentage of nucleotides in the TAS2R38 sequence are identical across all of the species we compared? 11. Finally, let’s compare all of the sequences, including the PCR product. Earlier we determined which nucleotide was different in human tasters vs. non-tasters. Find this nucleotide in the new alignment. Using the alignment you just generated that includes all species AND the PCR product do you think that other primates are tasters or non-tasters?