Download UCSC genome support forum

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Zinc finger nuclease wikipedia , lookup

Comparative genomic hybridization wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Microevolution wikipedia , lookup

DNA virus wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Gene desert wikipedia , lookup

Adeno-associated virus wikipedia , lookup

Ridge (biology) wikipedia , lookup

Genetic engineering wikipedia , lookup

Genomic imprinting wikipedia , lookup

RNA-Seq wikipedia , lookup

Polyploid wikipedia , lookup

Copy-number variation wikipedia , lookup

Oncogenomics wikipedia , lookup

Designer baby wikipedia , lookup

Gene wikipedia , lookup

Metagenomics wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Microsatellite wikipedia , lookup

Public health genomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transposable element wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome (book) wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Helitron (biology) wikipedia , lookup

NUMT wikipedia , lookup

Pathogenomics wikipedia , lookup

ENCODE wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genomics wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Minimal genome wikipedia , lookup

Human genome wikipedia , lookup

Genomic library wikipedia , lookup

Genome editing wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Subject: RE: [genome] rRNA track
Posted by sjnair on Thu, 15 Oct 2015 07:31:04 GMT
View Forum Message <> Reply to Message
Thank you all for the illuminating discussion. We appreciate your inputs. Will try you suggestions
and more readings on this issue. One question: where to find the contact info for the produces for
the reference genome assembly?
Sincerely,
Sree
________________________________
From: Angie Hinrichs [[email protected]]
Sent: Wednesday, October 14, 2015 12:49 PM
To: Galt Barber
Cc: Nair, Sreejith; Qi Ma; btanasa-forward; [email protected]
Subject: Re: [genome] rRNA track
I believe Galt is referring to the alignment of highly repeated rRNA subunits, but the intention was
actually to align larger regions of the reference assembly sequence. The suitability of BLAT
depends on the size of the region that you're searching in addition to the amount of repetitive
content. The tiles that Galt referred to are 11-base sequences that are overrepresented in the
genome. In addition, parts of the genome that are annotated by RepeatMasker or short Tandem
Repeats are soft-masked so that alignments cannot begin there -- but alignments can extend
through those tiles and regions from adjacent non-repetitive sequence.
> rDNA clusters (200-400 repeats of 43kb region)
43kb is too long for online BLAT's 25kb limit, so that region could be split in half.
I strongly suggest contacting the producers of the reference genome assembly as Steve said, to
ask how the clusters were placed in the assembly. Given the repetitive and highly polymorphic
nature of these clusters (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2134781/) it seems that
those regions of the genome would be extremely difficult to assemble. The discussion here might
also be of interest: https://www.biostars.org/p/12325/
Angie
On Wed, Oct 14, 2015 at 11:31 AM, Galt Barber
<[email protected]<mailto:[email protected]>> wrote:
BLAT will have problems dealing with highly repetitive sequences.
Not only are highly-used tiles masked out providing no seeds at those locations,
but it also has built-in limits to only return 16 alignments per chromosome per strand.
Perhaps another aligner like Bowtie or BWA would work better.
Page 1 of 5 ---- Generated from
UCSC genome support forum
Repetitive DNA and next-generation sequencing: computational challenges and solutions
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3324860/
-Galt
2015-10-14 9:41 GMT-07:00 Steve Heitner <[email protected]<mailto:[email protected]>>:
Hello, Sree.
We cannot provide assistance with sequence analysis, but we can suggest a way to align your
region of interest on chr21 to other regions in the reference genome to see whether the
alignments show informative differences between similar regions. Perform the following steps:
1. Get your coordinates of interest from a Table Browser query as previously outlined by my
colleague Matt Speir
2. View this region in the Browser:
2.1. Navigate to http://genome.ucsc.edu/cgi-bin/hgGateway
2.2. Enter your assembly of choice and enter your coordinates in the “search term” box
2.3. Click the “submit” button
3. In the blue navigation bar at the top of the screen, click “View/DNA”
4. Click the “get DNA” button
5. Copy the DNA sequence
6. Navigate to http://genome.ucsc.edu/cgi-bin/hgBlat
7. Paste the sequence into the text box (note that blat has a limit of 25,000 bases, so if your
region is larger than this, you will need to trim the sequence – this can be done more easily by
just viewing a smaller region in the Browser before obtaining the DNA sequence in steps 3-5)
8. Click the “submit” button
You may also wish to contact the producers of the reference genome assembly (NCBI for hg18
and the Genome Reference Consortium for hg19 and hg38) to see if they have any comments
about how the rDNA repetitive regions were handled and whether different sequences were
assigned to different chromosomes in those assemblies.
Please contact us again at [email protected]<mailto:[email protected]> if you have
any further questions. All messages sent to that address are archived on a publicly-accessible
Google Groups forum. If your question includes sensitive data, you may send it instead to
[email protected]<mailto:[email protected]>.
Page 2 of 5 ---- Generated from
UCSC genome support forum
--Steve Heitner
UCSC Genome Bioinformatics Group
From: Nair, Sreejith [mailto:[email protected]<mailto:[email protected]>]
Sent: Friday, October 09, 2015 12:17 AM
To: Qi Ma; Matthew Speir
Cc: btanasa-forward; [email protected]<mailto:[email protected]>
Subject: RE: [genome] rRNA track
Dear Dr. Speir,
Thanks for the explanation. One specific problem we are facing is to distinguish the rDNA
sequences between different chromosome. As you know, rDNA clusters (200-400 repeats of 43kb
region) are distributed in the p-arm of 5 different acrocentric chromosomes across human
genome. The 43 kb region is assumed to be same in the genome. However, it is possible that
there could be some markers within the repeats or other parts of the p-arm of different
chromosome that would help us to align the sequences in a chromosome specific manner. The
reason we need this info is to align our various nextgen seq expt data to the rDNA repeat at the p
arm of Chromosome 21.
We would greatly appreciate if you could provide any insight regarding this matter.
Sincerely,
Sree Nair
________________________________
From: Qi Ma [[email protected]<mailto:[email protected]>]
Sent: Thursday, October 08, 2015 4:13 PM
To: Matthew Speir
Cc: btanasa-forward; [email protected]<mailto:[email protected]>; Nair, Sreejith
Subject: Re: [genome] rRNA track
Attn to Sree:
Could you also add some of your comments on our discussion, and explain more of what we are
searching for to Dr. Matthew?
Many Thanks,
Best,
Qi
On Thu, Oct 8, 2015 at 4:11 PM, Qi Ma
<[email protected]<mailto:[email protected]>> wrote:
Dear Dr. Matthew,
Thank you so much for your reply. It very helpful.
But we are searching for rRNA (the sequence should include the components as in
http://www.ncbi.nlm.nih.gov/nuccore/555853/ indicated) locations on each chromosome marked
by hg18 genome. Could you give us any clue on how to get the information on that? Many
Thanks.
Page 3 of 5 ---- Generated from
UCSC genome support forum
Best,
Qi
On Thu, Oct 8, 2015 at 2:39 PM, Matthew Speir
<[email protected]<mailto:[email protected]>> wrote:
Hi Bogdan,
Thank for your questions about finding ribosomal RNA (rRNA) in the UCSC Genome Browser.
Identifying rRNA in the Genome Browser is going to depend on which assembly you are using as
some assemblies have better annotation than others. If you are looking at the human (hg19,
hg38) or mouse (mm10) genomes, you can use the "GENCODE Gene Annotation" tracks to view
rRNA. You can also use the Ensembl Genes track to view rRNA genes in the Genome Browser as
it is available for many more organisms.
To find these genes in the Browser, you can use the Table Browser to filter these tables for only
the rRNA genes. In the following steps, I've used GENCODE Genes on hg38 as my example, but
you should be able to modify these steps to use Ensembl Genes for a different organism.
1. Navigate to the Table Browser, http://genome.ucsc.edu/cgi-bin/hgTables.
2. Make the following selections:
clade: Mammal
genome: Human
assembly: Dec. 2013 (GRCh38/hg38)
group: Genes and Gene Predictions
track: All GENCODE V22
table: Basic (wgEncodeGencodeBasicV22)
output: Hyperlinks to Genome Browser
3. Next to 'filter', click "create".
4. Under 'Linked Tables', check the box next to 'wgEncodeGencodeAttrsV22'.
5. Click 'allow filtering using fields in checked tables'.
6. Under 'hg38.wgEncodeGencodeAttrsV22 based filters', type 'rRNA' in the 'geneType' and
'transciptType' fields.
The "geneType" line should read: geneType does match rRNA
The "transcriptType" line should read: transcriptType does match rRNA
7. Click 'submit'.
8. Click 'get output'
You will now see a page full of links to rRNA genes in the Genome Browser. Note that some of
these may be rRNA pseudogenes. You will need to click through to the Ensembl site to see more
information about each gene.
I hope this is helpful. If you have any further questions, please reply to
[email protected]<mailto:[email protected]>. All messages sent to that address are
archived on a publicly-accessible Google Groups forum. If your question includes sensitive data,
you may send it instead to [email protected]<mailto:[email protected]>.
Page 4 of 5 ---- Generated from
UCSC genome support forum
Matthew Speir
UCSC Genome Bioinformatics Group
On 10/6/15 6:31 PM, Bogdan Tanasa wrote:
Dear all,
please could you advise on finding a good rRNA (5S, 5.8S, 28S) track in the genome browser.
Many thanks,
-- Qi and Bogdan
--
-Qi Ma, Postdoctoral Scholar
Department of Bioengineering & Department of Medicine,
University of California, San Diego (UCSD)
9500 Gilman Dr. #0419
La Jolla, CA 92093-0419
(858)5983866<tel:%28858%295983866>
[email protected]<mailto:[email protected]>
-Qi Ma, Postdoctoral Scholar
Department of Bioengineering & Department of Medicine,
University of California, San Diego (UCSD)
9500 Gilman Dr. #0419
La Jolla, CA 92093-0419
(858)5983866
[email protected]<mailto:[email protected]>
---
--
-To unsubscribe from this group and stop receiving emails from it, send an email to
[email protected].
Page 5 of 5 ---- Generated from
UCSC genome support forum