Download Finding needles in a haystack - predicting gene regulatory pathways

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Biology and consumer behaviour wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Point mutation wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Gene desert wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Adeno-associated virus wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Human genetic variation wikipedia , lookup

Copy-number variation wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Genetic engineering wikipedia , lookup

Microevolution wikipedia , lookup

Gene expression profiling wikipedia , lookup

Oncogenomics wikipedia , lookup

Primary transcript wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Designer baby wikipedia , lookup

Gene wikipedia , lookup

Genome (book) wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Public health genomics wikipedia , lookup

RNA-Seq wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Transposable element wikipedia , lookup

Metagenomics wikipedia , lookup

NUMT wikipedia , lookup

History of genetic engineering wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

ENCODE wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Minimal genome wikipedia , lookup

Pathogenomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genomic library wikipedia , lookup

Helitron (biology) wikipedia , lookup

Genomics wikipedia , lookup

Human genome wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome editing wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Finding needles in a haystack - predicting gene regulatory pathways from
microarray and yeast sequence data; and a selection of human genome
anecdotes
David Landsman
Computational Biology Branch, NCBI NLM NIH
The gathering of sequence information has accelerated to the point where it is reasonable to
expect more than 10 bacterial and archeal, and 1-2 eukaryotic complete genome sequences
being deposited in the public databases in a given year. In addition, the identification of the
open reading frames in a genome is a challenge that is being met both computationally and
experimentally and there are considerable efforts underway to expedite the determination of
many of the protein folds and structures resulting from these results. However, the regulatory
networks which underpin the normal functioning of cells and which represent the interactions
between the genome protein and RNA products are less well understood. For example, in the
yeast, Saccharomyces cerevisiae, there are predicted to be about 300 DNA-binding proteins
with a wide variety of specific or non-specific DNA binding. Many of the sites that these
proteins bind to are, as yet, undiscovered and several methods for prediction have been
developed. Many of these methods use consensus pattern and matrix-based searches which are
designed to predict cis-acting transcriptional regulatory sequences but have historically been
subject to large numbers of false positives. We sought to decrease the rate of false positive
detection by incorporating expression profile data into a consensus pattern-based search
methodology. Based on our analysis, we have developed a web-based tool called PROSPECT,
which allows consensus pattern-based searching of gene clusters obtained from microarray
data.
For millions of years, L1 retrotransposons have been duplicating in mammalian genomes by an
efficient “copy and paste” mechanism; consequently, L1s now make up 15% of the human
genome. These autonomous elements are thought to have played an important role in the
expansion and evolution of our genome. For example, a recent, and still active, L1 element
was found to have inserted into genes, thereby causing disease. We will show examples of 3’
transduction events for this particular L1 element which is another mechanism by which L1s
have probably shaped the human genome.
Processed pseudogenes are created by retrotransposition, a process by which a mRNA is
reverse transcribed into DNA and inserted in a new location in a genome. In the human
genome, the total number of processed pseudogenes is estimated as approximately 20,000, and
while most are inactive, some may acquire a new promoter and remain functional. To
understand the nature of this process we set out to conduct a detailed genomic survey of
processed pseudogenes, concentrating on three families of HMG (high mobility group) genes
(e.g. HMGN, HMBA, and HMGB), which are known to have numerous processed
pseudogenes. We will present the general characteristics of the insertions found, as well as
describe some unique retrotransposition events.