Download 1. What is a gene?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Molecular cloning wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Plasmid wikipedia , lookup

List of types of proteins wikipedia , lookup

Gene desert wikipedia , lookup

Gene expression wikipedia , lookup

Ridge (biology) wikipedia , lookup

Genomic imprinting wikipedia , lookup

Transcriptional regulation wikipedia , lookup

RNA-Seq wikipedia , lookup

Genomic library wikipedia , lookup

Community fingerprinting wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Gene regulatory network wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene expression profiling wikipedia , lookup

Non-coding DNA wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Gene wikipedia , lookup

Molecular evolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
1. What is a gene?


Definition: A gene is a discrete unit of DNA (or RNA in some viruses)
that encodes a nucleic acid or protein product that contributes to or
influences the phenotype of the cell or the organism.
Genes are the functional units of chromosomal DNA. Each gene not
only encodes the structure of some cellular product, but also bears
control elements (short sequences) that determine when, where, and
how much of that product is synthesized. Most genes encode protein
products; special classes of genes encode for RNA molecules.

The way genes encode proteins is indirect and involves several steps. The first
step is to copy (transcribe) the information encoded in the DNA of the gene as
a related but single-stranded molecule called messenger RNA. Subsequently the
information in the messenger RNA is translated (decoded) into a string of
amino acids called a polypeptide. The polypeptides, on their own or by
aggregating with other polypeptides and cell constituents, form the functional
proteins of the cell.
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
2. Introns and exons


Trying to pinpoint precisely what genes are is complicated by the fact
that many eukaryotic genes contain mysterious segments of DNA,
called introns, interspersed in the transcribed region of the gene.
Introns do not contain information for functional gene product such as
protein. They are transcribed together with the coding regions
(called exons) but are then excised from the initial transcript.
Since correct sequence in the introns (as well as in the regulatory
region) is necessary in order to generate a properly sized transcript at
the right time and place, introns (along with coding and regulatory
regions) should be considered part of the overall functional unit, in
other words, part of the gene
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
4. Schematic gene structure
Generalized gene structure
in prokaryotes and
eukaryotes. The coding
region (dark green) is the
region that contains the
information for the
structure of the gene
product (usually a protein).
The adjacent regulatory
regions (lime green)
contain sequences that are
recognized and bound by
proteins that make the
gene's RNA and by
proteins that influence the
amount of RNA made.
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
3. The average lenght of coding regions
Organism
Vibrio cholerae (bacterium)
Saccharomyces cerevisiae (yeast)
Drosophila melanogaster (fruit fly)
Cenorhabditis elegans (nematode)
Arabidopsis thaliana (weed)
Homo sapiens
Average length of
gene product (aa)
304
477
492
436
435
497
Estimates of the average length of polypeptide chains
coded by genes of various organisms; these value have to
be multiplied by 3 in order to obtaing the lenght of the
corresponding coding DNA. Tipical values are 1,000 to
1,500 bp.
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
5. Number of introns-exons per gene
Distribution of the number of exons among genes of
three organisms
Many eukaryotic
genes contain
mysterious segments
of DNA, called
introns, interspersed
in the region of the
gene. Introns do not
contain information
for functional gene
product such as
protein.
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
6. Genomes and genes
The number of genes
increases with
genome size, but the
trend is complicated
due to repetitive DNA
and introns.
Counting genes is
difficult, even in
completely sequenced
genomes
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
7. Average gene length
Intron/exon statistics for various organisms
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
8. Plasmid genomes
Bacterial cells isolated from nature often contain small DNA elements that are not
essential for the basic operation of the bacterial cell. These elements are called
plasmids. Plasmids are symbiotic molecules that cannot survive at all outside of cells.
Even though plasmids are not part of the basic operational system of their host cells,
some are quite complex, carrying many genes, so it is quite appropriate to refer to their
distinctive DNA as a "plasmid genome." Bacterial plasmids often contain genes that are
extremely useful to the bacterial host, for example, by promoting bacterial cell fusion,
conferring antibiotic resistance, or producing toxins.
Plasmids also are occasionally found in fungal and plant cells. Most are found inside
mitochondria and chloroplasts, but some are found in nuclei or in the cytosol. Unlike
the bacterial plasmids mentioned above, these eukaryotic plasmids seem to provide no
benefits for their hoststhey seem to exist selfishly, only for the purpose of their own
propagation.
For their replication and maintenance, plasmids depend on the general cellular
machinery encoded by the host genome. Bacterial plasmids are most often circular, but
there are linear types too. In fungi and plants, linear plasmids are most common, but
circular types are known in fungi.
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
9. Organellar genomes




Mitochondrial and chloroplast chromosomes consist of double-stranded DNA
molecules. Individual mitochondria and chloroplasts contain identical multiple
copies of their chromosomes, and each eukaryotic cell contains several to many of
these organelles.
The organelle chromosomes contain genes specific to the functions of the organelle
concerned. Nevertheless, most of the biological functions that occur inside these
organelles are specified by genes in the nuclear genome. There is no overlap with the
nuclear genome in gene content.
Mitochondria and chloroplasts probably were originally prokaryotic cells that
entered and took up a symbiotic relationship inside another cell. Throughout
evolution most of the original prokaryotic genes were transferred to the nuclear
genome or lost.
Mitochondrial genomes can be eliminated in some organisms such as yeasts, but
most organisms cannot survive without them, so there is still mutual interdependence
between nuclear and organelle subdivisions of the genome. Chloroplasts can be
eliminated only in photosynthetic organisms that can survive by taking in preformed
nutrients from the environment (that is, that can act as heterotrophs).
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
10. Most eukaryotic DNA does not include genes




Between genes there is DNA, mostly of unknown function. The size
and nature of this DNA vary with the genome.
In bacteria and fungi there is little, but in mammals the intergenic
regions can be huge.
Sequences of DNA that exist quite distant from a given gene can
affect the regulation of that gene. They could thus be considered
part of the functional gene unit, even though separated by long
segments of DNA having nothing to do with the gene in question.
In many eukaryotes some of the DNA between genes is repetitive,
consisting of several different types of units repeated throughout the
genome. Some of the repetitive DNA is dispersed; some is found in
contiguous "tandem" arrays. Repetitive DNA is also found in some
introns. The extent of this DNA is different in different species, and
indeed there is variation of repeat number within species.
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
11. Comparing gene densities
Schematic diagram of gene topography in four organisms.
Light green = introns; dark green = exons; white = intergenic regions
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
12. A small fraction of total eukaryotic DNA is coding
In mammals, only a few percent of the DNA is actualy coding:
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
13. Coding sequences are needles in the haystack


It is apparent that the coding sequences are only a small part of the
genome in most eukaryotes, particularly in human. Finding these
regions is like finding a needle in the haystack.
In addition, the genes are not uniformly distributed. There are regions
in the genome where the genes are packed together, and regions where
they are sparse, where finding genes is like finding water in a desert.
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
14. Categorizing the genes in eukaryotic genomes

Classification schemes based on gene function suggest that all eukaryotes possess
the same basic set of genes, but that more complex species have a greater number of
genes in each category. For example, humans have the greatest number of genes in
all but one of the categories used in the figure, the exception being ‘metabolism'
where Arabidopsis comes out on top as a result of its photosynthetic capability,
which requires a large set of genes not present in the other four genomes included in
this comparison.
This functional classification
reveals other interesting
features, notably that C.
elegans has a relatively high
number of genes whose
functions are involved in cellcell signaling, which is
surprising given that this
organism has just 959 cells.
Humans, who have 1013 cells,
have only 250 more genes for
cell-cell signaling.
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini
15. Overview of the human genome












Genome size is approximately 3,200 Mb
Gene number is approximately 30,000
Average gene density is 1 per 100 kb (5% of DNA encodes proteins); some areas are
gene rich, others are gene deserts (0 to 64 genes per 100 kb)
Average gene size (including introns) is 27 kb; gene regions account for about 25%
of genome
Average polypeptide size is 1.3 kb
Fraction of genome with coding functions is about 1.5%
At least 50% of genome made of transposable elements (e.g. LINES and Alus)
Intron number ranges from 0 (in histones) to 234 (titin , a muscle protein).
Hundreds of genes appear to have been transferred directly from bacteria to
vertebrate genomes. Mechanism unknown.
Functions have been assigned to 60% of genes.
Largest human gene is dystrophin (mutated in muscular dystrophy): 2.5 Mb (larger
than some bacterial genomes)
1077 blocks of duplicated regions in human genome (contain 10,000 genes):
suggests genome rearrangements common in evolution
Genetica per Scienze Naturali
a.a. 03-04 prof S. Presciuttini