Download ppt - Chair of Computational Biology

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Deoxyribozyme wikipedia, lookup

Point mutation wikipedia, lookup

Artificial gene synthesis wikipedia, lookup

Cre-Lox recombination wikipedia, lookup

Meiosis wikipedia, lookup

X-inactivation wikipedia, lookup

Neocentromere wikipedia, lookup

Chromosome wikipedia, lookup

Polyploid wikipedia, lookup

Ploidy wikipedia, lookup

Genome (book) wikipedia, lookup

Microevolution wikipedia, lookup

Designer baby wikipedia, lookup

Gene expression programming wikipedia, lookup

Gene wikipedia, lookup

Public health genomics wikipedia, lookup

Gene expression profiling wikipedia, lookup

Site-specific recombinase technology wikipedia, lookup

History of genetic engineering wikipedia, lookup

Genetic engineering wikipedia, lookup

Human genetic variation wikipedia, lookup

Genome evolution wikipedia, lookup

RNA-Seq wikipedia, lookup

Therapeutic gene modulation wikipedia, lookup

Helitron (biology) wikipedia, lookup

Non-coding DNA wikipedia, lookup

Vectors in gene therapy wikipedia, lookup

Genomics wikipedia, lookup

Cell-free fetal DNA wikipedia, lookup

Human genome wikipedia, lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia, lookup

Mutation wikipedia, lookup

Genome editing wikipedia, lookup

Metagenomics wikipedia, lookup

Pathogenomics wikipedia, lookup

Genomic library wikipedia, lookup

Transposable element wikipedia, lookup

Microsatellite wikipedia, lookup

Zinc finger nuclease wikipedia, lookup

Bisulfite sequencing wikipedia, lookup

Frameshift mutation wikipedia, lookup

SNP genotyping wikipedia, lookup

Copy-number variation wikipedia, lookup

V5 Forward genetics – molecular markers
Review of lecture V4 ...
- Arabidopsis genome contains larger number of gene duplications
- More than 50% of genome duplicated
- TAIR website contains full sequence of columbia ecotype
Biological Sequence Analysis
SS 2008
lecture 5
Reverse genetics
Reverse genetics approach tries to identify the function of a particular gene
through the study of the impacts of a manipulation in its sequence.
Possible manipulations:
- random insertions or deletions
- site-directed mutagenesis (point mutations)
- gene knockout (yeast or mouse)
- RNA silencing
After an alteration, the method attempts to find a possible phenotype that may
have derived from this sequence change.
If variations become observable, conclusions can be drawn about the normal
underlying function of the mutated gene.
Modifying sequence of a gene requires sequence information retrieved from
genome sequencing, EST sequencing or transcript profiling projects.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Reverse genetics
After choosing a specific target sequence, select mutations that inactivate
the gene or disrupt its function and thus hopefully lead to a mutated visible
Main advantage of reverse genetic studies:
concerned gene is already known beforehand.
Regrettably, the used mutations often result in reduced function
(thus gain-of-function mutations can not be identified)
and the discovery of redundant pathways is not possible.
Unfortunately, also only a small portion of the mutations exhibit
informative phenotypes and even fewer display morphological changes
providing a direct clue about gene function.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Forward genetics
Instead one often uses forward genetic (also called classical genetic) approach
to discover the function(s) of a gene
Its allows
- to consider gain-of function mutations,
- identifying genes acting within a common pathway as well as genes encoding
for interacting proteins and
- it is not restricted to any tissue type.
Because of its wide area of applications, this method is often the preferred
strategy in functional studies.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Meiosis can be divided into the first
and the second meiosis.
First meiosis: segregation of the
homologous chromosomes from
each other and division of the
diploid cell into two haploid cells
each containing one of the
Second meiosis: decouples each
chromosome’s sister strands, the
chromatids and the segregation of
the DNA into two sets of strands
(each containing one of each
It further divides both haploid,
duplicated cells to produce 4
gametes which can fuse with other
haploid cells during fertilisation to
create a new diploid cell, or zygote.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Meiosis terms
A zygote is a cell that is the result of fertilization.
The haploid number is the number of chromosomes in a gamete of an individual.
Diploid cells have two homologous copies of each chromosome, usually one from
the mother and one from the father.
Plants and some algae switch between a haploid and a diploid or polyploid state,
with one of the stages emphasized over the other.
Biological Sequence Analysis
SS 2008
lecture 5
Meiosis terms
Zygosity describes the similarity or dissimilarity of DNA between homologous
chromosomes at a specific allelic position or gene. Every gene in a diploid
organism has two alleles at the gene's locus. These alleles are defined as
dominant or recessive, depending on the phenotype resulting from the two alleles.
An organism is called homozygous at a specific locus when it carries two identical
copies of the gene affecting a given trait on the two corresponding homologous
chromosomes (e.g., the genotype is PP or pp when P and p refer to different
possible alleles of the same gene).
An organism is heterozygous at a locus or gene when it has different alleles
occupying the gene's position in each of the homologous chromosomes.
In diploid organisms, the two different alleles were inherited from the organism's
two parents. For example a heterozygous individual would have the allele
combination Pp.
Biological Sequence Analysis
SS 2008
lecture 5
During the pairing of the
homologue chromosomes in
the first meiosis, the synapsis,
two copies of each chromosome
pair become physically close.
A process named recombination
or crossover can happen, if the
homologue chromosome arms
undergo a breakage and an
exchange of DNA segments,
resulting in gametic
chromosomes consisting of
material from both members of
the chromosome pair.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
The crossover directly affects the inheritance pattern of the involved genes as it
determines whether two genes will remain linked and inherited together or
whether they will be separated and inherited independently.
 meiosis not only ensures proper chromosome disjunction but also contributes to
genetic diversity among the gametes.
Because recombination events are able to give an insight on the distance of two
genes, they are capable to assist map-based cloning approaches.
Map-based cloning relies on this high frequency genetic exchange events of meiotic
recombination because two closely adjacent markers are separated less frequently
than two markers which are more distant to each other during a random occurring
In general, the crossover probability between two markers increases monotonically
as the distance between the two markers increases along the chromosome.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Map-based cloning
Outcrossing : practice of introducing unrelated
genetic material into a breeding line.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Marker in F2 generation
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Bulk analysis of mutation effect
Below: schematic representation of the marker
positions used in the mapping experiment. Open
circles: centromeres.
Right: gel of PCR products for these markers. In each
panel, the left lane shows the result for the
heterozygous control sample, and the right lane that
for a pooled mutant sample is given on the right side.
Bands specific for Ler ecotype are marked with an
The mutation created in Ler is
linked to markers ciw 1 and nga
280, because both markers show
only the Ler specific band. In
contrast, all other used markers
show approximately the same
ratio of Col and Ler amplification
in both lanes. This indicates that
the mutation is not linked with
these loci. Lukowitz et al. [2000].
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
In most organisms, SNPs comprise the largest set of sequence variants.
SNP: a single nucleotide replaces one of the other three nucleotides between
members (see Figure below).
transitions (substitutions between purines A and G or between pyrimidines C and T
transversions (substitutions between a purine and a pyrimidine).
In Arabidopsis both kinds are equally abundant in the genome (see Table).
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
SNPs can be found
- in intergenic regions (frequency 1 SNP per 3.5 kb),
- in coding (frequency 1 SNP per 2.2 kb) as well as
- in non-coding areas (frequency 1 SNP per 3.1 kb)
of genes.
SNPs falling within coding zones are of particular
interest. Due to redundancy in the genetic code not
every modification mandatory results in a different
amino acid.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Molecular markers: SNPs
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
RFLPs – Restriction fragment length polymorphisms
RFLPs [Botstein et al., 1980], were
one of the first developed types of
DNA markers. They exploit the
circumstance that variant accessions
have almost identical genomes but
they always differ at a few nucleotides
(due to base substitutions, insertions,
deletions or sequence rearrangements
during the evolution).
 Idea: Use these variations to
distinguish between ecotypes.
Employ restriction endonucleases that
recognise specific nucleic acid
sequences in the DNA and cleave
given sequences at these (or
adjacent) sites.
Some restriction enzymes and their recognition sites
(arrows indicates the cut site). Some enzymes
recognise not only one particular sequence but also
allow variations of certain nucleotides within their
recognition site. E.g. N stands for any nucleotide.
Source: Restriction Enzyme Database (REBASE)
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Effect of RFLPs
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
RFLPs – Restriction fragment length polymorphisms
After a cut, the obtained fragments may show differences in their sizes (due to
insertions or deletions) and also the number of produced pieces may vary (through
an alteration of a recognition site in the sequence by base change) between
dissimilar accessions (see Fig.).
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Cleaved amplified polymorphic sequence (CAPS) markers
CAPS markers detect single base changes that create or remove a recognition site
for a restriction enzyme in one of a pair of alleles.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Molecular markers: short sequence repeats
Simple sequence repeats, SSR, (also called short tandem repeats, STR, simple
sequence length polymorphisms, SSLP, or microsatellites) are highly polymorphic
loci present in DNA consisting of short 2-4 bp long sequence motifs repeating
multiple times embedded in DNA with unique sequences.
Minisatellites (also named variable number tandem repeats, VNTR) are similar to
SSRs, but their repeated sequence is longer (about 10-100 base pairs). Both often
arise from tandem duplications or slipped strand mispairing (slippage) occurring
during replication or DNA repair on a single DNA double helix.
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5
Classify microsatellites according to the number of nucleotides in the repeat unit.
Mononucleotide and dinucleotide repeat elements are quite common, longer
fragments become increasingly unlikely.
Alternative classification:
- perfect repeats, containing a single uninterrupted repeat element flanked on both
sides by non-repeated sequences, and
- imperfect ones with two or more runs of the same repeat unit interrupted by short
stretches of other sequences.
Besides these simple perfect repeats (such as (CA)n) and simple imperfect repeats
(for example (CA)nGT(CA)m), composed perfect repeats (for instance (AC)n(TC)m)
and composed imperfect repeats (such as (CA)nA(AC)mA(GA)o) also arise in the
M.Sc. thesis S. Pfeifer
genome of most organisms.
Biological Sequence Analysis
SS 2008
lecture 5
Molecular markers: microsatellites
Already in 1984, Tautz and Renz showed that all possible types of perfect simple
sequence repeats composed of only one or two nucleotide(s) are present to at least
some extent in eukaryotic genomes and that one can expect to encounter at least
one simple sequence stretch every 10 kb of DNA sequence.
In 1994, Bell and Ecker addressed mono- or dinucleotide repeats which are greater
than 20 nucleotides long in the Arabidopsis accessions Columbia and Landsberg
erecta.  most of them display polymorphisms between these ecotypes due to
variation in the number of the repeat units.
In 2000, this result was affirmed by a study of Lukowitz et al. showing that there is a
likelihood of 40 % that such DNA segments are polymorphic between different
M.Sc. thesis S. Pfeifer
Biological Sequence Analysis
SS 2008
lecture 5