Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
A Simpler 1.5-Approximation
Algorithm for Sorting by
Transpositions
Tzvika Hartman
Weizmann Institute
Genome Rearrangements
During evolution, genomes undergo large-
scale mutations which change gene order
(reversals, transpositions, translocations).
Given 2 genomes, GR algs infer the most
economical sequence of rearrangement
events which transform one genome into the
other.
Genome Rearrangements Model
Chromosomes are viewed as ordered lists of
genes.
Unichromosomal genome, every gene
appears once.
Genomes are represented by unsigned
permutations fo genes.
Circular genomes (e.g., bacteria &
mitochondria) are represented by circular
perms.
Sorting by Transpositions
A transposition exchanges between 2
consecutive segments of a perm.
Example :
123456789
12673458
9
Sorting by transpositions
: finding a
shortest sequence of transpositions which
sorts the perm.
Previous work
1.5-approximation algs for sorting by
transpositions [BafnaPevzner98, Christie99].
An alg that sorts every perm of size n in at
most 2n/3 transpositions [Erikkson et al 01].
Complexity of the problem is still open.
Main Results
The problem of sorting circular permutations
by transpositions is equivalent to sorting linear
perms by transpositions.
2. A new and simple 1.5-approximation alg for
sorting by transpositions, which runs in
quadratic time.
1.
Linear & Circular Perms
A transposition “cuts” the perm at 3 points.
Linear transposition :
A
B
C
t
D
A
A
C
B
A
t
Circular transposition :
B
C
C
• Circular transpositions can be represented by
exchanging any 2 of the 3 segments.
B
D
Linear & Circular Equivalence
Thm : Sorting linear perms by transpositions
is computationally equivalent to sorting
circular perms.
Pf sketch: Circularize linear perm by adding
an n+1 element and closing the circle.
Пn Пn+1 П1
П1 . . . Пn
.
.
.
.
.
• Every linear transposition is equivalent to a
circular transposition that exchanges the 2
segments that do not include n+1.
Breakpoint Graph [BafnaPevzner98]
Perm : ( 1
6
5
4
7
Replace each element j by 2j-1,2j:
3
2 )
= (1 2 11 12 9 10 7 8 13 14 5 6 3 4)
Circular Breakpoint graph G():
4 1
3
Vertex for every element.
Black edges (2i, 2i+1)
2
6
11
5
12
Grey edges (2i, 2i+1)
14
9
13
8
10
7
Breakpoint Graph (Cont.)
Unique decomposition into cycles.
codd() : # of odd cycles in G().
Define Δcodd(,t) = codd(t · ) – codd()
Lemma [BP98]: t and ,
Δcodd(,t) {0, 2, -2}.
4
3
1
2
6
11
5
12
14
9
13
8
10
7
Effect on Graph : Example
Perm: (1 3 2).
After extension: (1 2 5 6 3 4).
Breakpoint graph:
1
2
4
1
5
3
2
4
6
• # of cycles increased by 2
5
3
6
Effect on Graph : Example
Perm : (6 5 4 3 2 1).
After extension : (11 12 9 10 7 8 5 6 3 4 1 2).
Breakpoint graph :
11
11
12
1
10
7
4
3
6
9
2
9
2
12
8
5
• # of cycles remains 2
1
10
7
4
3
6
8
5
Breakpoint Graph (Cont.)
Max # of odd cycles, n, is in the id perm, thus:
Lower bound [BP98]: For all ,
d() [n-codd()]/2.
Goal : increase # of odd cycles in G.
t is a k-transposition if Δcodd(,t) = k.
A cycle that admits a 2-transposition is
oriented.
Simple Permutations
A perm is simple if its breakpoint graph
contains only short (3) cycles.
The theory is much simpler for simple perms.
Thm : Every perm can be transformed into a
simple one, while maintaining the lower bound.
Moreover, the sorting sequence can be
mimicked.
Corr : We can focus only on simple perms.
3 - Cycles
2 possible configurations of 3-cycles:
Non-oriented 3-cycle
Oriented 3-cycle
(0,2,2)-Sequence of Transpositions
A (0,2,2)-sequence is a sequence of 3
transpositions: the 1st is a 0-transposition and
the next two are 2-transpositions.
A series of (0,2,2)-sequences preserves a 1.5
approximation ratio.
Throughout the alg, we show that there is
always a 2-transposition or a (0,2,2)sequence.
Interleaving Cycles
2 cycles interleave if their black edges appear
alternatively along the circle.
Lemma : If G contains 2 interleaving 3-cycles,
then a (0,2,2)-sequence.
Shattered Cycles
2 pairs of black edges intersect if they appear
alternatively along the circle.
Cycle A is shattered by cycles
and C if every pair of black
edges in A intersects with a pair
in B or with a pair in C.
B
Lemma : If G contains a shattered cycle, then
a (0,2,2)-sequence.
Shattered Cycles (Cont.)
Lemma : If G contains no 2-cycles, no
oriented cycles and no interleaving cycles,
then a shattered cycle.
The Algorithm
While G contains a 2-cycle, apply a
2-transposition [Christie99].
If G contains an oriented 3-cycle, apply a 2-
transposition on it.
If G contains a pair of interleaving 3-cycles,
apply a (0,2,2)-sequence.
If G contains a shattered unoriented 3-cycle,
apply a (0,2,2)-sequence.
Repeat until perm is sorted.
Conclusions
We introduced 2 new ideas which simplify the
theory and the alg:
1. Working with circular perms simplifies the case
analysis.
2. Simple perms avoid the complication of
dealing with long cycles (similarly to the HP
theory for sorting by reversals).
Open Problems
Complexity of sorting by transpositions.
Models which allow several rearrangement
operations, such as trans-reversals, reversals
and translocations (both signed & unsigned).
Acknowledgements
Ron Shamir.
Thank you !