Download Models for Structural and Numerical Alterations in Cancer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Karyotype wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

NEDD9 wikipedia , lookup

Gene nomenclature wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Neocentromere wikipedia , lookup

Gene therapy wikipedia , lookup

Y chromosome wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Genetic engineering wikipedia , lookup

Nutriepigenomics wikipedia , lookup

NUMT wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Transposable element wikipedia , lookup

Ridge (biology) wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Non-coding DNA wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Whole genome sequencing wikipedia , lookup

X-inactivation wikipedia , lookup

Genomic imprinting wikipedia , lookup

Copy-number variation wikipedia , lookup

Gene desert wikipedia , lookup

History of genetic engineering wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene wikipedia , lookup

Public health genomics wikipedia , lookup

Gene expression profiling wikipedia , lookup

Oncogenomics wikipedia , lookup

Genomics wikipedia , lookup

Polyploid wikipedia , lookup

Pathogenomics wikipedia , lookup

Human genome wikipedia , lookup

Genomic library wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression programming wikipedia , lookup

Helitron (biology) wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome editing wikipedia , lookup

Genome (book) wikipedia , lookup

Minimal genome wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Sorting by Cuts, Joins and Whole
Chromosome Duplications
Ron Zeira and Ron Shamir
Combinatorial Pattern Matching 2015
30.6.15
Genome rearrangements
Motivation I: evolution
Human genome project
Motivation II: cancer
MCF-7 breast cancer cell-line
Normal karyotype
NCI, 2001
Definitions: gene
• A gene – oriented segment:
• A gene has two extremities: head and tail.
• Positive: tailhead; Negative: headtail.
Definitions: chromosome
• Chromosome is a series of consecutive genes.
• 2 consecutive extremities form an adjacency.
• A telomere is an extremity that is not part of
an adjacency.
• Circular chrom. has no telomeres. Linear
chrom. has 2 telomeres.
Definitions: genome
• A genome is a set of chromosomes.
• Equivalently, a genome is a set of adjacencies.
Π  {ah , bh },{bt , ch },{dh , f h },{ ft , et }
• Ordinary genome has one copy of each gene.
Otherwise duplicated.
GR distance problem
• Distance dop(Π,Σ) – minimal number of
operations between genomes Π and Σ.
• Operations:
– Reversals
– Translocations
– Transpositions
– Others…
The SCJ model
• SCJ – Single Cut or Join (Feijão,Meidanis 11):
– Cut an adjacency to 2 telomeres.
– Join 2 telomeres to an adjacency.
cut
join
• Simple and practical model.
• Reflects evolutionary distance (Biller et al. 13)
Models with multiple gene copies
• Most models with multiple gene copies are
NP-hard.
• Not many models allow duplications or
deletions.
• Many normal and cancer genomes have
multiple gene copies.
The SCJD model
• A duplication takes a linear chromosome and
produces an additional copy of it.
abc  abc, abc
• An SCJD operation is either a cut, or a join or a
duplication.
The SCJD distance
• The minimal number of SCJD operations that
transform an ordinary genome into a
duplicated genome.
Results outline
• Characterize optimal solution structure.
• Give a distance optimization function.
• Solve the optimization problem.
• Study the number of duplications in optimal
scenario.
SCJD optimal scenario structure
• Theorem: There exists an optimal SCJD sorting
scenario, consisting, in this order, of
– SCJ operations on single-copy genes.
– Duplications.
– SCJ operations acting on duplicated genes.

SCJs
'
duplications
2 '
SCJs

Proof outline
• An SCJ operation acts on extremities on 2
duplicated genes or 2 unduplicated genes.
• Preempting SCJ on unduplicated genes keeps
a valid sorting scenario.
• Preempt duplications while scenario is valid.
Corollary: SCJD distance
• Write the distance as a function of Γ’.
• Find Γ’ that minimizes the distance.
η – higher score for adj. in Γ and Δ
Distance optimization solution
• The following genome maximizes H:
 '  { |  ( )  0}
• If Γ not linear, remove an adjacency with η=1
from each circular chromosome in Γ’ to obtain
Γ’’.
• Theorem: SCJD distance is computable in
linear time.
Controlling the number of duplications
• Duplications are more “radical” events than
cut or join.
• Lemma: Our algorithm gives an optimal
sorting scenario with a maximum number of
duplications.
Optimal solutions can have different
numbers of duplications
Minimizing duplications is hard
• Theorem: Finding an optimal SCJD sorting
scenario with a minimum number of
duplications is NP-hard.
• Reduction from Hamiltonian path problem on
a directed graph with in/out degree 2.
Proof outline
• For a 2-digraph G and two vertices x, y, there
is an Eulerian path P:xy.
• Create a duplicated genome Σ from P and an
empty genome Π.
• Add auxiliary genes and k copies of Σ, Π.
• There is a Hamiltonian path xy in G iff there
is an optimal sorting scenario with k
duplications.
Summary
• Genome rearrangements are important.
• Problems with multiple gene copies are hard.
• SCJD – allows SCJ and duplications:
– Linear algorithm for the SCJD distance.
– Study the number of duplications in optimal
solution.
• We hope to generalize the model and apply it
on cancer data.
Thank You!