Download Handouts

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

RNA wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene wikipedia , lookup

Messenger RNA wikipedia , lookup

Gene expression programming wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

RNA interference wikipedia , lookup

Gene desert wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

History of RNA biology wikipedia , lookup

Gene therapy wikipedia , lookup

Gene expression profiling wikipedia , lookup

NEDD9 wikipedia , lookup

Primary transcript wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Microevolution wikipedia , lookup

RNA silencing wikipedia , lookup

Non-coding RNA wikipedia , lookup

Gene nomenclature wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Epitranscriptome wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
11/18/15 Abstract Algebras: Category Theory for Biomedicine Eric Neumann PSB 2016 Material Used •  Category Theory for ScienFsts, Spivak 2013 •  Categorical Databases, Spivak 2012 •  Categories for the Working MathemaFcian, Mac Lane 1998 •  Conceptual MathemaFcs: A First IntroducFon to Categories Lawvere and Schanuel 2009 •  Sheaves in Geometry and Logic: A First IntroducFon to Topos Theory, MacLane and Moerdijk, 1994 1 11/18/15 Category Theory (CT) Basics •  All Categories consist of: •  Objects (more like what we think of as classes) •  Morphisms (funcFons, maps, arrows between objects) •  Morphisms can be composed in a sensible way: g ∘ f: x ⟼ g(f(x)) •  HomC(c, d) = the set of all funcFons/maps between c and d •  Categories are oYen constructed in the simplest form possible •  Set, Graph, Grp, Mon, Vect, Top, etc … even Cat •  Generality is the name of the game! Category Theory (CT) Basics •  Functors F map Categories (C) to Categories (D): F: C -­‐> D •  F: Obj(C) -­‐> Obj(D) , e.g., c ⟼ F(c) = d •  F: Morphism(C) -­‐> Morphism(D) , e.g., f ⟼ F(f) •  For example, GS: Graph -­‐> Set maps Graph objects and arrows to sets of objects and sets of funcFons in Set: Think Biochemical Networks to Genes and Reac:ons! •  There’s even morphisms between Functors called Natural TransformaDons… turtles all the way up! 2 11/18/15 Category Theory (CT) Basics C := Domain, PreImage Codomain, Image f g h k i Sets •  SurjecFve = “onto” •  Everything in Image is mapped •  InjecFve = “one-­‐to-­‐one” •  Inclusions, subobjects •  BijecFve = “onto” & “one-­‐to-­‐one” Category Theory (CT) Basics Where can one apply CT? •  Sets and their funcFons (general rules for arithmeFc, data) •  All areas of Math and Groups, incl. Topology (String Theory) •  All Concepts that can be formally represented: Schemas and Ontologies (Functors ~ logic, Inference, concept generaFon) •  All Databases (Functors ~ data migraFon, queries) •  All ScienFfic Models: ontological, dynamic, analyFcal Power of CT is that theorems in one parent CAT can be applied to all proper subsumed categories! Just be sure to follow the formal rules! 3 11/18/15 PracFcal Categories Some Science… Space à Plane x Line , Time à Space Time à Plane x Line from Lawvere and Schanuel, 2009 CT Universal ProperFes •  Products, Pullbacks •  Limits •  Equalizers •  Sums, Pushouts •  CommutaFon C_has_width∘ f = s ∘ B_has_width 4 11/18/15 CT Universal ProperFes •  Products, Pullbacks •  Limits •  Equalizers •  Sums, Pushouts •  CommutaFon C_has_width∘ f = s ∘ B_has_width FuncFons in CT Given funcFon/map f: X -­‐> Y ≡ x ⟼ f(x) ExponenFaFon: f ~ YX , means each f is iden:fied by name: YX ~ “f” •  “How many ways to maps from X to Y?” •  Ex 1: For an input set X of size |X| mapping to any subset of X (Y ~ membership is binary, so |Y|=2), the number of possible subsets is |Y||X| (i.e., each x∊X is either included or not), hence there are just as many maps. •  Ex 2: A funcFon with 3 boolean args must have values for 23=8 different inputs •  Ex 3: For seq’s X of a similar length and Y, the 4 bases one can choose from, then 4|X| is the number of possible sequences for a string of length |X| EvaluaFon: {f(X) = Y} ~ YX à ev: X × YX à Y gz f Currying: X × Y ⟶
Z ~ X ⟶
Z , where gY = f(Y,_ ) ~ (ZY)X = (ZX×Y ) ExponenFals objecFfy funcFons and maps! 5 11/18/15 Path CommutaFon Protein tl mRNA e s ts Gene e = ts∘ tl # Protein from Gene = # RNA from Gene∘ # Protein from RNA Prot express rate per Gene = RNA transcribe rate per Gene∘ Prot transl rate amount per RNA Prot mRNA s = inv e Protein Seq complexity for Gene = Gene Seq complexity ∘ RNA Seq complexity Protein-­‐>Gene coord = Prot-­‐>RNA coord ∘ Gene-­‐>RNA coord Gene G-­‐Sheaf mut Variant Sheaf binding domains Prot πp mRNA funcFon 1° Causal Flow (mutaFon effects) 𝐹G Structural dependence (informaFon) πr Gene 𝓑G Sequence Topology e ∘ πp p ∝ e ∘ πp ∘ int p 6 11/18/15 T-­‐Sheaf modificaFon abundance interacFons Prot πp acFvity mRNA 𝐹G 𝐹T πr Gene πg Tissue 𝓑T Some obvious Products • 
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
Gene-­‐expr * Time Prot-­‐expr * Tissue P-­‐Interacts * Gene-­‐expr P-­‐Interacts * Prot-­‐expr Gene-­‐expr * G-­‐Interacts Gene-­‐expr * G-­‐Interacts * Time Gene-­‐expr * G-­‐Interacts * Prot-­‐expr Gene-­‐expr * G-­‐Interacts * Prot-­‐expr * Tissue * Time Prot Structure * Mut Seq-­‐space Prot Structure * P-­‐Interacts Prot Structure * P-­‐Interacts * Mut Seq-­‐space Prot-­‐Act * Prot-­‐expr * Mut Seq-­‐space Cell Processes * Prot-­‐Act * Prot-­‐expr * Mut Seq-­‐space Onco Processes * Prot-­‐Act * Mut Seq-­‐space * Therapy * Outcomes 7 11/18/15 OLOGs •  “Ontology Log” – Spivak & Kent, PLOS 2012 •  Dynamic Data Schema •  compose new concepts via U-­‐operaFons •  Graph formalism w/ path constraints •  As powerful as Ontology + Rules •  Great for whiteboarding new semanFcs Product as a Pullback Pullback D = B x C a Signaling Oncogene C a Oncogene B a Signaling Gene A a Gene 8 11/18/15 Product as a Pullback Pullback A Disease B D = B x C PaFents with a Molecular AberraFon-­‐ Disease AssociaFon PaFents w a Disease C a Molecular AberraFon A PaFents with a Molecular AberraFon PaFents OLOGs – rules as programs Spivak and Kent, PLOS, 2012
9 11/18/15 Endomaps C := α A x’ = α.α.α.x = α(α(α(x))) Idempotent x = β.β.x β from Lawvere and Schanuel, 2009 Endomaps father father mother ♀ P ♂ father mother mother
♀ ♂ 10 11/18/15 Endomap Products e.g. pedigree & clans father P father mother ♀ mother ♂ father mother mother mother W ♀W B mother father father ♂W mother ♀B father clan father gender ♂B mother mother Graph Products ex: Mitochondrial Genomes father father mother G mother mother M2 mother father father mother mother ♀M2 ♂M1 father father father father father ♀M1 father M1 father father mother mother father ♂ ♀ ♂M2 mother 11 11/18/15 Graph DefiniFon • VerFces are objects • Arrows are objects ~
V A V
<Subj> <Prop> <Obj> It’s the basis of a Triple! RelaFon to Shapes Count commuta:on X h f g Y s Z h = g∘ f ∀z {∑x∊X(h(x)=z) = ∑y∊Y (g(y)=z * ∑x∊Xf(x)=y)} :h cat:comm_eq (:f :g)
h∘ s = 1Z ~ :h cat:right_inv :s
This can also be a data constraint!! 12 11/18/15 … even deeper Logic ApplicaFons All Φh(x) must have a fixed point for some t specifically Φh(Φt(t)) = ΦΦt(t); otherwise FALSE: e.g., h() = neg()à Gödel’s Incompleteness Theorem Yanofsky 2003
Yoneda lemma The Yoneda lemma allows the embedding of any category into a category of functors defined on that category. It suggests that instead of studying the (small) category C, one should study the category of all functors of C into Set (the category of sets with funcFons as morphisms). Set is a category we understand, and a functor of C into Set can be seen as a "representaFon" of C in terms of known structures. The original category C is contained in this functor category, but new objects appear in the functor category, which were absent and "hidden" in C. For a functor F and for CAT C with objects A, B, C and morphisms f,g,h… hA = Hom(A,_) Nat(hA , F) ≃ F(A) F(B) hA(B) A B F(A) F(C) hA(C) C 13 11/18/15 Yoneda lemma applied to databases For instances (functor) I and for Schema category objects (tables) A, B, C… hA = Hom(A,_) Nat(hA , I) ≃ I(A) Nat(hA , I)(_) à Hom_(hA(_), I(_)) hA(B) = Hom(A,B) HomB(hA(B), I(B)) ≃ IB(A) hA(C) = Hom(A, C) HomC(hA(C), I(C)) ≃ IC(A) A A B B I(B) C I(A) C I(C) CT applicaFon areas 1.  Valid Science concept (domain) representaFons – 
FuncFonal maps, dependencies, limits/products 2.  Rules for database design and use – 
Data integrity, Schema migraFon, Ontology alignment, query forms 3.  Rules for inference and knowledge discovery – 
CondiFons for producFzaFon, exponenFaFon, Discovery Functors, Tool Wrappers (monads) 4.  A formal bridge between CT and LD could help accelerate the development of tools and applicaFons for many areas of Data Sciences 14 11/18/15 Concluding remarks •  Should we begin a regular, semi-­‐formal discussion group around CT and LD? •  PSB2016 Workshop on Abstract Algebra and Topology 15