* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Handouts
Nutriepigenomics wikipedia , lookup
Messenger RNA wikipedia , lookup
Gene expression programming wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
RNA interference wikipedia , lookup
Gene desert wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
History of RNA biology wikipedia , lookup
Gene therapy wikipedia , lookup
Gene expression profiling wikipedia , lookup
Primary transcript wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Microevolution wikipedia , lookup
RNA silencing wikipedia , lookup
Non-coding RNA wikipedia , lookup
Gene nomenclature wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Designer baby wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
11/18/15 Abstract Algebras: Category Theory for Biomedicine Eric Neumann PSB 2016 Material Used • Category Theory for ScienFsts, Spivak 2013 • Categorical Databases, Spivak 2012 • Categories for the Working MathemaFcian, Mac Lane 1998 • Conceptual MathemaFcs: A First IntroducFon to Categories Lawvere and Schanuel 2009 • Sheaves in Geometry and Logic: A First IntroducFon to Topos Theory, MacLane and Moerdijk, 1994 1 11/18/15 Category Theory (CT) Basics • All Categories consist of: • Objects (more like what we think of as classes) • Morphisms (funcFons, maps, arrows between objects) • Morphisms can be composed in a sensible way: g ∘ f: x ⟼ g(f(x)) • HomC(c, d) = the set of all funcFons/maps between c and d • Categories are oYen constructed in the simplest form possible • Set, Graph, Grp, Mon, Vect, Top, etc … even Cat • Generality is the name of the game! Category Theory (CT) Basics • Functors F map Categories (C) to Categories (D): F: C -‐> D • F: Obj(C) -‐> Obj(D) , e.g., c ⟼ F(c) = d • F: Morphism(C) -‐> Morphism(D) , e.g., f ⟼ F(f) • For example, GS: Graph -‐> Set maps Graph objects and arrows to sets of objects and sets of funcFons in Set: Think Biochemical Networks to Genes and Reac:ons! • There’s even morphisms between Functors called Natural TransformaDons… turtles all the way up! 2 11/18/15 Category Theory (CT) Basics C := Domain, PreImage Codomain, Image f g h k i Sets • SurjecFve = “onto” • Everything in Image is mapped • InjecFve = “one-‐to-‐one” • Inclusions, subobjects • BijecFve = “onto” & “one-‐to-‐one” Category Theory (CT) Basics Where can one apply CT? • Sets and their funcFons (general rules for arithmeFc, data) • All areas of Math and Groups, incl. Topology (String Theory) • All Concepts that can be formally represented: Schemas and Ontologies (Functors ~ logic, Inference, concept generaFon) • All Databases (Functors ~ data migraFon, queries) • All ScienFfic Models: ontological, dynamic, analyFcal Power of CT is that theorems in one parent CAT can be applied to all proper subsumed categories! Just be sure to follow the formal rules! 3 11/18/15 PracFcal Categories Some Science… Space à Plane x Line , Time à Space Time à Plane x Line from Lawvere and Schanuel, 2009 CT Universal ProperFes • Products, Pullbacks • Limits • Equalizers • Sums, Pushouts • CommutaFon C_has_width∘ f = s ∘ B_has_width 4 11/18/15 CT Universal ProperFes • Products, Pullbacks • Limits • Equalizers • Sums, Pushouts • CommutaFon C_has_width∘ f = s ∘ B_has_width FuncFons in CT Given funcFon/map f: X -‐> Y ≡ x ⟼ f(x) ExponenFaFon: f ~ YX , means each f is iden:fied by name: YX ~ “f” • “How many ways to maps from X to Y?” • Ex 1: For an input set X of size |X| mapping to any subset of X (Y ~ membership is binary, so |Y|=2), the number of possible subsets is |Y||X| (i.e., each x∊X is either included or not), hence there are just as many maps. • Ex 2: A funcFon with 3 boolean args must have values for 23=8 different inputs • Ex 3: For seq’s X of a similar length and Y, the 4 bases one can choose from, then 4|X| is the number of possible sequences for a string of length |X| EvaluaFon: {f(X) = Y} ~ YX à ev: X × YX à Y gz f Currying: X × Y ⟶ Z ~ X ⟶ Z , where gY = f(Y,_ ) ~ (ZY)X = (ZX×Y ) ExponenFals objecFfy funcFons and maps! 5 11/18/15 Path CommutaFon Protein tl mRNA e s ts Gene e = ts∘ tl # Protein from Gene = # RNA from Gene∘ # Protein from RNA Prot express rate per Gene = RNA transcribe rate per Gene∘ Prot transl rate amount per RNA Prot mRNA s = inv e Protein Seq complexity for Gene = Gene Seq complexity ∘ RNA Seq complexity Protein-‐>Gene coord = Prot-‐>RNA coord ∘ Gene-‐>RNA coord Gene G-‐Sheaf mut Variant Sheaf binding domains Prot πp mRNA funcFon 1° Causal Flow (mutaFon effects) 𝐹G Structural dependence (informaFon) πr Gene 𝓑G Sequence Topology e ∘ πp p ∝ e ∘ πp ∘ int p 6 11/18/15 T-‐Sheaf modificaFon abundance interacFons Prot πp acFvity mRNA 𝐹G 𝐹T πr Gene πg Tissue 𝓑T Some obvious Products • • • • • • • • • • • • • • Gene-‐expr * Time Prot-‐expr * Tissue P-‐Interacts * Gene-‐expr P-‐Interacts * Prot-‐expr Gene-‐expr * G-‐Interacts Gene-‐expr * G-‐Interacts * Time Gene-‐expr * G-‐Interacts * Prot-‐expr Gene-‐expr * G-‐Interacts * Prot-‐expr * Tissue * Time Prot Structure * Mut Seq-‐space Prot Structure * P-‐Interacts Prot Structure * P-‐Interacts * Mut Seq-‐space Prot-‐Act * Prot-‐expr * Mut Seq-‐space Cell Processes * Prot-‐Act * Prot-‐expr * Mut Seq-‐space Onco Processes * Prot-‐Act * Mut Seq-‐space * Therapy * Outcomes 7 11/18/15 OLOGs • “Ontology Log” – Spivak & Kent, PLOS 2012 • Dynamic Data Schema • compose new concepts via U-‐operaFons • Graph formalism w/ path constraints • As powerful as Ontology + Rules • Great for whiteboarding new semanFcs Product as a Pullback Pullback D = B x C a Signaling Oncogene C a Oncogene B a Signaling Gene A a Gene 8 11/18/15 Product as a Pullback Pullback A Disease B D = B x C PaFents with a Molecular AberraFon-‐ Disease AssociaFon PaFents w a Disease C a Molecular AberraFon A PaFents with a Molecular AberraFon PaFents OLOGs – rules as programs Spivak and Kent, PLOS, 2012 9 11/18/15 Endomaps C := α A x’ = α.α.α.x = α(α(α(x))) Idempotent x = β.β.x β from Lawvere and Schanuel, 2009 Endomaps father father mother ♀ P ♂ father mother mother ♀ ♂ 10 11/18/15 Endomap Products e.g. pedigree & clans father P father mother ♀ mother ♂ father mother mother mother W ♀W B mother father father ♂W mother ♀B father clan father gender ♂B mother mother Graph Products ex: Mitochondrial Genomes father father mother G mother mother M2 mother father father mother mother ♀M2 ♂M1 father father father father father ♀M1 father M1 father father mother mother father ♂ ♀ ♂M2 mother 11 11/18/15 Graph DefiniFon • VerFces are objects • Arrows are objects ~ V A V <Subj> <Prop> <Obj> It’s the basis of a Triple! RelaFon to Shapes Count commuta:on X h f g Y s Z h = g∘ f ∀z {∑x∊X(h(x)=z) = ∑y∊Y (g(y)=z * ∑x∊Xf(x)=y)} :h cat:comm_eq (:f :g) h∘ s = 1Z ~ :h cat:right_inv :s This can also be a data constraint!! 12 11/18/15 … even deeper Logic ApplicaFons All Φh(x) must have a fixed point for some t specifically Φh(Φt(t)) = ΦΦt(t); otherwise FALSE: e.g., h() = neg()à Gödel’s Incompleteness Theorem Yanofsky 2003 Yoneda lemma The Yoneda lemma allows the embedding of any category into a category of functors defined on that category. It suggests that instead of studying the (small) category C, one should study the category of all functors of C into Set (the category of sets with funcFons as morphisms). Set is a category we understand, and a functor of C into Set can be seen as a "representaFon" of C in terms of known structures. The original category C is contained in this functor category, but new objects appear in the functor category, which were absent and "hidden" in C. For a functor F and for CAT C with objects A, B, C and morphisms f,g,h… hA = Hom(A,_) Nat(hA , F) ≃ F(A) F(B) hA(B) A B F(A) F(C) hA(C) C 13 11/18/15 Yoneda lemma applied to databases For instances (functor) I and for Schema category objects (tables) A, B, C… hA = Hom(A,_) Nat(hA , I) ≃ I(A) Nat(hA , I)(_) à Hom_(hA(_), I(_)) hA(B) = Hom(A,B) HomB(hA(B), I(B)) ≃ IB(A) hA(C) = Hom(A, C) HomC(hA(C), I(C)) ≃ IC(A) A A B B I(B) C I(A) C I(C) CT applicaFon areas 1. Valid Science concept (domain) representaFons – FuncFonal maps, dependencies, limits/products 2. Rules for database design and use – Data integrity, Schema migraFon, Ontology alignment, query forms 3. Rules for inference and knowledge discovery – CondiFons for producFzaFon, exponenFaFon, Discovery Functors, Tool Wrappers (monads) 4. A formal bridge between CT and LD could help accelerate the development of tools and applicaFons for many areas of Data Sciences 14 11/18/15 Concluding remarks • Should we begin a regular, semi-‐formal discussion group around CT and LD? • PSB2016 Workshop on Abstract Algebra and Topology 15