Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Why is Surface Area Important to Normal Cell Function? Geoff Sutcliffe Adam Pease University of Miami Miami, Florida, USA [email protected] Rearden Commerce Foster City, California, USA [email protected] Abstract This paper shows two first-order logic encodings of the question and sentences necessary to answer “Why is surface area important to normal cell function?”, and shows how the proof provides the answer to the question. 1 Introduction The question “Why is surface area important to normal cell function?” can be answered using the sentence “A high surface-to-volume ratio facilitates the exchange of materials between a cell and its environment.”. A human’s answer to the question might simply be that sentence, leaving much commonsense and biological knowledge implicit, e.g., there is a relationship between surface area and surface-to-volume ratio, that if one of two related things is important then the other is also important, and the exchange rate of a cell with its environment is important. This biological and commonsense knowledge needs to be encoded. Some of this knowledge might be inferred from other sentences, e.g., from “As an object of a particular shape increases in size, its volume grows proportionately more than its surface area.”, but that was not done for this submission. For the two encodings described in this submission, the implicit commonsense knowledge that has been encoded is; • • • • • Every 3-dimensional volume (cells in this task) has a numeric surface area. Every 3-dimensional volume (cells in this task) has a numeric surface to volume ratio. Surface area is proportional to surface to volume ratio. Proportional is symmetric and transitive on numeric functions. If something numeric is important, and is proportional to another numeric thing, then the other numeric thing is also important. The implicit biological knowledge that has been encoded is: • Every cell has an environment. • Every cell and environment have a numeric exchange rate between the cell and the environment. • The exchange rate of a cell with its environment is important. Given the explicit and implicit knowledge, the question can be encoded as a conjecture, and the proof of the conjecture provides the reasoning that is the detailed answer to the question. This submission has encoded the knowledge and conjecture in first-order logic, in two different ways. The first encoding is shallow, but provided the intuitive argument that underlies the second, deep, encoding. Section 2 provides the shallow encoding, and Section 3 provides the deep encoding. Section 4 provides some commentary on the encodings, suggesting what might be generally necessary for extensive deep knowledge representation and reasoning tasks. 1 Why is surface area important to normal cell function? 2 Sutcliffe, Pease The Shallow Encoding The shallow encoding was done in typed first-order logic, using the typed first-order form (TFF) language of the TPTP [Sut09], as follows: tff(cell_type,type,( cell: $tType )). tff(environment_type,type,( environment: $tType )). tff(number_type,type,( number: $tType )). tff(numeric_function_on_cell_type,type,( numeric_function_on_cell: $tType )). tff(surface_area_type,type,( surface_area: cell > numeric_function_on_cell )). tff(surface_to_volume_ratio_type,type,( surface_to_volume_ratio: cell > numeric_function_on_cell )). tff(exchange_rate_type,type,( exchange_rate: ( cell * environment ) > numeric_function_on_cell )). tff(environment_of_type,type,( environment_of: cell > environment )). tff(apply_type,type,( apply: ( numeric_function_on_cell * cell ) > number )). tff(proportional_type,type,( proportional: ( numeric_function_on_cell * numeric_function_on_cell ) > $o )). tff(important_type,type,( important: numeric_function_on_cell > $o )). %----Proportional is symmetric and transitive tff(proportional_symmetric,axiom,( ! [R1: numeric_function_on_cell,R2: numeric_function_on_cell] : ( proportional(R1,R2) => proportional(R2,R1) ) )). tff(proportional_transitive,axiom,( ! [R1: numeric_function_on_cell,R2: numeric_function_on_cell,R3: numeric_function_on_cell] : ( ( proportional(R1,R2) & proportional(R2,R3) ) => proportional(R1,R3) ) )). %----A high surface-to-volume ratio facilitates the exchange of materials %----between a cell and its environment. tff(proportional1,axiom,( ! [C: cell] : proportional(surface_to_volume_ratio(C),exchange_rate(C,environment_of(C))) )). %----Surface area is proportional to surface-to-volume ratio. tff(proportional2,axiom,( ! [C: cell] : proportional(surface_area(C),surface_to_volume_ratio(C)) )). %----If something numeric is important, and proportional to another numeric %----thing, then the other numeric thing is important. tff(important_proportional,axiom,( ! [R1: numeric_function_on_cell,R2: numeric_function_on_cell] : ( ( important(R1) & proportional(R1,R2) ) => important(R2) ) )). %----The exchange rate of a cell with its environment is important. tff(important_to_cell,axiom,( ! [C: cell] : important(exchange_rate(C,environment_of(C))) )). %----Why is surface area important to normal cell function? tff(surface_area_important,conjecture,( ! [C: cell] : important(surface_area(C)) )). 2 Why is surface area important to normal cell function? Sutcliffe, Pease A proof for the TFF encoding is easily found, either directly by a TFF enabled automated theorem proving (ATP) system such as SNARK, or by a standard translation to first-order form (FOF) [Coh86] and using almost any FOF ATP system, e.g., Vampire or EP. 3 The Deep Encoding The deep encoding is based on the concepts encoded in the Suggested Upper Merged Ontology (SUMO) ontology [Pea11, NP01]. SUMO is a large theory stated in first-order logic with some higher-order extensions. It has been open source since its first release in 2000. SUMO has grown steadily since its initial development as an upper ontology, and now contains a MidLevel Ontology (MILO) and dozens of domain ontologies. It numbers roughly 20,000 terms and 70,000 statements, including thousands of rules. SUMO has been mapped by hand to the entire WordNet lexicon [NP03]. SUMO is particularly valuable as a basis for deep knowledge representation and formal inference because of the large number of hand-crafted rules that serve to define concepts in the ontology. More recently, SUMO has been automatically extended with millions of facts [dMSP08]. The deep encoding was done in the SUO-KIF language that is used by SUMO, using some concepts already defined by axioms in SUMO, and adding new axioms to define concepts that are specific to this task. Some of the axioms are not needed for this task, e.g., the new axioms that encode the temporal and spatial properties of an exchange rate. They were included because it is likely that they would be selected for use if the entire SUMO (extended to include those new axioms) was available. Selection of relevant SUMO axioms for specific reasoning tasks has recently been effectively addressed by the SInE system [HV11]. The encoding is: ;;----General SUMO background (subclass TransitiveRelation Relation) (subclass SymmetricRelation Relation) (subclass BinaryRelation Relation) (subclass TernaryRelation Relation) (<=> (instance ?REL TransitiveRelation) (forall (?INST1 ?INST2 ?INST3) (=> (and (?REL ?INST1 ?INST2) (?REL ?INST2 ?INST3)) (?REL ?INST1 ?INST3)))) (<=> (instance ?REL SymmetricRelation) (forall (?INST1 ?INST2) (=> (?REL ?INST1 ?INST2) (?REL ?INST2 ?INST1)))) (=> (and (subclass ?X ?Y) (subclass ?Y ?Z)) (subclass ?X ?Z)) (=> (and (instance ?X ?Y) (subclass ?Y ?Z)) (instance ?X ?Z)) ;;----Every 3-dimensional volume has a numeric surface area. (instance surfaceArea BinaryRelation) (domain surfaceArea 1 SelfConnectedObject) (domain surfaceArea 2 AreaMeasure) (=> (surfaceArea ?S ?M) (exists (?O) (surface ?S ?O))) 3 Why is surface area important to normal cell function? Sutcliffe, Pease (=> (surfaceArea ?S (MeasureFn ?N ?U)) (measure ?S (MeasureFn ?N ?U))) ;;----Every 3-dimensional volume has a numeric surface to volume ratio. (instance surfaceToVolumeRatio BinaryRelation) (domain surfaceToVolumeRatio 1 SelfConnectedObject) (domain surfaceToVolumeRatio 2 FunctionQuantity) (=> (surfaceToVolumeRatio ?O (PerFn (MeasureFn ?A ?U) (MeasureFn ?B ?U))) (and (surfaceArea ?O (MeasureFn ?N (SquareFn ?U2))) (measure ?O (MeasureFn ?N2 (CubeFn ?U2))) (equal (PerFn (MeasureFn ?A ?U) (MeasureFn ?B ?U)) (PerFn (MeasureFn ?N (SquareFn ?U2)) (MeasureFn ?N2 (CubeFn ?U2)))))) ;;----Every cell and environment have a numeric exchange rate between the ;;----cell and the environment. (instance exchangeRate TernaryRelation) (domain exchangeRate 1 Cell) (domain exchangeRate 2 Region) (domain exchangeRate 3 FunctionQuantity) (=> (exchangeRate ?C ?R (PerFn (MeasureFn ?N ?U) (MeasureFn ?N2 ?TU))) (instance ?TU TimeDuration)) (=> (and (exchangeRate ?C ?R (PerFn (MeasureFn ?N ?U) (MeasureFn ?N2 ?TU))) (greaterThan ?N 0)) (exists (?M ?S) (and (instance ?S Substance) (measure ?S (MeasureFn ?N ?U)) (instance ?M Motion) (duration (WhenFn ?M) (MeasureFn ?N2 ?TU)) (holdsDuring (ImmediatePastFn ?M) (orientation ?S ?C Inside)) (holdsDuring (ImmediateFutureFn ?M) (orientation ?S ?C Outside))))) (=> (and (exchangeRate ?C ?R (PerFn (MeasureFn ?N ?U) (MeasureFn ?N2 ?TU))) (greaterThan 0 ?N)) (exists (?M ?S) (and (instance ?S Substance) (measure ?S (MeasureFn ?N ?U)) (instance ?M Motion) (duration (WhenFn ?M) (MeasureFn ?N2 ?TU)) (holdsDuring (ImmediatePastFn ?M) (orientation ?S ?C Outside)) (holdsDuring (ImmediateFutureFn ?M) (orientation ?S ?C Inside))))) ;;----Every cell has an environment. (instance cellEnvironmentFn UnaryFunction) (domain cellEnvironmentFn 1 Cell) (range cellEnvironmentFn Region) (=> (cellEnvironmentFn ?C ?R) 4 Why is surface area important to normal cell function? Sutcliffe, Pease (orientation ?C ?R Inside)) ;;----Proportional is symmetric and transitive (instance proportional SymmetricRelation) (instance proportional TransitiveRelation) (domain proportional 1 Relation) (domain proportional 2 Relation) (=> (proportional ?X ?Y) (proportional ?Y ?X)) (=> (and (proportional ?X ?Y) (proportional ?Y ?Z)) (proportional ?X ?Z)) ;;----A high surface to volume ratio facilitates the exchange of materials ;;----between a cell and its environment. (proportional surfaceToVolumeRatio exchangeRate) ;;----Surface area is proportional to surface to volume ratio. (proportional surfaceArea surfaceToVolumeRatio) ;;----If something numeric is important, and is proportional to another ;;----numeric thing, then the other numeric thing is important. (instance Important RelationalAttribute) (=> (and (instance ?R Relation) (attribute ?R Important) (proportional ?R ?R2)) (attribute ?R2 Important)) ;;----The exchange rate of a cell with its environment is important. (attribute exchangeRate Important) ;;----Why is surface area important to normal cell function? (attribute surfaceArea Important) The SUO-KIF encoding was exported to TPTP FOF format by the SigmaKEE system [PB10]. The export instantiates higher-order axioms for predicates, e.g., the axioms for transitivity and symmetry are instantiated for proportional. The resultant axiomatization has 46 axioms, and the conjecture is easily proved by most first-order ATP systems. 4 Conclusion The two encodings help to illustrate the difference between shallow and deep knowledge representation. It is often possible to do a quick shallow encoding of any given topic, which leaves implicit much knowledge about the domain. If that knowledge is not to be reused, a shallow encoding may be the best and cheapest choice. However, when large amounts of knowledge must be accumulated over time, where each new chunk of knowledge must be coherent with the content that has been created before, it is those implicit assumptions that can cause problems. Creating deep knowledge takes effort. One payoff comes in reuse. Rather than creating knowledge from scratch to support each new topic, existing knowledge may be reused, amortizing efforts to create general knowledge over many different domain-focused projects. This 5 Why is surface area important to normal cell function? Sutcliffe, Pease is similar to conventional software development, e.g., a travel website need not write its own database and server software, but rather reuses general tools that have been created with significant investment before. Creating all general software infrastructure for every new project is impractical. Gradually, we believe the same fact will be generally understood in ontology development as well. There are a number of statements in the deep encoding of this question that are not used in the proof that solves the question. One might ask why they are needed. In fact, if the entire problem were to encode this one question, their creation would be a waste of time. But if the ultimate goal is to encode a biology textbook, each definition contributes to that goal. The discipline of defining each term, we believe, also makes for better knowledge representation in the short term. The benefit of reusing the massive existing content in SUMO is noteworthy. There is no need to create a theory of different types of relations, and instead the existing notions of transitivity, symmetry etc. were reused. Similarly, a whole theory of units and measures, as well as theories of process and action and roles in actions, were reused. The reasoning task used a large inference system that includes techniques for optimizing SUMO for FOL inference. The inference system takes advantage of ATP systems that have been optimized through several years of open competition to work well on this type of problem, i.e., inference over very large knowledge bases in which most axioms are not relevant to any given query. References [Coh86] A.G. Cohn. Many Sorted Logic = Unsorted Logic + Control? In Bramer M., editor, Proceedings of Expert Systems ’86, The 6th Annual Technical Conference on Research and Development in Expert Systems, pages 184–194. Cambridge University Press, 1986. [dMSP08] G. de Melo, F. Suchanek, and A. Pease. Integrating YAGO into the Suggested Upper Merged Ontology. In S. Chung, editor, Proceedings of the 20th IEEE International Conference on Tools with Artificial Intelligence, pages 190–193. IEEE Computer Society, 2008. [HV11] K. Hoder and A. Voronkov. Sine Qua Non for Large Theory Reasoning. In V. SofronieStokkermans and N. Bjœrner, editors, Proceedings of the 23rd International Conference on Automated Deduction, Lecture Notes in Artificial Intelligence, page To appear. Springer-Verlag, 2011. [NP01] I. Niles and A. Pease. Towards A Standard Upper Ontology. In C. Welty and B. Smith, editors, Proceedings of the 2nd International Conference on Formal Ontology in Information Systems, pages 2–9, 2001. [NP03] I. Niles and A. Pease. Linking Lexicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology. In H. Arabnia, editor, Proceedings of the 2003 International Conference on Information and Knowledge Engineering, pages 412–416, 2003. [PB10] A. Pease and C. Benzmüller. Sigma: An Integrated Development Environment for Logical Theories. In Proceedings of ECAI 2010 Workshop on Intelligent Engineering Techniques for Knowledge Bases, pages 7–12, 2010. [Pea11] A. Pease. Ontology: A Practical Guide. Articulate Software Press, 2011. [Sut09] G. Sutcliffe. The TPTP Problem Library and Associated Infrastructure. The FOF and CNF Parts, v3.5.0. Journal of Automated Reasoning, 43(4):337–362, 2009. 6