Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Applied Differential Ontology Framework Bringing the knowledge of concepts to Information Assurance and Cyber Security Using FARES ** How we have done this and why Presentation by Dr Peter Stephenson and Paul Stephen Prueitt, PhD Draft version 1.0 April 2, 2005 ** FARES is the name of an Information Assurance product available from Center for Digital Forensics. Ontology Tutorial 7, copyright, Paul S Prueitt 2005 Goal: extend some of the functionality of individual intelligence to the group Of course this goal requires political and cultural activity. This kind of activity is dependant on there being a level of education as to the natural science related to behavior, computation and neuroscience. Since 1993, the BCNGroup has focused on defining relevant educational, political and cultural dependencies. We make a principled observation that semantic technology has not been enabling to the degree that one might have assumed. BCNGroup scientists suggest that the reason for this absence of performance is the Artificial Intelligence (AI) polemic. This polemic falsely characterized human cognition as a computation, and created the mythology that a computer program will reason based on an active perception of reality. If we look beyond the AI polemic we see that natural science understands a great deal about individual intelligence and about group intelligence. In the Differential Ontology Framework, this understanding is operationalized in a software architecture that relies on humans to make judgments and on the computer to create categories of patterns based on real time data. The Fundamental Diagram Scientific Origins: J. J. Gibson (late 1950s) Ecological Physics, Evolutionary Psychology, and Cognitive Engineering, and other literatures Does a human Community of Practice (CoP) have a perceptional, cognitive and/or action system? Depends: Some groups within the State Department (yes) Some groups at HIST, NSF, DARPA, etc (yes) Some groups in the Academy (yes) Other groups in these same organizations (no, not at all) Knowledge Management community (No, not really) Computer Security and Information Assurance community (No, not really) Iraqi Sunni community in Iraq in March 2005 (this might be forming) Diagram from Prueitt, 2003 First two steps are missing Seven step AIPM RDBMS Diagram from Prueitt, 2003 is not complete The measurement/instrumentation task First two steps in the AIPM Measurement is part of the “semantic extraction” task, and is accomplished with a known set of techniques” • Latent semantic technologies • Some sort of n-gram measurement with encoding into hash tables or internal ontology representation (CCM and NdCore, perhaps AeroText and Convera’s process ontology, Orbs, Hilbert encoding, CoreTalk/Cubicon. • Stochastic and neural/ genetic architectures Differential Ontology Framework Applications: • Increase the degree of executive decision making capacity and the degree of cognitive capability available to a human community of practice, such as a group in the US State Department, or a group in US Treasury. • Social groups interested in citizen watchdog activities, or other civil activities can have this same technology. • Business entities will be able to use this software to develop a greater understanding of Risks to the business. For technical descriptions see tutorials 1 – 7. (from [email protected]) Development steps, for DOF Beta Site Development of an ontology with an editor like Protégé Concepts related to Threats, Vulnerabilities, Impacts and Inter-domain communications are specified but the set of concepts about Risks is not. Domain expert, Peter Stephenson, used the methods of “Descriptive Enumeration” (DE) and community polling to develop the set of concepts properties and relationships. Peter’s role here is to represent what he knows about these realities without being concerned about computable inference or ontology representation standards. He used Protégé as a typewriter to write out concepts, specify relationship and properties. It is a very creative process. The modular DOF architecture Three levels – upper, middle and scoped ontology individuals - are used. The top level has a higher level abstraction for each of the core concepts that are in each of fine middle level ontologies. Initially these middle level ontologies were developed manually for Threats, Vulnerabilities, Impacts, and Inter-domain communication channels. The exercise at this Beta site will demonstrate how to automate the development of a set of Risk concepts, through the measurement of event log data. Goal: An ontology over Risks is to be developed as a consequence of a measurement process over some data set. It is important to see that, in theory, any one of the five “upper level ontologies” can be deleted and built using a data source, the other four, and the process we are prototyping. Of course, one discovers what one discovers, and human tacit knowledge is involved in any of these HIP processes since human in the loop is core to DOF use. How does one judge the results?: A “arm-chair” evaluation is used whereby knowledgeable individuals look at how and why various steps are done, and make a subjective evaluation about the results. We also have a mapping between Risk evaluation ontology and a numerical value with quantitative metrics. This mapping provided an informed measure of Risk that can be converted to a financial and legal statement. Log Data: Src Address/port, Dst Address/ port, Protocol Organizational Groups Organizational Group IP Address Ranges Vulnerabilities at the Policy Domain Level Differential Ontology Framework Orb Analysis Scoped Ontology Individuals (possible risks) Security Policy Domains Orb Analysis Out-of-Band or Covert Communications Channel Behavior Threats at the Policy Domain Level HumanCentric Information Production Analysis Inter-Domain Communications CPNet Model Inter-Domain Communications Channel Behavior Risk Profile Impacts at the Policy Domain Level Log Data: Src Address/port, Dst Address/ port, Protocol Organizational Groups Organizational Group IP Address Ranges Vulnerabilities at the Policy Domain Level Differential Ontology Framework Orb Analysis Scoped Ontology Individuals (possible risks) Security Policy Domains Orb Analysis Out-of-Band or Covert Communications Channel Behavior Threats at the Policy Domain Level HumanCentric Information Production Analysis Inter-Domain Communications CPNet Model Inter-Domain Communications Channel Behavior Risk Profile Impacts at the Policy Domain Level Top and middle ontology Scoped ontology Human expert The Fundamental Diagram DOF grounds the Fundamental Diagram with correspondence to several levels of event observation First level: Data Instance Example: Custom’s manifest data ei w I/ s i . The event is measured (by humans or algorithms) in a report having both relational database type “structured” data and weakly structured free form human language text. Example: Cyber Security or Information Assurance data e i co-occurrence patterns The event is measured (by algorithms) and expressed as a record in a log file. In both cases, a FARES or modified FARES product establishes the ontology resources for a more long term “True Risk Analysis” (TRA) process. DataRenewal Inc will start marketing both the FARES product and the TRA product in April 2005. The Fundamental Diagram Second level: Concept Instance Instance aggregation into a “collapse” of many instances into a category. Example: the concept of “two-ness” allows one to talk about any instances of two things. These aggregation of instance into category produces a bypass to many scalability problems (the scalability issue never comes up in practice). The aggregation process is called “semantic extraction” of instances into Subject Matter Indicators (SMIs) that reference “concepts’. These concepts provide context for any specific data instance. There are several classes of patents on semantic extraction, all of these are useful within DOF, and none is perfect with respect to always being right. Matching Subject Matter Indicators to concepts SMIs are found using algorithms. • The algorithms are complex and require expert use, however the results of good work produces a computational filter that is used to profile the SMI and to thus allow parsing programs to identify SMIs in new sources of data. • SMIs always produce a conjecture that a concept is present. • Once the conjecture is examined by a human, the concept’s “neighborhood” in the explicit ontology can be reproduced as the basis for a small scoped ontology individual. Concepts are expressed through a process of human descriptive enumeration and iterative refinement. In the FARES Beta site, Threats, Vulnerabilities, Impacts and Inter-domain communications are separate middle DOF ontology each having about 40 concepts. These ontologies also have relationships, attributes and properties and some subsumption (subconcept) relationships. However, they are designed to be subsetting rather than to use as the basis for “inference”. Because we do not use the Ontology Inference Layer in OWL, we convert the OWL formatted information into Ontology referential bases (Orbs) encoded information. We have SMI representation of the SMIs. Thus a common representational standard exists between the SMIs and the set of explicitly defined concepts. Ontology referential base encoding within the DOF