Download Multi Entity Bayesian Network

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Randomness wikipedia , lookup

Random variable wikipedia , lookup

Conditioning (probability) wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
An Overview By:
Asma Sanam Larik & Syeda Saleha Raza
Authors:
Paulo da Costa & Prof. Laskey



Davis (1990) defines
“ Logic as a schema for defining languages to
describe and reason about entities in different
domains of application”
A logic with Propositional expressive power can
reason about particular individuals but cannot
express Generalization
A First-Order Logic can reason about general
properties and relationships that apply to
collection of individuals
Terminology of Logic
Vocabulary
consists of symbols that can be combined to
form expressions to
represent and reason about entities in a given
domain
Logical symbols
e.g., variables, connectives, punctuation
Non-logical symbols
(e.g., constant symbols, function symbols,
relation
symbols
Syntax
consists of rules for combining these symbols
to form legal expressions
Semantics
characterizes the meaning of expressions
Theory of reference
specifies what the expressions denote in the
domain
Model theory
specifies domain-independent aspects of
meaning that
are purely logical consequences of collections
of expressions
Terminology of Logic
Sentence
Legal Expressions that make assertions about
particular domain
Theory
Collection of sentences in a given language
Axiom
A proposition that is taken to be true for
granted and are used as starting point for
inferring other truths

In a computational theory, expressions are
encoded as data structures on a computer
and the proof rules are implemented as
computer programs




First-order logic is primary among logical systems
from both a theoretical and a practical standpoint. It
has been proposed as a unifying logical foundation
for defining extended logics and interchanging
knowledge among applications written in different
languages
However, its applicability has been limited by the lack
of a coherent semantics for plausible reasoning
A theory in first-order logic assigns definite truthvalues only to sentences that have the same truth
value (either true or false) in all interpretations of the
theory
The most that can be said about any other sentence
is that its truth-value is indeterminate
Syntax of First-Order Logic
Logical symbols
Logical Connectives (¬ , ∧, ∨, ⇒, ⇔, =, ∀, ∃,
the comma, open and close parentheses,
collection of variable symbols)
Non-logical symbols
consist of constant symbols, function symbols,
and predicate symbols
Term
• A term is a constant symbol, a variable
symbol, or a function symbol followed by a
parenthesized list of terms separated by
commas
• Terms are used to refer to entities in the
domain
Atomic Formula
• A predicate symbol followed by a
parenthesized list of terms
• A parenthesized expression consisting of a
term followed by an equal sign followed
by another term
Syntax of First-Order Logic
Formula
• Atomic Formula
•An expression of the form ¬α, (α∧β); (α∨β); (α⇒β),
or (α⇔β), where α and β are formulas, e.g.,
((Fernandez = Manager(Maintenance,2003))
∨ (Nguyen = Manager(Maintenance,2003))); or
•An expression of the form ∀μα or ∃μα, where μ is a
variable symbol and α is a
formula, e.g. ∃x (Employee(x) ∧ (x =
Manager(department,year))).
Open Formula
Some variables are free from quantifiers
Closed Formula
Formula in which no variables are free


BN assumes a simple attribute-value
representation i.e. each problem involves
reasoning about the same fixed number of
attributes with only evidence changing from
problem instances.
However any domain require reasoning about
varying number of related entities of different
types and relationships where the number,
type and relationships cannot be specified in
advance and are uncertain


Systems based on FOL have the ability to
represent entities of different types
interacting with each other in varied ways
However systems based on FOL lack
theoretically principled, widely accepted,
logically coherent methodology for reasoning
under uncertainity


MEBN is not a computer language such as Java or
C++, or an application such as Netica or Hugin.
Rather, it is formal system that combines
expressive power of FOL with logically consistent
treatment of uncertainty.
MEBN provides syntax, a set of model
construction and inference processes, and
semantics that together provide a means of
defining probability distributions over
unbounded and possibly infinite numbers of
interrelated hypotheses
Like BN
•
MEBN theories use directed graphs to specify joint probability
distributions for a
collection of interrelated random variables
• MEBN theories represent relationships among hypotheses using
directed graphs in which nodes represent uncertain hypotheses and edges
represent probabilistic dependencies
• Knowledge in MEBN theories is expressed via MEBN Fragments
(MFrags), each of which represents probability information about a group
of related random variables
• MEBN theories extend ordinary Bayesian networks to provide an inner
structure for random variables. Random variables in MEBN theories take
arguments that refer to entities in the domain of application. For
example, Manager(d,y) might represent the manager of the department
designated by the variable d during the year designated by the variable y.

The MEBN language treats the world as being
comprised of entities that have attributes and are
related to other entities

Constant and variable symbols are used to refer to
entities

Random variables represent features of entities and
relationships among entities.

Unique Identifier Symbols are either Truth valued
symbols , undefined symbols or Entity Identifier
symbols



Random variable symbols include Logical
connectives ,Equality operators and Quantifiers
The variable symbol <> represents Identity
reserved random variable. It is identity function
on T, F or Absurd
Findings are the random variables used to
represent observed Evidence.





Like a BN, an MFrag contains nodes, which
represent Random Variables, arranged in a
directed graph whose edges represent direct
dependence relationships.
Mfrag contains 3 types of nodes:
Context Node
Input Node
Resident Node


An Mfrag represents a conditional probability distribution
for instances of its resident random variables given their
parents in the fragment graph and the context nodes
A node in an Mfrag can have a parenthesized list of
arguments. These arguments are placeholders for entities
in the domain

The Resident Nodes of an Mfrag have their local
distributions.

Input and Context Nodes influence the distribution of the
Resident Node.

Context Nodes specify the conditions that must be
satisfied. They are Boolean Nodes
This table shows Danger To Self MFrag Probability
Distribution
Relevant Starships Nearby Danger Level Dist.
At least 1 Cardassian [0.925, 0.024, 0.006, 0]
At least 2 Cardassians [0.99, 0.008, 0.002, 0]
At least 3 Cardassians [0.975, 0.2, 0.05, 0]
More than 4 Cardassians [1, 0, 0, 0]
No Cardassians but at least 1 Romulan [.73, .162, .081,
.027]
No Cardassians but at least 1 Romulans [.76, .144, .072,
.024]
… … (see formula)
No Cardassians but 10 or more Romulans [1, 0, 0, 0]
No Cardassians or Romulans, one Unknown [.02, .48, .48,
.02]
… … (see formula)
No Cardassians or Romulans, 10+ Unknown [.20, .30, .30,
.20]
… …(see formula)



MEBN also provides theoretically grounded support
for representing very general forms of recursion via
MFrags that allow influences between instances of the
same RV template.
Allowable recursive definitions must ensure that no
RV instance can influence its own probability
distribution.
As in non-recursive MFrags, the input nodes in a
recursive MFrag may include nodes whose local
distributions are defined in another MFrag. In
addition, the input nodes may include instances of
recursively defined nodes in the MFrag itself.



MFrags provide a flexible means to represent knowledge
about specific subjects within the domain of discourse,
but the true gain in expressive power is revealed when we
aggregate these knowledge patterns” to form a coherent
domain theory that can be applied to reason about specific
situations and refined through learning.
A collection of MFrags that satisfies consistency
constraints ensuring the existence of a unique joint
probability distribution over its random variables is called
an MTheory.
MTheories can express probability distributions over truth
values of arbitrary First Order Logic sequences and can be
used to express domain-specific ontologies that capture
statistical regularities in a particular domain of application.
A valid MTheory must ensure that:
◦ all recursive definitions terminate in finitely many steps and
contain no circular influences.
◦ Since random variable instances may have a large, and possibly
unbounded number of parents, the local distributions must have
reasonable limiting behavior as more and more parents are added.
Laskey (2005) proved that when an MTheory satisfies
these conditions (as well as other technical conditions
that are unimportant to our example), then there exists a
joint probability distribution on the set of instances of its
random variables that is consistent with the local
distributions assigned within its Mfrags.
MEBN can represent uncertainty about the type of an entity, refine
type-specific probability distributions through Bayesian learning,
inherit distributions from parent types, and incorporate other
features related to representing and reasoning with incomplete
and/or uncertain information in typed systems (Costa, 2005).
As an example, we might consider two subtypes of starships,
fighters and cargo ships. When we are unsure about a starship’s
type, the result of a query that depends on type will be a weighted
average of the result given that the ship
is a fighter and the result given that it is a cargo ship.
As an example, suppose we had two different types of space
traveling entities, starships and comets, and we are not sure about
the type of a given entity. In this case, the result of a query that
depends on the entity type will be a weighted average of the result
given that the entity is a comet and the
result given that it is a starship.
To allow for hypothetical starships, the local
distribution for Exists(st) assigns non-zero
probability to False. Suppose the unique identifier
!ST4 refers to a hypothetical starship nominated to
explain the report.
In this case, Isa(Starship, !ST4) has value True, but the
value of Exists(!ST4) is uncertain. A value of False
would mean !ST4 is a spurious starship or false
alarm. Queries involving the unique identifier of a
hypothetical starship return results weighted by our
belief that it is an actual or a spurious starship. Belief
in Exists(!S4) is updated by Bayesian conditioning as
relevant evidence accrues. Representing existence
uncertainty is especially useful for counterfactual
reasoning and reasoning about causality.
MEBN logic can also represent association uncertainty, a major
problem for multi-source fusion systems. Association
uncertainty means we are not sure about the source of a given
report. For example, a report (say, !SR4) may indicate a starship
near a given location, but it may be unclear whether the report
was generated by !ST1 or !ST3, two starships known to be near
the reported location, or by a previously unreported starship
!ST4. In this case, we would enumerate these three unique
identifiers as possible values for Subject(!SR4), and specify that
Exists(!ST4) has value False if Subject(!SR4) has any value other
than !ST4. Many weakly discriminatory reports coming from
possibly many starships produces an exponential set of
combinations that require special hypothesis management
methods (c.f. Stone et al. 1999). For example, we might not
nominate !ST3 as a possible value for Subject(!SR4) if its distance
from the reported location exceeded our gating threshold, even
though if is logically possible for the report to have been
generated by !ST3.
Closely related to association uncertainty is
identity uncertainty, or uncertainty about
whether two expressions refer to the same
entity. Association uncertainty can be
regarded as a special case of identity
uncertainty – that is, we are unsure about the
identity of Subject(!SR4).
MEBN inference responds to queries for the
degree of belief in target RVs given evidence RVs.
Situation Specific Bayesian Network (SSBN)
This is an ordinary Bayesian network constructed
by combining instances of the MFrags in the
generative Mtheory. An standard Bayesian
network inference algorithm is applied to the
SSBN to answer the query.
The answer to the query is obtained by
inspecting the posterior probabilities of the
target nodes.
Finding:
In our example, the finding MFrags will convey
information that we have five starships (!ST0 through
!ST4) and that the first is our own starship. For the
sake of illustration, let’s assume that our Finding set
also includes data regarding the nature of the space
zone we are in (!Z0), its magnetic disturbance for the
first time step (!T0), and sensor reports for starships
!SR1 to !SR4 for the first two time steps.
Targets:
We assume that the Target set for our illustrative
query includes an assessment of the level of danger
experienced by the Enterprise and the best decision
to take given this level of danger.
1.
2.
3.
4.
5.
Initialization: Set the query set Q to the union of the target nodes and the
finding nodes. Initialize the RV instances R0 = Q. Set the maximum number of
states per random variable N0 equal to a finite integer. Set i = 0.
SSBN Structure Construction. Set the current SSBN Bi to contain the nodes in Ri
and all arcs corresponding to influencing configurations. Remove from Bi any
barren nodes, nodes d-separated from target nodes by finding nodes, and
nuisance nodes for which marginal distributions do not need to be updated.
Local Distribution Construction. Set the local distributions in Bi, modifying the
local distributions to restrict random variables to no more than Ni possible
values and, to approximate the effect of random variables that have not been
enumerated, and compute for no more than Ki steps.
Inference. Apply standard Bayesian network inference to compute conditional
distributions for the target random variables given the finding random variables.
If findings have probability zero, report that the findings are inconsistent.
Instance Enumeration and Approximation Parameter Updating. If a stopping
criterion is met, output Bi. Else add to Ri additional parents of random variables
for which adding additional parents might change the distribution, increase Ni
and Ki and return to Step 2.

In some cases the SSBN can be infinite, but under
conditions given in Definition 2 above, the
algorithm produces a sequence of approximate
SSBNs for which the posterior distribution of the
target nodes converges to their posterior
distribution given the findings. Mahoney and
Laskey (1998) define a SSBN as a minimal
Bayesian network sufficient to compute the
response to a query. A SSBN may contain any
number of instances of each MFrag, depending
on the number of entities and their
interrelationships.
As a full integration of first-order logic and probability,
MEBN provides:
◦ a means of expressing a globally consistent joint distribution over
models of any consistent, finitely axiomatizable FOL theory;
◦ a proof theory capable of identifying inconsistent theories in
finitely many steps and converging to correct responses to
probabilistic queries; and
◦ a built in mechanism for adding sequences of new axioms and
refining theories in the light of observations.
Thus, even the most complex situations can be
represented in MEBN, provided they can represented in
FOL.