Download Maximum entropy method

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Maximum Entropy
RESM 575
Spring 2011
Lecture 7
From Pearson, 2007
2
Maximum entropy




(Phillips et al.
2008)
History
E. T. Janes 1957
Thermodynamics
Inference and information theory
3
The Maximum Entropy Method
Origins: Jaynes 1957, statistical mechanics
Recent use: machine learning, eg. automatic language translation
To estimate an unknown distribution:
1. Determine what you know (constraints)
2. Among distributions satisfying constraints:
Output the one with maximum entropy
4
5
What is it?
Maxent is a general-purpose method for making
predictions or inferences from incomplete information.
 Its origins lie in statistical mechanics (Jaynes, 1957),
that explores applications in diverse areas such as





Astronomy
portfolio optimization
image reconstruction
statistical physics and signal processing.
6
Like other Bayesian models…


Uses prior information
Maxent is an alternative to methods of
inference of classical statistics
7
Maximum Entropy Principle
The fact that a certain
probability distribution
maximizes entropy
subject to certain
constraints
representing our
incomplete
information, is the
fundamental property
which justifies the use
of that distribution for
inference; it agrees
with everything that is
known but carefully
avoids assuming
anything that is not
known (Jaynes,
1990).
8
Why?

Introduced as a general approach for
presence only modeling of species
distributions, suitable for all existing
applications involving presence-only
datasets.
9
Modeling species distributions
Yellow-throated
Vireo
occurrence points
…
environmental
variables
Predicted potential distribution
10
Estimating a probability distribution
Given:
 Map divided into cells
 Environmental variables, with values in each cell
 Occurrence points: samples from an unknown
distribution
Our task is to estimate the unknown probability distribution
Note:
 The distribution sums to 1 over the whole map
 Most probability values will be very small
 Different from estimating probability of presence
11
Entropy
More entropy : more spread out, closer to uniform distribution
2nd law of thermodynamics:
- Without external influences, a system moves to increase
entropy
Maximum entropy method:
- Apply constraints to remove external influences
- Species spreads out to fill areas with suitable conditions
12
Using Maxent for Species
Distributions
“Features”
“Constraints”
“Regularization”
13
Features impose constraints
precipitation
Feature = environmental variable, or function thereof
sample average
temperature
find distribution
distribution p
p of
such
that
find
maximum
entropy such that
for all
all features
features ff:: mean(f)
mean(f) == sample
sample average
average of
of ff
for
14
Features
Environmental variables or functions thereof.
Maxent has these classes of features (others are possible):
1.
2.
3.
4.
Linear
Quadratic
Product
Binary (indicator)
5. Threshold
…
…
…
…
variable itself
square of variable
product of two variables
membership in a category
…
1
0
Environmental variable
6. Hinge
…
1
0
Environmental variable
15
Constraints
Each feature type imposes constraints on output
distribution
Linear features
…
mean
Quadratic features
…
variance
Product features
…
covariance
Threshold features
…
proportion above threshold
Hinge features
…
mean above threshold
Binary features (categorical) …
proportion in each category
16
Regularization
precipitation
confidence region
true mean
sample average
temperature
find distribution p of maximum entropy such that
Mean(f) in confidence region of sample average of f
17
The Maxent distribution
… is always a Gibbs distribution:
qλ(x) = exp(Σj λjfj(x)) / Z
Z
is a scaling factor so distribution sums to 1
fj
is the j’th feature
λj
is a coefficient, calculated by the program
18
Maxent is penalized maximum likelihood
Log likelihood:
LogLikelihood(qλ) = 1/m Σi ln(qλ(xi))
where x1 … xm are the occurrence points.
Maxent maximizes regularized likelihood:
LogLikelihood(qλ) - Σj βj|λj|
where βj is the width of the confidence interval for fj
Similar to Akaike Information Criterion (AIC).
19
Output


When Maxent is applied to presence-only species
distribution modeling, the pixels of the study area
make up the space on which the Maxent probability
distribution is defined,
Pixels with known species occurrence records
constitute the sample points, and the features are




climatic variables,
elevation,
soil category,
vegetation type or other environmental variables, and
functions thereof.
20
To note
Sometimes both presence and absence
occurrence data are available for the
development of models, in which case
general-purpose statistical methods can be
used
(for an overview of the variety of techniques
currently in use, see Corsi et al., 2000; Elith,
2002; Guisan and Zimmerman, 2000; Scott et
al., 2002).

21
Opportunity



However, while vast stores of presence-only
data exist, (records etc.) absence data are
rarely available,
Poorly sampled areas, remote, difficult…
Absence data may be of questionable value
in many situations
22
23
Background


16 modeling methods
226 well surveyed species in 6 regions of the
world
24
The authors used three statistics, the area under the Receiver
Operating Characteristic curve (AUC), correlation
(COR) and Kappa, to assess the agreement between
the presence-absence records and the predictions.
25
Maximum Entropy



Only useful when applied to testable information.
(whether a given distribution is consistent with it)
Given testable information, the maximum entropy
procedure consists of seeking the probability
distribution which maximizes information entropy,
subject to the constraints of the information.
This constrained optimization problem is typically
solved using the method of Lagrange multipliers.
26
Michael Dougherty – GIS Project Manager WVDNR
Develop statewide conservation prioritization map based on the
distribution of:
1. Species of Greatest Conservation Need (SGCN)
2. Habitats of Concern
3. Existing public land
The Challenge:
• Develop distribution models for 500 state-tracked species
• Species include: plants, herps, birds, bats, mammals, aquatics
• Modeling process must be defensible, transparent, and repeatable
27
Occurrence data:
1. State Natural Heritage Program “Biotics” database:
• Biologists collect “Source Features”
• “Source Features” are grouped into “Element Occurrences” (EOs)
• EOs represent known populations
• Species identification is accurate and spatial accuracy documented
• Use of EOs seems to greatly reduce spatial autocorrelation
2. Community Ecologists’ Vegetation Plots Database
28
Predictor Variables:
Developed a broad range of predictor variables:
•
•
•
•
•
•
•
Climate
Landcover
Terrain
Ecoregions
Geology
Soils
Disturbances
29
Workflow Overview:
• Build an array of workstations to run models
• Develop R scripts to automate running the maxent models by
iterating through all 500 species
• Develop web-based map viewer to assist biologists in reviewing
maxent model results
• Perform patch and connectivity analysis using FunConn
• (TBD) Assign weights to patches and connectors
30
31
32
33