Download Statistics Based Predictive Geo-Spatial Data Mining: Forest Fire

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics Based Predictive Geo-Spatial Data
Mining: Forest Fire Hazardous Area Mapping
Application
Jong Gyu Han, Keun Ho Ryu, Kwang Hoon Chi and Yeon
Kwang Yeon
26-10-2003
1
Problem Definition
-Forrest Fire Prevention
-Finding spatial-temporal distribution of forest fires
-Predicting forest fire hazardous areas from large spatial
data sets
-Leads to a forest fire hazard prediction model
26-10-2003
2
Problem Definition 2
Youngdong Region of Kangwan
Province, Republic of Korea
Using:
-Historical data on fire ignition
point locations
-Grid-based multi-layer GIS
26-10-2003
3
Prediction Methods
Depends on relationship of spatial data sets
relevant to forest fire with respect to areas of
previous forest fire ignition
N[S] = all area
N[F] = fire ignition areas
N[A] = forest type A
N[E] = area of fire ignition on forest
type A
26-10-2003
4
Conditional Probability Prediction
Model
Average density of ignition areas:
P(F) = N[F]/N[S]
Without other information this is the
probability of a forest fire ignition area
Favourability of finding a forest ignition area given the presence of forest type A:
CondP(F\A) = P(A\F) · P(F)
P(A)
P(A\F) = P(A ∩ F)
P(F)
P(A ∩ F) = N[A ∩ F] / N[S] = N[E] / N[S]
26-10-2003
5
Conditional Probability Prediction
Model Example
N[S] = 100.000
N[F] = 500
N[A] = 2500
N[E] = 100
P(F) = N[F]/N[S]
-> 500/100.000 = 0,005
P(A\F) = N[E]/N[S]
-> 100/500 = 0.2
P (A) = N[A]/N[S]
-> 2500/100.000 = 0.025
CondP(F\A) = ((N[F]/N[S]) · (N[E]/N[S])) / (N[A]/N[S])
-> 0,005 × 0.2/0.025 =
0.04
Given the presence of forest type A, the probability of a forest fire occurrence is 8
times greater than the prior probability
26-10-2003
6
Likelihood Ratio Prediction Model
Represents the ratio of two spatial
distribution functions: one with forest fire
and one without occurrences
LR(A\F) = P(A\F)
P (A\F)
LR(A\F) = N[E] · (N[S] – N[F])
N[F] · (N[A] – N[E])
N[E] · (N[S] – N[F]) = 100 * (100.000 - 500) = 9.950.000
N[F] · (N[A] – N[E]) = 500 * (2500 - 100) = 1.200.000
LR(A\F) = 9.950.000/1.200.000 = 8,2916
>1: positive evidence for
forest ignition
1: uncorrelated
<1: negatively correlated
26-10-2003
7
Prediction Procedure
-Forestry Maps
-Topography Maps
-Human Activities
-Fire History Data
-A large number of thematic layers can be suitable related to forest fire occurrences
-Relevance filter is subjective
-> Thematic layers are user-selected
26-10-2003
8
Forest Fire Hazard Rate
Multiple Layer integration shares intermediate analysis with other levels
FHR: Forest Fire Hazard Rate:
FHR(p)CondP = CondP(V1(p)) ×…× CondP(Vm(p)), i=1,…,m
FHR(p)LR = LR(V1(p)) ×…× LR(Vm(p)), i=1,…,m
Vi(p) = Attribute value at the point thematic map (i)
CondP = Conditional Probability
LR = Likelihood Ratio
For each local area, a FHR can be computed, and fire ignition danger can be
analysed
26-10-2003
9
Experiment: Attribute selection
For practical use, thematic layers must be
selected, based on relative importance for
explaining fire ignition
Condition: chosen layers have to be
conditionally independent
Layers for Experiment:
-Forest Type
-Elevation
-Slope
-Road Network
-Farms
-Building Boundaries
26-10-2003
10
Experiment: Data sets
-It is assumed the time of study was 1996:
All spatial data in 1996 are compiled, including distribution of fire
ignition locations which occurred prior to that year
-Cross Validation: Predictions based on those relationships are evaluated
by comparing the estimated hazard classes with the distribution of forest
fire ignition locations that occurred after 1996, during the period 1997 to
2001
- Evaluation of Conditional Probability and Likelihood Ratio can
expressed in a Prediction Rate Curve
26-10-2003
11
Expiriment: Evaluation
Prediction rate curve of both models
Conclusion:
Likelihood Ratio is a more powerful
method than Conditional Probability.
The effectiveness of the model
estimated are acceptable
Prediction Rates with respect to the ‘future’
1997 to 2001 forest fire occurrences
26-10-2003
12
Expiriment: Visualisation
Using Forest Fire Hazard Index (FHI)
-Sort estimated probabilities of all pixels
in descending order
-ordered pixels are divided into 11
classes:
Pixels with the highest 5% estimated
probability are classified as the first
class, the next 5% as second class and so
on.
-Remaining low 50% is assigned to the
last class
-Add color to classes
26-10-2003
13
Conclusion
Statistics based Forest Fire prediction works well.
The Likelihood ratio method is more powerful than the Conditional probability
method.
Prediction of the forest fire hazardous area could be helpful to increase the
efficiency of forest fire management:
The ability to quantify the ignition risk could lead to a more informed allocation
of fire prevention resources.
26-10-2003
14
Questions
26-10-2003
15