Download Saving the planet: Challenges for spatio

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
SSTD 2005, Angra dos Reis
Challenges for spatiotemporal database
researchers
Gilberto Câmara
Director for Earth Observation, National Institute
for Space Research
Member of Executive Committee, Group on Earth
Observations
Member, Scientific Steering Committee, IGBP
LAND project
INPE - brief description

National Institute for Space Research
 main
civilian organization for space activities in Brazil
 staff of 1,800 ( 800 Ms.C. and Ph.D.)

Areas:
 Space
Science, Earth Observation, Meteorology and
Space Engineering
R&D in GIScience at INPE


Graduate programs in Computer Science and
Remote Sensing
Research areas
 Spatial
statistics
 Spatial dynamical modelling
 Spatio-temporal databases
 Image databases and image processing

Technology
 TerraLib
– open source library for ST DBMS
“Give us some new problems”
(Dimitrios Papadias, SSTD 2005)
“Give us some new problems”
What about saving the planet?
Great challenge:
Database support for
earth system science
source: NASA
Earth as a system
Physical Climate System
Climate
Change
Atmospheric Physics/Dynamics
Ocean Dynamics
Terrestrial
Energy/Moisture
Human
Activities
Global Moisture
Marine
Biogeochemistry
Terrestrial
Ecosystems
Tropospheric Chemistry
Biogeochemical Cycles
(from Earth System Science: An Overview, NASA, 1988)
Soil
CO2
Land
Use
CO2
Pollutants
The fundamental question

How is the Earth’s environment changing,
and what are the consequences for human
civilization?

A society with the ability to gather and
understand Earth Science information and make
proactive, timely environmental predictions and
decisions at all relevant geographical and
societal levels.
Source: NASA,
The Road Ahead for Earth
Observations



A new international organization tasked with
implementation a Global Earth Observation
System of Systems (GEOSS).
GEOSS shall coordinate a wide range of spacebased, air-based, land-based, and ocean-based
environmental monitoring platforms, resources
and networks – presently often operating
independently.
Membership in GEO currently includes 51
countries plus the European Commission, and
29 participating international organisations.
Permanent
Coordinating Earth Observing Systems
Vantage Points
Capabilities
FarSpace
L1/HEO/GEO
TDRSS &
Commercial
Satellites
LEO/MEO
Commercial
Satellites
and Manned
Spacecraft
NearSpace
Aircraft/Balloon
Event Tracking
and Campaigns
Deployable
Airborne
Terrestrial
Forecasts & Predictions
User
Community
Remote Sensing:
Increased EO capability
GeoSensors: New technology of earth
observations
Smart Dust (UC
Berkeley)
“Spec” mote
UC Berkeley
Intel
mote
MICA
mote
Group on Earth Observation System of
Systems
G8 supports GEOSS


The G8 welcome the adoption of the 10-year
implementation plan for development of the Global Earth
Observation System of Systems (GEOSS).
We will:


(a) move forward in the national implementation of GEOSS in our
member states;
(b) support efforts to help developing countries and regions
obtain full benefit from GEOSS, such as placement of
observational systems to fill data gaps, developing of capacity for
analysing and interpreting observational data, and development
of decision-support systems and tools relevant to local needs;
Gleneagles plan of action, 2005
How good is the plan for GEOSS?



GEOSS should agree to use any one of four open
standard ways to describe service interfaces (CORBA,
WSDL, ebXML or UML).
The standard ISO/IEC 11179, Information Technology-Metadata Registries, provides guidance on representing
data semantics.
Data and information resources and services in GEOSS
typically include references to specific places on the
Earth. Interfaces to use these geospatial data and
services are agreed upon through the various Spatial
Data Infrastructure initiatives (e.g., OGC WMS, WCS,
and WFS).
GEOSS Implementation Plan, 2004
The Five Orders of Ignorance

0th Order Ignorance (0OI): Lack of Ignorance


1st Order Ignorance (1OI): Lack of Knowledge


I do not know that I do not know something
3rd Order Ignorance (3OI): Lack of Process


I do not know something
2nd Order Ignorance (2OI): Lack of Awareness


I (provably) know something
I do not know a suitably effective way to find out that I don’t know that I
don’t know something
4th Order Ignorance (4OI): Meta-Ignorance

I do not know about the Five Orders of Ignorance
The five orders of ignorance, Phillip G. Armour, CACM, 43(10), Oct 2000
What does GEOSS doesn’t know that it doesn’t
know?
Ontology
Lake
Habitat
Converter
GIS A
GIS B
We don’t now how to do this!!
The Road Ahead: Geosensors



Advances in remote sensing are giving
computer networks the eyes and ears they need
to observe their physical surroundings.
Sensors detect physical changes in pressure,
temperature, light, sound, or chemical
concentrations and then send a signal to a
computer that does something in response.
Scientists expect that billions of these devices
will someday form rich sensory networks linked
to digital backbones that put the environment
itself online.
(Rand Corporation, “The Future of Remote Sensing”)
From Global to local
scales: Spatio-temporal
modeling in Amazonia
Source: Carlos Nobre (INPE)
Amazonia: the forest..
Source: Carlos Nobre (INPE
Source: Carlos N
Deforestation...
Fire...
Source: Carlos Nobre (INPE)
photo source: Edson Sano (EMBRAPA)
Large-Scale Agriculture
Agricultural Areas (ha)
1970
Legal Amazonia
Brazil
1995/1996
%
5,375,165
32,932,158
513
33,038,027
99,485,580
203
Source: IBGE - Agrarian Census
photo source: Edson Sano (EMBRAPA)
Cattle in Amazonia and Brazil
Unidade
Amazônia Legal
Brasil
Fonte: PAM - IBGE
1992
29915799
154,229,303
2001
51689061
176,388,726
%
72,78%
14,36%
Cattle in Amazonia and Brazil
Unidade
Amazônia Legal
Brasil
1992
2001
%
29,915,799
51,689,061
72,78%
154,229,303
176,388,726
14,36%
photo source: Edson Sano (EMBRAPA)
McDonald’s is bad for the planet!
Cattle in Amazonia and Brazil
Unidade
Amazônia Legal
Brasil
Fonte: PAM - IBGE
1992
29915799
154,229,303
2001
51689061
176,388,726
%
72,78%
14,36%
Cattle in Amazonia and Brazil
Unidade
Amazônia Legal
Brasil
1992
2001
%
29,915,799
51,689,061
72,78%
154,229,303
176,388,726
14,36%
Amazon Deforestation 2003
Deforestation 2002/2003
Deforestation until 2002
Fonte: INPE PRODES Digital, 2004.
Amazônia in 2005
source: Greenpeace
Amazônia em 2015?
fonte: Aguiar et al., 2004
Modelling Complex Problems

Application of interdisciplinary knowledge to produce a
model.
If (... ? ) then ...
Desforestation?
Uncertainty on basic equations
Limits for Models: the line of our
ignorance
Social and Economic
Systems
Quantum
Gravity
Particle
Physics
Living
Systems
Global
Change
Chemical
Reactions
Applied
Sciences
Solar System Dynamics
Complexity of the phenomenon
Meteorology
source: John Barrow
What Drives Tropical Deforestation?
% of the cases
 5% 10% 50%
Underlying Factors
driving proximate causes
Causative interlinkages at
proximate/underlying levels
Internal drivers
*If less than 5%of cases,
not depicted here.
source:Geist &Lambin
Nested Cellular Automata:
A Foundation for Building
Multifunctional Landscape
and Urban Dynamic
Models
Tiago Garcia Carneiro
Gilberto Câmara
Antônio Miguel Monteiro
Ana Paula Aguiar
Maria Isabel Escada
(manuscript under preparation, 2005)
Dynamic Spatial Models
f (It)
f (It+1)
F
f (It+2)
f ( It+n )
F
..
“A dynamical spatial model is a computational
representation of a real-world process where a location
on the earth’s surface changes in response to variations
on external and internal dynamics on the landscape”
(Peter Burrough)
Dynamic Spatial Models
Forecast
tp - 20
tp - 10
tp
Calibration
Source: Cláudia Almeida
Calibration
tp + 10
Emergence: Clocks, Clouds or Ants?

Clocks
 Paradigms:
Netwon’s laws (mechanistic, cause-effect
phenomena describe the world)

Clouds
 Stochastic
models
 Theory of chaotic systems

Ants
 The
colony behaves intelligently
 Intelligence is an emergent property
Deterministic CA Models
(Clarke et al., 1997)
Diffusive
growth and
spread of a
new growth
centre
Spontaneous
new growth
Seed Cell
Organic
growth
Cell urbanised by
this step
Cell urbanised at
previous step
Road
influenced
growth
Growth moved to
road, and spread
Road
Source: Cláudia Almeida
Requirements
Simulation of different partitions in space
 Each partition has different actors and processes

Farms
Settlements
10 to 20 anos
Recent
Settlements
(less than 4
years)
Source: Escada, 2003
Old
Settlements
(more than
20 years)
Spatial dynamic modeling
Demands
Requirements

Locations change due to
external forces

discretization of space in cells

Realistic representation of
landscape

generalization of CA

Elements of dynamic
models

discrete and continous
processes

Geographical space is
inhomogeneous


Different types of models
Flexible neighborhood
definitions
Extensibility to include userdefined models

Cell Spaces
(a) land_cover equals deforested in 1985
(a) land_cover equals deforested in 1985
attr_id
object_id
initial_time
final_time
C34L181985-01-0100:00:001985-12-3123:59:59
C34L18
01/01/1985
31/12/1985
C34L181988-01-0100:00:001988-12-3123:59:59
C34L18
01/01/1988
31/12/1988
C34L181991-01-0100:00:001991-12-3123:59:59
C34L18
01/01/1991
31/12/1991
C34L181994-01-0100:00:001994-12-3123:59:59
C34L18
01/01/1994
31/12/1994
C34L181997-01-0100:00:001997-12-3123:59:59
C34L18
01/01/1997
31/12/1997
C34L182000-01-0100:00:002000-12-3123:59:59
C34L18
01/01/2000
31/12/2000
C34L191985-01-0100:00:001985-12-3123:59:59
C34L19
01/01/1985
31/12/1985
C34L191988-01-0100:00:001988-12-3123:59:59
C34L19
01/01/1988
31/12/1988
C34L191991-01-0100:00:001991-12-3123:59:59
C34L19
01/01/1991
31/12/1991
C34L191994-01-0100:00:001994-12-3123:59:59
C34L19
01/01/1994
31/12/1994
C34L191997-01-0100:00:001997-12-3123:59:59
C34L19
01/01/1997
31/12/1997
C34L192000-01-0100:00:002000-12-3123:59:59
C34L19
01/01/2000
31/12/2000
land_cover
forest
forest
forest
deforested
deforested
deforested
forest
deforested
deforested
deforested
deforested
deforested
dist_primary_road
dist_secondary_road
7068.90
669.22
7068.90
669.22
7068.90
669.22
7068.90
669.22
7068.90
669.22
7068.90
669.22
7087.29
269.24
7087.29
269.24
7087.29
269.24
7087.29
269.24
7087.29
269.24
7087.29
269.24
Hybrid Automata

Formalism developed by Tom Henzinger
(UC Berkeley)
 Applied
to embedded systems, robotics, process
control, and biological systems

Hybrid automaton
 Combines
discrete transition graphs with continous
dynamical systems
 Infinite-state transition system
Event
Jump condition
Control Mode A
Control Mode B
Flow Condition
Flow Condition
Space is Anisotropic
Spaces of fixed location and spaces of fluxes in Amazonia
Motivation
Which objects are NEAR each other?
Motivation
Which objects are NEAR each other?
Using Generalized Proximity Matrices
Consolidated area
Emergent area
Computational Modelling with Cell
Spaces
Cell Spaces

Components

Cell Spaces

Generalizes Proximity Matriz – GPM

Hybrid Automata model

Nested enviroment
Environment: A Key Concept in TerraME
An environment has 3 kinds of sub models:
 Spatial Model: cellular space + region + GPM
(Generalized Proximity Matrix)
 Behavioral Model: hybrid automata + situated
agents
 Temporal Model: discrete event simulator
The spatio-temporal structure is shared by several
communicating agents
Support for Nested Environments
U
U
U
Environments can be nested
Multiscale modelling
Space can be modelled in different resolutions
Nested CA x Traditional CA

CA
 Homogeneous,
isotropic space
 Local
action
 One attribute per cell (discrete values)
 Finite space state

Nested CA
 Non-homogeneous
space
 Action-at-a-distance
 Many attributes per cell
 Infinite space state
Software Architecture
RondôniaModel
São Felix Model
Amazon Model
Hydro Model
TerraME Language
TerraME Compiler
TerraME Virtual Machine
TerraLib
TerraME Framework
C++ Signal
Processing
librarys
C++
Mathematical
librarys
C++
Statistical
librarys
TerraLib
http://www.terralib.org/
Deforestation Rate Distribution Module
Small Units Agent
latency
> 6 years
Deforesting
Newly implanted
Deforestation >
80%
Year of
creation
Slowing down
Iddle
Factors affecting rate:


Deforestation =
100%


Large and Medium Units Agent
Deforesting
Deforestation >
80%
Year of
creation
Slowing down
Iddle
Deforestation =
100%
Global rate
Relation properties density speedy of change
Year of creation
Credit in the first years (small)
Allocation Module: different resolution,
variables and neighborhoods
1985
Small farms environments:
500 m resolution
Categorical variable:
deforested or forest
One neighborhood relation:
•connection through roads
Large farm environments:
2500 m resolution
1997
Continuous variable:
% deforested
Two alternative neighborhood
relations:
•connection through roads
• farm limits proximity
1997
Simulation Results
1985
1988
1994
1997
1991
Mining Patterns of Change
in Remote Sensing Image
Databases
Marcelino Pereira S. Silva
Gilberto Câmara
Ricardo Cartaxo M. Souza
Dalton M. Valeriano
Maria Isabel S. Escada
(IEEE Data Mining Conf, 2005)
Images are everywhere!



Observational satellites from 1 meter to 1 km
Spectral bands, ranging from visible to radar
Periodic sources of information
Knowledge gap in Earh Observation
source: John McDonald (MDA)
Why image database mining?

Most applications of EO data
 “Snapshot”

paradigm
Recipe analogy
 Take
1 image (“raw”)
 “Cook” the image (correction + interpretation)
 All “salt” (i.e., ancillary data)
 Serve while hot (on a “GIS plate”)

But we have lots of images!
MSS - Landsat 1
WRS1 248/62
07/07/1973
Why image database mining?

What’s in an Image?
 Is
an image just a field of energy received by a
sensor?
 Are images instruments for capturing landscape
dynamics?
(Camara, Egenhofer et al, 2001)
Improving Societal Benefits

In search of a “killer-app”
 How
many cutting-edge applications exist for
extracting information in large image databases?
 How
much R&D is being invested in spatial data
mining in large repositories of EO data?
 How
use?
do we put our image databases to more effective
Image Database mining

A large remote sensing image database is a
collection of snapshots of landscapes, which
provide us with a unique opportunity for
understanding how, when, and where changes
take place in our world.

We should search for changes, not search for
content
Image mining part I: Finding patterns
Image segmentation
Segmentation extracts objects from images
Segmentation comparison
A comparison of segmentation programs for high resolution
remote sensing data, G. Meinel, M. Neubert, ISPRS Congress,
2004
Spatial Patterns of Deforestation
CORRIDOR
DIFFUSE
FISHBONE
GEOMETRIC
Colonization
along roads
and rivers
Small farms
Planned
settlement
Large farms
Image mining part II: finding spatial
configurations
Geometrical patterns – Terra do Meio (PA),
2003
Diffuse patterns – Terra do Meio (PA),
2003
Image mining results
Image mining results

What’s the behavior of large farmers in Terra do
Meio during this period (1997-2003)? Is the area
of new large farms increasing?

In 2000, this kind of deforestation reached a
peak of 55,000 ha, but decreased in the
following years. In 2003, the deforestation area
associated to large farms decreased to 29,000
ha. This indicates that large farms are reducing
their contribution to deforestation.
From Moving Objects to
Moving Regions
Real-time monitoring of Amazon
deforestation
Yearly estimates
Deforestation maps
Recent MODIS/WFI
data
Detection of new deforestation
Web maps
External users
Ground Station
Deforestation betwen
13/Aug/2003 and 07/May/2004
Landsat Image
13/Ago/2003
Deforestation
13/Ago/2003 until
07/Mai/2004
Deforestation in
13/Aug/2003 (yellow) +
deforestation from
13/Aug/2003 until
07/mai/2004 (red)
Fifteen days later...
Deforestation on
21/May/2004
Deforestation
13/Aug/2003
(yellow) +
deforestation f
13/Aug/2003 u
07/May/2004 (r
+ deforestation
21/May/2004
(orange)
From moving objects to moving regions
Desforestation areas detected in 07/21 May
(blue dots) + fire stops detected in 10/11 Jun
Query examples

Select all deforestation units and fire spots within these units that
occurred in the same year

Select all deforestation units and fire spots within these units, such
that the fire spots occurred up to 90 days after the deforestation

Select all deforestation units and fire spots within these units that
occurred at the same month of the same year

Select all deforestation units and fire spots within these units that
occurred at same month (of any year)

Select the time difference between the deforestation units and fire
spots within these units
TerraLib: Open Source
Tools for GIS Application
Development
www.terralib.org
Spatial Information Engineering


Technological change
Current generation of GIS
 Built
on proprietary architectures
 Interface+function+database = “monolythic” system
 Geometric data structures = archived outside of the
DBMS

New generation of object-relational DBMS
 All
data will be handled by DBMS
 Standardized access methods (e.g. OpenGIS)
 Users can develop customized applications
TerraLib: the support for TerraME


Open source library for GIS
Data management
 object-relational




DBMS
raster + vector geometries
ORACLE, Postgres, mySQL, Access
Environment for customized GIS
applications
Web-based cooperative development
 http://www.terralib.org
TerraLib: Scientific Motivation

GIScience has brought us new concepts





How do we build “proof-of-concept” prototypes?


Ontologies
Spatio-Temporal models
Uncertainty
Geocomputation
Will GIScience be driven by the industry?
We need open source tools to share our results!
TerraLib: Open source GIS library

Data management


Functions


All of data (spatial + attributes) is in
database
Spatial statistics, Image Processing,
Map Algebra
Web-based co-operative
development

http://www.terralib.org
Operational Vision of TerraLib
DBMS
TerraLib
Geographic
Application
Spatial
Operations
API for
Spatial
Operations
Spatial
Operations
Access
Oracle
Spatial
MySQL
Postgre
SQL
TerraLib  MapObjects + ArcSDE + cell spaces + spatio-temporal models
TerraLib applications

Cadastral Mapping


Public Health


Indicators of social exclusion in
inner-city areas
Land-use change modelling


Spatial statistical tools for
epidemiology and health services
Social Exclusion


Improving urban management of
large Brazilian cities
Spatio-temporal models of
deforestation in Amazonia
Emergency action planning

Oil refineries and pipelines
(Petrobras)
TerraLib Support for Cell Spaces
Cellular Space
TerraLib: Spatial
statistics
R- TerraLib interface: the case for strong
coupling
R-Terralib interface
Loaded into a TerraLib database, and visualized with TerraView.
R data from geoR package.
Wrap-up: databases for
Earth System Science
Computational Modelling and Databases

Design and implementation of
computational enviroments for
modelling



Requires a formal and stable
description
Implementation allow experimentation
Rôle of computer representation



Bring together expertise in different field
Make the different conceptions explicit
Make sure these conceptions are
represented in the information system
The basic question

What are the different spatio-temporal data
types and how can we design an algebra for
spatio-temporal objects based on them?

How can these spatio-temporal data types be
represented in an object-relational DBMS?

What query languages and algorithms are
needed to handle ST data?
Spatio-temporal data types
Life
Geometry
Attributes
Genealogy
Events
Ephemeral
Point
Fixed
-
Cadastre
Long
Polygon
Variable
Important
Fields
Permanent
Cell,
Polygon
Variable
-
Spatio-temporal
patterns
Permanent
Polygon
Variable
Important
Location based
services
Permanent
Point
Variable
-
Genealogy of ST objects
Earth System Computation Models: The role of
database community

What are the data structures we need for earth
system science?

How to handle spatio-temporal fields in
databases?

How to handle moving-regions in databases?

How to develop computational models for
spatio-temporal data?
Image database mining: research
challenges

How to handle effectively images in DBMS?

What types of patterns are important for remote
sensing image DB?

What pattern-finding algorithms capture changes
in images?
Geosensors: How can DB researchers
help?

We need a comprehensive semantics of spatial
data

What are spatio-temporal fields? What are the
associated data types, operations, data
structures and query and indexing schemas?

What is geosensor data? How to describe the
world based on samples in space and time?
Spatial modeling and spatial statistics

How can we merge spatial statistics
with ST DBMS?

How can we support nested CA and
cell spaces?

How do we model anisotropic
space?

What modeling languages are
suitable for ST DBMS?
Long-term challenges




Can our current STDBMS handle the challenges
of earth system modeling?
What STDBMS technology would handle earth
system modeling?
What knowlege processing tools are included in
our STDBMS?
What knowledge processing tools do we need
for the next generation of STDBMS?
Vision: from data to knowledge
fonte: NASA
“Give us some new problems”
We need databases for saving the planet