Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SMART Aquifer Characterization and Mapping with Machine-Learning and Evolutionary Techniques Data 2 Knowledge Keynote, Groundwater Session Australian Earth Sciences Convention Adelaide, Australia 28 June, 2016 Michael J. Friedel, Ph.D. [email protected] Outline Background Applications Future BACKGROUND Goal Save Money and Reduce Time (SMART) in characterization and mapping of aquifers Past - disparate data Hydrogeologic surveys Physical properties (lithologic composition, Ks or T, porosity, bulk density, retention, water levels) Hydrochemistry (major ions, metals, isotopes, tracers) Field parameters (pH, temp, sc) Sampling (pump/injection, point/crosswell) Geophysical surveys Physical properties (density, velocity, resistivity) Sampling (point, crosswell, surface, remote) Today - big data Velocity – rate data are generated Volume – number of records Variety – coupled, disparate, nonlinear, scaledependent, sparse, spatiotemporal, uncertain The challenge ... “We're drowning in data and starving for knowledge” Rutherford D. Rogers Objective SMART aquifer characterization and mapping with machine-learning and evolutionary techniques = Model Objective SMART aquifer characterization and mapping with machine-learning and evolutionary techniques Machine Learning = Model Objective SMART aquifer characterization and mapping with machine-learning and evolutionary techniques Traditional Machine Learning = Model Objective SMART aquifer characterization and mapping with machine-learning and evolutionary techniques Machine Learning = Model Evolutionary Objective SMART aquifer characterization and mapping with machine-learning and evolutionary techniques Traditional Machine Learning = Model Evolutionary Reality Model Simplification of Reality Model accuracy Dependent on data diversity Observations across multiple gradients Natural and anthropogenic features/stresses Data diversity Physical - geophysical and hydrogeologic response to lithology (natural) Chemical - agricultural to urban land use, forest to mining land use (anthropogenic) Data diversity Climate change – time continuum Groundwater archive paleotemperatures Aquatic biology archive El Niño – La El Niña events Sample support – space and time Borehole induction resistivity to airborne resistivity Slug test conductivity to pump test transmissivity Algae to macroinvertebrates to fish Model algorithms Supervised Machine-learning Supervised - maps inputs to outputs Unsupervised - models set of patterns Evolutionary Unsupervised optimization (heredity, fitness) Functional evolution (functions, parse trees) Unsupervised Model algorithms Hybrid Traditional plus machine-learning Evolutionary plus machine-learning ML plus minimum spanning tree Workflows Multiple ML processes Selecting algorithms Classification or regression Imputation, estimation, prediction, forecast Sparse or complete data Number of attributes, processes, responses Linearly separable or nonlinear relations Imputation, estimation, prediction, forecast APPLICATIONS Machine learning - supervised Perceptron Mapping presence/absence till (Gunnik et al. 2012) Back-propagation (BP) Mapping 3-layer resistivity (Zhu et al. 2012) Naïve Bayes/BP/Support Vector Machine Precipitation effects on groundwater recharge (Unpublished) Predict precipitation effects on GW recharge (unpublished) Recharge, cm Recharge, cm ANN – supervised artificial neural network Minnesota, USA Year Precipitation, cm Predict precipitation effects on GW recharge (unpublished) Recharge, cm Recharge, cm ANN – supervised artificial neural network Minnesota, USA Year Precipitation, cm Data-driven challenge – predicting extremes Solution - train ANN with set correlated random variables Machine-learning - unsupervised Modified-Self-Organizing Map (SOM) Forecast climate change on groundwater recharge (unpublished) Climate-change reconstruction (Friedel, 2012) Forecast climate change on GW recharge (unpublished) ANN – supervised artificial neural network with correlated random variables Wisconsin, USA Use for water-resource management Climate-change reconstruction (Friedel, 2012) MEDIEVAL LITTLE ICE WARMING PERIOD AGE MODERN WARMING PERIOD Northern and Southern Hemisphere Climate-change reconstruction (unpublished) Hybrid Modeling – SOM plus others … GW recharge scaling equations (Friedel, unpublished) Spatial continuity for GW models (Friedel et al., 2013) Climate-change forecasting (Esfahani and Friedel, 2014) Toward real-time aquifer mapping (Friedel et al., 2015) Estimation and scaling hydrostratigraphy (Friedel, 2016) Real-time satellite mapping landscape features (in review) Groundwater-recharge scaling equations (unpublished) Self-Organizing Map Symbolic Regression Scaling Equations Precipitation Recharge Recharge Recharge Use to adjust groundwater recharge based on scale of model Spatial continuity for GW models (Friedel, 2013) … Various scales, Brazil Use to conceptualize and inform groundwater model calibration Climate-change forecasting (Esfahani and Friedel, 2014) Precipitation trend for California, USA: 2012 to 2020 DROUGHT WILDFIRES California, USA Use to evaluate climate change effects on groundwater modeling Toward real-time aquifer mapping (Friedel, 2015) Numerical Inversion Machine-Learning Estimates 0 5km Aquifer Confining Unit Aquifer Confining Unit Use to conceptualize groundwater system Estimation & scaling hydrostratigraphy (Friedel, 2016) Borehole Hydrostratigraphic Units Continuous Hydrostratigraphic Units SOM Scaling Network HSU(lithology, hydraulic properties, water chemistry, geophysical properties) Use to conceptualize and inform groundwater modeling process Real-time satellite landscape mapping (in review) Subpixel Soil and Vegetation Fractions Nonphotosynthetically Active Vegetation (NPAV) Brazil Soil Use for downscale GW recharge estimates Photosynthetically Active Vegetation (PAV) FUTURE Feature selection Spatial autocorrelation Linear PCA on SOM estimates Clustering ACM on minimum spanning tree Genetic doping Feature Selection – Spatial Autocorrelation Self-Organizing Map 1 Otago Region, NZ Layer Resistivities Electromagnetic Measurements Distance of Resistivity Sounding to Borehole 0 10 20 30 40 50 60 F41-0389 Correlation C2017 Correlation Correlation Autocorrelation @ Boreholes 0 10 20 30 40 50 60 g41-0308 0 10 20 30 40 50 60 Distance from borehole, m Relate borehole lithology to AEM soundings <10 m Feature Selection – Spatial Autocorrelation Self-Organizing Map 2 Predict Hydraulic Conductivity Lithology + Hydraulic properties + Chemistry + Soundings < 10 m Predict Lithology Predict Total Dissolved Solids Feature Selection – Genetic Doping Southland Region, NZ Model (53 Variables) Linear PCA on SOM estimates Feature Selection – Genetic Doping Feature Selection Data Worth 6 clusters Genetic Doping Conceptualize Model 3 2 1 3 clusters Workflows – Murray-Darling, AU Modified self-organizing map Parallelization – processing big data (speed) Regularization – elastic net estimation (add Lasso, ...) Open source – flexibility with model-independent software (Python, R) for customizable workflows Modified self-organizing map Drill-log uncertainty - random correlated variables Generalize cross-validation to n-fold Ensemble - boosting with SOM as base learner Quantile regression – mcmc training resistivity models ML proposal distributions - mcmc resistivity inversions Conclusion Data 2 knowledge paradigm provides a SMART approach to characterizing and mapping aquifers SMART = Save Money And Reduce Time Thank you! Answers ? Questions ? Mike Friedel [email protected] [email protected]