Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SMART Aquifer Characterization and Mapping with Machine-Learning and Evolutionary Techniques Data 2 Knowledge Keynote, Groundwater Session Australian Earth Sciences Convention Adelaide, Australia 28 June, 2016 Michael J. Friedel, Ph.D. [email protected] Outline  Background  Applications  Future BACKGROUND Goal  Save Money and Reduce Time (SMART) in characterization and mapping of aquifers Past - disparate data  Hydrogeologic surveys  Physical properties (lithologic composition, Ks or T, porosity, bulk density, retention, water levels)  Hydrochemistry (major ions, metals, isotopes, tracers)  Field parameters (pH, temp, sc)  Sampling (pump/injection, point/crosswell)  Geophysical surveys  Physical properties (density, velocity, resistivity)  Sampling (point, crosswell, surface, remote) Today - big data  Velocity – rate data are generated  Volume – number of records  Variety – coupled, disparate, nonlinear, scaledependent, sparse, spatiotemporal, uncertain The challenge ...  “We're drowning in data and starving for knowledge” Rutherford D. Rogers Objective  SMART aquifer characterization and mapping with machine-learning and evolutionary techniques = Model Objective  SMART aquifer characterization and mapping with machine-learning and evolutionary techniques Machine Learning = Model Objective  SMART aquifer characterization and mapping with machine-learning and evolutionary techniques Traditional Machine Learning = Model Objective  SMART aquifer characterization and mapping with machine-learning and evolutionary techniques Machine Learning = Model Evolutionary Objective  SMART aquifer characterization and mapping with machine-learning and evolutionary techniques Traditional Machine Learning = Model Evolutionary Reality Model Simplification of Reality Model accuracy  Dependent on data diversity  Observations across multiple gradients  Natural and anthropogenic features/stresses Data diversity  Physical - geophysical and hydrogeologic response to lithology (natural)  Chemical - agricultural to urban land use, forest to mining land use (anthropogenic) Data diversity  Climate change – time continuum  Groundwater archive paleotemperatures  Aquatic biology archive El Niño – La El Niña events  Sample support – space and time  Borehole induction resistivity to airborne resistivity  Slug test conductivity to pump test transmissivity  Algae to macroinvertebrates to fish Model algorithms Supervised  Machine-learning  Supervised - maps inputs to outputs  Unsupervised - models set of patterns  Evolutionary  Unsupervised optimization (heredity, fitness)  Functional evolution (functions, parse trees) Unsupervised Model algorithms  Hybrid  Traditional plus machine-learning  Evolutionary plus machine-learning  ML plus minimum spanning tree  Workflows  Multiple ML processes Selecting algorithms       Classification or regression Imputation, estimation, prediction, forecast Sparse or complete data Number of attributes, processes, responses Linearly separable or nonlinear relations Imputation, estimation, prediction, forecast APPLICATIONS Machine learning - supervised  Perceptron  Mapping presence/absence till (Gunnik et al. 2012)  Back-propagation (BP)  Mapping 3-layer resistivity (Zhu et al. 2012)  Naïve Bayes/BP/Support Vector Machine  Precipitation effects on groundwater recharge (Unpublished) Predict precipitation effects on GW recharge (unpublished) Recharge, cm Recharge, cm ANN – supervised artificial neural network Minnesota, USA Year Precipitation, cm Predict precipitation effects on GW recharge (unpublished) Recharge, cm Recharge, cm ANN – supervised artificial neural network Minnesota, USA Year Precipitation, cm Data-driven challenge – predicting extremes Solution - train ANN with set correlated random variables Machine-learning - unsupervised  Modified-Self-Organizing Map (SOM)  Forecast climate change on groundwater recharge (unpublished)  Climate-change reconstruction (Friedel, 2012) Forecast climate change on GW recharge (unpublished) ANN – supervised artificial neural network with correlated random variables Wisconsin, USA Use for water-resource management Climate-change reconstruction (Friedel, 2012) MEDIEVAL LITTLE ICE WARMING PERIOD AGE MODERN WARMING PERIOD Northern and Southern Hemisphere Climate-change reconstruction (unpublished) Hybrid Modeling – SOM plus others …       GW recharge scaling equations (Friedel, unpublished) Spatial continuity for GW models (Friedel et al., 2013) Climate-change forecasting (Esfahani and Friedel, 2014) Toward real-time aquifer mapping (Friedel et al., 2015) Estimation and scaling hydrostratigraphy (Friedel, 2016) Real-time satellite mapping landscape features (in review) Groundwater-recharge scaling equations (unpublished) Self-Organizing Map Symbolic Regression Scaling Equations Precipitation Recharge Recharge Recharge Use to adjust groundwater recharge based on scale of model Spatial continuity for GW models (Friedel, 2013)  … Various scales, Brazil Use to conceptualize and inform groundwater model calibration Climate-change forecasting (Esfahani and Friedel, 2014) Precipitation trend for California, USA: 2012 to 2020 DROUGHT  WILDFIRES California, USA Use to evaluate climate change effects on groundwater modeling Toward real-time aquifer mapping (Friedel, 2015) Numerical Inversion Machine-Learning Estimates 0 5km Aquifer Confining Unit Aquifer Confining Unit Use to conceptualize groundwater system Estimation & scaling hydrostratigraphy (Friedel, 2016) Borehole Hydrostratigraphic Units Continuous Hydrostratigraphic Units SOM Scaling Network HSU(lithology, hydraulic properties, water chemistry, geophysical properties) Use to conceptualize and inform groundwater modeling process Real-time satellite landscape mapping (in review) Subpixel Soil and Vegetation Fractions Nonphotosynthetically Active Vegetation (NPAV) Brazil Soil Use for downscale GW recharge estimates Photosynthetically Active Vegetation (PAV) FUTURE Feature selection  Spatial autocorrelation  Linear PCA on SOM estimates  Clustering ACM on minimum spanning tree  Genetic doping Feature Selection – Spatial Autocorrelation Self-Organizing Map 1 Otago Region, NZ Layer Resistivities Electromagnetic Measurements Distance of Resistivity Sounding to Borehole 0 10 20 30 40 50 60 F41-0389 Correlation C2017 Correlation Correlation Autocorrelation @ Boreholes 0 10 20 30 40 50 60 g41-0308 0 10 20 30 40 50 60 Distance from borehole, m Relate borehole lithology to AEM soundings <10 m Feature Selection – Spatial Autocorrelation Self-Organizing Map 2 Predict Hydraulic Conductivity Lithology + Hydraulic properties + Chemistry + Soundings < 10 m Predict Lithology Predict Total Dissolved Solids Feature Selection – Genetic Doping Southland Region, NZ Model (53 Variables) Linear PCA on SOM estimates Feature Selection – Genetic Doping Feature Selection Data Worth 6 clusters Genetic Doping Conceptualize Model 3 2 1 3 clusters Workflows – Murray-Darling, AU Modified self-organizing map  Parallelization – processing big data (speed)  Regularization – elastic net estimation (add Lasso, ...)  Open source – flexibility with model-independent software (Python, R) for customizable workflows Modified self-organizing map      Drill-log uncertainty - random correlated variables Generalize cross-validation to n-fold Ensemble - boosting with SOM as base learner Quantile regression – mcmc training resistivity models ML proposal distributions - mcmc resistivity inversions Conclusion  Data 2 knowledge paradigm provides a SMART approach to characterizing and mapping aquifers  SMART = Save Money And Reduce Time Thank you! Answers ? Questions ? Mike Friedel [email protected] [email protected]