Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SIDA – AN OVERVIEW Focus of research: method development • Wavelets in statistics: (David Donald, Lachlan McKinna, Yvette Everingham) • Tree-based methods (Tim Hancock) • Assembling methods for improved performance ( Christine , Lewis) • Categorical data analysis (Mike) • Grid computing ( Nigel Sim) • Management Statistics (Daniel Zamykal) . Software development (Stefan) Danny Coomans Regression problem y1 . . . . . . . . yn x11 . . . . . . . . . x1d xn1 xnd F * argmin E y,x ( y, F ( x)) argmin Ex[ E y ( F ( x))| x] F F Activity of tamiflu = genes + life style + … What to do? Linear regression Polynomial regression Partial least squares Least median of squares Multivariate regression splines Neural networks Support vector machines Regression trees Bayesian linear regression Generalised linear models Projection pursuit regression Deep regression Local regression Weighed least squares Try them all and take the best one ? What to do? Linear regression Polynomial regression Partial least squares Least median of squares Multivariate regression splines Neural networks Support vector machines Regression trees Bayesian linear regression Generalised linear models Projection pursuit regression Deep regression Local regression Weighed least squares …….. Try them all and take the best one ? WRONG What to do? Linear regression Polynomial regression Partial least squares the Least median of squares Multivariate regression splines Neural networks Support vector machines Regression trees Bayesian linear regression Projection pursuit regression Deep regression Local regression Weighed least squares …….. Try them all and take best one ? Answer: Combine Them! Christine Smyth, Nigel Sim and Lewis Anderson GRID computing • Ensemble learning • Randomisation methods: Bootstrapping Permutation tests Genetic Algorithms Monte Carlo Crossvalidation MCMC etc Nigel Sim NIR spectrometry • Rapid Assessment • Quality monitoring of Avocados, • Sandalwood • Wine • Sugar Cane Yvette, Mike, Danny Ron 1 Normalised absorbance 0.5 0 -0.5 -1 -1.5 400 600 800 1000 1200 1400 1600 Wavelength (nm) 1800 2000 2200 2400 NIR Spectra Sucrose Concentration Fructose Concentration Glucose Concentration 250 wavelengths of NIR specta 125 training samples 21 evaluation samples Wavelets • Demonstrate the use of adaptive wavelets in different situation: Regression Experimental design Clustering Classification etc. David Donald and Lachlan McKinna 2D Wavelet Transform of SST Anomalies – Lachlan McKinna Approximation A1 2D WT Horizontal Detail H1 10 10 20 20 30 30 40 40 50 50 60 60 Smooth 20 40 60 80 100 120 140 20 40 60 80 100 120 140 Horizontal Vertical Detail V1 Diagonal Detail D1 20 20 40 40 60 60 Vertical 20 40 60 80 100 120 140 Diagonal 20 40 60 80 100 120 140 Sustaining our natural resources: Dairying for tomorrow. • Analysis of survey results examining natural resource management on Australian dairy farms • Investigating the specific management practices responsible for producing greater than expected yield, given the nutritional inputs Daniel Zamykal Mike Steele Main research area: •looking at the power of goodness-of-fit tests (eg Chi-Square, Kolmogorov-Smirnov,…) •use Monte Carlo simulation techniques for this as it is easy! •Also do minor statistical consulting work. This has led to publications with a couple of cardiologists. More of these coming. Multivariate profiling Tim Hancock Applications • Climate variability, NIR (Yvette) • Forensic , NIR, … (Mike) • Clinical symptom profiling (Danny) … • Aerodrome weather (Keith Ross) Research Advancement Programs • Computational Life Sciences • AVANTI = ageing, veins,arteria,nutrition,trials, information • ($750,000 , 3 years) Computational Life Sciences • Data mining/modeling Tropical diseases: Qfever, meliodosis, viral characterisation. Images : retina/diabetes data analysis/diabetes • Grid computing : genetic data Aims@jcu : th 4 program • Statistical data mining in Drug Discovery • Prediction of biological activities on the basis of chemical fingerprints • Prediction of biological activities on the basis of molecular descriptors Regression methods Micro-arrays and Drug Discovery • Drug activity patterns • Different tumor cell lines • Drug molecules • Molecular structure descriptors AVANTI : statistical modeling • Chronic diseases diabetes, aortic aneurism chronic inflammation. Symptom profiles == biological inf Image features == disease status Time profiles == disease status p m Yij h xhij U hi zhij Rij h 0 Aortic Aneurism h0 prediction of time to reach some critical threshold(5cm) Time to critical threshold p m h 0 h0 Yij h xhij U hi zhij Rij Tim Hancock and Danny Coomans Time profile Risk factors: Tabocco HT CAD COPD CRF PAD Diabetes etc Are the military in Iraq properly protected against chemical attacks? • Are the antidote drugs they carry resistant to large temperature variations ? • Longitudinal Analysis. New PhD student co-supervision. Work-Flows and Data Mining Input Dataset • • Method 1 • Method 2 Work-flows make combining statistical algorithms easy The results from each algorithm flow from one node to the next, making a combination of techniques easy. Output and graphics can be viewed at each intermediate stage in the work-flow 1st Output Data 2nd Output Data Stefan Aberhard