Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Mining on the Farm Accelerating the search for a better pesticide John B. Kinney, Senior Research Associate DuPont Biosolutions Enterprise © 2001, DuPont, Inc. - All Rights Reserved. Spotfire User’s Conference, May 3-4 2001 DuPont Biosolutions Enterprise Crop Protection Products* Control weeds, insects, and plant diseases Pioneer Hi-Bred High performance seeds Protein Technologies Soy protein isolates used in the food industry Qualicon Food safety * Focus of today’s talk CPP Goals Control pests Efficaciously Safely Environmentally Cost effectively CPP Research & Data Styles in vitro in vivo Field in vivo CPP Research “Treating the bed to cure the patient” Plants in pots Length of test is a factor “Extra” data Herbicide Test Unit Control Test Substance MOG BGC FTI VEL BYG PWX CRL Field Tests Same as in vivo, but with less control! “Extra” data Degradation and movement in the environment are major issues Data Issues Biological variability (Highly) Multivariate data EC50 results are uncommon Historical data is valuable Successful Applications of Data Visualization Sourcing: Preformatted data sets for sample acquisition analysis Hit Followup: R-group visualization and analysis Lead Optimization: Color-coded reports for rapid, high-dimensional comparisons Browsing Acquisition Analysis Data Challenge: Characterize and evaluate offerings from compound brokers and collaborators Solution: External system to characterize offerings and build tables for browsing in Spotfire Minimal interface... User selection from existing “evaluation tables” Spotfire for browsing Parallel Synthesis Hit Followup Visualization and analysis of combinatorial library Row and Column layout useful, but not chemically relevant! Merging synthetic schemes combined with biology Hansch-style characterization often helpful for identifying trends and features R2 R1 == methyl, ethyl, propyl, etc R2 == -F, -Cl, -Br, -I R1 N Fragment properties and whole molecule data can provide insights Plate layout vs. Fragment Data Lead Optimization Numerous test and characterization values for each compound History of complex, printed data reports PRIMARY PLANT RESPONSE (WEEDS) INCODE = CPD1 DEPT = 8 DATE = 891127 SUBMITTER = N.B = 056898 N.B.PAGE = 021 AMT =.21G % = 100 FORM = LEAD AREA = YY/MM/DD TYPE RATE UNITS MORN TEST GLORY -------- ---- ------- ----- ----90/01/02 POST 1.0 KG/HA 90H 90/01/02 PRE 2.0 KG/HA 0 COCKL BUR ----70H 10H VELV PIG LEAF WEED ----- ----70H 30H ############################################### # # #INCODE= CPD1 # # # # # #/MOLNM # #/Info= CHEMICAL NAME AVAILABLE UPON REQUEST # # # # # ############################################### CRAB GRASS ----10C 0 GIANT FOXTL B Y CHEAT DOWNY WILD FOXTL MILLT GRASS GRASS BROME OATS ----- ----- ----- ----- ----- ----10C 0 40G 0 20H 0 0 0 SOR GUM ----30C 30C COMMENT ----------HERBICIDE HeLo Project Overview w/Heat Maps Future Challenges Better data extraction/formatting techniques Expanding data warehouse to include non-traditional data sources Computer screen real estate! Acknowledgements At the risk of missing someone... Kevin Kranis (retired) Laurie Christianson Dan Kleier The entire Discovery Organization -- They generated the data!