Download Data Mining on the Farm

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Data Mining on the Farm
Accelerating the search for a
better pesticide
John B. Kinney, Senior Research Associate
DuPont Biosolutions Enterprise
© 2001, DuPont, Inc. - All Rights Reserved.
Spotfire User’s Conference, May 3-4 2001
DuPont Biosolutions
Enterprise
Crop Protection Products*
Control weeds, insects, and plant diseases
Pioneer Hi-Bred
High performance seeds
Protein Technologies
Soy protein isolates used in the food industry
Qualicon
Food safety
* Focus of today’s talk
CPP Goals
Control pests
Efficaciously
Safely
Environmentally
Cost effectively
CPP Research & Data Styles
in vitro
in vivo
Field
in vivo CPP Research
“Treating the bed to cure the patient”
Plants in pots
Length of test is a factor
“Extra” data
Herbicide Test Unit
Control
Test Substance
MOG
BGC
FTI
VEL
BYG
PWX
CRL
Field Tests
Same as in vivo, but with less control!
“Extra” data
Degradation and movement in the
environment are major issues
Data Issues
Biological variability
(Highly) Multivariate data
EC50 results are uncommon
Historical data is valuable
Successful Applications of
Data Visualization
Sourcing: Preformatted data sets for
sample acquisition analysis
Hit Followup: R-group visualization and
analysis
Lead Optimization: Color-coded reports
for rapid, high-dimensional comparisons
Browsing Acquisition Analysis
Data
Challenge: Characterize and evaluate
offerings from compound brokers and
collaborators
Solution: External system to characterize
offerings and build tables for browsing in
Spotfire
Minimal interface...
User selection from
existing “evaluation
tables”
Spotfire for browsing
Parallel Synthesis Hit
Followup
Visualization and analysis of combinatorial
library
Row and Column layout useful, but not
chemically relevant!
Merging synthetic schemes
combined with biology
Hansch-style characterization often helpful
for identifying trends and features
R2
R1 == methyl, ethyl, propyl, etc
R2 == -F, -Cl, -Br, -I
R1
N
Fragment properties and whole molecule
data can provide insights
Plate layout vs. Fragment
Data
Lead Optimization
Numerous test and characterization values
for each compound
History of complex, printed data reports
PRIMARY PLANT RESPONSE (WEEDS)
INCODE = CPD1
DEPT = 8 DATE = 891127
SUBMITTER =
N.B = 056898
N.B.PAGE = 021
AMT =.21G
% = 100 FORM =
LEAD AREA =
YY/MM/DD TYPE
RATE UNITS MORN
TEST
GLORY
-------- ---- ------- ----- ----90/01/02 POST
1.0 KG/HA 90H
90/01/02 PRE
2.0 KG/HA 0
COCKL
BUR
----70H
10H
VELV PIG
LEAF WEED
----- ----70H
30H
###############################################
#
#
#INCODE= CPD1
#
#
#
#
#
#/MOLNM
#
#/Info= CHEMICAL NAME AVAILABLE UPON REQUEST #
#
#
#
#
###############################################
CRAB
GRASS
----10C
0
GIANT FOXTL B Y
CHEAT DOWNY WILD
FOXTL MILLT GRASS GRASS BROME OATS
----- ----- ----- ----- ----- ----10C
0
40G
0
20H
0
0
0
SOR
GUM
----30C
30C
COMMENT
----------HERBICIDE
HeLo
Project Overview w/Heat
Maps
Future Challenges
Better data extraction/formatting
techniques
Expanding data warehouse to include
non-traditional data sources
Computer screen real estate!
Acknowledgements
At the risk of missing someone...
Kevin Kranis (retired)
Laurie Christianson
Dan Kleier
The entire Discovery Organization
-- They generated the data!