Download Adventures in Data Mining Kimberly Kirkpatrick Kansas State University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Adventures in Data Mining
Kimberly Kirkpatrick
Kansas State University
What is Data Mining?
The process of knowledge
discovery in databases
 Data mining should be
hypothesis driven
 This is not the same as
data dredging or data
fishing!

Why Data Mining?
 Flexibility in data analysis
◦ Scaling issues
 May be able to use one data
many purposes
set for
◦ Grant applications
◦ Modelling/curve fitting
◦ Generate new research questions
Data Collection: Time-event codes
Time stamp in ms
841.005
1564.005
1650.005
2901.005
3666.005
3856.005
15409.005
19075.005
20331.005
21975.005
47126.006
47277.006
47391.006
47495.006
47598.006
55217.006
55268.006
59765.005
59959.010
60793.005
62070.005
62326.005
62377.005
62411.005
63585.005
64494.005
64882.005
65873.005
66514.005
66741.005
69959.020
69959.013
70059.023
70477.005
106429.005
108570.006
108702.006
109337.010
112883.005
113133.005
119337.020
119337.013
119387.023
120100.005
Event codes
Head entry into food cup = 005
Drinking from water tube = 006
Tone on = 010
Tone off = 020
Food on = 013
Food off = 023
Value of Data Mining: Scaling Issues
Delay
IRT1=1564-841
IRT2=1650-1564
US
US
Temporal
Conditioning
Extract times of responses of interest (005)
Take the difference of those times
0.03
0.05
0.025
0.045
Probability
0.04
0.035
Probability
841.005
1564.005
1650.005
2901.005
3666.005
3856.005
15409.005
19075.005
20331.005
21975.005
47126.006
47277.006
47391.006
47495.006
47598.006
55217.006
55268.006
59765.005
59959.010
60793.005
62070.005
62326.005
62377.005
62411.005
63585.005
64494.005
64882.005
65873.005
66514.005
66741.005
69959.020
69959.013
70059.023
70477.005
106429.005
108570.006
108702.006
109337.010
112883.005
113133.005
119337.020
119337.013
119387.023
120100.005
0.03
0.025
0.02
0.015
0.01
0.02
0.015
0.01
0.005
0.005
0
0
50
100
150
200
250
300
Interresponse time (s)
0
0.01
0.1
1
10
100
Interresponse Time (s)
Kirkpatrick & Church (2003)
1000
Value of Data Mining: Scaling Issues
Probability density
0.6
Bout IRTs
0.5
Pause IRTs
0.4
0.3
0.2
0.1
0
0.01
0.1
1
10
100
1000
Interresponse time (s)
Kirkpatrick & Church (2003)
Value of Data Mining
May be able to use
one data set for many
purposes
◦ Grant applications
◦ Generate new research
questions
B
Change-over Time (s)

140
120
100
80
60
40
20
0
1
2
3
Reinforcer Magnitude
Galtress, Garcia and Kirkpatrick (2012)
4
The Dark Side of Data Mining
Time consuming (5002000 data files/study)
 Resource intensive
 Requires high-level
programming and data
analysis skills

Summary and Conclusions

Consider sharing tools for analysis

[email protected]
Russ Church
Ana Garcia
Tiffany Galtress
Yuci Gou
Andrew Marshall
My rats
Related documents