Download CISC 4631 Data Mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
DATA MINING
SAMPLE RESEARCH: ACTIVITY RECOGNITION
CLASSIFICATION IN ACTION
1
Mobile Activity Recognition
 Mobile devices like smartphones and
smartwatches have many sensors
 Some sensors measure motion
 Tri-axial accelerometer
 Gyroscope
 GPS and other location sensors
 Activity Recognition is now pretty common
but wasn’t when this research started
 It took a year for our Fitbit order to be delivered
2
What is Activity Recognition?
 Identifying a user’s activity based on data
 In our case the mobile sensor data from the
accelerometer and gyroscope
 What type of data mining task is this?
 Classification
 How would you formulate this as a
classification task?
 Not so obvious if you have not read the paper,
since time dimension complicates things
3
More on Activity Recognition
 Examples of activities
 Walking, jogging, running, jumping, washing
dishes, playing basketball, reading, partying,
studying, eating, drinking, etc
 Why do we care?
 Context sensitive “smart” devices
 Fitness and health applications
 To track what we do for other purposes
4
The Data
 The data is collected at 20 Hz
 A timestamped sequence of numbers for
each of 3 dimensions for both sensors
5
Walking Data
Phone accelerometer
Watch Gyroscope
6
Phone Jogging Accelerometer
7
Phone Accelerometer Standing Data
8
WISDM Activity Recognition Studies
 2010 study using only smartphones
 Good results, but only 6 basic activities (29 subjects)
 More refined studies over next few years, including
impact of personal models
 2016 study: smartphones & smartwatches
 Good results over 18 activities (17 subjects)
 Hand-based activities including eating
 In progress
 Increasing test subjects to 50-100 and more thorough
evaluation of the four sensors
 Phone accel, phone gyro, watch accel, watch gyro, fusion
9
The 2016 Smartwatch Activities
General Activities

Walking*

Jogging*

Climbing Stairs*

Sitting*

Standing*

Kicking Soccer Ball
* These used in the 2010 smartphone study
General Activities (hand-oriented)
 Dribbling Basketball
 Playing Catch with Tennis Ball
 Typing
 Handwriting
 Clapping
 Brushing Teeth
 Folding Clothes
Eating Activities (hand-oriented)
 Eating Pasta
 Eating Soup
 Eating Sandwich
 Eating Chips
 Drinking from a Cup
10
Formulation as Classification
 Take the raw time series sensor data for non-
overlapping 10 second chunks and create one
example
 Use higher level features to describe behavior
over the 10 second period
 This is data transformation
 Mapping the data to a very different representation
11
High Level Features: 43 Total




Average[3]: Average acceleration (per axis)
Standard Deviation[3]: SD per axis
Average Absolute difference[3]: per axis
Average Resultant Acceleration[3]: average of
square root of sum of squares of 3 values
 Time Between Peaks[3]
 Binned Distribution[30]: For each axis take
max – min value, create 10 equal sized bins,
and record fraction in each bin
12
Types of Models
 Impersonal Models
 Generated using data from a panel of other users
 Personal Models
 Generated using data from the intended user.
Must be separate from test data, as usual.
13
Results
14
2010 Study using Impersonal
Model (IB3 Method)
Actual Class
72.4%
Accuracy
Predicted Class
Walking Jogging Stairs Sitting Standing
Lying
Down
Walking
2209
46
789
2
4
0
Jogging
45
1656
148
1
0
0
Stairs
412
54
869
3
1
0
Sitting
10
0
47
553
30
241
Standing
8
0
57
6
448
3
Lying Down
5
1
7
301
13
131
15
2010 Study using Personal
Model (IB3 Method)
Actual Class
98.4%
accuracy
Predicted Class
Walking
Jogging
Stairs
Walking
3033
1
24
0
0
Lying
Down
0
Jogging
4
1788
4
0
0
0
Stairs
42
4
1292
1
0
0
Sitting
0
0
4
870
2
6
Standing
5
0
11
1
509
0
Lying Down
4
0
8
7
0
442
Sitting Standing
16
2010 Study Accuracy Results
% of Records Correctly Classified
Personal
Universal
IB3 J48 NN
IB3 J48
NN
Straw
Man
Walking
99.2
97.5
99.1
72.4
77.3
60.6
37.7
Jogging
99.6
98.9
99.9
89.5
89.7
89.9
22.8
Stairs
96.5
91.7
98.0
64.9
56.7
67.6
16.5
Sitting
98.6
97.6
97.7
62.8
78.0
67.6
10.9
Standing
96.8
96.4
97.3
85.8
92.0
93.6
6.4
Lying Down
95.9
95.0
96.9
28.6
26.2
60.7
5.7
98.4 96.6 98.7
72.4
74.9
71.2
37.7
Overall
17
2016 Study Universal Models
Phone
Algorithm
accel (%)
RF
35.1
J48
24.1
IB3
22.5
NB
26.2
MLP
18.9
Average
25.3
Watch
accel (%)
70.3
59.3
62.0
63.8
64.6
64.0
Watch gyro
(%)
57.5
49.6
49.3
53.5
57.7
53.5
Note: based on 18 activities
18
2016 Study Personal Models
Algorithm
Phone
accel (%)
Watch
accel (%)
Watch gyro (%)
RF
75.5
93.3
79.0
J48
65.5
86.1
73.0
IB3
67.7
93.3
60.1
NB
77.1
92.7
80.2
MLP
77.0
94.2
70.0
Average
72.6
91.9
72.4
19
2016 Detailed Summary Results
Random Forest
Impersonal (%)
Personal (%)
Watch
accel
Phone
accel
Watch
gyro
Watch
accel
Phone
accel
Watch
gyro
Walking
Jogging
Stairs
Sitting
Standing
Kicking
79.8
97.7
58.5
84.9
96.3
71.3
60.7
93.8
66.7
26.9
65.9
72.5
87.0
48.6
43.1
70.5
57.9
41.4
94.2
99.2
88.9
97.5
98.1
88.7
88.5
68.8
66.7
87.0
73.1
91.7
93.5
98.1
80.0
82.2
68.6
67.9
Dribbling
Catch
Typing
Handwriting
Clapping
Brush Teeth
Fold Clothes
89.3
66.0
80.4
85.2
76.3
84.5
80.8
26.1
26.1
76.9
12.9
40.9
19.2
8.3
86.0
68.9
60.8
63.1
67.9
66.2
37.8
98.7
93.3
99.4
100.0
96.9
97.3
95.0
84.8
78.3
72.0
75.9
77.3
96.2
79.2
96.9
94.6
88.6
80.5
95.6
89.6
73.1
Eat Pasta
Eat Soup
47.1
52.7
0.0
0.0
57.9
47.7
88.6
90.7
40.0
82.4
72.9
69.8
Eat Sandwich
29.0
7.1
31.1
68.9
63.0
44.2
Eat Chips
Drink
65.0
62.7
16.0
31.8
50.6
61.1
83.4
93.3
76.0
77.3
52.5
78.5
70.3
35.1
57.5
93.3
75.5
79.0
Activity
Overall
20
Actitracker
 The phone-bases research was incorporated
into a deployed app/system called Actitracker
 The development effort to handle real-time
activity recognition was substantial
 Actitracker is no longer supported
21
New Directions
 My WISDM Lab is finishing work on the
smartwatch activity recognition
 Beginning to consider data mining of static
sensors since cheap Bluetooth sensors are
now available
 Research related to Internet of Thing (IoT)
22
Data Collection
 Collecting the data is quite time intensive
 We are still collecting data for “definitive” set
of AR experiements so if you want to
volunteer, please email me.
 Data collection usually at RH
 Will provide an Amazon gift card
23