Download A PROPOSED DATA MINING DRIVEN METHDOLOGY FOR

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

K-means clustering wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
The Pennsylvania State University
The Graduate School
Department of Industrial and Manufacturing Engineering
A PROPOSED DATA MINING DRIVEN METHDOLOGY FOR MODELING HUMAN
GAIT AND GEOSPATIAL TRAJECTORIES
A Thesis in
Industrial Engineering
by
Yixiang Han
 2013 Yixiang Han
Submitted in Partial Fulfillment
of the Requirements
for the Degree of
Master of Science
August 2013
ii
The thesis of Yixiang Han was reviewed and approved* by the following:
Conrad S. Tucker
Assistant Professor of Industrial Engineering
Thesis Advisor
Timothy W. Simpson
Professor of Industrial Engineering
Paul Griffin
Professor of Industrial Engineering
Head of the Department of Industrial Engineering
*Signatures are on file in the Graduate School
iii
ABSTRACT
Less than 35% of human communication is verbal (hearing, listening, etc.), whereas
greater than 65% of human communication is nonverbal (body posture, facial expressions, etc.).
By analyzing nonverbal human communication instead of just verbal communication, researchers
may be able to perceive latent human features such as body language, neurological patterns, etc.,
otherwise missed through verbal communication alone. In this thesis, human kinematics (i.e.,
human gait and geospatial trajectory) is modeled and analyzed so as to perceive and predict
human behavior and kinematic patterns. A data mining driven methodology is proposed for
modeling and predicting both human gait (i.e., human walking posture) and human geospatial
trajectory (i.e., a sequence of geospatial locations from a moving individual in an indoor space).
The human gait mining component of the proposed methodology captures multimodal gait data in
order to model and predict neurological patterns that influence human gaits. The human trajectory
mining component of the methodology aims to predict common regions of interest (CRI) in
indoor design spaces by modeling geospatial trajectory patterns. A Parkinson’s disease (PD)
detection case study is used to validate the human gait component of the methodology, and an
engineering design case study involving students working in teams is used to validate the human
trajectory methodology. Analyzing human gait and geospatial trajectory would reduce human
variations and recognize desired patterns in both human gait and geospatial trajectory so as to
evaluate human movement characteristics and understand human movement dynamics.
iv
TABLE OF CONTENTS
List of Figures .......................................................................................................................... v
List of Tables ........................................................................................................................... vi
Acknowledgements .................................................................................................................. vii
Chapter 1 Introduction ............................................................................................................ 1
Chapter 2 Literature Review ................................................................................................... 4
2.1 Existing Techniques for Modeling Human Movement .............................................. 4
2.1.1 Existing Human Gait Modeling ...................................................................... 4
2.1.2 Existing Human Geospatial Trajectory Modeling........................................... 6
2.2 Data Mining based Human Movement Modeling ...................................................... 7
2.2.1 Data Mining based Human Gait Modeling ..................................................... 8
2.2.2 Data Mining based Human Geospatial Trajectory Modeling.......................... 9
Chapter 3 Methodology .......................................................................................................... 11
3.1 Human Gait Modeling Methodology ......................................................................... 11
3.1.1 Step 1: Sensor Data Acquisition...................................................................... 12
3.1.2 Step 2: Data Preprocessing .............................................................................. 14
3.1.3 Step 3: Data Mining Knowledge Discovery.................................................... 15
3.1.4 Step 4: Model Performance Evaluation and Application ................................ 20
3.2 Geospatial Trajectory based Human Motion Modeling Methodology ...................... 23
3.2.1 Step 1: Data Acquisition ................................................................................. 24
3.2.2 Step 2: Data Transfer....................................................................................... 25
3.2.3 Step 3:Data Mining Knowledge Discovery..................................................... 25
3.2.4 Step 4:Model Visualization ............................................................................. 30
Chapter 4 Case Studies and Discussion .................................................................................. 31
4.1 Parkinson’s disease detection based Case Study........................................................ 32
4.1.1 PD Data Acquisition and Preprocessing ......................................................... 33
4.1.2 PD-based Data Mining Knowledge Discovery and Evaluation ...................... 35
4.2 Geospatial Trajectory Clustering ............................................................................... 37
4.2.1 Geospatial Trajectory Data Acquisition and Preprocessing ............................ 38
4.2.2 Geospatial Trajectory based knowledge Discovery and Explanation ............. 39
Chapter 5 Conclusions and Future Work ................................................................................ 47
References ................................................................................................................................ 49
v
List of Figures
Figure 3-1. Framework of the proposed human gait based methodology................................ 12
Figure 3-2. Skeletal image with 20 nodes and example data from Shoulder_Center node. .... 14
Figure 3-3. Framework of geospatial trajectory modeling....................................................... 24
Figure 4-1. PD forward walking experiment overhead view. .................................................. 34
Figure 4-2. The learning factory layout. .................................................................................. 38
Figure 4-3. Extracted characteristic points of User 1............................................................... 40
Figure 4-4. Visualization of trajectory partitioning for User 1. ............................................... 41
Figure 4-5. Clustering visualization. ........................................................................................ 43
Figure 4-6. Clustering visualizations in the first period........................................................... 44
Figure 4-7. Clustering visualizations in the second period. ..................................................... 44
Figure 4-8. Clustering visualizations in the third period. ........................................................ 45
vi
List of Tables
Table 3-1. Confusion matrix example...................................................................................... 21
Table 4-1. Algorithms performances in walking experiment. ................................................. 36
Table 4-2. Other evaluations among multiple algorithms in walking experiment. .................. 36
Table 4-3. Original trajectory of User 1. .................................................................................. 39
Table 4-4. Example result based on clustering algorithm. ....................................................... 41
Table 4-5. Result of clustering algorithm. ............................................................................... 42
vii
Acknowledgements
It is my pleasure to thank everyone that helped make my thesis possible.
I would like to express the deepest appreciation to my advisor, Dr. Conrad S. Tucker. He
patiently provided the guidance, motivation, remarks and useful comments for me to proceed
through not just my Master study and the learning process of this master thesis but my overall
academic and professional career as well. He has shaped my growth and development regarding
my research and scholarship. Without his tremendous mentorship and persistent help this thesis
would not be possible.
I would also like to offer my special thanks to my thesis committee member, Dr. Timothy
W. Simpson, for his guidance, encouragement, insightful comments, and immensely helpful
suggestions. His guidance has served me well and I owe him my heartfelt appreciation for taking
the time to advise me through the process of writing this thesis.
I thank my fellow lab mates in the Design Analysis Technology Advancement
(D.A.T.A.) Lab in the School of Engineering Design, Technology and Professional Programs for
their great support and enlightenment. Their friendship and assistance has meant more to me than
I can express in words. Thank you all for your patience and friendly assistance.
It has been a great pleasure to be a student in the Harold and Inge Marcus Department of
Industrial and Manufacturing Engineering at the Pennsylvania State University at University
Park. I deeply thank Dr. Paul Griffin, the department head, Dr. M. Jeya Chandra, the graduate
program coordinator, and all other members of the department and the university.
Last but not least, I would like to thank my family, Xiaokao Han and Fengmei Li, for
giving birth to me and supporting me throughout my life. I love you all my life.
1
Chapter 1
Introduction
Research has shown that more than 65% of human communication is non-verbal (e.g.,
posture, gesture) while about 35% is considered verbal (e.g., speech, discussion) [1]. The verbal
human communication component conveys a large volume of information, but may miss latent
aspects such as body posture, facial gestures, etc. that may provide researchers with added
dimensions of knowledge. Within research pertaining to nonverbal communication, human
movement behavior modeling is gaining significant interest across research domains ranging
from public security surveillance to human movement-based disease diagnosis [2–4]. By
analyzing human motion, researchers are able to compare and evaluate human movement
characteristics in order to capture movement patterns and understand human dynamics.
The objectives of analyzing human gait behavior and geospatial trajectory behavior are
both important and complementary. Human gait is defined as the act of self-propulsion achieved
by using human extremity [5]. Human geospatial trajectory is defined as a sequence of geospatial
locations from a moving individual in order to recover human motion in a given space [6].
Human gait behavior analysis focuses on human motion analysis including human body segments
(e.g., posture detection) while geospatial trajectory behavior analysis focuses on human trajectory
movement analysis in a given space without considering specific parts of the human body
structure.
There is already a wide spectrum of applications based on human gait modeling such as
athletic performance evaluation, medical diagnosis, public security surveillance and video
2
conferencing, etc. [7-8]. Analyzing human gait behaviors helps capture and recognize human
movement characteristics and interesting gait patterns that could be taken as evidence for
different targets and applications. For example, potential Parkinson’s disease (PD) patients are
typically diagnosed by specific types of gait such as muscle rigidity, vocal problems, gait
disorders through several kinematics experiments based on published criteria [8–10]. Another
example is that swimmers analyze video tapes from other swimmers in order to learn about some
performance indicators such as basic speed, stoke mechanics, starting and turning abilities which
could be helpful in their personal trainings [12-13]. Other similar case studies could be found in
other domains such as tennis, basketball and airport security surveillance [13–15].
In addition to human gait analysis, human geospatial trajectory methodologies focuses on
human movement within a given space to capture geospatial position, velocity, time, acceleration,
etc., without considering human body segments (i.e., human body is considered as one point, and
different segments are assumed to have the same movement status). The objective is to detect
their geospatial movement patterns such as the trajectory shape, common regions of interest
(CRI) and density of multiple trajectories in specific setting. For example, researchers have
proposed methodologies to detect overcrowded situations in an indoor space (e.g., shopping malls,
career fairs, railway stations, etc.) so as to provide event alarms and better reorganized layouts
[16]. Furthermore, this tracking strategy could also contribute to traffic control in order to
relocate different facilities and promote user experience [17]. Other similar examples can be
found in [19-20].
The aforementioned methodologies for modeling human gait and geospatial trajectories
are usually achieved by a number of different human body models, which range from stick
figures, ribbon-based 2-D contour, to 3-D volumetric human models. However, there are some
3
limitations that must be addressed [2], [20–22]. First, there is no human motion variation (i.e.,
human size) included in the applied standard 2-D or 3-D models. For example, different
individuals may have different heights, weights, and other parameters which could lead to
variations in human motion modeling and consequently, affect the predictive accuracy. In this
thesis, the human variation is addressed by introducing ratio components for position, velocity,
and acceleration between each pair of joints. In addition, joint correspondence (i.e., identifying
every joint in successive frames) is required in certain 2-D and 3-D models, which may restrict
the modeling flexibility and make it only applicable to some motion types (e.g., simple walking).
In this thesis, each joint in the 3-D model is detected automatically by using a multimodal sensor.
In order to mitigate these challenges, a data mining driven methodology is proposed to models
human gait and geospatial trajectory in order to normalize and categorize various human gaits and
understand utilization density in an indoor space.
The rest of this thesis is organized as follows. This Chapter provides an introduction and
background relating to human motion analysis. Chapter 2 describes previous work related to the
research topic, discusses pros and cons of these methodologies, and contrasts them to the
methodology proposed in this thesis. The human gait and geospatial trajectory components of the
methodology are introduced in Chapter 3 with results and discussions presented in Chapter 4.
Chapter 5 concludes the thesis and identifies future research expansions.
4
Chapter 2
Literature Review
This chapter discusses the past research that is closely related to the topic in this thesis.
The literature review begins by discussing the performances of various existing human motion
modeling methodologies in section 2.1. Data mining based human motion modeling
methodologies are then discussed in section 2.2. As part of this section, various classification
algorithms that are most relevant to this study topic are discussed and compared with emphasis on
extracting significant motion features in both human gait and trajectory classification and
prediction.
2.1 Existing Techniques for Modeling Human Movement
Existing methodologies proposed to model human movement focus on human motion
tracking without system-based automatic motion recognitions and classifications (e.g., public
security surveillance system) [23–25]. For example, existing passive surveillance system (e.g.,
Closed Circuit Television (CCTV) cameras) can only track human movements and require welltrained camera operators to manually view video feed so as to recognize any suspicious act. In
this section, multiple existing modeling methodologies are discussed relating to both human gait
modeling (see Section 2.1.1) and geospatial trajectory modeling (see Section 2.1.2).
2.1.1 Existing Human Gait Modeling
An approach to modeling human gait motion is based on 2-D human stick figures, which
consists of multiple joints or nodes connected by multiple line segments. This “skeletal”
5
representation of the human body could be a significant aid to help track and estimate human gait.
One example is to model human gait based on moving light display (MLD) [23]. In this
methodology, the human body kinematics is modeled using 12 MLD lights representing the head,
shoulders, hips, elbows, wrists, knees, and ankles. This MLD-based model can help translate 3-D
human gait into 2-D projections during different human motion experiments. Other similar
examples can be found in [25], [27-28]. However, the joint correspondence required in human
gait modeling is the most challenging and complex part since each joint requires node-to-node
correspondence between successive frames. In addition, the 2-D projection provided by these
methodologies cannot provide depth data (i.e., only X and Y) for each joint and is not capable of
describing real-world 3-D human movements. Depth data is needed for more accurate for 3-D
modeling. These shortcomings are addressed in the proposed methodology by collecting 3-D joint
data (i.e., X, Y and Z coordinates) based on the applied multimodal sensor that can collect and
store in a database (data structure is discussed in more detail in Chapter 3). Once stored in a
database, researchers are able to extract information or run predictive models on this data, thus
making it possible to model and predict human gait, as demonstrated later in this thesis.
Another approach to tracking and recognizing human motion is based on the application
of 2-D contour modeling. The objective in this methodology is to model human gait by adding
human outlines, which is more precise than just applying “skeletal” models in previously
discussed approaches [27-28]. For example, human gait could be modeled based on a ribbonbased 2-D model without putting markers on the human body [28]. In this model, there are eight
joints included in the 2-D model, which represent shoulders, elbow, hips, and knees, respectively.
By comparing the difference of moving ribbons between two frames, the moving ribbons could
be extracted. The resulting parameter curves recover motion characteristics in different body parts.
Other similar 2-D ribbon-based examples could be found in [26,30]. However, human size
6
variation during gait analysis is not considered. In addition, there are body structure constraints
that may restrict the modeling flexibility and make it only applicable to some simple motion types
(e.g., simple walking). In order to reduce human size variation and model other human motion
types, the proposed data mining driven methodology is introduced in Chapter 3.
Comparing to the previous 2-D human gait modeling, 3-D modeling would have several
advantages such as viewing each joint independently based on the 3D angle parameters and
modeling other complex and unconstrained human movements [30]. There are usually two
parameters included in 3-D models, classified as “skeletal” figures and surrounding tissues [30].
For instance, a 22-DOF model is applied to construct the skeleton of the human body such as arm,
leg and torso. Then cylinders, spheres and other different primitives are applied to generate the 3D model [31]. However, the 3-D model is still incapable of addressing human gait variations
since parameters in 3-D model are unchanged. In addition, the included body constraints may
reduce the model flexibility in other complex motions instead of just simple walking. Other
similar 3-D models could be found in [32–34].
2.1.2 Existing Human Geospatial Trajectory Modeling
Human trajectory analysis addresses geospatial positions without considering the human
body structure. Some studies have considered qualitative methodologies to collect geospatial
trajectory patterns such as visual observations, interviews, and questionnaires [35], [37-38]. For
instance, human geospatial trajectory analysis has conducted based on a questionnaire at Osaka
Science Museum in Japan [35]. In the experiment, each volunteer was asked to fill out a
questionnaire after touring about their interactions with different robots. By analyzing popular
types of robots and the amount of time spent on each one based on the feedback provided in the
7
questionnaires, researcher were able to discover that there was no preference between males and
females and there is no correlation between age of visitors and popularity of robots. However,
there are subjective biases in volunteers since different people may give different feedback to the
same one. In addition, this methodology requires a lot of time for human collaboration. Finally,
there could be potential privacy problems during the data collection process. For example,
confidential information such as name, age and phone number might be collected and accessed.
Other examples can be found in [37-38].
Other approaches to modeling human geospatial movement is based on technologies such
as Bluetooth, WI-FI, GPS and video camera recording, which can record precise time, location
and trajectory data. For example, the Bluetooth sensor was installed in one of the busiest regions
inside the Louvre museum in order to record the number of visitors and collect geospatial
position with corresponding time [38]. Finally, researchers concluded that there are strong
connections between Samothrace and Hall access, and between Hall access and big gallery,
which could help explain the most frequent trajectory patterns [34]. In their work, visitors’
trajectories could be clearly described, and correlations between different nodes could also been
discovered. However, the main limitation is that the busiest nodes are predetermined by officers
(i.e., domain knowledge is included). In addition, this methodology may only be available to
model one direction trajectory while other types may not be available (e.g., circular or a backand-forth type trajectory). Other similar examples cold be found in [40-41].
2.2 Data Mining based Human Movement Modeling
Instead of just tracking and modeling individual gait and geospatial trajectories, highlevel objectives about comparing and recognizing both different human gait features and
8
geospatial movement patterns could provide additional insights and discoveries. The
methodology aims to generate clusters for similar features that would finally provide reliable
guidance to recognize human activities for new untested people. In order to achieve this goal,
data mining based methodologies are discussed in this section for both human gait analysis and
human geospatial trajectory analysis.
2.2.1 Data Mining based Human Gait Modeling
Charayaphan and Marble propose a data mining based methodology to detect hand
motion and classify different hand signs [41]. In the methodology, a frame grabber with IBM PC
was applied to help extract multiple frames of hand motions without applying any 2-D or 3-D
models. The hand detection is accomplished by comparing the grey scale difference between two
successive frames followed by hand sign classification based on stop position. Another similar
example can be found in [42]. Polana and Nelson proposed a methodology that does not require
the joint correspondence or track specific parts of an individual as mentioned in Section 2.1 [43].
Instead, human motion is tracked from the moving pixels followed by spatial translation where
the image frames are reduced to the same size as the object. Finally, the generated spatial grayvalued frame set in each time point t would be considered as the feature vector to compare with
the reference motion set based on the K nearest neighbors (KNN) algorithm [43]. Heisele
proposed another methodology to model human gait based on Color Cluster Flow (CCF) [44]. In
this methodology, the pedestrian is represented by two initial color-based clusters (i.e., lilac
jacket and blue pants). The trajectories of these two clusters are considered as the approximation
of real human motion. By comparing the cluster trajectories to the reference trajectories, it helps
identify the real human motion. However, the main limitation for these methodologies is that they
9
can only be applied to periodic and simple motion types (e.g., parallel walking), and has lower
predictive accuracies in complex motions such as rotation motion of arms or legs. In addition,
there is usually a predefined reference motion set required before the motion modeling.
2.2.2 Data Mining based Human Geospatial Trajectory Modeling
Johnson [45] proposed one methodology to model human trajectory based on probability
density function (pdf) modeling. In the methodology, the human trajectory is described in a
sequence of flow vectors denoted by Q={f1 ,f 2 ,...,f n } , where n is the total number of images
captured for one subject. Then a learning network is applied to model the pdf to classify n input
data nodes into k output nodes (k and n are predetermined parameters) based on nearest distance.
Fu [46] models and predicts human geospatial trajectory that measures the similarity between two
individual trajectories based on a similarity matrix. Then a two-layer clustering algorithm is
employed where the dominant paths and routes are generated in the first layer. Then the Tightness
& Separation Criterion (TSC) is applied to quantitatively evaluate the clustering results.
However, the main shortcoming of the methodologies is that they only deal with trajectory
clustering without detecting common regions of interest (CRI) to help explain the different
motivation behind each individual’s activities. In addition, it may only perform well in single
directional trajectory and cannot be applied in other cases such as circular or rectangular
trajectories.
Having reviewed the research problem, motivation, previous related methodologies and
corresponding pros and cons, a data mining driven methodology is proposed to overcome the
aforementioned limitations mentioned above in both human gait analysis and human geospatial
10
trajectory analysis and extract significant movement patterns to compare and understand human
gait and geospatial activities. This methodology is introduced in the next chapter.
11
Chapter 3
Methodology
Based on the background and explanation established, the proposed data mining driven
methodology for modeling human gait and geospatial trajectories is presented in detail in this
section. The proposed methodology aims to overcome the human motion variation by adding the
ratios of position, velocity and acceleration between each pair of joints. There are four steps
included in the methodology. In the first step, data acquisition is conducted in order to collect
human gait and geospatial trajectory data. In the second step, data preprocessing technique
proposed for data cleaning and transferring. In Step 3, data mining algorithms are employed to
model and extract human movement features so as to explore common movement patterns. Step 4
of the proposed methodology outlines a validation and evaluation framework that helps determine
the robustness of the proposed methodology. The human gait based methodology is discussed in
Section 3.1 followed by the human geospatial trajectory based methodology in Section 3.2.
3.1 Human Gait Modeling Methodology
The human gait modeling methodology aims to capture multimodal gait data in order to
model and predict neurological patterns that influence human gait. The Human Gait Modeling
component of the methodology is partitioned into a total four steps: data acquisition (Step 1), data
preprocessing (Step 2), data mining knowledge discovery (Step 3), and model evaluation and
application (Step 4) in Fig. 3-1. Step 1 discusses how to set up experiments and collect human
12
gait data based on the multimodal sensor employed in this work. Step 2 discusses how to
preprocess and store the collected data into server. Step 3 discusses how to extract correlated
features from the generated data set and apply these features to model, recognize and evaluate
human gait patterns. In Step 4, the trained model could be evaluated and extended into different
domains to help recognize and compare human gait patterns. For example, Step 4 in Fig. 3-1
could apply proposed human gait modeling into human movement related disease detection (e.g.,
Parkinson’s disease detection). Similar applications would be threat detection and athletic
performance evaluation.
Figure 3-1. Framework of the proposed human gait based methodology.
3.1.1 Step 1: Sensor Data Acquisition
The first step of human gait modeling outlines the experiment setup for collecting human
gait data. In this step, the overall body movement (i.e., human gait) is captured through a
13
multimodal sensor system including RGB video camera and infrared depth sensor. In the sensor
data acquisition, the human body gait is modeled and captured based on the “skeletal” model
shown in Fig. 3-2, where total twenty joints are represented by the black circles. Comparing to
the modeling methodologies mentioned in the Literature Review, this multimodal sensor is able
to automatically recognizing each of the twenty joints without placing sensors on human body. In
addition, there are position, velocity and acceleration ratios created between each pair of joint to
help normalize shape variation existing in a population. By utilizing the multimodal sensor, this
virtual skeletal model is able to capture movements of joints in 3-D environment (i.e., X, Y and Z
coordinates) in real-time manner with privacy preserved that is sometimes a desirable feature in
human gait modeling (e.g. human gait based Parkinson’s disease detection).
In this research study, the Microsoft Kinect sensor is used for data collection which is
capable of tracking human motion by applying a similar virtual skeletal image shown in Fig. 3-2.
The hardware is able to capture each frame of human gait approximately every 33ms and
generating a 3MB data file in 4s.
14
Figure 3-2. Skeletal image with 20 nodes and example data from Shoulder_Center node.
3.1.2 Step 2: Data Preprocessing
In addition to the initial X, Y, Z position data, velocity and acceleration of each single
joint are calculated by taking the derivative of position and velocity and creating additional
features in the raw data. In order to reduce the human gait variation (e.g., longer legs may have
longer length of stride), the ratio of position, velocity, and acceleration between each pair of joint
are also generated to reduce human gait variation.
Since not all features generated previously are expected to have the same predictive
power to the response variable, only the most relevant features corresponding to the response
variable should be selected in order to obtain more insights to capture and distinguish human gait
patterns, which leads to the feature space reduction [52]. Since multiple data mining algorithms
are applied in this study, a candidate feature selection algorithm should be independent of the
15
multiple classifiers while maintaining the good performance [52]. In this thesis, the Correlationbased Feature Selection (CFS) is selected where the most relevant feature set to the corresponding
output variable is selected with minimum correlation inside the feature set [48]. In the CFS
methodology, the correlation between relevant features (i.e., the features included in the relevant
feature set) and irrelevant features (i.e., the features not included in the relevant feature set) is a
function of the number of components inside the feature set, average value of inner-correlation
among inside features, and average value of correlation between inside components and outside
features which is shown in Equation 3.1. More technical details can be found in [48].
rzc 
k rzi
k  k ( k  1) rii
(Equation 3.1)
where,
: is the correlation between the current relevant features and the potential relevant features.
k: is the number of features.
rzi : is the average value between the relevant features and the potentially relevant features.
rii : is the average value between two relevant features.
3.1.3 Step 3: Data Mining Knowledge Discovery
After data preprocessing is completed, the aim is to develop a function f ( X )  Y that can
help map the selected features X  ( x1 , x2 ,..., xn ) where n is the total number of features selected
to the class variable Y. From a theoretical point of view, there are two types of data mining
learning methodologies: supervised learning and unsupervised learning. Supervised learning is
16
the machine learning task of inferring a function from labeled training data while unsupervised
learning refers to the problem of trying to find hidden structure in unlabeled data [49]. Since
observations are labeled in human gait modeling, the supervised learning is selected. In addition,
multiple data mining algorithms including Binary Logistic Regression, Support Vector Machine,
C4.5, Random Forest, and IBK are employed since they are proved to have good performances in
human gait classification problem [50–55]. Based on the performances of different algorithms,
the most accurate and reliable model and partitioning criteria would be generated.
Binary Logistic Regression
In Binary Logistic Regression, each selected input variable would be given a coefficient
in order to formulate a function mapping input variable to the output variable. Here the
coefficients could be considered as prediction power indicators. The equations are shown in
Eq.3.2 and Eq.3.3. By applying multiple linear features as input variable for a new observation,
the model estimates its probability of falling into one category. For example, in terms of the
Parkinson’s disease (PD) case study presented in the following section, one category would be
PD patient and another would be controls. More information about logistic regression can be
found in [50][51]. Linear regression may help the classification problem; however, its accuracy is
sometimes inconsistent since the linear combination of features may not be able to explain all the
variation in the response variable. By considering these limitations, support vector machine is
introduced.
f ( x)   T x
(Equation 3.2)
n
 *  arg min  ( f ( xi )  y ) 2

i 1
(Equation 3.3)
17
where,
β: is the coefficient and f(x) is the logistic function.
: is the value of the ith feature for one observation.
y: is the value of the output variable.
Support Vector Machine (SVM)
In addition to the logistic regression model, SVM is another available classifier by
maximizing the margin space between two different clusters. In contrast to other data mining
algorithms, the observations that are close to the partition boundary of the clusters receive more
attention in SVM and would finally generate the separating boundary based on a kernel function
shown in Eq. 3.4. In practice, SVMs are made robust by adding some “slacking variables” that
allow training error to be non-zero. In addition, SVM would also be able to transform the current
data to a higher dimensional space and construct the decision boundary. Specific technical details
could be referred in [52][53]. SVM may help increase accuracy in logistic regression modeling;
however, it is sometimes difficult to explain the kernel function and results of the algorithm since
it is a non-parametric technique and lacks transparency of results and cannot represent the kernel
function as simple parametric function of input variables [56]. In order to overcome these
limitations while maintaining the modeling accuracy and robustness, the C4.5 decision tree
algorithms is discussed.
n
f ( x1 , x2 ,..., xn )   wi xi  b
i 1
where,
(Equation 3.4)
18
: is the coefficient and f(x) is the logistic function.
b: is the tolerance of the misclassification error.
: is the value of the ith feature for one observation.
C4.5
C4.5 is well established classification algorithm proposed by Quinlan in 1986 [54][55].
C4.5 is usually employed to classify one type of pattern in binary classification problems [54][55].
The algorithm comprises of two main steps: (1) best attributes evaluation and (2) splitting point
selection. The attribute evaluation step attempts to select the most informative node in each
subset of the training data set (the whole training data set for the root node selection) based on the
maximum value of gain ratio, which is calculated based on equations from Eq. 3.5 to Eq. 3.8. The
splitting point selection attempts to decide the best numerical split point that has the minimum
misclassification error which is based on Eq. 3.6. More information about decision tree can be
found in [50], [59-60].
m
Info( D)   pi log 2 ( pi )
(Equation 3.5)
i 1
v
Dj
j 1
D
InfoA ( D)  
 Info( D j )
(Equation 3.6)
Gain( A)  Info(D)  Info A (D)
(Equation 3.7)
GR( A) 
where,
Info( D)  InfoA ( D)
Gain( A)

(Equation 3.8)
v D
SplitInfo( A)
Dj
j

 log 2 (
)
D
j 1 D
19
I (D): is the expected information needed to classify a tuple in D.
D: is the data set.
m: is the total number of classes.
: is the probability that an arbitrary sample belongs to class
and is estimated by
/|D
InfoA ( D) : is the information needed to split D into v partitions by selecting the attribute A.
Gain(A): is the information gained by branching on an attribute A.
I(A): is the information of attribute A.
GR (A): is the gain ratio of attribute A.
SplitInfo(A): is the information of attribute A.
: is the number of instances in D that belong to the jth partition.
Random Forest (RF)
Random Forest retains many benefits of decision tree classification algorithm (such as the
C4.5) while achieving better results through the use of bagging, random subsets of variables, and
a voting scheme [57]. By using a random selection, M random cases are sampled with
replacement in the training data set for each tree. Then N features are also randomly sampled to
help construct single tree (M and N are predetermined parameters). Second, all the input variables
and cases are taken to help general a single tree as the similar procedure in C4.5. Finally, a large
number of trees are generated and they vote for the most popular class. We call this entire
procedure a random forest (RF). More details can be found in [61-62].
20
IBK
IBK classifier is an instance-based machine learning algorithm based on K-Nearest
Neighbor (KNN). Instead of constructing explicit abstractions such as linear logistic regression
model, decision tree model and SVM model, IBK compares similarity between the observations
in training data set and hold-out observations in test data set. In addition, this algorithm assumes
that similar instance should have similar classifications. By computing the instance similarity
(shown in Eq. 3.9), IBK would be able to classify new instances to its nearest neighbors and
finally generate clusters. More information is given in [59][53].
Similarity ( x, y )  
n
 f ( xi , yi )  
i 1
n
 (x  y )
i 1
i
i
2
(Equation 3.9)
where,
: is the value in one dimension of one observation.
: is the value in one dimension of another observation.
n: is the total number of features in the feature space.
3.1.4 Step 4: Model Performance Evaluation and Application
After discussing human gait modeling, the next step is to evaluate model performance
based on multiple evaluations. Before employing the following performance metrics, k-fold cross
validation is employed. In the k-fold cross validation, the whole data set is randomly partitioned
into training data set and test data set. Each time the training data set is applied to train the model
while the test data set is applied to validate performance. This procedure is repeated another k
21
times, and the performance is averaged and represented in multiple evaluation measures. Based
on the literature review, k is assigned to be 10 [52] [60].
The first evaluation is based confusion matrix (example is shown in Table 3-1) that
contains four cells: (1) true positive, (2) false positive, (3) false negative and (4) true negative.
These values would help generate Correctly Classified Instance (CCI), precision, recall, Fmeasure and ROC curve [61].
Table 3-1. Confusion matrix example.
Actual Status
Predicted Status
True
False
True
True Positive (TP)
False Positive (FP)
False
False Negative (FN)
True Negative (TN)
The second evaluation measure Correctly Classified Instance (CCI) explains the weighted
average accuracy of different models for the two categories. The calculation is shown in Eq. 3.10.
CCI 
TP  TN
*100%
TP  TN  FP  FN
(Equation 3.10)
The third metric, Kappa statistic (KS) [53][62], measures the proportion of all positive
and negative cases after considering chance prediction. Generally, its value ranges from -1 to 1
where the model is considered as reliable when its value is from 0.8-1. In addition, KS<=0.2
(poor); 0.2<KS<=0.6 (fair); 0.6<KS<=0.8 (substantial). The calculation is shown in Eq. 3.11.
KS 
p0  pc
1  pc
where,
p0 : is the probability of total agreement.
(Equation 3.11)
22
pc : is the probability because of chance.
There are other evaluation measures called precision, recall, and F-measure, which can be
calculated from confusion matrix. Precision and recall can be considered as the Type I and Type
II errors to describe the confidence interval of applied model and calculations are shown in Eq.
3.12 and Eq. 3.13. For example, if the precision value is greater than 0.95, then the researchers
are 95 % confident that the model is able to classify observations correctly. F-measure is another
performance indicator, and it can be considered as a weighted average of the precision and recall.
Note that it gets the best performance at value of 1 and the worst performance at the value of 0.
The equation is shown in Eq. 3.14.
precision 
recall 
F  2*
TP
TP  FP
TP
TP  FN
precision * recall
precision  recall
(Equation 3.12)
(Equation 3.13)
(Equation 3.14)
The last evaluation measure is the receiver operating characteristic (ROC) curve. Since a
classification model is usually applied based on particular values of thresholds or parameters, the
ROC curve is able to describe different model performances based on different values of
threshold in order to choose the best operating point. The best operating point might be chosen so
that the classifier gives the best trade-off between the costs of failing to detect positives against
the costs of raising false alarms. These costs need not be equal; however this is a common
assumption. Note that the best place to operate the classifier is usually the point on its ROC that
lies on a 45 degree line closest to the north-west corner (0, 1) of the ROC plot.
23
Once the human gait modeling and performance evaluation are completed, the most
suitable model could be applied to detect and recognize human gait patterns and visualize results
for decision support. The main benefit for decision support is able to quantify and visualize the
human gait results. In addition, the decision support also helps measure and evaluates human gait
patterns based on a small subset of relevant features. Finally, this decision support may serve as a
system to give reference for any interesting gait pattern detection based on the particular
application domain.
3.2 Geospatial Trajectory based Human Motion Modeling Methodology
The motivation of analyzing human geospatial trajectory is not only to model geospatial
movement patterns relative to an indoor space but also recognize common trajectory patterns
from multiple people so as to achieve the objectives in different application domains (e.g.,
averaging indoor space utilization, maintaining crowd control, etc.). Here, the trajectory pattern
could be understood as a set of regions that are of interests to different individuals. In the
methodology, there are a total four steps: data collection (Step 1), data preprocessing (Step 2),
data mining knowledge discovery (Step 3) and visualization (Step 4). The framework for human
trajectory modeling is shown in Fig. 3-3.
24
Figure 3-3. Framework of geospatial trajectory modeling.
3.2.1 Step 1: Data Acquisition
Since there is not too much novel contribution in Step 2, Step 1 and Step 2 are combined.
The first step of the human trajectory based methodology is data acquisition, which is captured
through a wireless indoor tracking system helping update real-time individual geospatial location
(i.e. X and Y coordinates) with corresponding time stamps. By utilizing the GPS-based tracking
system, geospatial locations of each individual can be updated and considered as an
approximation of the individual geospatial trajectory. Then researchers are able to extract
individual trajectory patterns and establish common trajectory patterns among multiple people. In
25
this study, the BuzNet Real-Time Locating System (RTLS) was used to track the trajectories of
multiple individuals in an indoor space. Once the data collection is completed, the data would be
stored in a database in a suitable format for subsequent steps in the data mining process.
3.2.2 Step 2: Data Transfer
Human geospatial trajectory data transfer is based on a hardware and software platform
that consists of three primary components: (1) Routers, (2) tags, and (3) Base Station. Routers are
fixed-position devices that form the wireless network infrastructure of the hardware. Tags are
wireless, battery-powered mobile devices placed on individuals in an indoor environment. Base
Station is a PC (typically, Microsoft Windows-based) that is loaded with the software. When
individuals are walking around in a given indoor space, this system provides an interactive
visualization interface for the tracking of individual locations approximately every 2 minutes. At
the same time, the Base Station stores every calculated location for every tag in a database
(locally-stored or cloud-based) that can be accessed and analyzed.
3.2.3 Step 3:Data Mining Knowledge Discovery
By comparing individual trajectories among multiple people in an indoor space,
researchers can extract common trajectory patterns in order to understand and recognize how the
indoor design space is utilized. Since The TRACLUS algorithm is irrespective of trajectory types
(e.g., dual direction trajectory), it can extract individual movement features from different
trajectories which will provide more information in trajectory clustering [63]. The methodology
contains two steps: (1) partitioning and (2) clustering. The first step attempts to capture and
26
recover the real trajectory based on a subset of trajectory points. The second step attempts to
group different line segments generated in the previous step so as to recognize trajectory patterns
among different people based on clusters. In this section, the individual trajectory partitioning
methodology is explained first followed by the clustering methodology.
Trajectory partitioning
We assume that the original real individual trajectory could be duplicated based on the
data collected in the previous step. Some simple trajectories could be classified directly (i.e., one
directional trajectory); however, most of the trajectories are not in this type and cannot provide
insight if they are classified directly without any partitioning. The partitioning algorithm provides
one approach to duplicate the original trajectory without losing much information based on an
optimal subset of characteristic points. Given an individual trajectory T  {t1 , t2 ,..., tn } , optimal
characteristic points P  { p1 , p2 ,..., pn } would be generated [63]. Here
collected in previous data collection, and
is any position point
is any characteristic point extracted. In the
partitioning algorithm, the Minimum Description Length (MDL) function is applied to evaluate
each point (equations are shown from Eq. 3.15 to Eq. 3.18). Assuming the first point
in the
original trajectory is the starting point, if its MDL_par cost is less than or equal to its
MDL_nonpar cost, then we continue searching
until the first point that violates this
requirement is found. Assuming the first point that violates the MDL cost function is , then the
point
is considered as one characteristic point and taken as the new starting point to search
next characteristics point until all the points in the original trajectory is checked. Finally the
characteristic point set P can be established.
27
L( D | H ) 
MDLpar  ( L( H )  L( D | H ))
(Equation 3.15)
L( H )  log 2 (len( p j p j 1 ))
(Equation 3.16)
p j 1 1
 log
k pj
2
(d  ( p j p j 1 , tk tk 1 ))  log 2 (d ( p j p j 1 , tk tk 1 ))
MDLnonpar 
(Equation 4.17)
currentindex

j  startindex
log 2 (len( p j p j 1 ))
where,
MDLpar : is the MDL cost of one possible characteristic point.
MDLnonpar : is the non-MDL cost of one point.
L( H ) : is the length of hypothesis when the next location is added.
L( D | H ) : is the distance between line segments.
len( p j p j 1 ) : is the Euclidean distance between two points.
d ( p j p j 1 , tk tk 1 ) : is the perpendicular distance between two line segments.
d ( p j p j 1 , tk tk 1 ) : is the angle distance between two line segments.
len( p j p j 1 ) : is the Euclidean distance between two points.
is the jth characteristic point in one trajectory.
is the (j+1)th characteristic point in one trajectory.
is the kth location point in one trajectory.
is the (k+1)th location point in one trajectory.
(Equation 4.18)
28
Trajectory clustering
By classifying different individual movement features into different clusters, researchers
would be able to understand the density of all the trajectories in the indoor design space in order
to improve user experience. Based on the characteristic points selected in the previous section, the
original individual trajectory could be duplicated and represented as line segment combinations.
In this section, the objective is to classify these line segments into different clusters where
common movement patterns are restored.
The clustering algorithm is based on the DBSCAN algorithm, which is a type of density
based clustering algorithm [63]. Given a set of line segments L={l1 ,l2 ,...,l j } , multiple clusters
could be generated C ={c1 ,c2 ,...,ck } , where j and k are the total number of line segments and
clusters [63]. In the methodology, there are two parameters: (1) ε and (2) MinLn. ε is a threshold
to determine the distance between any pair of line segment, and MinLn helps explain the
minimum number of line segments inside the cluster.
The algorithm contains three steps [63]. First, a queue Q is constructed to include all the
unlabeled line segments during the algorithm. Each time, the ε–neighborhood of one unclassified
line segment
in the queue is computed based on the distance function shown in Eq. 3.19 and
Eq. 3.20. If N (li ) >=MinLn is satisfied, then a density-based set is generated until all the
unclassified line segments are examined. Otherwise, the line segment is considered as noise.
Second, the algorithm attempts to expand clusters. Assuming there are M neighborhoods
generated in the first step, then for any one neighborhood
in terms of
, the similar process is repeated but
. If there are other neighborhoods connected to
(i.e.
), then a
cluster would be generated. Finally, trajectory cardinality is conducted to ensure that all the line
29
segments inside the cluster are from different individual trajectory. More details can be found in
[63].
N (li )  {l j  Q | dist (li , l j )   }
(Equation 3.19)
dist (li , l j )  d (li , l j )  d (li , l j )  d|| (li , l j )
(Equation 3.20)
where,
N (li ) : is the number of ε–neighborhood of one unclassified line segment .
ε: is a threshold to determine the distance between any pair of line segment.
dist (li , l j ) : is the distance between two line segments.
d (li , l j ) : is the perpendicular distance between two line segments.
d (li , l j ) : is the angle distance between two line segments.
d|| (li , l j ) : is the parallel distance between two points.
As mentioned earlier, the goal of trajectory modeling is to detect and recognize
movement patterns not only for individuals but also for multiple people. By visualizing the
trajectory clustering results based on the methodology discussed in Section 3.2.2, we may be able
to understand dynamics behind the geospatial movement patterns in these two aspects. In
addition, the clustering results may also be applied to help achieve some high-level objectives as
well. For instance, introducing the specific facility layout in a specific setting could help
understand indoor space utilization or public space crowd control so as to generate guidance to
relocate various resources and increase user experience. In the next chapter, two case studies
about human gait and geospatial trajectory are discussed to validate the methodologies.
30
3.2.4 Step 4:Model Visualization
The goal of human geospatial trajectory modeling is to discover all possible utilized
regions to better understand human movement dynamics, describe space utilization patterns
evolution during different time periods based on the clustering results and provide possible better
indoor space design. From Step 1 to Step 3, researchers are able to obtain the total number of
clusters, the number of line segments included in each cluster, number different individuals
included and locations of these line segments in each cluster. Clustering visualization helps better
understand how the indoor space is utilized and may lead a better indoor space design.
In the previous sections, the utilized region is assumed to be the location in a design
space containing clusters of individuals. Based on the data mining trajectory clustering
methodology, several common movement patterns from different individual trajectories can be
detected, which equivalently means the clusters of common movement patterns. In the second
aspect, based on the clustering result, the total number of clusters generated and the number of
individuals included in each one can be obtained. In addition, the evolution of indoor space
utilization based on the change of movement patterns in different time periods is also addressed
to obtain the change of human movement behavior patterns.
31
Chapter 4
Case Studies and Discussion
In order to validate the proposed methodologies in Chapter 3, a suitable application/case
study is chosen for human gait modeling and geospatial trajectory modeling, respectively. A
Parkinson’s disease (PD) detection case study is introduced for explaining the human gait
modeling. The objective in the case study is to propose a non-invasive motion tracking
methodology that will serve as a healthcare decision support system, capable of predicting the
emergence of PD based on extracted PD gait patterns. An indoor design space utilization case
study is presented to validate the proposed human geospatial movement component of the
proposed methodology. The objective is to understand how the indoor space is utilized based on
the density of all the trajectories.
For the data acquisition in the two case studies, voluntary participants from the university
were invited. Since the two studies focused on human related topics and asked for volunteers,
where the personal information may be identifiable during experiments and may cause privacy
problem, the skeletal frames are applied in human gait related topic while user ID is tracked
anonymously in the geospatial trajectory related topic. It is important to note that all the
experiments were designed and carried out following all the guidelines and rules enforced by the
Institutional Review Board (IRB) and Office for Research Protections (ORP) for research
involving human participants in the experiments.
The details about PD detection are discussed in Section 4.1 while the trajectory clustering
details are discussed in Section 4.2.
32
4.1 Parkinson’s disease detection based Case Study
Parkinson’s disease (PD) is a motor related disease that affects more than one million
people in North America and is the 2nd most neurological disorder after Alzheimer’s disease
[7],[68-69]. PD results from the death of dopamine-generating cells in a region of the middle
brain called substantia nigra [66]. The symptoms of PD include shaking, muscle rigidity,
slowness of movement, difficulty with walking, and some vocal problems; however, the most
obvious symptoms are gait-related, especially during the early stages of PD [8]. Here the early
PD stages are defined as the stages from I to III in the Hoehn and Yahr Staging of PD [8].
PD
is
now
diagnosed
based
on
published
criteria
such
as
Unified
Parkinson's Disease Rating Scale (UPDRS) [67]. Since the reason of neuron cell death is still
unclear, sometime it is difficult to diagnose PD accurately, and approximately 20%-25%
misdiagnosis is expected in the clinical PD diagnosis [68]. In addition, the current clinical PD
diagnosis process has a high demand for the human resources and facilities which could increase
financial burdens to not only PD patients but also insurance providers and even the government.
All these limitations would let PD patients occupy more healthcare resources, receive more
possible side effects, and decrease PD management efficiency. There are also some data mining
based methodologies applied in PD diagnosis. Even though they have proved effective in PD
recognition, the fundamental limitation is that there are predetermined assumptions for the
biomarkers that may reduce the final PD modeling performance. For example, hands and feet are
usually taken as the biomarkers to track and capture PD [10]. However, these may not be the best
features to predict PD in terms of accuracy and robustness of prediction.
Due to the disadvantages of current PD diagnosis, there is a demand for an integrated PD
detection system that is capable of identifying the emergence of PD motor symptoms in a cost-
33
efficient, objective, and non-invasive way. The proposed data mining driven methodology would
highly satisfy this.
4.1.1 PD Data Acquisition and Preprocessing
For these experiments, the Microsoft Kinect was configured at an elevation of 3 feet and
10 inches above the floor. Each subject’s body presence was verified, and the camera angle was
adjusted by having the subject stand relaxed while facing the Kinect at a distance of 10 feet. Then
each volunteer was invited to the walking experiment where human gait was updated in about
every 30ms (i.e., collecting each frame of human gait in about every 30ms). In this forward
walking (FW) experiment, the subject was asked to first take 2-3 steps backward (4 feet) from the
point of camera calibration, still remaining within the distance limit of the device. Subjects were
then instructed to walk comfortably to the Kinect and were not given any specific instructions
regarding side of initiation. Finally, individual human gait data set was labeled with a class
variable (i.e., subject is PD or control) since the PD status was known before the experiment. The
experiment overhead view is shown in Fig. 4-1.
In this forward walking experiment, the subject pool consists of a total seven PD patients
without medication and seven controls without PD symptoms. Based on the data sampling rate of
33Hz, more than one thousand frames were collected for each subject during the walking
experiment.
34
Figure 4-1. PD forward walking experiment overhead view.
In the next step, data preprocessing is conducted to reduce noise in the original data set.
For example, in the FW experiment, arm swing may not be captured when it swung to the back of
the body, and in this case multiple zeros would be generated in multiple successive frames. The
summary of this step is shown as follows:
1. The velocity and acceleration of each node were also generated in X, Y, and Z
directions similar to position data;
2. The ratio about position, velocity, and acceleration are generated between every two
nodes in X, Y, and Z coordinates to reduce human motion variation;
3. PD status is the response variable, and PD is considered as TRUE and control is
considered as FALSE;
4. Two dataset are finally generated. The first one is PD-OFF dataset which contains the
data form seven PD patients without medication. The second one is Control dataset which
35
contains the data from seven controls. There are 1891 features included in each of the two data
sets.
4.1.2 PD-based Data Mining Knowledge Discovery and Evaluation
In the first step, the feature selection based on the CFS algorithm mentioned in Section
3.1 is conducted to generate the optimal subset in the forward walking experiment in terms of PD
detection. There are 32 features generated from the 1890 original features (the last one is output
variable). Among these 32 features, 18 features are related to position, 9 features are related to
velocity, 1 feature is related to acceleration, and the rest fall into ratios.
In the second step, multiple machine learning algorithms are employed to discover novel
knowledge pertaining to the data acquired. As discussed in Section 3.1, different models may
have different advantages and lead to different classification accuracies. By evaluating the
performances based on the 10 fold cross validation technique, the most accurate and reliable
classification model could be identified. From the Table 4-1, the IBK classifier is the best
classifier since its accuracy is almost 99%, which means the model could identify 99% of the
human gait frames correctly among the seven PD patients and seven controls. At the same time,
the accuracies of J48 (a classifier based on C4.5) and random forest both exceed 90%. The worst
model is logistic regression where the accuracy is only about 64.3%. More information about
confusion matrix in forward walking could be referred in Table 4-1. From this table, we can also
validate that logistic regression has lower accuracy, compared to the SVM, J48, random forest
and IBK models. In addition, the values of other performances can be obtained in Table 4-2. For
example, the IBK classifier can recognize 98.6% of the PD frames correctly. From these two
36
tables mentioned, the IBK and random forest are the best classifiers in terms of the forward
walking experiment.
Table 4-1. Algorithms performances in walking experiment.
Algorithm
Confusion matrix
IBK
Binary Logistic Regression
J48
SVM
Random Forest
Accuracy
PD
Control
Sum
PD
1498
25
1523
Control
16
1349
1365
Sum
1514
1374
2888
PD
Control
Sum
PD
1079
444
1523
Control
556
779
1365
Sum
1665
1223
2888
PD
Control
Sum
PD
1414
109
1523
Control
120
1245
1365
Sum
1534
1354
2888
PD
Control
Sum
PD
1055
468
1523
Control
556
809
1365
Sum
1611
1277
2888
PD
Control
Sum
PD
1472
51
1523
Control
82
1283
1365
Sum
1554
1334
2888
98.8%
64.3%
92.1%
64.5%
95.4%
Table 4-2. Other evaluations among multiple algorithms in walking experiment.
Algorithm
IBK
Binary
Logistic
Regression
J48
SVM
Random
Forest
TP Rate
0.986
FP Rate
0.014
Precision
0.986
Recall
0.986
F-Measure
0.986
ROC Area
0.986
0.643
0.364
0.643
0.643
0.642
0.705
0.921
0645
0.08
0.36
0.921
0.645
0.921
0.655
0.921
0.645
0.929
0.643
0.954
0.048
0.954
0.954
0.954
0.991
37
To summarize, the performances of all these machining learning classifiers are different
in FW experiment. In all these classifiers above, the IBK, random forest and J48 are better than
other two classifiers based on multiple evaluation measures and could be applied in future PD
detection application. For example, by looking at the features extracted from these three models,
researchers are able to identify the common relevant features to the PD detection. In addition, the
average value of these three algorithms may be considered as one quantitative PD detection result
in order to do PD comparison. However, since this case study is a pilot study and it has several
limitations. The main one is that there are only seven PD patients and seven controls involved in
the case study to validate human gait based modeling. One possible future work would be having
more subjects involved in different ages and keeping the same proportion in both males and
females. Furthermore, multiple stages scale detections applied in the current PD long-term
evolution (e.g. UPDRS) are attempted to be quantified which may help improve long term PD
management.
4.2 Geospatial Trajectory Clustering
In this section, the geospatial trajectory based case study is discussed. The objective is to
extract individual movement patterns and compare these patterns in order to generate clusters for
common movement patterns that could serve to help explain motivations behind these activities.
In order to achieve this goal, the related data acquisition is discussed in Section 4.2.1, followed by
the modeling and visualization is Section 4.2.2.
38
4.2.1 Geospatial Trajectory Data Acquisition and Preprocessing
The data collection is conducted in the Learning Factory in Pennsylvania State University
which involves data collected throughout the 3,500 sq. ft. of the facility lab, work, and shop space
(see Fig. 4-2) [74-75]. It is designed for students in the College of Engineering to conduct design
and other related works. BuzNet Real-Time Locating System (RTLS) was used to track the
trajectories of teaching assistants (TAs) at the Learning Factory [71].
Figure 4-2. The learning factory layout.
In the experiment, there are twelve battery-powered Buznet tags provided to TAs when
they are working on their duties. TA was assumed to wear a tag while guiding student’s
experiments until the work is done and the tag is returned to the container. By collecting and
analyzing TA’s trajectories of a semester, we are able to understand their trajectory patterns and
dynamics. During each experiment, the X-Y 2-D dimensional position data would be updated
39
about every two minutes with corresponding time stamp, tag ID, and sequence number. These
data would also be stored in database automatically.
4.2.2 Geospatial Trajectory based knowledge Discovery and Explanation
By looking at the results in partitioning algorithms, it is clear that this algorithm is able to
approximate the original individual trajectory based on the minimum number of characteristic
points. For example, there are 13 position nodes in the original trajectory in User 1 (shown in
Table 4-3); however, only Points 1, 4, 12 and 13 are selected as characteristic points to
approximate the original trajectory (shown in Fig. 4-3). Similar results could be seen in User 2.
For a clearer understanding, the trajectory visualization of User 1 is shown in Fig. 4-4. The
original trajectory of User 1 is represented as multiple black nodes connected by green line.
Based on the proposed partitioning algorithm, the trajectory is approximated by red dots
connected by a black line.
Table 4-3. Original trajectory of User 1.
40
user number sequence number x location y location
1
1
18.6
11.8
1
2
21.4
14.9
1
3
21.5
15.1
1
4
20.8
15.2
1
5
20.5
15.6
1
6
20.7
15.1
1
7
21.1
15
1
8
21.5
15.2
1
9
21.2
15.1
1
10
20.9
15.4
1
11
20.9
15.4
1
12
21.1
15.2
1
13
18.6
11.8
2
1
18.6
11.8
2
2
20.9
15.6
2
3
21.7
15.3
2
4
20.9
15.8
date
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
1/20/2012
Figure 4-3. Extracted characteristic points of User 1
time
18:15:16
18:17:17
18:19:16
18:21:17
18:23:18
18:25:18
18:27:18
18:29:18
18:31:18
18:33:17
18:35:18
18:37:18
18:39:18
18:43:18
18:45:18
18:47:18
18:49:16
41
Figure 4-4. Visualization of trajectory partitioning for User 1.
Based on the results in the partitioning algorithm, there are totally 1287 line segments
generated. By letting ε=1 and MinLn= 8, the clustering algorithm was applied to these line
segments and generated clusters. At last, each effective line segment in the queue was assigned to
a cluster, as well as the original trajectory to which each line segment belongs. Notice that there
are some line segments that cannot be classified into any one cluster since they violate the
parameters
and MinLn, and we labeled this type of line segments as noise. For example in
Table 4-4, Line 1 and Line 2 are grouped in a cluster while Line 3 is grouped into Cluster 2 even
though all three lines are from the same trajectory. Since multiple lines could be included in
original trajectory, it is possible that each individual trajectory could be grouped into different
clusters and helps provide more detail about trajectory patterns.
Table 4-4. Example result based on clustering algorithm.
Line Segment No.
Cluster No.
Trajectory No.
42
Line 1
C1
Tra 1
Line 2
C1
Tra 1
Line 3
C2
Tra 1
Line 4
C2
Tra 2
Line 5
C2
Tra 2
Line 6
C1
Tra 3
Line 7
C1
Tra 3
Line 8
C1
Tra 3
Line 9
C1
Tra 3
Table 4-5. Result of clustering algorithm.
Total number of line
Cluster
segments
cardinality
C1
58
20
C2
41
18
C3
8
3
C4
42
15
C5
15
8
C6
59
14
C7
48
14
C8
224
46
C9
322
44
Cluster No.
The final clustering result is represented in Table 4-5. There are nine clusters generated
based on 817 line segments from the total 1287 line segments in the first step. That is to say,
about 63.5 % of the individual movement patterns could be shared among multiple people,
represented in nine clusters. Moreover, Cluster 8 and Cluster 9 are the most common movement
patterns shown in blue and green in Fig.4-5 since 546 line segments from 90 individual
trajectories are included in these two. Cluster 8 helps explain movements from about 18.7% of
the total people in the case study, and most of the movements are represented in the middle two
43
spaces (work space and shop space). At the same time, there are “back and forth” movements
patterns since most of the line segments are parallel types. Cluster 9 explains the movements
shared by 17.8% of the sample included in case study. Comparing to Cluster 8, more trajectory
patterns are represented and more spaces are used such as PC room, presentation room, as well as
toilet. The similar thing is that there are still “back and forth” patterns involved. Notice that there
are some lines outside of the Learning Factory because people go out of the building before they
return the tags.
Comparing to Fig. 4-2, the clustering results provide a clearer picture about the human
trajectory movement patterns as well as the indoor space utilization patterns as shown in Fig. 4-5.
Figure 4-5. Clustering visualization.
44
Figure 4-6. Clustering visualizations in the first period.
Figure 4-7. Clustering visualizations in the second period.
45
Figure 4-8. Clustering visualizations in the third period.
In order to detect possible movement pattern evolution, the original trajectory data set
was separated into three periods: from January 20th 2012 to February 21th 2012 for the first
period, from February 22th 2012 to March 22th 2012 for the second one, and from March 23th
2012 to April 23th 2012 for the last one. The visualizations are shown in the Fig. 4-6, Fig. 4-7
and Fig.4-8. In addition, there are several points needed to be addressed. First, utilized spaces are
increasing as time goes on from the first picture (Fig. 4-6) to the last one (Fig.4-8). Second, the
similarities among multiple clusters are increasing as time goes on. One possible explanation is
that students have no specific assignments or tasks and just wander around to know each section
in Learning Factory. However, as the semester goes on, students may need to design the
prototype and then go to the machining room for milling. During the end of the semester, the PC
room usage is decreased, but the presentation room is increased since they may complete the
project already and give final presentations. To summarize, it is clear that having more
46
information about different geospatial trajectory patterns based on proposed methodology in this
thesis instead of just mapping location points. In addition, this methodology provides one
approach to recognize the utilization relationship between or among multiple spaces in order to
capture the indoor space utilization patterns which can be taken as evidence for indoor space
utilization optimization.
47
Chapter 5
Conclusions and Future Work
This thesis proposes a human movement tracking methodology for both human gait and
geospatial trajectory with preserved privacy, which means person is unidentifiable based on the
movement data collected. The methodology is partitioned into two components. The first
component is human gait modeling where the objective is to model and predict neurological
patterns that influence human gait. In addition, we are able to solve human gait variation problem
by introducing ratios in position, velocity and acceleration. The second component is human
geospatial trajectory modeling and it aims to predict common regions of interest (CRI) in indoor
design spaces in order to capture and optimize indoor space design. The experimental results
show that our proposed human gait modeling is able to detect significant gait difference between
PD patients and controls, and our proposed human geospatial trajectory modeling is able to detect
common regions of interest form multiple people in the Learning Factory which can serve as a
tool for future indoor space design. Based on these research findings, we can demonstrate the
feasibility of employing multimodal sensors and supervised machine learning algorithms to
model and predict human movement kinematics.
It is time to consider how this work can be expanded and improved upon in the future. In
terms of human gait modeling, one possible future work would be to identify the common
relevant features among multiple machine learning algorithms in order to search for the most
relevant features to the human gait class variable. For example, by examining all the selected
features in different machine learning algorithm, researchers are able to recognize the most
predictive features to the PD detection. In terms of human geospatial trajectory modeling, one
possible future extension would be to add indoor space layout information in order to optimize
48
the indoor space utilization efficiency. For example, by adding facility layout information of the
Learning Factory, the designers are able to better design the space and improve the utilization
efficiency.
49
References
[1]
B. James, Body language: 7 easy lessons to master the silent language. Saddle River, New
Jersey, 07458: FT Press, 2009.
[2]
J. K. Aggarwal and Q. Cai, “Human motion analysis: a review,” in Proceedings of the
1994 IEEE Workshop on. IEEE, 1994, pp. 90–102.
[3]
L. Wang, W. Hu, and T. Tan, “Recent developments in human motion analysis,” in
Pattern Recognition, vol. 36, no. 3, pp. 585–601.
[4]
H. Fujiyoshi, “Real-time Human Motion Analysis by Image Skeletonizadion,” in Fourth
IEEE Workshop on. IEEE, 1998, pp. 15–21.
[5]
“Human Gait,” http://en.wikipedia.org/wiki/Gait_(human). .
[6]
A. Hakeem, R. Vezzani, M. Shah, R. Cucchiara, and R. Emilia, Estimating Geospatial
Trajectory of a Moving Camera. Hong Kong: ICPR 2006, 2006, pp. 82–87.
[7]
D. Gil and D. J. Manuel, “Diagnosing parkinson by using artificial neural networks and
support vector machines,” Global Journal of Computer Science and Technology, vol. 9,
no. 4, 2009.
[8]
S. J. G. Lewis, T. Foltynie, a D. Blackwell, T. W. Robbins, a M. Owen, and R. a Barker,
“Heterogeneity of Parkinson’s disease in the early clinical stages using a data driven
approach.,” Journal of neurology, neurosurgery, and psychiatry, vol. 76, no. 3, pp. 343–8,
Mar. 2005.
[9]
D. B. Calne, B. J. Snow, and C. Lee, “Criteria for Diagnosing Parkinson’s disease,”
Annals of Neurology, vol. 32, no. Supplement S1, pp. 125–127, 1992.
[10]
J. Barth, M. Sunkel, K. Bergner, G. Schickhuber, J. Winkler, J. Klucken, and B. Eskofier,
“Combined analysis of sensor data from hand and gait motor function improves automatic
recognition of Parkinson’s disease,” in Engineering in Medicine and Biology Society
(EMBC), 2012 Annual International Conference of the IEEE, 2012, pp. 5122–5125.
[11]
Http://www.swimsmooth.com/certifiedcoaches.html, “Swimming video analysis.” .
[12]
and J. M. H. Smith, David J., Stephen R. Norris, “Performance evaluation of swimmers,”
Sports Medicine, vol. 32, no. 9, pp. 539–554, 2002.
50
[13]
G. L. Foresti, “A Real-Time System for Video Surveillance of Unattended Outdoor
Environments,” in IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO
TECHNOLOGY, 1998, vol. 8, no. 6, pp. 697–704.
[14]
M. Xu, L. Duan, C. Xu, and Q. Tian, “A fusion scheme of visual and auditory modalities
for event detection in sports video,” in Acoustics, Speech, and Signal Processing, 2003,
vol. 3, pp. 111–189.
[15]
A. F. Smeaton, P. Over, and W. Kraaij, “Multimedia Content Analysis,” in Signals and
Communication Technology, 2009, pp. 151–174.
[16]
and A. E. E. Prassler1, J. Scholz, Tracking People in a Railway Station during Rush-Hour.
1999, pp. 162–179.
[17]
and A. T. Regazzoni, Carlo S., “Distributed data fusion for real-time crowding estimation,”
in Signal Processing, 1996, vol. 53, pp. 47–63.
[18]
A. Fod, A. Howard, and A. Overview, “A Laser-Based People Tracker,” in Robotics and
Automation, 2002. Proceedings, 2002, no. May, pp. 3024–3029.
[19]
M. S. L. Scanners, H. Zhao, and R. Shibasaki, “A Novel System for Tracking Pedestrians
Using Multiple Single-Row Laser-Range Scanners,” Systems, Man and Cybernetics, Part
A: Systems and Humans, IEEE Transactions, vol. 35, no. 2, pp. 283–291, 2005.
[20]
L. W. Campbell and A. F. Bobick, “Recognition of human body motion using phase space
constraints,” in Proceedings of IEEE International Conference on Computer Vision, 1995,
pp. 624–630.
[21]
N. H. Goddard, “Incremental Model-Based Discriminat ion of Articulated Movement
from Motion Features,” in Proceedings of the 1994 IEEE Workshop on. IEEE, 1994, pp.
89–94.
[22]
I. A. Kakadiaris, D. Metaxas, R. Bajcsy, and I. Science, “Active Part-Decomposition,
Shape and Motion Estimation of Articulated Objects: A Physics-based Approach,”
Computer Vision and Pattern Recognition, 1994. Proceedings CVPR’94, pp. 980–984,
1994.
[23]
R. Rashid, “Towards a system for the interpretation of moving light displays,” in Pattern
Analysis and Machine Intelligence, IEEE …, 1980, no. 6, pp. 574–581.
[24]
G. Johansson, “Visual perception of biological motion and a model for its analysis,”
Perception & psychophysics, vol. 14, no. 2, pp. 201–211, 1973.
51
[25]
M. K. Leung, Y. Yang, and M. Senior, “First Sight : A Human Body Outline Labeling
System,” Pattern Analysis and Machine Intelligence, IEEE Transactions, vol. 17, no. 4,
1995.
[26]
G. Johansson, “Visual motion perception,” Scientific American, vol. 232, no. 6, pp. 76–88,
1975.
[27]
J. A. J. k. A. Webb, Visually Interpreting The Motion of Objects in Space. Computer
Science Department, University of Texas at Austin: , 1981.
[28]
I.-C. C. H. Chang, “Ribbon-Based Motion Analysis of Human Body Movements,” in In
Pattern Recognition, Proceedings of the 13th International Conference, 1996, pp. 436–
440.
[29] C. S. Works, “Image sequence analysis of real world human motion,” Pattern Recognition,
vol. 17, no. 1, 1984.
[30]
D. . Gavrila, “The Visual Analysis of Human Movement: A Survey,” Computer Vision
and Image Understanding, vol. 73, no. 1, pp. 82–98, Jan. 1999.
[31]
D. M. Gavrila and L. S. Davis, “3-D model-based tracking of humans in action,” in
Computer Vision and Pattern Recognition,, 1996, pp. 73–80.
[32]
L. Goncalvest, E. Di Bernardotl, E. Ursellaj, and P. Peronat, “Monocular tracking of the
human a r m in 3D,” in Computer Vision, 1995, pp. 764–770.
[33]
I. A. Kakadiaris and D. Metaxas, “3D Human Body Model Acquisition from Multiple
Views,” in Computer Vision, 1995. Proceedings., Fifth International Conference on. IEEE,
1995, pp. 618–623.
[34]
R. Szeliski, O. K. Square, and S. B. Kang, “Recovering 3D Shape and Motion from Image
Streams using Non-Linear Least Squares,” in Computer Vision and Pattern Recognition,
1993. Proceedings CVPR ’93., 1993 IEEE Computer Society Conference, pp. 752–753.
[35]
T. Nomura, T. Tasaki, and T. Kanda, “Questionnaire – Based Research on Opinions of
Visitors for Communication Robots at an Exhibition in Japan,” in Human-Computer
Interaction-INTERACT, 2005, pp. 685–698.
[36]
T. Shibata, K. Wada, and K. Tanie, “Tabulation and analysis of questionnaire results of
subjective evaluation of seal robot at Science Museum in London,” Proceedings. 11th
IEEE International Workshop on Robot and Human Interactive Communication, pp. 23–
28, 2002.
52
[37]
F. Girardin, F. D. Fiore, C. Ratti, and J. Blat, “Leveraging explicitly disclosed location
information to understand tourist dynamics: a case study,” Journal of Location Based
Services, vol. 2, no. 1, pp. 41–56, Mar. 2008.
[38]
C. R. Yuji Yoshimura, Fabien Girardin, Juan Pablo Carrascal, “New Tools for Studying
Visitor Behaviors in Museum: A Case Study at the Louvre,” in and Communication
Technologies in Tourism 2012. Proceedings of the International conference in
Helsingborg (ENTER 2012)., pp. 15–27.
[39]
H. Cao, N. Mamoulis, D. W. Cheung, P. Road, and H. Kong, “Mining Frequent Spatiotemporal Sequential Patterns,” in Data Mining, Fifth IEEE International Conference, pp.
27–30.
[40]
G. Andrienko and S. Augustin, “Visual Analytics Tools for Analysis of Movement Data,”
ACM SIGKDD Explorations Newsletter, vol. 9, no. 2, pp. 38–46, 2007.
[41]
C. Charayaphan, “Communications Image processing system for interpreting in American
Sign Language motion,” Journal of Biomedical Engineering, vol. 14, no. 5, pp. 419–425,
1992.
[42]
S. Tamura and S. Kawasaki, “Recognition of sign language motion images,” Pattern
Recognition, vol. 21, no. 4, pp. 343–353, Jan. 1988.
[43]
F. Polana, R. Nelson, and N. York, “Low Level Recognition of Human Motion,” in
Motion of Non-Rigid and Articulated Objects, 1994., Proceedings of the 1994 IEEE
Workshop on. IEEE,, 1994, pp. 77–82.
[44]
U. Kreljel, W. Ritter, R. Dbag, and U. Daimlerbenz, “Tracking Non-Rigid, Moving
Objects Based on Color Cluster Flow,” in IEEE Computer Society Conference, 1997, pp.
257–260.
[45]
N. Johnson and D. Hogg, “Learning the distribution of object trajectories for event
recognition,” Image and Vision Computing, vol. 14, no. 8, pp. 609–615, Aug. 1996.
[46]
T. T. Zhouyu Fu , Weiming Hu, “Similarity based vehicle trajectory clustering and
anomaly detection,” in Image Processing, 2005. ICIP 2005. IEEE International
Conference on (Volume:2 ), pp. 11–14.
[47]
I. K. Fodor, “A Survey of Dimension Reduction Techniques,” 2002.
[48]
M. A. Hall, “Correlation-based feature selection for machine learning,” Doctoral
dissertation, The University of Waikato, 1999.
[49]
B. Fritzke, “Growing Cell Structures: A Self-Organizing Network for Unsupervised and
Supervised Learning,” Neural networks, vol. 7, no. 9, pp. 1441–1460, 1994.
53
[50]
R. G. Ramani, G. Sivagami, and and G. S. Ramani, R. Geetha, “Parkinson disease
classification using data mining algorithms,” International Journal of Computer
Applications, vol. 32, no. 9, pp. 17–22.
[51]
N. Landwehr, M. Hall, and E. Frank, “Logistic Model Trees,” Machine Learning, vol. 59,
no. 1–2, pp. 161–205, May 2005.
[52]
A. Tsanas, M. A. Little, P. E. McSharry, J. Spielman, and L. O. Ramig, “Novel speech
signal processing algorithms for high-accuracy classification of Parkinson’s disease,”
Biomedical Engineering, IEEE Transactions on, vol. 59, no. 5, pp. 1264–1271, 2012.
[53]
A. Ozcift, “SVM feature selection based rotation forest ensemble classifiers to improve
computer-aided diagnosis of Parkinson disease,” Journal of medical systems, vol. 36, no. 4,
pp. 2141–2147, 2012.
[54]
S. Wu, “A Data Mining Analysis of The Parkinson’s Disease,” in iBusiness, 2011, vol. 03,
no. 01, pp. 71–75.
[55]
S. M. Gabrilovich, Evgeniy and E. Gabrilovich, “Text Categorization with Many
Redundant Features: Using Aggressive Feature Selection to Make SVMs Competitive
with C4.5,” in Proceedings of the twenty-first international conference on Machine
learning, pp. 41–48.
[56]
S. Abe, Support vector machines for pattern classification. Springer London Dordrecht
Heidelberg New York, 2010.
[57]
L. E. O. Breiman, “Random Forests,” Machine learning, vol. 45, no. 1, pp. 5–32, 2001.
[58]
M. F. Amasyalı, B. Diri, and M. F. Amasyal\i, Automatic Turkish Text Categorization in
terms of Author, genre and gender. Springer Berlin Heidelberg, 2006, pp. 221–226.
[59]
D. W. Aha, D. Kibler, and M. K. Albert, “Instance-based learning algorithms,” Machine
Learning, vol. 6, no. 1, pp. 37–66, Jan. 1991.
[60]
T. D’heygere, P. L. M. M. Goethals, and N. De Pauw, “Use of genetic algorithms to select
input variables in decision tree models for the prediction of benthic macroinvertebrates,”
Ecological Modelling, vol. 160, no. 3, pp. 291–300, Feb. 2003.
[61]
A. H. Fielding and J. F. Bell, “A review of methods for the assessment of prediction errors
in conservation presence/absence models,” Environmental conservation, vol. 24, no. 1, pp.
38–49, 1997.
[62] E. Dakou, T. D’heygere, A. P. Dedecker, P. L. M. Goethals, M. Lazaridou-Dimitriadou, N.
Pauw, and N. De Pauw, “Decision Tree Models for Prediction of Macroinvertebrate Taxa
54
in the River Axios (Northern Greece),” Aquatic Ecology, vol. 41, no. 3, pp. 399–411, Jul.
2006.
[63]
J. Lee and J. Han, “Trajectory Clustering : A Partition-and-Group Framework,” in
Proceedings of the 2007 ACM SIGMOD international conference on Management of data,
2007, pp. 593–604.
[64]
S. Patel, K. Lorincz, R. Hughes, N. Huggins, J. Growdon, D. Standaert, M. Akay, J. Dy,
M. Welsh, and P. Bonato, “Monitoring motor fluctuations in patients with Parkinson’s
disease using wearable sensors,” Information Technology in Biomedicine, IEEE
Transactions on, vol. 13, no. 6, pp. 864–873, 2009.
[65] X. Huang, H. Chen, W. C. Miller, R. B. Mailman, J. L. Woodard, P. C. Chen, D. Xiang, R.
W. Murrow, Y.-Z. Wang, and C. Poole, “Lower low-density lipoprotein cholesterol levels
are associated with Parkinson’s disease,” Movement disorders, vol. 22, no. 3, pp. 377–381,
2007.
[66]
“Parkinson’s
disease
introduction.”
http://en.wikipedia.org/wiki/Parkinson’s_disease.
[Online].
[67]
P. Martinez-Martin, A. Gil-Nagel, L. M. Gracia, J. B. Gomez, J. Martí
nez-Sarriés, and F.
Bermejo, “Unified Parkinson’s disease rating scale characteristics and structure,”
Movement disorders, vol. 9, no. 1, pp. 76–83, 1994.
[68]
a J. Hughes, S. E. Daniel, L. Kilford, and a J. Lees, “Accuracy of clinical diagnosis of
idiopathic Parkinson’s disease: a clinico-pathological study of 100 cases.,” Journal of
Neurology, Neurosurgery & Psychiatry, vol. 55, no. 3, pp. 181–184, Mar. 1992.
[69]
T.W. Simpson and E. Kisenwether, “Driving entrepreneurial innovation through the
learning factory: The power of interdisciplinary capstone design projects,” in ASME
Design Engineering Technical Conferences-Design Education Conference., 2013.
[70]
T. W. Lamancusa, John S and Simpson, “The Learning Factory–10 Years of Impact at
Penn State.,” in International Conference on Engineering Education, pp. 16–21.
[71]
“Simple & Reliable Indoor Positioning Overview.”
http://www.buzbynetworks.com/buznet/buznet-overview.
[Online].
Available:
Available: