Download THE OPEN SOURCE MATLAB TOOLBOX Gait

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
THE OPEN SOURCE MATLAB TOOLBOX Gait-CAD
AND ITS APPLICATION TO BIOELECTRIC SIGNAL PROCESSING
R. Mikut, O. Burmeister, S. Braun, M. Reischl
Institute for Applied Computer Science, Forschungszentrum Karlsruhe GmbH, Germany
E-Mail [email protected]
Abstract In this paper, the open source Matlab toolbox
Gait-CAD is presented. This toolbox is designed for the
visualization and analysis of time series and single features
with a special focus to classification problems. The aim is
to provide an open platform for the development and improvement of data mining methods and the application to
various medical and technical problems.
Keywords Data Mining, Tools, Neuroprostheses
Introduction
In many applications, large data sets of time series and
single features are recorded. An at least semi-automatic
search for unknown or partially known relations requires
the use of data mining methods [1]. In the last years, a huge
number of potentially useful methods and software tools
have been proposed including methods for feature extraction, classification, and regression.
Many existing software tools are very powerful, but they
cover only a very limited subset of implemented methods.
However, the coupling between different necessary processing steps (as e.g. feature extraction from time series and
classification) is rather weak. This leads often to the reimplementation of existing methods or a stepwise transfer of
partial results between different tools.
Some tools are focused on a script-based processing resulting in problems for a transfer to other applications due to a
time-consuming manual adaptation of implemented algorithms. A generally accepted tool platform does not exist at
the moment.
These facts make a fast comparison of new developed
methods against a broader set of existing methods very
time consuming. As a consequence, the new methods will
only be compared with a small number of concurrent approaches - a broad comparison is not feasible.
In our opinion, an ideal data mining tool
• has to contain various data mining methods from
feature extraction to classification and regression using statistical approaches up to newer approaches
from computational intelligence,
• has to be free and open source to guarantee a wide
acceptance in the scientific community and the fast integration of new methods,
• needs to be modular with well documented interfaces
to integrate various methods useful for highly specialized application domains, and
• has to support a GUI based exploration of the data set
as well as a highly automated script based processing
of routine operations.
This paper presents the Matlab toolbox Gait-CAD as a first
step in this direction. It is focused on the visualization and
analysis of time series and features, especially for classification, but also for regression problems. Our intention is
the design of an open platform as a framework for the
development and improvement of data mining methods.
Methods
The toolbox Gait-CAD bases on Matlab (tested for the
versions 5.3 and 2007b). The decision to a Matlab-based
solution was made to use the wide mathematical functionality of this package provided by The Mathworks Inc. A
main disadvantage is the need for a MATLAB license.
The toolbox is operated by a graphical user interface
(GUI) with menu items and control elements like popup
lists, checkboxes, and edit elements (Figure 1). This enables inexperienced users to work with the toolbox. However, the implemented algorithms work independently
from the GUI. Thus, the Matlab-typical way of programming using a command prompt and variables is possible.
Furthermore, an automation and batch standardization of
analyzes is possible by designing individual macros. More
details for the handling are explained in a comprehensible
PDF handbook.
Figure 1: Gait-CAD screenshot
Gait-CAD is an open source software. The German version
is available since November 2006, the English one since
January 2008 It is licensed under the conditions of the
GNU General Public License (GNU-GPL) of The Free
Software Foundation. The download is possible using the
downloading section at
http://www.iai.fzk.de/projekte/biosignal/index.html.
To use the toolbox for the design of a data mining algorithm, a training data set is required. This data set is normally given by a binary Matlab project file, containing
matrices and vectors with predefined structures and names.
This data set is normally given by a binary Matlab project
file, containing matrices with given names. Additionally,
the user is able to add own textual identifiers and further
information to the matrices and structures. Missing information is compensated by standard values and identifiers.
The import of data from text files (single files or complete
directories, single features or time series) is possible.
The training data set is organized with n = 1, ..., N data
points, each containing
• sz time series (described by a matrix with the dimension sz × K, with K - number of sample points),
• s single features (vector with the dimension s)
• sy discrete output variables (vector with the dimension
sy).
The management of multiple output variables (i.e. diagnoses with respect to diseases in medical applications, decisions for therapies, qualitative evaluations of therapy successes, gender, age-groups etc.) for each data point allows
a flexible selection of multiple classification problems.
Additionally, input and output variables may be switched
depending on the problem.
Gait-CAD implements the standardized data mining
process proposed by [2]. The main components are shown
in Figure 2. Gait-CAD permits a comfortable handling of
numerous algorithms for the
• selection of data points (e.g. detection of outliers,
discarding of incomplete data points and features, selection of parts of data sets),
• feature extraction (e.g. spectrograms, FFT analysis,
correlation analysis, linear filtering, calculation of extrema, mean values, fuzzification etc.),
• evaluation and selection of features and time series
(e.g. multivariate analysis of variances, t-test, information measures, regression analysis),
• feature aggregation (e.g. discriminant analysis, principal component analysis - PCA, independent component analysis - ICA),
• supervised and unsupervised classification (e.g. decision trees, cluster algorithms, Bayes classifier, artificial neural networks (ANN), nearest neighbour algorithms, support vector machines - SVM, fuzzy systems), and
• validation strategies (e.g. cross-validation, bootstrap).
Additionally, there are various possibilities to visualize
results, automatically log results and process steps in text
and LaTeX files, rename variables etc.
For some functions, Gait-CAD uses additional commercial
Matlab toolboxes (e.g. Signal, Statistics, Neural Network,
and Wavelet toolbox from the MathWorks, Inc.) or freely
available GNU-GPL toolboxes. But most of the selfimplemented functions require only a standard Matlab
installation.
The feature extraction is realized with plugins. Plugins
are single Matlab functions called plugin_*.m, which are
included in a special directory or in the working directory.
Database
Problem formulation
(verbalized)
Collecting
training data set
Problem formulation
(formalized)
Evaluation
measures
Data point
selection
Feature
extraction
Validation
strategies
Feature
selection
Feature
aggregation
Visualization
Classification/
Regression
Design of a data mining method (Gait-CAD)
Figure 2: Design process of a data mining algorithm [2]
They generate
• new time series from one (e.g. by low-pass or highpass filtering, segmentation) or more (e.g. minimum,
mean or maximum value) existing time series, or
• new single features from one time series in a predefined segment (e.g. mean value for the complete
time series or the first 50% of sampling points). The
segment can be defined by a special file or interactively by selecting a region of interest.
Gait-CAD contains a large number of pre-defined plugins
and segments. The structure allows a user-defined expansion with special feature types for each specific application
field.
Macros are recorded sequences of clicked menu items and
control elements. The main advantages are an automation
of long sequences of operations (e.g. for the use in different
projects) and the opportunity for the integration of userdefined functions. A manual modification is possible due
to its textual Matlab syntax.
Application-specific extension packages can be easily
integrated into the graphical user interface. Gait-CAD
contains templates for new menu items and control elements as a starting point for a manual modification. It
allows the integration of own functions using any parameter from the control elements or available variables. An
example is a special package for electroneurography provided by the University of Freiburg. It contains the algorithms described in [3].
Results
In many clinical applications, the available data set contains time series of recorded bioelectric signals such as
muscle, nerve, or brain signals.
The automatic design of data mining solutions offers an
objective and reliable method for the generation of hypotheses for clinical trials, the data-based design of clinical
decision support systems for diagnosis and therapy planning, and the adaptation of medical devices to individual
patients.
An example for the latter task is the detection of user intentions from brain, nerve or muscle signals or the information processing of nerve signals from natural limbs for
neuroprostheses (Figure 3).
Intentions
Central
Nervous
System
Neural Interface
Sensor
Interface
Software
Data
analysis
Control
StimuStimulator
lator
Feedback
Pattern
generator
Artificial
Protheses
Interface
Software
Pattern
generator
Stimulator
Data
analysis
Sensor
Natural
Limbs
Figure 3: Interface for the design of neuroprostheses [4]
Table 1: Examples for recent applications of Gait-CAD to
bioelectric signals (EMG: electromyography, ENG: electroneurography, EEG: electroencephalography, ECoG:
electrocorticography)
Applications
Hand prosthesis control [5]
Detection of mechanical stimuli from nerve
signals with cuff electrodes [6]
Detection of artefacts from Function Electrical Stimulation (FES) [7]
Analysis of Central Pattern Generators [8]
Design algorithms for Brain Computer
Interfaces [5, 9]
Gait analysis [10]
Signals
EMG
ENG
EMG,
ENG
EMG
EEG,
ECoG
EMG
Data analysis plays a key role in this concept for the databased detection of human intentions from bioelectric signals and for the use of biosensors. Gait-CAD has supported
these steps for a number of different scenarios:
For the first task, Brain Computer Interfaces are often
controlled by imagined movements. The brain signals can
be recorded by surface (EEG) or invasive (ECoG) electrode arrays resulting in a set of time series. The data mining task consists of the extraction new time series (e.g. by
bandpass filters) and a classification to differentiate the
movement intentions. In addition, an analysis of the local
and temporal information content is useful to understand
the processes [9]. Hand prostheses are usually controlled
by muscle signals originating from two electrodes. Here,
classification problems exist for the switching between
different grasp types [5].
For future neuroprostheses, a scenario including functional
electro stimulation and a recording of afferent nerve signals
induced by mechanical stimuli is intended. The nerve
signals are recorded by cuff electrodes. Here, very high
sampling frequencies (50 kHz) are necessary to extract
useful information. The problem is the detection and localization of mechanical stimuli by a classification task [6].
Besides these applications, Gait-CAD is now used in many
medical, biological, and technical application scenarios.
From a data mining point of view, these very different
applications can be unified and the synergies can be used
with the presented platform.
Discussion
The aim of Gait-CAD is to provide an interface to apply
and compare data mining methods. Its architecture allows
to enlarge the toolbox by further algorithms. Everyone is
invited to support the further development of Gait-CAD.
Acknowledgements
Thanks to all the busy programmers, developers of algorithms, and testers, especially to Tobias Loose, and Sebastian Gollmer. The support by the Deutsche Forschungsgemeinschaft (German research association) within the project "Diagnosis support in gait analysis" and the Cooperate
Research Center "Humanoid Robots" was a great help to
build the basis for the further development of the toolbox.
References
[1]
Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P.: From
Data Mining to Knowledge Discovery in Databases.
AI Magazine, Vol. 17, pp. 37–54, 1996.
[2] Mikut, R.; Reischl, M.; Burmeister, O.; Loose, T.:
Data Mining in Medical Time Series. Biomedizinische Technik, vol. 51, pp. 288–293, 2006.
[3] Krüger, T. B.; Levchuk, O.; Stieglitz, T.: Decoding of
Neural Signals with MATLAB - Onset Detection and
Classification as a Guided Tool. Biomedizinische
Technik, vol. 52, Ergänzungsband, 2007
[4] Mikut, R.; Krüger, T.; Reischl, M.; Burmeister, O.;
Rupp, R.; Stieglitz, T.: Regelungs- und Steuerungskonzepte für Neuroprothesen am Beispiel der oberen
Extremitäten. at - Automatisierungstechnik, vol. 54,
pp. 523–536, 2006.
[5] Reischl, M.: Ein Verfahren zum automatischen
Entwurf von Mensch-Maschine-Schnittstellen am
Beispiel myoelektrischer Handprothesen. Dissertation, Universität Karlsruhe, Universitätsverlag
Karlsruhe. 2006.
[6] Krüger, T.; Reischl, M.; Lago, N.; Burmeister, O.;
Mikut, R.; Ruff, R.; Hoffmann, K.-P.; Navarro, X.;
Stieglitz, T.: Analysis of Microelectrode-Signals in
the Peripheral Nervous System, In-Vivo and PostProcessing. In: Proc., Mikrosystemtechnik Kongress
Deutschland, pp. 69–72. Freiburg: VDE-Verlag.
2005.
[7] Rohm, M.: Evaluierung und Inbetriebnahme von
Sensorkonzepten für die Steuerung von funktionellen
Orthesen der oberen Extremität. Diplomarbeit, Universität Darmstadt, Forschungszentrum Karlsruhe.
2008.
[8] Chen, Y.: A Concept for the Application of Neural
Oscillators and Spinal Reflexes to Humanoid Robots
and Neuroprostheses. Diplomarbeit, Universität
Karlsruhe (TH), in preparation, 2008.
[9] Burmeister, O.; Reischl, M.; Mikut, R.: Application
of Time-Variant Classifiers to Invasively Recorded
Signals from Brain and Peripheral Nerve. Biomedizinische Technik, vol. 52, Ergänzungsband, 2007.
[10] Wolf, S.; Loose, T.; Schablowski, M.; Döderlein, L.;
Rupp, R.; Gerner, H. J.; Bretthauer, G.; Mikut, R.:
Automated feature assessment in instrumented gait
analysis. Gait & Posture, 23 (3), S. 331-338; 2006