Download End of lecture 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
BME452 Biomedical Signal Processing
Lecture 3
Signal conditioning
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
1
Lecture 3 Outline

In this lecture, we’ll study the following signal conditioning methods
(specifically for noise reduction)






Ensemble averaging
Median filtering
Moving average filtering
Principal component analysis
Independent component analysis (in brief)
Before we study these, an introduction to some mathematics will be
given
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
2
Mean

The arithmetic mean is the "standard" average, often simply called the "mean"
1
x
N






N
 x[ n ]
n 1
1
x
N
N 1
 x[ n ]
n 0
where N is used to denote the data size (length)
In MATLAB, n=1,….N but sometimes we use n=0,1,….N-1.
Example
An experiment yields the following data: 34,27,45,55,22,34
To get the arithmetic mean
 How many items? There are 6. Therefore N=6
 What is the sum of all items? =217.
 To get the arithmetic mean divide sum by N, here 217/6=36.1667
Expectation
 What is expected value of X, E[X]? Simply said, it refer to the sum divided by the
quantity, i.e. mean of the value in the square brackets

Eg: E[x2]= 1
N
N 1
 x[n]
2
n 0
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
3
Mean removal for signals


Very often, we set the mean to zero before performing any signal analysis
This is to remove the dc (0 Hz) noise

xm=x-mean(x)
22
5
21
4
20
3
19
2
18
1
17
0
16
-1
15
-2
14
13
-3
0
50
100
150
200
250
300
-4
0
50
100
150
200
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
250
300
4
Mean removal across channels/recordings

Sometimes, a noise corrupts all the signals in a multi-channel signal
or across all the recordings of a single channel signal

Since the noise is common to all the channels/recordings, the simplest
way of removing this noise is to remove mean across
channels/recordings
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
5
Standard deviation ()

Measures how spread out are the values in a data set
Suppose we are given a signal x1, ..., xN of real value numbers (all recorded signals are real
values)
The arithmetic mean of this population is defined as

The standard deviation of this population is defined as

Given only a sample of values x1,...,xN from some larger population, many authors define the
sample (or estimated) standard deviation by

This is known as an unbiased estimator for the actual standard deviation


Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
6
Standard deviation example
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
7
Interpreting standard deviation

A large standard deviation indicates that the data points are far from the mean and a
small standard deviation indicates that they are clustered closely around the mean

For example, each of the three samples (0, 0, 14, 14), (0, 6, 8, 14), and (6, 6, 8, 8)
has an average of 7.

Their standard deviations are 7, 5 and 1, respectively.

The third set has a much smaller standard deviation than the other two because its
values are all close to 7.
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
8
Normalisation



Sometimes, we may wish to normalise a signal to mean=0 and set the standard
deviation to 1
For example, if we record the same signal but using different instruments with
different amplification factor, it will be difficult to analyse the signals together
In this regard, we will normalise the signals using
x nor [n ] 
( x[n ]  x )
std ( x )
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
9
Variance

Variance is simply the square of standard deviation

Uncertainty measure

Variance may be thought of as a measure of uncertainty

When deciding whether measurements agree with a theoretical prediction, variance could be
used

If variance (using the predicted mean) is high, then the measurements contradict the
prediction

Example: say we have predicted that x[1]=7, x[2]=6, x[3]=5
x is measured 3 times => (7.2 6.7 5.6); (4.2 6.8 5.2); (11.2 6.3 5.9)


Do this =>

Compute the variance using the predicted value as mean

var[1]=12.76, var[2]=0.610, var[3]=0.605

So, we know that x[1] measurements are contradicting the prediction and probably not x[2]
and x[3] measurements
1 N
var 
( xi  x ) 2
N  1 i 1
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)

10
Covariance

If we have multi-channel/multi-trial recorded signals, we can have cross variance or simply
covariance

Covariance measure the variance between different signals (from different channels/recordings)

Covariance between two signals, X and Y with respective means, μ and ν,

The covariance sometimes is used as a measure of "linear dependence" between the two signals
but correlation is a better measure
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
11
Correlation

Correlation between two signals, X and Y is

It is simply normalised covariance

It measures linear dependence between X and Y

The correlation is 1 in the case of an increasing linear relationship, −1 in the case of a decreasing
linear relationship, and some value in between in all other cases, indicating the degree of linear
dependence between the variables

The closer the coefficient is to either −1 or 1, the stronger the correlation between the variables
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
12
Application of correlation (example)

The diagram shows how the unknown signal can be identified





A copy of a known reference signal is correlated with the unknown signal
The correlation will be high if the reference is similar to the unknown signal
The unknown signal is correlated with a number of known reference
functions
A large value for correlation shows the degree of similarity to the reference
The largest value for correlation is the most likely match
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
13
Application of correlation (another example)
Application to heart disease detection using ECG signals



Cross correlation is one way in which different types of heart diseases can be identified
using ECG signals
Each heart disease has a unique ECG signal
Some example of ECG signals for different diseases are shown below
Sinus bradycardia
120
120
100
100
Amplitude (arbitrary units
Amplitude (arbitrary units)
Normal Sinus Rhythm
80
60
40
20
0
-20
-40
80
60
40
20
0
0
100
200
300
400
Sampling points
500
600
-20
700
0
100
Right Bundle Branch Block
200
300
400
Sampling points
500
600
700
600
700
Accelerated Junctional Rhythm
70
50
60
40
50
Amplitude (arbitrary units)
Amplitude (arbitrary units)

40
30
20
10
30
20
10
0
0
-10
-10
-20



0
100
200
300
400
Sampling points
500
600
700
-20
0
100
200
300
400
Sampling points
500
The system has a library of pre-recorded ECG signals (known as templates)
An unknown ECG signal is correlated with all the ECG templates in this library
The largest correlation is the most likely match of the heart disease
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
14
Signal-to-noise ratio (SNR)

Before we move into the noise reduction methods, we need a
measure of noise in the signals


This is important to gauge the performance of the noise reduction
techniques
For this purpose, we use SNR

SNR=10log10[(signal energy)/(noise energy)]
Energy 
N

x(n)2
n 1

The original noise


x(noise) = x(original signal) – x(noisy signal)
After using some noise reduction method,

x(noise) = x(original signal) – x(noise reduced signal)
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
15
Ensemble averaging

If we have many recordings, we can use ensemble averaging to reduce noise that is not
correlated between the recordings
3
Ensemble averaging to reduce noise from Evoked Potential (EP) EEG
2.5
2
1.5




1
Repeated different recordings are known as trials
EP EEG signals from trial to another are about the same (high correlation)
But noise will be different from one trial to another (low correlation)
Hence, it would be possible to use ensemble averaging to reduce noise
5
4
4
4
3
3
3
2
…………….
1
0
-1
-2
-2
-3
-3
-1
0
50
100
150
200
250
EP EEG
1
1
-1
0
-0.5
2
2
0
0.5
0
-1
-2
-3
-4
-4
0
50
100
150
200
250
300
EP EEG+noise (trial 1)
-4
0
50
100
150
200
250
300
EP EEG+noise (trial 2)
-5
0
50
100
150
200
250
300
EP EEG+noise (trial 20)
4
3
2
1
0
EP EEG after ensemble averaging
-1
-2
0
50
100
150
200
250
300
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
16
300
Worked example 1 - Ensemble averaging



Assume we have 3 signals corrupted with noise. Assume we have the original also (for SNR computation).
1. Set the mean to zero first
n
0
1
2
3
n
0
1
2
3
Noisy signal 1
2.9
4.9
5.3
7.3
Noisy signal 1
-2.2
-0.2
0.2
2.2
Noisy signal 2
2.9
5.1
5.1
6.9
Noisy signal 2
-2.1
0.1
0.1
1.9
Noisy signal 3
3.1
4.8
5.1
7
Noisy signal 3
-1.9
-0.2
0.1
2.0
Original
3
5
5
7
Original
-2.0
0.0
0.0
2.0
2. The ensemble average is (the average is done for each sample point n)
n
Ensemble average
3. The noises in the signals are
(original signal – noise corrupted signal)
0
1
2
3
-2.1
-0.1
0.1
2.0

4. SNR=10log10(e(signal)/e(noise))
n
0
1
2
3
Signal 1 noise
0.2
0.2
-0.2
-0.2
Signal 2 noise
0.1
-0.1
-0.1
0.1
Signal 3 noise
-0.1
0.2
-0.1
0.0
Ensemble average noise
0.1
0.1
-0.1
0.0
Original signal energy
8
signal 1
signal 2
signal 3
ensemble
average
noise energy
0.16
0.04
0.06
0.03
SNR
16.99
23.01
21.25
24.26
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
17
Median filtering

Similar to ensemble averaging, if we have many recordings, we can use median filtering to reduce
noise that is not correlated between the recordings

What is median filtering?

If we have x[1] as [3 2 1 0 6 7 9 3 2] from 9 trials, we sort the numbers from small to big, then the
centre value (i.e. 5th) as the median

Sorted x[1] is [0 1 2 2 3 3 6 7 9], so median x[1]= 3

Median filtering is advantageous as compared to ensemble averaging if there is one trial
containing a lot of noise AND if the number of trials/recordings are small

This is because the one heavily noise corrupted signal will distort the ensemble average values
but will less likely affect the median values
m
Data length
n=1,2,…………………………………………….,N
1
2
3
Number of trials .
.
.
.
M
.
.
.
obtain median values
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
18
Worked example 2 – median filtering
Assume we have 3 signals corrupted with noise, one
heavily corrupted
(assume the mean has been set to zero)


n
0
1
2
3
Noisy signal 1
-2.15
0.95
-0.25
1.45
Noisy signal 2
-9
0
5
4
Noisy signal 3
-2.25
-0.45
0.85
1.85
Original
-2.0
0.0
0.0
2.0
2. The noises in the signals are (original signal – noise
corrupted signal)

n
0
1
2
3
Noise in signal 1
0.15
-0.95
0.25
0.55
Noise in signal 2
7
0
-5
-2
Noise in signal 3
0.25
0.45
-0.85
1. The ensemble average and median filtered signals

n
0
Ensemble average
-4.47
Median filter
-2.25
1
2
3
0.17
1.87
2.43
0
0.85
1.85
4. SNR=10log10(e(signal)/e(noise))

Original signal energy
8
signal 1
signal 2
signal 3
ensemble
average
Median
filter
noise energy
1.29
78
1.01
9.81
0.81
SNR
7.93
-9.89
8.99
0.88
9.95
0.15
Noise in ensemble
averaging
2.47
-0.17
-1.87
-0.43
Noise in median filter
signal
0.25
0
-0.85
0.15

Which technique gave better noise reduction using SNR – ensemble averaging or median filtering?

Why?
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
19
Moving average filtering

How do we reduce noise if we have only one signal from one recording/ trial?

We can’t use ensemble averaging and median filtering

Normally, in any signal, the few points before and after a certain point n are correlated (i.e.
related)

But generally the noise is not correlated


So, we can use moving average (MA) filtering
It is defined as
1 n  S 1
y[ n ] 
x[ n ]
S i n


where S is the filter order

Example, for S=3, y[5]=(x[5]+x[6]+x[7])/3

For signals x and y to remain of same sample length:

n
Noise –
correlation is
low
n+S-1
Signal –
correlation is
high
We have to pad (S-1) zeros to the signal x to get the last (S-1) points of the signal y
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
20
Moving average filtering –zero padding
x[1]
x[256]
x[n], N=256
y[n], N=254 if S=3
Length y is S-1 less than x
Moving averaged signal
x[1]
If zero padding
is NOT allowed
Because y[254]=(x[254]+x[255]+x[256])/3
x[256]
x[n], N=256
y[n], N=256 - no matter what value of S
Length y is same as x
Moving averaged signal
If zero padding
is allowed
Because y[254]=(x[254]+x[255]+x[256])/3
y[255]=(x[255]+x[256]+0])/2
y[256]=(x[256]+0+0])
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
21
Example - moving average filtering

Assume we have a EEG signal
corrupted with noise

Set the mean to zero

Apply moving average filter to the
noisy signal (use filter order=3 and 5)

The higher filter order will remove
more noise, but it will also distort the
signal more (i.e. remove the signal
parts also)
load eeg;
N=length(eeg);
for i=1:N-3,
eegMA1(i)=(eeg(i)+eeg(i+1)+eeg(i+2))/3;
end
eegMA1(255)=(eeg(255)+eeg(256))/2;
eegMA1(256)=eeg(256)/1;
for i=1:N-5,
eegMA2(i)=(eeg(i)+eeg(i+1)+eeg(i+2)+eeg(i+3)+eeg(i+4))/5;
end
eegMA2(253)= (eeg(253)+eeg(254)+eeg(255)+eeg(256))/4;
eegMA2(254)=(eeg(254)+eeg(255)+eeg(256))/3;
eegMA2(255)=(eeg(255)+eeg(256))/2;
eegMA2(256)=eeg(256)/1;
subplot(3,1,1), plot(eeg, 'g ');
subplot(3,1,2), plot(eegMA1,'r');
subplot(3,1,3), plot(eegMA2,‘b');
5
0

So, a compromise has to be found for
the value of S (normally by trial and
error)
-5
0
50
100
150
200
250
300
0
50
100
150
200
250
300
0
50
100
150
200
250
300
5
0
-5
5
0
-5
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
22
Median filter for noisy images


Consider applying median filtering to some noisy images
In computer, these grayscale images are stored as 2D arrays
 x(i,j) where I and j are the coordinates and x is the grayscale values (in general from 0
(black) 255 (white))

After applying median filter

Mean (averaging) filter could be applied in similar manner though for images, median filter
normally gives better results
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
23
Principal component analysis

PCA can be used to reduce noise from signals provided we have
repeated recordings or signals from a number of trials or multichannel signals

Principal components (PCs) are obtained from PCA, which are
orthogonal signals, i.e. signals that are uncorrelated to each other

Since noise is less correlated between the trials as compared to the
signals, the first few PCs will account for the signals while the last
few PCs will account for the noise

By discarding the last few PCs before reconstruction, we’ll get the
signals without noise/with less noise
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
24
Principal component analysis -algorithm

PCA algorithm








Organise the data, X in M x N matrix
Set mean to zero
Compute CX=covariance of matrix, X
Compute eigenvalue, eigenvector of CX
Sort eigenvectors (i.e. principal components) in descending order
Compute Zscores
Decide how many PCs to keep using some criteria
Reconstruct the noise reduced signals using the first few PCs and
Zscores
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
25
Eigenvector, eigenvalue – a brief review

The steps of setting mean to zero and computing covariance have been covered earlier, so let us move to the
step of computing eigenvector, eigenvalue

Let us assume that A=cov(X), where X is the mean zero data

In MATLAB, [V,D] = eig(A) produces matrices of eigenvalues (D) and eigenvectors (V) of matrix A

It is obtained from A.*V = D.*V

Eg:
A

So

 3
 
2

Note: A has to be a square matrix
 2 3  3 
 3

.   4. 
 2 1  2 
 2
 3  is the eigenvector and 4 is the eigenvalue
 
2
can be assumed to be the vector direction
2
3
And eigenvalue=4 is the weight of this vector
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
26
Eigenvector, eigenvalue (cont.)

Finding the eigenvalues and eigenvectors for bigger than 3 x 3 matrix is extremely difficult, so we will skip the
algorithms and just use MATLAB function eig

Example, for the following square matrix:

Decide which, if any, of the following vectors are eigenvectors of that matrix and give the corresponding
eigenvalue
 2
 
 2
  1
 
  1
 
 0
 2
 
  1
 
 1
 3
 
 0
 
1
 0
 

Answer: The eigenvector is

The eigenvalue is 1
 3
 
2
1
 
 0
 
1
 0
 
 3 0 1 



4
1
2


  6 0  2


 3 0 1 


because  4 1
2 

  6 0  2


 0
 
1
 0
 
 0
 
= 1.  1 
 0
 
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
27
Sort the eigenvectors

Sort the eigenvectors from big to small using eigenvalues
Let’s use the example we saw earlier for ensemble averaging and median filtering
X=[2.9 4.9 5.3 7.3; 2.9 5.1 5.1 6.9; 3.1 4.8 5.1 7]
Xm=[-2.2 -0.2 0.2 2.2; -2.1 0.1 0.1 1.9; -1.9 -0.2 0.1 2.0]

A=Cov(Xm’)

The eigenvectors are, [V,D]=eig(A)



3.2533
2.9333
2.8800
2.9333
2.6800
2.5933
2.8800
2.5933
2.5533
V=
0.7103 0.3335
-0.1039 -0.8213
-0.6962 0.4629
D=
0.6198
0.5610
0.5487
0.0017
0
0
0 0.0272
0
0
0 8.4578

The corresponding eigenvalues are 0.0017, 0.0272, 8.4578

So now sort the eigenvectors in the order of eigenvalues: 8.4578, 0.0272, 0.0017

So the eigenvectors are
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
28
Zscores

Zscores=Vsort’*Xm
where V is the sorted eigenvectors and Xm is the mean zero data
matrix

In the previous example, the size of A=3

Zscore 1
-3.5843 -0.1776
0.1115 -0.2414
-0.0218 -0.0132
0.2349 3.5270
0.0309 0.0991
0.0621 -0.0270

So, we will have 3 Zscores

Zscores will have the same dimensions as Xm
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
Zscore 2
29
How to select the number of PCs to keep

The PCs with higher eigenvalues represent the signals while the PCs with lower eigenvalues represent the noise

So we keep the first few PCs and discard the rest

But how many PCs do we keep?

Using certain percentage of variance to retain, normally 95% or 99%

Eigenvalues represent the weight of the PCs i.e. some sort of variance (power) measure of the PCs

So, we can use sum(D1:Dq)/sum(D1:Dlast)>0.99, where D represents the eigenvalues [D1,D2,D3,…Dlast]

In our example, say we wish to retain 99% variance
 eigenvalues are 8.4578, 0.0272, 0.0017

Sum(D1:Dlast)= 8.4867
Sum(D1:D1)=8.4578; sum(D1:D1)/sum(D1:Dlast)=0.9996
Sum(D1:D2)=8.4849; sum(D1:D1)/sum(D1:Dlast)=0.9998
Sum(D1:Dlast)=8.4867; sum(D1:Dlast)/sum(D1:Dlast)=1.0




Since the first eigenvalue accounted for 99.96% variance (which is more than 99%) and we can discard the
second and third PC

If we wish to retain 99.97%, how many PCs do we retain? Answer=2
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
30
Reconstruct using the selected PCs

To get back the original signals without noise, we need to reconstruct using the selected PCs

Xnonoise=Vselected*Zscoreselected

In our example, only 1 PC was selected, so the first eigenvector and the first Zscore will be used
to get back the 3 noise reduced signals

Xnonoise=Vsort(:,1)*Zscore(1,:)

noise=Xm-Xnonoise

Energy (noise) = 0.0117 0.0550 0.0200

Original signal, x=[-2 0 0 2]; this is the actual original mean removed signal - from the earlier slide

Energy (original signal)=8;

SNR=
28.3483 21.6265 26.0222
-2.2217 -0.1101
-2.0107 -0.0996
-1.9668 -0.0975
0.1456
0.1318
0.1289
2.1861
1.9786
1.9353
0.0217 -0.0899 0.0544 0.0139
-0.0893 0.1996 -0.0318 -0.0786
0.0668 -0.1025 -0.0289 0.0647
SNR using PCA is generally higher than
ensemble averaging or median filtering
and we do get 3 signal outputs unlike one
signal output from ensemble averaging or
median filtering
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
31
Principal component analysis – an example of application

Consider the following 3
noise corrupted signals
20
15
10
5
Noisy EP signal (trial 1)
0
-5
-10
-15
-20
0
50
100
150
200
250
300
15
10
5
Noisy EP signal (trial 2)
0
-5
-10


Obtain the principal
components (in descending
order of eigenvalue
magnitude)
-15
-20
0
50
100
150
200
250
300
20
15
10
Noisy EP signal (trial 3)
5
0
-5
-10
-15
-20
0
50
100
150
200
250
300
Obtain the Zscores
Reconstruct using only one PC
4
3
2
EP signal (trial 1)
1

Decide how many PCs to
retain - assume that we
retain only the first PC
0
-1
-2
0
50
100
150
200
250
300
3
2
1
EP signal (trial 2)
0
-1
-2
-3
-4
0
50
100
150
200
250
300
50
100
150
200
250
300
2
1
0
EP signal (trial 3)
-1

By retaining the first PC only
for reconstruction, we will
have 3 noise reduced EP
-2
-3
-4
0
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
32
Independent component analysis –a brief study






ICA is a new method that could be used to separate noise from signal
Sometimes known as blind source separation
Requires more than one signal recording (like PCA)
ICA separates the signals into independent signals (signals and noises – we keep the signals,
discard the noises)
Example:
Assume, we have 3 observed (i.e. recorded signals): x1[n], x2[n] and x3[n] from 3 original signals
sources: s1[n], s2[n] and s3[n]

x1[n]=a11.s1[n]+a12.s2[n]+a13.s3[n]

x2[n]=a21.s1[n]+a22.s2[n]+a23.s3[n]

x3[n]=a31.s1[n]+a32.s2[n]+a33.s3[n]

 a11 a12 a13
The matrix, A  a 21 a 22 a 23
a 31 a 32 a 33

ICA can be used to obtain the original signals by obtaining the unmixing matrix W





is known as mixing matrix
W=A-1
The original signals can be obtained by using
s1[n]=w11.x1[n]+w12.x2[n]+w13.x3[n]
s2[n]=w21.x1[n]+w22.x2[n]+w23.x3[n]
s3[n]=w31.x1[n]+w32.x2[n]+w33.x3[n]
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
 w11 w12 w13
W   w21 w22 w23


 w31 w32 w33
33
Independent component analysis – a pictorial example
Figures from Independent Component Analysis,
Hyvarinen, Karhunen and Oja
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
34
Maximising non-gaussianity using kurtosis

How ICA works?

The central limit theorem says that sums of non-gaussian random variables are closer to
gaussian than the original ones
Mixed
(combined
signals)
Source
(original
signals)
less gaussian
more gaussian

=> the independent signals are less gaussian than the combined signals
So by maximising non-gaussian behaviour, we get closer to the original signals
Kurtosis could be used to measure gaussian behaviour

BUT what is gaussian?



See next slide
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
35
Gaussian and probability distributions

Gaussian (or normal) probability distribution is

BUT what is probability distribution?
  ( x   )2 
pdf ( x ) 
exp

2
2 2
 2

1
3
2.5


Probability distribution for discrete-time signals is simply the
number of occurences vs value
Eg: if x has values from 1 to 10
x = [4 1 2 3 9 8 6 5 7 3 4 2 2 6 9 5 6 7]
count(1:10)=0;
for i=1:10,
y=find(x==i);
count(i)=length(y);
end
plot(y);
2
1.5
1
0.5
0
0
2
4
6
8
10
Probability distribution of x

Gaussian distribution

Super-gaussian distribution


The data close to mean have higher occurences
Sub-gaussian distribution

Most the data have similar number of occurences
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
36
Kurtosis

Non-gaussianity can be measure using kurtosis


Gaussian signals have kurtosis=3
Sub-gaussian signals have lower kurtosis value
Super-gaussian signals have higher kurtosis value

Examples

x = [4 1 2 3 9 8 6 5 7 3 4 2 2 6 9 5 6 7];
y=kurtosis(x,0); %unbiased kurtosis using MATLAB
y=1.9509
4
3
2
Gaussian distribution signal
x = randn(1,100000); % gaussian signal with mean=0, std=1
plot(x);
y=kurtosis(x,0) %unbiased kurtosis using MATLAB
1
0
-1
-2
-3
-4
y=3.00
0
200
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
400
600
800
1000
37
Example – Kurtosis for EP and noise
2
4
1.5
3
1
2
0.5
1
0
0
-0.5
-1
-1
-2
-1.5
-2
-3
-2.5
-4
-3
0
50
100
150
200
250
300
EP signal, kurtosis=3.32
-5
0
50
100
150
200
250
300
X1= EP+noise, kurtosis=2.79
3
3
2
2
1
1
0
0
-1
-2
-1
-3
-2
-4
-3
0
50
100
150
200
250
noise, kurtosis=2.81
Original signals
0
50
100
150
200
250
300
300
X2=EP+noise, kurtosis=2.61
Recorded signals
Can you see that kurtosis is lower for combined signals, i.e. the actual independent
signals (i.e. sources) have higher kurtosis
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
38
Simple ICA algorithm – an example using EP and noise

ICA tries to obtain EP and noise by estimating the unmixing matrix

 EP[n]  0.8 0.9  X 1[n] 
noise[n]  0.9 0.5  X 2[n]

 
 

1
The solution is
In the beginning, we don’t know the unmixing matrix!
Unmixing matrix

A simple ICA method is to randomly generate values in [0,1] for the unmixing matrix
 EP[n]   w11 w12   X 1[n] 
noise[n]   w21 w22  X 2[n]

 



Now, EP[n]=w11.X1[n]+w12.X2[n] and noise[n]=w21.X1[n]+w22.X2[n]

Kurtosis values are computed for these estimated EP and noise

Repeat with other random values for the unmixing matrix (say for a thousand times)

The unmixing matrix that gave the highest kurtosis values will denote the actual EP and noise

Actual ICA algorithms use complicated neural network learning algorithms, so we’ll skip them

It suffices to know that by using certain measures like kurtosis (representing non-gaussianity), we can
separate the signals into independent components
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
39
Study guide (Lecture 3)

From this week’s lecture, you should know

Basic mathematics– mean, standard deviation, variance, covariance, correlation,
autocorrelation, SNR, etc.

Uses of these basic maths in signal analysis

Noise reduction methods like ensemble averaging, median filtering, moving
average filtering, principal component analysis and basics of independent
component analysis
End of lecture 3
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013)
40