Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
BME452 Biomedical Signal Processing Lecture 3 Signal conditioning Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 1 Lecture 3 Outline In this lecture, we’ll study the following signal conditioning methods (specifically for noise reduction) Ensemble averaging Median filtering Moving average filtering Principal component analysis Independent component analysis (in brief) Before we study these, an introduction to some mathematics will be given Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 2 Mean The arithmetic mean is the "standard" average, often simply called the "mean" 1 x N N x[ n ] n 1 1 x N N 1 x[ n ] n 0 where N is used to denote the data size (length) In MATLAB, n=1,….N but sometimes we use n=0,1,….N-1. Example An experiment yields the following data: 34,27,45,55,22,34 To get the arithmetic mean How many items? There are 6. Therefore N=6 What is the sum of all items? =217. To get the arithmetic mean divide sum by N, here 217/6=36.1667 Expectation What is expected value of X, E[X]? Simply said, it refer to the sum divided by the quantity, i.e. mean of the value in the square brackets Eg: E[x2]= 1 N N 1 x[n] 2 n 0 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 3 Mean removal for signals Very often, we set the mean to zero before performing any signal analysis This is to remove the dc (0 Hz) noise xm=x-mean(x) 22 5 21 4 20 3 19 2 18 1 17 0 16 -1 15 -2 14 13 -3 0 50 100 150 200 250 300 -4 0 50 100 150 200 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 250 300 4 Mean removal across channels/recordings Sometimes, a noise corrupts all the signals in a multi-channel signal or across all the recordings of a single channel signal Since the noise is common to all the channels/recordings, the simplest way of removing this noise is to remove mean across channels/recordings Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 5 Standard deviation () Measures how spread out are the values in a data set Suppose we are given a signal x1, ..., xN of real value numbers (all recorded signals are real values) The arithmetic mean of this population is defined as The standard deviation of this population is defined as Given only a sample of values x1,...,xN from some larger population, many authors define the sample (or estimated) standard deviation by This is known as an unbiased estimator for the actual standard deviation Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 6 Standard deviation example Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 7 Interpreting standard deviation A large standard deviation indicates that the data points are far from the mean and a small standard deviation indicates that they are clustered closely around the mean For example, each of the three samples (0, 0, 14, 14), (0, 6, 8, 14), and (6, 6, 8, 8) has an average of 7. Their standard deviations are 7, 5 and 1, respectively. The third set has a much smaller standard deviation than the other two because its values are all close to 7. Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 8 Normalisation Sometimes, we may wish to normalise a signal to mean=0 and set the standard deviation to 1 For example, if we record the same signal but using different instruments with different amplification factor, it will be difficult to analyse the signals together In this regard, we will normalise the signals using x nor [n ] ( x[n ] x ) std ( x ) Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 9 Variance Variance is simply the square of standard deviation Uncertainty measure Variance may be thought of as a measure of uncertainty When deciding whether measurements agree with a theoretical prediction, variance could be used If variance (using the predicted mean) is high, then the measurements contradict the prediction Example: say we have predicted that x[1]=7, x[2]=6, x[3]=5 x is measured 3 times => (7.2 6.7 5.6); (4.2 6.8 5.2); (11.2 6.3 5.9) Do this => Compute the variance using the predicted value as mean var[1]=12.76, var[2]=0.610, var[3]=0.605 So, we know that x[1] measurements are contradicting the prediction and probably not x[2] and x[3] measurements 1 N var ( xi x ) 2 N 1 i 1 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 10 Covariance If we have multi-channel/multi-trial recorded signals, we can have cross variance or simply covariance Covariance measure the variance between different signals (from different channels/recordings) Covariance between two signals, X and Y with respective means, μ and ν, The covariance sometimes is used as a measure of "linear dependence" between the two signals but correlation is a better measure Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 11 Correlation Correlation between two signals, X and Y is It is simply normalised covariance It measures linear dependence between X and Y The correlation is 1 in the case of an increasing linear relationship, −1 in the case of a decreasing linear relationship, and some value in between in all other cases, indicating the degree of linear dependence between the variables The closer the coefficient is to either −1 or 1, the stronger the correlation between the variables Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 12 Application of correlation (example) The diagram shows how the unknown signal can be identified A copy of a known reference signal is correlated with the unknown signal The correlation will be high if the reference is similar to the unknown signal The unknown signal is correlated with a number of known reference functions A large value for correlation shows the degree of similarity to the reference The largest value for correlation is the most likely match Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 13 Application of correlation (another example) Application to heart disease detection using ECG signals Cross correlation is one way in which different types of heart diseases can be identified using ECG signals Each heart disease has a unique ECG signal Some example of ECG signals for different diseases are shown below Sinus bradycardia 120 120 100 100 Amplitude (arbitrary units Amplitude (arbitrary units) Normal Sinus Rhythm 80 60 40 20 0 -20 -40 80 60 40 20 0 0 100 200 300 400 Sampling points 500 600 -20 700 0 100 Right Bundle Branch Block 200 300 400 Sampling points 500 600 700 600 700 Accelerated Junctional Rhythm 70 50 60 40 50 Amplitude (arbitrary units) Amplitude (arbitrary units) 40 30 20 10 30 20 10 0 0 -10 -10 -20 0 100 200 300 400 Sampling points 500 600 700 -20 0 100 200 300 400 Sampling points 500 The system has a library of pre-recorded ECG signals (known as templates) An unknown ECG signal is correlated with all the ECG templates in this library The largest correlation is the most likely match of the heart disease Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 14 Signal-to-noise ratio (SNR) Before we move into the noise reduction methods, we need a measure of noise in the signals This is important to gauge the performance of the noise reduction techniques For this purpose, we use SNR SNR=10log10[(signal energy)/(noise energy)] Energy N x(n)2 n 1 The original noise x(noise) = x(original signal) – x(noisy signal) After using some noise reduction method, x(noise) = x(original signal) – x(noise reduced signal) Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 15 Ensemble averaging If we have many recordings, we can use ensemble averaging to reduce noise that is not correlated between the recordings 3 Ensemble averaging to reduce noise from Evoked Potential (EP) EEG 2.5 2 1.5 1 Repeated different recordings are known as trials EP EEG signals from trial to another are about the same (high correlation) But noise will be different from one trial to another (low correlation) Hence, it would be possible to use ensemble averaging to reduce noise 5 4 4 4 3 3 3 2 ……………. 1 0 -1 -2 -2 -3 -3 -1 0 50 100 150 200 250 EP EEG 1 1 -1 0 -0.5 2 2 0 0.5 0 -1 -2 -3 -4 -4 0 50 100 150 200 250 300 EP EEG+noise (trial 1) -4 0 50 100 150 200 250 300 EP EEG+noise (trial 2) -5 0 50 100 150 200 250 300 EP EEG+noise (trial 20) 4 3 2 1 0 EP EEG after ensemble averaging -1 -2 0 50 100 150 200 250 300 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 16 300 Worked example 1 - Ensemble averaging Assume we have 3 signals corrupted with noise. Assume we have the original also (for SNR computation). 1. Set the mean to zero first n 0 1 2 3 n 0 1 2 3 Noisy signal 1 2.9 4.9 5.3 7.3 Noisy signal 1 -2.2 -0.2 0.2 2.2 Noisy signal 2 2.9 5.1 5.1 6.9 Noisy signal 2 -2.1 0.1 0.1 1.9 Noisy signal 3 3.1 4.8 5.1 7 Noisy signal 3 -1.9 -0.2 0.1 2.0 Original 3 5 5 7 Original -2.0 0.0 0.0 2.0 2. The ensemble average is (the average is done for each sample point n) n Ensemble average 3. The noises in the signals are (original signal – noise corrupted signal) 0 1 2 3 -2.1 -0.1 0.1 2.0 4. SNR=10log10(e(signal)/e(noise)) n 0 1 2 3 Signal 1 noise 0.2 0.2 -0.2 -0.2 Signal 2 noise 0.1 -0.1 -0.1 0.1 Signal 3 noise -0.1 0.2 -0.1 0.0 Ensemble average noise 0.1 0.1 -0.1 0.0 Original signal energy 8 signal 1 signal 2 signal 3 ensemble average noise energy 0.16 0.04 0.06 0.03 SNR 16.99 23.01 21.25 24.26 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 17 Median filtering Similar to ensemble averaging, if we have many recordings, we can use median filtering to reduce noise that is not correlated between the recordings What is median filtering? If we have x[1] as [3 2 1 0 6 7 9 3 2] from 9 trials, we sort the numbers from small to big, then the centre value (i.e. 5th) as the median Sorted x[1] is [0 1 2 2 3 3 6 7 9], so median x[1]= 3 Median filtering is advantageous as compared to ensemble averaging if there is one trial containing a lot of noise AND if the number of trials/recordings are small This is because the one heavily noise corrupted signal will distort the ensemble average values but will less likely affect the median values m Data length n=1,2,…………………………………………….,N 1 2 3 Number of trials . . . . M . . . obtain median values Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 18 Worked example 2 – median filtering Assume we have 3 signals corrupted with noise, one heavily corrupted (assume the mean has been set to zero) n 0 1 2 3 Noisy signal 1 -2.15 0.95 -0.25 1.45 Noisy signal 2 -9 0 5 4 Noisy signal 3 -2.25 -0.45 0.85 1.85 Original -2.0 0.0 0.0 2.0 2. The noises in the signals are (original signal – noise corrupted signal) n 0 1 2 3 Noise in signal 1 0.15 -0.95 0.25 0.55 Noise in signal 2 7 0 -5 -2 Noise in signal 3 0.25 0.45 -0.85 1. The ensemble average and median filtered signals n 0 Ensemble average -4.47 Median filter -2.25 1 2 3 0.17 1.87 2.43 0 0.85 1.85 4. SNR=10log10(e(signal)/e(noise)) Original signal energy 8 signal 1 signal 2 signal 3 ensemble average Median filter noise energy 1.29 78 1.01 9.81 0.81 SNR 7.93 -9.89 8.99 0.88 9.95 0.15 Noise in ensemble averaging 2.47 -0.17 -1.87 -0.43 Noise in median filter signal 0.25 0 -0.85 0.15 Which technique gave better noise reduction using SNR – ensemble averaging or median filtering? Why? Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 19 Moving average filtering How do we reduce noise if we have only one signal from one recording/ trial? We can’t use ensemble averaging and median filtering Normally, in any signal, the few points before and after a certain point n are correlated (i.e. related) But generally the noise is not correlated So, we can use moving average (MA) filtering It is defined as 1 n S 1 y[ n ] x[ n ] S i n where S is the filter order Example, for S=3, y[5]=(x[5]+x[6]+x[7])/3 For signals x and y to remain of same sample length: n Noise – correlation is low n+S-1 Signal – correlation is high We have to pad (S-1) zeros to the signal x to get the last (S-1) points of the signal y Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 20 Moving average filtering –zero padding x[1] x[256] x[n], N=256 y[n], N=254 if S=3 Length y is S-1 less than x Moving averaged signal x[1] If zero padding is NOT allowed Because y[254]=(x[254]+x[255]+x[256])/3 x[256] x[n], N=256 y[n], N=256 - no matter what value of S Length y is same as x Moving averaged signal If zero padding is allowed Because y[254]=(x[254]+x[255]+x[256])/3 y[255]=(x[255]+x[256]+0])/2 y[256]=(x[256]+0+0]) Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 21 Example - moving average filtering Assume we have a EEG signal corrupted with noise Set the mean to zero Apply moving average filter to the noisy signal (use filter order=3 and 5) The higher filter order will remove more noise, but it will also distort the signal more (i.e. remove the signal parts also) load eeg; N=length(eeg); for i=1:N-3, eegMA1(i)=(eeg(i)+eeg(i+1)+eeg(i+2))/3; end eegMA1(255)=(eeg(255)+eeg(256))/2; eegMA1(256)=eeg(256)/1; for i=1:N-5, eegMA2(i)=(eeg(i)+eeg(i+1)+eeg(i+2)+eeg(i+3)+eeg(i+4))/5; end eegMA2(253)= (eeg(253)+eeg(254)+eeg(255)+eeg(256))/4; eegMA2(254)=(eeg(254)+eeg(255)+eeg(256))/3; eegMA2(255)=(eeg(255)+eeg(256))/2; eegMA2(256)=eeg(256)/1; subplot(3,1,1), plot(eeg, 'g '); subplot(3,1,2), plot(eegMA1,'r'); subplot(3,1,3), plot(eegMA2,‘b'); 5 0 So, a compromise has to be found for the value of S (normally by trial and error) -5 0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 50 100 150 200 250 300 5 0 -5 5 0 -5 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 22 Median filter for noisy images Consider applying median filtering to some noisy images In computer, these grayscale images are stored as 2D arrays x(i,j) where I and j are the coordinates and x is the grayscale values (in general from 0 (black) 255 (white)) After applying median filter Mean (averaging) filter could be applied in similar manner though for images, median filter normally gives better results Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 23 Principal component analysis PCA can be used to reduce noise from signals provided we have repeated recordings or signals from a number of trials or multichannel signals Principal components (PCs) are obtained from PCA, which are orthogonal signals, i.e. signals that are uncorrelated to each other Since noise is less correlated between the trials as compared to the signals, the first few PCs will account for the signals while the last few PCs will account for the noise By discarding the last few PCs before reconstruction, we’ll get the signals without noise/with less noise Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 24 Principal component analysis -algorithm PCA algorithm Organise the data, X in M x N matrix Set mean to zero Compute CX=covariance of matrix, X Compute eigenvalue, eigenvector of CX Sort eigenvectors (i.e. principal components) in descending order Compute Zscores Decide how many PCs to keep using some criteria Reconstruct the noise reduced signals using the first few PCs and Zscores Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 25 Eigenvector, eigenvalue – a brief review The steps of setting mean to zero and computing covariance have been covered earlier, so let us move to the step of computing eigenvector, eigenvalue Let us assume that A=cov(X), where X is the mean zero data In MATLAB, [V,D] = eig(A) produces matrices of eigenvalues (D) and eigenvectors (V) of matrix A It is obtained from A.*V = D.*V Eg: A So 3 2 Note: A has to be a square matrix 2 3 3 3 . 4. 2 1 2 2 3 is the eigenvector and 4 is the eigenvalue 2 can be assumed to be the vector direction 2 3 And eigenvalue=4 is the weight of this vector Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 26 Eigenvector, eigenvalue (cont.) Finding the eigenvalues and eigenvectors for bigger than 3 x 3 matrix is extremely difficult, so we will skip the algorithms and just use MATLAB function eig Example, for the following square matrix: Decide which, if any, of the following vectors are eigenvectors of that matrix and give the corresponding eigenvalue 2 2 1 1 0 2 1 1 3 0 1 0 Answer: The eigenvector is The eigenvalue is 1 3 2 1 0 1 0 3 0 1 4 1 2 6 0 2 3 0 1 because 4 1 2 6 0 2 0 1 0 0 = 1. 1 0 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 27 Sort the eigenvectors Sort the eigenvectors from big to small using eigenvalues Let’s use the example we saw earlier for ensemble averaging and median filtering X=[2.9 4.9 5.3 7.3; 2.9 5.1 5.1 6.9; 3.1 4.8 5.1 7] Xm=[-2.2 -0.2 0.2 2.2; -2.1 0.1 0.1 1.9; -1.9 -0.2 0.1 2.0] A=Cov(Xm’) The eigenvectors are, [V,D]=eig(A) 3.2533 2.9333 2.8800 2.9333 2.6800 2.5933 2.8800 2.5933 2.5533 V= 0.7103 0.3335 -0.1039 -0.8213 -0.6962 0.4629 D= 0.6198 0.5610 0.5487 0.0017 0 0 0 0.0272 0 0 0 8.4578 The corresponding eigenvalues are 0.0017, 0.0272, 8.4578 So now sort the eigenvectors in the order of eigenvalues: 8.4578, 0.0272, 0.0017 So the eigenvectors are Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 28 Zscores Zscores=Vsort’*Xm where V is the sorted eigenvectors and Xm is the mean zero data matrix In the previous example, the size of A=3 Zscore 1 -3.5843 -0.1776 0.1115 -0.2414 -0.0218 -0.0132 0.2349 3.5270 0.0309 0.0991 0.0621 -0.0270 So, we will have 3 Zscores Zscores will have the same dimensions as Xm Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) Zscore 2 29 How to select the number of PCs to keep The PCs with higher eigenvalues represent the signals while the PCs with lower eigenvalues represent the noise So we keep the first few PCs and discard the rest But how many PCs do we keep? Using certain percentage of variance to retain, normally 95% or 99% Eigenvalues represent the weight of the PCs i.e. some sort of variance (power) measure of the PCs So, we can use sum(D1:Dq)/sum(D1:Dlast)>0.99, where D represents the eigenvalues [D1,D2,D3,…Dlast] In our example, say we wish to retain 99% variance eigenvalues are 8.4578, 0.0272, 0.0017 Sum(D1:Dlast)= 8.4867 Sum(D1:D1)=8.4578; sum(D1:D1)/sum(D1:Dlast)=0.9996 Sum(D1:D2)=8.4849; sum(D1:D1)/sum(D1:Dlast)=0.9998 Sum(D1:Dlast)=8.4867; sum(D1:Dlast)/sum(D1:Dlast)=1.0 Since the first eigenvalue accounted for 99.96% variance (which is more than 99%) and we can discard the second and third PC If we wish to retain 99.97%, how many PCs do we retain? Answer=2 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 30 Reconstruct using the selected PCs To get back the original signals without noise, we need to reconstruct using the selected PCs Xnonoise=Vselected*Zscoreselected In our example, only 1 PC was selected, so the first eigenvector and the first Zscore will be used to get back the 3 noise reduced signals Xnonoise=Vsort(:,1)*Zscore(1,:) noise=Xm-Xnonoise Energy (noise) = 0.0117 0.0550 0.0200 Original signal, x=[-2 0 0 2]; this is the actual original mean removed signal - from the earlier slide Energy (original signal)=8; SNR= 28.3483 21.6265 26.0222 -2.2217 -0.1101 -2.0107 -0.0996 -1.9668 -0.0975 0.1456 0.1318 0.1289 2.1861 1.9786 1.9353 0.0217 -0.0899 0.0544 0.0139 -0.0893 0.1996 -0.0318 -0.0786 0.0668 -0.1025 -0.0289 0.0647 SNR using PCA is generally higher than ensemble averaging or median filtering and we do get 3 signal outputs unlike one signal output from ensemble averaging or median filtering Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 31 Principal component analysis – an example of application Consider the following 3 noise corrupted signals 20 15 10 5 Noisy EP signal (trial 1) 0 -5 -10 -15 -20 0 50 100 150 200 250 300 15 10 5 Noisy EP signal (trial 2) 0 -5 -10 Obtain the principal components (in descending order of eigenvalue magnitude) -15 -20 0 50 100 150 200 250 300 20 15 10 Noisy EP signal (trial 3) 5 0 -5 -10 -15 -20 0 50 100 150 200 250 300 Obtain the Zscores Reconstruct using only one PC 4 3 2 EP signal (trial 1) 1 Decide how many PCs to retain - assume that we retain only the first PC 0 -1 -2 0 50 100 150 200 250 300 3 2 1 EP signal (trial 2) 0 -1 -2 -3 -4 0 50 100 150 200 250 300 50 100 150 200 250 300 2 1 0 EP signal (trial 3) -1 By retaining the first PC only for reconstruction, we will have 3 noise reduced EP -2 -3 -4 0 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 32 Independent component analysis –a brief study ICA is a new method that could be used to separate noise from signal Sometimes known as blind source separation Requires more than one signal recording (like PCA) ICA separates the signals into independent signals (signals and noises – we keep the signals, discard the noises) Example: Assume, we have 3 observed (i.e. recorded signals): x1[n], x2[n] and x3[n] from 3 original signals sources: s1[n], s2[n] and s3[n] x1[n]=a11.s1[n]+a12.s2[n]+a13.s3[n] x2[n]=a21.s1[n]+a22.s2[n]+a23.s3[n] x3[n]=a31.s1[n]+a32.s2[n]+a33.s3[n] a11 a12 a13 The matrix, A a 21 a 22 a 23 a 31 a 32 a 33 ICA can be used to obtain the original signals by obtaining the unmixing matrix W is known as mixing matrix W=A-1 The original signals can be obtained by using s1[n]=w11.x1[n]+w12.x2[n]+w13.x3[n] s2[n]=w21.x1[n]+w22.x2[n]+w23.x3[n] s3[n]=w31.x1[n]+w32.x2[n]+w33.x3[n] Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) w11 w12 w13 W w21 w22 w23 w31 w32 w33 33 Independent component analysis – a pictorial example Figures from Independent Component Analysis, Hyvarinen, Karhunen and Oja Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 34 Maximising non-gaussianity using kurtosis How ICA works? The central limit theorem says that sums of non-gaussian random variables are closer to gaussian than the original ones Mixed (combined signals) Source (original signals) less gaussian more gaussian => the independent signals are less gaussian than the combined signals So by maximising non-gaussian behaviour, we get closer to the original signals Kurtosis could be used to measure gaussian behaviour BUT what is gaussian? See next slide Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 35 Gaussian and probability distributions Gaussian (or normal) probability distribution is BUT what is probability distribution? ( x )2 pdf ( x ) exp 2 2 2 2 1 3 2.5 Probability distribution for discrete-time signals is simply the number of occurences vs value Eg: if x has values from 1 to 10 x = [4 1 2 3 9 8 6 5 7 3 4 2 2 6 9 5 6 7] count(1:10)=0; for i=1:10, y=find(x==i); count(i)=length(y); end plot(y); 2 1.5 1 0.5 0 0 2 4 6 8 10 Probability distribution of x Gaussian distribution Super-gaussian distribution The data close to mean have higher occurences Sub-gaussian distribution Most the data have similar number of occurences Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 36 Kurtosis Non-gaussianity can be measure using kurtosis Gaussian signals have kurtosis=3 Sub-gaussian signals have lower kurtosis value Super-gaussian signals have higher kurtosis value Examples x = [4 1 2 3 9 8 6 5 7 3 4 2 2 6 9 5 6 7]; y=kurtosis(x,0); %unbiased kurtosis using MATLAB y=1.9509 4 3 2 Gaussian distribution signal x = randn(1,100000); % gaussian signal with mean=0, std=1 plot(x); y=kurtosis(x,0) %unbiased kurtosis using MATLAB 1 0 -1 -2 -3 -4 y=3.00 0 200 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 400 600 800 1000 37 Example – Kurtosis for EP and noise 2 4 1.5 3 1 2 0.5 1 0 0 -0.5 -1 -1 -2 -1.5 -2 -3 -2.5 -4 -3 0 50 100 150 200 250 300 EP signal, kurtosis=3.32 -5 0 50 100 150 200 250 300 X1= EP+noise, kurtosis=2.79 3 3 2 2 1 1 0 0 -1 -2 -1 -3 -2 -4 -3 0 50 100 150 200 250 noise, kurtosis=2.81 Original signals 0 50 100 150 200 250 300 300 X2=EP+noise, kurtosis=2.61 Recorded signals Can you see that kurtosis is lower for combined signals, i.e. the actual independent signals (i.e. sources) have higher kurtosis Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 38 Simple ICA algorithm – an example using EP and noise ICA tries to obtain EP and noise by estimating the unmixing matrix EP[n] 0.8 0.9 X 1[n] noise[n] 0.9 0.5 X 2[n] 1 The solution is In the beginning, we don’t know the unmixing matrix! Unmixing matrix A simple ICA method is to randomly generate values in [0,1] for the unmixing matrix EP[n] w11 w12 X 1[n] noise[n] w21 w22 X 2[n] Now, EP[n]=w11.X1[n]+w12.X2[n] and noise[n]=w21.X1[n]+w22.X2[n] Kurtosis values are computed for these estimated EP and noise Repeat with other random values for the unmixing matrix (say for a thousand times) The unmixing matrix that gave the highest kurtosis values will denote the actual EP and noise Actual ICA algorithms use complicated neural network learning algorithms, so we’ll skip them It suffices to know that by using certain measures like kurtosis (representing non-gaussianity), we can separate the signals into independent components Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 39 Study guide (Lecture 3) From this week’s lecture, you should know Basic mathematics– mean, standard deviation, variance, covariance, correlation, autocorrelation, SNR, etc. Uses of these basic maths in signal analysis Noise reduction methods like ensemble averaging, median filtering, moving average filtering, principal component analysis and basics of independent component analysis End of lecture 3 Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 40