Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ENSEMBLE EMPIRICAL MODE DECOMPOSITION Noise Assisted Signal Analysis (nasa) Part I Preliminary Zhaohua Wu and N. E. Huang: Ensemble Empirical Mode Decomposition: A Noise Assisted Data Analysis Method. Advances in Adaptive Data Analysis, 1, 1-41, 2009 Theoretical Foundations • Intermittency test, though ameliorates the mode mixing, destroys the adaptive nature of EMD. • The EMD study of white noise guarantees a uniformed frame of scales. • The cancellation of white noise with sufficient number of ensemble. Theoretical Background I Intermittency Sifting with Intermittence Test • To avoid mode mixing, we have to institute a special criterion to separate oscillation of different time scales into different IMF components. • The criteria is to select time scale so that oscillations with time scale longer than this pre-selected criterion is not included in the IMF. Observations • Intermittency test ameliorates the mode mixing considerably. • Intermittency test requires a set of subjective criteria. • EMD with intermittency is no longer totally adaptive. • For complicated data, the subjective criteria are hard, or impossible, to determine. Effects of EMD (Sifting) • To separate data into components of similar scale. • To eliminate ridding waves. • To make the results symmetric with respect to the x-axis and the amplitude more even. – Note: The first two are necessary for valid IMF, the last effect actually cause the IMF to lost its intrinsic properties. Theoretical Background II A Study of White Noise Wu, Zhaohua and N. E. Huang, 2004: A Study of the Characteristics of White Noise Using the Empirical Mode Decomposition Method, Proceedings of the Royal Society of London , A 460, 1597-1611. Methodology • Based on observations from Monte Carlo numerical experiments on 1 million white noise data points. • All IMF generated by 10 siftings. • Fourier spectra based on 200 realizations of 4,000 data points sections. • Probability density based on 50,000 data points data sections. IMF Period Statistics 1 2 3 4 5 6 7 8 9 number of peaks 347042 168176 83456 41632 20877 10471 5290 2658 1348 Mean period 2.881 5.946 11.98 24.02 47.90 95.50 189.0 376.2 741.8 period in year 0.240 0.496 0.998 2.000 3.992 7.958 15.75 31.35 61.75 IMF Fourier Spectra of IMFs Fourier Spectra of IMFs spectrum (10**-3) 1.5 1 0.5 0 0 1 2 3 4 5 6 7 8 9 Shifted Fourier Spectra of IMFs spectrum (10**-3) 1 0.8 0.6 0.4 0.2 0 1 1.5 2 2.5 ln T 3 3.5 Empirical Observations : I Mean Energy En = 1 N N c j=1 n ( j) 2 Empirical Observations : II Normalized spectral area is constant SlnT ,nd lnT const Empirical Observations : III Normalized spectral area is constant En = S ,n d is the total Energy of n-th IMF component Empirical Observations : IV Computation of mean period En S ,n d ST ,n Tn dT d lnT SlnT ,n 2 T T S S lnT ,n lnT ,n d lnT d lnT T S lnT ,n d lnT Tn Empirical Observations : III The product of the mean energy and period is constant E nTn const ln E n lnTn const Monte Carlo Result : IMF Energy vs. Period Empirical Observation: Histograms IMFs By Central Limit theory IMF should be normally distributed. 5000 mode 2 0 -1 0 mode 4 5000 0 -0.5 0 -0.4 -0.2 0 5000 0 0.2 0.4 mode 8 -0.2 -0.1 0 0.1 0.2 -1 -0.5 0 5000 0 0.5 mode 6 5000 0 0 1 mode 3 5000 -0.5 0 0.5 mode 7 -0.2 0 0.2 mode 9 5000 0 1 mode 5 5000 0 0.5 -0.1 0 0.1 Fundamental Theorem of Probability • If we know the density function of a random variable, x, then we can express the density function of any random variable, y, for a given y=g(x). The procedure is as follows: Solve the roots of y = g(x1 ) + ... + g(xn ) + ... y( y ) = f x ( x1 ) , g ( x1 ) + .... + f x ( xn ) , g ( xn ) then + ... because d y = g , ( x1 ) d x j ; therefore, d x j = dy . , g ( xj ) Fundamental Theorem of Probability • If we know the density function of a random variable, x, is normal, then x-square should be (y) = 1 2 y exp -y/2 2 U(y). where U(y) is a normalizing function. See: A. Papoulis : Probability, Random Variables, and Stochastic Processes. 1984. Page 97-98. Chi and Chi-Square Statistics Given n normal identical independent random varaibles with density 1 (x1 , ..., xn ) = exp - x 12 +... +x n2 /2 2 n 2 we have the RV's x = 2 1 +... +x 2 n 1/ 2 U(y). y= 2 then the density for y with -degree of freedom is (y) = a y -1+ /2 exp - y with a = 1 2 n 2 2 U(y) ( / 2 ) See: A. Papoulis : Probability, Random Variables, and Stochastic Processes 1984. Page 187-188. CHI SQUARE-DISTRIBUTION OF ENERGY 200 mode 2 100 0.15 200 0.2 0.25 mode 4 100 200 200 0.02 0.04 0.06 0.08 mode 6 0.1 0.15 mode 5 0 0.01 200 0.02 0.03 0.04 0.05 mode 7 100 0 0.01 0.02 0.03 mode 8 0 300 0 0.01 0.02 mode 9 200 100 0 0 0.05 200 100 100 0 mode 3 100 0 0 200 100 0 0.005 0.01 0 0 0.005 0.01 Chi-Squared Energy Density Distributions Probability for degree of freedom NE n should be NE n ( NE n ) NE n 2 1 e NEn 2 Then, by the fundamental theory of probability, we have E n N ( NE n ) NE n 2 1 e NEn 2 Let us make a variable change: E = e y , then y N ( N e y ) NE n 2 1 e NEn NE = C exp y 2 2 E E ey DEGREE OF FREEDOM • Random samples of length N contains N degree of freedom • Each Fourier component contains one degree of freedom • For EMD, the shares of DOF is proportional to its share of energy; therefore, the degree of freedom for each IMF is given as f i = N Ei . CHI SQUARE-DISTRIBUTION OF ENERGY 200 mode 2 100 0.15 200 0.2 0.25 mode 4 100 200 200 0.02 0.04 0.06 0.08 mode 6 0.1 0.15 mode 5 0 0.01 200 0.02 0.03 0.04 0.05 mode 7 100 0 0.01 0.02 0.03 mode 8 0 300 0 0.01 0.02 mode 9 200 100 0 0 0.05 200 100 100 0 mode 3 100 0 0 200 100 0 0.005 chi-square dist. 0.01 0 0 wi wi ri NEi 0.005 ri 21 wi 2 e wi NEi 0.01 Formula of Confidence Limit for IMF Distributions I Introducing new variable, y = ln E; then E = e y . It follows: y N Ne y NE 2 1 e NE 2 e y NE E NE NE C exp y y C exp 2 2 2 E C N NE 2 y y y y E y y e 1 y y ... 2! 3! E 2 3 Formula of Confidence Limit for IMF Distributions II With the new variable, y = ln E; then E = e y , it follows: 2 3 NE y y y y 1 y y C' exp 2! 3! 2 C= N NE/2 1 exp - NE ( 1 - y ) . 2 Formula of Confidence Limit for IMF Distributions III When y - y << 1 , we can neglect the higher power terms: NE y y 2 y C exp 2 2 ! C' = N NE/2 1 exp - NE ( 1 - y ) . 2 Formula of Confidence Limit for IMF Distributions IV For given confidence limit, , the corresponding vairable, y should satisfy y y dy y dy . For a Gaussian distribution, it is often to relate α to the standard deviation, σ , i.e., α confidence level corresponds to k σ, where k varies with α. For example, having values -2.326, 0.675, -0.0, 0.675, and 2.326 for the first, 25th, 50th, 75th and 99th percentiles (with α being 0.01, 0.25, 0.5, 0.75, 0.99), respectively. Formula of Confidence Limit for IMF Distributions V When y - y << 1 , the distribution of En is approximately Gaussian, 2 Tn 1 = = NEn / 2 N 2 Therefore, for any given , in terms of k , we have y y k k 2T N Formula of Confidence Limit for IMF Distributions VI 2T Given y y k k N and lnE + lnT 0. If we write x = lnT , y = ln E as defined before, then y x; therefore 2 x2 y x k e N A pair of upper and lower bounds will be y x k 2 x2 e N Confidence Limit for IMF Distributions C1 Raw SOI Data and IMFs SOI R C9 C8 C7 C6 C5 C4 C3 C2 5 0 -5 2 0 -2 2 0 -2 2 0 -2 2 0 -2 1 0 -1 1 0 -1 1 0 -1 0.5 0 -0.5 0.5 0 -0.5 0.2 0 -0.2 -0.4 1930 1940 1950 1960 1970 1980 1990 2000 Statistical Significance for SOI IMFs IMF 4, 5, 6 and 7 are 99% statistical significance signals. 1 mon 1 yr 10 yr 100 yr Summary • Not all IMF have the same statistical significance. • Based on the white noise study, we have established a method to determine the statistical significant components. • References: • • Wu, Zhaohua and N. E. Huang, 2003: A Study of the Characteristics of White Noise Using the Empirical Mode Decomposition Method, Proceedings of the Royal Society of London A460, 1597-1611. Flandrin, P., G. Rilling, and P. Gonçalvès, 2003: Empirical Mode Decomposition as a Filterbank, IEEE Signal Proc Lett. 11 (2): 112-114. Observations The white noise signal consists of signal of all scales. EMD separates the scale dyadically. The white noise provide a uniformly distributed frame of scales through EMD. Different Approaches but reach the same end. Flandrin, P., G. Rilling and P. Goncalves, 2004: Empirical Mode Decomposition as a filter bank. IEEE Signal Process. Lett., 11, 112-114. Flandrin, P., P. Goncalves and G. Rilling, 2005: EMD equivalent filter banks, from interpretation to applications. Introduction to Hilbert-Huang Transform and its Applications, Ed. N. E. Huang and S. S. P. Shen, p. 57-74. World Scientific, New Jersey, Fractional Gaussian Noise aka Fractional Brownian Motion A continuous time Gaussian process, x H (t), is a Fractional noise, if it starts at zero, with zero mean and has correlation function: R(t,s) = E x H ( t ) x H (s) = 2 2 t 2H s 2H ts 2H , where H is a paramter known as the Hurst Index with value in 0 ,1 , and is the rms value of x H (t). If H = 1/2, the process is Gaussian, or regular Brownian motion. If H > 1/2, the process is positively correlated, or more red. If H < 1/2, the process is negatively correlated, or more blue. Examples Flandrin’s results Flandrin’s results Flandrin’s results Flandrin’s results Flandrin’s results Flandrin’s results : Delta Function Flandrin’s results : Delta Function Theoretical Background III Effects of adding White Noise Some Preliminary • Robert John Gledhill, 2003: Methods for Investigating Conformational Change in Biomolecular Simulations, University of Southampton, Department of Chemistry, Ph D Thesis. • He investigated the effect of added noise as a tool for checking the stability of EMD. Some Preliminary • His basic assumption is that the correct result is the one without noise: 1 Discrepancy M 1 j=1 N M c ( t ) - c ( t ) t=1 N p j r j 1/ 2 2 where c jp ( t ) is the IMF from the perturbated signal (signal + noise) and c rj ( t ) is the IMF from the original signal without noise. Test results Top Whole data perturbed; bottom only 10% perturbed. 10% Test results Observations • They made the critical assumption that the unperturbed signal gives the correct results. • When the amplitude of the added perturbing noise is small, the discrepancy is small. • When the amplitude of the added perturbing noise is large, the discrepancy becomes bi-modal.