Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Digital Signal Processing 1. Define statistical variance and covariance. The concept of variance can be extended to continuous data sets too. In that case, instead of summing up the individual differences from the mean, we need to integrate them. This approach is also useful when the number of data points is very large, like the population of a country.Variance is extensively used in probability theory, wherein from a given smaller sample set, more generalized conclusions need to be drawn. This is because variance gives us an idea about the distribution of data around the mean, and thus from this distribution, we can work out where we can expect an unknown data point. In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean (expected value). In particular, the variance is one of the moments of a distribution. In that context, it forms part of a systematic approach to distinguishing between probability distributions. While other such approaches have been developed, those based on moments are advantageous in terms of mathematical and computational simplicity.The variance is a parameter describing in part either the actual probability distribution of an observed population of numbers, or the theoretical probability distribution of a sample (a not-fully-observed population) of numbers. In the latter case a sample of data from such a distribution can be used to construct an estimate of its variance: in the simplest cases this estimate can be the sample variance, defined below. In probability theory and statistics, covariance is a measure of how much two random variableschange together. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the smaller values, i.e. the variables tend to show similar behavior, the covariance is a positive number. In the opposite case, when the greater values of one variable mainly correspond to the smaller values of the other, i.e. the variables tend to show opposite behavior, the covariance is negative. The sign of the covariance therefore shows the tendency in the linear relationship between the variables. The magnitude of the covariance is not that easy to interpret. The normalized version of the covariance, the correlation coefficient, however shows by its magnitude the strength of the linear relation.A distinction has to be made between the covariance of two random variables, a population parameter, that can be seen as a property of the joint probability distribution at one side, and on the other side thesample covariance, which serves as an estimated value of the parameter. 2. How do you compute the energy of a discrete signal in time and frequency domains? 3. Define sample autocorrelation function. Give the mean value of this estimate. Autocorrelation is the cross-correlation of a signal with itself. Informally, it is the similarity between observations as a function of the time separation between them. It is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal which has been buried under noise, or identifying the missing fundamentalfrequency in a signal implied by its harmonic frequencies. It is often used in signal processing for analyzing functions or series of values, such as time domain signals. Different fields of study define autocorrelation differently, and not all of these definitions are equivalent. In some fields, the term is used interchangeably with autocovariance. In statistics, the autocorrelation of a random process describes the correlation between values of the process at different points in time, as a function of the two times or of the time difference. Let X be some repeatable process, and i be some point in time after the start of that process. (i may be aninteger for a discrete-time process or a real number for a continuous-time process.) Then Xi is the value (or realization) produced by a given run of the process at time i. Suppose that the process is further known to have defined values for mean μi and variance σi2 for all times i. Then the definition of the autocorrelation between times s and t is where "E" is the expected value operator. Note that this expression is not well-defined for all time series or processes, because the variance may be zero (for a constant process) or infinite. If the function R is well-defined, its value must lie in the range [−1, 1], with 1 indicating perfect correlation and −1 indicating perfect anti-correlation.If Xt is a second-order stationary process then the mean μ and the variance σ2 are time-independent, and further the autocorrelation depends only on the difference between t and s: the correlation depends only on the time-distance between the pair of values but not on their position in time. It is common practice in some disciplines, other than statistics and time series analysis, to drop the normalization by σ2 and use the term "autocorrelation" interchangeably with "autocovariance". However, the normalization is important both because the interpretation of the autocorrelation as a correlation provides a scale-free measure of the strength of statistical dependence, and because the normalization has an effect on the statistical properties of the estimated autocorrelations. 4. What is the basic principle of Welch method to estimate power spectrum? In physics, engineering, and applied mathematics, Welch's method, named after P.D. Welch, is used for estimating the power of a signal at different frequencies: that is, is is an approach to spectral density estimation. The method is based on the concept of using periodogram spectrum estimates, which are the result of converting a signal from the time domain to the frequency domain. Welch's method is an improvement on the standard periodogram spectrum estimating method and on Bartlett's method, in that it reduces noise in the estimated power spectra in exchange for reducing the frequency resolution. Due to the noise caused by imperfect and finite data, the noise reduction from Welch's method is often desired. The Welch method is based on Bartlett's method and differs in two ways: 1. The signal is split up into overlapping segments: The original data segment is split up into L data segments of length M, overlapping by D points. 1. If D = M / 2, the overlap is said to be 50% 2. If D = 0, the overlap is said to be 0%. This is the same situation as in the Bartlett's method. 2. The overlapping segments are then windowed: After the data is split up into overlapping segments, the individual L data segments have a window applied to them (in the time domain). 1. Most window functions afford more influence to the data at the center of the set than to data at the edges, which represents a loss of information. To mitigate that loss, the individual data sets are commonly overlapped in time (as in the above step). 2. The windowing of the segments is what makes the Welch method a "modified"periodogram. After doing the above, the periodogram is calculated by computing the discrete Fourier transform, and then computing the squared magnitude of the result. The individual periodograms are then time-averaged, which reduces the variance of the individual power measurements. The end result is an array of power measurements vs. frequency "bin". 5. How do find the ML estimate? 6. Give the basic principle of Levinson recursion. Ans. The Levinson recursion is a simplified method for solving normal equations. It may be shown to be equivalent to a recurrence relation in orthogonal polynomial theory. The simplification is Levinson's method is possible because the matrix has actually only N different elements when a general matrix could have N2 different elements.Levinson developed his recursion with single time series in mind (the basic idea was presented in Section 3.3). It is very little extra trouble to do the recursion for multiple time series. Let us begin with the prediction-error normal equation. With multiple time series, unlike single time series, the prediction problem is changed if time is reversed. We may write both the forward and the backward prediction-error normal equations as one equation in the form of (36). Since end effects play an important role, we will show how, when given the solution for 3-term filters, and (36) to find the solution and four-term filters to (37) by forming a linear combination of and in and . This can be done by choosing constant matrices (38 ) Make by choosing and so that the bottom element on the right-hand side of (38) vanishes. That is, .Make by choosing and so that the top element on the righthand side vanishes. That is, . Of course, one will want to solve more than just the prediction-error problem. We will also want to go from 3 x 3 to 4 x 4 in the solution of the filter problem with arbitrary right-hand side .This is accomplished by choosing in the following construction (39) so that (39 ) 7. Why are FIR filters widely used for adaptive filters? 8. Express the Windrow- Hoff LMS adaptive algorithm. State its properties. Ans. Least mean squares (LMS) algorithms are a class of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean squares of the error signal (difference between the desired and the actual signal). It is a stochastic gradient descent method in that the filter is only adapted based on the error at the current time. It was invented in 1960 byStanford University professor Bernard Widrow and his first Ph.D. student, Ted Hoff. LMS algorithm summary The LMS algorithm for a pth order algorithm can be summarized as Parameters: p = filter order μ = step size Initialisation: Computation: For n = 0,1,2,... where denotes the Hermitian transpose of . The main drawback of the "pure" LMS algorithm is that it is sensitive to the scaling of its input x(n). This makes it very hard (if not impossible) to choose a learning rate μ that guarantees stability of the algorithm (Haykin 2002). The Normalised least mean squares filter (NLMS) is a variant of the LMS algorithm that solves this problem by normalising with the power of the input. The NLMS algorithm can be summarised as: Parameters: p = filter order μ = step size Initialization: Computation: For n = 0,1,2,... [edit]Optimal learning rate It can be shown that if there is no interference (v(n) = 0), then the optimal learning rate for the NLMS algorithm is μopt = 1 and is independent of the input x(n) and the real (unknown) impulse response general case with interference ( . In the ), the optimal learning rate is The results above assume that the signals v(n) and x(n) are uncorrelated to each other, which is generally the case in practice.