Download Chapter 2: Digitization of Sound

Chapter 2: Digitization of Sound Acoustics pressure waves are converted to electrical signals by use of a microphone. The output signal from the microphone is an analog signal, i.e., a continuous-valued voltage waveform with horizontal axis representing time and vertical axis representing signal strength in volt amplitude time An audio signal In order for the computer to process the sound signal and/or store it on memory, we must digitize it (convert it into a stream of numbers) 1 Digitization of Sound Two stages in the digitization process: Sampling - take samples of signal at regular locations along the time dimension (the horizontal axis). Uniform sampling is ubiquitous. Questions:  Can we recover the original signal from the sampled signal?  How often do you need to sample the signal? (Sampling frequency) Quantization - divide the signal strength (the vertical axis) into discrete levels. Quantization is a process that selects the closet level to present a signal sample. Each level can be represented by a fixed-bit number. For examples, 8 bit quantization divides the vertical axis into 256 levels, and 16 bit gives you 65536 levels. Questions:  How good is the signal after the conversion? (Number of bits in the quantization)  How is audio data formatted? (The coding of sampled signal) 2 Digitization of Sound Mathematic Formulation of Sampling  Consider uniform impulse train p(t )    (t  kT ) k   where T is the sampling period y (t )  x(t ) p (t )    x(t ) (t  kT ) k      x(kT ) (t  kT ) k   Y ( j )  1 X ( j ) * P ( j  ) 2 note: multiplication in time domain equivalent to convolution in frequency domain  2 Where P( j )   s   (  k s ) ,  s is the sampling frequency where  s  T k   Thus    1 Y ( j )  2 X ( j ) *  s   (  k s )  k    1    X ( j ) *  (  k s ) T k   1    X [ j (  k s )] T k   3 Digitization of Sound Mathematic Formulation of Sampling The bandwidth of the spectrum X ( j ) must satisfy  s  2 B , thus no overlap of the repeated replicas X ( j (  k s )) occurs in Y ( j ) . If bandwidth of X ( j ) does not satisfy the inequality and  s  2 B , irreversible overlap of spectrum replicas is produced. This effect is known as aliasing. Time Frequency x(t ) . B p(t ) s -T 0 T 2T X ( j ) A t s 0 *  B P ( j ) s 0  Y ( j ) y (t )  x(t ) p (t ) A/T t s 0 s  4 Digitization of Sound Sampling Theorem states that the sampling frequency must exceed twice the signal bandwidth in order for the original signal to be recoverable from the sampled signal. aliasing Y ( j )  2 s s Y ( j ) lowpass filter s s 0 0 2 s  X ( j ) s   s  2 B is the Nyquist frequency. In order to prevent aliasing, we must remove any frequencies above half the sampling rate from the signal. This is done by lowpass filtering the signal. Problem with Nyquist-Rate A/D Converter  needs a sharp cutoff (brick-wall) analog lowpass filter -> high order analog 5 filter and is difficult to build -> lots of RC components Digitization of Sound Quantization use a finite number of bits to represent a sampled signal, i.e., precision  quantization process will introduce quantization error (noise) and is a non-recoverable process  quantization noise affects the quality of the sampled signal, higher precision (more bits for quantization) will get less noise because of finer distance between two quantization levels  Linear and Non-linear Quantization If the quantization levels are uniformly spaced over the entire signal strength, the resulting quantization is called uniform (or linear) quantization. If the levels can be spaced according to a non-linear function, it is called nonuniform (or non-linear) quantization. 6 Quantization The choice of linear or non-linear quantization is very much dependent on the probability distribution of the signal. If the signal has a uniform probability density, it is better to use linear quantizer. Examples of signals that have this property are video, music signals. On the other hand, speech signal has a probability distribution that resembles a Gamma function. It is better to use non-linear quantization with the quantization levels placed according to a logarithmic scale, e.g., A-law (used in Europe) and μ-law (used in US) quantization for telephone speech. The dynamic range of digital telephone signals (A-law or μ-law) is effectively 13 bits rather than 8 bits. Digitization of Sound pdf (VS ) VS pdf (VS ) VS 7 Digitization of Sound Definition of Signal to Noise Ratio (SNR) In any analog system, some of the voltage is what you want to measure (signal), and some of it is random fluctuations (noise) also exist.  The ratio of the power of the two (signal and noise) is called the signal to noise ratio (SNR). SNR is a measure of the quality of the signal. A signal with higher SNR means it is less noisy and, in general, is better quality.   SNR is usually measured in decibels (dB). VS2 V SNR  10 log 2  20 log S VN VN where VS is the signal voltage and VN is noise voltage 8 Digitization of Sound Signal to Quantization Noise Ratio (SQNR) Quantization process introduces error (noise). The quantization error (or quantization noise) is the difference between the actual value of the analog signal at the sampling time and the nearest quantization interval value. The largest (worst) quantization error is half of the interval. The precision of the digital audio sample is determined by the number of bits per sample, typically 8 or 16 bits. The quality of the quantization can be measured by the Signal to Quantization Noise Ratio (SQNR) 9 Digitization of Sound Signal to Quantization Noise Ratio (SQNR) Assuming linear quantization and given N to be the number of bits per sample, with sufficiently large N, the quantization error for ideal linear quantizer can be considered as random noise with uniform distribution 2N  Signal rms power (sinewave) VS  , where N is the number of 2 2 1 bit per sample.  2 2 1 2 2 Quantization noise variance VN   q dq  12 1    Noise power VN   2 12 VS 2 N  12    1.5  2 N Signal-to-quantization noise ratio SQNR  VN 2 2  In terms of dB, SQNR  20 log10 ( 1.5  2 )  6.02 N  1.76 dB In other words, each bit adds about 6 dB of resolution, so 16 bits quantization enables a maximum SQNR of 96 dB. Converters capable of sampling to 24-bit resolution at 96 kHz and even 192 kHz sample rates are now becoming standard N 10 Digitization of Sound Oversampling A/D Conversion A brick-wall filter may produce unwanted acoustic effects for Nyquest-rate A/D conversion. It suffers from limitation such as noise, distortion, group delay, and passband ripple, and it is difficult for A/D converters to achieve resolution beyond 18 bits Oversampling is a technique aimed at improving the results of the digitization process. In oversampling A/D conversion, the input signal is first passed through a mild low-pass filter (analog), which provides sufficient attenuation at high frequencies. To extend the Nyquist frequency, the signal is then sampled at a high frequency and quantized. Afterwards, a digital low-pass filter is used to reduce the sampling frequency and prevent aliasing when the output of the digital filter (e.g. an interpolating, linear phase FIR filters) is downsampled to achieve the desired output sampling frequency X ( j )  4  s Y ( j ) 4  s  11 Digitization of Sound Oversampling A/D Conversion In addition to eliminating unwanted effects of a brick-wall analog filter, oversampling helps achieve increased resolution by extending the spectrum of the quantization error far beyond the audio base-band. The out of band quantization noise can then be filtered by a digital lowpass filter, rendering the in-band noise relatively insignificant, and thus increase the equivalent SQNR. The decimator (downsapler) converts the oversampled signal back to the Nyquest rate and generates the correct output word length of say, 16-, 20-, or 24-bits. Noise Power 2 2  12  s 0 s / 2 2 2  12 2   s s digitally filtered 2  s  Each doubling of the sampling frequency decreases the in-band noise by 3 dB, increasing the resolution by half a bit. (Remember this formula SQNR  6.02 N  1.76 dB ) An added advantage of oversampling is it relaxes the requirement of a sharp 12 analog anti-aliasing filter Digitization of Sound Delta-Sigma A/D Converter Essentially a delta-sigma converter digitizes the audio signal with a very low resolution (1-bit) A/D converter at a very high sampling rate. It is the oversampling rate and subsequent digital processing that separates this from plain delta modulation. A delta-sigma modulator consists of three parts: an analog modulator (1-bit converter), a digital lowpass filter and a decimation circuit (downsampler). The simplicity of 1-bit circuitry makes the conversion process very fast and less distortion High-speed "mixed-signal" IC processing allows the total integration of 13 analog and digital circuitry Digitization of Sound Delta-Sigma A/D Converter Principles of Σ-Δ A/D Converter Slope overload problem -> the analog signal must not change too Rapidly Higher sampling frequency -> smaller noise Still, for 1-bit converter, 64 times oversampling gives only about 4–bit precision -> insufficient for audio applications 14 Digitization of Sound Delta-Sigma A/D Converter Adding Noise Shaping If the quantization noise is shaped and pushed largely out of band during the quantization process in the oversampling A/D converter, the noise remains in-band will be reduced, then after digital lowpass filtering, further improvement in SQNR can be achieved. 15 Digitization of Sound Delta-Sigma A/D Converter Adding Noise Shaping Filter in Σ-Δ A/D Converter An integrator is a first-order lowpass filter H ( f )  1 f 2  2 In-band quantization noise N   3 12 3K 2 e where K is the oversampling ratio (OSR) i.e., doubling OSR (K) increases SQNR by 9 dB 16 Digitization of Sound Delta-Sigma A/D Converter Higher Order Noise Shaping Filter 2  4 In-band quantization noise N  12  5 K 5 2 e where K is the oversampling ratio (OSR) i.e., doubling OSR (K) increases SQNR by 15 dB Generalization to Lth order: doubling OSR (K) increases SQNR by 17 (6L+3) dB Digitization of Sound Delta-Sigma A/D Converter Multi-bit Σ-Δ A/D Converter uses an n-bit flash ADC and an n-bit DAC  gives a higher dynamic range for a given oversampling ratio and order of loop filter  stabilization is easier, since second-order loops can generally be used  18 Digitization of Sound Delta-Sigma A/D Converter Adding Dither Noise With large amplitude complex signals, there is little correlation between the signal and the quantization error; thus the error is random and perceptually similar to white noise. With low-level signals, the character of the error changes as it becomes correlated to the signal, and potentially audible distortion results Dither is a small amount of noise added to the audio signal before conversion. The mixed noise causes the small signal to jump around, which causes the converter to switch rapidly between levels rather than being forced to choose between two fixed values Quantization error is thus decorelated from the signal and is perceived as wideband noise rather than audible distortion 19 Digitization of Sound Typical Audio Formats Signal samples can be represented as Pulse Code Modulation (PCM) format. If a sample can have a positive or negative value, then it can be stored in the computer as an N-bit number with 1 bit representing the sign and N-1 bits representing the magnitude, i.e. a signed-integer. In most cases, they are just stored as fixed-point 2’complement numbers. Popular file formats include for audio are: .au (Unix workstations), .wav (PC). There are popular formats which require compression of audio signal, for examples, .mp3 for MPEG Layer 3 audio codec, .ra for Real player audio codec. Data Rates of Common Digital Audio Signals 20

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 2: Digitization of Sound