Download Chapter 2: Digitization of Sound

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Resistive opto-isolator wikipedia , lookup

Time-to-digital converter wikipedia , lookup

Spectral density wikipedia , lookup

Pulse-width modulation wikipedia , lookup

Sound level meter wikipedia , lookup

Opto-isolator wikipedia , lookup

Sound reinforcement system wikipedia , lookup

Public address system wikipedia , lookup

Heterodyne wikipedia , lookup

Dynamic range compression wikipedia , lookup

Quantization (signal processing) wikipedia , lookup

Dither wikipedia , lookup

Analog-to-digital converter wikipedia , lookup

Transcript
Chapter 2: Digitization of Sound
Acoustics pressure waves are converted to electrical signals
by use of a microphone. The output signal from the
microphone is an analog signal, i.e., a continuous-valued
voltage waveform with horizontal axis representing time
and vertical axis representing signal strength in volt
amplitude
time
An audio signal
In order for the computer to process the sound signal
and/or store it on memory, we must digitize it (convert it
into a stream of numbers)
1
Digitization of Sound
Two stages in the digitization process:
Sampling - take samples of signal at regular locations
along the time dimension (the horizontal axis). Uniform
sampling is ubiquitous. Questions:
 Can we recover the original signal from the sampled
signal?
 How often do you need to sample the signal? (Sampling
frequency)
Quantization - divide the signal strength (the vertical axis)
into discrete levels. Quantization is a process that selects
the closet level to present a signal sample. Each level can
be represented by a fixed-bit number. For examples, 8 bit
quantization divides the vertical axis into 256 levels, and
16 bit gives you 65536 levels. Questions:
 How good is the signal after the conversion? (Number of
bits in the quantization)
 How is audio data formatted? (The coding of sampled
signal)
2
Digitization of Sound
Mathematic Formulation of Sampling

Consider uniform impulse train p(t )    (t  kT )
k  
where T is the sampling period
y (t )  x(t ) p (t ) 

 x(t ) (t  kT )
k  


 x(kT ) (t  kT )
k  
Y ( j ) 
1
X ( j ) * P ( j  )
2
note: multiplication in
time domain equivalent to
convolution in frequency
domain

2
Where P( j )   s   (  k s ) ,  s is the sampling frequency where  s 
T
k  
Thus



1
Y ( j ) 
2
X ( j ) *  s   (  k s )
 k  

1 
  X ( j ) *  (  k s )
T k  
1 
  X [ j (  k s )]
T k  
3
Digitization of Sound
Mathematic Formulation of Sampling
The bandwidth of the spectrum X ( j ) must satisfy  s  2 B , thus
no overlap of the repeated replicas X ( j (  k s )) occurs in Y ( j ) . If
bandwidth of X ( j ) does not satisfy the inequality and  s  2 B ,
irreversible overlap of spectrum replicas is produced. This effect is
known as aliasing.
Time
Frequency
x(t )
.
B
p(t )
s
-T 0 T 2T
X ( j )
A
t
s
0
*

B
P ( j )
s
0

Y ( j )
y (t )  x(t ) p (t )
A/T
t
s
0
s

4
Digitization of Sound
Sampling Theorem states that the sampling frequency must
exceed twice the signal bandwidth in order for the original signal to
be recoverable from the sampled signal.
aliasing
Y ( j )
 2 s
s
Y ( j )
lowpass filter
s
s
0
0
2 s

X ( j )
s

 s  2 B is the Nyquist frequency.
In order to prevent aliasing, we must remove any frequencies above half the
sampling rate from the signal. This is done by lowpass filtering the signal.
Problem with Nyquist-Rate A/D Converter
 needs a sharp cutoff (brick-wall) analog lowpass filter -> high order analog
5
filter and is difficult to build -> lots of RC components
Digitization of Sound
Quantization
use a finite number of bits to represent a sampled signal,
i.e., precision
 quantization process will introduce quantization error
(noise) and is a non-recoverable process
 quantization noise affects the quality of the sampled
signal, higher precision (more bits for quantization) will
get less noise because of finer distance between two
quantization levels

Linear and Non-linear Quantization
If the quantization levels are uniformly spaced over the
entire signal strength, the resulting quantization is called
uniform (or linear) quantization. If the levels can be
spaced according to a non-linear function, it is called nonuniform (or non-linear) quantization.
6
Quantization
The choice of linear or non-linear
quantization is very much dependent
on the probability distribution of the
signal. If the signal has a uniform
probability density, it is better to use
linear quantizer. Examples of signals
that have this property are video,
music signals.
On the other hand, speech signal has
a probability distribution that
resembles a Gamma function. It is
better to use non-linear quantization
with the quantization levels placed
according to a logarithmic scale, e.g.,
A-law (used in Europe) and μ-law
(used in US) quantization for
telephone speech. The dynamic range
of digital telephone signals (A-law or
μ-law) is effectively 13 bits rather
than 8 bits.
Digitization of Sound
pdf (VS )
VS
pdf (VS )
VS
7
Digitization of Sound
Definition of Signal to Noise Ratio (SNR)
In any analog system, some of the voltage is what you
want to measure (signal), and some of it is random
fluctuations (noise) also exist.

The ratio of the power of the two (signal and noise) is
called the signal to noise ratio (SNR). SNR is a measure of
the quality of the signal. A signal with higher SNR means
it is less noisy and, in general, is better quality.


SNR is usually measured in decibels (dB).
VS2
V
SNR  10 log 2  20 log S
VN
VN
where
VS
is the signal voltage and
VN
is noise voltage
8
Digitization of Sound
Signal to Quantization Noise Ratio (SQNR)
Quantization process introduces error (noise). The quantization
error (or quantization noise) is the difference between the actual
value of the analog signal at the sampling time and the nearest
quantization interval value. The largest (worst) quantization
error is half of the interval.
The precision of the digital audio sample is determined by the
number of bits per sample, typically 8 or 16 bits. The quality of
the quantization can be measured by the Signal to Quantization
Noise Ratio (SQNR)
9
Digitization of Sound
Signal to Quantization Noise Ratio (SQNR)
Assuming linear quantization and given N to be the number of
bits per sample, with sufficiently large N, the quantization error
for ideal linear quantizer can be considered as random noise
with uniform distribution
2N 
Signal rms power (sinewave) VS 
, where N is the number of
2 2
1
bit per sample.

2
2
1 2
2
Quantization noise variance VN   q dq 
12
1 
 
Noise power VN  
2
12
VS 2 N 
12


 1.5  2 N
Signal-to-quantization noise ratio SQNR 
VN 2 2

In terms of dB, SQNR  20 log10 ( 1.5  2 )  6.02 N  1.76 dB
In other words, each bit adds about 6 dB of resolution, so 16
bits quantization enables a maximum SQNR of 96 dB.
Converters capable of sampling to 24-bit resolution at 96 kHz
and even 192 kHz sample rates are now becoming standard
N
10
Digitization of Sound
Oversampling A/D Conversion
A brick-wall filter may produce unwanted acoustic effects for Nyquest-rate
A/D conversion. It suffers from limitation such as noise, distortion, group
delay, and passband ripple, and it is difficult for A/D converters to achieve
resolution beyond 18 bits
Oversampling is a technique aimed at improving the results of the
digitization process. In oversampling A/D conversion, the input signal is first
passed through a mild low-pass filter (analog), which provides sufficient
attenuation at high frequencies. To extend the Nyquist frequency, the signal
is then sampled at a high frequency and quantized. Afterwards, a digital
low-pass filter is used to reduce the sampling frequency and prevent aliasing
when the output of the digital filter (e.g. an interpolating, linear phase FIR
filters) is downsampled to achieve the desired output sampling frequency
X ( j )
 4  s
Y ( j )
4  s

11
Digitization of Sound
Oversampling A/D Conversion
In addition to eliminating unwanted effects of a brick-wall analog filter,
oversampling helps achieve increased resolution by extending the spectrum
of the quantization error far beyond the audio base-band. The out of band
quantization noise can then be filtered by a digital lowpass filter, rendering
the in-band noise relatively insignificant, and thus increase the equivalent
SQNR. The decimator (downsapler) converts the oversampled signal back to
the Nyquest rate and generates the correct output word length of say, 16-,
20-, or 24-bits.
Noise Power
2 2

12  s
0
s / 2
2
2

12 2   s
s
digitally filtered
2  s

Each doubling of the sampling frequency decreases the in-band
noise by 3 dB, increasing the resolution by half a bit.
(Remember this formula SQNR  6.02 N  1.76 dB )
An added advantage of oversampling is it relaxes the requirement of a sharp
12
analog anti-aliasing filter
Digitization of Sound
Delta-Sigma A/D Converter
Essentially a delta-sigma converter digitizes the audio signal with a
very low resolution (1-bit) A/D converter at a very high sampling
rate. It is the oversampling rate and subsequent digital processing
that separates this from plain delta modulation.
A delta-sigma modulator consists of three parts: an analog
modulator (1-bit converter), a digital lowpass filter and a
decimation circuit (downsampler).
The simplicity of 1-bit circuitry makes the conversion process very fast and
less distortion
High-speed "mixed-signal" IC processing allows the total integration of
13
analog and digital circuitry
Digitization of Sound
Delta-Sigma A/D Converter
Principles of Σ-Δ A/D Converter
Slope overload
problem
-> the analog
signal must not
change too
Rapidly
Higher sampling frequency -> smaller noise
Still, for 1-bit converter, 64 times oversampling gives only about 4–bit
precision -> insufficient for audio applications
14
Digitization of Sound
Delta-Sigma A/D Converter
Adding Noise Shaping
If the quantization noise is shaped and pushed
largely out of band during the quantization process in
the oversampling A/D converter, the noise remains
in-band will be reduced, then after digital lowpass
filtering, further improvement in SQNR can be
achieved.
15
Digitization of Sound
Delta-Sigma A/D Converter
Adding Noise Shaping Filter in Σ-Δ A/D Converter
An integrator is a first-order lowpass filter H ( f ) 
1
f
2  2
In-band quantization noise N   3
12 3K
2
e
where K is the oversampling ratio (OSR)
i.e., doubling OSR (K) increases SQNR by 9 dB
16
Digitization of Sound
Delta-Sigma A/D Converter
Higher Order Noise Shaping Filter
2  4
In-band quantization noise N  12  5 K 5
2
e
where K is the oversampling ratio (OSR)
i.e., doubling OSR (K) increases SQNR by 15 dB
Generalization to Lth order: doubling OSR (K) increases SQNR by
17
(6L+3) dB
Digitization of Sound
Delta-Sigma A/D Converter
Multi-bit
Σ-Δ A/D Converter
uses an n-bit flash ADC and an n-bit DAC
 gives a higher dynamic range for a given oversampling ratio and
order of loop filter
 stabilization is easier, since second-order loops can generally be
used

18
Digitization of Sound
Delta-Sigma A/D Converter
Adding Dither Noise
With large amplitude complex signals,
there is little correlation between the
signal and the quantization error; thus
the error is random and perceptually
similar to white noise. With low-level
signals, the character of the error
changes as it becomes correlated to
the signal, and potentially audible
distortion results
Dither is a small amount of noise
added to the audio signal before
conversion. The mixed noise causes
the small signal to jump around, which
causes the converter to switch rapidly
between levels rather than being
forced to choose between two fixed
values Quantization error is thus
decorelated from the signal and is
perceived as wideband noise rather
than audible distortion
19
Digitization of Sound
Typical Audio Formats
Signal samples can be represented as Pulse Code Modulation (PCM)
format. If a sample can have a positive or negative value, then it can
be stored in the computer as an N-bit number with 1 bit representing
the sign and N-1 bits representing the magnitude, i.e. a signed-integer.
In most cases, they are just stored as fixed-point 2’complement
numbers. Popular file formats include for audio are: .au (Unix
workstations), .wav (PC). There are popular formats which require
compression of audio signal, for examples, .mp3 for MPEG Layer 3
audio codec, .ra for Real player audio codec.
Data Rates of Common Digital Audio Signals
20