Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Sound level meter wikipedia , lookup

Decibel wikipedia , lookup

Spectral density wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Sound reinforcement system wikipedia , lookup

Oscilloscope types wikipedia , lookup

Public address system wikipedia , lookup

Pulse-width modulation wikipedia , lookup

Opto-isolator wikipedia , lookup

Dither wikipedia , lookup

Quantization (signal processing) wikipedia , lookup

Heterodyne wikipedia , lookup

Dynamic range compression wikipedia , lookup

Analog-to-digital converter wikipedia , lookup

Transcript
Chapter 6
Basics of Digital Audio
1
Li & Drew
Fundamentals of Multimedia, Chapter 6
6.1 Digitization of Sound
What is Sound?
• Sound is a wave phenomenon like light, but is macroscopic
and involves molecules of air being compressed and
expanded under the action of some physical device.
(a) For example, a speaker in an audio system vibrates back and
forth and produces a longitudinal pressure wave that we
perceive as sound.
(b) Since sound is a pressure wave, it takes on continuous values,
as opposed to digitized ones.
2
Li & Drew
Fundamentals of Multimedia, Chapter 6
(c) Even though such pressure waves are
longitudinal, they still have ordinary wave
properties and behaviors, such as reflection
(bouncing), refraction (change of angle when
entering a medium with a different density)
and diffraction (bending around an obstacle).
(d) If we wish to use a digital version of sound
waves we must form digitized representations
of audio information.
 Link to physical description of sound waves.
3
Li & Drew
Fundamentals of Multimedia, Chapter 6
Digitization
• Digitization means conversion to a stream of
numbers, and preferably these numbers
should be integers for efficiency.
• Fig. 6.1 shows the 1-dimensional nature of
sound: amplitude values depend on a 1D
variable, time. (And note that images depend
instead on a 2D set of variables, x and y).
4
Li & Drew
Fundamentals of Multimedia, Chapter 6
Fig. 6.1: An analog signal: continuous measurement
of pressure wave.
5
Li & Drew
Fundamentals of Multimedia, Chapter 6
• The graph in Fig. 6.1 has to be made digital in both time and
amplitude. To digitize, the signal must be sampled in each
dimension: in time, and in amplitude.
(a) Sampling means measuring the quantity we are interested in, usually
at evenly-spaced intervals.
(b) The first kind of sampling, using measurements only at evenly spaced
time intervals, is simply called, sampling. The rate at which it is
performed is called the sampling frequency (see Fig. 6.2(a)).
(c) For audio, typical sampling rates are from 8 kHz (8,000 samples per
second) to 48 kHz. This range is determined by the Nyquist theorem,
discussed later.
(d) Sampling in the amplitude or voltage dimension is called
quantization. Fig. 6.2(b) shows this kind of sampling.
6
Li & Drew
Fundamentals of Multimedia, Chapter 6
(a)
(b)
Fig. 6.2: Sampling and Quantization. (a): Sampling the
analog signal in the time dimension. (b): Quantization is
sampling the analog signal in the amplitude dimension.
7
Li & Drew
Fundamentals of Multimedia, Chapter 6
Signal to Noise Ratio (SNR)
• The ratio of the power of the correct signal and the noise is
called the signal to noise ratio (SNR) — a measure of the
quality of the signal.
• The SNR is usually measured in decibels (dB), where 1 dB is
a tenth of a bel. The SNR value, in units of dB, is defined in
terms of base-10 logarithms of squared voltages, as follows:
SNR  10 log10
2
Vsignal
2
noise
V
8
 20 log10
Vsignal
Vnoise
(6.2)
Li & Drew
Fundamentals of Multimedia, Chapter 6
a) The power in a signal is proportional to the
square of the voltage. For example, if the
signal voltage Vsignal is 10 times the noise,
then the SNR is 20 ∗ log10(10) = 20dB.
b) In terms of power, if the power from ten
violins is ten times that from one violin
playing, then the ratio of power is 10dB, or
1B.
c) To know: Power — 10; Signal Voltage — 20.
9
Li & Drew
Fundamentals of Multimedia, Chapter 6
• The usual levels of sound we hear around us are described in terms of decibels, as a
ratio to the quietest sound we are capable of hearing. Table 6.1 shows approximate
levels for these sounds.
Table 6.1: Magnitude levels of common sounds, in decibels
Threshold of hearing
0
Rustle of leaves
10
Very quiet room
20
Average room
40
Conversation
60
Busy street
70
Loud radio
80
Train through station
90
Riveter
100
Threshold of discomfort
120
Threshold of pain
140
Damage to ear drum
160
10
Li & Drew
Fundamentals of Multimedia, Chapter 6
Signal to Quantization Noise Ratio (SQNR)
• Aside from any noise that may have been present
in the original analog signal, there is also an
additional error that results from quantization.
(a) If voltages are actually in 0 to 1 but we have only 8
bits in which to store values, then effectively we force
all continuous values of voltage into only 256 different
values.
(b) This introduces a roundoff error. It is not really
“noise”. Nevertheless it is called quantization noise
(or quantization error).
11
Li & Drew
Fundamentals of Multimedia, Chapter 6
• The quality of the quantization is characterized
by the Signal to Quantization Noise Ratio
(SQNR).
(a) Quantization noise: the difference between the
actual value of the analog signal, for the
particular sampling time, and the nearest
quantization interval value.
(b) At most, this error can be as much as half of the
interval.
12
Li & Drew
Fundamentals of Multimedia, Chapter 6
Audio Filtering
• Prior to sampling and AD conversion, the audio signal is also usually filtered
to remove unwanted frequencies. The frequencies kept depend on the
application:
(a) For speech, typically from 50Hz to 10kHz is retained, and other frequencies
are blocked by the use of a band-pass filter that screens out lower and higher
frequencies.
(b) An audio music signal will typically contain from about 20Hz up to 20kHz.
(c) At the DA converter end, high frequencies may reappear in the output —
because of sampling and then quantization, smooth input signal is replaced by
a series of step functions containing all possible frequencies.
(d) So at the decoder side, a lowpass filter is used after the DA circuit.
13
Li & Drew
Fundamentals of Multimedia, Chapter 6
• Every compression scheme has three stages:
A. The input data is transformed to a new
representation that is easier or more efficient to
compress.
B. We may introduce loss of information. Quantization is
the main lossy step ⇒ we use a limited number of
reconstruction levels, fewer than in the original signal.
C. Coding. Assign a codeword (thus forming a binary
bitstream) to each output level or symbol. This could
be a fixed-length code, or a variable length code such
as Huffman coding (Chap. 7).
14
Li & Drew
Fundamentals of Multimedia, Chapter 6
Pulse Code Modulation
• The basic techniques for creating digital signals
from analog signals are sampling and
quantization.
• Quantization consists of selecting breakpoints
in magnitude, and then re-mapping any value
within an interval to one of the representative
output levels. −→ Repeat of Fig. 6.2:
15
Li & Drew
Fundamentals of Multimedia, Chapter 6
PCM in Speech Compression
• Assuming a bandwidth for speech from about 50 Hz to about 10 kHz,
the Nyquist rate would dictate a sampling rate of 20 kHz.
(a) Using uniform quantization without companding, the minimum
sample size we could get away with would likely be about 12 bits.
Hence for mono speech transmission the bit-rate would be 240 kbps.
[Compander = COM(PRESSOR) + (EX)PANDER]
(b) With companding, we can reduce the sample size down to about 8
bits with the same perceived level of quality, and thus reduce the bitrate to 160 kbps.
(c) However, the standard approach to telephony in fact assumes that the
highest-frequency audio signal we want to reproduce is only about 4
kHz. Therefore the sampling rate is only 8 kHz, and the companded
bit-rate thus reduces this to 64 kbps.
16
Li & Drew
Fundamentals of Multimedia, Chapter 6
• However, there are two small wrinkles we must
also address:
1.
Since only sounds up to 4 kHz are to be considered, all
other frequency content must be noise. Therefore, we
should remove this high-frequency content from the
analog input signal. This is done using a band-limiting
filter that blocks out high, as well as very low,
frequencies.
2.
A discontinuous signal contains a theoretically infinite
set of higher-frequency components:
Therefore the output of the digital-to-analog converter goes to a
low-pass filter that allows only frequencies up to the original
maximum to be retained.
17
Li & Drew
Fundamentals of Multimedia, Chapter 6
• The complete scheme for encoding and decoding telephony
signals is shown as a schematic in Fig. 6.14. As a result of
the low-pass filtering, the output becomes smoothed.
Fig. 6.14: PCM signal encoding and decoding.
18
Li & Drew