Download A sound card - E

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Loudspeaker enclosure wikipedia , lookup

Transmission line loudspeaker wikipedia , lookup

Audio power wikipedia , lookup

Heterodyne wikipedia , lookup

Resistive opto-isolator wikipedia , lookup

Opto-isolator wikipedia , lookup

Fade (audio engineering) wikipedia , lookup

Tube sound wikipedia , lookup

Studio monitor wikipedia , lookup

Loudspeaker wikipedia , lookup

History of sound recording wikipedia , lookup

Dynamic range compression wikipedia , lookup

Electronic musical instrument wikipedia , lookup

Electrostatic loudspeaker wikipedia , lookup

Phone connector (audio) wikipedia , lookup

Sound level meter wikipedia , lookup

Sound recording and reproduction wikipedia , lookup

Sound reinforcement system wikipedia , lookup

Public address system wikipedia , lookup

Transcript
Introduction to Audio
This beginner-level tutorial covers the basics of audio production. It is
suitable for anyone wanting to learn more about working with sound,
in either amateur or professional situations. The tutorial is five pages
and takes about 20 minutes to complete.
What is "Audio"?
Audio means "of sound" or "of the reproduction of sound". Specifically, it refers to the range of
frequencies detectable by the human ear — approximately 20Hz to 20kHz. It's not a bad idea to
memorise those numbers — 20Hz is the lowest-pitched (bassiest) sound we can hear, 20kHz is
the highest pitch we can hear.
Audio work involves the production, recording, manipulation and reproduction of sound waves.
To understand audio you must have a grasp of two things:
1. Sound Waves: What they are, how they are produced and how we hear them.
2. Sound Equipment: What the different components are, what they do, how to choose the
correct equipment and use it properly.
Fortunately it's not particularly difficult. Audio theory is simpler than video theory and once you
understand the basic path from the sound source through the sound equipment to the ear, it all
starts to make sense.
Technical note: In physics, sound is a form of energy known as acoustical energy.
The Field of Audio Work
The field of audio is vast, with many areas of specialty. Hobbyists use audio for all sorts of
things, and audio professionals can be found in a huge range of vocations. Some common areas
of audio work include:





Studio Sound
Engineer
Live Sound Engineer
Musician
Music Producer
DJ





Radio technician
Film/Television Sound
Recordist
Field Sound Engineer
Audio Editor
Post-Production Audio Creator
In addition, many other professions require a level of audio proficiency. For example, video
camera operators should know enough about audio to be able to record good quality sound with
their pictures.
Speaking of video-making, it's important to recognise the importance of audio in film and video.
A common mistake amongst amateurs is to concentrate only on the vision and assume that as
long as the microphone is working the audio will be fine. However, satisfactory audio requires
skill and effort. Sound is critical to the flow of the programme — indeed in many situations high
quality sound is more important than high quality video.
Most jobs in audio production require some sort of specialist skill set, whether it be micing up a
drum kit or creating synthetic sound effects. Before you get too carried away with learning
specific tasks, you should make sure you have a general grounding in the principles of sound.
Once you have done this homework you will be well placed to begin specialising.
The first thing to tackle is basic sound wave theory...
Acoustics
Acoustics is the interdisciplinary science that deals with the study of all mechanical waves in
gases, liquids, and solids including vibration, sound, ultrasound and infrasound. A scientist who
works in the field of acoustics is an acoustician while someone working in the field of acoustics
technology may be called an acoustical engineer. The application of acoustics can be seen in
almost all aspects of modern society with the most obvious being the audio and noise control
industries.
Hearing is one of the most crucial means of survival in the animal world, and speech is one of
the most distinctive characteristics of human development and culture. So it is no surprise that
the science of acoustics spreads across so many facets of our society—music, medicine,
architecture, industrial production, warfare and more. Art, craft, science and technology have
provoked one another to advance the whole, as in many other fields of knowledge. Lindsay's
'Wheel of Acoustics' is a well accepted overview of the various fields in acoustics.[1]
The word "acoustic" is derived from the Greek word ἀκουστικός (akoustikos), meaning "of or for
hearing, ready to hear"[2] and that from ἀκουστός (akoustos), "heard, audible",[3] which in turn
derives from the verb ἀκούω (akouo), "I hear".[4]
The Latin synonym is "sonic", after which the term sonics used to be a synonym for acoustics[5]
and later a branch of acoustics.[6] After acousticians had extended their studies to frequencies
above and below the audible range, it became conventional to identify these frequency ranges as
"ultrasonic" and "infrasonic" respectively, while letting the word "acoustic" refer to the entire
frequency range without limit
. Nature of Sound Waves
Sound is one kind of longitudinal wave, in which the particles oscillate to and fro in the same
direction of wave propagation. Sound waves cannot be transmitted through vacuum. The
transmission of sound requires at least a medium, which can be solid, liquid, or gas.
condensation
rarefaction
The wavelength, l is the distance between two successive rarefactions or condensations.
Figure 1 Propagation of Sound Wave
The displacement of any point on the wave, y, along the direction of propagation is related to
time by the following formula:
(1)
(2)
(3)
(4)
Table 1 shows the velocities of sound in same common media.
Material
Air
Velocity of Sound (m/s)
344
Water
1,372
Concrete
3,048
Glass
3,658
Iron
5,182
Lead
1,219
Steel
5,182
Wood (hard)
4,267
Wood (soft)
3,353
Table 1 Approximate Velocities of Sound in Some Common Media
Fundamental characteristics of sound
Sound is a mechanical wave that is an oscillation of pressure transmitted through a solid, liquid,
or gas, composed of frequencies within the range of hearing and of a level sufficiently strong to
be heard, or the sensation stimulated in organs of hearing by such vibrations
Propagation of sound
Sound is a sequence of waves of pressure that propagates through compressible media such as air
or water. (Sound can propagate through solids as well, but there are additional modes of
propagation). During propagation, waves can be reflected, refracted, or attenuated by the
medium.[2]
The behavior of sound propagation is generally affected by three things:



A relationship between density and pressure. This relationship, affected by temperature,
determines the speed of sound within the medium.
The propagation is also affected by the motion of the medium itself. For example, sound
moving through wind. Independent of the motion of sound through the medium, if the
medium is moving, the sound is further transported.
The viscosity of the medium also affects the motion of sound waves. It determines the rate at
which sound is attenuated. For many media, such as air or water, attenuation due to viscosity
is negligible.
When sound is moving through a medium that does not have constant physical properties, it may
be refracted (either dispersed or focused).[2]
Perception of sound
Human ear
The perception of sound in any organism is limited to a certain range of frequencies. For
humans, hearing is normally limited to frequencies between about 20 Hz and 20,000 Hz (20
kHz)[3], although these limits are not definite. The upper limit generally decreases with age.
Other species have a different range of hearing. For example, dogs can perceive vibrations higher
than 20 kHz, but are deaf to anything below 40 Hz. As a signal perceived by one of the major
senses, sound is used by many species for detecting danger, navigation, predation, and
communication. Earth's atmosphere, water, and virtually any physical phenomenon, such as fire,
rain, wind, surf, or earthquake, produces (and is characterized by) its unique sounds. Many
species, such as frogs, birds, marine and terrestrial mammals, have also developed special organs
to produce sound. In some species, these produce song and speech. Furthermore, humans have
developed culture and technology (such as music, telephone and radio) that allows them to
generate, record, transmit, and broadcast sound. The scientific study of human sound perception
is known as psychoacoustics.
Physics of sound
The mechanical vibrations that can be interpreted as sound are able to travel through all forms of
matter: gases, liquids, solids, and plasmas. The matter that supports the sound is called the
medium. Sound cannot travel through a vacuum.
Longitudinal and transverse waves
Sinusoidal waves of various frequencies; the bottom waves have higher frequencies than those
above. The horizontal axis represents time.
Sound is transmitted through gases, plasma, and liquids as longitudinal waves, also called
compression waves. Through solids, however, it can be transmitted as both longitudinal waves
and transverse waves. Longitudinal sound waves are waves of alternating pressure deviations
from the equilibrium pressure, causing local regions of compression and rarefaction, while
transverse waves (in solids) are waves of alternating shear stress at right angle to the direction of
propagation.
Matter in the medium is periodically displaced by a sound wave, and thus oscillates. The energy
carried by the sound wave converts back and forth between the potential energy of the extra
compression (in case of longitudinal waves) or lateral displacement strain (in case of transverse
waves) of the matter and the kinetic energy of the oscillations of the medium.
Sound wave properties and characteristics
Sound waves are often simplified to a description in terms of sinusoidal plane waves, which are
characterized by these generic properties:








Frequency, or its inverse, the period
Wavelength
Wavenumber
Amplitude
Sound pressure
Sound intensity
Speed of sound
Direction
Sometimes speed and direction are combined as a velocity vector; wavenumber and direction are
combined as a wave vector.
Transverse waves, also known as shear waves, have the additional property, polarization, and are
not a characteristic of sound waves.
Speed of sound
U.S. Navy F/A-18 breaking the sound barrier. The white halo is formed by condensed water
droplets thought to result from a drop in air pressure around the aircraft (see Prandtl-Glauert
Singularity).[4][5]
Main article: Speed of sound
The speed of sound depends on the medium the waves pass through, and is a fundamental
property of the material. In general, the speed of sound is proportional to the square root of the
ratio of the elastic modulus (stiffness) of the medium to its density. Those physical properties
and the speed of sound change with ambient conditions. For example, the speed of sound in
gases depends on temperature. In 20 °C (68 °F) air at the sea level, the speed of sound is
approximately 343 m/s (1,230 km/h; 767 mph) using the formula "v = (331 + 0.6 T) m/s". In
fresh water, also at 20 °C, the speed of sound is approximately 1,482 m/s (5,335 km/h;
3,315 mph). In steel, the speed of sound is about 5,960 m/s (21,460 km/h; 13,330 mph).[6] The
speed of sound is also slightly sensitive (a second-order anharmonic effect) to the sound
amplitude, which means that there are nonlinear propagation effects, such as the production of
harmonics and mixed tones not present in the original sound (see parametric array).
Acoustics
Main article: Acoustics
Acoustics is the interdisciplinary science that deals with the study of all mechanical waves in
gases, liquids, and solids including vibration, sound, ultrasound and infrasound. A scientist who
works in the field of acoustics is an acoustician while someone working in the field of acoustics
technology may be called an acoustical or audio engineer. The application of acoustics can be
seen in almost all aspects of modern society with the most obvious being the audio and noise
control industries.
Noise
Main article: Noise
Noise is a term often used to refer to an unwanted sound. In science and engineering, noise is an
undesirable component that obscures a wanted signal.
Sound pressure level
Main article: Sound pressure
Sound pressure is the difference, in a given medium, between
average local pressure and the pressure in the sound wave. A square
of this difference (i.e., a square of the deviation from the
equilibrium pressure) is usually averaged over time and/or space,
and a square root of this average provides a root mean square
(RMS) value. For example, 1 Pa RMS sound pressure (94 dBSPL)
in atmospheric air implies that the actual pressure in the sound wave
oscillates between (1 atm Pa) and (1 atm Pa), that is between
101323.6 and 101326.4 Pa. Such a tiny (relative to atmospheric)
variation in air pressure at an audio frequency is perceived as a
deafening sound, and can cause hearing damage, according to the
table below.
As the human ear can detect sounds with a wide range of
amplitudes, sound pressure is often measured as a level on a
logarithmic decibel scale. The sound pressure level (SPL) or Lp is
defined as
Sound measurements
Sound pressure p, SPL
Particle velocity v, SVL
Particle displacement ξ
Sound intensity I, SIL
Sound power Pac
Sound power level SWL
where p is the root-mean-square sound pressure and pref is a
reference sound pressure. Commonly used reference sound
pressures, defined in the standard ANSI S1.1-1994, are 20
µPa in air and 1 µPa in water. Without a specified reference
sound pressure, a value expressed in decibels cannot
represent a sound pressure level.
Sound energy
Since the human ear does not have a flat spectral response,
sound pressures are often frequency weighted so that the
measured level matches perceived levels more closely. The
International Electrotechnical Commission (IEC) has
defined several weighting schemes. A-weighting attempts to
match the response of the human ear to noise and Aweighted sound pressure levels are labeled dBA. Cweighting is used to measure peak levels.
Acoustic impedance Z
Sound energy density E
Sound energy flux q
Speed of sound c
Audio frequency AF
v·d·e
Equipment for dealing with sound
Equipment for generating or using sound includes musical instruments, hearing aids,
sonar systems and sound reproduction and broadcasting equipment. Many of these use
electro-acoustic transducers such as microphones and loudspeakers.
Sound measurement









Decibel, Sone, mel, Phon, Hertz
Sound pressure level, Sound pressure
Particle velocity, Acoustic velocity
Particle displacement, Particle amplitude, Particle acceleration
Sound power, Acoustic power, Sound power level
Sound energy flux
Sound intensity, Acoustic intensity, Sound intensity level
Acoustic impedance, Sound impedance, Characteristic impedance
Speed of sound, Amplitude
Microphone
A microphone (colloquially called a mic or mike; both pronounced /ˈmaɪk/[1]) is an acoustic-toelectric transducer or sensor that converts sound into an electrical signal. In 1877, Emile Berliner
invented the first microphone used as a telephone voice transmitter.[2] Microphones are used in
many applications such as telephones, tape recorders, karaoke systems, hearing aids, motion
picture production, live and recorded audio engineering, FRS radios, megaphones, in radio and
television broadcasting and in computers for recording voice, speech recognition, VoIP, and for
non-acoustic purposes such as ultrasonic checking or knock sensors.
Most microphones today use electromagnetic induction (dynamic microphone), capacitance
change (condenser microphone), piezoelectric generation, or light modulation to produce an
electrical voltage signal from mechanical vibration.
Components
The sensitive transducer element of a microphone is called its element or capsule. A complete
microphone also includes a housing, some means of bringing the signal from the element to other
equipment, and often an electronic circuit to adapt the output of the capsule to the equipment
being driven. A wireless microphone contains a radio transmitter.
Varieties
Microphones are referred to by their transducer principle, such as condenser, dynamic, etc., and
by their directional characteristics. Sometimes other characteristics such as diaphragm size,
intended use or orientation of the principal sound input to the principal axis (end- or sideaddress) of the microphone are used to describe the microphone.
Condenser microphone
Inside the Oktava 319 condenser microphone
The condenser microphone, invented at Bell Labs in 1916 by E. C. Wente[3] is also called a
capacitor microphone or electrostatic microphone—capacitors were historically called
condensers. Here, the diaphragm acts as one plate of a capacitor, and the vibrations produce
changes in the distance between the plates. There are two types, depending on the method of
extracting the audio signal from the transducer: DC-biased and radio frequency (RF) or high
frequency (HF) condenser microphones. With a DC-biased microphone, the plates are biased
with a fixed charge (Q). The voltage maintained across the capacitor plates changes with the
vibrations in the air, according to the capacitance equation (C = Q⁄V), where Q = charge in
coulombs, C = capacitance in farads and V = potential difference in volts. The capacitance of the
plates is inversely proportional to the distance between them for a parallel-plate capacitor. (See
capacitance for details.) The assembly of fixed and movable plates is called an "element" or
"capsule".
A nearly constant charge is maintained on the capacitor. As the capacitance changes, the charge
across the capacitor does change very slightly, but at audible frequencies it is sensibly constant.
The capacitance of the capsule (around 5 to 100 pF) and the value of the bias resistor (100 MΩ to
tens of GΩ) form a filter that is high-pass for the audio signal, and low-pass for the bias voltage.
Note that the time constant of an RC circuit equals the product of the resistance and capacitance.
Within the time-frame of the capacitance change (as much as 50 ms at 20 Hz audio signal), the
charge is practically constant and the voltage across the capacitor changes instantaneously to
reflect the change in capacitance. The voltage across the capacitor varies above and below the
bias voltage. The voltage difference between the bias and the capacitor is seen across the series
resistor. The voltage across the resistor is amplified for performance or recording. In most cases,
the electronics in the microphone itself contribute no voltage gain as the voltage differential is
quite significant, up to several volts for high sound levels. Since this is a very high impedance
circuit, current gain only is usually needed with the voltage remaining constant.
AKG C451B small-diaphragm condenser microphone
RF condenser microphones use a comparatively low RF voltage, generated by a low-noise
oscillator. The signal from the oscillator may either be amplitude modulated by the capacitance
changes produced by the sound waves moving the capsule diaphragm, or the capsule may be part
of a resonant circuit that modulates the frequency of the oscillator signal. Demodulation yields a
low-noise audio frequency signal with a very low source impedance. The absence of a high bias
voltage permits the use of a diaphragm with looser tension, which may be used to achieve wider
frequency response due to higher compliance. The RF biasing process results in a lower
electrical impedance capsule, a useful by-product of which is that RF condenser microphones
can be operated in damp weather conditions that could create problems in DC-biased
microphones with contaminated insulating surfaces. The Sennheiser "MKH" series of
microphones use the RF biasing technique.
Condenser microphones span the range from telephone transmitters through inexpensive karaoke
microphones to high-fidelity recording microphones. They generally produce a high-quality
audio signal and are now the popular choice in laboratory and recording studio applications. The
inherent suitability of this technology is due to the very small mass that must be moved by the
incident sound wave, unlike other microphone types that require the sound wave to do more
work. They require a power source, provided either via microphone inputs on equipment as
phantom power or from a small battery. Power is necessary for establishing the capacitor plate
voltage, and is also needed to power the microphone electronics (impedance conversion in the
case of electret and DC-polarized microphones, demodulation or detection in the case of RF/HF
microphones). Condenser microphones are also available with two diaphragms that can be
electrically connected to provide a range of polar patterns (see below), such as cardioid,
omnidirectional, and figure-eight. It is also possible to vary the pattern continuously with some
microphones, for example the Røde NT2000 or CAD M179.
Electret condenser microphone
Main article: Electret microphone
First patent on foil electret microphone by G. M. Sessler et al. (pages 1 to 3)
An electret microphone is a type of capacitor microphone invented at Bell laboratories in 1962
by Gerhard Sessler and Jim West.[4] The externally applied charge described above under
condenser microphones is replaced by a permanent charge in an electret material. An electret is a
ferroelectric material that has been permanently electrically charged or polarized. The name
comes from electrostatic and magnet; a static charge is embedded in an electret by alignment of
the static charges in the material, much the way a magnet is made by aligning the magnetic
domains in a piece of iron.
Due to their good performance and ease of manufacture, hence low cost, the vast majority of
microphones made today are electret microphones; a semiconductor manufacturer[5] estimates
annual production at over one billion units. Nearly all cell-phone, computer, PDA and headset
microphones are electret types. They are used in many applications, from high-quality recording
and lavalier use to built-in microphones in small sound recording devices and telephones.
Though electret microphones were once considered low quality, the best ones can now rival
traditional condenser microphones in every respect and can even offer the long-term stability and
ultra-flat response needed for a measurement microphone. Unlike other capacitor microphones,
they require no polarizing voltage, but often contain an integrated preamplifier that does require
power (often incorrectly called polarizing power or bias). This preamplifier is frequently
phantom powered in sound reinforcement and studio applications. Monophonic microphones
designed for personal computer (PC) use, sometimes called multimedia microphones, use a
3.5 mm plug as usually used, without power, for stereo; the ring, instead of carrying the signal
for a second channel, carries power via a resistor from (normally) a 5 V supply in the computer.
Stereophonic microphones use the same connector; there is no obvious way to determine which
standard is used by equipment and microphones.
Only the best electret microphones rival good DC-polarized units in terms of noise level and
quality; electret microphones lend themselves to inexpensive mass-production, while inherently
expensive non-electret condenser microphones are made to higher quality.
Dynamic microphone
Patti Smith singing into a Shure SM58 (dynamic cardioid type) microphone
Dynamic microphones work via electromagnetic induction. They are robust, relatively
inexpensive and resistant to moisture. This, coupled with their potentially high gain before
feedback, makes them ideal for on-stage use.
Moving-coil microphones use the same dynamic principle as in a loudspeaker, only reversed. A
small movable induction coil, positioned in the magnetic field of a permanent magnet, is attached
to the diaphragm. When sound enters through the windscreen of the microphone, the sound wave
moves the diaphragm. When the diaphragm vibrates, the coil moves in the magnetic field,
producing a varying current in the coil through electromagnetic induction. A single dynamic
membrane does not respond linearly to all audio frequencies. Some microphones for this reason
utilize multiple membranes for the different parts of the audio spectrum and then combine the
resulting signals. Combining the multiple signals correctly is difficult and designs that do this are
rare and tend to be expensive. There are on the other hand several designs that are more
specifically aimed towards isolated parts of the audio spectrum. The AKG D 112, for example, is
designed for bass response rather than treble.[6] In audio engineering several kinds of
microphones are often used at the same time to get the best result.
Ribbon microphone
Main article: Ribbon microphone
Edmund Lowe using a ribbon microphone
Ribbon microphones use a thin, usually corrugated metal ribbon suspended in a magnetic field.
The ribbon is electrically connected to the microphone's output, and its vibration within the
magnetic field generates the electrical signal. Ribbon microphones are similar to moving coil
microphones in the sense that both produce sound by means of magnetic induction. Basic ribbon
microphones detect sound in a bi-directional (also called figure-eight) pattern because the ribbon,
which is open to sound both front and back, responds to the pressure gradient rather than the
sound pressure. Though the symmetrical front and rear pickup can be a nuisance in normal stereo
recording, the high side rejection can be used to advantage by positioning a ribbon microphone
horizontally, for example above cymbals, so that the rear lobe picks up only sound from the
cymbals. Crossed figure 8, or Blumlein pair, stereo recording is gaining in popularity, and the
figure 8 response of a ribbon microphone is ideal for that application.
Other directional patterns are produced by enclosing one side of the ribbon in an acoustic trap or
baffle, allowing sound to reach only one side. The classic RCA Type 77-DX microphone has
several externally adjustable positions of the internal baffle, allowing the selection of several
response patterns ranging from "Figure-8" to "Unidirectional". Such older ribbon microphones,
some of which still provide high quality sound reproduction, were once valued for this reason,
but a good low-frequency response could only be obtained when the ribbon was suspended very
loosely, which made them relatively fragile. Modern ribbon materials, including new
nanomaterials[7] have now been introduced that eliminate those concerns, and even improve the
effective dynamic range of ribbon microphones at low frequencies. Protective wind screens can
reduce the danger of damaging a vintage ribbon, and also reduce plosive artifacts in the
recording. Properly designed wind screens produce negligible treble attenuation. In common
with other classes of dynamic microphone, ribbon microphones don't require phantom power; in
fact, this voltage can damage some older ribbon microphones. Some new modern ribbon
microphone designs incorporate a preamplifier and, therefore, do require phantom power, and
circuits of modern passive ribbon microphones, i.e., those without the aforementioned
preamplifier, are specifically designed to resist damage to the ribbon and transformer by
phantom power. Also there are new ribbon materials available that are immune to wind blasts
and phantom power.
Carbon microphone
Main article: Carbon microphone
A carbon microphone, also known as a carbon button microphone (or sometimes just a button
microphone), use a capsule or button containing carbon granules pressed between two metal
plates like the Berliner and Edison microphones. A voltage is applied across the metal plates,
causing a small current to flow through the carbon. One of the plates, the diaphragm, vibrates in
sympathy with incident sound waves, applying a varying pressure to the carbon. The changing
pressure deforms the granules, causing the contact area between each pair of adjacent granules to
change, and this causes the electrical resistance of the mass of granules to change. The changes
in resistance cause a corresponding change in the current flowing through the microphone,
producing the electrical signal. Carbon microphones were once commonly used in telephones;
they have extremely low-quality sound reproduction and a very limited frequency response
range, but are very robust devices. The Boudet microphone, which used relatively large carbon
balls, was similar to the granule carbon button microphones.[8]
Unlike other microphone types, the carbon microphone can also be used as a type of amplifier,
using a small amount of sound energy to control a larger amount of electrical energy. Carbon
microphones found use as early telephone repeaters, making long distance phone calls possible
in the era before vacuum tubes. These repeaters worked by mechanically coupling a magnetic
telephone receiver to a carbon microphone: the faint signal from the receiver was transferred to
the microphone, with a resulting stronger electrical signal to send down the line. One illustration
of this amplifier effect was the oscillation caused by feedback, resulting in an audible squeal
from the old "candlestick" telephone if its earphone was placed near the carbon microphone.
Piezoelectric microphone
A crystal microphone or piezo microphone uses the phenomenon of piezoelectricity—the
ability of some materials to produce a voltage when subjected to pressure—to convert vibrations
into an electrical signal. An example of this is potassium sodium tartrate, which is a piezoelectric
crystal that works as a transducer, both as a microphone and as a slimline loudspeaker
component. Crystal microphones were once commonly supplied with vacuum tube (valve)
equipment, such as domestic tape recorders. Their high output impedance matched the high input
impedance (typically about 10 megohms) of the vacuum tube input stage well. They were
difficult to match to early transistor equipment, and were quickly supplanted by dynamic
microphones for a time, and later small electret condenser devices. The high impedance of the
crystal microphone made it very susceptible to handling noise, both from the microphone itself
and from the connecting cable.
Piezoelectric transducers are often used as contact microphones to amplify sound from acoustic
musical instruments, to sense drum hits, for triggering electronic samples, and to record sound in
challenging environments, such as underwater under high pressure. Saddle-mounted pickups on
acoustic guitars are generally piezoelectric devices that contact the strings passing over the
saddle. This type of microphone is different from magnetic coil pickups commonly visible on
typical electric guitars, which use magnetic induction, rather than mechanical coupling, to pick
up vibration.
Fiber optic microphone
The Optoacoustics 1140 fiber optic microphone
A fiber optic microphone converts acoustic waves into electrical signals by sensing changes in
light intensity, instead of sensing changes in capacitance or magnetic fields as with conventional
microphones.[9][10]
During operation, light from a laser source travels through an optical fiber to illuminate the
surface of a reflective diaphragm. Sound vibrations of the diaphragm modulate the intensity of
light reflecting off the diaphragm in a specific direction. The modulated light is then transmitted
over a second optical fiber to a photo detector, which transforms the intensity-modulated light
into analog or digital audio for transmission or recording. Fiber optic microphones possess high
dynamic and frequency range, similar to the best high fidelity conventional microphones.
Fiber optic microphones do not react to or influence any electrical, magnetic, electrostatic or
radioactive fields (this is called EMI/RFI immunity). The fiber optic microphone design is
therefore ideal for use in areas where conventional microphones are ineffective or dangerous,
such as inside industrial turbines or in magnetic resonance imaging (MRI) equipment
environments.
Fiber optic microphones are robust, resistant to environmental changes in heat and moisture, and
can be produced for any directionality or impedance matching. The distance between the
microphone's light source and its photo detector may be up to several kilometers without need
for any preamplifier and/or other electrical device, making fiber optic microphones suitable for
industrial and surveillance acoustic monitoring.
Fiber optic microphones are used in very specific application areas such as for infrasound
monitoring and noise-canceling. They have proven especially useful in medical applications,
such as allowing radiologists, staff and patients within the powerful and noisy magnetic field to
converse normally, inside the MRI suites as well as in remote control rooms.[11]) Other uses
include industrial equipment monitoring and sensing, audio calibration and measurement, highfidelity recording and law enforcement.
Laser microphone
Main article: Laser microphone
Laser microphones are often portrayed in movies as spy gadgets, because they can be used to
pick up sound at a distance from the microphone equipment. A laser beam is aimed at the surface
of a window or other plane surface that is affected by sound. The vibrations of this surface
change the angle at which the beam is reflected, and the motion of the laser spot from the
returning beam is detected and converted to an audio signal.
In a more robust and expensive implementation, the returned light is split and fed to an
interferometer, which detects movement of the surface by changes in the optical path length of
the reflected beam. The former implementation is a tabletop experiment; the latter requires an
extremely stable laser and precise optics.
A new type of laser microphone is a device that uses a laser beam and smoke or vapor to detect
sound vibrations in free air. On 25 August 2009, U.S. patent 7,580,533 issued for a Particulate
Flow Detection Microphone based on a laser-photocell pair with a moving stream of smoke or
vapor in the laser beam's path. Sound pressure waves cause disturbances in the smoke that in turn
cause variations in the amount of laser light reaching the photo detector. A prototype of the
device was demonstrated at the 127th Audio Engineering Society convention in New York City
from 9 through 12 October 2009.
Liquid microphone
Main article: Water microphone
Early microphones did not produce intelligible speech, until Alexander Graham Bell made
improvements including a variable resistance microphone/transmitter. Bell's liquid transmitter
consisted of a metal cup filled with water with a small amount of sulfuric acid added. A sound
wave caused the diaphragm to move, forcing a needle to move up and down in the water. The
electrical resistance between the wire and the cup was then inversely proportional to the size of
the water meniscus around the submerged needle. Elisha Gray filed a caveat for a version using a
brass rod instead of the needle. Other minor variations and improvements were made to the
liquid microphone by Majoranna, Chambers, Vanni, Sykes, and Elisha Gray, and one version
was patented by Reginald Fessenden in 1903. These were the first working microphones, but
they were not practical for commercial application. The famous first phone conversation between
Bell and Watson took place using a liquid microphone.
MEMS microphone
Main article: Microelectromechanical systems
The MEMS (MicroElectrical-Mechanical System) microphone is also called a microphone chip
or silicon microphone. The pressure-sensitive diaphragm is etched directly into a silicon chip by
MEMS techniques, and is usually accompanied with integrated preamplifier. Most MEMS
microphones are variants of the condenser microphone design. Often MEMS microphones have
built in analog-to-digital converter (ADC) circuits on the same CMOS chip making the chip a
digital microphone and so more readily integrated with modern digital products. Major
manufacturers producing MEMS silicon microphones are Wolfson Microelectronics (WM7xxx),
Analog Devices, Akustica (AKU200x), Infineon (SMM310 product), Knowles Electronics,
Memstech (MSMx), NXP Semiconductors, Sonion MEMS, AAC Acoustic Technologies,[12] and
Omron.[13]
Speakers as microphones
A loudspeaker, a transducer that turns an electrical signal into sound waves, is the functional
opposite of a microphone. Since a conventional speaker is constructed much like a dynamic
microphone (with a diaphragm, coil and magnet), speakers can actually work "in reverse" as
microphones. The result, though, is a microphone with poor quality, limited frequency response
(particularly at the high end), and poor sensitivity. In practical use, speakers are sometimes used
as microphones in applications where high quality and sensitivity are not needed such as
intercoms, walkie-talkies or Video game voice chat peripherals, or when conventional
microphones are in short supply.
However, there is at least one other practical application of this principle: Using a medium-size
woofer placed closely in front of a "kick" (bass drum) in a drum set to act as a microphone. The
use of relatively large speakers to transduce low frequency sound sources, especially in music
production, is becoming fairly common. A product example of this type of device is the Yamaha
Subkick, a 6.5-inch (170 mm) woofer shock-mounted it into a 10" drum shell used in front of
kick drums. Since a relatively massive membrane is unable to transduce high frequencies,
placing a speaker in front of a kick drum is often ideal for reducing cymbal and snare bleed into
the kick drum sound. Less commonly, microphones themselves can be used as speakers, almost
always as tweeters. Microphones, however, are not designed to handle the power that speaker
components are routinely required to cope with. One instance of such an application was the
STC microphone-derived 4001 super-tweeter, which was successfully used in a number of high
quality loudspeaker systems from the late 1960s to the mid-70s.
Capsule design and directivity
The inner elements of a microphone are the primary source of differences in directivity. A
pressure microphone uses a diaphragm between a fixed internal volume of air and the
environment, and responds uniformly to pressure from all directions, so it is said to be
omnidirectional. A pressure-gradient microphone uses a diaphragm that is at least partially open
on both sides. The pressure difference between the two sides produces its directional
characteristics. Other elements such as the external shape of the microphone and external devices
such as interference tubes can also alter a microphone's directional response. A pure pressuregradient microphone is equally sensitive to sounds arriving from front or back, but insensitive to
sounds arriving from the side because sound arriving at the front and back at the same time
creates no gradient between the two. The characteristic directional pattern of a pure pressuregradient microphone is like a figure-8. Other polar patterns are derived by creating a capsule that
combines these two effects in different ways. The cardioid, for instance, features a partially
closed backside, so its response is a combination of pressure and pressure-gradient
characteristics.[14]
Microphone polar patterns
(Microphone facing top of page in diagram, parallel to page):


Omnidirectional
 Subcardioid
 Cardioid
 Supercardioid
Bi-directional or Figure of 8
 Hypercardioid
 Shotgun
A microphone's directionality or polar pattern indicates how sensitive it is to sounds arriving at
different angles about its central axis. The polar patterns illustrated above represent the locus of
points that produce the same signal level output in the microphone if a given sound pressure
level (SPL) is generated from that point. How the physical body of the microphone is oriented
relative to the diagrams depends on the microphone design. For large-membrane microphones
such as in the Oktava (pictured above), the upward direction in the polar diagram is usually
perpendicular to the microphone body, commonly known as "side fire" or "side address". For
small diaphragm microphones such as the Shure (also pictured above), it usually extends from
the axis of the microphone commonly known as "end fire" or "top/end address".
Some microphone designs combine several principles in creating the desired polar pattern. This
ranges from shielding (meaning diffraction/dissipation/absorption) by the housing itself to
electronically combining dual membranes.
Omnidirectional
An omnidirectional (or nondirectional) microphone's response is generally considered to be a
perfect sphere in three dimensions. In the real world, this is not the case. As with directional
microphones, the polar pattern for an "omnidirectional" microphone is a function of frequency.
The body of the microphone is not infinitely small and, as a consequence, it tends to get in its
own way with respect to sounds arriving from the rear, causing a slight flattening of the polar
response. This flattening increases as the diameter of the microphone (assuming it's cylindrical)
reaches the wavelength of the frequency in question. Therefore, the smallest diameter
microphone gives the best omnidirectional characteristics at high frequencies.
The wavelength of sound at 10 kHz is little over an inch (3.4 cm) so the smallest measuring
microphones are often 1/4" (6 mm) in diameter, which practically eliminates directionality even
up to the highest frequencies. Omnidirectional microphones, unlike cardioids, do not employ
resonant cavities as delays, and so can be considered the "purest" microphones in terms of low
coloration; they add very little to the original sound. Being pressure-sensitive they can also have
a very flat low-frequency response down to 20 Hz or below. Pressure-sensitive microphones also
respond much less to wind noise and plosives than directional (velocity sensitive) microphones.
An example of a nondirectional microphone is the round black eight ball.[15]
Unidirectional
A unidirectional microphone is sensitive to sounds from only one direction. The diagram above
illustrates a number of these patterns. The microphone faces upwards in each diagram. The
sound intensity for a particular frequency is plotted for angles radially from 0 to 360°.
(Professional diagrams show these scales and include multiple plots at different frequencies. The
diagrams given here provide only an overview of typical pattern shapes, and their names.)
Cardioid
US664A University Sound Dynamic Supercardioid Microphone
The most common unidirectional microphone is a cardioid microphone, so named because the
sensitivity pattern is heart-shaped. A hyper-cardioid microphone is similar but with a tighter area
of front sensitivity and a smaller lobe of rear sensitivity. A super-cardioid microphone is similar
to a hyper-cardioid, except there is more front pickup and less rear pickup. These three patterns
are commonly used as vocal or speech microphones, since they are good at rejecting sounds from
other directions.
A cardioid microphone is effectively a superposition of an omnidirectional and a figure-8
microphone; for sound waves coming from the back, the negative signal from the figure-8
cancels the positive signal from the omnidirectional element, whereas for sound waves coming
from the front, the two add to each other. A hypercardioid microphone is similar, but with a
slightly larger figure-8 contribution. Since pressure gradient transducer microphones are
directional, putting them very close to the sound source (at distances of a few centimeters) results
in a bass boost. This is known as the proximity effect.[16]
Bi-directional
"Figure 8" or bi-directional microphones receive sound equally from both the front and back of
the element. Most ribbon microphones are of this pattern. In principle they do not respond to
sound pressure at all, only to the gradient between front and back; since sound arriving from the
side reaches front and back equally there is no gradient and therefore no sensitivity to sound
from that direction. While omnidirectional microphones are scalar transducers responding to
pressure from any direction, bi-directional microphones are vector transducers responding to the
gradient along an axis normal to the plane of the diaphragm. As a result, output polarity is
inverted for sounds arriving from the back side.
Shotgun
An Audio-Technica shotgun microphone
Shotgun microphones are the most highly directional. They have small lobes of sensitivity to
the left, right, and rear but are significantly less sensitive to the side and rear than other
directional microphones. This results from placing the element at the back end of a tube with
slots cut along the side; wave cancellation eliminates much of the off-axis sound. Due to the
narrowness of their sensitivity area, shotgun microphones are commonly used on television and
film sets, in stadiums, and for field recording of wildlife.
Boundary or "PZM"
Several approaches have been developed for effectively using a microphone in less-than-ideal
acoustic spaces, which often suffer from excessive reflections from one or more of the surfaces
(boundaries) that make up the space. If the microphone is placed in, or very close to, one of these
boundaries, the reflections from that surface are not sensed by the microphone. Initially this was
done by placing an ordinary microphone adjacent to the surface, sometimes in a block of
acoustically transparent foam. Sound engineers Ed Long and Ron Wickersham developed the
concept of placing the diaphgram parallel to and facing the boundary.[17] While the patent has
expired, "Pressure Zone Microphone" and "PZM" are still active trademarks of Crown
International, and the generic term "boundary microphone" is preferred. While a boundary
microphone was initially implemented using an omnidirectional element, it is also possible to
mount a directional microphone close enough to the surface to gain some of the benefits of this
technique while retaining the directional properties of the element. Crown's trademark on this
approach is "Phase Coherent Cardioid" or "PCC," but there are other makers who employ this
technique as well.
Application-specific designs
A lavalier microphone is made for hands-free operation. These small microphones are worn on
the body. Originally, they were held in place with a lanyard worn around the neck, but more
often they are fastened to clothing with a clip, pin, tape or magnet. The lavalier cord may be
hidden by clothes and either run to an RF transmitter in a pocket or clipped to a belt (for mobile
use), or run directly to the mixer (for stationary applications).
A wireless microphone transmits the audio as a radio or optical signal rather than via a cable. It
usually sends its signal using a small FM radio transmitter to a nearby receiver connected to the
sound system, but it can also use infrared waves if the transmitter and receiver are within sight of
each other.
A contact microphone picks up vibrations directly from a solid surface or object, as opposed to
sound vibrations carried through air. One use for this is to detect sounds of a very low level, such
as those from small objects or insects. The microphone commonly consists of a magnetic
(moving coil) transducer, contact plate and contact pin. The contact plate is placed directly on
the vibrating part of a musical instrument or other surface, and the contact pin transfers
vibrations to the coil. Contact microphones have been used to pick up the sound of a snail's
heartbeat and the footsteps of ants. A portable version of this microphone has recently been
developed. A throat microphone is a variant of the contact microphone that picks up speech
directly from a person's throat, which it is strapped to. This lets the device be used in areas with
ambient sounds that would otherwise make the speaker inaudible.
A parabolic microphone uses a parabolic reflector to collect and focus sound waves onto a
microphone receiver, in much the same way that a parabolic antenna (e.g. satellite dish) does
with radio waves. Typical uses of this microphone, which has unusually focused front sensitivity
and can pick up sounds from many meters away, include nature recording, outdoor sporting
events, eavesdropping, law enforcement, and even espionage. Parabolic microphones are not
typically used for standard recording applications, because they tend to have poor low-frequency
response as a side effect of their design.
A stereo microphone integrates two microphones in one unit to produce a stereophonic signal. A
stereo microphone is often used for broadcast applications or field recording where it would be
impractical to configure two separate condenser microphones in a classic X-Y configuration (see
microphone practice) for stereophonic recording. Some such microphones have an adjustable
angle of coverage between the two channels.
A noise-canceling microphone is a highly directional design intended for noisy environments.
One such use is in aircraft cockpits where they are normally installed as boom microphones on
headsets. Another use is in live event support on loud concert stages for vocalists involved with
live performances. Many noise-canceling microphones combine signals received from two
diaphragms that are in opposite electrical polarity or are processed electronically. In dual
diaphragm designs, the main diaphragm is mounted closest to the intended source and the second
is positioned farther away from the source so that it can pick up environmental sounds to be
subtracted from the main diaphragm's signal. After the two signals have been combined, sounds
other than the intended source are greatly reduced, substantially increasing intelligibility. Other
noise-canceling designs use one diaphragm that is affected by ports open to the sides and rear of
the microphone, with the sum being a 16 dB rejection of sounds that are farther away. One noisecanceling headset design using a single diaphragm has been used prominently by vocal artists
such as Garth Brooks and Janet Jackson.[18] A few noise-canceling microphones are throat
microphones.
Connectors
Electronic symbol for a microphone
The most common connectors used by microphones are:



Male XLR connector on professional microphones
¼ inch (sometimes referred to as 6.3 mm) jack plug also known as 1/4 inch TRS connector
on less expensive consumer microphones. Many consumer microphones use an unbalanced
1/4 inch phone jack. Harmonica microphones commonly use a high impedance 1/4 inch TS
connection to be run through guitar amplifiers.
3.5 mm (sometimes referred to as 1/8 inch mini) stereo (wired as mono) mini phone plug on
very inexpensive and computer microphones
A microphone with a USB connector, made by Blue Microphones
Some microphones use other connectors, such as a 5-pin XLR, or mini XLR for connection to
portable equipment. Some lavalier (or 'lapel', from the days of attaching the microphone to the
news reporters suit lapel) microphones use a proprietary connector for connection to a wireless
transmitter. Since 2005, professional-quality microphones with USB connections have begun to
appear, designed for direct recording into computer-based software.
Impedance-matching
Microphones have an electrical characteristic called impedance, measured in ohms (Ω), that
depends on the design. Typically, the rated impedance is stated.[19] Low impedance is considered
under 600 Ω. Medium impedance is considered between 600 Ω and 10 kΩ. High impedance is
above 10 kΩ. Owing to their built-in amplifier, condenser microphones typically have an output
impedance between 50 and 200 Ω.[20]
The output of a given microphone delivers the same power whether it is low or high impedance.
If a microphone is made in high and low impedance versions, the high impedance version has a
higher output voltage for a given sound pressure input, and is suitable for use with vacuum-tube
guitar amplifiers, for instance, which have a high input impedance and require a relatively high
signal input voltage to overcome the tubes' inherent noise. Most professional microphones are
low impedance, about 200 Ω or lower. Professional vacuum-tube sound equipment incorporates
a transformer that steps up the impedance of the microphone circuit to the high impedance and
voltage needed to drive the input tube; the impedance conversion inherently creates voltage gain
as well. External matching transformers are also available that can be used in-line between a low
impedance microphone and a high impedance input.
Low-impedance microphones are preferred over high impedance for two reasons: one is that
using a high-impedance microphone with a long cable results in high frequency signal loss due
to cable capacitance, which forms a low-pass filter with the microphone output impedance. The
other is that long high-impedance cables tend to pick up more hum (and possibly radio-frequency
interference (RFI) as well). Nothing is damaged if the impedance between microphone and other
equipment is mismatched; the worst that happens is a reduction in signal or change in frequency
response.
Most microphones are designed not to have their impedance matched by the load they are
connected to.[21] Doing so can alter their frequency response and cause distortion, especially at
high sound pressure levels. Certain ribbon and dynamic microphones are exceptions, due to the
designers' assumption of a certain load impedance being part of the internal electro-acoustical
damping circuit of the microphone.[22][dubious – discuss]
Digital microphone interface
Neumann D-01 digital microphone and Neumann DMI-8 8-channel USB Digital Microphone
Interface
The AES 42 standard, published by the Audio Engineering Society, defines a digital interface for
microphones. Microphones conforming to this standard directly output a digital audio stream
through an XLR or XLD male connector, rather than producing an analog output. Digital
microphones may be used either with new equipment with appropriate input connections that
conform to the AES 42 standard, or else via a suitable interface box. Studio-quality microphones
that operate in accordance with the AES 42 standard are now available from a number of
microphone manufacturers.
Measurements and specifications
A comparison of the far field on-axis frequency response of the Oktava 319 and the Shure SM58
Because of differences in their construction, microphones have their own characteristic responses
to sound. This difference in response produces non-uniform phase and frequency responses. In
addition, microphones are not uniformly sensitive to sound pressure, and can accept differing
levels without distorting. Although for scientific applications microphones with a more uniform
response are desirable, this is often not the case for music recording, as the non-uniform response
of a microphone can produce a desirable coloration of the sound. There is an international
standard for microphone specifications,[19] but few manufacturers adhere to it. As a result,
comparison of published data from different manufacturers is difficult because different
measurement techniques are used. The Microphone Data Website has collated the technical
specifications complete with pictures, response curves and technical data from the microphone
manufacturers for every currently listed microphone, and even a few obsolete models, and shows
the data for them all in one common format for ease of comparison.[1]. Caution should be used
in drawing any solid conclusions from this or any other published data, however, unless it is
known that the manufacturer has supplied specifications in accordance with IEC 60268-4.
A frequency response diagram plots the microphone sensitivity in decibels over a range of
frequencies (typically 20 Hz to 20 kHz), generally for perfectly on-axis sound (sound arriving at
0° to the capsule). Frequency response may be less informatively stated textually like so:
"30 Hz–16 kHz ±3 dB". This is interpreted as meaning a nearly flat, linear, plot between the
stated frequencies, with variations in amplitude of no more than plus or minus 3 dB. However,
one cannot determine from this information how smooth the variations are, nor in what parts of
the spectrum they occur. Note that commonly made statements such as "20 Hz–20 kHz" are
meaningless without a decibel measure of tolerance. Directional microphones' frequency
response varies greatly with distance from the sound source, and with the geometry of the sound
source. IEC 60268-4 specifies that frequency response should be measured in plane progressive
wave conditions (very far away from the source) but this is seldom practical. Close talking
microphones may be measured with different sound sources and distances, but there is no
standard and therefore no way to compare data from different models unless the measurement
technique is described.
The self-noise or equivalent noise level is the sound level that creates the same output voltage as
the microphone does in the absence of sound. This represents the lowest point of the
microphone's dynamic range, and is particularly important should you wish to record sounds that
are quiet. The measure is often stated in dB(A), which is the equivalent loudness of the noise on
a decibel scale frequency-weighted for how the ear hears, for example: "15 dBA SPL" (SPL
means sound pressure level relative to 20 micropascals). The lower the number the better. Some
microphone manufacturers state the noise level using ITU-R 468 noise weighting, which more
accurately represents the way we hear noise, but gives a figure some 11–14 dB higher. A quiet
microphone typically measures 20 dBA SPL or 32 dB SPL 468-weighted. Very quiet
microphones have existed for years for special applications, such the Brüel & Kjaer 4179, with a
noise level around 0 dB SPL. Recently some microphones with low noise specifications have
been introduced in the studio/entertainment market, such as models from Neumann and Røde
that advertise noise levels between 5–7 dBA. Typically this is achieved by altering the frequency
response of the capsule and electronics to result in lower noise within the A-weighting curve
while broadband noise may be increased.
The maximum SPL the microphone can accept is measured for particular values of total
harmonic distortion (THD), typically 0.5%. This amount of distortion is generally inaudible, so
one can safely use the microphone at this SPL without harming the recording. Example: "142 dB
SPL peak (at 0.5% THD)". The higher the value, the better, although microphones with a very
high maximum SPL also have a higher self-noise.
The clipping level is an important indicator of maximum usable level, as the 1% THD figure
usually quoted under max SPL is really a very mild level of distortion, quite inaudible especially
on brief high peaks. Clipping is much more audible. For some microphones the clipping level
may be much higher than the max SPL.
The dynamic range of a microphone is the difference in SPL between the noise floor and the
maximum SPL. If stated on its own, for example "120 dB", it conveys significantly less
information than having the self-noise and maximum SPL figures individually.
Sensitivity indicates how well the microphone converts acoustic pressure to output voltage. A
high sensitivity microphone creates more voltage and so needs less amplification at the mixer or
recording device. This is a practical concern but is not directly an indication of the mic's quality,
and in fact the term sensitivity is something of a misnomer, 'transduction gain' being perhaps
more meaningful, (or just "output level") because true sensitivity is generally set by the noise
floor, and too much "sensitivity" in terms of output level compromises the clipping level. There
are two common measures. The (preferred) international standard is made in millivolts per pascal
at 1 kHz. A higher value indicates greater sensitivity. The older American method is referred to a
1 V/Pa standard and measured in plain decibels, resulting in a negative value. Again, a higher
value indicates greater sensitivity, so −60 dB is more sensitive than −70 dB.
Measurement microphones
Some microphones are intended for testing speakers, measuring noise levels and otherwise
quantifying an acoustic experience. These are calibrated transducers and are usually supplied
with a calibration certificate that states absolute sensitivity against frequency. The quality of
measurement microphones is often referred to using the designations "Class 1," "Type 2" etc.,
which are references not to microphone specifications but to sound level meters.[23] A more
comprehensive standard[24] for the description of measurement microphone performance was
recently adopted.
Measurement microphones are generally scalar sensors of pressure; they exhibit an
omnidirectional response, limited only by the scattering profile of their physical dimensions.
Sound intensity or sound power measurements require pressure-gradient measurements, which
are typically made using arrays of at least two microphones, or with hot-wire anemometers.
Microphone calibration
Main article: Measurement microphone calibration
To take a scientific measurement with a microphone, its precise sensitivity must be known (in
volts per pascal). Since this may change over the lifetime of the device, it is necessary to
regularly calibrate measurement microphones. This service is offered by some microphone
manufacturers and by independent certified testing labs. All microphone calibration is ultimately
traceable to primary standards at a national measurement institute such as NPL in the UK, PTB
in Germany and NIST in the USA, which most commonly calibrate using the reciprocity primary
standard. Measurement microphones calibrated using this method can then be used to calibrate
other microphones using comparison calibration techniques.
Depending on the application, measurement microphones must be tested periodically (every year
or several months, typically) and after any potentially damaging event, such as being dropped
(most such mikes come in foam-padded cases to reduce this risk) or exposed to sounds beyond
the acceptable level.
Microphone array and array microphones
Main article: Microphone array
A microphone array is any number of microphones operating in tandem. There are many
applications:


Systems for extracting voice input from ambient noise (notably telephones, speech
recognition systems, hearing aids)
Surround sound and related technologies



Locating objects by sound: acoustic source localization, e.g. military use to locate the
source(s) of artillery fire. Aircraft location and tracking.
High fidelity original recordings
3D spatial beamforming for localized acoustic detection of subcutaneous sounds
Typically, an array is made up of omnidirectional microphones distributed about the perimeter of
a space, linked to a computer that records and interprets the results into a coherent form.
Microphone windscreens
Various microphone covers
Windscreens[note 1] are used to protect microphones that would otherwise be buffeted by wind or
vocal plosives from consonants such as "P", "B", etc. Most microphones have an integral
windscreen built around the microphone diaphragm. A screen of plastic, wire mesh or a metal
cage is held at a distance from the microphone diaphragm, to shield it. This cage provides a first
line of defense against the mechanical impact of objects or wind. Some microphones, such as the
Shure SM58, may have an additional layer of foam inside the cage to further enhance the
protective properties of the shield. One disadvantage of all windscreen types is that the
microphone's high frequency response is attenuated by a small amount, depending on the density
of the protective layer.
Beyond integral microphone windscreens, there are three broad classes of additional wind
protection.
Microphone covers
Microphone covers are often made of soft open-cell polyester or polyurethane foam because of
the inexpensive, disposable nature of the foam. Optional windscreens are often available from
the manufacturer and third parties. A visible example of an optional accessory windscreen is the
A2WS from Shure, one of which is fitted over each of the two Shure SM57 microphones used on
the United States president's lectern.[25] One disadvantage of polyurethane foam microphone
covers is that they can deteriorate over time. Windscreens also tend to collect dirt and moisture
in their open cells and must be cleaned to prevent high frequency loss, bad odor and unhealthy
conditions for the person using the microphone. On the other hand, a major advantage of concert
vocalist windscreens is that one can quickly change to a clean windscreen between users,
reducing the chance of transferring germs. Windscreens of various colors can be used to
distinguish one microphone from another on a busy, active stage.
Pop filter by Gauge Precision Instruments
Pop filters
Pop filters or pop screens are used in controlled studio environments to minimize plosives when
recording. A typical pop filter is composed of one or more layers of acoustically transparent
gauze-like material, such as woven nylon (e.g. pantyhose) stretched over a circular frame and a
clamp and a flexible mounting bracket to attach to the microphone stand. The pop shield is
placed between the vocalist and the microphone. The closer a vocalist brings his or her lips to the
microphone, the greater the requirement for a Pop filter. Singers can be trained either to soften
their plosives or direct the air blast away from the microphone, in which cases they don't need a
pop filter.
Pop filters also keep spittle off the microphone. Most condenser microphones can be damaged by
spittle.
Blimps
Two recordings being made—A blimp is being used on the left. An open-cell foam windscreen is
being used on the right.
a 'dead cat' and a 'dead kitten' windscreens. The dead kitten covers a stereo mic for a DSLR
camera. The difference in name is due to the size of the fur.
Blimps (also known as Zeppelins) are large, hollow windscreens used to surround microphones
for outdoor location audio, such as nature recording, electronic news gathering, and for film and
video shoots. They can cut wind noise by as much as 25 dB, especially low-frequency noise. The
blimp is essentially a hollow cage or basket with acoustically transparent material stretched over
the outer frame. The blimp works by creating a volume of still air around the microphone. The
microphone is often further isolated from the blimp by an elastic suspension inside the basket.
This reduces wind vibrations and handling noise transmitted from the cage. To extend the range
of wind speed conditions in which the blimp remains effective, many have the option of a
secondary cover over the outer shell. This is usually an acoustically transparent, synthetic fur
material with long, soft hairs. Common and slang names for this include "dead cat" or
"windmuff". The hairs deaden the noise caused by the shock of wind hitting the blimp. A
synthetic fur cover can reduce wind noise by an additional 10 dB.[26]
Amplifier
Generally, an amplifier or simply amp, is a device for increasing the power of a signal.
In popular use, the term usually describes an electronic amplifier, in which the input "signal" is
usually a voltage or a current. In audio applications, amplifiers drive the loudspeakers used in PA
systems to make the human voice louder or play recorded music. Amplifiers may be classified
according to the input (source) they are designed to amplify (such as a guitar amplifier, to
perform with an electric guitar), the device they are intended to drive (such as a headphone
amplifier), the frequency range of the signals (Audio, IF, RF, and VHF amplifiers, for example),
whether they invert the signal (inverting amplifiers and non-inverting amplifiers), or the type of
device used in the amplification (valve or tube amplifiers, FET amplifiers, etc.).
A related device that emphasizes conversion of signals of one type to another (for example, a
light signal in photons to a DC signal in amperes) is a transducer, a transformer, or a sensor.
However, none of these amplify power.
Figures of merit
The quality of an amplifier can be characterized by a number of specifications, listed below.
[edit] Gain
The gain of an amplifier is the ratio of output to input power or amplitude, and is usually
measured in decibels. (When measured in decibels it is logarithmically related to the power ratio:
G(dB)=10 log(Pout /(Pin)). RF amplifiers are often specified in terms of the maximum power
gain obtainable, while the voltage gain of audio amplifiers and instrumentation amplifiers will be
more often specified (since the amplifier's input impedance will often be much higher than the
source impedance, and the load impedance higher than the amplifier's output impedance).

Example: an audio amplifier with a gain given as 20 dB will have a voltage gain of ten (but a
power gain of 100 would only occur in the unlikely event the input and output impedances
were identical).
If two equivalent amplifiers are being compared, the amplifier with higher gain settings would be
more sensitive as it would take less input signal to produce a given amount of power.[1]
[edit] Bandwidth
The bandwidth of an amplifier is the range of frequencies for which the amplifier gives
"satisfactory performance". The definition of "satisfactory performance" may be different for
different applications. However, a common and well-accepted metric is the half power points
(i.e. frequency where the power goes down by half its peak value) on the output vs. frequency
curve. Therefore bandwidth can be defined as the difference between the lower and upper half
power points. This is therefore also known as the −3 dB bandwidth. Bandwidths (otherwise
called "frequency responses") for other response tolerances are sometimes quoted (−1 dB, −6 dB
etc.) or "plus or minus 1dB" (roughly the sound level difference people usually can detect).
The gain of a good quality full-range audio amplifier will be essentially flat between 20 Hz to
about 20 kHz (the range of normal human hearing). In ultra high fidelity amplifier design, the
amp's frequency response should extend considerably beyond this (one or more octaves either
side) and might have −3 dB points < 10 Hz and > 65 kHz. Professional touring amplifiers often
have input and/or output filtering to sharply limit frequency response beyond 20 Hz-20 kHz; too
much of the amplifier's potential output power would otherwise be wasted on infrasonic and
ultrasonic frequencies, and the danger of AM radio interference would increase. Modern
switching amplifiers need steep low pass filtering at the output to get rid of high frequency
switching noise and harmonics.
[edit] Efficiency
Efficiency is a measure of how much of the power source is usefully applied to the amplifier's
output. Class A amplifiers are very inefficient, in the range of 10–20% with a max efficiency of
25% for direct coupling of the output. Inductive coupling of the output can raise their efficiency
to a maximum of 50%.
Drain efficiency is the ratio of output RF power to input DC power when primary input DC
power has been fed to the drain of an FET. Based on this definition, the drain efficiency cannot
exceed 25% for a class A amplifier that is supplied drain bias current through resistors (because
RF signal has its zero level at about 50% of the input DC). Manufacturers specify much higher
drain efficiencies, and designers are able to obtain higher efficiencies by providing current to the
drain of the transistor through an inductor or a transformer winding. In this case the RF zero
level is near the DC rail and will swing both above and below the rail during operation. While
the voltage level is above the DC rail current is supplied by the inductor.
Class B amplifiers have a very high efficiency but are impractical for audio work because of high
levels of distortion (See: Crossover distortion). In practical design, the result of a tradeoff is the
class AB design. Modern Class AB amplifiers commonly have peak efficiencies between 30–
55% in audio systems and 50-70% in radio frequency systems with a theoretical maximum of
78.5%.
Commercially available Class D switching amplifiers have reported efficiencies as high as 90%.
Amplifiers of Class C-F are usually known to be very high efficiency amplifiers. RCA
manufactured an AM broadcast transmitter employing a single class-C low mu triode with an RF
efficiency in the 90% range.
More efficient amplifiers run cooler, and often do not need any cooling fans even in multikilowatt designs. The reason for this is that the loss of efficiency produces heat as a by-product
of the energy lost during the conversion of power. In more efficient amplifiers there is less loss
of energy so in turn less heat.
In RF linear Power Amplifiers, such as cellular base stations and broadcast transmitters, special
design techniques can be used to improve efficiency. Doherty designs, which use a second output
stage as a "peak" amplifier, can lift efficiency from the typical 15% up to 30-35% in a narrow
bandwidth. Envelope Tracking designs are able to achieve efficiencies of up to 60%, by
modulating the supply voltage to the amplifier in line with the envelope of the signal.
[edit] Linearity
An ideal amplifier would be a totally linear device, but real amplifiers are only linear within
limits.
When the signal drive to the amplifier is increased, the output also increases until a point is
reached where some part of the amplifier becomes saturated and cannot produce any more
output; this is called clipping, and results in distortion.
In most amplifiers a reduction in gain takes place before hard clipping occurs; the result is a
compression effect, which (if the amplifier is an audio amplifier) sounds much less unpleasant to
the ear. For these amplifiers, the 1 dB compression point is defined as the input power (or output
power) where the gain is 1 dB less than the small signal gain. Sometimes this nonlinearity is
deliberately designed in to reduce the audible unpleasantness of hard clipping under overload.
Ill effects of nonlinearity can be reduced with negative feedback.
Linearization is an emergent field, and there are many techniques, such as feedforward,
predistortion, postdistortion, in order to avoid the undesired effects of the non-linearities.
[edit] Noise
This is a measure of how much noise is introduced in the amplification process. Noise is an
undesirable but inevitable product of the electronic devices and components; also, much noise
results from intentional economies of manufacture and design time. The metric for noise
performance of a circuit is noise figure or noise factor. Noise figure is a comparison between the
output signal to noise ratio and the thermal noise of the input signal.
[edit] Output dynamic range
Output dynamic range is the range, usually given in dB, between the smallest and largest useful
output levels. The lowest useful level is limited by output noise, while the largest is limited most
often by distortion. The ratio of these two is quoted as the amplifier dynamic range. More
precisely, if S = maximal allowed signal power and N = noise power, the dynamic range DR is
DR = (S + N ) /N.[2]
In many switched mode amplifiers, dynamic range is limited by the minimum output step size.
[edit] Slew rate
Slew rate is the maximum rate of change of the output, usually quoted in volts per second (or
microsecond). Many amplifiers are ultimately slew rate limited (typically by the impedance of a
drive current having to overcome capacitive effects at some point in the circuit), which
sometimes limits the full power bandwidth to frequencies well below the amplifier's small-signal
frequency response.
[edit] Rise time
The rise time, tr, of an amplifier is the time taken for the output to change from 10% to 90% of
its final level when driven by a step input. For a Gaussian response system (or a simple RC roll
off), the rise time is approximated by:
tr * BW = 0.35, where tr is rise time in seconds and BW is bandwidth in Hz.
[edit] Settling time and ringing
The time taken for the output to settle to within a certain percentage of the final value (for
instance 0.1%) is called the settling time, and is usually specified for oscilloscope vertical
amplifiers and high accuracy measurement systems. Ringing refers to an output variation that
cycles above and below an amplifier's final value and leads to a delay in reaching a stable output.
Ringing is the result of overshoot caused by an underdamped circuit.
[edit] Overshoot
In response to a step input, the overshoot is the amount the output exceeds its final, steady-state
value.
[edit] Stability
Stability is an issue in all amplifiers with feedback, whether that feedback is added intentionally
or results unintentionally. It is especially an issue when applied over multiple amplifying stages.
Stability is a major concern in RF and microwave amplifiers. The degree of an amplifier's
stability can be quantified by a so-called stability factor. There are several different stability
factors, such as the Stern stability factor and the Linvil stability factor, which specify a condition
that must be met for the absolute stability of an amplifier in terms of its two-port parameters.
[edit] Electronic amplifiers
Main article: Electronic amplifier
There are many types of electronic amplifiers, commonly used in radio and television
transmitters and receivers, high-fidelity ("hi-fi") stereo equipment, microcomputers and other
electronic digital equipment, and guitar and other instrument amplifiers. Critical components
include active devices, such as vacuum tubes or transistors.
[edit] Other amplifier types
[edit] Carbon microphone
One of the first devices used to amplify signals was the carbon microphone (effectively a soundcontrolled variable resistor). By channeling a large electric current through the compressed
carbon granules in the microphone, a small sound signal could produce a much larger electric
signal. The carbon microphone was extremely important in early telecommunications; analog
telephones in fact work without the use of any other amplifier. Before the invention of electronic
amplifiers, mechanically coupled carbon microphones were also used as amplifiers in telephone
repeaters for long distance service.
[edit] Magnetic amplifier
Main article: magnetic amplifier
A magnetic amplifier is a transformer-like device that makes use of the saturation of magnetic
materials to produce amplification. It is a non-electronic electrical amplifier with no moving
parts. The bandwidth of magnetic amplifiers extends to the hundreds of kilohertz.
[edit] Rotating electrical machinery amplifier
A Ward Leonard control is a rotating machine like an electrical generator that provides
amplification of electrical signals by the conversion of mechanical energy to electrical energy.
Changes in generator field current result in larger changes in the output current of the generator,
providing gain. This class of device was used for smooth control of large motors, primarily for
elevators and naval guns.
Field modulation of a very high speed AC generator was also used for some early AM radio
transmissions.[3] See Alexanderson alternator.
[edit] Johnsen-Rahbek effect amplifier
The earliest form of audio power amplifier was Edison's "electromotograph" loud-speaking
telephone, which used a wetted rotating chalk cylinder in contact with a stationary contact. The
friction between cylinder and contact varied with the current, providing gain. Edison discovered
this effect in 1874, but the theory behind the Johnsen-Rahbek effect was not understood until the
semiconductor era.
[edit] Mechanical amplifiers
Mechanical amplifiers were used in the pre-electronic era in specialized applications.
Early autopilot units designed by Elmer Ambrose Sperry incorporated a mechanical amplifier
using belts wrapped around rotating drums; a slight increase in the tension of the belt caused the
drum to move the belt. A paired, opposing set of such drives made up a single amplifier. This
amplified small gyro errors into signals large enough to move aircraft control surfaces. A similar
mechanism was used in the Vannevar Bush differential analyzer.
The electrostatic drum amplifier used a band wrapped partway around a rotating drum, and fixed
at its anchored end to a spring. The other end connected to a speaker cone. The input signal was
transformed up to high voltage, and added to a high voltage dc supply line. This voltage was
connected between drum and belt. Thus the input signal varied the electric field between belt and
drum, and thus the friction between them, and thus the amount of lateral movement of the belt
and thus speaker cone.
Other variations on the theme also existed at one time.
[edit] Optical amplifiers
Main article: Optical amplifier
Optical amplifiers amplify light through the process of stimulated emission. See Laser and
Maser.
[edit] Miscellaneous types




There are also mechanical amplifiers, such as the automotive servo used in braking.
Relays can be included under the above definition of amplifiers, although their transfer
function is not linear (that is, they are either open or closed).
Also purely mechanical manifestations of such digital amplifiers can be built (for theoretical,
instructional purposes, or for entertainment), see e.g. domino computer.
Another type of amplifier is the fluidic amplifier, based on the fluidic triode
Loudspeaker
loudspeaker (or "speaker") is an electroacoustic transducer that produces sound in response to an
electrical audio signal input. Non-electrical loudspeakers were developed as accessories to
telephone systems, but electronic amplification by vacuum tube made loudspeakers more generally
useful. The most common form of loudspeaker uses a paper cone supporting a voice coil
electromagnet acting on a permanent magnet, but many other types exist. Where accurate
reproduction of sound is required, multiple loudspeakers may be used, each reproducing a part of
the audible frequency range. Miniature loudspeakers are found in devices such as radio and TV
receivers, and many forms of music players. Larger loudspeaker systems are used for music, sound
reinforcement in theatres and concerts, and in public address systems.
Audio Mixer
In professional audio, a mixing console, or audio mixer, also called a sound board, mixing
desk, or mixer is an electronic device for combining (also called "mixing"), routing, and
changing the level, timbre and/or dynamics of audio signals. A mixer can mix analog or digital
signals, depending on the type of mixer. The modified signals (voltages or digital samples) are
summed to produce the combined output signals.
Mixing consoles are used in many applications, including recording studios, public address
systems, sound reinforcement systems, broadcasting, television, and film post-production. An
example of a simple application would be to enable the signals that originated from two separate
microphones (each being used by vocalists singing a duet, perhaps) to be heard through one set
of speakers simultaneously. When used for live performances, the signal produced by the mixer
will usually be sent directly to an amplifier, unless that particular mixer is "powered" or it is
being connected to powered speakers.
Structure
Yamaha 2403 audio mixing console in a 'live' mixing application
A typical analog mixing board has three sections:



Channel inputs
Master controls
Audio level metering
The channel input strips are usually a bank of identical monaural or stereo input channels. The
master control section has sub-group faders, master faders, master auxiliary mixing bus level
controls and auxiliary return level controls. In addition it may have solo monitoring controls, a
stage talk-back microphone control, muting controls and an output matrix mixer. On smaller
mixers the inputs are on the left of the mixing board and the master controls are on the right. In
larger mixers, the master controls are in the center with inputs on both sides. The audio level
meters may be above the input and master sections or they may be integrated into the input and
master sections themselves
Digital Audio
Digital audio is sound reproduction using pulse-code modulation and digital signals. Digital
audio systems include analog-to-digital conversion (ADC), digital-to-analog conversion (DAC),
digital storage, processing and transmission components. A primary benefit of digital audio is in
its convenience of storage, transmission and retrieval.
Overview of digital audio
Sampling and 4-bit quantization of an analog signal (red) using Pulse-code modulation
Digital audio has emerged because of its usefulness in the recording, manipulation, massproduction, and distribution of sound. Modern distribution of music across the Internet via online stores depends on digital recording and digital compression algorithms. Distribution of audio
as data files rather than as physical objects has significantly reduced the cost of distribution.
In an analog audio system, sounds begin as physical waveforms in the air, are transformed into
an electrical representation of the waveform, via a transducer (for example, a microphone), and
are stored or transmitted. To be re-created into sound, the process is reversed, through
amplification and then conversion back into physical waveforms via a loudspeaker. Although its
nature may change, analog audio's fundamental wave-like characteristics remain the same during
its storage, transformation, duplication, and amplification.
Analog audio signals are susceptible to noise and distortion, unavoidable due to the innate
characteristics of electronic circuits and associated devices. In the case of purely analog
recording and reproduction, numerous opportunities for the introduction of noise and distortion
exist throughout the entire process. When audio is digitized, distortion and noise are introduced
only by the stages that precede conversion to digital format, and by the stages that follow
conversion back to analog.
The digital audio chain begins when an analog audio signal is first sampled, and then (for pulsecode modulation, the usual form of digital audio) it is converted into binary signals—‘on/off’
pulses—which are stored as binary electronic, magnetic, or optical signals, rather than as
continuous time, continuous level electronic or electromechanical signals. This signal may then
be further encoded to allow correction of any errors that might occur in the storage or
transmission of the signal, however this encoding is for error correction, and is not strictly part of
the digital audio process. This "channel coding" is essential to the ability of broadcast or
recorded digital system to avoid loss of bit accuracy. The discrete time and level of the binary
signal allow a decoder to recreate the analog signal upon replay. An example of a channel code is
Eight to Fourteen Bit Modulation as used in the audio Compact Disc (CD).
[edit] Conversion process
The lifecycle of sound from its source, through an ADC, digital processing, a DAC, and finally as sound
again.
A digital audio system starts with an ADC that converts an analog signal to a digital signal. [note 1]
The ADC runs at a sampling rate and converts at a known bit resolution. For example, CD audio
has a sampling rate of 44.1 kHz (44,100 samples per second) and 16-bit resolution for each
channel. For stereo there are two channels: 'left' and 'right'. If the analog signal is not already
bandlimited then an anti-aliasing filter is necessary before conversion, to prevent aliasing in the
digital signal. (Aliasing occurs when frequencies above the Nyquist frequency have not been
band limited, and instead appear as audible artifacts in the lower frequencies).
The digital audio signal may be stored or transmitted. Digital audio storage can be on a CD, a
digital audio player, a hard drive, USB flash drive, CompactFlash, or any other digital data
storage device. The digital signal may then be altered in a process which is called digital signal
processing where it may be filtered or have effects applied. Audio data compression techniques
— such as MP3, Advanced Audio Coding, Ogg Vorbis, or FLAC — are commonly employed to
reduce the file size. Digital audio can be streamed to other devices.
The last step is for digital audio to be converted back to an analog signal with a DAC. Like
ADCs, DACs run at a specific sampling rate and bit resolution but through the processes of
oversampling, upsampling, and downsampling, this sampling rate may not be the same as the
initial sampling rate.
[edit] History of digital audio use in commercial recording
Pulse-code modulation was invented by British scientist Alec Reeves in 1937[1] and was used in
telecommunications applications long before its first use in commercial broadcast and recording.
Commercial digital recording was pioneered in Japan by NHK, and Nippon Columbia (a.k.a.
Denon) in the 1960s. The first commercial digital recordings were released in 1971.[2]
The BBC also began experimenting with digital audio in the 1960s. By the early 1970s they had
developed a 2-channel recorder and in 1972 they deployed a digital audio transmission system
linking their broadcast center to their remote transmitters.[2]
The first 16-bit PCM recording in the United States was made by Thomas Stockham at the Santa
Fe Opera in 1976 on a Soundstream recorder. In 1978, an improved version of the Soundstream
system was used by Telarc to produce several classical recordings. At the same time 3M was
well along in development of their digital multitrack recorder based on BBC technology. The
first all-digital album recorded on this machine was Ry Cooder's "Bop 'Til You Drop" which was
released in 1979. In a crash program started in 1978, British record label Decca developed their
own 2-track digital audio recorders. Decca released the first European digital recording in
1979.[2]
Helped along by introduction of popular digital multitrack recorders from Sony and Mitsubishi
in the early 1980s, digital recording was soon embraced by the major record companies. With the
introduction of the CD by Sony and Philips in 1982, digital audio was embraced by consumers as
well.[2]
[edit] Digital audio technologies
Digital audio broadcasting




Digital Audio Broadcasting (DAB)
HD Radio
Digital Radio Mondiale (DRM)
In-band on-channel (IBOC)
Storage technologies:








Digital audio player
Digital Audio Tape (DAT)
Compact Disc (CD)
Hard disk recorder
DVD Audio
MiniDisc
Super Audio CD
Various audio file formats
Synthesizers
A synthesizer (often abbreviated "synth") is an electronic instrument capable of producing
sounds by generating electrical signals of different frequencies. These electrical signals are
played through a loudspeaker or set of headphones. Synthesizers can usually produce a wide
range of sounds, which may either imitate other instruments ("imitative synthesis") or generate
new timbres.
Synthesizers use a number of different technologies or programmed algorithms, each with their
own strengths and weaknesses. Among the most popular waveform synthesis techniques are
subtractive synthesis, additive synthesis, wavetable synthesis, frequency modulation synthesis,
phase distortion synthesis, physical modeling synthesis and sample-based synthesis. Other sound
synthesis methods, like subharmonic synthesis or granular synthesis, are not commonly found in
hardware music synthesizers.
Synthesizers are often controlled with a piano-style keyboard, leading such instruments to be
referred to simply as "keyboards". Several other forms of controller have been devised to
resemble violins, guitars (see guitar synthesizer) and wind-instruments. Synthesizers without
controllers are often called "modules", and they can be controlled using MIDI or CV/Gate
methods
MIDI
MIDI ( /ˈmɪdi/; Musical Instrument Digital Interface) is an industry-standard protocol that
enables electronic musical instruments (synthesizers, drum machines), computers and other
electronic equipment (MIDI controllers, sound cards, samplers) to communicate and synchronize
with each other. Unlike analog devices, MIDI does not transmit an audio signal: it sends event
messages about musical notation, pitch and intensity, control signals for parameters such as
volume, vibrato and panning, cues, and clock signals to set the tempo. As an electronic protocol,
it is notable for its widespread adoption throughout the music industry. MIDI protocol was
defined in 1982.[1]
All MIDI-compatible controllers, musical instruments, and MIDI-compatible software follow the
same MIDI 1.0 specification, and thus interpret any given MIDI message the same way, and so
can communicate with and understand each other. MIDI composition and arrangement takes
advantage of MIDI 1.0 and General MIDI (GM) technology to allow musical data files to be
shared among many different devices due to some incompatibility with various electronic
instruments by using a standard, portable set of commands and parameters. Because the music is
stored as instructions rather than recorded audio waveforms, the data size of the files is quite
small by comparison.
Basics of Staff Notation
To better understand how to read music, maybe it is best to first ask ourselves:
What is music exactly?
Well, according to the 1976 edition (okay so I need to update my book collection!) of Funk &
Wagnalls Standard Desk Dictionary the definition is:
mu.sic (myoo'zik) n. 1. The art of producing significant arrangements of sounds, usually with
reference to rhythm, pitch and tone colour. 3. A succession or combination of notes, especially if
pleasing to the ear.
Man!, don't you just hate it when you look up a definition and you need to look up words the
definition uses? Well, I'll try to save you the trouble this time. pitch is the frequency at which a
note vibrates, I'll explain this shortly. Tone colour is the type of sound, for example an
overdriven electric guitar has a very rough aggressive tone while a flute usually has a soft
mellow tone (unless the flute player really sucks I suppose). Rhythm is a measure of the the time
frame you play the notes in, but I will explain that later too. For now, let's just say that music is
the art of producing significant arrangements of sounds, usually for the purpose of causing
emotional responses in people (usually, you want people to like what they hear unless of course
you are trying to be the latest punk band and want people to be offended by your sound! To each
his own I guess...).
Okay, now back to what we set out to do in the first place, teach you how to read music...
Sound and Pitch in Music
Now that we've established that music is made up of sounds I will explain what a sound actually
is:
All sounds are caused by the vibrations of air molecules. These waves ("sound waves") of
vibrations in air molecules originate from some kind of vibrating object, perhaps a musical
instrument or a person's vocal chords. In music we refer to the frequency (how many times the
molecules vibrate per second) a note vibrates at as the pitch of the note.
In most contemporary sheet music you will see the music will be written on either the treble clef
staff:
Or the bass clef staff:
As the notes are written closer to the top of these clefs there pitch increases giving them a higher,
lighter sound. Conversely, as notes are written closer to the bottom of the clefs the pitch
decreases giving them a lower, darker sound. The treble clef contains notes that are higher in
pitch than the bass clef and the bass clef contains notes that are lower in pitch than the treble clef.
For this reason for some instruments that have a wide range of notes, the piano in particular, you
may see these two staffs combined as follows:
The next image may help you visualize how notes are placed on the staffs in relation to their
pitch. It is a picture of a piano keyboard with the clefs and notes written over top:
Click here to listen to sound of notes in picture from left to right
Notice that as you go from the lower pitch notes on the left of the piano to the higher pitch notes
on the right side of the piano the notes are written on the staffs in ascending order. As you can
see from the diagram above we sometimes write notes that are below or above the lines on the
staff, these notes appear on extra small lines called ledger lines. You may also notice that there is
one note (middle C) which can be written as either one ledger line above the bass clef or as one
ledger line below the treble clef. The diagram above shows all of the white notes on the piano
written on the staffs, but you are probably wondering about the black notes, how are they
written? Well, this can be answered by viewing the diagram below:
Click
here
to
compare
regular
and
accidental
notes
in
picture
(Note: The rhythm in the sound file is slightly different to than the rhythm shown in the picture.)
In music there are notes that we sometimes come across called "Accidentals". So what exactly
are these accidentals, you may be asking, the notes I accidentally play by mistake? No, although
some musicians might try to use that as an excuse, accidentals are actually notes that are called
for you to play in a piece of music which are not in the general key that most of the song is
written in.
When you encounter a note in music that has a to the left of it you play the note immediately
left of it on the keyboard. If you encounter a note that has a
immediately to the right of it on the keyboard.
in front of it you play the note
Rhythm and Note Durations
There are many different durations of notes, typically you will see the following basic note
durations in today's contemporary music:
Whole Note
Half Note
Quarter Note
Eigth Note
Sixteenth Note
The majority of the contemporary rock and pop music you hear on the radio these days is written
in the 4/4 time signature:
The top number tells us how many of the specified notes are in a bar and the bottom number tells
us what duration (ie: how long) that specified note is. For example in 4/4 Time the top number
tells us there are 4 notes in a bar and the bottom number tells us that each note is 1/4 of the
length of the bar, or more simply put a quarter note. Therefore, we can tell that a song written
with a 4/4 time signature is made up of bars (musical units a song is divided up into) which
contain 4 quarter note long beats. The following picture may help in visualizing this:
Click
here
to
listen
(Note: I have added a drum click to empasize the beat and will also do so in some later examples.
The drum will be played on every beat with an accent on beat one.)
Sound Card
A sound card (also known as an audio card) is an internal computer expansion card that
facilitates the input and output of audio signals to and from a computer under control of
computer programs. The term sound card is also applied to external audio interfaces that use
software to generate sound, as opposed to using hardware inside the PC. Typical uses of sound
cards include providing the audio component for multimedia applications such as music
composition, editing video or audio, presentation, education and entertainment (games) and
video projection. Many computers have sound capabilities built in, while others require
additional expansion cards to provide for audio capability.
[edit] Sound channels and polyphony
8-channel DAC Cirrus Logic CS4382 placed on Sound Blaster X-Fi Fatal1ty.
An important sound card characteristic is polyphony, which refers to its ability to process and
output multiple independent voices or sounds simultaneously. These distinct channels are seen as
the number of audio outputs, which may correspond to a speaker configuration such as 2.0
(stereo), 2.1 (stereo and sub woofer), 5.1 (surround), or other configuration. Sometimes, the
terms voice and channel are used interchangeably to indicate the degree of polyphony, not the
output speaker configuration.
For example, many older sound chips could accommodate three voices, but only one audio
channel (i.e., a single mono output) for output, requiring all voices to be mixed together. Later
cards, such as the AdLib sound card, had a 9-voice polyphony combined in 1 mono output
channel.
For some years, most PC sound cards have had multiple FM synthesis voices (typically 9 or 16)
which were usually used for MIDI music. The full capabilities of advanced cards aren't often
completely used; only one (mono) or two (stereo) voice(s) and channel(s) are usually dedicated
to playback of digital sound samples, and playing back more than one digital sound sample
usually requires a software downmix at a fixed sampling rate. Modern low-cost integrated
soundcards (i.e., those built into motherboards) such as audio codecs like those meeting the
AC'97 standard and even some lower-cost expansion sound cards still work this way. These
devices may provide more than two sound output channels (typically 5.1 or 7.1 surround sound),
but they usually have no actual hardware polyphony for either sound effects or MIDI
reproduction – these tasks are performed entirely in software. This is similar to the way
inexpensive softmodems perform modem tasks in software rather than in hardware.
Also, in the early days of wavetable synthesis, some sound card manufacturers advertised
polyphony solely on the MIDI capabilities alone. In this case, the card's output channel is
irrelevant (and typically, the card is only capable of two channels of digital sound). Instead, the
polyphony measurement solely applies to the amount of MIDI instruments the sound card is
capable of producing at one given time.
Today, a sound card providing actual hardware polyphony, regardless of the number of output
channels, is typically referred to as a "hardware audio accelerator", although actual voice
polyphony is not the sole (or even a necessary) prerequisite, with other aspects such as hardware
acceleration of 3D sound, positional audio and real-time DSP effects being more important.
Since digital sound playback has become available and provided better performance than
synthesis, modern soundcards with hardware polyphony don't actually use DACs with as many
channels as voices. Instead, they perform voice mixing and effects processing in hardware
(eventually performing digital filtering and conversions to and from the frequency domain for
applying certain effects) inside a dedicated DSP. The final playback stage is performed by an
external (in reference to the DSP chip(s)) DAC with significantly fewer channels than voices
(e.g., 8 channels for 7.1 audio, which can be divided among 32, 64 or even 128 voices).
[edit] Color codes
Connectors on the sound cards are colour coded as per the PC System Design Guide. They will
also have symbols with arrows, holes and soundwaves that are associated with each jack
position, the meaning of each is given below:
Colour
Pink
Function
Analog microphone audio input.
Light blue Analog line level audio input.
Connector
3.5 mm
TRS
A microphone
3.5 mm
TRS
An arrow going into a circle
Arrow going out one side of
a circle into a wave
Lime green
Analog line level audio output for the main
stereo signal (front speakers or headphones).
3.5 mm
TRS
Brown/Dark
Analog line level audio output for a special
panning,'Right-to-left speaker'.
3.5 mm
TRS
Black
Analog line level audio output for surround
speakers, typically rear stereo.
3.5 mm
TRS
Orange
symbol
Analog line level audio output for center channel 3.5 mm
speaker and subwoofer
TRS
Gold/Grey Game port / MIDI
15 pin D
Arrow going out both sides
into waves
[edit] History of sound cards for the IBM PC architecture
The AdLib Music Synthesizer Card, was one of the first sound cards circa 1990. Note the manual volume
adjustment knob. ISA-8 bus.
Sound card Mozart 16 for ISA-16 bus.
A Turtle Beach sound card. PCI bus.
Echo Digital Audio's Indigo IO — PCMCIA card 24-bit 96 kHz stereo in/out sound card.
Sound cards for computers compatible with the IBM PC were very uncommon until 1988, which
left the single internal PC speaker as the only way early PC software could produce sound and
music. The speaker hardware was typically limited to square waves, which fit the common
nickname of "beeper". The resulting sound was generally described as "beeps and boops".
Several companies, most notably Access Software, developed techniques for digital sound
reproduction over the PC speaker; the resulting audio, while baldly functional, suffered from
distorted output and low volume, and usually required all other processing to be stopped while
sounds were played. Other home computer models of the 1980s included hardware support for
digital sound playback, or music synthesis (or both), leaving the IBM PC at a disadvantage to
them when it came to multimedia applications such as music composition or gaming.
It is important to note that the initial design and marketing focuses of sound cards for the IBM
PC platform were not based on gaming, but rather on specific audio applications such as music
composition (AdLib Personal Music System, Creative Music System, IBM Music Feature Card)
or on speech synthesis (Digispeech DS201, Covox Speech Thing, Street Electronics Echo). Not
until Sierra and other game companies became involved in 1988 was there a switch toward
gaming.
[edit] Hardware manufacturers
One of the first manufacturers of sound cards for the IBM PC was AdLib, who produced a card
based on the Yamaha YM3812 sound chip, also known as the OPL2. The AdLib had two modes:
A 9-voice mode where each voice could be fully programmed, and a less frequently used
"percussion" mode with 3 regular voices producing 5 independent percussion-only voices for a
total of 11. (The percussion mode was considered inflexible by most developers; it was used
mostly by AdLib's own composition software.)
Creative Labs also marketed a sound card about the same time called the Creative Music System.
Although the C/MS had twelve voices to AdLib's nine, and was a stereo card while the AdLib
was mono, the basic technology behind it was based on the Philips SAA 1099 chip which was
essentially a square-wave generator. It sounded much like twelve simultaneous PC speakers
would have, and failed to sell well, even after Creative renamed it the Game Blaster a year later,
and marketed it through Radio Shack in the US. The Game Blaster retailed for under $100 and
included the hit game Silpheed.
A large change in the IBM PC compatible sound card market happened with Creative Labs'
introduced the Sound Blaster card. The Sound Blaster cloned the AdLib, and added a sound
coprocessor for recording and play back of digital audio (likely to have been an Intel
microcontroller relabeled by Creative). It was incorrectly called a "DSP" (to suggest it was a
digital signal processor), a game port for adding a joystick, and capability to interface to MIDI
equipment (using the game port and a special cable). With more features at nearly the same
price, and compatibility as well, most buyers chose the Sound Blaster. It eventually outsold the
AdLib and dominated the market.
The Sound Blaster line of cards, together with the first inexpensive CD-ROM drives and
evolving video technology, ushered in a new era of multimedia computer applications that could
play back CD audio, add recorded dialogue to computer games, or even reproduce motion video
(albeit at much lower resolutions and quality in early days). The widespread decision to support
the Sound Blaster design in multimedia and entertainment titles meant that future sound cards
such as Media Vision's Pro Audio Spectrum and the Gravis Ultrasound had to be Sound Blaster
compatible if they were to sell well. Until the early 2000s (by which the AC'97 audio standard
became more widespread and eventually usurped the SoundBlaster as a standard due to its low
cost and integration into many motherboards), Sound Blaster compatibility is a standard that
many other sound cards still support to maintain compatibility with many games and applications
released.
[edit] Industry adoption
When game company Sierra On-Line opted to support add-on music hardware (instead of builtin hardware such as the PC speaker and built-in sound capabilities of the IBM PCjr and Tandy
1000), what could be done with sound and music on the IBM PC changed dramatically. Two of
the companies Sierra partnered with were Roland and Adlib, opting to produce in-game music
for King's Quest 4 that supported the Roland MT-32 and Adlib Music Synthesizer. The MT-32
had superior output quality, due in part to its method of sound synthesis as well as built-in
reverb. Since it was the most sophisticated synthesizer they supported, Sierra chose to use most
of the MT-32's custom features and unconventional instrument patches, producing background
sound effects (e.g., chirping birds, clopping horse hooves, etc.) before the Sound Blaster brought
playing real audio clips to the PC entertainment world. Many game companies also supported the
MT-32, but supported the Adlib card as an alternative because of the latter's higher market base.
The adoption of the MT-32 led the way for the creation of the MPU-401/Roland Sound Canvas
and General MIDI standards as the most common means of playing in-game music until the mid1990s.
[edit] Feature evolution
Early ISA bus soundcards were half-duplex, meaning they couldn't record and play digitized
sound simultaneously, mostly due to inferior card hardware (e.g., DSPs). Later, ISA cards like
the SoundBlaster AWE series and Plug-and-play Soundblaster clones eventually became fullduplex and supported simultaneous recording and playback, but at the expense of using up two
IRQ and DMA channels instead of one, making them no different from having two half-duplex
sound cards in terms of configuration. Towards the end of the ISA bus' life, ISA soundcards
started taking advantage of IRQ sharing, thus reducing the IRQs needed to one, but still needed
two DMA channels. Many PCI bus cards do not have these limitations and are mostly fullduplex. It should also be noted that many modern PCI bus cards also do not require free DMA
channels to operate.
Also, throughout the years, soundcards have evolved in terms of digital audio sampling rate
(starting from 8-bit 11.025 kHz, to 32-bit, 192 kHz that the latest solutions support). Along the
way, some cards started offering wavetable synthesis, which provides superior MIDI synthesis
quality relative to the earlier OPL-based solutions, which uses FM-synthesis. Also, some higher
end cards started having their own RAM and processor for user-definable sound samples and
MIDI instruments as well as to offload audio processing from the CPU.
For years, soundcards had only one or two channels of digital sound (most notably the Sound
Blaster series and their compatibles) with the exception of the E-MU card family, which had
hardware support for up to 32 independent channels of digital audio. Early games and MODplayers needing more channels than a card could support had to resort to mixing multiple
channels in software. Even today, the tendency is still to mix multiple sound streams in software,
except in products specifically intended for gamers or professional musicians, with a sensible
difference in price from "software based" products. Also, in the early era of wavetable synthesis,
soundcard companies would also sometimes boast about the card's polyphony capabilities in
terms of MIDI synthesis. In this case polyphony solely refers to the amount of MIDI notes the
card is capable of synthesizing simultaneously at one given time and not the amount of digital
audio streams the card is capable of handling.
In regards to physical sound output, the number of physical sound channels has also increased.
The first soundcard solutions were mono. Stereo sound was introduced in the early 90s, and
quadraphonic sound came in 1989. This was shortly followed by 5.1 channel audio. The latest
soundcards support up to 8 physical audio channels in the 7.1 speaker setup.
Audio File formats and CODECs
An audio file format is a file format for storing digital audio data on a computer system. This data can be
stored uncompressed, or compressed to reduce the file size. It can be a raw bitstream, but it is usually a
container format or an audio data format with defined storage layer
Types of formats
It is important to distinguish between a file format and an audio codec. A codec performs the
encoding and decoding of the raw audio data while the data itself is stored in a file with a
specific audio file format. Most of the publicly documented audio file formats can be created
with one of two or more encoders or codecs.[citation needed] Although most audio file formats
support only one type of audio data (created with an audio coder), a multimedia container format
(as Matroska or AVI) may support multiple types of audio and video data.
There are three major groups of audio file formats:



Uncompressed audio formats, such as WAV, AIFF, AU or raw header-less PCM;
Formats with lossless compression, such as FLAC, Monkey's Audio (filename extension APE),
WavPack (filename extension WV), TTA, ATRAC Advanced Lossless, Apple Lossless (filename
extension m4a), MPEG-4 SLS, MPEG-4 ALS, MPEG-4 DST, Windows Media Audio Lossless (WMA
Lossless), and Shorten (SHN).
Formats with lossy compression, such as MP3, Vorbis, Musepack, AAC, ATRAC and Windows Media
Audio Lossy (WMA lossy)).
[edit] Uncompressed audio formats
There is one major uncompressed audio format, PCM, which is usually stored in a .wav file on
Windows or in a .aiff file on Mac OS. The AIFF format is based on the Interchange File Format
(IFF). The WAV format is based on the Resource Interchange File Format (RIFF), which is
similar to IFF. WAV and AIFF are flexible file formats designed to store more or less any
combination of sampling rates or bitrates. This makes them suitable file formats for storing and
archiving an original recording.
BWF (Broadcast Wave Format) is a standard audio format created by the European Broadcasting
Union as a successor to WAV. BWF allows metadata to be stored in the file. See European
Broadcasting Union: Specification of the Broadcast Wave Format (EBU Technical document
3285, July 1997). This is the primary recording format used in many professional audio
workstations in the television and film industry. BWF files include a standardized timestamp
reference which allows for easy synchronization with a separate picture element. Stand-alone,
file based, multi-track recorders from Sound Devices,[1] Zaxcom,[2] HHB USA,[3] Fostex, and
Aaton[4] all use BWF as their preferred format.
The .cda (Compact Disk Audio Track) is a small file that serves as a shortcut to the audio data
for a track on a music CD. It does not contain audio data and is therefore not considered to be a
proper audio file format.
[edit] Lossless compressed audio formats
A lossless compressed format stores data in less space by eliminating unnecessary data. It
requires more processing power both to compress the data and to uncompress for playback.[citation
needed]
Uncompressed audio formats encode both sound and silence with the same number of bits per
unit of time. Encoding an uncompressed minute of absolute silence produces a file of the same
size as encoding an uncompressed minute of music. In a lossless compressed format, however,
the music would occupy a smaller portion of the file and the silence would take up almost no
space at all.
Lossless compression formats enable the original uncompressed data to be recreated exactly.
They include the common[5] FLAC, WavPack, Monkey's Audio, ALAC (Apple Lossless). They
provide a compression ratio of about 2:1 (i.e. their files take up half the space of the originals).
Development in lossless compression formats aims to reduce processing time while maintaining
a good compression ratio.
[edit] Lossy compressed audio formats
Lossy compression enables even greater reductions in file size by removing some of the data. A
variety of techniques are used, mainly by exploiting psychoacoustics, to remove data with
minimal reduction in the quality of reproduction. For many everyday listening situations, the loss
in data (and thus quality) is imperceptible. The popular MP3 format is probably the best-known
example, but AAC format is another common one. Most formats offer a range of degrees of
compression, generally measured in bit rate. The lower the rate, the smaller the file and the
greater the quality loss.
Audio Recording Systems
Sound recording and reproduction is an electrical or mechanical inscription and re-creation of
sound waves, such as spoken voice, singing, instrumental music, or sound effects. The two main
classes of sound recording technology are analog recording and digital recording. Acoustic
analog recording is achieved by a small microphone diaphragm that can detect changes in
atmospheric pressure (acoustic sound waves) and record them as a graphic representation of the
sound waves on a medium such as a phonograph (in which a stylus senses grooves on a record).
In magnetic tape recording, the sound waves vibrate the microphone diaphragm and are
converted into a varying electric current, which is then converted to a varying magnetic field by
an electromagnet, which makes a representation of the sound as magnetized areas on a plastic
tape with a magnetic coating on it. Analog sound reproduction is the reverse process, with a
bigger loudspeaker diaphragm causing changes to atmospheric pressure to form acoustic sound
waves. Electronically generated sound waves may also be recorded directly from devices such as
an electric guitar pickup or a synthesizer, without the use of acoustics in the recording process
other than the need for musicians to hear how well they are playing during recording sessions.
Digital recording and reproduction converts the analog sound signal picked up by the
microphone to a digital form by a process of digitization, allowing it to be stored and transmitted
by a wider variety of media. Digital recording stores audio as a series of binary numbers
representing samples of the amplitude of the audio signal at equal time intervals, at a sample rate
high enough to convey all sounds capable of being heard. Digital recordings are considered
higher quality than analog recordings not necessarily because they have higher fidelity (wider
frequency response or dynamic range), but because the digital format can prevent much loss of
quality found in analog recording due to noise and electromagnetic interference in playback, and
mechanical deterioration or damage to the storage medium. A digital audio signal must be
reconverted to analog form during playback before it is applied to a loudspeaker or earphones.
Audio and Multimedia
Multimedia is media and content that uses a combination of different content forms. The term
can be used as a noun (a medium with multiple content forms) or as an adjective describing a
medium as having multiple content forms. The term is used in contrast to media which use only
rudimentary computer display such as text-only, or traditional forms of printed or hand-produced
material. Multimedia includes a combination of text, audio, still images, animation, video, or
interactivity content forms.
Examples of individual content forms
combined in multimedia:
Text
Audio
Still Images
Animation
Video
Footage
Interactivity
Multimedia is usually recorded and played, displayed or accessed by information content
processing devices, such as computerized and electronic devices, but can also be part of a live
performance. Multimedia (as an adjective) also describes electronic media devices used to store
and experience multimedia content. Multimedia is distinguished from mixed media in fine art; by
including audio, for example, it has a broader scope. The term "rich media" is synonymous for
interactive multimedia. Hypermedia can be considered one particular multimedia application.
Voice Recognition and Response
Interactive voice response (IVR) is a technology that allows a computer to interact with
humans through the use of voice and DTMF keypad inputs.
In telecommunications, IVR allows customers to interact with a company’s database via a
telephone keypad or by speech recognition, after which they can service their own inquiries by
following the IVR dialogue. IVR systems can respond with prerecorded or dynamically
generated audio to further direct users on how to proceed. IVR applications can be used to
control almost any function where the interface can be broken down into a series of simple
interactions. IVR systems deployed in the network are sized to handle large call volumes.
IVR technology is also being introduced into automobile systems for hands-free operation.
Current deployment in automobiles revolves around satellite navigation, audio and mobile phone
systems.
It has become common in industries that have recently entered the telecommunications industry
to refer to an automated attendant as an IVR. The terms, however, are distinct and mean different
things to traditional telecommunications professionals, whereas emerging telephony and VoIP
professionals often use the term IVR as a catch-all to signify any kind of telephony menu, even a
basic automated attendant.[citation needed] The term voice response unit (VRU), is sometimes used
as well
Audio Processing Software.
Typical Audio Editing Applications
Software audio editing for studios and professional journalists.
Edit sound files to broadcast over the internet with the BroadWave Streaming Audio Server
Normalizing the level of audio files during mastering before burning to CD.
Editing mp3 files for your iPod, PSP or other portable device.
As a music editor (includes ringtones creator formats).
Music editing and recording to produce mp3 files.
Voice editing for multimedia productions (use with our Video Editor).
Restoration of audio files including removing excess noise such as hiss and hums.