Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mid-term review Physics of sound, part 1 What to focus on: When a sound is created in the environment: 1. Something vibrates (our vocal cords, a machine, a musical instrument, as a result of friction when we move, etc.) 2. The air (or sometimes another fluid medium like water) is disturbed by the vibration, resulting in pressure waves 3. The pressure waves ‘wash over’ our ears (or the hearing apparatus of other creatures) and are interpreted as sounds by our or the creature’s brain circuitry (more on this later) Representation of sound as an oscillogram: When the oscilogram line passes above the center line, it represents an increase in atmospheric pressure (molecules compressed by the surge of the pressure wave) Center line represents zero atmospheric pressure (no disturbance in the environment) When the oscilogram line passes below the center line, it represents a decrease in atmospheric pressure (molecules decompressing after the surge of the pressure wave) Mid-term review Physics of sound, part 1 The vertical axis of an oscillogram represents the amplitude or force of the pressure wave Lower amplitude (softer) parts of the sound Higher amplitude (louder) parts of the sound (Beginning / earlier part of sound) (End / later part of sound) The horizontal axis of an oscillogram represents the timeline over which the pressure wave plays out Mid-term review Physics of hearing Stage one: -vibrating air enters the ear canal -pressure waves cause the eardrum to vibrate sympathetically 2 cm Sympathetic vibrations to incoming pressure waves Mid-term review Physics of hearing dB SPL (different reading than on mixers and in sound editors, etc.) Threshold of pain Guard against hearing loss! SPL levels at many concerts, in clubs and when wearing headphones can be deceptively high and cause temporary or permanent damage to parts of the outer, middle or inner ear. Mid-term review Physics of hearing Stage two: -eardrum motion causes the attached Malleus (hammer) bone to move, which acts as a lever for the Incus (anvil) and the Stapes (stirrup). Stage three: Mid-term review Stage three: -the stapes bone vibrates onto the oval window at the entrance of the cochlea -the vibration (amplified because of the leverage between the 3 bones) causes a liquid pressure wave* to move thru the cochlea Physics of hearing Mid-term review Stage three: -the stapes bone vibrates onto the oval window at the entrance of the cochlea Physics of hearing -the vibration (amplified because of the leverage between the 3 bones) causes a liquid pressure wave* to move thru the cochlea (Stage four) Mid-term review Stage four: -in the basilar membrane of the cochlea, tiny hair cells react to the liquid pressure wave -humans have around 20,000 of these cells, each of which is a slightly different length, and each of which resonates with a different frequency in our hearing range. Physics of hearing -each of the cells connect with nerve endings that then transmit precise frequency, spectrum and amplitude information electrically to the brain. -spatial information comes from the different readings each ear gets from the environment. Full auditory perception, including spectrum, then occurs Mid-term review Physics of hearing It is important to note that we do not have ‘flat response’ hearing. The red lines in the chart indicate the level at which a frequency must be played in order to sound to us as if it is at the level in the center column (threshold-20-4060-80-100 phons or psychoacoustic equivalent levels) In the high range (5.5- 20 kHz) increasing amounts of boost must made to be heard at an equal level (ex: ~ 30 dB SPL = 20 phons at 17 kHz) In the low range (20 - 100 Hz) large amounts of boost must made to be heard at an equal level (ex: ~ 60 dB SPL = 20 phons at 40 Hz) In the mid to upper-mid range (400 Hz-5 kHz), frequencies are either left at the same level (ex: 20 dB SPL = 20 phons at 500 Hz) or reduced (ex: ~12 dB SPL = 20 phons at 3-4 kHz) [This is why I so often give notes about having to reduce mid-range frequencies in mixes] Mid-term review Physics of sound, part 2 Representation of sound as a spectrogram / sonogram: sound analyzed as our ears do in Stage 4: Like oscillograms, the sonogram L-R axis shows time placement of spectral elements: Sonogram height shows frequency of spectral elements: 20kHz 10kHz 5kHz 1kHz 500Hz 200Hz 20Hz Sonogram A In this sound, we see strong energy in the 100-300 Hz range,which is where the fundamentals of voices and many musical instruments sound Sonogram colors show intensity of spectral elements: Softest Just above,in the 300-750 Hz range, are medium intensity spectral frequencies that might be higher fundamentals or strong harmonics The weakest energy is in the 1-20 kHz range - very little there Loudest Mid-term review Physics of sound, part 2 Like oscillograms, the sonogram L-R axis shows time placement of spectral elements: Sonogram height shows frequency of spectral elements: 20kHz 10kHz 5kHz 1kHz 500Hz 200Hz 20Hz Sonogram B In this sound, the energy below 250 Hz range,is fairly weak The strongest energy is between 250 Hz - 4kHz, and it is more ‘grainy’ and less sustained than in Sonogram A In the top (4-20 kHz) range is medium intensity energy, which means the sound has a lot of very high frequencies Sonogram colors show intensity of spectral elements: Softest Loudest Mid-term review Physics of sound, part 2 Like oscillograms, the sonogram L-R axis shows time placement of spectral elements: Sonogram height shows frequency of spectral elements: 20kHz 10kHz 5kHz 1kHz 500Hz 200Hz 20Hz Sonogram C In this sound, the energy below 450 Hz range,is strong and sustained The next strongest area is between 500 Hz - 5kHz, possibly higher components of a complex mix below In the top (5-20 kHz) range is very weak energy, which indicates a faint high-frequency ‘sheen’ to the sound. Sonogram colors show intensity of spectral elements: Softest Loudest Mid-term review Physics of sound, part 2 Like oscillograms, the sonogram L-R axis shows time placement of spectral elements: Sonogram height shows frequency of spectral elements: 20kHz 10kHz 5kHz 1kHz 500Hz 200Hz 20Hz Sonogram D Sonogram colors show intensity of spectral elements: In this sound, the energy is completely evenly distributed all across the spectrum - something that rarely happens in natural sounds Softest Loudest Mid-term review Physics of sound, part 2 Pitched and unpitched sounds: Certain sounds, like the notes produced by musical instruments, have what is referred to as pitch Pitches are more neatly organized compared to other sounds, consisting of stable frequencies that reinforce one another because of being mathematically related in a simple way. Again, a zoomed-out oscillogram does not tell us whether a sound is pitched or unpitched: Sound A All we know is that the first sound has sharp attack, and then a consistent decay… Sound B …and that the second sound has several peaks before decaying Mid-term review Physics of sound, part 1 But a look at the sonograms for the two sounds immediately reveals which is a stable, organized pitch, and which isn’t: Sonogram A: Harmonic multiples of the fundamental Fundamental frequency Sonogram B: Thick clusters of partials not reinforcing any particular frequency Mid-term review Physics of sound, part 2 Pitched sounds have ordered harmonics, only cover specific parts of spectrum 5x, 6x, 7x, 8x, 9x, 10x, 11x, etc. Unpitched sounds (e.g. ‘hits’, hiss in this file) cover full stretches of spectrum 4th harm. 4 x 450 = 1.8 kHz 3rd harm. 3 x 450 = 1350 Hz 2nd harm. 2 x 450 = 900 Hz 1st harmonic ~450 Hz (freq2) Mid-term review Physics of sound, part 2 Parts of a zoomed-in oscillogram (not the attack) can give us clues about whether the sound is pitched or nonpitched: (A regularly repeating wave cycle such as this one will generate a stable pitch) (This irregular waveform, which contains little of no repetition, will not generate a stable pitch) Mid-term review Physics of sound, part 2 Pitched sounds can also be stable (e.g. the examples looked at so far) or unstable (the violin note-gliss-vibrato note below): Mid-term review Physics of sound, part 2 Human speech combines pitched and unpitched sounds: The hard consonants (unvoiced) have complex spectra and aren’t organized like pitches The vowels and soft consonants (voiced) are much louder & have pitch-like organizations, often gliding Sonogram of the same excerpt Th f e “The fourth anniversary of the nuclear misunderstanding…” th our Oscillogram of a speech excerpt --> v an- ni- s er- th a ry of c e nu s le ar mi s under Mid-term review Physics of sound, part 2 Slower / lower frequency voiced sounds - less compressed in oscillogram Higher / faster frequency unvoiced sounds - more compressed in oscillogram …which is confirmed by the sonogram of the same excerpt: Mid-term review Physics of sound, part 2 Spatial behavior / perception of sound Especially in indoor environments, the sound we hear is not only the direct vibrations made by the object, person or process, but a number of echoes and reverberations laid on top of the direct sound, the nature of which is generally referred to as (room) acoustics. The time it takes for reverberated sound to die down to below the threshold of hearing is called the reverb decay, and is given in seconds or milliseconds [SWP; OPCH] Concert halls and other listening environments are / were designed to create a pleasing (but not too confusing) reverberation, in the 1.5 - 2.0 second range: Estherhazy Palace Hall, Austria, where most of the classical composer Haydn’s works were performed in his lifetime. Reverb time: 1.2 seconds Grosser Musikvereinsaal, Vienna, a major European concert hall. Reverb time: 2.0 seconds Powell Hall, St. Louis Reverb time: 2.2 seconds Mid-term review Physics of sound, part 2 Recording studios, on the other hand, are designed with shorter reverb times, because this allows for more control of recorded signals. But contrary to what many people think, a completely ‘dead’ studio acoustic (i.e., reverb time < 0.3 sec) was only ever considered the ideal in the late 70s and 80s. Mid-term review Physics of sound, part 2 Recording studios, on the other hand, are designed with shorter reverb times, because this allows for more control of recorded signals. But contrary to what many people think, a completely ‘dead’ studio acoustic (i.e., reverb time < 0.3 sec) was only ever considered the ideal in the late 70s and 80s. Today, most studio control rooms have reverb times between 0.5 - 0.8 sec, while recording rooms vary from 0.3 sec (e.g., booths for recording voiceovers) to rooms that can be ‘opened up’ to over 1.0 secs when carpeting and other acoustic baffling is removed. Mid-term review Physics of sound, part 2 Arenas and other large concert venues not designed for sound present the challenge of very long reverb times (>3-4 seconds) that can confuse listeners with too much overlap. This is usually dealt with by providing as much direct sound to as many parts of the venue as possible. Speaker clusters for a stadium rock concert Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review Mid-term review MICROPHONES Recording the Voice Mid-term review MICROPHONES Vowels and Consonants Mid-term review MICROPHONES Frequencies 16 - 400 Hz % Contibution to speech intelligibility 7.2 400 - 800 Hz 14.4 800 - 1600 Hz 22.2 1600 - 3200 Hz 32.8 3200 - 20,000 Hz 23.4 Vowels Consonants Mid-term review DIGITAL AUDIO RECORDING INPUT ANALOG SIGNAL ANTI-DISTORTION FILTERING SAMPLE AND HOLD (The voltage stream from a microphone, the line level stream from playback devices or keyboards passing through a mixer, etc.) (Required to eliminate aliasing - more on this in a moment) (Linked to sampling rate and resolution) (The stream of numbers created according to the sampling rate & resolution) [Anti-aliasing low-pass filters used relative to the 1/2 sampling rate, to eliminate problematic high frequencies] Cutoff Frequency (Amplitude) (All frequencies above 1/2 sampling rate must be eliminated to avoid distortion. E.g. 25k Hz frequency distorts at 18 kHz at 44.1 kHz sampling rate:) (-3 k) ANALOG TO DIGITAL CONVERSION (Frequency) (On hard drive, in live RAM memory or using other storage media) (The crucual process of remaking the waveform from the stream of numbered samples, the quality and accuracy of which is linked to the sampling rate, resolution and other factors) DATA STORAGE DIGITAL TO ANALOG CONVERSION 19 kHz (The resulting analog electrical waveform) (+3 k) 25 kHz (The sound we hear through speakers or headphones) Mid-term review DIGITAL AUDIO RECORDING …can be thought of as a series of ‘snapshots’ that the digital recorder takes of the sound wave… Sampling rate = # of digital ‘snapshots’ taken of the sound every second): Sample… [ProTools LE sessions (i.e. recording & editing) .can only be in 44.1 kHz or 48 kHz, however… Common sampling rates: 44.1 kHz (CD audio) (sound wave) 48 kHz (Video & DAT) […when bouncing (rendering) audio, ProTools LE offers a wide variety of sampling rates, from ‘lo-fi’ 8kHz to ultra-high-end 192 kHz] 96 kHz (high-end digital recording) … & Hold (Stream of numbers according to sampling rate & digital resolution) Mid-term review DIGITAL AUDIO RECORDING Sampling resolution = number range available for representing amplitude Because digital memory is built on switch-like binary bits, the resolution of each sample taken is determined by the bit rate of the digital recording: 0 1 …1 bit provides a resolution of 2 binary numbers (0 or 1) 0 1 0 1 0 1 0 1 …4 bits provide a resolution of 16 binary numbers (0-0-0-0; 0-0-0-1; 0-0-1-1; etc.) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 …8 bits (still heard in some digital samples on the Internet and in older video games) provide a resolution of 256 binary numbers… 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 …16 bits (standard for CDs and most digital sound) provide a resolution of 65,536 binary numbers… 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 …24 bits (now the industry standard for high fidelity sound recording & reproduction) provide a resolution of 16,777,216 binary numbers. [ProTools LE sessions.and rendering /bouncing can be in 8, 16 or 24-bit resolution] In-line effects Mid-term review Signal processing Definition: An effect through which the signal on a track or mixer channel passes directly en route to the output The ‘raw’ signal is transformed by the effect, and cannot be heard as is unless the in-line effect is bypassed Signal In-line effect (bypass -->) Output In-line effects Mid-term review Signal processing In some situations (recording studios, film shoots) external effects units are still used. These are of two basic types: Rack mounted effects devices, patched into a patch bay and routed through the mixing board Pedal or ‘stomp box’ external effects are patched directly from a guitar, keyboard or other electric instrument to a direct box, and then into the mixing console via patched connections. Pedals often require external noise reduction devices to make the processed signal they output usable in a mix. But most effects used today are sub-routines within digital audio editors called plug-ins Mid-term review In-line effects Signal processing Equalizers (EQ) Manipulation of the spectral color of a sound through phase cancelation (cut) or reinforcement (boost) within certain frequency bands EQ types: Rolloff / low-pass, high-pass filters Peak / dip (notch filter) Shelving Key points: Pro engineers and mixers always advise using EQ to CUT rather than BOOST frequencies. Eqs are not designed to act as effective preamps, which is what they do when they boost certain frequency bands. Usually you boost conservatively -- a few dB or so. But extreme EQ can sometimes work -experiment! Because of the way human hearing works, mid-range frequencies (which our ears are hypersensitive to) almost always have to be cut to avoid a ‘crowded’ feeling in mixes. Some examples of EQ use in music: To establish clear ‘frequency areas’ in the mix that don’t conflict and don’t mask other sounds To bring out the most characteristic part of a sound so that its presence in the mix is there without being overbearing, e.g. Electric piano <--> Electric guitar; Bass drum <--> Lower register bass Low toms <--> Upper register bass; Voice(s) <--> Synth, saxophone In-line effects Mid-term review Signal processing Variable Gain Amplifiers 1. Compressors Definition: An automatic volume (a.k.a. ‘gain’, ‘level’, ‘amplitude’) control. Compressor/Limiters are the only signal processors generally used in recording, as a safeguard against distortion Works by reducing gain by a certain ratio when the signal goes above a set threshold Examples: A compressor with a ratio of 2:1 and a threshold of -10 dB will reduce a -2 dB signal at the input to -6 dB at the output (1/2 x 8 dB over threshold = 4 dB gain) -2dB -2dB -6dB -6dB -10dB -10dB -14dB -14dB No compression: -2 dB = -2 dB With compression (2:1 ratio & -10dB threshold): the -8 dB of gain is halved by the compressor (reduced by -4 dB) Compressed output is -6 dB In-line effects Mid-term review Signal processing Variable Gain Amplifiers 2. Limiters Definition: heavy compression (10:1 ratio or greater; ProTools’ default is 100:1). Severely cuts gain above the set threshold.. Much more obvious to the ear than compression. Often used as a ‘safety’ to prevent clipping above a certain limit, for broadcast signals, or for dynamically limited media like cassettes. Also used to reduce pops & clicks, Example:A compressor with a ratio of 100:1 and a threshold of -10 dB will reduce a -2 dB signal at the input to -10.08 dB at the output (1/100 x 8 dB over threshold = 0.08 gain ) -2dB -2dB -6dB -6dB -10dB -10dB -14dB -14dB No limiting: -2 dB = -2 dB With severe limiting (100:1 ratio & -10dB threshold): the -8 dB of gain is reduced to 1/100th of its size by the limiter Compressed output is -10.08 dB In-line effects Mid-term review Signal processing Variable Gain Amplifiers 3. Gates Definition: does the opposite of a limiter. When the signal falls below a set threshold, the level is abruptly cut to a specified level . Used to eliminate unwanted low level sounds in speech, music and other recordings.(e.g. to eliminate headphone or other leakage on music tracks). Also used for special effects like gated reverb -2dB -2dB -6dB -6dB -10dB -10dB -14dB -14dB With no limiting, the signal decays at a natural rate With severe limiting (100:1 ratio & -10dB threshold the gain is reduced dramatically (to 1/100th of the original) by the gate when the threshold is reached. In-line effects Mid-term review Signal processing Variable Gain Amplifiers 4. Expanders Definition: does the opposite of a compressor. When the signal falls below a set threshold, the level is cut according to the set ratio. Also used to eliminate unwanted low level sounds in speech, music and other recordings, but somewhat more gently. Also used for special effects like gated reverb -2dB -2dB -6dB -6dB -10dB -10dB -14dB -14dB With no expander applied, the signal decays at a natural rate With an expander in-line using a 3:1 ratio, the signal decays three times more rapidly than the original once the threshold is reached. In-line effects Mid-term review Signal processing Variable Gain Amplifiers 5. De-essers Definition: a limiter that is only applied within a set certain frequency range and at a set amplitude threshold. All frequencies in the range are sharply attenuated (ProTools default is 100:1). Normal use is to reduce the harshness of hard consonants (‘s’, ‘t’, ‘tch’, etc.) Typical frequency range for this purpose is 6-7 kHz and up. In the settings shown, as soon as a consonant above 7 kHz reaches 0 dB, sharp limiting will occur to those frequencies only. Send/Return effects Mid-term review Signal processing Definition: An effect added to the direct signal on a track or mixer channel via an auxiliary send/ return prior to the output The amount of effect can be controlled by adjusting the auxiliary send and / or return levels on the mixer Output Signal Send effect Send/Return effects Mid-term review Signal processing Reverb Definition: a device or plug-in that simulates acoustic reverberation, a series of complex sound reflections varying with the size, proportions, and reflective surfaces of an interior space. Digital reverbs simulate acoustic spaces by providing programmable parmeters such as room size, room type, decay, etc. Usual use is as a send/return effect, but can also be used as an in-line effect if RAM memory permits. Playing with the mix between direct signal and reverb then becomes a consideration. Send/Return effects Mid-term review Signal processing Delay Definition: a device or plug-in that delays the input signal by a set number of milliseconds. In addition to the delay itself, a feedback control feeds the delayed signal back into itself to varying degrees (expressed as percentages) and generates diminishing or infinite repeats of the delay Most digital delays also include a filter to make the repeated or single delays sound ‘duller’, similar to a real acoustic echo. ProTools’ delay also includes modulation, a cyclic pitch variation of the delayed signal used to create chorus and flange effects. The delay repetitions can also be synced to a tempo for musical purposes. Send/Return effects Mid-term review Signal processing Delay Uses:. A short delay (less than 40 ms) acts a bit like an EQ, in that it reinforces or cancels certain frequencies due to phase differences it causes. Slap echo is obtained with slightly longer delay times (40 to 170 ms), and adds a type of ‘ambience’ to the effected sound. Medium (340 - 680) and longer delays (680 ms +) are generally used for musical and special effects. The repetitions of the sound are distinctly audible, and tempo sync considerations come into play.