Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Richard Dobson Dr Archer Endrich Composers Desktop Project CAS Wiltshire Hub Kingdown School Warminster 14 November 2012 The Science of Sound – a Micro-history We stand on the shoulders of many giants. “Musical training is a more potent instrument than any other, because rhythm and harmony find their way into the inward places of the soul, on which they mightily fasten, imparting grace, and making the soul of him who is rightly educated graceful.” Plato Pythagoras Guido d’Arezzo Hermann von Helmholtz Max Mathews Some topics in SMC Digital Audio – sampling, synthesis, processing Music Representation and Analysis Performance and Interactive Composition Languages for Music – Algorithmic Composition Software and Hardware Design Acoustics and Psychoacoustics Sonification and Audification The Shapes of Sound A sound wave is bipolar. • A wave comprises alternating displacements from a central “zero” position. • For a sound wave the zero line corresponds to silence. • Displacements are both positive and negative, and should sum to zero. area above = area below Sampling Sound 1 • The overall process is generally called “digitising” • Two aspects: we need to digitise both amplitude and time • Quantisation of amplitude to discrete levels (represented in N-bit words) • Sampling properly refers strictly to discretising of time (sampling rate) • (Hence technical literature will refer to “periodic sampling”, “discrete-time”, etc) • Quantisation introduces “quantisation error”, which manifests as “quantisation noise” • Sampling depends on a very accurate clock. Errors in timing are known as “jitter”; not something we need to worry about. • Soundcard clocks are based on crystals, just as CPUs are. • Nothing is perfect; one 44100 Hz clock may not exactly match another. Independent devices will drift out of sync over time. Sound Example 1 – quantisation noise for N = 16, 12, 10,8,6,4,2,1 Quantisation – the Challenge Integers: N bits gives us 2N levels - an even number. Where is the middle? Standard quantisation is called “mid-tread” qval = floor(val + 0.5) • • • • • Includes zero valued sample = twos complement arithmetic Asymmetrical: e.g. 16-bit range = -32768 to +32767 Tiny values quantise to zero, so are lost Standard choice for audio codecs 4-bit quantisation The alternative is “mid-rise” quantisation qval = floor(val) + 0.5 •No zero value •Symmetric for all level values •Bipolar one-bit quantisation possible •Tiny values quantise to quasi square wave Sampling Sound 2: Nyquist The “modified” Nyquist-Shannon sampling theorem sr f input < 2 “Perfect reconstruction” requires phase independence. (what the textbooks usually show) cosine phase : amplitude = 1 (what the textbooks usually don't show) sine phase : amplitude = 0 For perfect reconstruction, input frequencies must be below Nyquist. The Nyquist limit itself (sr/2) defines the onset of frequency aliasing, where the Nyquist frequency aliases with DC. Put another way: we need more than two samples per cycle. Sampling Sound 3: anti-aliasing Aliasing is now impossible to demonstrate using consumer hardware. The crystal This is a Cirrus Logic 8ch 192KHz Sigma-Delta oversampling ADC Anti-Alias filter is integrated into the device Even cheap chips do this now So whatever sample rate you set, the input is correctly filtered Examples of aliasing To sample analogue audio without anti-alias filters we can use older types of ADC (e.g. using the method of successive approximation), or industrial data acquisition systems. Sound Examples 2 These examples were prepared by Dr R.W. Stewart, University of Strathclyde, for his CDROM project “DSPedia”. • On the other hand, aliasing is very easy to demonstrate using digital sound synthesis. • The dominant sources of aliasing these days are synthesis and processing, not recording. Aliasing - a synthetic example • We need a program to generate a plain sine frequency sweep (“chirp signal”) and write it to a sound file. • Use a low sampling rate : 11025 Hz • Let the sweep rise to an extreme value : 16000 Hz! • Listen to it…. • And view it in the frequency domain (Audacity) Sound Example 3 Reconstruction : the Digital to Analogue Converter • The DAC is strangely absent from most CS curricula • but it is more important than the ADC: • • We can manage without audio input but not without audio output! • With oversampling (as in the ADC), the final analogue reconstruction filter can be very simple – and cheap • It restores the required smooth curves of the underlying waveform Periodic Waves: Time v Distance The Time Domain The speed of sound in air is approximately 340 M/sec. So we can measure frequency either in terms of distance or in terms of time. • • Wavelength – literally the length of a cycle Period – duration of one cycle • • Frequency = speed of sound / wavelength Frequency = 1 / period •Frequency is not a measure of either length or duration. It is therefore best to avoid labelling either wavelength or period directly as “frequency”. Audio Data Representation –Time Domain • Two basic forms – data stream, and a file format. • Two primary number representations: • Integer (e.g. -32768 to 32767) • Floating point • These days, the ± 1.0 floating point normalised representation is the most important. • We can display amplitude (V scale) either as normalised sample values, or in decibels (dB) Normalised - Audacity To convert an amplitude a to dB: Decibel (logarithmic) scale – Adobe Audition dBval = 20.log10(a) Amplitude Display – the dB log scale Using the standard display, most of the signal is invisible. The ear senses both loudness and frequency on a logarithmic scale e.g. from a maximum of 1.0 (0 dB) to less than 0.0001 (-80 dB). Where does the sound finish? Here? Here! Representation – Frequency Domain We have two primary and complementary ways to represent sound. • Time Domain : amplitude / time • Frequency Domain : two related forms. • Spectrum : amplitude / frequency • Spectrogram (or sonogram) : spectrum / time • Audacity supports both, with linear/logarithmic options • Again, the log frequency scale reflects how we hear – e.g. axis marks in octaves – or musical notes. The figures below display a sine “log frequency sweep” (without aliasing) Log vertical scale Linear vertical scale Digital Sound Synthesis A basic definition : using algorithms (and some maths) to generate audio data Two computer-based approaches: • Real-time, e.g. using “soft” or hardware-based synthesisers • Offline – writing data to a soundfile for later playback. • Many possible approaches; most are technically difficult, maths-heavy, and (especially for real-time) computationally demanding – need fast hardware, and compilers able to generate very fast code. One (relatively) simple but classic approach identifies three fundamental ingredients: • Sine waves • Noise • Time-varying Control functions (“automation”, “breakpoint data”) Together, these form the basis of additive synthesis. This means, quite literally, arbitrarily or algorithmically adding sound waves together – also known as “mixing”. Music Synthesis and Algorithmic Composition Concentrates on the control aspect. • Most common route is the algorithmic generation of MIDI data, in real time or written as a standard MIDI file. • Many free domain-specific languages are available. Some support both direct synthesis and algorithmic score generation using a library of freely arranged modules. The (arguably) pre-eminent example is Csound. Algorithmic composition can be very complex, but can also be very simple, such as loop-based generation of scale and chord patterns. The auto-arpeggiator built into many synths and home organs is a simple example of a musical automaton. • MIT Scratch: supports basic soundfile playback and MIDI note generation. Loose timing limits scope to simple patterns. • Python: many extension libraries available, for both synthesis and MIDI programming. It includes standard modules for basic soundfile i/o. Sonification and Audification The rendering of non-audio data as sound in order to reveal patterns and features. • Audification : source data already has a time dimension. • e.g. seismic, volcanic, astrophysics, even stock price movements . • Sonification: applied to any arbitrary numeric data. • Generally applied to large data sets which are already a challenge to analyse. For example, we have worked on particle collision data from the Large Hadron Collider (searching for the Higgs boson), as part of the LHCsound outreach project1. • However, it can be applied to small data sets and processes too: • Any algorithms involving lists, iteration and loops • Shapes of mathematical functions and formulae • Whether the output is sonification or algorithmic composition depends entirely on your intention and interest – the process itself is the same. (Examples of simple sonification were presented in Scratch) 1 http://people.bath.ac.uk/masrwd/lhcsoundresources.html and http://www.lhcsound.com