Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Proceedings of the 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Vol. 20, No 3,1998 The Generation and Validation of High Fidelity Virtual Auditory Space Simon Carlile', Craig Jin1,2,Vaughn Harvey''2 Department of Physiology, The University of Sydney, NSW, Australia 2006 'Department of Electrical Engineering, The University of Sydney, NSW, Australia 2006 E-mail: [email protected] Abstract-This paper reviews a number of the issues involved with recording the filter functions of the outer ear and subsequently using these functions to render virtual auditory space using headphones. The acoustical problems associated with recording within the confined tube of the auditory canal are considered for both specification of the head related transfer function and the headphone transfer function. The difficulties of acoustically validating the rendered VfLS indicate the importance of validation using a powerful test of airditory performance. We describe such methods as used in our laboratory. The issue of individualised HRTFs is considered and two relevant experiments are described concerning the (i) mapping of the morphology of the outer ear to the filter characteristics and (ii) the subsequent manipulation of standard HRTFs. Keywords-Virtual auditory space, Head-related transfer function, Sound localization, Virtual auditory displays I. INTRODUCTION A. Applications of VAS There are a number of outstanding problems in rendering high fidelity virtual auditory space (VAS). This paper will review some of the most common ways of measuring the filter functions of the outer ears and of using these functions to generate sounds localised in a virtual space. We will highlight some of the technical difficulties associated with performing these kinds of measurements on humans as well as the problems associated with acoustically validating the rendered virtual space. In discussing the nature of the perceptual errors that can arise as a result of poorly rendered virtual space we will focus on a class of large localisation errors; the so-called cone-of-confbsion errors. On the one hand, for relatively trivial applications of these technologies such as by the games industry, these problems are not so acute, but where high fidelity localisation is required (such as for directional or location mapping) then these problems are far more profound. Work over the last decade has demonstrated that, under carefully controlled conditions, it is possible to render auditory environments with high fidelity. In the case of artificial virtual environments, non-auditory data can also be mapped into this [email protected] facilitate operations such as in the development of head-updisplays. In addition to the remapping of some of these data, the auditory dimension has also been used to replace, supplement or facilitate visual information. Some examples include a horizon indicator that uses a moving auditory icon to indicate the gravitational 'up' directions, a collision warning system which indicates the direction of an incoming threat using an appropriate auditory icon or as a flight plan navigational aid. (e.g. see [I-41). The fidelity of the rendered virtual auditory space is most important in mission critical applications such as the gravitational horizon or collision indicator. B. The Acoustical Basis of VirtualAuditory Displays The perception of auditory space is dependent on acoustical cues that arise as a result of the pattern of sound waves arriving at each ear (for recent review see [ 5 ] ) . A sound source located away from the midline results in differences in the arrival time and the level of the sounds at each ear. The auditory periphery, comprising the pinna, concha, head and torso, interact with the incoming sound waves and spectrally filter the sound. Because of the morphological asymmetries of these structures, the nature of this filtering is dependent on the relative location of the sound in space (see [6]). These filter functions are commonly described as the head related transfer functions (HRTF) and ideally provide a complete description of the transformation of the sound from a point in free space to the eardrum. However, the perception of an externalised sound is not dependent exclusively on the HRTFs and therc are a range of other relevant acoustical cues ([7]). In the ideal case, the functional requirement of a VAS display is to generate the pattern of sound waves at the eardrums that might have occurred had the sound occurred in the free field (for recent review see [SI). In practice this is very difficult to achieve. There are a range of technical difficulties associated with recording the transfer function to the eardrum. The placement of the probe microphone in the human outer ear is not a straightforward procedure due to the delicacy of the eardrum and the sensitivity of the proximal portion of the auditory canal. Secondly, the impedance mismatch between the outer and inner ear results in the reflection of energy at the domain. This provides a means by which data that is either eardrum and the consequent development of standing waves highly symbolic, or data that is usually delivered to a different sensory channel, can be mapped onto the auditory channel. For instance, there are a variety of situations where the visual channel carries a very high data load such as in a flight cockpit. Efforts have been made to remodel these kinds of data to along the length of the auditory canal. A microphone within the auditory canal will sample pressures that result from a combination of incoming and outgoing sounds [ 9 ] . Moreover, close to the eardrum the sound field becomes particularly complicated due to the coupling of the sound field and the 0-7803-5164-9/98/$10.000 1958 IEEE 1090 Authorized licensed use limited to: UNIVERSITY OF SYDNEY. Downloaded on January 31, 2010 at 21:30 from IEEE Xplore. Restrictions apply. effective reflecting surface of the eardrum [lo]. As a consequence, it is difficult to specify an optimal location within the proximal portion of the auditory canal at which the HRTF can be specified. In addition, a compromise needs also to be made between recording convenience, the safety of the experimental subject and the complex acoustics close to the eardrum. Of course, the transfer functions of the headphones (HpTF) used to deliver the VAS need also to be measured and compensated for in the process. All of the problems associated with the measurement of the HRTF also apply to the measurement of the headphone transfer function. There are in addition, two other problems that need to be overcome. First, there is an increased difficulty in maintaining the exact location of the microphone for both the free field and headphone transfer functions. The headphones have a tendency to distort the ear and push on the recording microphone. Variations in the point to which these functions are specified will lead to sharp and quite large disparities resulting from small variations in the frequency of the standing wave null at the location of the probe microphone ([6]). Secondly, most headphones tested demonstrate significant variation in the transfer functions dependent on their exact placement on the outer ear ([ 113). This latter study also demonstrated that the use of a standardised headphone calibration would lead to significant errors in the regenerated HRTF as the headphone calibration captures a significant component of the individualised characteristics of a listener’s ears. As a result of the many technical difficulties outlined above, most approaches involve a number of approximations in the recordings procedure and untested or untestabfe assumptions in the rendering of VAS. C. Common Methods Applied in Generating VirtualAuditory Space There are a number of methods that have been employed to measure the HRTF (e.& [12-151). These methods can be broadly divided into the (i) deep or (ii) shallow ear canal recordings or (iii) blocked ear canal recording. The different techniques all aim to overcome the problem of the artifacts produced by the standing waves within the canal. We have previously attempted to record as deeply as possible within the canal by using the frequency of the standing wave null to determine the distance of the microphone probe from the effective reflecting surface of the eardrum ([14] see also [16]). The microphone is placed sufficiently closely to the eardrum so that the frequency of the standing wave occurs above 14kHz. This provides a reasonable estimate of the transfer function at the low to mid frequencies [16], but provides an increasing underestimate for progressively higher frequencies. Other authors have recorded at more distal locations within the canal although these locations provide a poor estimate of the spectrum of the sound at the eardrum (e.g., [29]). An alternative approach is to record the so-called ‘Thevenin pressure’, which is obtained at the entrance of the ear canal that has been plugged to eliminate the effects of the canal resonance and the standing waves. This situation is similar to the ‘open circuit’ situation for deriving the Thevenin equivalent circuit in electronics, only in this case it’s an acoustical circuit. There are two characteristic impedances describing this situation: (i) the input impedance seen at the ear canal, Ze, canal and (ii) the output impedance seen at the ear canal, Zradration. The validity of this approach relies on the argument that, using a sufficiently ‘open’ headphone, the transfer function obtained with the headphone will be equivalent to the free field transfer function. Briefly, this can be seen as follows. The free field transfer function is given by: z e a r canal 1 (zear canal -k Zradiatlon 1, while the transfer function with headphone delivery is: z e a r canal / @ear canal -k Zheadphone), where &adphone is the impedance of the headphone seen from the ear canal. Thus it can be seen that an ‘open’ headphone must have Zheadphone and Zradlatlon approximately equivalent. The methods employed in generating the VAS will depend on the methods that have been used in recording the HRTFs. The simplest case is for the deep ear canal recordings. If the transfer function of the measurement system is known (system transfer function: the speaker, microphone and recording system) then dividing the measurement made within the ear canal with the system transfer function should provide the HRTF to the measured point in the ear. Likewise, if the HpTF has been obtained then dividing the HRTF with the HpTF will produce a signal that, when filtered using the headphone, should produce the desired sound at the eardrum. However, as noted above, there are problems associated with obtaining reliable HpTF. To avoid this difficulty, in-ear tube phones such as those produced by Etymotic have been used in a number of laboratories to generate VAS. Although relatively expensive, the tube-phones appear to have two main advantages. First, the placement within the ear is highly reproducible compared to placement of circum-aural or supraaural headphones. Secondly, the manufacturers claim that the transfer function of the driver is flat to the eardrum so there is effectively no headphone transfer function that needs to be compensated for. It is common to treat the transfer function from a particular sound source location as a two-stage transfer function. One stage is a location independent transfer function and the other stage is a location dependent transfer function. If the location independent transfer function is deconvolved from the original transfer function, then the resulting transfer function is known as the directional transfer function or DTF. The primary reason for performing such a manipulation is to remove measurement artifacts. Theoretically this is generally taken care of by calibrating the recordings using the inverse of the recording microphone transfer function. However this method can be prone to errors because of noise inherent in the measurement process. If instead, all of the transfer functions are averaged to obtain a location independent transfer function, which is then deconvolved from the original transfer function, the influence of noise upon the measurement process is greatly reduced. Different techniques exist for computing the average transfer function (also known as the diffuse field transfer function) and probably the most popular method is the one introduced by Middlebrooks [ 171 which involves averaging the log magnitude 1091 Authorized licensed use limited to: UNIVERSITY OF SYDNEY. Downloaded on January 31, 2010 at 21:30 from IEEE Xplore. Restrictions apply. spectra across all locations. A drawback to this procedure is the invariable introduction o F an overall interaural distortion to the simulation process. This can be reasoned as follows. If the original interaural transfer fimction, ITFonglnal , is given by the equation: ITFongmal = (IR*DR)/ (IL*DL), where IR is the right location-independent transfer function and DR is the right location-dependent transfer function (similarly for the left ear), then the interaural transfer function resulting from using the DTFs, DRand DL,is given by: ITFdtf = DR/DL It then follows that the (difference between the two interaural transfer functions can be (expressedas: (ITFongmal/ ITFdtf) = IR/ID. Thus IR/IDgives a measure of the distortion introduced into the interaural transfer function when using the DTF. One outstanding question is the general acoustical validation of this approach. It is not practical or, in the case of the in-ear tube phones even possible to calibrate the headphone transfer functions for each operator prior to generating VAS. This leads to some uncertainty regarding the acoustical accuracy of the rendered HRTFs and beg:; the need for an appropriate means by which individualised VAS can be appropriately and convenientlyvalidated. D. Tests of the Fidelity cfRendered Virtual Auditory Space the orientation of the head and thus provides an objective measure of the perceived location. Turning to face the source of a sound is a highly ecological behaviour which brings the source of the sound into the visual field [18]. All subjects undergo a short period of training prior to localisation testing to ensure that they can reliably use this method of pointing to indicate the perceived location of the sound source. Localisation errors fall broadly into two categories. The most common type of error is associated with relatively small deviations of the perceived location from the actual location and is referred to here as a “local error”. The second type of error typically involves a very large error where the perceived location of the target is at a location reflected about the interaural axis (the line passing through the two ears). In this type of error the correct angle with respect to the median plane is estimated but the spatial quadrant is confused; for example, a sound located close to the anterior midline is judged to be close to the posterior midline. Such errors occur relatively infrequently (typically less than 4% under the conditions tested in our laboratory) and are referred to as a “front-back confusions” or more properly “cone-of-confksion” errors. The large qualitative differences between local and cone-ofconfusion errors is generally taken to indicate the failure of two different processes in auditory localisation (for an extensive discussion see [6]). We have examined both the performance of individuals and a pooled population of subjects using a number of statistical measures. The cluster of localisation estimates about an individual target location can be described by the centroid and standard deviation calculated using a Kent or Fisher distribution as appropriate ([ 19, 201). The centroid indicates the systematic error in localisation and the dispersion the accuracy. The systematic errors and the dispersion of these clusters are smallest for frontal locations close to the audiovisual horizon and largest for locations behind and above the subject (Figure 1: taken from [21]). The spherical correlation coefficient is used to assess the association between the centroids of the perceived locations and the actual target locations (the correlation of the data shown in Figure 1 was 0.98 for the pooled data and the rate of front-back confksions was 3.2% of the localisation trials. We have assessed the fidelity of virtual auditory space by using sound localisation performance. The ability of a subject to localise a burst of broadband noise in anechoic space is compared with his/her localisation performance for the same type of stimulus presented in virtual auditory space. From an evolutionary point of vkw, the most sophisticated form of sound localisation involv4esbinaural and monaural processing and is capable of determining the locations of very brief sounds with considerable accuriicy, particularly if the sounds are spectrally dense (for recent review see [5]). There is considerable evolutionary pressure to accurately localise sounds such as the inadvertent movement noises of a predator or prey. We have chosen localisation of a brief sound as a test of the fidelity of VAS as processing of these kinds of stimuli represent a demanding test of auditory localisation abilities. B. Localisation Performance in Virtual Auditory Space Combined with appropriate statistical treatments (see below) Individualised HRTFs were used to generate virtual such a test gives a powerfill evaluation of VAS fidelity auditory space stimuli for each subject. The impulse responses 11. ASSESSESMENT OF SOUND LOCALISATION for the left and right ear for the same 76 locations used in testing the free field localisation performance were calculated PERFORMANCE as described above. In some experiments the localisation accuracy for the different kinds of filtering were tested: A. Free field Localisatiofr Namely impulse responses calculated from (i) deep ear canal We have developed an automated stimulus system that HRTF, (ii) blocked ear canal HRTF and (iii) DTF calculated allows the placement of a sound source at any position on an from any of the microphone recording conditions. For the data imaginary sphere surrounding the subject. The robot arm shown in Figure 2 the impulse responses for the left and right places the sound stimulus at one of 76 randomly chosen ears were calculated from the deep ear canal recordings of five locations in a darkened anechoic chamber. The subject turns to subjects and convolved with 150 ms of broad band noise using face the perceived locatioii of the target and points his/her nose MatLab (Mathworks). Stimuli were generated using 16bit D-A at the source. An electromagnet tracking device mounted on converters at 80kHz (TDT System 11) and were presented to the top of the head (Polhemus: IsoTrack) is used to measure the subjects using in-ear tube phones (Etymotic AER-2). As 1092 Authorized licensed use limited to: UNIVERSITY OF SYDNEY. Downloaded on January 31, 2010 at 21:30 from IEEE Xplore. Restrictions apply. mentioned above, the principal disadvantage of using the tube phones was that any inter-subject differences in the tube-phone transfer functions could not be measured and accounted for under these experimental conditions. Sound localisation performance for stimuli presented in VAS was assessed in the exactly the same manner as for the free field localisation. The only difference in conditions was that the subjects wore the in-ear tube phones to deliver the stimulus. Some subjects demonstrate a pattern of acclimatisation to the VAS and initially show a higher than normal rate of front-back confusions that declined significantly after one or two trial blocks ([22]). The spatial distribution of localisation errors for sounds located in the virtual space was very similar to that for sounds presented in the free field (Figure 2). On average, dispersion was greater by 1.5" for stimuli presented in VAS compared to free field. There was a slight increase in the dispersion of localisation estimates for locations behind and also above the subjects when compared to free field localisation. An increase in the front-back confusion rates (the most prominent form of the cone-of-confusion errors) was also seen with average rates rising from around 3-4% in the free field up to 6% for sounds presented in virtual space. A significant proportion of these front and back confusion are probably attributed to an increase in the angular errors about the interaural axes, particularly at the higher elevations. The spherical correlation between the perceived and actual target locations was 0.973, Furthermore the spherical correlation between the VAS and free field localization was higher still (0.98) indicating that subject biases evident in the free field data were also replicated in the VAS data. For some subjects we have carried out VAS testing using a range of different methods of generating VAS and some differences have been found between DTF and HRTF recordings (e.g. Table 1). In this case, the subject was known to have slight asymmetries between the ears that resulted in differences in the directional characteristics of the ears. The calculation of the DTF might be expected to produce a subtle distortion in the interaural differences by removing a constant interaural difference and result in confounded binaural spectral cues. While there is some evidence that the VAS rendered without calculating the DTF resulted in better performance in either the deep ear canal recording or the blocked ear canal recording, the differences are not that marked. This suggests that there is redundancy in the auditory localisation cues that may result in compensation for the deficiencies or distortion in one cue set. It has been known for some time that the individual differences in the structural features of the outer ears can lead to significant differences in the filter functions. Wenzel et ai., ([23]) showed that localisation in VAS generated using HRTFs that were not obtained from the listeners, often resulted in localisation performance that was significantly degraded. Any one subject showed a range of localisation performance using HRTFs obtained from different subjects. The performance differences presumably reflect the extent of the physical differehces between the ears of the listener and the ears from SCC Front-back confusions I 0.95 0.89 0.93 1.5% 4.6% I 0.96 Condition Normal Free field Deen ear: HRTF Deep ear: DTF Blocked ear: DTF I docked ear: HRTF I 0.95 I 2.6% I 4.5% I 4.0% 111. MDIVIDUALISED HRTFS which the HRTFs were obtained. Subjects seemed largely unable to learn to use these non-individualised HRTFs despite repeated exposure. This has important consequences for the generalised use of VAS displays. Unless the HRTFs used in generating a display are individualised for the listener there will likely be significant variations in the fidelity of the VAS generated across a population of listeners. To that end a number of projects are being pursued in our laboratory that seek (i) to find ways in which the structural features of an individual's ears can be used to predict the nature of the HRTF and (ii) to examine ways in which a standard set of HRTFs might be modified or 'morphed' to better fit the filter functions of a particular listener's ears. A. HRTF Mapping No one has yet found a mathematically tractable or useful functional relationship between the acoustical morphology of the outer ear and the HRTF. And yet, such a relationship could be exploited to produce a generic VAS system adaptable to each individual. In hopes of such a realization, a number of inroads have been made towards solving this problem. For example, Lopez-Poveda and Meddis [24] have used a waveequation model to characterize the human concha. Kulkarni and Colburn [25]have applied ARMA modeling to HRTFs and have noted that the resonances and anti-resonances of the pinna may be characterized via the pole-zero positions of their model. Wightman and Kistler [26] have used multidimensional scaling analysis to analyze the differences between individual HRTFs in a low-dimensional space. No matter which method was employed, the underlying difficulty arises from the large morphological differences between people and the amazing sensitivity of the auditory system to the specific filter characteristics of its own ears. It is not just the shape of the ears that plays a role but also the angle of the ears, size of the head, and shape of the torso. Therefore we have taken a reductionist approach and have studied a system with a reduced number of varying parameters. To this end, a real-life acoustical mannequin was built which allows us to vary many characteristics of the auditory periphery and to measure acoustically the consequences of such modifications. In the experiment described here we have recorded the HRTFs while varying only the angle between the ear and the head (the ears were attached to the head using a notched hinge). 1093 Authorized licensed use limited to: UNIVERSITY OF SYDNEY. Downloaded on January 31, 2010 at 21:30 from IEEE Xplore. Restrictions apply. Figure 1: Pooled localisation responses from 19 subjects shown for front (f), back (b), left (I) and right(r) hemispheres. The actual target locations are shown by the: small and the centroid of the pooled data by the filled circle. The: ellipse indicates the standard deviation of the cluster of perceived responses. Figure 2: Location estimates pooled for 5 subjects for sounds presented in virtual auditory space. All other details as for Figure 1. ‘*I HRTF. The mean square error also provided a measure of the explained variance accounted for by the linear functional model of the data. It was found that a reasonable mapping between the outer ear angle, a , and the HRTF data could be derived using a linear functional consisting of three parameter functions and coefficients. The degree of match between an original HRTF and that derived from the linear functional model is shown in Figure 3. The first two functions, a(a) and b(a), of the linear functional demonstrated a linear and quadratic functional dependence, respectively, upon the outer ear angle, whilst the third function, c(a), had a more complicated functional dependence. The approximately linear dependence of the first function, a(a), on the angle between the ear and the head is shown in Figure 4. These data indicate that the dependence of the HRTFs upon a single physical parameter, the angle between the outer ear and the head, expanded within the model to at least three parameter functions. This is a rather sharp increase in complexity when examined in terms of the number of potential physical variations between individuals. Furthermore, this example varies only a single physical parameter and the additional complications introduced when two physical parameters co-vary are unknown. A mathematical model of the functional relationship of the angle between the ear and the head and the HRTFs was derived. While only one physical parameter was varied, namely the angle of the ear, we did not expect that only a single parameter in the model would be required to completely describe the HRTF variations. Essentially, the angle of the ear influences the HRTFs through a complicated boundary-value problem involving the wave equation. However, by utilizing this physical reductionist approach, one can begin to assess more accurately the difficulties of the original problem. Seven complete sets of HRTF recordings of the mannequin were performed for each of 7 different angles of the ear with respect to the head. Thes8eangles were (in degrees): 0, 10, 20, 30, 40, 50, 60. Each set of recordings contained the left and right ear transfer functions for 390 locations equally distributed on the sphere. Principle component analysis [27] was performed across the HRTFs in all 7 recordings (each ear was treated separately). A linear functional was then used to map the angle between the ear and the head to the principle component representation of the HRTFs in a given recording. The linear functional was derived such that the coefficients of the functions within the linear functional did not vary with the angle of the ear, but did vary with location. This relationship is B. HRTF Morphing described as follows: Another interesting approach towards exploring the relationship between outer ear morphology and HRTFs starts hrtf,(a,b,c;k,l,m) = a(a) E, + b(a) TI + c(a) m, not with the basic functional relationships between the ear and where hrtf, specifies the principle component representation of the HRTF, but with the functional relationship between the the HRTF with an outer ear angle of a and at a location labeled HRTFs of different individuals. This approach shall be called by i , and a(& b(a), c(u) are functions of a with respective the ‘morphing approach’ and is one in which a given set of vector coefficients El ,TI , , that vary with the location, again HRTFs for one individual is adapted or ‘morphed’ into another set for a different individual. Middlebrooks [28] has recently labeled by i. The parameter functions and coefficients of the demonstrated that a simple scaling of the HRTFs along the linear functional were derived by minimization of the mean frequency axis can account for a substantial proportion of the square error between the original HRTF and the calculated variance between the HRTFs of different individuals. The 1094 Authorized licensed use limited to: UNIVERSITY OF SYDNEY. Downloaded on January 31, 2010 at 21:30 from IEEE Xplore. Restrictions apply. - I 30. I I 20 I 1 I I 1 I 4 500 200 a(a) ’00 0 -400 - 0 10 20 30 40 50 60 a (degrees) Figure 4. T h e first parameter function, a(a), as a function of the angle, a , between the outer ear and the head. difficulty, in general, with this approach is deriving a suitable ‘morphing function’ between different HRTF data sets. Ongoing research in this laboratory is exploring the linear functional technique described above. The intriguing aspect of the morphing approach is the possibility of discovering a psychophysical relationship between the morphing parameters and the perception of the derived VAS. However experiments within our own laboratory and personal communication with Wightman and Kistler indicate that such a relationship will only be useful if a small number of parameters are required. ACKNOWLEDGEMENTS This work was supported by grants from the Australian Research Council, National Health and Medical Research Council (Australia), the Ramaciotti Foundation and the University of Sydney. The authors would like t o acknowledge the assistance of Johahn Leung, Stephanie Hyams, Anna Corderoy and Andre‘ van Schaik. REFERENCES [I] [2] [3] [4] E. M. Wenzel. “Research in Virtual Auditory displays at NASA”, in SimTecT 96, Melbourne, 1996. E. M. Wenzel, “Localization in virtual acoustic displays”, Presence, pp. 80-105, 1992. D. R. Begault, 3-0 sound for virtual reality and multimedia, Chestnut Hill, MA, Academic Press, Inc., 1994. R. L. McKinley and M. A. Ericson, “Flight Demonstration of a 3-D auditory display“, in Binaural and spatial hearing in real and virtual environments, R. H. Gilkey and T. R. Anderson, Eds., 1997, Lawrence Erlbaum Associates, Inc., Mahwah, New Jersey. pp. 683-699. S. Carlile, “Auditory space”, in Virtual auditory space: Generution and applications., S. Carlile, Ed., Landes, Austin, chapter 1, 1996. S. Carlile, “The physical and psychophysical basis of sound localization“, in Virtual auditory space: Generation and applications., S. Carlile, Ed., Landes, Austin, chapter 2, 1996. N. I. Durlach, et al., “On the extemalization of auditory images”, Presence, vol. 1, pp. 251-257, 1992. S. Carlile, ed., Virtual auditoiy space: Generation and applications, Landes, Austin, 1996. [91 S. M. Khanna and M. R. Stinson, “Specification of the acoustical input to the ear at high fiequencies”, J. Acoust. Soc. Am., vol. 77(2), pp. 577-589, 1985. R. D. Rabbitt and M. H. Holmes, “Three dimensional acoustic waves in the ear canal and their interaction with the tympanic membrane”, J Acoust Soc Am, vol. 83(3), pp. 1064-1080, 1988. D. Pralong and S. Carlile, “The role of individualized headphone calibration for the generation of high fidelity virtual auditory space”, J Acoust Soc Am, vol. 100(6), pp. 3785-3793, 1996. F. L. Wightman and D. J. Kistler, “Headphone simulation of free field listening. I: Stimulus synthesis”, J Acoust Soc Am, vol. 85(2), pp. 858867, 1989. ~ 3 1J. C. Middlebrooks, J. C. Makous, and D. M. Green, Directional sensitivity of sound-pressure levels in the human ear canal., J. Acoust. SOC. Am., vol. 86(1), p. 89-108. 1989. ~ 1 4 1 D. Pralong and S. Carlile, “Measuring the human head-related transfer functions: A novel method for the construction and calibration of a miniature “in-ear’’recording system”, J Acoust Soc Am, vol. 95(6), pp. 3435-3444, 1994. ~ 1 5 1 D. Hammershoi and H. Moller, “Sound transmission to and within the human ear canal”, JAcoust Soc Am, vol. 100(1), pp. 408-427, 1996. [I61 J. C. K. Chan and C. D. Geisler, “Estimation of eardrum acoustic pressure and of ear canal length from remote points in the canal”, J. Acoust. SOC.Am., vol. 87(3), pp. 1237-1247, 1990. ~ 7 1J. C. Middlebrooks and D. M. Green, “Directional dependence of interaural envelope delays”, J. Acoust. SOC.Am., vol. 87(5), ~ p 2149. 2162, 1990. [I81 J. C. Middlebrooks and D. M. Green, “Sound localization by human listeners”, Annu. Rev. PsychoL, vol. 42, pp. 135-159, 1991. [19] P. H. W. Leong and S. Carlile, “Methods for spherical data analysis and visualisation”, J Neurosci Methods, In Press, 1998. f20] N. I. Fisher, T. Lewis, and B. J. J. Embleton, Statistical analysis of spherical data, Pper Back (with errata) ed. Cambridge, Cambridge University Press, 1993. [211 S. Carlile, P. Leong, and S. Hyams, “The nature and distribution of errors in the localization of sounds by humans”, Hear Res, vol. 114, pp. 179-196, 1997. D. Pralong and S. Carlile, “Localization accuracy in virtual auditory [221 space”, Proc Aust Neurosci SOC,vol. 7, p. 225, 1996. ~231 E. M. Wenzel, et al., “Localization using non-individulaized headrelated transfer functions”, J Acoust Soc Am, vol. 94(1), pp. 11 1-123, 1993. P I E. A. Lopez-Poveda and R. Meddis, “A physical model of sound diffraction and reflections in the human concha”, J. Acoust. Soc. Am., VOI. 100(5), pp. 3248-59, 1996. 1251 A. Kulkarni and H. S. Colburn, “Infinite-impulse respmse models of the head-related transfer function”, J. Acoust.. SOC. Am., Submitted, 1997. [26] F. L. Wightman and D. J. Kistler, “Multi-dimensional scaling analysis of head-related transfer functions”, IZEE workshop on applications of signal processing to audio and acoustics, 1993. ~ 7 1I. T. Jolliffe, Principle component analysis. New York, SpringerVerlag, 1986. [28] J. C. Middlebrooks, Z. A. Onsan, and L. Xu, “Virtual sound localization with non-individualized transfer functions is improved by scaling transfer functions in frequency”, Abstracts of the twenty-first midwinter research meeting of the association for research in otolaryngology, vol. 21, p. 43, 1998. 4291 J. C. Middlebrooks, J. C. Makous, and D. M. Green, “Directional sensitivity of sound-pressure levels in the human ear canal”, J. Acoust. SOC.Am.,vol. 86, pp, 89-108, 1989. 1095 Authorized licensed use limited to: UNIVERSITY OF SYDNEY. Downloaded on January 31, 2010 at 21:30 from IEEE Xplore. Restrictions apply.