Download Improving Speech Intelligibility in Noise with Binaural Beamforming

www.siemens.com/hearing Improving Speech Intelligibility in Noise with Binaural Beamforming The technology behind binax Narrow Directionality & Spatial SpeechFocus Homayoun Kamkar-Parsi, Ph.D., Dipl.-Ing. Eghart Fischer, Dipl.-Ing. Marc Aubreville © Siemens Audiology, 2014. Abstract This paper describes the new binaural directional features, implemented in the latest binaural Siemens hearing aid system binax, which is capable of transmitting audio signals from one hearing instrument to the other in bilateral fitting. ‘Narrow Directionality’, an enhanced binaural beamforming algorithm, especially designed for very difficult listening situations is introduced. A second feature, ‘Spatial SpeechFocus’, a self-steering binaural beam forming algorithm also is described. This feature is designed especially for situations with talkers from side directions in noisy environments, e.g. in cars. The advantages of automatic activation and control of the new algorithms and their low power consumption needed are also reviewed. Introduction Binaural hearing – meaning perceiving and processing slightly different acoustical information from both sides of the head - enables our hearing system to perform amazingly well in multiple aspects. Surely the most wellknown benefit of our binaural system is the ability to localize acoustic sources. This important ability is essential to spatial orientation, and to react quickly and appropriately to acoustic events. But there are more fascinating qualities associated with binaural processing. Our brain is able to fuse the information perceived by both ears together to form one enhanced common output even when the individual ears only received incomplete or distorted input. This process is called binaural redundancy. Furthermore, our binaural system provides a kind of natural noise reduction, meaning that due to spatial, spectral, and time-based differences between the ears, a so-called binaural squelch is applied. But perhaps the most intriguing area related to binaural processing is the effects which help us to understand a particularly desired speech in background noise. Our brain is able to localize multiple sound sources and to assign the correct characteristics to these sources simultaneously. As soon as the auditory system has localized a preferred sound, it can extract the signals of this sound source out of a mixture of interfering sounds in a process called binaural directed listening. Common real-world examples might be a dinner table, where an individual is trying to listen to a person sitting off to one side, or while driving a car with the radio on and trying to understand the passenger. All these benefits of natural binaural hearing can be reduced by hearing loss, aging, declining cognition function, and central auditory processing deficits; all factors that often apply to hearing instrument wearers. This presents a challenge when new hearing instrument technology is designed, as these user-related conditions need to be taken into consideration. In addition, this is especially critical regarding hearing instrument use, as not being able to understand speech in background noise is the most common complaint from new hearing instrument wearers, and poor performance in background noise is the most common reason for hearing instrument rejection. How can we enhance binaural processing? In order to emulate binaural listening, we first need to link the two hearing instruments, just like how the brain uses input from both ears. Building upon the original e2e technology which won the prestigious German President’s Future Prize in 2012, e2e wireless 3.0 is more powerful than ever before. In addition to exchanging hearing instrument data such as volume and program settings, e2e wireless 3.0 is also able to directly transmit audio signals between the two hearing instruments. This means that each hearing instrument in a bilateral pair works with input not only from the two microphones on its own housing, it now also receives the acoustic signal picked up by the two microphones on the other hearing instrument, creating a hearing instrument that works with input from four microphones. And because in the wearing position, the microphones are located at different parts of the head, the acoustic information each one picks up is slightly different, just like it is for the right and left ear. Together, the information from all the microphones creates a much more complete and accurate impression of the surrounding acoustic environment. We call this heightened sensitivity of the acoustic environment: high definition sound resolution (HDSR). The HDSR processing is what enables the new Siemens binax hearing instruments to offer a number of features which emulate the natural binaural hearing processes. We believe that the algorithms in binaural hearing instruments have to follow the same principles as our internal binaural processor, the brain. To achieve optimal speech intelligibility in very noisy situations, we strategically combine the signals from both right and left instruments in order to use any information available from the target direction, while at the same time reducing the influence of the non-frontal directions where interferers are assumed. To better hear a speaker from one side in the presence of noise from the other side, we also exploit the natural principles of auditory localization, using interaural time differences (ITDs) and interaural level differences (ILDs). In this way, we keep the sound sources well on their natural position even though the non-desired side is attenuated. In all that we do, we know that speech intelligibility is not only a matter of simply attenuating interfering noise, but also of keeping these (attenuated) interferers in the correct spatial position for later processing in the brain (spatial stream segregation). Therefore, we ensure that the important binaural cues are well preserved so that the natural and artificial “binaural processors” can work together. Bringing benefits of binaural hearing to daily use New features should not only be innovative, they also need to provide practical benefit for everyday use. First of all, this means that these features should work automatically without wearer manipulation, and fast enough in order to be effective in ever-changing real-life situations. The features of binax are fully integrated into the automatic Universal program. They engage, disengage, adapt smoothly and automatically in response to the changing acoustic environment, and function synergistically with other existing hearing instrument features. They do not require separate programs, additional accessories, or volume adjustments. Because only when everything just works, and works automatically, can the wearer forget that he is wearing hearing instruments. Secondly, when we are talking about features in tiny hearing instruments with even tinier batteries, then practicality also means that new features need to be energy efficient. Only when these features can be engaged when necessary without significantly reducing the battery life can they be truly beneficial. Compared with our previous micon platform, which did not have bilateral audio data transfer, binax battery consumption remains the same for the normal microphone modes and most importantly, when audio streaming is active, the additional binaural features offered by binax only increases the battery consumption by 200µA or less, from 1.3 to still below 1.6mA. And this of course is only while the new features are automatically activated. This is considerably more efficient than other products using this type of binaural processing. Put in more practical terms, a size 312 battery in a Pure micon S would last approximately 9.5 days given a typical usage profile1. That same battery would last a Pure binax S 8.3 days given the same usage profile including active binaural features. binax Narrow Directionality Our objective was to develop binaural algorithms which address the most dissatisfying listening situations for those with hearing loss. It has been well documented in MarkeTrak studies (e.g. Kochkin, 2005, Kochkin 2010) that the most problematic situations for the hearing impaired are understanding speech in noisy environments and large groups [1, 2]. In fact, satisfaction in the area did not improve substantially from MarkeTrak VII to MarkeTrak VIII. Therefore, we created Narrow Directionality to address those problems. Narrow Directionality is a new advanced binaural beamforming system, which uses our binaural wireless audio link technology (e2e wireless 3.0). This algorithm improves speech understanding in extremely noisy and adverse acoustic environments, and provides a more efficient solution to the cocktail party effect compared to bilaterally-fitted conventional monaural differential microphone systems. Narrow Directionality is designed to enhance the speech signal coming from a target speaker located among multiple other competing or interfering speakers around the listener. It creates a narrow beam towards the front direction so the wearer can listen easily to any distinct speaker by turning his head towards him or her. It improves the speech signal from the target speaker in two ways simultaneously: by quickly reducing other 1 16 hour wearing day including 2 hours of Bluetooth streaming competing speech signals outside the beam angular range, and by boosting the level of the target speaker signal within the beam (i.e. ‘focus’ on the target speaker). As is well known, conventional monaural directional microphone systems can only effectively suppress interferences in the rear hemisphere. This new system, however, can essentially attenuate interfering speakers or noise that is not immediately in front of the wearer (i.e. ‘narrow point of listening‘). The system also incorporates a module that prevents the attenuation or distortion of the target speech signal due to small head movements during a typical conversation. As a result, the system can quickly adapt and compensate for small movements such as +/-10° so that the target speech signal is not constrained to be precisely within the very frontal narrow beam range. This allows a more comfortable conversation without having the user obliged to always face directly the target speaker. Additionally, the system is well integrated with an automatic control which smoothly adjusts (channel-wise) the directivity from a wider to a narrower beam as the SNR in the environment deteriorates. This ensures the optimal listening experience in all kinds of noisy environments. Monaural processing and binaural processing Existing advanced monaural directional microphone systems (i.e. monaural beamforming) provide great noise suppression, but mostly from the back hemisphere. In simple terms, only noises or even interfering speakers at the back of the hearing impaired person are well attenuated. So what if the interfering noise comes from the side or from next to the target speaker? Narrow Directionality (i.e. our advanced binaural beamforming system; see Figure 2) is built on top of our existing monaural directional microphone system to tackle even more challenging noisy situations. Figure 2: Simplified block diagram of Binax Narrow Directionality system composed of a monaural processing stage followed by a binaural processing stage, which takes as inputs the local signal (the monaural directional signal) and the contralateral signal (the monaural directional signal transmitted from the hearing instrument from the other side of the head i.e. via e2e wireless audio link) Figure 3 illustrates the various listening modes from omnidirectional to Narrow Directionality. In Figure 3a, the listening mode is set to omnidirectional. This implies no sound suppression of any kind. In this setting, listener hears all the surrounding sounds equally. In Figure 3b, the listening mode is set to monaural or traditional directional processing. This creates a wide beam towards the front direction, and any interfering noise or talkers from back hemisphere are attenuated. In Figure 3c, the listening mode is set to Narrow Directionality. The interferers from the back hemisphere remain attenuated, but additionally, interfering speakers in the proximity of target speakers are also well suppressed. This is due to the narrower frontal beam offered by binax technology. Figure 3: a) Omnidirectionality. b) Monaural directionality. c) Binaural Narrow Directionality Insights of Narrow Directionality processing As introduced in the previous section, Figure 2 is a simplified block diagram of our Narrow Directionality system incorporating the fusion of monaural and binaural systems. Now if we take a closer look at our system as shown in Figure 4, our binaural processing is composed of three essential components: the binaural beamforming, the binaural noise reduction gain and the head movement compensation module. All those components work together to achieve the Narrow Directionality effect. Figure 4: Binax Narrow Directionality – closer look at the Binaural Processing system composed of three components: Binaural beamforming, binaural noise reduction gain and a head movement compensation module Binaural Beamforming It is a new kind of binaural beamforming based on utilizing the head shadowing effect. It takes into account the contralateral wireless binaural signal as shown in Figure 4. For each side, the binaural beamformer is designed as follows: it takes as input the local signal, which is the monaural directional signal, and the contralateral signal, which is the monaural directional signal transmitted from the hearing instrument from the other side of the head (i.e. via our binaural wireless link e2e wireless 3.0). It should be noted that the monaural directional signal (from the local side or the contralateral side) is already an enhanced signal with reduced noise from the back hemisphere as illustrated in Figure 3b. The output of the beamformer is generated by linearly adding the weighed local signal and the weighted contralateral signals. The weighting scheme, which is a crucial part of the overall design, is aimed to provide an output signal with maximum lateral interference cancellation while keeping the frontal speaker untouched. How should the weights be then optimized to achieve this goal? Taking the left hearing instrument as the reference, imagine the following example: a target speaker is located at 0° in front of the listener, who is fitted with binaural hearing instruments. The local and contralateral (i.e. the transmitted signal from the right hearing instrument) signals will therefore arrive at the same time to both ears without any head shadowing effect. In other words, both signals will have approximately the same power and phase. However, for the case of a lateral interfering noise (e.g. at 45°), the local and contralateral signals will be different due to head shadowing and interaural time difference i.e. the signals will have power and phase (or time) differences. Therefore, given a target at 0° with the presence of an interfering lateral noise or a competing talker at 45°, the local signal power will be higher than the contralateral signal power due to the interfering noise. It also implies that the contralateral signal is the one which is less affected by the noise due to the head shadowing. Having this example in mind, we designed an algorithm to derive the optimum weights with the following criteria: A) The sum of the weighted local and the weighted contralateral input signals should always produce an output signal (to each ear) with minimum power with respect to the original local and contralateral signal powers. This criterion ensures that lateral interferences are attenuated. Importantly, the weights are also adaptive and are updated within milliseconds to closely follow fast changes in the noisy environment. B) The additive combination of the weights should always add up to 1. This ensures that the target signal at 0° remains untouched. Applying A and B, our binaural beamformer creates a narrow beam to the front direction with the beam pattern representation as depicted in Figure 5. Figure 5 illustrates Narrow Directionality (combination of binaural beamforming and binaural noise reduction) output signal characteristic relative to monaural directionality. Binaural Noise Reduction To further enhance the output signal from our binaural beamformer as depicted in Figure 5b, we also developed a new adaptive binaural noise reduction gain which is fully integrated within the noise reduction unit. This noise reduction gain can be interpreted as a binaural Wiener-based gain computed using as inputs the local and the contralateral signals. We designed it to have some specific properties. It also attenuates lateral interfering speakers coming from outside the frontal target angular range (+-10°). Therefore, a competing speaker very close to the target speaker is further attenuated (i.e. by applying gain below 1 – typical Wiener-based gain attenuation). However, if there is a frontal target speaker present within the frontal angular range, the frontal speaker is moderately amplified (or boosted) by applying a gain above 1 (which is not a typical Wiener-based gain property). 5a) 5b) 5c) Figure 5: Narrow Directionality output signal characteristic. a) Monaural directional output characteristic; b) Binaural Beamforming output characteristic c) Binax Narrow Directionality: Binaural Beamforming combined with Binaural Noise Reduction. Narrow Directionality gives the hearing impaired wearer the perception that he is focusing on the person he is directly looking at (like a magnifying glass) as illustrated in Figure 5c. It should be noted that the adaptation of the binaural noise reduction gain is fast enough (within milliseconds) to rapidly amplify or attenuate depending on the acoustic situation and of course without any background noise increase. Head movement compensation As discussed earlier, Narrow Directionality creates a narrow frontal beam to reduce efficiently interfering noises from all directions under the assumption that the target speaker is in front of the listener. But what about small movements from both the target speaker and the listener? Would any small head movements also attenuate the desired target speaker due to the narrow beam? At first glance, the answer would be yes. And this is why Narrow Directionality also includes a head movement compensation module to avoid any target distortion. This is necessary to ensure a normal, comfortable conversation without requiring the listener to always directly face the target speaker. Head movements are a part of normal behavior during a conversation, and they usually occur very quickly. Plus and minus 10 degrees can be considered as a normal range of regular head movements by either the speaker or the listener. So if the target signal is at +10°, the head compensation module modifies the originating input binaural signal at +10° and brings it back to 0° (more specifically, the phase and the level of the contralateral signal is re-adjusted to match the local signal). This allows the binaural beamforming and the binaural noise reduction to behave in the same original manner; that is, the target signal is ‘seen’ at 0°, which lies within the narrow frontal beam range of 0° of Narrow Directionality. As a result, the target signal is still enhanced rather than attenuated due to head movements. Automatic Control of Narrow Directionality With Narrow Directionality, the user can now understand better in challenging situations, such as a loud cafeteria environment. However, in quieter situations, it is important to hear from all around. This is when an automatic control of binaural algorithms comes into play. The goal of this intelligent algorithm is to estimate the complexity of a situation, and seamlessly introduce more directionality when needed. For this, multiple criteria are evaluated. The acoustic complexity of the hearing aid user’s listening environment is usually linked with background noise level. In order to activate binaural processing, a certain background noise level which was optimized in various noisy environments is needed. This threshold is higher than the respective noise level expected for monaural processing. If this binaural threshold is exceeded, the binaural audio transmission is enabled. Once the audio signal from the contralateral hearing instrument is available, the hearing instrument has far more possibilities to analyze the scene. Using a combination of both ipsi- and contralateral metrics, the effect of the beamformer is restricted to situations and frequency bands where it is useful. Furthermore, its strength is adjusted frequency-specifically, depending on the background noise level (Figure 6a). For lower noise levels, the monaural directional microphone is engaged to provide sufficient noise attenuation in these situations (Figure 6b). As the noise level increases, Narrow Directionality engages and its effects are increased accordingly until it reaches full directionality for high noise levels. This has the important advantage of keeping spatial orientation and sound naturalness to a maximum in medium noisy situations, and in situations where most noise is contained in the lowest frequencies. It should be noted that the classification of the acoustic situation surrounding the hearing aid wearer is also taken into account. For instance, there might be some situations, such as enjoying loud music that should not activate a narrow beamformer in an automatic program. a) b) Figure 6: a) An example of frequency-dependent activation of binaural Narrow Directionality effect, e.g. in a cocktail party situation. 6b): directional microphone benefit is maximized by providing it in 48 channels For the hearing instrument user, the mechanisms “under the hood” are not important. He or she only needs to be focused on good speech understanding and listening comfort, in every situation. This is why it is so important to have a seamless adaptation as the acoustic scenery changes. Any control that is acting too fast is prone to misdetections, whereas any control that is acting too slowly can be noticed, and therefore irritating. If the situations change quickly, the automatic steering has to adapt quickly, if the situation changes gradually, the automatic steering should also adapt gradually. The best possible outcome for a user reaction is: I did not notice any automatic adjustments of the hearing instrument – it just always seemed right. The maximum user benefit is reached when the amount of directionality and noise reduction is optimized so that the overall perception is natural and unnoticeable. The efficacy of the Narrow Directionality algorithm for speech recognition for the hearing impaired has been studied in clinical trials at two different independent sites. These behavioral findings are very encouraging, and are in good agreement with SNR advantages expected in the algorithm design. This research is summarized in a companion paper by Powers & Froehlich, 2014 [3]. Advanced beam forming in 360 degrees – Spatial SpeechFocus The binaural audio transmission introduced with the binax platform enables not only beamforming for situations where the target speaker is in the front, but also allows for beamformng to the side of the hearing instrument wearer, e.g. for situations like walking or sitting side by side. Up to now, the best one could do when a target speaker is located to the side is to select the omnidirectional mode. However, in these situations the interfering sources, such as street noise, often come from the respective other directions and omnidirectional processing cannot suppress the undesired sources. This is where Spatial SpeechFocus, introduced in the binax platform, comes into play. By suppressing an undesired side, and enhancing the target signal from the desired side on both ears, this technology is built to increase listening comfort as well as speech understanding. a) b) Figure 7: Interaural time difference. a) f = 250Hz b) f=1 kHz. Whereas in the left figure, the direction of arrival of the sound wave is clear, for the right figure it is already ambiguous. The algorithm works on the same fundamental principles as the human hearing. When sound is coming from one side of the head, it will have two major differences in the two ears: First, as sound propagates, it will arrive earlier at the ear closer to the sound source. This effect is named interaural time difference (ITD). If sound comes exactly from the side of the person (90 degrees), this time difference is approximately 0.7 milliseconds. This will result in a phase difference of the respective sound, which can be used by a state of the art differential beamformer [4]. This phase difference is most evident for frequencies lower than around 750Hz (Figure 7). Luckily, there is another effect to exploit for these higher frequencies. As the sound waves travel pass the head, lower frequencies sound waves are diffracted by the head, and hence not much attenuated. But for higher frequencies, there is significant attenuation that can be detected and used to determine the origin of a sound. This can be accomplished even for high frequency noises, albeit with less precision. This effect is called interaural level difference (ILD). In the Spatial SpeechFocus algorithm, introduced with the binax platform, a socalled Wiener filter based approach is used to suppress signals coming from an undesired side. a) b) Figure 8: Polar patterns for binax Spatial SpeechFocus, measured on the left hearing aid, in anechoic chamber condition. Dashed: omnidirectional, solid: Spatial SpeechFocus, a) f=500Hz, b): f=2kHz Both ITD and ILD are utilized to create a powerful beamforming algorithm, the effect of which can be observed in the polar plots shown in figure 8. Depending on the frequency and room acoustics, the attenuation is approximately 10 dB. The major advantage over mere copying of the preferred ear to the other side is that in this application, spatial cues are kept. That is, the user can still localize the sound and has a natural spatial impression. The polar pattern in figure 9b shows a beamformer focused to the left side. It can be observed that sources coming from the right side (regarded as noise in this case) are attenuated by the same amount on the left and right ear compared to the omni signal in figure 9a. The interaural level differences (shaded area) and thus the localization of the sources remain untouched. This beamformer can be controlled manually, using a remote control application (“Spatial Configurator Direction”), but it also can be controlled automatically. For this, features that correlate with the signal-to-noise ratio of speech are calculated independently in both hearing aids. These features represent the probability of speech from the front, the back and one of the sides. By exchanging this information and combining the data, it is possible to determine from which direction a speech signal originates. This is an extension of the previous generation micon SpeechFocus algorithm. Now the Spatial SpeechFocus beamformer is able to provide a signal focused to either side in addition to the front and back. If speech is coming from both sides, or in quiet, an omnidirectional microphone mode is chosen (Figure 10). a) b) Figure 9: Polar plots for left and right ear for a) omnidirectional mode and b) Spatial SpeechFocus to the left side, f = 2 kHz, measured on KEMAR in low reverberant room. The interaural level difference, represented by the shaded area, does not change in both modes. The switching is performed synchronously on both ears, using a smooth transition. In all situations, the correct perception of the speaker(s) will be kept because the binaural cues are kept to a large extent. Figure 10: Directivity patterns in Spatial SpeechFocus. The directionality is steered according to where speech originates Spatial SpeechFocus is activated, if the hearing instrument detects a dedicated car situation in the automatic program, or when a special program that the hearing instrument acoustician can configure is selected. Conclusion In this paper, we provide insights into the new binaural directional features implemented in the latest binaural Siemens hearing aid binax. All described functionalities are based on the new ability of the hearing aid system to transmit audio signals from ear to ear with low latency. With ‘Narrow Directionality’, we presented an enhanced binaural beamforming algorithm, designed especially for very difficult listening situations with multiple talkers in the background. It provides a narrow acoustic focus to the “look direction” of the user, and thus enables the hearing aid user to understand the preferred talker, even with several other talking persons in the same proximity. What makes it also unique is its smooth, situation dependent activation and deactivation and the high resolution control of its strength of effect, which separately depends on the acoustic conditions in each frequency band. We also described ‘Spatial SpeechFocus’, a self steering binaural beam forming algorithm, especially useful for situations with speakers talking from the left or right side in noisy environments. It is automatically activated when the presence of speech from one side is detected in the presence of background noise. Like the human ear, it uses interaural phase and level differences for maintaining a natural, spatially correct sound impression. Finally, we highlighted the surprisingly low additional power consumption needed for operating these powerful binaural algorithms. References [1] Kochkin, S. (2005). MarkeTrak VII: Customer satisfaction with hearing instruments in the digital age. Hearing J. 58(9), 30,32–34,38–40,42–43. [2] Kochkin S. (2010). MarkeTrak VIII: Consumer satisfaction with hearing aids is slowly increasing. Hearing J 63(1),19-20,22,24,26,28,30-32. [3] Powers, T. & Froehlich,M. (2014). Clinical results with a new wireless binaural directional hearing system. Hearing Rev, in print. [4] Elko, G. W. & Nguyen Pong, A.-T. (1995). ‘A simple adaptive first-order differential microphone’, IEEE AASP Workshop on Applications of Signal Processing to Audio and Acoustic.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Improving Speech Intelligibility in Noise with Binaural Beamforming