AES Milan 2018
Paper Session P07
P07 - Spatial Audio-Part 1
Thursday, May 24, 09:00 — 12:30 (Scala 4)
Chair:
Sascha Spors, University of Rostock - Rostock, Germany
P07-1 Continuous Measurement of Spatial Room Impulse Responses on a Sphere at Discrete Elevations—Nara Hahn, University of Rostock - Rostock, Germany; Sascha Spors, University of Rostock - Rostock, Germany
In order to analyze a sound field with a high spatial resolution, a large number of measurements are required. A recently proposed continuous measurement technique is suited for this purpose, where the impulse response measurement is performed by using a moving microphone. In this paper it is applied for the measurement of spatial room impulse responses on a spherical surface. The microphone captures the sound field on the sphere at discrete elevations while the system is periodically excited by the so-called perfect sequence. The captured signal is considered as a spatio-temporal sampling of the sound field, and the impulse responses are obtained by a spatial interpolation in the spherical harmonics domain. The elevation angles and the speed of the microphone are chosen is such a way that the spatial sampling points constitute a Gaussian sampling grid.
Convention Paper 9938 (Purchase now)
P07-2 Real-Time Conversion of Sensor Array Signals into Spherical Harmonic Signals with Applications to Spatially Localized Sub-Band Sound-Field Analysis—Leo McCormack, Aalto University - Espoo, Finland; Symeon Delikaris-Manias, Aalto University - Helsinki, Finland; Angelo Farina, Università di Parma - Parma, Italy; Daniel Pinardi, Università di Parma - Parma, Italy; Ville Pulkki, Aalto University - Espoo, Finland
This paper presents two real-time audio plug-ins for processing sensor array signals for sound-field visualization. The first plug-in utilizes spherical or cylindrical sensor array specifications to provide analytical spatial filters which encode the array signals into spherical harmonic signals. The second plug-in utilizes these intermediate signals to estimate the direction-of-arrival of sound sources, based on a spatially localized pressure-intensity (SLPI) approach. The challenge with the traditional pressure-intensity (PI) sound-field analysis is that it performs poorly when presented with multiple sound sources with similar spectral content. Test results indicate that the proposed SLPI approach is capable of identifying sound source directions with reduced error in various environments, when compared to the PI method.
Convention Paper 9939 (Purchase now)
P07-3 Parametric Multidirectional Decomposition of Microphone Recordings for Broadband High-Order Ambisonic Encoding—Archontis Politis, Aalto University - Espoo, Finland; Sakari Tervo, Aalto University - Espoo, Finland; Tapio Lokki, Aalto University - Aalto, Finland; Ville Pulkki, Aalto University - Espoo, Finland
Higher-order Ambisonics (HOA) is a flexible recording and reproduction method, which makes it attractive for several applications in virtual and augmented reality. However, the recording of HOA signals with practical compact microphone arrays is limited to a certain frequency range, which depends on the applied microphone array. In this paper we present a parametric signal-dependent approach that improves the HOA signals at all frequencies. The presented method assumes that the sound field consists of multiple directional components and a diffuse part. The performance of the method is evaluated in simulations with a rigid microphone array in different direct-to-diffuse and signal-to-noise ratio conditions. The results show that the proposed method has a better performance than the traditional signal-dependent encoding in all the simulated conditions.
Convention Paper 9940 (Purchase now)
P07-4 Adaptive Non-Coincidence Phase Correction for A to B-Format Conversion—Alexis Favrot, Illusonic GmbH - Uster, Switzerland; Christof Faller, Illusonic GmbH - Uster, Zürich, Switzerland; EPFL - Lausanne, Switzerland
B-format is usually obtained from A-format signals, i.e., from four directive microphone capsules pointing in different directions. Ideally, these capsules should be coincident, but due to design constraints, small distances always remain between them. As a result the phase mismatches between the microphone capsule signals lead to inaccuracies and interferences, impairing B-format directional responses, especially at high frequencies. A non-coincidence correction is proposed based on adaptive phase matching of the four microphone A-format signals before conversion to B-format, improving the directional responses at high frequencies, enabling higher focus, better spatial image and timbre in B-format signals.
Convention Paper 9941 (Purchase now)
P07-5 Advanced B-Format Analysis—Mihailo Kolundzija, Ecole Polytechnique Féderale de Lausanne (EPFL) - Lausanne, Switzerland; Christof Faller, Illusonic GmbH - Uster, Zürich, Switzerland; EPFL - Lausanne, Switzerland
Spatial sound rendering methods that use B-format have moved from static to signal-dependent, making B-format signal analysis a crucial part of B-format decoders. In the established B-format signal analysis methods, the acquired sound field is commonly modeled in terms of a single plane wave and diffuse sound, or in terms of two plane waves. We present a B-format analysis method that models the sound field with two direct sounds and diffuse sound, and computes the three components' powers and direct sound directions as a function of time and frequency. We show the effectiveness of the proposed method with experiments using artificial and realistic signals.
Convention Paper 9942 (Purchase now)
P07-6 Ambisonic Decoding with Panning-Invariant Loudness on Small Layouts (AllRAD2)—Franz Zotter, IEM, University of Music and Performing Arts - Graz, Austria; Matthias Frank, University of Music and Performing Arts Graz - Graz, Austria
On ITU BS.2051 surround with height loudspeaker layouts, Ambisonic panning is practice-proof, when using AllRAD decoders involving imaginary loudspeaker insertion and downmix. And yet on the 4+5+0 layout, this still yields a loudness difference of nearly 3 dB when comparing sounds panned to the front with such panned to the back. AllRAD linearly superimposes a series of two panning functions, optimally sampled Ambisonics and VBAP. Both are perfectly energy-preserving and therefore do not cause the loudness differences themselves, but their linear superposition does. In this contribution we present and analyze a new AllRAD2 approach that achieves decoding of constant loudness by (i) superimposing the squares of both panning functions, and (ii) calculating the equivalent linear decoder of the square root thereof.
Convention Paper 9943 (Purchase now)
P07-7 BRIR Synthesis Using First-Order Microphone Arrays—Markus Zaunschirm, University of Music and Performing Arts - Graz, Austria; IEM; Matthias Frank, University of Music and Performing Arts Graz - Graz, Austria; Franz Zotter, IEM, University of Music and Performing Arts - Graz, Austria
Both the quality and immersion of binaural auralization benefit from head movements and individual measurements. However, measurements of binaural room impulse responses (BRIRs) for various head rotations are both time consuming and costly. Hence for efficient BRIR synthesis, a separate measurement of the listener-dependent part (head-related impulse responses, HRIR) and the room-dependent part (RIR) is desirable. The room-dependent part can be measured with compact first-order microphone arrays, however the inherent spatial resolution is often not satisfying. Our contribution presents an approach to enhance the spatial resolution using the spatial decomposition method in order to synthesize high-resolution BRIRs that facilitate easy application of arbitrary HRIRs and incorporation of head movements. Finally, the synthesized BRIRs are compared to measured BRIRs.
Convention Paper 9944 (Purchase now)