Audio Engineering Society AES Paris 2016

AES Paris 2016
Poster Session P19

P19 - Perception Part 2, Audio Signal Processing Part 3, and Recording and Production Techniques


Monday, June 6, 14:45 — 16:45 (Foyer)

P19-1 Two Alternative Minimum-Phase Filters Tested PerceptuallyRobert Mores, University of Applied Sciences Hamburg - Hamburg, Germany; Ralf Hendrych, University of Applied Sciences Hamburg - Hamburg, Germany
A widely used method for designing minimum phase filters is based on the real cepstrum (Oppenheim, 1975). An alternative method is proposed for symmetric FIR filters that flips the filter’s “left side” around the central coefficient to the “right side” using a sinus ramp of perceptually irrelevant duration. The resulting phase is nearly minimal and nearly linear. The method is applied to impulse responses. Perception tests use original sound samples (A), samples processed by real-cepstrum-based minimum phase filters (B), and samples processed by the proposed method (C). The tests reveal that for impulsive sound samples the perceived dissimilarity between A and C is smaller than the dissimilarity between A and B suggesting that the alternative method has some potential for sound processing. [Also a lecture—see session P13-2]
Convention Paper 9554 (Purchase now)

P19-2 A Further Investigation of Echo Thresholds for the Optimization of Fattening DelaysMichael Uwins, University of Huddersfield - Huddersfield, UK; Dan Livesey, Confetti College, Nottingham Trent University - Nottingham, UK
Since the introduction of stereophonic sound systems, mix engineers have developed and employed numerous artificial methods in order to enhance their productions. A simple yet notable example is the effect commonly known as “fattening,” where a mono signal is cloned, delayed, and then panned to the opposite side of the stereo field. The technique can improve a sound’s prominence in the mix by increasing its overall amplitude while creating a pseudostereo image and is a consequence of a renowned psychoacoustic phenomenon, the “precedence effect.” The aim of this investigation was to build upon previous accepted studies, conducting further experiments in order to produce refined estimates for echo thresholds for elements common to a multi-track music production. This investigation obtained new estimates of echo thresholds and fattening delay times, for a variety of isolated instrumental and vocal recordings, as perceived by a sample population of trained mix engineers. The study concludes that current recommendation for delay times used to create fattening effects should be refined, taking into account not only those features of the but also the consequences of temporal and spectral masking, when applied in the context of a multitrack mix. Also a lecture—see session P16-3]
Convention Paper 9571 (Purchase now)

P19-3 Extraction of Anthropometric Measures from 3D-Meshes for the Individualization of Head-Related Transfer FunctionsManoj Dinakaran, Huawei Technologies, European Research Center - Munich, Germany; Technical University of Berlin - Berlin, Germany; Peter Grosche, Huawei Technologies European Research Center - Munich, Germany; Fabian Brinkmann, Technical University of Berlin - Berlin, Germany; Stefan Weinzierl, Technical University of Berlin - Berlin, Germany
Anthropometric measures are used for individualizing head-related transfer functions (HRTFs) for example, by selecting best match HRTFs from a large library or by manipulating HRTF with respect to anthropometrics. Within this process, an accurate extraction of anthropometric measures is crucial as small changes may already influence the individualization. Anthropometrics can be measured in many different ways, e.g., from pictures or anthropometers. However, these approaches tend to be inaccurate. Therefore, we propose to use Kinect for generating individual 3D head-and-shoulder meshes from which anthropometrics are automatically extracted. This is achieved by identifying and measuring distances between characteristics points on the outline of each mesh and was found to yield accurate and reliable estimates of corresponding features. In our experiment, a large set of anthropometric measures was automatically extracted for 61 subjects and evaluated in terms of a cross-validation against the manually extracted correspondent.
Convention Paper 9579 (Purchase now)

P19-4 Methods of Phase-Aligning Individual Instruments Recorded with Multiple Microphones during Post-ProductionBartlomiej Kruk, Wroclaw University Technology, Faculty of Electronics Chair of Acoustic and Multimedia - Wroclaw, Poland; University of Applied Science - Nysa, Poland; Aleksander Sobecki, Wroclaw University of Technology - Wroclaw, Poland
When recording any instrument, like a guitar cabinet or a drum set with a multi-microphone setup, phase plays a key role in shaping the sound. Despite the importance, phase is often overlooked during the recording process because of lack of time or experience. Then during mixing stage engineers tend to use equalizers and compressors to correct issues that might originate in signals not being well time-aligned. Phase measuring tools like goniometers are widely used by mastering engineers to diagnose any phase related issues in a mix, yet their usefulness in shaping sounds of individual instruments is vastly overlooked. The main aim of this paper is to present and analyze easy phase-aligning methods.
Convention Paper 9580 (Purchase now)

P19-5 Wireless Sensor Networks for Sound Design: A Summary on Possibilities and ChallengesFelipe Reinoso Carvalho, Vrije Universiteit Brussel, ETRO - Pleinlaan 2, 1050, Brussels; Ku Leuven; Abdellah Touhafi, Vrije Universiteit Brussel - Pleinlaan, Brussels; Kris Steenhaut, Vrije Universiteit Brussel - Pleinlaan, Brussels
This article presents opportunities of using Wireless Sensor Networks (WSNs) equipped with acoustic sensors as tools for sound design. We introduce the technology, examples considered as State of the Art, and several potential applications involving different profiles of sound design. The importance of adding real-time audio-messages into sound design is considered a main issue in this proposal. Actual technological situation and challenges are here discussed. The usage of WSNs for sound design is plausible, although technological challenges demand strong interaction between sound designers and WSN developers.
Convention Paper 9581 (Purchase now)

P19-6 Object-Based Audio Recording MethodsJean-Christophe Messonnier, CNSMDP Conservatoire de Paris - Paris, France; Jean-Marc Lyzwa, CNSMDP - Paris, France; Delphine Devallez, Arkamys - Paris, France; Catherine De Boisheraud, CNSMDP - Paris, France
The new ADM standard enables to define an audio file as object-based audio. Along with many other functionalities, the polar coordinates can be specified for each audio object. An audio scene can therefore be described independently of the reproduction system. This means that an object-based recording can be rendered on a 5.1 system, a binaural system, or any other system. In the case of a binaural system, it also gives the opportunity to interact with the audio content, as a headtracker can be used to follow the movements of the listener’s head and change the binaural rendering accordingly. This paper describes how such an object-based recording can be achieved. [Also a paper—see session P16-2]
Convention Paper 9570 (Purchase now)

P19-7 The Harmonic Centroid as a Predictor of String Instrument Timbral ClarityKirsten Hermes, University of Surrey - Guildford, Surrey, UK; Tim Brookes, University of Surrey - Guildford, Surrey, UK; Chris Hummersone, University of Surrey - Guildford, Surrey, UK
Spectrum is an important factor in determining timbral clarity. An experiment where listeners rate the changes in timbral clarity resulting from spectral equalization (EQ) can provide insight into the relationship between EQ and the clarity of string instruments. Overall, higher frequencies contribute to clarity more positively than lower ones, but the relationship is program-item-dependent. Fundamental frequency and spectral slope both appear to be important. Change in harmonic centroid (or dimensionless spectral centroid) correlates well with change in clarity, more so than octave band boosted/cut, harmonic number boosted/cut, or other variations on the spectral centroid. [Also a paper—see session P13-5]
Convention Paper 9557 (Purchase now)

P19-8 Subjective Listening Tests for Preferred Room Response in Cinemas - Part 2: Preference Test ResultsLinda A. Gedemer, University of Salford - Salford, UK; Harman International - Northridge, CA, USA
SMPTE and ISO have specified near identical in-room target response curves for cinemas and dubbing stages. However, to this author's knowledge, to date these standards have never been scientifically tested and validated with modern technology and measurement techniques. For this reason it is still not known if the current SMPTE and ISO in-room target response curves are optimal or if better solutions exist. Using a Binaural Room Scanning system for room capture and simulation, various seating positions in three cinemas were reproduced through headphones for the purpose of conducting controlled listening experiments. This system used a binaural mannequin equipped with a computer-controlled rotating head to accurately capture binaural impulse responses of the sound system and the listening space which are then reproduced via calibrated headphones equipped with a head-tracker. In this way controlled listening evaluations can be made among different cinema audio systems tuned to different in-room target responses. Results from a MUSHRA-style preference test are presented. (Also a lecture—see session P13-3)
Convention Paper 9555 (Purchase now)

P19-9 Exploiting Envelope Fluctuations to Enhance Binaural PerceptionG. Christopher Stecker, Vanderbilt University School of Medicine - Nashville, TN, USA
A review of recent and classic studies of binaural perception leads to the conclusion that envelope fluctuations, such as sound onsets, play a critical role in the sampling of spatial information from auditory stimuli. Specifically, listeners’ perception of sound location corresponds with the binaural cues (interaural time and level differences) that coincide with brief increases in sound amplitude, and disregards binaural cues occurring at other times. This discrete, envelope-triggered sampling of binaural information can be exploited to enhance spatial perception of synthesized sound mixtures, or to facilitate the localization of mixture components. [Also a lecture—see session P13-1]
Convention Paper 9553 (Purchase now)

P19-10 Comparison of Simple Self-Oscillating PWM ModulatorsNicolai Dahl, Technical University of Denmark - Lyngby, Denmark; Niels Elkjær Iversen, Technical University of Denmark - Kogens Lyngby, Denmark; Arnold Knott, Technical University of Denmark - Kgs. Lyngby, Denmark; Michael A. E. Andersen, Technical University of Denmark - Kgs. Lyngby, Denmark
Switch-mode power amplifiers has become the conventional choice for audio applications due to their superior efficiency and excellent audio performance. These amplifiers rely on high frequency modulation of the audio input. Conventional modulators use a fixed high frequency for modulation. Self-oscillating modulators do not have a fixed modulation frequency and can provide good audio performance with very simple circuitry. This paper proposes a new type of self-oscillating modulator. The proposed modulator is compared to an already existing modulator of similar type and their performances are compared both theoretically and experimentally. The result shows that the proposed modulator provides a higher degree of linearity resulting in around 2% lower Total Harmonic Distortion (THD). [Also a lecture—see session P14-5]
Convention Paper 9562 (Purchase now)

P19-11 Spatial Multi-Zone Sound Field Reproduction Using Higher-Order Loudspeakers in Reverberant RoomsKeigo Wakayama, NTT Service Evolution Laboratories - Kanagawa, Japan; Hideaki Takada, NTT Service Evolution Laboratories - Kanagawa, Japan
We propose a method for reproducing multi-zone sound fields in a reverberant room using an array of higher-order loudspeakers. This method enables sparse arrangement of loudspeakers and reproduction of independent sound fields for multiple listeners without the need for headphones. For multi-zone reproduction, global sound field coefficients are obtained using translation operator. By using the coefficient of the room transfer function measured or simulated with an extension of the image-source method, the loudspeakers’ coefficients are then calculated with the minimum norm method in the cylindrical harmonic domain. From experiments of two-zone and three-zone examples, we show that there was a 2N + 1-fold decrease in the number of Nth-order loudspeakers for accurate reproduction with the proposed method compared to conventional methods. [Also a lecture—see session P14-3]
Convention Paper 9560 (Purchase now)

P19-12 Stereo Panning Law Remastering Algorithm Based on Spatial AnalysisFrançois Becker, Paris, France; Benjamin Bernard, Medialab Consulting SNP - Monaco, Monaco; Longcat Audio Technologies - Chalon-sur-Saone, France
Changing the panning law of a stereo mixture is often impossible when the original multitrack session cannot be retrieved or used, or when the mixing desk uses a fixed panning law. Yet such a modification would be of interest during tape mastering sessions, among other applications. We present a frequency-based algorithm that computes the panorama power ratio from stereo signals and changes the panning law without altering the original panorama. Also a lecture—see session P7-6
Convention Paper 9523 (Purchase now)

P19-13 Non-Linear Extraction of a Common Signal for Upmixing Stereo SourcesFrançois Becker, Paris, France; Benjamin Bernard, Medialab Consulting SNP - Monaco, Monaco; Longcat Audio Technologies - Chalon-sur-Saone, France
In the context of a two- to three-channel upmix, center channel derivations fall within the field of common signal extraction methods. In this paper we explore the pertinence of the performance criteria that can be obtained from a probabilistic approach to source extraction; we propose a new, non-linear method to extract a common signal from two sources that makes the implementation choice of deeper extraction with a criteria of information preservation; and we provide the results of preliminary listening tests made with real-world audio materials. Also a lecture—see session P7-7
Convention Paper 9524 (Purchase now)


Return to Paper Sessions

AES - Audio Engineering Society