AES Paris 2016
Poster Session P15
P15 - Live Sound Practice, Rendering, Human Factors and Interfaces
Monday, June 6, 08:45 — 10:45 (Foyer)
P15-1 Evaluation of Quality Features of Spatial Audio Signals in Non-Standardized Rooms: Two Mixed Method Studies—Ulrike Sloma, Technische Universität Ilmenau - Ilmenau, Germany
It is known that the propagation and the characteristics of a reproduced sound wave in a room is influenced by the room acoustics. So does the perceived sound quality, too? To answer this question it is indispensable to research the quality evaluation of reproduced spatial audio signals in non-standardized rooms and to compare the results with those from standardized listening rooms, in which quality evaluation is usually conducted. Beside the overall quality it is reasonable to assess which parameters of the room acoustics have influence on which quality features. To evaluate the principle influence of different listening rooms on the perception of audio signals, two listening tests are conducted in which three acoustical different rooms are examined. In the first study the aim was to find out if there is an influence on the basic audio quality and five given quality features. This was realized as a single stimulus test. Based on the results a second test was conducted. The approach was adapted from the Open Profiling of Quality. The results from the studies suggest that the influence of the room characteristics are of minor importance on the perception of spatial audio signals.
Convention Paper 9565 (Purchase now)
P15-2 Novel Designs for the Audio Mixing Interface Based on Data Visualization First Principles—Christopher Dewey, University of Huddersfield - Huddersfield, UK; Jonathan Wakefield, University of Huddersfield - Huddersfield, UK
Given the shortcomings of current audio mixing interfaces (AMIs) this study focuses on the development of alternative AMIs based on data visualization first principles. The elementary perceptual tasks defined by Cleveland informed the design process. Two design ideas were considered for pan: using the elementary perceptual tasks “scale” to display pan on either a single or multiple horizontal lines. Four design ideas were considered for level: using “length,” “area,” “saturation,” or “scalable icon” for visualization. Each level idea was prototyped with each pan idea, totaling eight novel interfaces. Seven subjects undertook a usability evaluation, replicating a 16 channel reference mix with each interface. Results showed that “scalable icons” especially on multiple horizontal lines appear to show potential.
Convention Paper 9566 (Purchase now)
P15-3 The Method for Generating Movable Sound Source—Heng Wang, Wuhan Polytechnic University - Wuhan, Hubei, China; Yafei Wu, Wuhan University - Wuhan, Hubei, China; Cong Zhang, Wuhan Polytechnic University - Wuhan, Hubei, China
The rapid development of 3D video inspired the demand for 3D audio technology and products, but the products on the market currently are limited to follow the original stereo or surround sound technology; it is difficult to produce a three-dimensional sound field audio effect synchronized with 3D video content. The method is based on VBAP principle, derived 3D space movable sound source generating principles and formulas, and implement a method of generate movable sound source in the build 3D audio system. The produced virtual sound source could customize different trajectories and speed in 3D space. On the maximum base of keeping the existing audio equipment, only needing to configure the equipment to the allocation model, we could make the audience feel truly ubiquitous shock audio-visual enjoyment. After field tests, the movement of the movable virtual sound source multitasks generated by this method is obvious in the 3D sound field, and not only provides a good method for generating 3D movable sound source for future research and experimental film-making, but also for 3D audio in home entertainment promotion.
Convention Paper 9567 (Purchase now)
P15-4 Graphical Interface Aimed for Organizing Music Based on Mood of Music—Magdalena Plewa, Gdansk University of Technology - Gdansk, Poland; Bozena Kostek, Gdansk University of Technology - Gdansk, Poland; Audio Acoustics Lab.; Mateusz Bieñ, Academy of Music in Kraków - Kraków, Poland
Mood of music is one of the most intuitive criteria for listeners, thus it is used in automated systems for organizing music. This study is based on the emotional content of music and its automatic recognition and contains outcomes of a series of experiments related to building models and description of emotions in music. One-hundred-fifty-four excerpts from 10 music genres were evaluated in the listening experiments using a graphical model proposed by the authors, dedicated to the subjective evaluation of mood of music. The proposed model of mood of music was created in a Max MSP environment. Automatic mood recognition employing SOM and ANN was carried out and both methods returned results coherent with subjective evaluation.
Convention Paper 9568 (Purchase now)
P15-5 Subjective Evaluation of High Resolution Audio through Headphones—Mitsunori Mizumachi, Kyushu Institute of Technology - Kitakyushu, Fukuoka, Japan; Ryuta Yamamoto, Digifusion Japan Co., Ltd. - Hiroshima, Japan; Katsuyuki Niyada, Hiroshima Cosmopolitan University - Hiroshima, Japan
Recently, high resolution audio (HRA) can be played back through portable devices and spreads across musical genres and generation. It means that most people listen to HRA through headphones and earphones. In this study perceptual discrimination among audio formats including HRA has been invested using a headphones. Thirty-six subjects, who have a variety of audio and musical experience in the wide age range from 20s to 70s, participated in listening tests. Headphone presentation is superior in discriminating the details to the loudspeaker presentation. It is, however, found that the headphone presentation is weak in reproducing presence and reality. Audio enthusiasts and musicians could significantly discriminate audio formats than ordinary listeners in both headphone and loudspeaker listening conditions.
Also a lecture—see session P10-1]
Convention Paper 9529 (Purchase now)
P15-6 Accelerometer Based Motional Feedback Integrated in a 2 3/4" Loudspeaker—Ruben Bjerregaard, Technical University of Denmark - Kongens Lyngby, Denmark; Anders N. Madsen, Technical University of Denmark - Kongens Lyngby, Denmark; Henrik Schneider, Technical University of Denmark - Kgs. Lyngby, Denmark; Finn T. Agerkvist, Technical University of Denmark - Kgs. Lyngby, Denmark; Michael A. E. Andersen, Technical University of Denmark - Kgs. Lyngby, Denmark
It is a well known fact that loudspeakers produce distortion when they are driven into large diaphragm displacements. Various methods exist to reduce distortion using forward compensation and feedback methods. Acceleration based motional feedback is one of these methods and was already thoroughly described in the 1960s showing good results at low frequencies. In spite of this, the technique has mainly been used for closed box subwoofers to a limited extent. In this paper design and experimental results for a 23 /4 " acceleration based motional feedback loudspeaker are shown to extend this feedback method to a small full range loudspeaker. Furthermore, the audio quality from the system with feedback is discussed based on measurements of harmonic distortion, intermodulation distortion, and subjective evaluation.
[Also a lecture—see session P10-6]
Convention Paper 9534 (Purchase now)
P15-7 A Headphone Measurement System Covers both Audible Frequency and beyond 20 kHz (Part 2)—Naotaka Tsunoda, Sony Corporation - Shinagawa-ku, Tokyo, Japan; Takeshi Hara, Sony Video & Sound Products Inc. - Tokyo, Japan; Koji Nageno, Sony Video and Sound Corporation - Tokyo, Japan
A new scheme consists of measurement by wide range HATS, and the free-field HRTF correction was proposed to enable entire frequency response measurement from audible frequency and higher frequency area up to 140 kHz and for direct comparison with free field loud speaker frequency response. This report supplements the previous report [N. Tsunoda et al., “A Headphone Measurement System for Audible Frequency and Beyond 20kHz,” AES Convention 139, October 2015, convention paper 9375] that described system concept by adding ear simulator detail and tips to obtain reliable data with much improved reproducibility.
Also a lecture—see session P10-2]
Convention Paper 9530 (Purchase now)
P15-8 Deep Neural Networks for Dynamic Range Compression in Mastering Applications—Stylianos Ioannis Mimilakis, Fraunhofer Institute for Digital Media Technology (IDMT) - Ilmenau, Germany; Konstantinos Drossos, Tampere University of Technology - Tampere, Finland; Tuomas Virtanen, Tampere University of Technology - Tampere, Finland; Gerald Schuller, Ilmenau University of Technology - IImenau, Germany; Fraunhofer Institute for Digital Media technology (IDMT) - Ilmenau, Germany
The process of audio mastering often, if not always, includes various audio signal processing techniques such as frequency equalization and dynamic range compression. With respect to the genre and style of the audio content, the parameters of these techniques are controlled by a mastering engineer, in order to process the original audio material. This operation relies on musical and perceptually pleasing facets of the perceived acoustic characteristics, transmitted from the audio material under the mastering process. Modeling such dynamic operations, which involve adaptation regarding the audio content, becomes vital in automated applications since it significantly affects the overall performance. In this work we present a system capable of modelling such behavior focusing on the automatic dynamic range compression. It predicts frequency coefficients that allow the dynamic range compression, via a trained deep neural network, and applies them to unmastered audio signal served as input. Both dynamic range compression and the prediction of the corresponding frequency coefficients take place inside the time-frequency domain, using magnitude spectra acquired from a critical band filter bank, similar to humans’ peripheral auditory system. Results from conducted listening tests, incorporating professional music producers and audio mastering engineers, demonstrate on average an equivalent performance compared to professionally mastered audio content. Improvements were also observed when compared to relevant and commercial software.
Also a lecture—see session P11-3]
Convention Paper 9539 (Purchase now)
P15-9 Visualization Tools for Soundstage Tuning in Cars—Delphine Devallez, Arkamys - Paris, France; Alexandre Fénières, Arkamys - Paris, France; Vincent Couteaux, Telecom ParisTech - Paris, France
In order to improve the spatial fidelity of automotive audio systems by means of digital signal processing, the authors investigated means to objectively assess the spatial perception of reproduced stereophonic sound in car cabins. It implied choosing a convenient binaural microphonic system representative of real listening situations and metrics to analyze interaural time differences under 1.5~kHz in those binaural recordings. Frequency-dependent correlation correctly showed the frequencies at which the fidelity was improved and allowed to quantify the improvement. The time-domain correlation seemed to be a good indicator of the apparent source width, but failed at giving the perceived azimuth of the virtual sound source. Therefore that metric must be refined to be used efficiently during audio tunings.
(Also a poster—see session P10-7)
Convention Paper 9536 (Purchase now)
P15-10 The Difference between Stereophony and Wave Field Synthesis in the Context of Popular Music—Christoph Hold, Technische Universität Berlin - Berlin, Germany; Hagen Wierstorf, Technische Universität Ilmenau - Ilmenau, Germany; Alexander Raake, Technische Universität Ilmenau - Ilmenau, Germany
Stereophony and Wave Field Synthesis (WFS) are capable of providing the listener with a rich spatial audio experience. They both come with different advantages and challenges. Due to different requirements during the music production stage, a meaningful direct comparison of both methods has rarely been carried out in previous research. As stereophony relies on a channel- and WFS on a model-based approach, the same mix cannot be used for both systems. In this study mixes of different popular-music recordings have been generated, each for two-channel stereophony, surround stereophony, and WFS. The focus is on comparability between the reproduction systems in terms of the resulting sound quality. In a paired-comparison test listeners rated their preferred listening experience.
(Also a lecture—see session P10-5)
Convention Paper 9533 (Purchase now)
P15-11 Can Bluetooth ever Replace the Wire?—Jonny McClintock, Qualcomm Technology International Ltd. - Belfast, Northern Ireland, UK
Bluetooth is widely used as a wireless connection for audio applications including mobile phones, media players, and wearables, removing the need for cables. The combination of the A2DP protocol and frame based codecs used in many Bluetooth stereo audio
implementations have led to excessive latency and acoustic performance significantly below CD quality. This paper will cover the latest developments in Bluetooth audio connectivity that will deliver CD quality audio, or better, and low latency for video and gaming applications. These developments together with the increased battery life delivered by Bluetooth Smart could lead to the elimination of wires for many applications.
[Also a lecture—see session P11-2]
Convention Paper 9538 (Purchase now)