AES New York 2013
Paper Session P5
Thursday, October 17, 2:30 pm — 5:00 pm (Room 1E09)
Paper Session: P5 - Signal Processing—Part 2
Chair:
Juan Pablo Bello, New York University - New York, NY, USA
P5-1 Evaluation of Dynamics Processors’ Effects Using Signal Statistics—Tim Shuttleworth, Renkus Heinz - Oceanside, CA, USA
Existing methods of evaluating the action of dynamics processors, i.e., limiters, compressors, expanders, and gates do not provide results that have a direct correlation with the perceived and actual effect on the signals dynamics; aspects such as crest factor, dynamic range, and subjective acceptability of the processed signal or degree of optimization of the use of the transmission medium. A method is described that uses statistical analysis of the pre- and post-processed signal to allow the processor’s action to be characterized in a manner that correlates to the perceived effects and actual modification of signal dynamics. A number of signal statistical and user definable characteristics are introduced and, in addition to well-known statistical techniques, form the basis for this evaluation method.
Convention Paper 8938 (Purchase now)
P5-2 A New Ultra Low Delay Audio Communication Coder—Brijesh Singh Tiwari, ATC Labs - Noida, India; Midathala Harish, ATC Labs - Noida, India; Deepen Sinha, ATC Labs - Newark, NJ, USA
We propose a new full bandwidth audio codec that has algorithmic delay requirement as low as 0.67 ms to a maximum of 2.7 ms. Low delay is a critical requirement in real many time applications such as networked music performances, wireless speakers and microphones, and Bluetooth devices. The proposed Ultra Low Delay Audio Communication Codec (ULDACC) is a perceptual transform codec utilizing very small transform windows the shape of which is optimized to compensate for the lack of frequency resolution. Specially adapted psychoacoustic model and intra-frame coding techniques are employed to achieve transparent audio quality for bit rates approaching 128 kbps/channel at the algorithmic delay of about 1 ms.
Convention Paper 8939 (Purchase now)
P5-3 Cascaded Long Term Prediction of Polyphonic Signals for Low Power Decoders—Tejaswi Nanjundaswamy, University of California, Santa Barbara - Santa Barbara, CA, USA; Kenneth Rose, University of California, Santa Barbara - Santa Barbara, CA, USA
An optimized cascade of long term prediction filters, each corresponding to an individual periodic component of the polyphonic audio signal, was shown in our recent work to be highly effective as an inter-frame prediction tool for low delay audio compression. The earlier paradigm involved backward adaptive parameter estimation, and hence significantly higher decoder complexity, which is unsuitable for applications that pose a stringent power constraint on the decoder. This paper overcomes this limitation via extension to include forward adaptive parameter estimation, in two modes that trade complexity for side information: (i) a subset of parameters is sent as side information and the remaining is backward adaptively estimated; (ii) all parameters are sent as side information. We further exploit inter-frame parameter dependencies to minimize the side information rate. Objective and subjective evaluation results clearly demonstrate substantial gains and effective control of the tradeoff between rate-distortion performance and decoder complexity.
Convention Paper 8940 (Purchase now)
P5-4 Voice Coding with Opus—Koen Vos, vocTone - San Francisco, CA, USA; Karsten Vandborg Sørensen, Microsoft - Stockholm, Sweden; Søren Skak Jensen, GN Netcom A/S - Ballerup, Denmark; Jean-Marc Valin, Mozilla Corporation - Mountain View, CA, USA
In this paper we describe the voice mode of the Opus speech and audio codec. As only the decoder is standardized, the details in this paper will help anyone who wants to modify the encoder or gain a better understanding of the codec. We go through the main components that constitute the voice part of the codec, provide an overview, give insights, and discuss the design decisions made during the development. Tests have shown that Opus quality is comparable to or better than several state-of-the-art voice codecs, while covering a much broader application area than competing codecs.
Convention Paper 8941 (Purchase now)
P5-5 High-Quality, Low-Delay Music Coding in the Opus Codec—Jean-Marc Valin, Mozilla Corporation - Mountain View, CA, USA; Gregory Maxwell, Mozilla Corporation; Timothy B. Terriberry, Mozilla Corporation; Koen Vos, vocTone - San Francisco, CA, USA
The IETF recently standardized the Opus codec as RFC6716. Opus targets a wide range of real-time Internet applications by combining a linear prediction coder with a transform coder. We describe the transform coder, with particular attention to the psychoacoustic knowledge built into the format. The result out-performs existing audio codecs that don't operate under real-time constraints.
Convention Paper 8942 (Purchase now)