This new educational CD, produced by the AES Technical Committee on Signal Processing (TC_SP), is intended primarily to address issues relevant to digital signal processing algorithm designers and implementers.
Digital audio systems have obviously become ubiquitous in recent years, due in large part to the ingenuity and effort of Audio Engineering Society members. The maturity of the digital audio signal processing field can present a serious challenge for new students, researchers, software engineers, product testers, and other newcomers who must try to understand the relevance and relative importance of various signal processing parameters without being able to hear and see the details and effects: simply reading a written description is not the best way to learn and understand audio processing issues.
Since a typical characteristic of audio digital signal processing algorithms is the need for sustained, uninterrupted processing, even a single sample dropout or parameter update error can result in an audible artifact. Detecting, diagnosing, and correcting this sort of implementation error often requires experience listening for the defects, and examples of this type are included on this CD. The examples on this disc are intended to demonstrate a variety of the effects, both good and bad, that digital audio signal processing engineers are likely to come across in their work.
Note: this disc is "dual session" which means it contains CD-Audio in the CD part and WAV files in the CD-ROM/HTML part. This allows you to use it in your computer or in a normal audio CD player.
To purchase this CD-ROM online please click here.
Thank you for exploring this educational CD-ROM on digital audio signal processing, prepared by members of the AES Technical Committee on Signal Processing.
There are several notable audio demonstration CDs already available from commercial publishers and professional societies. These include the AES Technical Committee on Coding of Audio Signals' "What to Listen For" CD.
This new AES Technical Committee on Signal Processing (TC_SP) CD is intended primarily to address issues relevant to digital signal processing algorithm designers and implementers.
This is the second educational/tutorial CD-ROM presented by the AES Technical Council on a particular topic, combining background information with specific audio examples. Be aware that playing back the audio examples through small computer loudspeakers or with a noisy computer sound card may not reproduce the full frequency range and subtle nuances intended by the authors. To facilitate the use of high-quality home playback equipment for the reproduction of the audio excerpts, the disk can also be played back on all standard audio CD players. We hope you find this a useful feature.
Digital audio systems have obviously become ubiquitous in recent years, due in large part to the ingenuity and effort of Audio Engineering Society members. The maturity of the digital audio signal processing field can present a serious challenge for new students, researchers, software engineers, product testers, and other newcomers who must try to understand the relevance and relative importance of various signal processing parameters without being able to hear and see the details and effects: simply reading a written description is not the best way to learn and understand audio processing issues.
While at one time it was only possible to implement audio signal processing algorithms using special purpose hardware or arcane assembly language with a fast DSP microprocessor, it is now quite common to do audio processing with general-purpose microprocessors and high-level languages. This has made audio processing accessible to programmers who might not have had extensive experience with real-time audio systems. We believe that an awareness of numerical issues and proper algorithmic implementation can help avoid audible distortion and computational instability.
Since a typical characteristic of audio digital signal processing algorithms is the need for sustained, uninterrupted processing, even a single sample dropout or parameter update error can result in an audible artifact. Detecting, diagnosing, and correcting this sort of implementation error often requires experience listening for the defects, and examples of this type are included on this CD. The examples on this disc are intended to demonstrate a variety of the effects, both good and bad, that digital audio signal processing engineers are likely to come across in their work.
This project has actually been underway for many years. The AES TC_SP began initial discussions in 2001, and technical contributions and suggestions have come from many TC_SP members and friends since that time. The AES TC_SP thanks the AES Technical Council and the AES headquarters staff for making this project possible.
We also wish to thank the TC_SP members who voluntarily contributed their time and talent in creating the tutorial material and examples. We also wish to acknowledge the organizational contributions since 2001 of the TC_SP leadership: Ronald Aarts, James D. Johnston, and Christoph M. Musialik, and the work of Rob Maher, CD project coordinator. Many of the examples are prepared from a special recording provided by Stanley Lipshitz. The entire project team also acknowledges the work of Nermin Osmanovic in assembling the final CD layout.
We hope that you will find this CD useful, and thank you for your support.
- Christoph M. Musialik and James D. Johnston, co-chairs, AES Technical Committee on Signal Processing
- Robert C. Maher, TC_SP CD project coordinator
1.1 Sampling/Aliasing
1.2 Quantization and Dithering
1.3 Hard Limiting (Clipping) and Wrap-Around
1.4 Audible Effects of Frequency Filtering
1.5 Audibility of Interchannel Phase and Timing Differences
1.6 The Decibel Scale and Frequency Weightings
2.0 Section Introduction
2.1 Tone Masking Noise, and Noise Masking Tone Demonstrations
2.2 FM versus AM Modulation Inside a Critical Band
2.3 Level Differences
3.1 Envelopes and Parameter Update Rate
3.2 Wavetable Signal Synthesis
3.3 Broadband Denoising for Audio Signals
3.4 Effects of Implementation Errors and Error Concealment
3.5 Effects of Cascaded Sample Rate Conversion
Audio Track | Section | Description |
---|---|---|
1.1 Sampling and Aliasing |
||
1 | 1.1.1 | Original piano |
2 | 1.1.2 | Piano with 20 kHz bandwidth sampled at 10 kHz without a proper anti-aliasing filter. |
3 | 1.1.3 | Piano with 20 kHz bandwidth sampled at 8 kHz without a proper anti-aliasing filter. |
1.2 Quantization, Dither & Noise Shaping Tracks |
||
4 | 1.2.1 | Original 16-bit piano music excerpt. Call it U ("Unattenuated"). |
5 | 1.2.2 | 20-dB attenuated 16-bit piano music excerpt. Call it A ("Attenuated"). |
6 | 1.2.3 | Faded 16-bit piano music excerpt (0 to -60 dB at -3 dB/s). Call it F ("Faded"). |
7 | 1.2.4 | Undithered mid-tread requantization of U to n bits, where n goes from 16 to 2 and back to 16 again, one bit at a time. |
8 | 1.2.5 | Undithered mid-riser requantization of U to n bits, where n goes from 16 to 2 and back to 16 again, one bit at a time. |
9 | 1.2.6 | Undithered mid-tread 8-bit requantization of A. |
10 | 1.2.7 | Undithered mid-riser 8-bit requantization of A. |
11 | 1.2.8 | Undithered mid-tread 8-bit requantization of F. |
12 | 1.2.9 | Quantization error of Track 11 (1.2.8). |
13 | 1.2.10 | Undithered mid-riser 8-bit requantization of F. |
14 | 1.2.11 | Quantization error of Track 13 (1.2.10). |
15 | 1.2.12 | Fractional RPDF-dithered mid-tread 8-bit quantization of a 300-Hz sine-wave of 9 (8-bit) LSBs peak-to-peak amplitude, where the peak-to-peak width of the RPDF dither is varied linearly from zero to one (8-bit) LSB and back to zero again. |
16 | 1.2.13 | Total error of Track 15 (1.2.12). |
17 | 1.2.14 | Fractional RPDF-dithered mid-riser 8-bit quantization of a 300-Hz sine-wave of 9 (8-bit) LSBs peak-to-peak amplitude, where the peak-to-peak width of the RPDF dither is varied linearly from zero to one (8-bit) LSB and back to zero again. |
18 | 1.2.15 | Total error of Track 17 (1.2.14). |
19 | 1.2.16 | Fractional RPDF-dithered mid-tread 8-bit requantization of A, where the peak-peak width of the RPDF dither is varied linearly from zero to one (8-bit) LSB and back to zero again. |
20 | 1.2.17 | Fractional RPDF-dithered mid-riser 8-bit requantization of A, where the peak-peak width of the RPDF dither is varied linearly from zero to one (8-bit) LSB and back to zero again. |
21 | 1.2.18 | RPDF-dithered mid-tread 8-bit requantization of F. |
22 | 1.2.19 | Total error of Track 21 (1.2.18). |
23 | 1.2.20 | RPDF-dithered mid-tread 8-bit requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 (8-bit) LSB with a period of 2 s}. |
24 | 1.2.21 | Total error of Track 23 (1.2.20). |
25 | 1.2.22 | Undithered mid-tread 8-bit requantization of F with RPDF dither added after the requantization. |
26 | 1.2.23 | TPDF-dithered mid-tread 8-bit requantization of F. |
27 | 1.2.24 | Total error of Track 26 (1.2.23). |
28 | 1.2.25 | TPDF-dithered mid-tread 8-bit requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}. |
29 | 1.2.26 | Total error of Track 28 (1.2.25). |
30 | 1.2.27 | Subtractive RPDF-dithered mid-tread 8-bit requantization of F. |
31 | 1.2.28 | High-pass TPDF-dithered mid-tread 8-bit requantization of F. |
32 | 1.2.29 | Undithered 1st-order mid-tread 8-bit noise-shaper requantization of F. |
33 | 1.2.30 | Undithered 1st-order mid-tread 8-bit noise-shaper requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}. |
34 | 1.2.31 | RPDF-dithered 1st-order mid-tread 8-bit noise-shaper requantization of F. |
35 | 1.2.32 | TPDF-dithered 1st-order mid-tread 8-bit noise-shaper requantization of F. |
36 | 1.2.33 | Undithered tilted-F-weighted 9th-order mid-tread 8-bit noise-shaper requantization of F. |
37 | 1.2.34 | Undithered tilted-F-weighted 9th-order mid-tread 8-bit noise-shaper requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}. |
38 | 1.2.35 | RPDF-dithered tilted-F-weighted 9th-order mid-tread 8-bit noise-shaper requantization of F. |
39 | 1.2.36 | TPDF-dithered tilted-F-weighted 9th-order mid-tread 8-bit noise-shaper requantization of F. |
40 | 1.2.37 | Undithered 12th-order mid-tread 8-bit noise-shaper requantization of F. |
41 | 1.2.38 | Undithered 12th-order mid-tread 8-bit noise-shaper requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}. |
42 | 1.2.39 | Total error of Track 41 (1.2.38). |
43 | 1.2.40 | RPDF-dithered 12th-order mid-tread 8-bit noise-shaper requantization of F. |
44 | 1.2.41 | RPDF-dithered 12th-order mid-tread 8-bit noise-shaper requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}. |
45 | 1.2.42 | TPDF-dithered 12th-order mid-tread 8-bit noise-shaper requantization of F. |
46 | 1.2.43 | Concatenation of 5 s segments of dithered "silence" from each of the systems of Tracks 23 (TPDF-dithered rounding), 28 (high-pass TPDF-dithered rounding), 32 (TPDF-dithered 1st-order noise-shaped rounding), and 36 (TPDF-dithered tilted-F-weighted 9th-order noise-shaped rounding), repeated twice. |
1.3 Hard Limiting (Clipping), Analog and Digital |
||
47 | 1.3.1 | Hard digital clipping. |
48 | 1.3.2 | Wrap around. |
49 | 1.3.2 | Hard 'analog' clipping. |
1.4 Audible Effects of Frequency Filtering |
||
50 | 1.4.1 | Original piano. |
1.4.2 | Piano with 100 Hz highpass filter (signal frequency content attenuated below 100 Hz). | |
1.4.3 | Piano with 500 Hz highpass filter (signal frequency content attenuated below 500 Hz). | |
1.4.4 | Piano with 1000 Hz highpass filter (signal frequency content attenuated below 1000 Hz). | |
51 | 1.4.5 | Piano with 1000 Hz lowpass filter (signal frequency content attenuated above 1000 Hz). |
1.4.6 | Piano with 500 Hz lowpass filter (signal frequency content attenuated above 500 Hz). | |
1.4.7 | Piano with 100 Hz lowpass filter (signal frequency content attenuated above 100 Hz). Note: this example may be inaudible when using small loudspeakers that are unable to reproduce frequency content below 100 Hz. | |
52 | 1.4.8 | Piano with 440 Hz - 880 Hz bandpass filter (signal frequency content attenuated below 440 Hz and above 880 Hz). |
1.5 Audibility of Interchannel Phase and Timing Differences |
||
53 | 1.5.1 | Interchannel delay sequence, 441 Hz tones. Phase difference between channels: 0 samples (in phase), 1 sample, 2 samples, 5 samples, 10 samples, 20 samples, 50 samples. Sequence presented three times. |
1.5.2 | Interchannel delay sequence, 1 ms clicks. Phase difference between channels: 0 samples (in phase), 1 sample, 2 samples, 5 samples, 10 samples, 20 samples, 50 samples. Sequence presented three times. | |
1.5.3 | Interchannel delay sequence, 1 ms clicks. Time difference between channels:: 0 ms (synchronized), 100us, 200us, 500us, 1ms, 2ms, 5ms, 10ms. Sequence presented three times. | |
1.6 The Decibel Scale |
||
54 | 1.6.1 | Original broadband noise scaled for -12 dB with respect to full scale. |
1.6.2 | Broadband noise with A-weighting applied. | |
1.6.3 | Broadband noise with C-weighting applied. | |
2.1 Tone Masking Noise and Noise Masking Tone |
||
55 | 2.1.1 | Tone masking noise sequence. |
56 | 2.1.2 | Noise masking tone sequence. |
2.2 Audibility of Phase: AM vs. FM Within a Critical Band |
||
57 | 2.2.1 | AM and FM sequence |
2.3 Level Differences |
||
58 | 2.3.1 | Level comparison by windowing between the levels. |
59 | 2.3.2 | Level comparison by forcing a time interval between the two pulses at different levels. |
3.1 Envelopes and Parameter Update Rate |
||
60 | 3.1.1 | Gain changes abruptly from unity to 8 and back to unity (0 dB to +18 dB and back to 0 dB), presented twice. |
61 | 3.1.2 | Gain changes linearly from unity to 8 and back to unity (0 dB to +18 dB and back to 0 dB), presented twice. |
62 | 3.1.3 | Gain changes linearly from x1 to x8 in a set of 4 abrupt steps. |
3.1.4 | Gain changes linearly from x1 to x8 in a set of 16 abrupt steps. | |
3.1.5 | Gain changes linearly from x1 to x8 in a set of 64 abrupt steps. | |
63 | 3.1.6 | Gain changes linearly from x1 to x8 in a set of 4 linearly ramped steps. |
3.1.7 | Gain changes linearly from x1 to x8 in a set of 16 linearly ramped steps. | |
3.1.8 | Gain changes linearly from x1 to x8 in a set of 64 linearly ramped steps. | |
64 | 3.1.9 | Gain changes logarithmically from x1 to x8 in a set of 4 abrupt steps. |
3.1.10 | Gain changes logarithmically from x1 to x8 in a set of 16 abrupt steps. | |
3.1.11 | Gain changes logarithmically from x1 to x8 in a set of 64 abrupt steps. | |
65 | 3.1.12 | Gain changes logarithmically from x1 to x8 in a set of 4 linearly ramped steps. |
3.1.13 | Gain changes logarithmically from x1 to x8 in a set of 16 linearly ramped steps. | |
3.1.14 | Gain changes logarithmically from x1 to x8 in a set of 64 linearly ramped steps. | |
3.2 Signal Generation Via Wave Tables |
||
66 | 3.2.1 | Sample Increment (SI) = 1 (each sample of the wave table is played sequentially). |
67 | 3.2.2 | SI=2 (every other sample of the wave table is played). |
68 | 3.2.3 | SI=3 (every third sample of the wave table is played). |
69 | 3.2.4 | SI= 1.49 (non-integer SI means that the table lookup index is generally not an integer, so round-to-nearest sample introduces a waveform error). |
70 | 3.2.5 | SI=1.71(non-integer SI means that the table lookup index is generally not an integer, so round-to-nearest sample introduces a waveform error). |
71 | 3.2.6 | Sample Increment (SI) swept from 0.5 to 2. |
72 | 3.2.7 | Difference between the wave table sweep and a "perfect" (no lookup rounding) sweep. |
73 | 3.2.8 | Difference boosted by 20dB to make it more audible. |
3.3 Denoising to Remove Broadband Noise |
||
74 | 3.3.1 | Original. |
75 | 3.3.2 | Strong (-20 dBFs Peak ) tape-like noise. |
76 | 3.3.3 | Distorted file resulting from adding the original and the noise. |
77 | 3.3.4 | File denoised with DeNoiser from Sound Laundry, without attack and release parameters: note an obtrusive chirping. |
78 | 3.3.5 | File denoised with NoiseFree denoiser, short attack, short release: better, but still with a lot of phasy noise. |
79 | 3.3.6 | File denoised with NoiseFree denoiser; medium attack and release: less noise, but note the "fuzzy" onsets due to large block size leading to transient "smearing". |
80 | 3.3.7 | File denoised with NoiseFree denoiser, short attack and long release: more denoising is still possible, but at the cost of transient smoothing. |
81 | 3.3.8 | Original file optimally denoised with Algorithmix NoiseFree denoiser. |
3.4 Effects of Implementation Errors and Error Concealment |
||
82 | 3.4.1 | Every eighth sample is left out (missed). |
83 | 3.4.2 | Every 32nd sample is left out (missed). |
84 | 3.4.3 | Every eighth sample is set to zero (skipped). |
85 | 3.4.4 | Every 32nd sample is set to zero (skipped). |
86 | 3.4.5 | Comb filter (delay length 8) for comparison to miss and skip examples. |
87 | 3.4.6 | Comb filter (delay length 32) for comparison to miss and skip examples. |
88 | 3.4.7 | Filtered source without overflows (correct output). |
89 | 3.4.8 | Filtered source with internal filter overflow compensation by clipping. |
90 | 3.4.9 | Filtered source with internal filter overflow compensation by wrap-around. |
91 | 3.4.10 | Lattice filtered source without overflows (correct output). |
92 | 3.4.11 | Lattice filtered source with overflow compensation by clipping. |
93 | 3.4.12 | Lattice filtered source with overflow compensation by wrap-around. |
3.5 Cascading Sample Rate Conversions |
||
94 | 3.5.1 | Original source (44.1khz/16b stereo). |
95 | 3.5.2 | High quality, cascaded 1x (44.1kHz/16b stereo). |
96 | 3.5.3 | High quality, cascaded 2x (44.1kHz/16b stereo). |
97 | 3.5.4 | Consumer quality, cascaded 1x (44.1kHz/16b stereo). |
98 | 3.5.5 | Consumer quality, cascaded 2x (44.1kHz/16b stereo). |