Heyser Memorial Lecture
AES 112th Convention
M.O.C.-Center, München, Germany
Saturday, May 11, 18:15h
Some Musings on Progress in Audio: A Quest for Better Sound at Affordable Prices
Ray Dolby
In the 37 years since my company was founded in 1965 the audio recording technologies we use in manufacturing and licensing have gone from 100% analog to over 90% digital. How did that happen? No one could plan such a dramatic change just by making a decision at a particular time. There were many technological developments in the intervening years, and no one could predict which developments would allow or even dictate a changeover from a wholly analog world to a predominantly digital audio world.
I believe that the driving force in these changes was not any particular recording technology itself but other issues such as operating convenience and price. The market seems to apply continuous upward forces on utility and convenience until satisfactory levels have been achieved. However, there is an incessant downward pressure on prices, which are largely determined by manufacturing costs. In the last few years a phenomenal decrease in the cost of manufacturing digital equipment has given digital products an overwhelming marketplace advantage that has totally transformed the face of audio.
Ray Dolby bio
Ray Dolby, founder and Chairman of Dolby Laboratories, Inc., was born in Portland, Oregon, in 1933. From 1949-52 he worked on audio and instrumentation projects at Ampex Corporation, where from 1952-57, as a student, he was mainly responsible for the development of the electronic aspects of the Ampex video tape recording system. He received his B.S. in Electrical Engineering from Stanford University in 1957 and, as a Marshall Scholar, left Ampex to pursue further studies at Cambridge University in England. He received a Ph.D. degree in physics from Cambridge in 1961, and was elected a Fellow of Pembroke College (Honorary Fellow, 1983). During his last year at Cambridge, he was also a consultant to the United Kingdom Atomic Energy Authority.
In 1963, Dolby took up a two-year appointment as a United Nations technical advisor in India, then returning to England in 1965 to found Dolby Laboratories in London. In 1976 he established further offices, laboratories, and manufacturing facilities in California. He holds more than 50 US patents, and has written papers on video tape recording, long wavelength X-ray analysis, and noise reduction. Ray Dolby makes his home in San Francisco with his wife Dagmar. He enjoys skiing, boating and flying airplanes and helicopters.
Honors and Awards - AES: Fellow and Past Pres.; Silver Medal; Gold Medal. BKSTS: Fellow; Sci. and Tech. Award. SMPTE: Fellow; Samuel L. Warner Mem. Award; Alexander M. Poniatoff Gold Medal; Progress Medal; Hon. Memb. AMPAS: Sci. and Eng. Award; "Oscar" Award. NATAS: "Emmy" Award. NARAS: "Grammy" Award. IEEE: Masaru Ibuka Award. AEA: Medal of Achievement. Cambridge U.: Hon. Doctor of Science. University of York: Hon. Doctor of the University. U.S. Govt: Nat. Medal of Tech. U.K. Govt: OBE.
Complete transcript of Ray Dolby's speech
Today let me take you on my own personal tour of the audio industry and how it changed from analog to digital. When I started business, the audio recording and processing technologies used in our products and licensing were 100% analog. Now, 37 years later, our manufactured products are 90% digital. Our licensed technologies are over 95% digital. I think this is totally amazing. How did it happen? Nobody could plan such a dramatic change just by making a decision at a particular time. There were many technological developments in the intervening years, and no one could predict which ones would allow or even compel such a changeover. Maybe it would never happen. Perhaps the established analog technologies would prevail because of unforeseen fundamental limitations in recording media manufacture, head design, or chip densities.
You may be interested in my own experiences of analog, digital, and other processes that I have witnessed over the decades, including those in my own company. These experiences have shaped my attitudes and actions.
In 1949 I was lucky enough to have a summer job at the tiny Ampex Electric Company in San Carlos, California. There I got to see first hand how the technology marketplace really works. I had been hired by the company founder, Alex Poniatoff, and was somewhat under his wing. My task was to duplicate the first calibration tapes for the original Ampex recorder, the Model 200. I had to keep the frequency response of these tapes within better than 1/2 dB from 30 Hz to 15 kHz.
The Model 200 tape recorder was magnificent - with a large black polished wooden console and heavy anodized aluminum fixtures and fittings. The audio quality was excellent. Not only did the machine run at 30 inches per second, but it used the full 1/4" of tape for one track only. The frequency response, the distortion, the noise, and the wow and flutter were all impeccable.
How much did the Ampex 200 cost? $4,000 US in 1949. This would be equivalent to about $31,000 today. Imagine spending this on a one track recorder in 2002! No one would do it. However, remember that then this machine represented the only way of achieving such performance. It was analog.
Who was the first customer willing to pay the price? Bing Crosby. He was determined to change the way his radio show was produced, using edited tape instead of live performances.
However, even in 1949 I soon discovered that there was an awareness that tape recorder prices had to come down. I took the opportunity to look around Ampex - at this point 25 employees - and see what else was going on. In one room I discovered something that at both thrilled and alarmed me. Harold Lindsay and other engineers were planning a new recorder, the Model 300, that would cost only $1,500. As an admirer of fine machinery, I could not help but have the feeling that the new design would be a debasement of the original Rolls Royce concept of the Model 200, especially after I learned that the tape speed would be reduced from 30 to only 15 ips.
This is when I began to learn about the continuing balancing act that characterizes good design. If, as a designer, you overdesign a system or a product, you will be found out, eventually. Sooner or later, some smart engineer, or even a businessman, will realize that you have made something heavier than it has to be, uses more tape than it has to, or consumes too much power. In other words, there is always a downward pressure on price. So, even in the 1940's there was a demand to get the price of analog recording down. And digital wasn't even in the contest.
While at university I continued working at Ampex fifteen hours a week and during the vacations. In 1951 one of the projects I worked on was the Model 500 instrumentation recorder, the most ambitious and precision recorder that the company had yet developed. Depending on the application, the machine would be used in several modes: direct recording, amplitude modulation, frequency modulation and pulse recording. The machine employed a vacuum system that sucked the tape tight against the capstan, practically eliminating scrape flutter.
As an experiment, Walter Selsted, the designer, thought that it would be interesting to try an FM modulation system to make an audio recording, using a carrier frequency in the 100 kHz range, with about a +/- 40 kHz deviation. In an FM recording system flutter translates directly into noise, so with the low flutter characterizing the Model 500 recorder, a good result could be expected. My recollection was that the resulting signal to noise ratio was about 70-75 dB, which I thought a bit of a disappointment in view of the rather extravagant measures taken. So, it was difficult to make a large improvement over the best analog audio tape recorders of the time. But, at least, here was proof that FM could make a modest improvement over analog recording, without the extreme bandwidth requirements of the still uncontemplated digital audio.
In the summer of 1952, there was an engineer whose name I now remember only as Louis who was working in an adjacent lab, to investigate what the practical limits would be to recording digital data on magnetic tape. He had a couple of adapted Ampex Model 300 tape recorders offering various speeds and tape widths, especially by mounting and using head blocks of different designs. Louis had an array of threshold setting counters rigged up to measure the amplitudes of the reproduced pulses. Week after week of discouraging looking results were produced. If the track width was decreased, the dropouts increased. If the wavelength was decreased, the dropouts increased. Scores of tables and graphs were created, whereby the awful truth was revealed. The tapes and heads existing at that time were just not good enough for reasonably high density recording. A lot of tape would have to be used to record digital data reliably. My observation of this sorry situation, and the ensuing discussions, made a deep impression on me. Apparently the dilemma made an impression on management, too, and the analysis project was abandoned for the time.
The next project I worked on, the video tape recorder, from 1952-57, included an aspect that may have proved useful for digital recording. Charles Ginsburg and I had ordered special 2" wide tape from the 3M company. We had only a couple of reels of the tape and considered them to be valuable because of their custom made origin. We therefore reused the tapes countless times, after which we noticed that signal dropouts that we suffered gradually became less and less. This we attributed to the polishing action of our rotary head system. We then went back to 3M and asked whether they could polish the next tapes they made for us. They devised a mechanical scheme whereby the tape would be looped back on itself and the oxide coating of one surface would polish the oxide of the adjoining looped-back surface. I did not realize it at the time but such improved tape manufacture no doubt made things easier for magnetic data recording and eventually digital audio recording.
When I was at university in the 1950's digital was talked about and the basics were even taught. I did some experiments in my electronics course, but digital seemed so impractical, so expensive, so big. Tube circuits were big, but there were no other kinds of circuits.
At this time analog computers were active competition to digital efforts. Both digital and analog existed at Stanford University while I was there. Regarding analog, there was a conviction that using basic elements like amplifiers, resistors, and capacitors would ultimately provide a practical solution for many computational applications. It was not at all clear whether analog or digital would prevail, or whether the world would move forward using both, depending on the application.
When I went to Cambridge University in 1957 to do my graduate work I happened to come across an Engineering Department project to create a pneumatic digital computer. At this stage of semiconductor technology, transistors were big and expensive. The concept of the pneumatic computer was that a fine jet of air could be directed from one orifice to another by a second jet of air at right angles, thereby controlling the primary air flow action in a binary way. Large sheets of such pneumatic switches could be injection molded and assembled into large three-dimensional arrays. The resulting computer would be much smaller and less expensive than either a vacuum tube version or a semiconductor version.
I never did find out what happened to that pneumatic digital computer, but you can imagine what a reminder it gave me of the ongoing battles between one technology possibility and another. I was also left with the feeling of how difficult it is to predict the outcome of such a competition. I sometimes think of the struggles between the gas and electricity industries at the end of the 19th century and beginning of the 20th. Our family house in San Francisco was built in 1904 and was designed and equipped with both gas and electricity for lighting. Our architect evidently could not predict which would be the winner of the lighting competition.
In Cambridge, 1957, the main university research computer was the EDSAC vacuum tube computer, which was a vast machine occupying a good sized room. The equipment was mainly racks of vacuum tube logic circuits, in modular form, so that a faulty circuit could be replaced easily. My recollection is that there were some hundreds of these logic modules.
In 1958, in the middle of my Ph.D. program in long wavelength x-ray physics, I had a mathematical problem that required some very tedious calculations. So I decided to learn how to program the EDSAC computer using a program called Autocode, using punched tape as the input. The computer was in operation 24 hours a day, manned by specialists in white coats. It was necessary to sign up for time several days in advance, often accepting a time slot in the wee hours of the morning. My computation took 15 minutes, but before every run a diagnostic program was used for a couple of minutes to test all parts of the computer. It was very exciting getting the results of the run, knowing that I had been saved weeks of time in slide rule and adding machine calculations.
The state of the digital art being what it was in this period, I think I could be forgiven for not even thinking of the possibility that a computer would ever be small enough to fit on my desk, let alone in my pocket.
Within a few years I had started my own company. I wanted to improve the state of recorded sound, vintage 1965. Analog was still the only game in town. There were murmurings about the possibility of digital sound. Of course, if data could be recorded digitally, there was no reason why the data could not be audio. However, simple calculations showed that for a single PCM channel something of the order of 800 kilobits per second would be required, plus error correction, say a total of about 1.2 megabits per second. In 1965 this seemed like a totally extravagant expenditure of bandwidth, when by analog only 20 kHz of bandwidth would suffice.
In my whole technical life up to this point there had been an emphasis on conserving bandwidth. Bandwidth was precious. AM broadcast frequencies were allocated on the bare minimum that would be serviceable, which was judged to be about 9-10 kHz spacing between stations. FM radio occupied a more luxurious 200 kHz of bandwidth, but this was far below that which would be required with digital.
In 1965, I had already resolved to do analog noise reduction. I had identified an important application for my A-type noise reduction system - landlines, by which high quality signals could be transmitted from city to city for broadcast purposes. Then I read about digital fiber optic development work that was going on in England. I thought, "Damn, fiber optic could really make digital audio practical for landlines - someday." The information I had about the fiber optic work indicated that the creation of the infrastructure would take two or three decades. I thought, "OK, I've got until sometime near the end of the century until transmission bandwidth is cheap enough to make digital a threat. With noise reduction I will be offering something that is usable right now. And even in a worst case scenario my analog system will have had a decent run."
At that point I was thinking that digital audio would be a threat only to signal transmission applications, not recording. I had been working with magnetic tape recording long enough to have a sense that there seemed to be inherent limitations on signal packing densities on tape, and that the state of the art in the 1960's was pretty close to those limits. There was Louis' investigation at Ampex, for instance. A further indicator was tape signal to noise ratios. The tape manufacturers had been struggling for years to improve performance, with only modest results, perhaps an improvement of the order 10 db over a couple of decades. I thought, "At least there will be a good noise reduction market in tape recording. And then there's motion pictures."
Dolby Labs first incursion into what would be called digital occurred in the early 1970's, when we got started with our work in motion pictures. Ioan Allen made a study of what was wrong with motion picture sound technology. This included the much needed but despised ground noise reduction (GNR) system used in variable area recording. During quiet passages, the clear background on the film was automatically reduced in width, significantly lowering audible noise introduced by dust and scratches. Unfortunately, sudden transient sounds would be clipped and distorted by the GNR system. We realized that by introducing about a 10 ms delay into the sound recorded on the film it was possible to avoid the clipping of such signals. The undelayed sound would be used to activate the GNR system, giving it time to respond. We introduced this improvement in 1972, at first using a double playback head to provide the required time delay. To avoid the use of special heads, in 1974 we started using a charge coupled bucket brigade delay chip, which had just become available. A pure PCM delay would have been much too expensive. This became the beginning of how we acquired a distaste for PCM audio. It is a very inefficient code, utilizing no psychoacoustic principles whatsoever.
Our first real cinema processor, the CP100, designed under the guidance of David Robinson, was originally introduced in 1974, but without surround capability. Soon we realized that if surround sound were inexpensive enough it would become popular even in small cinemas. We therefore took our second step into digital audio in 1976 when we demonstrated stereo optical sound tracks with matrix surround. For a test, we used one of the first commercially available PCM digital delay lines to introduce a preset delay into the surround channel, of the order of 60 ms. This rather expensive arrangement was workable for demonstration purposes only. When we added surround on a production basis we knew we had to keep our production costs down. And so we utilized an adaptive delta modulation scheme that was about three times as efficient as PCM.
In this 1960's and 70's period the audio product manufacturers, including me and my company, had just a bare inkling that there was an electronic and recording technology revolution brewing out there that would totally change our industry.
Circuits were being pushed to higher and higher speeds, and digital circuit packing densities increased at a rate that could hardly have been predicted. Gordon Moore made the observation that densities had roughly doubled every 18 months for some time, through several generations of IC's and microprocessors. Others rashly turned this observation into a prediction, and the amazing thing is that this progress has been maintained to this day and appears set to continue.
In the early 1970's I was at an AES convention and heard of a hush-hush commercial project to develop a digital audio tape recorder. The proponents were full of optimism, and naturally I was curious and also somewhat apprehensive to see how this would play out. At following AES conventions I heard of progress, but I remained somewhat skeptical; I never did see any real information. In about 1975 I heard that the project had been abandoned, primarily because of reliability problems. It was too difficult to deal with tape dropouts.
During this 1970's period there were also developments of digital recorders by record companies for their own use. These included the Denon (Nippon Columbia) recorder, based on the Ampex quadruplex rotary head design. The Decca Record Company also developed a helical scan digital recorder for its own use.
At the November 1976 AES Convention in New York, Tom Stockham made an impressive demonstration of what may have been the first commercial digital audio recorder, the Soundstream. Other manufacturers apparently were spurred on by this and by the late 1970's a number of digital contenders appeared, including Denon, 3M, JVC, Matshushita, Mitsubishi, and Sony. However, the costs of all these machines were very high. The studios were afraid of being pressured into buying digital equipment that might well become obsolete. A few studios made the investment in the name of progress, but for the most part there was a lot of foot dragging going on. No one could be sure whether digital would stick or not.
On the consumer equipment manufacturer side there was also apprehension. By 1980 the Dolby B-type noise reduction system was well established in cassette tape recording. Nonetheless the manufacturers were anxious to have new high performance offerings. Change was in the air, but no one knew what that change would be. Some of the manufacturers had digital consumer recorders under development, but the head and tape technology did not seem to be ready yet. The result was an effort to create more powerful analog noise reduction systems. Among these were dbx, JVC VNR, Pioneer New NR, CBS CX, Telefunken Hi-Com, Hitachi Lo-D, Toshiba ADRES, and Sanyo Super-D.
Of course, all of these systems were mutually incompatible. It was an alarming situation. In early 1980 several manufacturers, including Nakamichi, Sony, Pioneer, Matshushita, and JVC approached my company expressing the view that Dolby had to offer a new noise reduction system that was significantly more powerful than the B-type system. If we were successful, all the chaos would disappear.
The arguments put forth by the Japanese manufacturers were persuasive, and I agreed to drop everything and design the new system. I knew that this would be impossible at the Dolby offices and labs, so I decided to set up a lab on the top floor of my home in San Francisco. In about eight months of high pressure work the new system was created. The new system, C-type, had to survive competitive trials against all other contenders. By late 1980 there was general agreement that the new system met the requirements. Moreover, because of Dolby independence, it was likely to be accepted by the many manufacturers in the industry.
Within a few months some 25 companies were preparing product offerings. The introduction of C-type seemed to clear away uncertainty in the industry. Indeed the venerable B-type system became more popular than ever. The existence of C-type gave consumers the feeling of choice and luxury, even though they would rarely use its full capabilities. Cassette tape recording flourished, and the cassette continues to this day, although with gradually declining numbers.
I was riding high with the success of C-type, which seemed to stave off the digital peril.
With the engineering momentum that I acquired during the development of C-type, I began to consider a new professional noise reduction system to supplement or even replace the A-type system. In the early days of A-type, nay-sayers would sometimes express fears that I would be tempted to tinker with and change the standard, thereby throwing into chaos all the established recording programs and archives. However, after 15 or 20 years of an unchanged standard, I thought that it would be safe enough to introduce a new system, especially if it resulted in a significant improvement in performance.
At this stage, late 1980, the "big iron" professional digital audio recorders were being designed by many. However, it appeared to me that their high cost and reported unreliability left room for the development of a better analog recording system. No one could predict how the digital development would turn out. I therefore decided to go ahead and design the best professional analog recording system that I could devise. It was partly a matter of pride. I wanted to prove that it was possible to equal or better the audible signal quality obtainable with digital by analog methods. I thought that, at least, the world will have two ways of making master tapes of the highest quality. In the short term we would be providing a facility that apparently digital could not yet do: deliver high signal quality reliably and at reasonable cost. The new system became known as Spectral Recording, or SR.
In 1980 I did not realize how long and intricate the new SR development would be - that it would not be until March 1986 that the new recording system would be presented and demonstrated for the first time at the AES convention in Montreux, Switzerland.
Throughout those SR development years I kept a wary eye on digital recorder progress. Fortunately, by 1986 there was still not enough digital recorder presence for anyone to say that digital would be the ultimate winner. There were still problems with reliability, with A/D, D/A converter design, particularly high-order distortion. Most of all, digital recording continued to be very expensive.
When SR was introduced, I was gratified to see that it was immediately welcomed for its signal quality; SR regularly won signal quality shootout competitions against digital. For some time SR was also appreciated because of its practicality and economy. Our SR order books were overflowing, and this happy condition continued for several years.
Meanwhile, in the 1970's and 80's my engineers, particularly Ken Gundry, Craig Todd, Louis Fielder, Mark Davis, and Grant Davidson were trying to figure out ways of applying psychoacoustic principles to digital applications. The success we had had in utilizing the characteristics of sound and human hearing to optimize signal processing in the analog domain persuaded us that something similar should be possible in the digital domain.
Following our use of efficient adaptive delta modulation in our cinema processor CP100 in 1976, we continued to develop this technology for use in other applications. We later called this technology "Audio Code No. 1" (AC-1). We proposed its use as the sound track format for 8 mm video and for BBC television sound. While neither proposal was accepted, we learned enough to go on to other applications which did succeed, including direct broadcast satellite and cable television.
We continued on our way of trying to do things in the most bandwidth and bit efficient way that we knew. As far as we could imagine, bandwidth and bits would continue to be expensive. We could not foresee the development work that eventually resulted in phenomenal increases in chip speeds and bit packing densities in recording.
In 1984 Louis Fielder was hired to help perfect our AC-1 system, but he soon became intrigued by the possibility of using the FFT as a means of developing an audio coder. One of the first things he invented was the windowing technique for bridging blocks in the bitstream. This technique became pivotal in the development of our perceptual audio coders.
In early 1986 Fielder demonstrated his first coder to me. At this time I was still finishing the work on the SR system and so had not paid much attention to Fielder's work. I was amazed by his results. The signal quality was surprisingly good from the outset. I had been expecting to hear various kinds of distortions, the sounds of which I could not even imagine. Since I was totally unprepared to hear such a good result so quickly, I became an early convert to perceptual audio coding.
It seemed evident that this coding technique could be used in a number of different applications, including film sound tracks. For some years we had been thinking about putting digital sound on film, either somewhere on the film or using an interlock system. However, all proposals seemed problematic until the development of our perceptual coder. This seemed to be the necessary key to unlock the puzzle. So after 1986, under Martin Richards, we accelerated our long-standing digital on film project, with an emphasis on optical issues such as recording bits on the film and reproducing them. Charles Seagrave led the group to produce the final digital electronic products in 1992.
Needless to say, having just finished the development of the SR system, I was eager to see my own new system used on film soundtracks. I didn't especially fancy seeing a new digital system, even one created by us, immediately upstage my new super-analog development. However, I needn't have worried. It took six more years of technical development work before the new optical and digital techniques were finished and could be released on film sound tracks. This finally happened in June 1992. Meanwhile SR was ready to go in 1986 and began to be used on film sound tracks in 1987. SR on film did what it was supposed to; the noise level was significantly reduced, and higher signal levels could be recorded with reduced distortion. The optical sound format had been transformed into something new and wonderful.
I had the satisfaction of seeing my SR development being used for several years as the premier sound format on film, and of course its use continues to this day as the backup analog track. Nonetheless, I was in the position of seeing my own company gradually obsolete my own development. This is not the worst result that could be imagined, but is one that decidedly results in mixed feelings. So I became an enthusiast and champion of our new digital perceptual coder direction and just let SR take care of itself.
SR didn't do too badly. It is not only the analog backup on film but remains an archival format. In the 1980's and somewhat into the 1990's analog SR was a preferred master recording format in recording studios and post production facilities - and still is in some circles.
However, we could not anticipate the amazing inventions and discoveries that favored digital recording, especially the increased magnetic bit packing densities on hard drives. The density soared from 1 million to 800 million megabits/sq. in. between 1975 and 1995. These improvements were nothing short of astounding. We can now clearly see that the effect was to so definitely tip the economic scales in favor of recording sound digitally that it's simply not worth it anymore trying to find ways of making analog recording competitive. It is now significantly more expensive to find and use analog recording media than digital media of equivalent capability. A roll of two inch tape costs about $160 and provides 24 channels of recording for about half an hour. An 80 GB hard drive also costs about $160 and will provide the same amount of digital recording for about 6 hours. These digital advantages are improving all the time. The latest number, about two weeks ago, is 100 GB for $49 US, about a factor of four improvement over the ones I just quoted.
So, following Louis Fielder's convincing perceptual coder demonstration in 1986, the digital transformation was well under way at Dolby Laboratories. This was not exactly planned, but it happened. We were being pushed by our own developments and those of others and also by the incred