DVD Audio Part 2: Sounding Good
Many Methods For Digital Audio
By Trevor Marshall
December 11, 2000
If you look at the directory of a DVD you will see two folders on it, VIDEO_TS, which contains the DVD-Video content, and AUDIO_TS, which contains the content to be processed by the new DVD-Audio players.
VIDEO_TS can contain a number of audio formats, the most important of which are plain old 48-KHz 16 bit PCM, Dolby AC3 Digital Surround, DTS 5.1 Digital Surround, Sony's SDDS (1 to 7.1 channels). The PCM channel can also carry Dolby Pro-Logic or Analog Surround Sound data.
Will My DVD Machine Play DVD-Audio?
In Part I, I examined the technologies for processing audio so that it can be distributed on CD and DVD media. I explained that the Industry's WG-4 Standards Committee, which was defining the DVD-Audio format, had settled on a PCM system that will allow up to 6 channels of 44.1, 48, or 96 KHz, 16, 20, or 24-bit PCM to be distributed in a DVD format, and that new DVD players and DVD-ROM drives would be needed in order to play this new audio content.
An optional 2 channel, 24-bit datastream at an 192-KHz sampling rate is primarily intended for "studio masters," no doubt to preserve for posterity the huge library of old vinyl records and analog master tapes. :-)
None of these data schemes are compatible with the DVD-Video audio formats I mentioned above, or the DVD-Video player in your living room, or the DVD-ROM drive in your computer.
So please excuse me for flaming a little as I cast my technical eye over the bill of goods we are being sold in this Great Step Forward for audio technology.
In my October column, I wrote about audio-digital conversion, and explained how the Nyquist Theorem limits the maximum frequency that can be reproduced (after a digital-sampling process) to one half of the sampling frequency.
But that was a basic explanation, and what I didn't cover was digital-filter theory, and how the filtering algorithms break down when the input gets close to the Nyquist frequency. A good ear can (allegedly) hear to 20 KHz, so why can't we just sample at 40 KHz and be done with it? Well, the digital-filtering theory is a little complex (that's an understatement) and only the DSP designers tend to be interested, anyway. But what I will say is that a number of good DSP implementations have been developed for the audio CD sampling frequency of 44.1 KHz that allow adequate performance right up to input frequencies approaching the 20-KHz audible limit.
So why do we need to sample at 48 KHz, or 96 KHz or even 192 KHz?
I'll buy into the first one, 48 KHz, which is now universally used in DVD and DV recording. It gives a little more headroom over the 20-KHz target without expanding the size of the data-stream too much. But 96 KHz and 192 KHz? You've got to be kidding.
I guess, on the other hand, if the public can be persuaded that any audio sampled at less than 96 thousand samples per second is "inferior," or "sub-standard," maybe we will all go out and buy new receivers and DVD-Audio players and continue to fuel the boilers of the consumer electronics industry.
When the bitrate for six channels of 24-bit, 96-KHz PCM was first calculated, it exceeded the maximum bit-rate for the DVD media (which is just under 10 Mbps). The industry therefore adopted a lossless compression scheme called Meridian Lossless Processing, or MLP, and all the new DVD-audio players have this compression integrated into their chip set.
There is some very interesting data in the technical description of the MLP compression scheme. Basically, a lossless Huffman algorithm with some pre-processing, the following data is given for MLP compression's effectiveness:
Sampling Freq | Peak bits saved | Avge Bits Saved |
48 KHz | 4 | 8 |
96 KHz | 8 | 9 |
192 KHz | 10 | 11 |
In other words, with a 192-KHz sampling frequency, a 24-bit audio datastream, on the average, will compress down to 24-11 = 13 bits of data.
This degree of compression probably means the entropy of the data stream was not sufficiently high in the first place or, in other words, that there are too many samples that contain exactly the same data value as the samples around them. I would conclude, therefore, that either 192 KHz is too high a sampling frequency or 24 bits is too accurate a conversion (probably the former).
Luckily, it is lossless compression, and you are getting all that data streaming to your sound system, and some of it just might be inaudible if the entropy had been attacked with more aggressive techniques (like MP3 or Dolby compression), and therefore all that data can't be bad all the time. At least, that's how I think the industry's argument goes.
Oh, and if somebody can explain to me why Meridian says the 192- KHz average bit saving (11) is higher than the peak bit saving (10), it would give me a lot more confidence in the MLP technology it is promoting in this paper.
Most of the new DVD-audio titles currently available contain both the DVD-audio multi-channel PCM format as well as video-compatibility tracks recorded in the "old" DTS and Dolby Digital modes.
While researching this article,
I called my local Good Guys audio store and asked if someone could give me a demonstration of the new DVD-Audio players. "Of course" was the reply. But when I got to the store I found that the DVD-audio disk that it had placed in the new player for me (with considerable hoopla) was actually playing through the Dolby Digital 5.1 VIDEO_TS compatibility mode. I found the Pioneer DVD-Audio player did not have any internal AUDIO_TS compatible decoding, and that a new receiver that could decode the new datastream would also be needed. And this store hadn't seen any of those receivers yet. But then again, since I was the first to call its bluff on the demonstration of this so-called "new" DVD-audio technology, why would it need to set up a proper demonstration anyway?
I will Finish On A Positive Note
The DVD-video specification has already defined a method for decoding 48-KHz, 16-bit audio, Dolby Digital Surround, DTS Surround, and SDDS, all of which (in my humble opinion) are capable of delivering excellent content that can be played on the current generation of DVD-video players and DVD-ROM drives.
The software tools are starting to become available so that technically literate (Byte.com readers) can create multi-channel sound tracks for their own DV videos. I am intrigued by the concept of using the four (lower-quality) DV audio channels to record signals from a surround-sound microphone like this one, for later matrixing into a full surround-sound presentation.
Surround-Sound Software Tools
There are very few open source software tools for manipulating surround audio, and of these only one, AC3DEC.EXE, an MSDOS based AC3 to PCM file converter, seems to be stable and reasonably configurable. But I recommend you find a GUI to control it, as to get a proper five-channel to two-channel demux, with the correct balance between each of the five input channels, you will need a command string such as:
c:\ac3dec.exe "f:test.ac3" -gain 250 -gain2 175 -gaincenter 150 -gainrear 150 -gainlfe 150 -wav "c:\test.wav."
Sonic Systems, Minnetonka Audio, and Sonic Foundry all have software that will convert .wav files to and from the surround-sound formats.
Of these, I prefer the SoftEncode product from Sonic Foundry.
(Click on the thumbnail for a larger picture.)
It is relatively quite easy, using Soft Encode, to take five wav files and put them together into a Dolby Digital AC3 5.1 track that can be burned just like an audio CD. It will play on most DVD players that play audio CD disks.
(Note: As this column went to press I received an e-mail from Rebecca Grow, PR coordinator for Sonic Foundry, which said that SoftEncode was no longer being sold as an identifiable product, and "eventually, the technology will be used in future Sonic Foundry products." I would like to refer you to Sonic Foundry to buy this product, but I am not sure what to advise you to do, since I know that you will stumble over copies of SoftEncode as you wander around the Internet.)
First, produce the five wav files. They need to be sampled at 44.1-KHz (the audio CD standard). Now, using the SoftEncode command FILE:OPEN(wav), add each of the five files in turn. The first is Front Left, then Center, then Front Right. Finally, add the Rear Left and Rear Right channels. Note that, as each is added,
the little 5.1 icon at the far left of the track display will have a white dot in the speaker position that the wav file from that channel will be routed into.
(Click on the thumbnail for a larger picture.)
Then select OPTIONS:ENCODE SETTINGS and either accept the default values, or play with them to get the feel for how flexible and easy AC3 encoding really is (don't forget to select 44.1 KHz as the sampling frequency).
Finally, select FILE:SAVE AS (dolby digital.wav) and then burn the resulting file to a CD-R with your burner set into the Audio CD mode. I use NERO, which works fine.
(Note: An explanation of how to take an existing AC3 file, resample it and burn it onto an audio CD, can be found at http://dvd.box.sk/?pid=m&s=ac3-1.)
The DVD-Audio Revolution
One way or another, the 4.7 gigabytes of storage on a DVD open up a plethora of solutions for delivery of next-generation content. Just how many consumers end up using the new DVD-audio datastreams, and how many stay with the DVD-video compatibility modes is anybody's guess.
Right now, however, one thing is certain. The number of low-cost Dolby Digital and DTS home-theater systems has grown rapidly over the past few months. Consumer-level systems (which sound much better than an equivalently priced stereo) are now available for a few hundred dollars. Up until now, the DVD manufacturers have been complaining that the DVD market has grown too slowly.
It will be interesting to monitor the rate of growth during the transition from multi-thousand dollar high-end equipment extravaganzas to the new affordable, mass market, configurations. Personally, I think that DVD's time has come. And the marketing dollars poured into DVD-Audio cannot but help the sales of all DVD equipment.