8 Bits Too Many

All other things being equal, in audio 24 bits are better than 16 bits. The latter permits an analogue wave to be defined on a scale of 65,000-odd levels, while 24 bits bumps that up to more than 16.7 million (each additional bit doubles the precision).

To look at it another way, the fundamental imprecision of a 16 bit system — due to the digital system alone — is at -96dB (relative to the full scale of the system). For 24 bits, that imprecision is pushed way down to -144dBFS. A 24 bit system could encompass the full range of human hearing — from too quiet to be heard at all, all the way through to beyond the threshold of pain — without its defects even being theoretically detectable by a human.

But all other things aren’t equal. 24 bits comes at cost: larger file sizes. And we are dealing with the real world, so imprecisions due to the digital system alone are, most of the time, the least of our problems.

So I hope readers will entertain a rather contrarian view here. In brief, I shall argue that in the case of movies delivered on Blu-ray, most of the extra eight bits offered by 24 bit audio are not being used to provide a more pure signal at all. Instead, they are mostly delivering nothing but noise.

A Bold Claim

What evidence do I have for this claim? Well, there are two main arguments: one to do with how movie sound is captured, and one to do with how it is presented.

Sound is, of course, captured with microphones. A fine quality condenser studio microphone will typically have a signal to noise ratio of 85dBA to 90dBA. That in itself is in 15 bit territory. The sound is usually fed to a microphone preamplifier, which adds more noise. Of course, you can hear signal even below the noise floor, or somewhat below it. Maybe enough to get coherent sound as far as -96dB, the theoretical maximum of a 16 bit system. But how far into 24 bit territory does it extend?

I am by no means suggesting that sound shouldn’t be recorded at 24 bits. In fact, that’s the best use of higher precision. It means that more headroom can be allowed when recording levels are set to allow for unexpected peaks, without losing significant resolution in the sections of the recording which are performed at more moderate levels.

Movie sound is not really what would be called audiophile quality anyway. An audiophile music label will go on about the minimal — if any — processing applied to the music between microphone and the disc that you buy. A movie sound track, by contrast, is assembled from bits and pieces. Much — especially dialogue — is captured as the scene is being acted, but then the Foley people add sound effects, the music is mixed in. Sometimes more dialogue is recorded separately in an additional dialogue recording (ADR) process. And then, before finally being mixed, all the elements are panned to their appropriate place in the 360 degree sound field by an audio engineer. Purity this isn’t.

Lossless Compression

As to how it is presented, Dolby TrueHD and DTS-HD Master Audio offer a new window into the sound. Before Blu-ray, multichannel movie sound was always lossily compressed. That is, the sound was an approximation of what the movie audio engineer actually produced.

Consider a movie presented in Dolby Digital. An uncompressed 5.1 channel version of the sound would consume 4608kbps … and that’s at 16 bits of resolution with a 48kHz sampling rate. At 24 bits that is bumped up to 6912kbps. If you went to see that movie at the cinema, the Dolby Digital track would squash the 4608kbps down to just 320kbps, or 7%. On DVD, it would deliver 384kbps (8%) or 448kbps (10%). On Blu-ray, up to 640kbps is permitted (14%).

Dolby Digital works in part by eliminating redundancy (there is no need to encode the rear channels if nothing is happening back there, and a sampling frequency of 300 hertz is sufficient for the LFE channel, rather than 48,000Hz). But most of the savings comes from tossing out bits of sound that the Dolby algorithms consider relatively inaudible. That’s why it is called lossy compression: some of the sound is actually lost.

DTS does something similar (its cinema and home systems are entirely different), offering a choice of 1509 (33%) or 768kbps (15%) on both DVD and Blu-ray. At the higher bitrate it is actually lossless for much of the time, but lossy at those moments when too much is happening for its bitrate to fully capture.

Lossless compression is a real challenge with audio. Once again, some redundancy can be removed (when channels are silent they can be omitted, and when quiet they can be encoded with fewer bits). But, basically, it is hard to compress audio because there really isn’t all that much repetition.

But there is something in audio that is almost as good: predictability. Approximate, only, but that allows big savings. If the waveform of an audio track is going from negative to positive on a steady upwards path for three or four samples, you can be fairly confident that the next sample will be even higher. Simple models can approximately predict the next sound sample based on the previous few. More developed ones can take into account the rounded tops and bottoms of waveforms. All this gives an approach to lossless compression.

Basically, you use a model to approximate the waveform, and then supply a list of corrections. Here’s the tricky part: the model itself uses very little data. It can just be a set of rules built into the encoder and decoder. Almost all the actual bitstream of data is the corrections.

But because the model based waveform is approximately right, most of the corrections do not need the full 24 or 16 bits of resolution to be defined. Most can be defined by 4 or 8 bit offsets, perhaps even fewer. If you have a 16 bit sound track and a good predictive model, sufficient to allow 8 bit corrections, then the actual bitrate will be close as dammit to 8 bits instead of 16 bits: a 50% compression ratio! And a compression that allows perfect reconstruction of the original sound!

Embeds and Cores

Both DTS-HD Master Audio and Dolby TrueHD are based on this concept. In a sense, Dolby TrueHD is the purer conception because it is based most closely on a single predictive algorithm with offsets (the codec is a development of that used on DVD Audio, called Meridian Lossless Packing). The problem is that on Blu-ray the Dolby TrueHD audio has embedded within it the same sound in Dolby Digital format, which can be used by the Blu-ray player in certain circumstances (eg. when providing the audio bitstream over optical digital audio rather than HDMI). If you have a system fully comfortable with Dolby TrueHD, then this embedded Dolby Digital is an unnecessary waste of space. But it does no harm.

DTS-HD Master Audio is a little different. It also contains within itself a regular DTS Audio stream for backwards compatibility, but this isn’t merely embedded like Dolby Digital; it is called a ‘core’. For good reason. Instead of using a predictive mechanism and then adjusting that, DTS-HD Master Audio uses its DTS core as the approximation, and adjusts that instead. The reasoning is sound: if you have to carry the DTS track anyway, why not put it to good use?

Both DTS-HD Master Audio and Dolby TrueHD are, unlike their lossy forebears, variable bitrate codecs. In quiet and easy times they require less data throughput than in busy multichannel times. So their average bitrates allow us to compare certain things.

For example we can see how much more efficient DTS-HD Master Audio’s ‘adjustment’ scheme is. I checked the average bitrate for 121 Dolby TrueHD movie sound tracks, all of which offered 16 bit, 48kHz sound (16/48). The mean of these was 1550kbps, with a range from 1138 to perhaps as much as 2639kbps (Dolby TrueHD doesn’t record the bitdepth in an easily reportable way, so there is inevitably some guesswork in this.)

Out of the 85 16/48 DTS-HD Master Audio sound tracks I checked, the average bitrate was 2142kbps.

That looks worse, of course. But consider this: most of those Dolby TrueHD tracks had an embedded 640kbps Dolby Digital track, which would increase the total track bitrate to 2190kbps, while the figures for the DTS-HD Master Audio tracks includes their 1,509kbps cores.

Now let’s consider the efficiency of adjustments alone. Virtually all of Dolby TrueHD’s 1550kbps is adjustment from the modelled waveform. But only 633kbps (= 2142-1509) of DTS-HD Master Audio’s bitrate was adjustment from its regular DTS ‘model’.

For no particularly rational reason I prefer Dolby as a brand to DTS, but I’ve got to score this as a win for DTS.

But we’re not here to compare Dolby and DTS. What we’re interested in is 24 bits vs 16 bits.

Predictability

What are the impacts on the average bitrate of a Dolby TrueHD or DTS-HD Master Audio track? Basically, two things tend to require more bits. The first is channel usage. A dialogue-heavy movie with little surround channel content can be very compact since most of the channels will be unused, or little used. One of the DTS-HD Master Audio tracks I’ve measure required a mere 171kbps extra above the standard DTS core.

The other attribute controlling bitrate is predictability, particularly with Dolby TrueHD, but also to a significant extent with DTS-HD Master Audio. Although the DTS core is fixed in bitrate, how accurately it defines the sound is in large part determined by predictability because within itself it uses a predictive-plus-adjustment mechanism. This is partially masked by the fixed bitrate, but it is still there.

Obviously a predictive technique will not work very well with inherently unpredictable data. That being the case, the lack of predictability will be marked by larger adjustments from the failed prediction to the correct value.

In other words, the average adjustment bitrate is a decent marker of predictability.

Now what is unpredictable? Noise!

Noise is stuff in the signal that isn’t supposed to be there. More precisely, it is stuff other than harmonic distortion that isn’t supposed to be there. Harmonic distortion isn’t random, but noise is random or semi-random (noise can sometimes be weighted, such as pink noise). One good definition of random is unpredictable.

Consequently a predictive algorithm is, basically, crap at predicting noise. So noise tends to blow out the average bitrate required to adjust the prediction back to reality.

The Numbers

All 48kHz 5.1 movie audio must, in a good system, end up in PCM format. As mentioned, the bitrates will be 4608kbps for 16 bit material, and 6912kbps for 24 bit content. Obviously the latter figure is 50% bigger than the former.

Let us compare 16 and 24 bit Dolby TrueHD audio.

For 121 sixteen bit audio tracks, as mentioned the average bitrate was 1550kbps, virtually all of which was adjustment to the prediction. For fifty 24 bit audio tracks, the average bitrate was 3307kbps, an increase not of 50% but of more than 110%. To put this another way, 1550kbps was required to completely encode the most significant 16 bits of a movie sound track yielding a compression factor of 33%, but the additional 8 bits required a further 1757kbps for a compression factor of just 76%. We would be justified in drawing from this the conclusion that the first 16 bits of a 24 bit audio track are much, much more compressible than the last 8 bits.

More compressible, therefore more predictable. Those less predictable 8 bits are consequently most likely to be largely full of random data, aka noise.

Repeating this with DTS-HD Master Audio I measured 85 sixteen bit encodes with an average adjustment of 633kbps. For the 166 twenty-four bit discs, the average adjustment (total bitrate minus the DTS bitrate) was 2296kbps. Subtracting the 16 bit adjustment rate of 633kbps we get 1663kbps for the final 8 bits, or 72% of those eight bits uncompressed.

Conclusion

Whether or not 16 bits vs 24 bits are audibly different with a pure signal is not something I’ve addressed here. I am looking at noise. On that basis, 24 bits is simply too much, at least on average.

However 16 bits might not be enough. It seems that some of the least significant eight bits are indeed compressible, which means that they are at least slightly predictable, which means that they are not entirely noise.

Just mostly noise.

Perhaps a good solution would be use 24 bit resolution, but prior to encoding truncate the data to 18 or 20 bits. That would eliminate much of the incompressible noise, saving precious bits, while preserving the full audio resolution of the actual signal.

—–

A NOTE ON MEASUREMENT

The bitrates were derived from scans I performed on more than five hundred Blu-ray discs using the tool BDInfo, available at www.cinemasquid.com/blu-ray/tools/bdinfo. I chose only the main feature’s bitrate, and if several versions of a movie were available, I chose the Theatrical version. I counted only those discs with 5.1 sound (quite a few have 7.1) and a 48Hz sampling frequency.

A small number of discs used a DTS core with a bitrate of 768kbps rather than 1509kbps. I omitted those as well to compare like with like.

—–

A NOTE ON THE PRECIOUSNESS OF BITS

So what if a couple of thousand kilobits per second are wasted on noise in audio track? Does it matter at all? Isn’t there plenty of space on a Blu-ray?

Well, there is. But that’s not really the issue. More important is that there is a maximum bitrate available from Blu-ray. It is limited to supplying 40,000kbps. High bitrate audio tracks come at the expense of other audio tracks and the video bitrate. Some movies have their video encoded at a more or less constant bitrate, but many others make considerable use of the variable bitrate capabilities of the formats. In the movie world, video making higher demands on video compression and therefore leading to higher video bitrates — moments of fast movement and action — tend to coincide with demands for high audio bitrates, due to intense surround sound action. So a high audio bitrate can actually lead to reduced picture quality!

[A version of this was published in Sound and Image magazine in 2011/2012].

© Stephen Dawson 2012

Leave a Reply

Your email address will not be published. Required fields are marked *