24 bit sound vs 16 bit sound and DTS-HD Master Audio

I’ve run BDInfo on the forthcoming Australian release of Slumdog Millionaire on Blu-ray from Icon Film Distribution. This turns out to be a very different encode to the US version.

The main title details of the US version are here. The main title details for the Australian version, posted by me, are here. In brief, the US version gets MPEG4 AVC, we get VC1 (both have healthy bitrates in the high 20s of megabits per second). Australia gets a PIP video commentary which the US doesn’t get.

Both versions get DTS-HD Master Audio 5.1 channel sound. But the US version gets 24 bits and 48kHz, whereas we get 16 bits and 48kHz. The US version has an average bitrate of 3962kbps. The Australian version averages 2032kbps. Note: throughout this Blog post, ‘k’ equals 1,000, not 1,024.

I find those bitrates interesting. Are they comparable? What do they tell us about lossless codecs?

I’m inclined to think that aside from the 24 vs 16 bit thing, the two audio tracks are very similar, but are not identical. The run length of the US release is around 30 seconds longer than the Australian release. If you look at the chapter breakdowns at those links, you will see that chapters 2 through 27 are the same length, and for chapter 28, the last chapter, the Australian version is actually about a second longer. The major timing difference is accounted for by the first chapter, and this is most likely to be due to to different company logos at the very start of the movie.

Aside from the different sound of the logos, I’d be extremely surprised if anything was different in the source sound between the two versions. It is a very recent movie, so I expect the original multichannel PCM recording was used for both versions.

As I understand it, the only way to do DTS-HD Master Audio compression is to use equipment and software supplied by DTS, so there should be no differences in the encoding methodology. If all these are the case, then the overwhelming responsibility for the difference in bitrates for the two audio tracks would be due to the 16 vs 24 bit question.

Now 3,962 is nearly twice 2,032 (1.95x anyway). An uncompressed LPCM 24 bit, 48kHz 5.1 channel audio track runs at 6,912kbps. The 16 bit version runs at 4,608kbps. Unsurprisingly, the former is 1.5 times the size of the latter.

So why does the efficiency of DTS-HD MA seemingly fall off so much with 24 bits? I’m not sure. But let’s look at how DTS-HD MA works.

The lossless compression I know best is Dolby TrueHD. This is very similar indeed to the Meridian Lossless Packing used on DVD Audio, which Dolby licensed from Meridian and championed for that purpose. Audio typically doesn’t compress very well using ‘traditional’ computer compression processes (eg. WinZip), which largely rely on finding and eliminating redundancy. I have just dragged a 16 bit, 44.1kHz music file into a Zip folder and managed to reduce its size by 7%. Even our relatively inefficient compression of the 24 bit sound on this movie got it down by 43%. The 16 bit sound was reduced by 56%!

A 7% reduction is probably not worth the trouble. But 43% and 56% certainly are.

So how to get big compression factors? The trick used by MLP and Dolby TrueHD is to build in an algorithm that, based on the sound so far, deduces what the sound will be in the future. If a waveform is increasing, then it’s highly likely that in the next sample it will still be rising. The algorithm will do that, and set the next sample at a reasonable guess for how much it will have risen by, given the preceding samples. Except for some sudden transient, which the system treats as an exception to be dealt with by other means, this should give an adequate approximation of the sound.

By ‘adequate’, I do not mean adequate for listening purposes, but adequate for the next stage of the process. That stage tweaks the sample to make it accurate. Consider, instead of using, say, 16 bit samples to describe a sound wave you use 8 bit samples. The amount of data you would be handling would be halved. Normal 16 bit sound describes a series of sample sizes. But another way of thinking about this is that it describes a series of offsets: how far each sample diverges from a particular value. With uncompressed PCM, that particular value is zero. But with MLP and TrueHD, the value is different for each sample, and is derived from that approximation algorithm.

So MLP and TrueHD basically work by using a formula to guess what the sound will be, based on what it has been, and then a series of offsets to correct it. Because the guess is typically very good, the offsets are small and you can use 8 or 6 or 4 bits to communicate these, much of the time. Exceptions are provided for, of course, but most sound is efficiently compressed. And since the encoder and decoder both use the same algorithm for the ‘guessing’ part of the process, the reconstruction of the sound (guess + offset) is conducted perfectly.

That’s Dolby. How about DTS?

Both Dolby TrueHD (but not MLP) and DTS-HD Master Audio carry, on Blu-ray, a ‘core’ within themselves to cater for equipment that doesn’t support the new audio formats. Dolby TrueHD carries a Dolby Digital core (typically at 640kbps, but sometimes at 448kbps). DTS-HD Master Audio carries a DTS core. In every case I have looked at so far, the DTS core is a high bitrate core at 1,536kbps (sometimes reported as 1,509kbps). Most normal DTS tracks on DVD and many on Blu-ray use a half bitrate of 768kbps.

If your system will decode Dolby TrueHD itself, then the Dolby Digital core is totally ignored. The TrueHD component of the bitstream stands alone. DTS works a little differently. Presumably the DTS engineers thought to themselves: ‘If our audio tracks are going to be carrying a bit 1.5Mbps data load anyway, we might as well make use of it.’ Note, also, that DTS has always claimed that DTS is nearly lossless. That is, DTS thinks that much of the time the entire 5.1 channels of PCM can be completely and perfectly reconstructed from the 1,536kbps. I am certainly not competent to dispute this claim, although I would note that this is rather more likely to be the case with 16 bit than 24 bit sound.

So the standard DTS core forms an integral part of DTS-HD Master Audio sound. The decoder uses both the core and the rest of the bitstream to losslessly reconstuct the original sound.

Now we get to some educated guesswork on my part. It seems likely to me that DTS-HD Master Audio works in the same way as Dolby TrueHD: it uses relatively compact offsets to tweak an approximate representation of the signal into perfection. But whereas Dolby TrueHD uses a predictive algorithm, DTS uses the regular DTS core as its approximation (I suspect that regular DTS uses a much cruder predictive algorithm anyway).

Now let us consider the sizes of the tweaks involved. Both of our Slumdog Millionaire audio tracks have DTS cores of 1,536kbps. The 16 bit version requires tweaks of a modest 496kbps (2,032-1,536). The 24 bit version needs tweaks of 2,426kbps, even though the standard DTS core approximation is itself claimed to be 24 bits in resolution!

Is 24 bits it worth it? That’s a question for another day.

This entry was posted in Audio, Blu-ray, Disc details. Bookmark the permalink.

5 Responses to 24 bit sound vs 16 bit sound and DTS-HD Master Audio

  1. Fred says:

    Yes – 24bit is worth it

    Some may say that equiptment & human ear can only produce/hear 20bit, but the 24bit allows more resolution to the audio. If a track is 24bit & is downconverted to 16bit, quality is lost because the resolution is being downsampled to fit into 16 bit. Bad side of this is the sound will not sound as “rich” & “live” as the original, but close. Also aritifacts will be present & a tendency for a ringing sound on loud scenes in the backgound – if you are an audio purifest, makes you cringe

  2. Thanks for joining us here Fred.

    Without a doubt 24 bits offers more resolution than 16 bits. The question is: can a human being hear it?

    In this piece, which expands on the above, I adduce evidence to suggest that at least four bits, and perhaps as much as eight bits, of 24 bit sound in movies is simply noise, not valid signal at all.

    Meanwhile, another study provides strong evidence that quite aside from this, downconverting from 24 bits to 16 bits produces no difference in audible quality detectable by humans.

  3. Joe says:

    Ok, I dont know why this is even a question 24 bit has 1/3 more information 8×2=16, 8×3-24. Thats a shitload of more information , I personally hear the difference instantly and im not sure if thats because 24 bit offers higher bit rates or because 24 bit itself gives those benefits. Either way im in favor of every track being lossless lpcm 192khz 24-bit insted of this bullshit 48khz truehd

  4. Jason says:

    Ok. I know this post is old, but I have to clarify for Joe here about 24-bit being 1/3 more information.
    This is simply not true, and not how digital information works.

    First of all, sample rate. When you sample audio digitally into binary digits, you have to do it a certain amount of times per second. It’s good to do this at least twice as many times as the fastest frequencies the human ear can usually perceive, which is 20,000 Hz, so around 40 kHz or better is optimal. So, the faster the better but eventually, your not going to notice the difference.

    Either way, the standards are 44.1, 48, 88.2, 96, 176.4, 192 kHz and certain people are using 384 kHz and higher for special applications. 88.2 and 176.4 are really made just to down sample to 44.1, but have higher sample rate stuff on file to mix and master with.

    But the key issue here is bit depth, or dynamic resolution. For each sample per second of audio, you must have a certain number of bits (binary digits) per sample to represent the dynamics of the recording. Whenever you add up binary digits, you can represent values much larger than decimal based numbers, with just a few more binary digits. For example 8-bit audio has 256 possible levels of dynamic resolution. But 16-bit audio has not 512 levels, but 65,535 levels of dynamic resolution. When you jump to 24-bit you have over 16.7 million levels of dynamic resolution. So, it’s much more than just 1/3 more information when jumping from 16 to 24 bit. It leaps exponentially.

    End the end, the uncompressed PCM files that the recording studio mastered will always be the best copies. However they want to distribute that recording to the end user will determine how it gets degraded from there. The best way, would be uncompressed on every channel. But media has to evolve to where we can distribute it that way. Until then, smart compression techniques will have to do their very best to deliver as close as possible to the original recording. Yet, with streaming becoming more and more the norm, I suspect that internet access speeds will be the determining factor rather than physical media storage and playback.

  5. Sorry Jason, I didn’t notice your comment waiting for approval. Well, only a couple of months late!

Leave a Reply

Your email address will not be published. Required fields are marked *