The Magic of Dither
(I have written an article for Australian HI-FI suggesting that a good way to start to understand the differences between 16 and 24 bit audio is to listen to the differences between 8 bit and 16 bit audio. To that end
I will in the future be posting four versions of a musical excerpt here. In the meantime, I want to explore a weirdness that my experiments revealed, and explain why it happened.)
Let’s create a sine wave of 980 hertz with a sampling rate of 44.1kHz and 16 bits of resolution. Let this sine wave be of an extremely low level, with a peak of -80dB. Here’s its spectrogram:
, -80dB, 16 bits” width=”589″ height=”310″ />
(This was generated with a small amount of dither, which is why it has a noise floor at -138dB and is unaccompanied by the nasty harmonics that would normally be present with a sine wave that is sketched out in values of +3 to -3). Note that the spike reaches -80dB.
Now let’s downsample it to 8 bits, accompanied by some dither noise:
Note the very high level of the noise, to the point that the 980 hertz signal barely peeks out over the top. This signal’s spike reaches -80dB, just as it did before it was downconverted.
So let’s go back to the 16 bit original and downsample it again to 8 bits, this time with some aggressive noise shaping:
That’s better. The 980 hertz signal stands well out from the signal, and if you turn the whole thing up it is clearly audible. It, once again, reaches -80dB.
Now what I’ve been leading up to: again we return to the 16 bit original, and again we downsample to 8 bits. But this time there’s no dither, no added low level noise. Instead we simply divide the 16 bit value of each sample by 8, and map it onto the nearest 8 bit integer value. Here’s the spectrum:
Lots of harmonics, quite a bit of noise (although a lower level than with the plain dithering) and something that’s distinctly odd. Instead of the 980 hertz spike reaching -80dB, it makes it up to -48dB. How could reducing its precision from 16 bits to 8 bits have caused that?
Well, here’s the thing: -80dB is lower than minimum value definable in an 8 bit signal. The lowest quantisation level is -48dB. So the only way this wave form can be represented is by toggling between values for zero and one. Let’s zoom in on the wave form:
As you can see, our sine wave has become a kind of square wave, which explains all those harmonics. But it also explains why the fundamental is way too powerful. A sample value of ‘1’ is -48dB on an 8 bit scale, so that’s what our signal becomes.
Which is why dither is so vitally important, at least with 8 bit audio.
Here are the four test signals with the low level 980 hertz sine wave:
Download them and listen. You’ll have to turn up the volume quite a way to hear the 980 hertz tone.
And here are the four excerpts of the music from the group B’Jezus referred to in the article: