I had this clever idea, or so I thought, in which I’d test some CDs and compare them to each other. Of course, CDs aren’t comparable — unless they are of the same music. The idea was: when a new digitally remastered CD is released, how does it differ. Suspicious individual that I am, I had half a mind to think that maybe a bit of dynamic range compression was applied in some remasters, with a view to allowing a higher average volume level.
So how to measure dynamic range? I was going to use Cool Edit 2000 which generates some useful statistics about audio files. But one of my editors suggested the Dynamic Range Meter.
The problem with that was the software had expired in August, since the authors had apparently planned to have a better version out by then. It seems that they haven’t.
Fortunately someone else did his own version with a couple of enhancements.
These meters take a pretty naive approach to this. Basically, you take the peak level achieved by the sound file, you take an average measure of the sound file, subtract the latter from the former and you end up with a figure. Express this in decibels and you have a sensible number that you can use to compare versions of music. This figure — the decibels, or DR — is a ratio between peak and average. It would typically range between maybe six and twenty, depending on the type of music.
Now I say this is naive because while the average figure (calculated using RMS methods to overcome the fact that roughly half the samples are negative) should be representative of the file, the peak depends on only one point. If that point is a transient, as it typically would be, there can be a rather large range on its value based purely on fluke. If the sample is on a rising or falling part of transient, then it will be lower than if it happens to be right on the peak. This all hangs solely on when the sample happens to be taken.
But let’s put that aside for the moment. I like to make sure my instruments are working well, and one way of partly confirming this is to compare measurements using two different instruments. So I used the facility in Cool Edit 2000 to gather stats on an audio track and applied the Dynamic Range Meter to the same track.
Oh, oh. Around three to four decibels difference for the RMS Average, with the Dynamic Range Meter giving a lower (closer to zero) value. Consequently its reported dynamic range was also 3-4dB lower than suggested by Cool Edit 2000.
Fortunately Cool Edit 2000 has an export facility, where you can turn an audio file into a text file consisting of a header, and then two long lists of numbers which represent the samples.
I trimmed to test file down to precisely one second in length (for 44,100 samples), exported it to text, imported it into Excel, and did my own max, min and RMS average calculations. This agreed, kind of, with Cool Edit 2000 rather than the Dynamic Range Meter.
(Kind of because Cool Edit seems to call the average, as I calculated it, ‘Total RMS Power’, and gives a slightly different answer — out by up to 0.5dB — for ‘Average RMS Power’).
So am I missing something? Is there a better more representative method for calculating average than I used?
My manual methodology was simple: square each sample, add them all up, divide by the number of samples and take the square root of the result.
Update (6 December 2011): The author of the improved Dynamic Range Meter emailed me back in response to a query and he clarifies things.
It is all to do with reference level. The maximum possible level of a digital sample is 0dB, of course. All other values in the range are negative. If you take the RMS average of a square wave you will get a result of 0dB. That’s because half the samples are at the positive end of the full scale, and half are at the negative end. If you take the RMS average of a sine wave you will get a result of -3.01dB (calculated by 20.log(sin(pi/4))).
Apparently there has been some disagreement over whether 0dB should be take as the reference for RMS measures, or -3.01dB. There are arguments on both sides. Intuitively 0dB seems to be the obvious choice. But as mentioned, that would mean a full-scale sine wave could never get higher than -3.01dB on average.
So in the end, it seems that the international standard has gone for the sine wave reference, treating it as 0dBFS for RMS purposes, which effectively counts the RMS levels of all other signals as 3.01dB higher than they otherwise would have been.
But this does not apply to specific samples. They are still counted with reference to the real 0dBFS. So when you subtract this redefined average RMS level from the peak level, you come up with a result 3.01dB less than the raw numbers would suggest. And in addition, the average RMS level of a square wave is actually positive rather than negative!
According to Wikipedia, the intuitive 0dB = 0dB approach is also the norm for analogue. I’m inclined to think I’ll stick with this, but the less is that it should be made entirely clear precisely how one is doing one’s measurement because there is plenty of room for confusion.