Before I dive into some solutions to the problem there are some audio concepts to understand. These relate to how audio is produced and measured. Inconsistent measurement and poor monitoring practices are at the core of the problem, so we need to understand these first.
There are two forms of measurement used when producing audio.
A level meter is a device for measuring variations in the electrical amplitude of the audio signal. The two most commonly used meters in broadcasting are the VU meter and the IEC standard PPM. Both measure audio differently, and experienced engineers know how to use them correctly. Measures of level are generally absolute and repeatable.
There are more sophisticated meters that purport to measure loudness as well; I will talk about these later.
The second and by far the most important tool, is the human auditory system. The ears combined with the brain is the most powerful psychoacoustic measuring device on the planet.
The human ear has two attributes that bear on this problem.
The first is that the perceived volume of a sound is based on the average loudness over time. According to Wikipedia:
The perception of loudness is related to both the sound pressure level and duration of a sound. The human auditory system integrates (averages) the effects of sound pressure level (SPL) over a 600–1,000 ms window.The second important attribute is that it is optimised for processing speech. Human speech at 1 metre is typically around 60 dBA, measured with a sound pressure meter.
If you play a recording of speech and ask a group of people to set the volume so it is comfortable, the set volume tends to converge on 60 dBA. This applies when watching TV too.
The two measures above would be fine except for two types of processing that are applied to audio to change the perception of loudness.
The first is equalization (EQ), which boosts or cuts selected frequencies. The aim in doing this is to improve the intelligibility and impact of the sound.
Mixing desks and digital editing systems have very complex controls that allow specific frequencies to be targeted for enhancement, allowing for fine-grained control.
The second first of these is audio compression or limiting. Put simply, this reduces the dynamic range of the audio so that the difference between the loudest and the quietest sounds are reduced. This allows the average volume to be increased.
Both of these are used in commercial production. The voice is EQed and compressed. Any music may also be compressed, and the finished product compressed again. It is common to see commercials with a dynamic range of less than 2dB.
The audio of most TV productions has a dynamic range greater than 2dB. 15-25dB is more typical. This difference means that the level (on a meter) has to be set lower to avoid overload on the peaks. Commercials don't have any peaks, so can be set higher.
This is what happens in the average consumer's lounge:
They turn on the TV, and when the programme starts the volume control is adjusted so that the speech is at a comfortable volume. As stated above this will be close to 60 dBA. The dynamic range of the spoken material will be (say) 15 dB, and it must be set so that any peaks do not causes overloading in the broadcast equipment.
When a commercial is played it can be set 13 dB higher (2 dB dynamic) without causing electrical overload.
At the consumer's end, this means that content with a reduced dynamic range (like commercials) will sound a lot louder.
This is a massive simplification, but hopefully it makes sense. I suspect that I'll need to make companion video to this series to demonstrate things more clearly.
How to fix this?
Most discussion I've seen suggests that his is either a technical problem, or deliberate.
The technical crowd think that problems occurs because there are no agreed standards. There are standards, and though they are not always followed I think this is a side-issue because the problems of differences in loudness is operational in nature.
As stated in part one, I don't believe that most TV stations turn up the volume of the Ads. This is an error of omission. The problem persists because it is either ignored because it is not understood, or there is a belief that Ads must be louder in order to be effective.
Next time I will explain the solution to this problem by first presenting a simplified version, and then applying it to some real-world situations.
If you want clarification on anything here, use the comments section.