Audio Processing 101

Audio Processing 101

By Cornelius Gould

What Is Audio Processing?
To help with understanding some of the terms used here, I suggest you check out Audio Processing University part One.

Audio processing has its roots in the early days in radio stemming from the desire to automatically control peak levels in a broadcast chain.   The reason for this came about as a way to assist radio console operators.  In this case, the console operators literally controlled the modulation of a broadcast station by manually turning up and / or down the levels on their console.  If the levels were to peak above 100% on the board, then the transmitter would operate illegally.   So, the operators had to act quickly when this situation arose to keep distortion and illegal over-modulation to a minimum.

This is where the automatic peak limiter came into play.  Physically, the peak limiter would be a device with audio inputs and outputs.   The unit operates by monitoring the levels feeding into it, and make audio level corrections anytime the levels exceeded a pre-set reference point (or threshold). Usually this reference point was set to 100%.

If program levels remain below 100%, the limiter assumes a “unity gain” state. That is, the audio levels on the input equal the output.   If the audio exceeds 100%, then the output of the limiter is reduced by an equal amount.  That is, if the input is, say, 175%, then the output of the limiter is 75% less than the input.

The action of these peak limiters is usually quite fast. The time it takes for these units to act is in the order of milliseconds.   After the peak condition passes, the limiters would quickly recover to unity gain.  The time it takes for a limiter to return to unity gain is usually 2 seconds or less.

The Birth of Modern Audio Processing.

In the advent of Rock and Roll radio, someone discovered that setting these limiters to “kick in” sooner than the 100%, caused the music to sound “bigger”.  This happens because of the “squeezing effect” of the limiter compressing most of the natural level variations of the program material down to one uniform level.     This, of course, was an abuse of the intention of a transmission limiter, but those who dared to break the rules discovered that they appeared to sound much louder than their competitors who used the limiter units as they were intended.

The following two clips demonstrate this effect:

Audio which has no processing on it

Audio that is processed to show the squeezing effect.

(The processing on the latter file is done with wideband processing – remember this as you read along…)

OK, back to our lesson…

This “squeezing effect” also causes the sound of music material to change dramatically…. so a rock song played through one of these radio stations after heavy limiting would sound nothing like it does when you play it on your home stereo.   It had a “much bigger” sound than normal.

As these loudness wars heated up, we found stations connecting these units back to back, thus creating (at that time) stations that were incredibly loud sounding when compared to the ones that “played by the rules”.

The main problem with this approach to loudness is that it is still totally dependent on the DJ at the studio to keep the board running as close to 100% as possible.  While Mr. DJ could no longer accidentally over modulate the transmitter, the effect of loudness hinged totally on how much studio level was being sent to the limiter.

A Better Mousetrap

Realizing this limitation, units called Automatic Gain Control (AGC) devices were created.  The AGC’s were basically slower versions of the transmission limiters, and their specific function is to control the average level of the program signal.   The compression ratio is also much looser than the transmission limiter.  In a transmission limiter, it’s compression ratio would be as close to infinity to one as possible…meaning the output level should never exceed the 100% reference regardless of the level of signal present at its input.  In an AGC, the typical ratio used was about ten to one (10:1), meaning for every ten dB increase of level, the output would only raise 1dB.

What the AGC units accomplished is a consistent operating level into the limiter unit.  This allowed the best of both worlds for most broadcasters.  The AGC would serve the function of keeping overall levels consistent over a wide range of erroneous operating levels from the studio, while the transmission limiter was allowed to be used more for what it was intended for (to protect the transmitter from over modulation). If the users wanted to “abuse” the limiters to gain more loudness, the effect was now much more consistent.

This method of creating loudness has its limitations.   This system is a wideband processing chain as the system is operating across the entire audio spectrum with a single AGC / Limiter.  The side effect of doing this is that any dominant material concentrated in a specific frequency range will cause a reduction in audio level for the entire audio spectrum.

If there is a song that starts off accapella, the wideband processing chain will adjust its output level to bring the voice to “100%” modulation.

But when the rest of the instrumentation resumes, the voice is suddenly pushed way back behind the instruments.

If the instrumentation stops, and the vocals are once again the only sound present in the program, the vocals will once again be adjusted to 100%.  This obvious “up and down” action on the vocals by the backing instruments is usually referred to as “pumping”.

Slowing down the recovery time on the AGC’s and limiters solved this problem to a degree, but the effect of loudness was lost as the “smashing” effect was diminished.

Some improvements to this problem were made by the creation of the “freeze gate”.  Freeze gates are usually implemented in the AGC section of a broadcast chain.  What this gate does is to monitor the input levels of the AGC, and whenever the audio level falls below a user-determined level, the units will hold their gain state until the audio crosses the threshold again.   This can be considered as a “dual speed” AGC system where the AGC would recover faster on louder material, and slow to a virtual freeze on low-level material.

…Then came the 80’s

80’s brought on a new style of music.  There was now music featuring heavier bass percussion than before.   This is where the wideband audio processing chain started to show its limitations.    The Bass elements, such as kick drums and synthesized beat tracks, would cause pumping in the wideband chain.

In this situation, each beat of bass percussion would “punch holes” in the upper frequency areas of the audio spectrum.   In this situation, certain songs would have totally unnatural artifacts introduced by the wideband chain.  One perfect example from this era was Kim White’s “Betty Davis Eyes”.  In this example, every time there was a bass drum beat, Kim’s voice and synthesized keyboard would just about disappear for the duration of the bass drum sound, then suddenly “jump back” in between beats. This was one of the actual songs that showed the Broadcast Community that something better was necessary.

Multiband Processing

Multiband processing actually has its roots in the early to mid 70’s and the concept behind them was simple, actually…

Split the audio into multiple frequency bands say, Highs, Mids, and Lows (bass), and feed the output of each band to its own AGC unit.   The audio could then be put together as a single source.  Doing this would give two improvements:

1) The audio was now spectrally more consistent from cut to cut as the multiband AGC, by its very nature, will have a tendency to “re-equalize” the audio to achieve more-or-less the same amount of treble, mid, and bass out of wildly varying recording styles.

2) Since there was no longer a real problem with bass notes modulating the entire audio spectrum, the units could be driven harder to achieve much greater loudness levels with fewer side-effects than was possible with the wideband chain.

Early multiband AGC systems were used ahead of wideband limiters.  This worked fine for a while, but as the loudness wars continued to heat up, and people were forced to rely more and more on the wideband limiter, the problems of the wideband chain were re-introduced via the wideband limiter.

The next obvious step was to replace that wideband limiter with a multiband one, giving us the basis behind modern audio processing as we see it today.

The following two clips are processed to the extreme, but they should show the difference in the two methods of processing quite clearly…

Heavily processed wideband audio processing

 Heavily Processed multiband audio processing

Today’s Systems

Today’s audio processing chains are a mixture of units each used for a specific purpose.

 

The typical modern audio processing “chain”

Virtually every processing chain starts with some kind of gain riding AGC (element #1 in our typical chain).

The Wideband AGC

 Wideband AGC

The job of the gain riding AGC is simply to control the levels of the programming leaving the main control studio.  Doing this will assure that the rest of the chain will always see ideal operating levels, even if the jock is “pegging” the console level meters.

The designers of these devices, and ultimately YOU must decide: “How exactly do we handle the correction once the threshold of  “too Much Level” is crossed?

The obvious answer to this question is “as slow as possible so it cannot be heard”.

What exactly does “slow” mean?

The slower you make the AGC, the longer it takes to recover from a loud segment after the operator re-adjusts the “board levels”.  But it also means that the audio is corrupted the least by AGC activity.

Speeding up the AGC means that the unit will recover quicker from “board operator” errors, but at the same time, it will have unpleasant “artifacts”.   These artifacts will make themselves known as “pumping” on the percussion elements of your music.

This brings us to the second element in the chain.

The multiband leveler. 

A leveler can be a slightly faster version of the AGC, or it can be one of the quite elegantly designed RMS detector types, where only the relative power level of the audio is sensed, rather than the instantaneous peak level.

The leveler is usually used for two purposes at once.

1)    To provide a “signature sound” for your station by automatically adjusting the equalization of your programming to your taste.

2)    Since it is multiband, it can aid in gain riding by operating faster than the ACG ahead of it.  The multiband nature of this stage will not cause much noticeable pumping to percussion elements, assuming it too is not running “too fast”.

The operating range of the multiband is usually less than that of the AGC so as not to cause too much of an upset in spectral balance, which limits any changes in timbre of some programming elements to an “acceptable” point.   Some people / manufacturers will give their multiband more range than others.  In any case, the usefulness of the multiband to add any significant “pep” to gain riding is typically limited.

Stage three: The (dynamic) limiter

We are referring to “the limiter” as a faster version of the leveler with the appropriate amount of compression ratio to make it a “limiter”.  We made this distinction since many manufacturers call the clipping sections of an audio processor a limiter.

A limiter is generally used as a tight peak level control device.  It is usually tighter than the leveler, but cannot limit brief transients, so Clipping limiters are used after this stage to clip off the transients.

These days, the limiters are made multiband (two or more bands) to mask most of the effects of extreme waveform distortion (which is what a limiter is supposed to do) to some degree.

The Final Limiter

This sounds so…”final” doesn’t it?  Anyway, this stage removes any loose peaks left in the audio from the limiter stage (there will be lots of them!).  It does this by “chopping”, or “clipping” the peak from the audio waveform.

This stage is also used to increase apparent loudness.   This is due to the fact that most peaks can be removed without much of the audience noticing.  Peaks can easily exceed 10% (usually more) of the 100% of the average modulation reference point.  So if you have peaks exceeding 10-20% of the maximum average modulation level you would have to reduce modulation by that much to prevent illegal operation.

Clipping these peaks from the audio will result in the ability to turn up the audio levels by this same 10-20% amount.  All without introducing any artifacts that the listening audience will notice.

The more clipping used, the louder the station appears to sound. This also means that the louder the station becomes, the more the clipping becomes noticeable as harmonic distortion .

This effect is purely aesthetic, and it requires the user to decide how much distortion he or she is willing to live with to achieve a certain amount ofperceived loudness.  This is because the final limiter is a distortion device…similar in concept to a guitar distortion pedal.  The more distortion, the louder and bigger it seems.  The problem here is the more distortion we can hear, the harder it is to listen to for long periods of time.  The more distortion, the shorter that time frame becomes.

Cleverly designed final limiters will incorporate distortion canceled clipping. What this does is to buy “extra” range by masking most of the objectionable distortion.  The net result (if done properly) is more perceived loudness with less obvious distortion.   This process isnot transparent, and the distortion masking techniques all have a sound unique to the specific type of distortion control used.  The differences between “Brand X’” processor and “Brand Y” lie mostly in the “sound” of their final clipping limiter, and how it interacts with the music material used to make up the format of the station using the processor.