Well, here I am already trying not to over-geek you. But, this is important. And, it’s a little technical. Not a lot… but a little.
You hear me crow about “great sound quality” on The Other Side. Maybe you’ve listened with a nice set of headphones, or maybe you’re like Denise and me and listen, more often than not, on crappy laptop speakers, or maybe on your phone’s built in tin-can speaker. But, when I do pop on a pair of nice headphones, The Other Side sounds pretty good. Not perfect, but pretty good — and, far better than most Internet streams and undeniably better than any FM (HD or not) radio station.
Let’s talk about music, how it’s recorded, and what can be done to it to screw it up before it’s broadcast (either over the air or over the Internet). Typical high-quality music recorded over the past 30 years or so is probably recorded digitally, and in stereo. That means you should hear good separation between your left and right ears (especially if you’re using headphones or earbuds) and the audio should be clean — there should be little or no hiss during soft musical segments. And, depending on what kind of music it is, who produced it, and who engineered it, it might exhibit wide dynamic range — that is, there should be a fairly large difference between soft and loud parts of the song. Rock and roll, hip-hop, and more contemporary music is probably louder with less dynamic range than classical, symphonic, or jazz music — that’s the nature of those genres.
Let’s use a hybrid example. Jeff Beck released Emotion and Commotion in 2010. It features his typical screamin’ guitar work. But, he’s also joined by various vocalists (including Imelda May on Lilac Wine) and a 64-piece orchestra on several tracks (including a version of Nessun Dorma from Puccini’s opera Turandot). Since Lilac Wine and Nessun Dorma are back-to-back with no break on the CD, I took both tracks, pasted them together, and run them as one song on The Other Side. The combined song exhibits tremendous dynamic range, starting with a small string section behind Imelda May’s voice, melding into screamin’ electric guitar, and finishing with the full orchestra at fortissimo.
If you have them handy, put on a pair of headphones or a nice pair of earbuds and listen. The song is in .WAV format, so it’s full CD quality (and, if you don’t have a really fast Internet connection, this will probably be choppy at best):
Now that you’ve heard the uncompressed, uncompromised version of the song, let’s explore where things can go wrong from here.
Digital Audio Compression
When the World Wide Web began to take hold, there was an early requirement to play audio over the Internet. Remember… early Internet was almost entirely over dial-up modems (starting, in many cases at 300, 1200, or 2400 bits-per-second). When DSL connections began to replace modems, connectivity speeds increased, but nowhere near enough to support full CD quality uncompressed audio (and, even today, you may not have been able to listen to the full CD version above). Enter the MP3 file. This compressed file permits audio to be “smashed” into a much smaller file size, supporting audio over low-speed Internet connections. But, there’s no free lunch — the more compression to the audio for small file sizes, the more audio quality suffers — it’s a simple law of physics. As MP3 files “these days” are encoded, at minimum, at 128K bits-per-second, you won’t notice as much audio compromise. But, keen ears can hear a low-bitrate MP3 file a mile away. Just for fun, here’s the Jeff Beck piece encoded at 24K bits-per-second (even this was too much for dial-up modems):
When you set up an audio stream, you can choose the type of audio compression you use, as well as the bitrate you want to use to send the stream (considering how much bandwidth you have at your disposal, and what internet speeds your listeners will be afforded). Lots of streaming servers still use MP3 files at, typically, 128K bits per second (or lower). Not bad, but nowhere near CD quality. If you listen on good equipment, you’ll typically hear “tearing” on higher frequencies (strings, symbol, high-hat). Also, certain voices are destroyed by lower bitrates — Alison Krauss, for example, just demands a higher bitrate.
And, is there something better than MP3? Enter AAC Audio. This successor to MP3 can jam more audio into less bandwidth, which means that the crappy sounding 24K MP3 you heard above would sound substantially better if encoded with AAC at that same 24K bit rate. Take my word for it — I’m not adding it here. ‘Cause, if you want to hear AAC encoding, just click LISTEN NOW! at the top of this page! The Other Side presents AAC audio at 256K bits per second. That’s tons of bandwidth with an encoder that’s very efficient at compressing audio. Is it full CD quality? No, but many would be hard pressed to tell the difference.
Audio Over-Processing
FM radio stations, generally, don’t worry about digital audio compression. That’s because they still broadcast in analog. So, while listeners won’t experience digital “tearing” or compression artifacts, they will hear multi-path effects (distorted audio as the signal bounces off nearby buildings), fading as they get farther away from the transmitter, not-as-crisp highs (FM stops at about 15Khz), and general analog hiss.
Radio stations also face a series of challenges when deciding how much audio processing to use before they send audio to their transmitter. Since the primary listener to a traditional radio station will probably be in a car, stations must consider how they’ll sound with traffic noise, the “whoosh” of air passing by the car at 65 mph, engine noise, and so forth. They want the car listener to hear everything.
And, then there are the “loudness wars.” Started largely by Top 40 and Rock stations in the 70s and 80s, radio stations compete with each other for the loudest sound. The thought was (and, still typically is) that “louder is better — the loudest station will win.” Every FM station I worked at in the 70s and 80s used the OptiMod 8100 audio processor — it was the standard “make it loud” processor in the industry (a fair number of these can still be found in equipment racks).
The idea of radio station audio processors is to limit (maybe even eliminate) dynamic range and make the music as loud as possible. If an unprocessed song has a soft intro, it may completely go away in a car with a V8 driving down a freeway at 70 mph. Instead, FM station audio processing will expand it to be as loud as the loudest part of the song coming up later. But, what if you want to hear the song as originally recorded, with some real dynamic range? Well, don’t listen to it on the radio!
Let’s go back to the Jeff Beck piece discussed earlier, starting with the original off-the-CD untouched version. If you open the song in an audio editor, you’ll see the left and right channels separately, and note the wide dynamic range (low wave heights and high wave heights) throughout the song.
Now, let’s talk about what FM radio can do to this. I’m going to pick on a particular station I volunteered for recently, because its approach to audio quality was truly fascinating. This is the station whose program director chided me when he occasionally heard me talking over song intros for a second or two: “Avoid that — our listeners are super-song aware and don’t want anything to interrupt the music.” This an FM stereo radio station that inadvertently ripped CDs to their automation system in mono for the first 6 months they were on the air and never caught it (guess who did) — at last check, the vast majority of that music was never re-ripped in stereo. And, this is the station whose chief engineer, in a meeting with the general manager, described his audio processing as “admittedly, very aggressive.”
Consider, if you will, what it might look like if you recorded Jeff Beck/Imelda May’s Lilac Wine/Nessun Dorma off the air from this FM station and opened it up in an audio editor:
And, imagine what it would sound like:
We will now save you from your pain. Click LISTEN NOW! in the toolbar above this article!