Tag Archives: Digital Audio

Compression vs. A Limited Domain

Digital audio is not inherently compressed, but it is inherently limited in its representational scope – kinda like analog gear, actually.

Please Remember:

The opinions expressed are mine only. These opinions do not necessarily reflect anybody else’s opinions. I do not own, operate, manage, or represent any band, venue, or company that I talk about, unless explicitly noted.

One of the great things about being an audio-human is that you get to answer your friends’ questions. I don’t know exactly what it is about having an audio discussion between the sets at a show, but it’s just a fun thing to do.

Maybe it’s because there’s a sort of instant reward to it all.

Anyway.

The other day, I was using some between-set time to transfer sound files from a recording of a Please Be Human show. As the process was moving along, I got into a conversation with “Mills” (the bass player) about the file sizes. As it turned out, Mills was a little mystified by all the filetypes that a person working with digital audio can encounter. Because so many files containing audio are encoded for the purposes of compression, Mills was under the impression that all digital audio is, inherently, compressed. (That is, compressed in terms of data size, as opposed to signal dynamic range compression.)

The idea that this would be the case is easy to understand. We live in a world where digital audio files containing compressed data are commonplace. MP3, AAC, WMA, OGG Vorbis – all of these are formats that can utilize data compression to save space. If every digital audio format you’ve encountered involves a file encoded for data compression, then it’s just natural to start assuming that all digital audio files use some kind of compression scheme.

This is not actually the case, though. A file that stores LPCM (Linear Pulse Code Modulation) audio data is definitely not compressed – but it IS limited to a useful “information domain.”

A Limited Domain

What on Earth did I mean by that last sentence? Well, let’s start with this:

Uncompressed digital audio has inherent restrictions on the sonic information that it can represent. These restrictions are determined by sample rate and the size of each sample “word.” Sonic information that falls outside these restrictions is either excluded from digital conversion, or included and then later represented inaccurately.

Let’s use a common recording format as an example: 24-bit, 44.1 kHz LPCM. This format is commonly wrapped in “Broadcast” WAV files or AIFF files.

The length of each sample word (24 bits) imposes a limitation on a stored signal’s dynamic range. The theoretical domain-size of a 24-bit integer (whole numbers – no fractions) sample is 144 decibels. So, if you set up your system such that a signal peaked PRECISELY at the clipping point, the file could theoretically be used to store and retrieve a signal that was equal in level or less, down to 144 dB less. Below that level, signal information would be totally indistinguishable from quantization noise. As such, the signal’s information would be totally unrecoverable.

(As an aside, 144 decibels of dynamic range is performance that is far in excess of most gear, and still in excess of even very good gear. To the best of my knowledge, the very finest analog systems in existence are limited to about 120 dB of dynamic range.)

On the other side of things, the sample rate (44.1 kHz) imposes the limitation on a stored signal’s frequency range. The theoretical domain-size of a signal sampled at 44.1 kHz is from 0 Hz to 22,050 Hz. Unlike the dynamic-range domain, signal information that exceeds this domain is not “drowned.” Instead, the information is misrepresented as an incorrect lower-frequency signal, known as an “alias.” (This is why there are anti-aliasing filters in analog-to-digital converters. The filters block sonic information above 22,050 Hz from entering the conversion process.)

What I’m getting at with all of this is that, yes, LPCM digital audio has restrictions on what it can represent correctly. As such, even if we’re not conscious of exactly what we’re doing, we do EXTERNALLY impose data reduction on the signals that we intend to store in the file. We do this because of the file’s limited data domain.

HOWEVER (and this is the key bit), the information stored in the LPCM file has NOT had any further data-size reduction applied to it. Whatever signal actually made it to conversion and storage is contained within the file at “full data width” for each data point. Every single sample utilizes the entirety of whatever sample-word length has been specified: 16-bit, 24-bit, whatever. Five seconds of perfect, digital silence occupies the same amount of storage space as five seconds of anything audible. The data is not compressed in any way.

The Distinction

What understanding the above allows us to do is to distinguish between the concept of a limited information domain, and the situation where information within that domain is represented in a compacted manner.

Any piece of audio equipment has an effective information domain. For instance, just like a digital audio file, a power amplifier has a limited dynamic range and usable frequency range. The amplifier’s electronics can only handle so much input and output voltage, and the unit’s noise floor imposes a lower limit on usable signal level. In the same vein, amplifiers can behave unpredictably (or even destructively) when asked to reproduce ultrasonic signals, so it can be helpful to filter very high frequencies out of the amplifier’s inputs.

Now…

Consider the case of the amplifier that can be set so that a mono signal of appropriate scope is duplicated across two channels or more. In a sense, what has happened is a rudimentary form of data compression. Instead of directly “storing and retrieving” two channels of duplicate audio throughout the amplifier (which would require two outputs from, say, a mixing console), we’ve instead combined a simpler input with an electronic “directive.” The directive says:

“Hey, Amplifier! I need you to mult this one input signal to both of your outputs.”

What happens is that a signal of the appropriate information domain (dynamic range and frequency content) is effectively encoded to have 50% of its otherwise required data-size, and then decoded into its full size at a more convenient time. In this case, the data size is the channel count, as opposed to anything within the signal itself – this analogy is far from perfect.

Even though the metaphor I just presented is both flawed and simplified, it still gives you an idea of the difference between a limited scope of information storage, and compressed information storage of data falling within that scope:

A digital audio file that stores compressed information effectively has the same signal storage domain as an uncompressed parent file or counterpart file. However, information within that domain is represented in such a way as to occupy less storage space. This can be done in a matter which ultimately loses no information, or in a destructive manner which relies on human perception to ignore or imagine the missing data.

As I mentioned before, an uncompressed format stores information such that each data point occupies the same amount of space. Silence is just as big as the most complex sonic event that you can imagine. This can be something of a waste, which is why data compression is often useful. The downside to compression is that it requires an extra “decoding step” in order for the stored data to be extracted. In the case of data stored on computers, this creates a tradeoff – you save storage space, but reading the data requires more processing power. If processing muscle is at a premium, and storage is cheap, then large, uncompressed files are the best solution.

…and the good news is that, just because audio is in a digital format, it doesn’t mean that you can’t have an uncompressed file.