Resources.
-
Normalization
Normalization is a technique used in audio processing to adjust the volume levels of audio files to a specific target level. This process does not affect the dynamic range of the audio, but instead increases or decreases the overall level of the file so that the loudest part of the audio reaches a predetermined level. Normalization is often used to ensure that different audio tracks are played back at a consistent volume level, without significant jumps in volume between them.
During normalization, the audio signal is analyzed to determine its loudness level, and then the overall gain of the signal is adjusted so that it reaches the target loudness level.
-
Loudness
Loudness is a physical attribute of sound that is defined as the subjective perception of the intensity of sound. It is often associated with the amplitude or strength of a sound wave, as the greater the amplitude of a sound wave, the more intense the sound is perceived to be. Loudness is measured in units called decibels (dB) and is generally perceived on a logarithmic scale, which means that a small increase in decibels corresponds to a large increase in perceived loudness. For example, a sound that is 10 dB louder than another sound is perceived to be twice as loud.
-
LUFS
LUFS (Loudness Units relative to Full Scale) is a unit of measurement that is commonly used in audio production to measure the perceived loudness of audio. It is a standardized measurement that takes into account both the level and the duration of a sound, in order to provide a more accurate representation of how loud it is perceived to be by the human ear.
-
Dynamic Range
Dynamic range refers to the difference between the quietest and loudest parts of an audio signal. It’s the range from the softest sound to the loudest in a recording. This concept is crucial because it impacts how expressive and immersive a track feels to the listener. Proper dynamic range helps prevent all parts of a track from sounding equally loud, which can lead to listener fatigue and a flat, less engaging sound and prevents the song from being perceived as quiet. Im sure you have heard the concept of, “if everything is loud, then nothing is loud”.
-
Frequency Stacking
Frequency stacking refers to a situation in a record where multiple instruments or elements occupy the same frequency range, often leading to muddiness or a lack of clarity in the final sound. When similar frequencies "stack" on top of each other, it can cause certain frequency bands (like the low end or mids) to become overly crowded. This often happens when multiple instruments have overlapping frequency content—for example, a kick drum, bass guitar, and low synth might all occupy the low end. Or snare and vocals, fighting for a similar space. Think of it as similar to stereo width but vertically.
-
Mastering for Vinyl
Mastering for vinyl is a unique process that prepares music specifically for the physical characteristics of vinyl records. Unlike digital formats, vinyl has specific needs to ensure the record plays smoothly and sounds great. Remember that vinyl is not a digital medium, but an analog one, so we need to treat it differently. Here’s what makes it different:
-
Gain
In audio processing, gain refers to the increase or decrease of the level of an audio signal. It is usually expressed in decibels (dB) and can be used to adjust the volume of an audio signal, either to increase it to a desired level or to reduce it to avoid clipping or distortion.
In some cases, gain can also refer to the overall loudness level of an audio signal, such as when using the term "gain staging" to describe the process of optimizing the gain structure of a signal chain to avoid clipping or distortion.
Overall, gain is an important parameter in audio processing that allows users to control the level and loudness of audio signals, and is essential for achieving a well-balanced and dynamic mix or recording.
-
Decibels
Decibels (dB) are a unit of measurement used to quantify the intensity or power of a sound wave. It is a logarithmic scale that is based on the ratio of the measured sound wave to a reference sound wave. The reference sound wave used for comparison is typically the threshold of human hearing, which is defined as 0 dB.
Because decibels are measured on a logarithmic scale, a difference of 10 dB represents a tenfold difference in sound intensity, while a difference of 20 dB represents a hundredfold difference in sound intensity. This means that a sound wave that is 70 dB is ten times more intense than a sound wave that is 60 dB.
-
Spotify Loudness Normalization
Spotify loudness normalization is a feature that aims to provide a consistent listening experience across different tracks, regardless of their original loudness. When a user plays a song on Spotify, the streaming service applies loudness normalization to the track to ensure that it plays at a consistent volume level relative to other tracks in the playlist or album.
The goal of loudness normalization is to prevent songs with higher overall loudness levels from sounding overly loud and overpowering songs with lower loudness levels. Spotify uses the ReplayGain algorithm to analyze the loudness of each track and adjust its playback level accordingly. The algorithm analyzes the track's loudness level, calculates the difference between the track's loudness level and the reference loudness level, and applies gain adjustment to bring the track's loudness level to the reference level.
-
Dither
Dithering is the process of adding a small amount of random noise to the digital signal before quantization to reduce the distortion caused by quantization. The added noise spreads the quantization error across a wider range of frequencies, which makes it less audible and more like the original signal.
There are different types of dithering techniques used in digital audio, including triangular, rectangular, and noise-shaped dither. Noise-shaped dither is a more advanced technique that involves shaping the added noise to be more similar to the natural frequency response of human hearing, which can result in even less audible distortion.
-
Bit depth
Bit depth in audio refers to the number of bits of information used to represent each sample of an audio signal. It determines the dynamic range of the audio signal, which is the difference between the loudest and softest sounds that can be represented.
For example, a 16-bit audio system can represent 65,536 different amplitude levels (2^16), while a 24-bit system can represent 16,777,216 levels (2^24). The higher the bit depth, the more accurately the system can represent the original sound wave, resulting in a higher quality and more dynamic sound.
In digital audio, bit depth is often accompanied by sampling rate, which is the number of samples taken per second. Together, bit depth and sampling rate determine the overall quality of the digital audio signal.
-
Aliasing
Aliasing refers to the distortion or artifacts that occur when a signal is sampled at too low a rate or improperly filtered. When a continuous analog signal is converted to a digital signal, it is sampled at a specific rate, known as the sampling rate, which determines the maximum frequency that can be accurately represented in the digital signal.
If the sampling rate is too low, frequencies above the Nyquist frequency (half the sampling rate) can fold back and appear in the sampled signal at lower frequencies, causing aliasing. This can result in unwanted sounds, such as high-pitched whines, distortion, or in extreme cases, complete loss of the original signal.