MFCC vs Mel Spectrogram

Tiya Vaj
1 min readSep 6, 2023

--

MFCC (Mel-Frequency Cepstral Coefficients) and Mel Spectrogram do not generate the same numbers. They are two different audio feature representations, each with its own characteristics and applications.

  1. Mel Spectrogram: A Mel Spectrogram is a visual representation of the spectrum of frequencies in an audio signal over time. It displays the intensity of various frequency components in the audio signal, typically using a colormap to represent intensity. It provides a 2D representation of the audio data, where the x-axis represents time, the y-axis represents frequency (typically in Mel scale), and the color intensity represents the amplitude or energy of each frequency component at different time intervals.

2. MFCC (Mel-Frequency Cepstral Coefficients): MFCCs are a set of coefficients that capture the spectral characteristics of an audio signal. They are derived from the Mel Spectrogram but are further processed to extract relevant information. MFCCs are typically used as features for tasks like speech and audio analysis. They capture information about the shape of the spectral envelope, which can be useful for tasks like speech recognition.

While both Mel Spectrograms and MFCCs are derived from the same Mel scale, they serve different purposes. The Mel Spectrogram provides a time-frequency representation of the audio signal, while MFCCs are a compact representation of spectral features. The numbers they generate are different because of these distinct purposes and processing steps.

--

--

Tiya Vaj

Ph.D. Research Scholar in NLP and my passionate towards data-driven for social good.Let's connect here https://www.linkedin.com/in/tiya-v-076648128/