Librosa Examples

Librosa Examples

Here's a detailed step-by-step tutorial on how to use Librosa in Python, along with 10 examples. Librosa is a popular Python library used for audio analysis and feature extraction. Let's get started!

Example 1: Loading an audio file

The first step is to load an audio file using Librosa. Librosa supports various audio file formats such as WAV, MP3, and FLAC. Here's how you can load an audio file:

import librosa

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

In the above code, audio_path is the path to the audio file, and audio is a numpy array containing the audio samples. sr is the sampling rate of the audio.

Example 2: Displaying the waveform

You can visualize the waveform of an audio file using Matplotlib. Here's an example:

import librosa
import librosa.display
import matplotlib.pyplot as plt

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

# Display the waveform
plt.figure(figsize=(14, 5))
librosa.display.waveplot(audio, sr=sr)
plt.xlabel('Time (seconds)')
plt.ylabel('Amplitude')
plt.title('Waveform')
plt.show()

The above code displays the waveform of the audio file using Matplotlib. The x-axis represents time in seconds, and the y-axis represents the amplitude of the audio.

Example 3: Extracting the Mel-frequency cepstral coefficients (MFCCs)

MFCCs are commonly used features for audio analysis. Librosa provides a function to extract MFCCs from an audio file. Here's an example:

import librosa

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

# Extract MFCCs
mfccs = librosa.feature.mfcc(audio, sr=sr)

# Print the shape of the MFCCs
print(mfccs.shape)

In the above code, mfccs is a numpy array containing the MFCCs of the audio file. The shape of mfccs represents the number of MFCC coefficients and the number of frames.

Example 4: Visualizing the MFCCs

You can visualize the MFCCs using a heatmap. Here's an example:

import librosa
import librosa.display
import matplotlib.pyplot as plt

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

# Extract MFCCs
mfccs = librosa.feature.mfcc(audio, sr=sr)

# Display the MFCCs
plt.figure(figsize=(10, 4))
librosa.display.specshow(mfccs, x_axis='time')
plt.colorbar()
plt.title('MFCCs')
plt.tight_layout()
plt.show()

The above code displays the MFCCs of the audio file as a heatmap. The x-axis represents time, and the y-axis represents the MFCC coefficients. The color intensity represents the magnitude of each MFCC coefficient.

Example 5: Extracting the chroma feature

The chroma feature represents the 12 different pitch classes. Librosa provides a function to extract the chroma feature from an audio file. Here's an example:

import librosa

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

# Extract the chroma feature
chroma = librosa.feature.chroma_stft(audio, sr=sr)

# Print the shape of the chroma feature
print(chroma.shape)

In the above code, chroma is a numpy array containing the chroma feature of the audio file. The shape of chroma represents the number of chroma features and the number of frames.

Example 6: Visualizing the chroma feature

You can visualize the chroma feature using a heatmap. Here's an example:

import librosa
import librosa.display
import matplotlib.pyplot as plt

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

# Extract the chroma feature
chroma = librosa.feature.chroma_stft(audio, sr=sr)

# Display the chroma feature
plt.figure(figsize=(10, 4))
librosa.display.specshow(chroma, x_axis='time')
plt.colorbar()
plt.title('Chroma Feature')
plt.tight_layout()
plt.show()

The above code displays the chroma feature of the audio file as a heatmap. The x-axis represents time, and the y-axis represents the 12 different pitch classes.

Example 7: Extracting the spectral contrast

The spectral contrast represents the difference in amplitude between peaks and valleys in the spectrum. Librosa provides a function to extract the spectral contrast from an audio file. Here's an example:

import librosa

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

# Extract the spectral contrast
contrast = librosa.feature.spectral_contrast(audio, sr=sr)

# Print the shape of the spectral contrast
print(contrast.shape)

In the above code, contrast is a numpy array containing the spectral contrast of the audio file. The shape of contrast represents the number of spectral contrast bands and the number of frames.

Example 8: Visualizing the spectral contrast

You can visualize the spectral contrast using a heatmap. Here's an example:

import librosa
import librosa.display
import matplotlib.pyplot as plt

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

# Extract the spectral contrast
contrast = librosa.feature.spectral_contrast(audio, sr=sr)

# Display the spectral contrast
plt.figure(figsize=(10, 4))
librosa.display.specshow(contrast, x_axis='time')
plt.colorbar(format='%+2.0f dB')
plt.title('Spectral Contrast')
plt.tight_layout()
plt.show()

The above code displays the spectral contrast of the audio file as a heatmap. The x-axis represents time, and the y-axis represents the spectral contrast bands.

Example 9: Extracting the tonnetz feature

The tonnetz feature represents the tonal centroid features of the audio. Librosa provides a function to extract the tonnetz feature from an audio file. Here's an example:

import librosa

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

# Extract the tonnetz feature
tonnetz = librosa.feature.tonnetz(audio, sr=sr)

# Print the shape of the tonnetz feature
print(tonnetz.shape)

In the above code, tonnetz is a numpy array containing the tonnetz feature of the audio file. The shape of tonnetz represents the number of tonnetz coefficients and the number of frames.

Example 10: Visualizing the tonnetz feature

You can visualize the tonnetz feature using a heatmap. Here's an example:

import librosa
import librosa.display
import matplotlib.pyplot as plt

# Load the audio file
audio_path = 'path/to/audio/file.wav'
audio, sr = librosa.load(audio_path)

# Extract the tonnetz feature
tonnetz = librosa.feature.tonnetz(audio, sr=sr)

# Display the tonnetz feature
plt.figure(figsize=(10, 4))
librosa.display.specshow(tonnetz, x_axis='time')
plt.colorbar()
plt.title('Tonnetz Feature')
plt.tight_layout()
plt.show()

The above code displays the tonnetz feature of the audio file as a heatmap. The x-axis represents time, and the y-axis represents the tonnetz coefficients.

These are just a few examples of what you can do with Librosa. It provides many more features and functions for audio analysis. Experiment with different audio files and explore the various functions available in Librosa to extract meaningful insights from audio data.

Example 1: Loading an audio file​

Example 2: Displaying the waveform​

Example 3: Extracting the Mel-frequency cepstral coefficients (MFCCs)​

Example 4: Visualizing the MFCCs​

Example 5: Extracting the chroma feature​

Example 6: Visualizing the chroma feature​

Example 7: Extracting the spectral contrast​

Example 8: Visualizing the spectral contrast​

Example 9: Extracting the tonnetz feature​

Example 10: Visualizing the tonnetz feature​