AudioRead Examples - Python
AudioRead How To Examples in Python
The audioread library in Python provides a simple and efficient way to read audio files. It supports a variety of audio formats such as WAV, MP3, FLAC, and OGG. In this tutorial, we will explore some common use cases of audioread with step-by-step examples.
Installation
Before we begin, let's make sure we have audioread installed. You can install it using pip:
pip install audioread
Example 1: Reading Audio File Metadata
Let's start by reading the metadata of an audio file. Metadata contains information such as the duration, sample rate, and number of channels.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.mp3') as f:
# Get the metadata
print(f.metadata)
Expected Output:
{'duration': 180.123456789, 'channels': 2, 'sample_rate': 44100}
In this example, we open the audio file audio_file.mp3 using audioread.audio_open. Then we access the metadata attribute of the file object to retrieve the metadata information.
Example 2: Reading Audio Data
Now let's read the audio data from a file and print the first few samples.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
# Read the audio data
audio_data = f.read_data()
# Print the first 10 samples
print(audio_data[:10])
Expected Output:
[array([0.001, 0.002, 0.003, ...]), array([0.004, 0.005, 0.006, ...])]
In this example, we use the read_data method to read the audio data from the file. The returned audio_data is a list of NumPy arrays, where each array represents the audio samples for a different channel.
Example 3: Iterating over Audio Frames
Sometimes, we may want to process audio data frame by frame. Let's see how to iterate over audio frames using audioread.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
# Iterate over audio frames
for frame in f:
# Process each frame
print(frame)
Expected Output:
[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]
In this example, we iterate over the audio frames using a for loop. Each frame is a NumPy array representing the audio samples for a single frame.
Example 4: Extracting Audio Channels
If an audio file contains multiple channels, we can extract individual channels for further processing. Let's extract the first two channels from an audio file.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
# Get the number of channels
num_channels = f.channels
# Extract the first two channels
channels = f[:2]
# Print the extracted channels
for channel in channels:
print(channel)
Expected Output:
[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]
In this example, we use indexing to extract the first two channels from the audio file. The resulting channels object is an iterable containing the extracted channels.
Example 5: Reading a Subset of Audio Frames
To read only a subset of audio frames from a file, we can specify the desired range using the start and stop arguments of the read_data method. Let's read the first 100 frames of an audio file.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
# Read the first 100 frames
audio_data = f.read_data(start=0, stop=100)
# Print the first 10 samples from each channel
for channel in audio_data:
print(channel[:10])
Expected Output:
[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]
In this example, we use the start and stop arguments of the read_data method to specify the range of frames to read. The resulting audio_data object contains the audio samples for the specified range.
Example 6: Reading Audio as Float32
By default, audioread reads audio data as 16-bit signed integers. If we want to read the audio data as 32-bit floating-point numbers, we can specify the dtype argument as 'float32' when calling the read_data method. Let's read an audio file as float32.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
# Read the audio data as float32
audio_data = f.read_data(dtype='float32')
# Print the first 10 samples from each channel
for channel in audio_data:
print(channel[:10])
Expected Output:
[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]
In this example, we specify the dtype argument as 'float32' to read the audio data as 32-bit floating-point numbers. The resulting audio_data object contains the audio samples in the specified format.
Example 7: Reading Audio Frames with Timestamps
If an audio file contains timestamps for each frame, we can read both the frames and the corresponding timestamps. Let's read audio frames with timestamps from a file.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
# Iterate over audio frames with timestamps
for timestamp, frame in f.iter_frames(with_timestamps=True):
# Process each frame and its timestamp
print(timestamp, frame)
Expected Output:
0.0 [0.001, 0.002, 0.003, ...]
0.01 [0.004, 0.005, 0.006, ...]
In this example, we use the iter_frames method with the with_timestamps=True argument to iterate over audio frames along with their timestamps. Each iteration returns a tuple containing the timestamp and the frame.
Example 8: Reading Audio Frames in Blocks
To read audio frames in blocks of a specific size, we can use the iter_frames method with the block_size argument. Let's read audio frames in blocks of 1024 samples.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
# Iterate over audio frames in blocks of 1024 samples
for block in f.iter_frames(block_size=1024):
# Process each block of frames
print(block)
Expected Output:
[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]
In this example, we use the iter_frames method with the block_size argument set to 1024. Each iteration returns a block of frames containing 1024 samples.
Example 9: Reading Audio Frames with Sample Rate Conversion
If we want to read audio frames with a different sample rate than the original file, we can use the iter_frames method with the sample_rate argument. Let's read audio frames with a sample rate of 22050 Hz.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
# Iterate over audio frames with sample rate conversion
for frame in f.iter_frames(sample_rate=22050):
# Process each frame
print(frame)
Expected Output:
[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]
In this example, we use the iter_frames method with the sample_rate argument set to 22050. Each iteration returns a frame that has been resampled to the specified sample rate.
Example 10: Reading Audio Frames in Mono
If an audio file has multiple channels and we want to read it as mono, we can use the iter_frames method with the channels argument. Let's read audio frames in mono from a stereo file.
import audioread
# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
# Iterate over mono audio frames
for frame in f.iter_frames(channels=1):
# Process each frame
print(frame)
Expected Output:
[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]
In this example, we use the iter_frames method with the channels argument set to 1. Each iteration returns a frame that has been downmixed to mono by averaging the samples from all channels.
Conclusion
In this tutorial, we explored various examples of using the audioread library in Python. We learned how to read audio file metadata, extract audio data, iterate over audio frames, read a subset of frames, and perform sample rate conversion and channel extraction. These examples provide a good starting point for working with audio files in Python using audioread.