AudioRead Examples - Python

AudioRead How To Examples in Python

The audioread library in Python provides a simple and efficient way to read audio files. It supports a variety of audio formats such as WAV, MP3, FLAC, and OGG. In this tutorial, we will explore some common use cases of audioread with step-by-step examples.

Installation

Before we begin, let's make sure we have audioread installed. You can install it using pip:

pip install audioread

Example 1: Reading Audio File Metadata

Let's start by reading the metadata of an audio file. Metadata contains information such as the duration, sample rate, and number of channels.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.mp3') as f:
    # Get the metadata
    print(f.metadata)

Expected Output:

{'duration': 180.123456789, 'channels': 2, 'sample_rate': 44100}

In this example, we open the audio file audio_file.mp3 using audioread.audio_open. Then we access the metadata attribute of the file object to retrieve the metadata information.

Example 2: Reading Audio Data

Now let's read the audio data from a file and print the first few samples.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
    # Read the audio data
    audio_data = f.read_data()
    
    # Print the first 10 samples
    print(audio_data[:10])

Expected Output:

[array([0.001, 0.002, 0.003, ...]), array([0.004, 0.005, 0.006, ...])]

In this example, we use the read_data method to read the audio data from the file. The returned audio_data is a list of NumPy arrays, where each array represents the audio samples for a different channel.

Example 3: Iterating over Audio Frames

Sometimes, we may want to process audio data frame by frame. Let's see how to iterate over audio frames using audioread.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
    # Iterate over audio frames
    for frame in f:
        # Process each frame
        print(frame)

Expected Output:

[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]

In this example, we iterate over the audio frames using a for loop. Each frame is a NumPy array representing the audio samples for a single frame.

Example 4: Extracting Audio Channels

If an audio file contains multiple channels, we can extract individual channels for further processing. Let's extract the first two channels from an audio file.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
    # Get the number of channels
    num_channels = f.channels
    
    # Extract the first two channels
    channels = f[:2]
    
    # Print the extracted channels
    for channel in channels:
        print(channel)

Expected Output:

[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]

In this example, we use indexing to extract the first two channels from the audio file. The resulting channels object is an iterable containing the extracted channels.

Example 5: Reading a Subset of Audio Frames

To read only a subset of audio frames from a file, we can specify the desired range using the start and stop arguments of the read_data method. Let's read the first 100 frames of an audio file.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
    # Read the first 100 frames
    audio_data = f.read_data(start=0, stop=100)
    
    # Print the first 10 samples from each channel
    for channel in audio_data:
        print(channel[:10])

Expected Output:

[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]

In this example, we use the start and stop arguments of the read_data method to specify the range of frames to read. The resulting audio_data object contains the audio samples for the specified range.

Example 6: Reading Audio as Float32

By default, audioread reads audio data as 16-bit signed integers. If we want to read the audio data as 32-bit floating-point numbers, we can specify the dtype argument as 'float32' when calling the read_data method. Let's read an audio file as float32.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
    # Read the audio data as float32
    audio_data = f.read_data(dtype='float32')
    
    # Print the first 10 samples from each channel
    for channel in audio_data:
        print(channel[:10])

Expected Output:

[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]

In this example, we specify the dtype argument as 'float32' to read the audio data as 32-bit floating-point numbers. The resulting audio_data object contains the audio samples in the specified format.

Example 7: Reading Audio Frames with Timestamps

If an audio file contains timestamps for each frame, we can read both the frames and the corresponding timestamps. Let's read audio frames with timestamps from a file.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
    # Iterate over audio frames with timestamps
    for timestamp, frame in f.iter_frames(with_timestamps=True):
        # Process each frame and its timestamp
        print(timestamp, frame)

Expected Output:

0.0 [0.001, 0.002, 0.003, ...]
0.01 [0.004, 0.005, 0.006, ...]

In this example, we use the iter_frames method with the with_timestamps=True argument to iterate over audio frames along with their timestamps. Each iteration returns a tuple containing the timestamp and the frame.

Example 8: Reading Audio Frames in Blocks

To read audio frames in blocks of a specific size, we can use the iter_frames method with the block_size argument. Let's read audio frames in blocks of 1024 samples.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
    # Iterate over audio frames in blocks of 1024 samples
    for block in f.iter_frames(block_size=1024):
        # Process each block of frames
        print(block)

Expected Output:

[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]

In this example, we use the iter_frames method with the block_size argument set to 1024. Each iteration returns a block of frames containing 1024 samples.

Example 9: Reading Audio Frames with Sample Rate Conversion

If we want to read audio frames with a different sample rate than the original file, we can use the iter_frames method with the sample_rate argument. Let's read audio frames with a sample rate of 22050 Hz.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
    # Iterate over audio frames with sample rate conversion
    for frame in f.iter_frames(sample_rate=22050):
        # Process each frame
        print(frame)

Expected Output:

[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]

In this example, we use the iter_frames method with the sample_rate argument set to 22050. Each iteration returns a frame that has been resampled to the specified sample rate.

Example 10: Reading Audio Frames in Mono

If an audio file has multiple channels and we want to read it as mono, we can use the iter_frames method with the channels argument. Let's read audio frames in mono from a stereo file.

import audioread

# Open the audio file
with audioread.audio_open('audio_file.wav') as f:
    # Iterate over mono audio frames
    for frame in f.iter_frames(channels=1):
        # Process each frame
        print(frame)

Expected Output:

[0.001, 0.002, 0.003, ...]
[0.004, 0.005, 0.006, ...]

In this example, we use the iter_frames method with the channels argument set to 1. Each iteration returns a frame that has been downmixed to mono by averaging the samples from all channels.

Conclusion

In this tutorial, we explored various examples of using the audioread library in Python. We learned how to read audio file metadata, extract audio data, iterate over audio frames, read a subset of frames, and perform sample rate conversion and channel extraction. These examples provide a good starting point for working with audio files in Python using audioread.

Installation​

Example 1: Reading Audio File Metadata​

Example 2: Reading Audio Data​

Example 3: Iterating over Audio Frames​

Example 4: Extracting Audio Channels​

Example 5: Reading a Subset of Audio Frames​

Example 6: Reading Audio as Float32​

Example 7: Reading Audio Frames with Timestamps​

Example 8: Reading Audio Frames in Blocks​

Example 9: Reading Audio Frames with Sample Rate Conversion​

Example 10: Reading Audio Frames in Mono​

Conclusion​

Installation

Example 1: Reading Audio File Metadata

Example 2: Reading Audio Data

Example 3: Iterating over Audio Frames

Example 4: Extracting Audio Channels

Example 5: Reading a Subset of Audio Frames

Example 6: Reading Audio as Float32

Example 7: Reading Audio Frames with Timestamps

Example 8: Reading Audio Frames in Blocks

Example 9: Reading Audio Frames with Sample Rate Conversion

Example 10: Reading Audio Frames in Mono

Conclusion