Set "sample rate" attribute on a file Transform using PySoX?

Question

I'm transforming to audio files using PySoX:

import pysox
tfm = sox.Transformer()
tfm.build('./abc/1.raw', './abc/2.flac')

This is the error I'm getting: "sox.core.SoxError: Stdout: Stderr: sox FAIL formats: bad input format for file `./abc/1.raw': sampling rate was not specified"

How can I build the function to include a sampling rate and complete the transform?

After reading your comments on another question concerning upvotes I went through your questions and found this one. It deserves an upvote. As I researched the subject this evening I thought I could provide an answer although it is not my field of expertise. Please have a look. — trincot

trincot trincot · Accepted Answer · 2018-01-19T23:32:33

The reason is that raw audio files do not contain information about the audio format in the file, so you need to provide these. Sample rate is just one such indicator, so you'll need to do this also for a few other parameters.

Quoted from sox.sourceforge.net:

SoX can work with ‘self-describing’ and ‘raw’ audio files. ‘self-describing’ formats (e.g. WAV, FLAC, MP3) have a header that completely describes the signal and encoding attributes of the audio data that follows. ‘raw’ or ‘headerless’ formats do not contain this information, so the audio characteristics of these must be described on the SoX command line or inferred from those of the input file.

The following four characteristics are used to describe the format of audio data such that it can be processed with SoX:

sample rate

The sample rate in samples per second (‘Hertz’ or ‘Hz’). Digital telephony traditionally uses a sample rate of 8000 Hz (8 kHz), though these days, 16 and even 32 kHz are becoming more common. Audio Compact Discs use 44100 Hz (44.1 kHz). Digital Audio Tape and many computer systems use 48 kHz. Professional audio systems often use 96 kHz.

sample size [...]

data encoding [...]

channels [...]

The pysox documentation describes the set_input_format method:

set_input_format(file_type=None, rate=None, bits=None, channels=None, encoding=None, ignore_length=False)

Sets input file format arguments. This is primarily useful when dealing with audio files without a file extension. Overwrites any previously set input file arguments.

If this function is not explicitly called the input format is inferred from the file extension or the file’s header.

Parameters:

file_type : str or None, default=None

The file type of the input audio file. Should be the same as what the file extension would be, for ex. ‘mp3’ or ‘wav’.

rate : float or None, default=None

The sample rate of the input audio file. If None the sample rate is inferred.

[...]

So, you should set the rate as follows:

tfm.set_input_format(file_type='raw', rate=8000, bits=16, channels=1, encoding='signed-integer')

You'll have to adjust the values to what you really have encoded in that raw file. This method call will apply to all files with the "raw" extension, so if you would process more than one such file, there is no need to call the above again. Only when the characteristics are different in a different "raw" file you would need to call it again with the appropriate values.

Set "sample rate" attribute on a file Transform using PySoX?

1 Answers