I am aiming to convert wave data (read from .wav
file with wave module) to a numpy array.
The data is currently formatted as a byte array. This means each element of the byte array is 8 bits wide. The wav file is mono, so contains only 1 channel. Most wav files are stereo however and therefore the data are formatted as a sequence of samples for the left and right channels interweaved.
The samples are 16 bits, therefore each pair of sequential bytes in the array is 1 "16 bit" sample. Some audio files are 24 bits per sample. The number of bits per sample can be obtained from
len(bytearray) // (wave.getnframes() * wave.getnchannels())
So I need to somehow
- group bytes into pairs of bytes (samples)
- copy the pairs of bytes to some new storage with a "stride". for mono stride = 0 ? for stereo the stride is presumably 1? (It would depend on how python counts in memory)
- convert the new storage to a numpy array
- at some point convert from 16 bit signed integer format to floating point format, which could be done at any stage of the process
I could implement a C++ style solution, using for loops and indices. I assume this will be very slow in python.
My guess is that Python (probably) includes some functions for
- conversion between int and float/double formats (perhaps as a numpy array or perhaps elsewhere in the above described process)
- "de-interlacing" data (seperating out the left/right channel data from the byte array)
- converting a byte array with a specified format to a numpy array with a specified type
However, I have no ideas what these might be or what form they might be in. (Builtins? Libraries/Modules?)
This seemed like a problem that should be easily "duckduckgoable" - but I had no luck. Probably working with wav format data these days is a bit of a niche application?
Even a simple answer with a list of things to type into duckduckgo would be appreciated. I can read/figure out the documentation, I just don't know what to search for.