I need to create an Integer Sequence from an Audio file. I was checking the waveform libraries as that draw a linear graph. But I am searching for the key information, What is the source of the integer that is used to draw the graph ? is it amplitude ? frequency ? or something else ? There are libraries available but I need to know what unit of information I need to extract to have a data that I can feed to a graph. However drawing a graph is not my objective. I just want that raw integer array.
2 Answers
Of course, it's the amplitudes what you need to get a wave oscillogram, and it's the way PCM data are stored in wav files, for example (data which come directly after the file header). Note that there are 8-bit and 16-bit formats, the latter may be also big-endian or little-endian depending on the byte order (just to keep you aware of it).
Audio is simply a curve - when you plot it with time across the X axis then Y axis is amplitude - similar to plotting a sin math function - each point on the curve is a number which gets stored in the audio file - WAV format this number typically is a 16 bit unsigned integer - so ignoring the 44 byte header - the rest of the file is just a sequence of these integer numbers. When this curve varies up and down quickly over time the frequency is higher than if the curve varies more slowly over time. If you download the audio workbench application : Audacity you can view this curve of any audio file (WAV, mp3,...)
.wav
is pretty simple (amplitudes at fixed intervals). Compressed formats are more complex, but most use some sort of transform (DCT, FFT, etc.) to convert from individual samples to frequency-based encoding. – Jerry Coffin