2
votes

Usually the subchunk1size of a wav file is 16. However, I have some wav files that have subchunk1size = 18. I have c++ code to read wav file that has subchunk1size = 16. Now I want to read wav files that have subchunk1size = 18. Any help would be appreciated.

typedef struct header_file
{
    char chunk_id[4];
    int chunk_size;
    char format[4];
    char subchunk1_id[4];
    int subchunk1_size;
    short int audio_format;
    short int num_channels;
    int sample_rate;            
    int byte_rate;
    short int block_align;
    short int bits_per_sample;
    char subchunk2_id[4];
    int subchunk2_size;         
} header;

The above is struct header_file in my code to read wav file with subchunk1size = 16.

2
Don't forget to show us your code.Prashant Kumar
That's the "fmt " chunk for non-PCM data, like µ-law. Don't just assume that sub-chunk is the first one. Basic reference is here.Hans Passant
If you'd prefer to avoid the pain of parsing WAV files by hand, and also get support for a number of other audio formats "for free", you might check out libsndfile; it allows you to just sf_open() pretty much any audio file and get right to reading the audio samples. ( mega-nerd.com/libsndfile )Jeremy Friesner
I use libsndfile to read the audio samples and I am free of worry about audio files with different headers or structure.Kelvin Tan

2 Answers

4
votes

Wav files do not have as rigid a structure as you are expecting. The "fmt " chunk is not necessarily the first to follow the file header (though it usually is), and its size is not necessarily 16 bytes (though again that's often the case). Compressed audio can be stored in a wav file, in which case the audio_format field will be different than 1 and the "fmt " chunk can have a different size than 16 bytes.

The proper and flexible way to parse wav files is to use more granular structures:

struct wave_header
{
    char chunk_id[4];
    int chunk_size;
    char format[4];     
};

struct riff_chunk_header
{
    char id[4];
    int size;
};

struct wave_fmt_chunk
{
    short audio_format;
    short num_channels;
    int sample_rate;            
    int byte_rate;
    short block_align;
    short bits_per_sample;
};

Then your parsing logic should be (taking care to validate the data you've read at each step):

  1. Read a wave_header
  2. Read a riff_chunk_header
  3. If the ID of the chunk header you've read is not "fmt ", skip the chunk (you have its size in bytes) and loop back to step 2 to read the next chunk header
  4. Read the audio_format field
  5. Interpret the rest of the "fmt " chunk's data based on this audio_format. If it's 1, you have PCM data and the chunk should have your expected 16 bytes. If it's not 1, you have to find documentation on that compression format.

In general, it's also a good idea to gracefully ignore additional data, so if you do see a PCM-encoded wav file with a "fmt " chunk of 18 bytes, try to ignore the last 2 bytes and see where that gets you.

1
votes

The files do have a rigid format structure. If this structure is not adhered to, files may not play or open for editing by some applications.

To the original question: wave files can be divided into two groups. The first group consists of files with more than 2 channels of audio OR with a PCM bit depth greater than 16 or both. The second group is comprised of files which do not meet both of those conditions, i.e. 1 or 2 channels with up to 16 bits. Over the years, Microsoft has kludged the structures contained in wav files to accomodate advances in computer audio technology. Specifically, they added a 2-byte field called cbSize to the WAVEFORMATEX structure. This is why you see subchunk1size values of both 16 and 18. The two-byte difference depends on the presence or absence of the cbSize field. Properly-formed modern-day audio files using the current version of WAVEFORMATEX will have a subchunk1size of 18 regardless of channel count or bit depth. Old files created before Microsoft changed the WAVEFORMATEX structure have a subchunk1size of 16.

Here is my policy:

When reading a file, subchunk1size can be either 16 or 18, so the code should adapt accordingly. There are plenty of old wav files with the old format out there, or a modern file could be incorrectly written with the old WAVEFORMATEX structure without the cbSize field.

When creating a wav file, I always use a subchunk1size of 18 regardless of channel count or bit depth because Microsoft has permanently changed the WAVEFORMATEX structure and that makes the file conformant with spec.

Windows Media Player is useful for making sure your wav file can be opened and played.

http://msdn.microsoft.com/en-us/library/windows/desktop/dd390970%28v=vs.85%29.aspx