0
votes

I have an API that gives me PCM wave data:

http://msdn.microsoft.com/en-us/library/ff966424.aspx

The byte[] buffer format used as a parameter for the SoundEffect constructor, Microphone.GetData method, and DynamicSoundEffectInstance.SubmitBuffer method is PCM wave data. Additionally, the PCM format is interleaved and in little-endian.

The audio format has the following constraints:

  • The audio channels can be mono (1) or stereo (2).
  • The PCM wave file must have 16-bits per sample.
  • The sample rate must be between 8,000 Hz and 48,000 Hz.
  • The interleaving for stereo data is left channel to right channel.

I would like to do a visualisation based on this data.

I want to split the sound pitch levels into 3rds, and get the volume/level of each.

So, if i speak in a low voice, i'll get a high value, then 2 low values, if i speak normally i'd get a low value, a high value and a low value and if i speak in a high voice i get 2 low values, and a high value.

How can i achieve this? I've never tried anything dealing with sound, so i'm at level 1, and don't know where to start.

1

1 Answers

4
votes

A full answer would probably be too complex to give here but you need to take the time-domain, PCM sample data and derive the frequency-domain representation from it so that you can then assess the level of the signal in the different frequency ranges. The technique for doing this is known as the Fast Fourier Transfer (FFT). Implementing this yourself requires a significant amount of knowledge of DSP, so perhaps your best approach would be to source a library that offers an FFT implementation out-of-the-box that you can use.