1
votes

I have a DirectSound application I'm writing in C, running on Windows 7. The application just captures some sound frames, and plays them back. For sanity-checking the capture results, I'm writing out the PCM data to a file, which I can play in Linux using aplay.

Unfortunately, the sound is choppy, sometimes contains stuttering (and plays at the wrong speed in Linux). Oddly, the amount of distortion observed when playing the capture file is less if the PCM data is not played in the playback buffer at the time of capture.

Here's the initialization of my WAVEFORMATEX:

memset(&wfx, 0, sizeof(WAVEFORMATEX)); 
wfx.cbSize = 0;
wfx.wFormatTag = WAVE_FORMAT_PCM; 
wfx.nChannels = 1; 
wfx.nSamplesPerSec = sampleRate; 
wfx.wBitsPerSample = sampleBitWidth;
wfx.nBlockAlign = (wfx.nChannels * wfx.wBitsPerSample) / 8; 
wfx.nAvgBytesPerSec = wfx.nSamplesPerSec * wfx.nBlockAlign code here

The sampleRate is 8000, and sampleBitWidth is 16.

I create a capture and play buffer using this same structure, and the capture buffer has 3 notification positions. I start capturing with:

lpDsCaptureBuffer->Start(DSCBSTART_LOOPING);

I then spark off a playback thread that calls WaitForMultipleObjects on the events associated with the notification points. Upon notification, I reset all the events, and copy the 1 or 2 pieces of the capture buffer to a local buffer, and pass those on to a play routine:

void playFromBuff(LPVOID captureBuff,DWORD captureLen) {
  LPVOID playBuff;
  DWORD playLen;
  HRESULT hr;

  hr = lpDsPlaybackBuffer->Lock(0L,captureLen,&playBuff,&playLen,NULL,0L,0L);

  memcpy(playBuff,captureBuff,playLen);
  hr = lpDsPlaybackBuffer->Unlock(playBuff,playLen,NULL,0L);
  hr = lpDsPlaybackBuffer->SetCurrentPosition(0L);
  hr = lpDsPlaybackBuffer->Play(0L,0L,0L);
}

(some error-checking omitted).

Note that the playback buffer has no notification positions. Each time I get a chunk from the capture buffer, I lock the playback buffer starting at position 0.

The capture code, guarded by the WaitForMultipleObjects, looks like:

    lpDsCaptureBuffer->GetCurrentPosition(NULL,&readPos);

    hr = lpDsCaptureBuffer->Lock(...,...,&captureBuff1,&captureLen1,&captureBuff2,&captureLen2,0L);

where the ellipses contain calculations involving the current and previously-seen read positions. I'm omitting those likely-wrong calculations -- I suspect that's where the problem lies.

My notification positions are multiples of 1024. Yet the read positions reported are 1500, 2500, and 3500. So if I see a read position of 1500, does that mean I can read from bytes 0 to 1500. And when next I see 2500, does that mean I should read from 1501 to 2500? Why do those read positions not correspond exactly to my notification positions? What's the right algorithm here?

I've tried the simpler alternative of stopping the capture when the capture buffer is full, without other notification positions. But that means, I think, allowing some sound to escape capture.

1

1 Answers

0
votes

My notification positions are multiples of 1024. Yet the read positions reported are 1500, 2500, and 3500. So if I see a read position of 1500, does that mean I can read from bytes 0 to 1500. And when next I see 2500, does that mean I should read from 1501 to 2500? Why do those read positions not correspond exactly to my notification positions? What's the right algorithm here?

DirectSound API is nowadays a compatibility layer on top of other "real" audio capture API. This means that inside audio capture fills some buffers (esp. those multiples of 500) and then passes the filled buffers to DirectSound capture, which in turn reports them to you. This explains why you see read positions as multiples of 500, because DirectSound itself has data available this way.

Since you are interested in getting captured data, your assumption is correct that you are interested mostly in read position. You get the notification and you know what offset is safe to read up to. Since the capture API is layered, there is some latency involved because layers need to pass chunks of data between one another, before making them available to you.