2
votes

I've written a program that generates a sine-wave at a user-specified frequency, and plays it on a 96kHz audio channel. To save a few CPU cycles I employ the old trick of pre-rendering a short section of audio into a buffer, and then playing back the buffer in a loop, so that I can avoid calling the sin() function 96000 times per second for the duration of the program and just do simple memory-copying instead.

My problem is efficiently determining what the minimum usable size of this pre-rendered buffer would be. For some frequencies it is easy -- for example, an 8kHz sine wave can be perfectly represented by generating a 12-sample buffer and playing it in a looping, because (8000*12 == 96000). For other frequencies, however, a single cycle of the sine wave requires a non-integral number of samples to represent, and therefore looping a single cycle's worth of samples would cause unacceptable glitching.

For some of those frequencies, however, it's possible to get around that problem by pre-rendering more than one cycle of the sine wave and looping that -- if I can figure out how many cycles are required so that the number of cycles present in the buffer will be integral, while also guaranteeing that the number of samples in the buffer are integral. For example, a sine-wave frequency of 12.8kHz translates to a single-cycle buffer-size of 7.5 samples, which won't loop cleanly, but if I render two consecutive cycles of the sine wave into a 15-sample buffer, then I can cleanly loop the result.

My current approach to solving this issue is brute force: I try all possible cycle-counts and see if any of them result in a buffer size with an integral number of samples in it. I think that approach is unsatisfactory for the following reasons:

1) It's very inefficient. For example, the program shown below (which prints buffer-size results for 480,000 possible frequency values between 0Hz and 48kHz) takes 35 minutes to complete on my 2.7GHz machine. I think there must be a much faster way to do this.

2) I suspect that the results are not 100% accurate, due to floating-point errors.

3) The algorithm gives up if it can't find an acceptable buffer size less than 10 seconds long. (I could make the limit higher, but of course that would make the algorithm even slower).

So, is there any way to calculate the minimum-usable-buffer-size analytically, preferably in O(1) time? It seems like it should be easy, but I haven't been able to figure out what kind of math I should use.

Thanks in advance for any advice!

#include <stdio.h>
#include <math.h>

static const long long SAMPLES_PER_SECOND              = 96000;
static const long long MAX_ALLOWED_BUFFER_SIZE_SAMPLES = (SAMPLES_PER_SECOND * 10);

// Returns the length of the pre-render buffer needed to properly
// loop a sine wave at the given frequence, or -1 on failure.
static int GetNumCyclesNeededForPreRenderedBuffer(float freqHz)
{
   double oneCycleLengthSamples = SAMPLES_PER_SECOND/freqHz;

   for (int count=1; (count*oneCycleLengthSamples) < MAX_ALLOWED_BUFFER_SIZE_SAMPLES; count++)
   {
      double remainder = fmod(oneCycleLengthSamples*count, 1.0);
      if (remainder > 0.5) remainder = 1.0-remainder;
      if (remainder <= 0.0) return count;
   }
   return -1;
}

int main(int, char **)
{
   for (int i=0; i<48000*10; i++)
   {
      double freqHz = ((double)i)/10.0f;
      int numCyclesNeeded = GetNumCyclesNeededForPreRenderedBuffer(freqHz);
      if (numCyclesNeeded >= 0)
      {
         double oneCycleLengthSamples = SAMPLES_PER_SECOND/freqHz;
         printf("For %.1fHz, use a pre-render-buffer size of %f samples (%i cycles, %f samples/cycle)\n", freqHz, (numCyclesNeeded*oneCycleLengthSamples), numCyclesNeeded, oneCycleLengthSamples);
      }
      else printf("For %.1fHz, there was no suitable pre-render-buffer size under the allowed limit!\n", freqHz);
   }
   return 0;
}
1
Another way to avoid calling sin() a lot would be to cache the results. If your audio samples are 8 bit or 12 bit then you could build map of which ranges of phase map to each output amplitude (with no more than 256 or 4096 entries); each sample then requires updating the phase plus one table lookup.Alan Stokes
Alternatively: you're really just rotating round a circle, with a constant incremental angle on each sample. The rotation can be expressed as multiplication by a 2x2 matrix, where you can precompute the coefficients - so the computation for each sample is just a few multiplies and adds. Eventually the errors will build up though.Alan Stokes

1 Answers

1
votes
number_of_cycles/size_of_buffer = frequency/samples_per_second

This implies that if you can simplify your frequency/samples_per_second fraction, you can find the size of your buffer and the number of cycles in the buffer. If frequency and samples_per_second are integers, you can simplify the fraction by finding the greatest common divisor, otherwise you can use the method of continued fractions.

Example:

Say your frequency is 1234.5, and your samples_per_second is 96000. We can make these into two integers by multiplying by 10, so we get the ratio:

frequency/samples_per_second = 12345/960000

The greatest common divisor is 15, so it can be reduced to 823/64000.

So you would need 823 cycles in a 64000 sample buffer to reproduce the frequency exactly.