I need to create a sound analyzer to isolate certain song frequencies. For now, I'm interested in bass (60-250Hz).
I read the signal (IEEE float), for each block of 1024: do a FFT, and then extract the value corresponding to each frequency.
What I don't understand is this: I know FFT needs powers of 2 in order to work. I've seen code using blocks of 512, code using 2048, 4096 and so on.
I've settled on 1024 (which gives me roughly 47 datapoints/second). Am I correct in assuming that using, 2048, for instance will work just the same, giving me 23.5 datapoints/second, and the only difference is accuracy (and speed of computation of course)?
Also, am I required to read at 1024-boundary blocks? Like, for instance, say I simply skip the first 200 floats, will the results end up being very similar? (my tests seem to say yes)
LATER EDIT: updated title to make it easier to understand