I Have a bit of a hypothetical question to understand this concept..
Let's say I captured a mono voice clip with 8000hz sample rate, that is 4096 bytes in data.. Feeding the First 512 Bytes(16bit encoding) through an FFT of size 256, will return me 128 values, which i convert to amplitude. So my frequencies for this output are
FFT BIN #1
0: 0*8000/256
1: 1*8000/256
.
.
127: 127*8000/256
So far so good ey? So now i 3584 bytes of unprocessed data left. So i perform another fft of 256 size on 512 bytes of data. And get the same amount of results.. So for this do i again have frequencies of:
FFT BIN #2:
Example1:
0: 0*8000/256
1: 1*8000/256
.
.
127: 127*8000/256
or
FFT BIN #2
Example2:
128: 129*8000/256
139: 130*8000/256
.
.
255: 255*8000/256
Because I would like to plot this amplitude/freq graph. But i don't understand if all these fft bins should be overlapped on the same frequencies like examaple1, or spread out like the second example.
Or am I trying to do something that is completely redundant? Because what i want to accomplish is find the peak amp value of every 30-50ms time frame to use for comparison of other sound files..
If anyone can clear this out for me, I'd be very grateful.