1
votes

I use knowles sph0645lm4h-b microphone to acquire data, which is a 24-bits PCM format with 18 data presicion. Then the 24-bits PCM data is truncated to 18-bits data, because the last 6 bits is alway 0 according to the specification. After that, the 18-bits data is stored as a 32-bits unsigned integer. When the MSB bit is 0, which means it's a positive integer, and the MSB is 0, which means it's a negative integer.

After that, i find all data is positive, no matter which sound i used to test. I tested it with a dual frequency, and do a FFT, then I found the result is almost right except the lower frequency about 0-100Hz is larger. But i reconstructed the sound with the data, which i used for FFT algorithm. The reconstructed sound is almost right but with noise.

I use a buffer to store the microphone data, which is transmitted using DMA. The buffer is

uint16_t fft_buffer[FFT_LENGTH*4]

The DMA configuration is doing as following:

DMA_InitStructure.DMA_Channel = DMA_Channel_0;
DMA_InitStructure.DMA_PeripheralBaseAddr = (uint32_t)&(SPI2->DR);
DMA_InitStructure.DMA_Memory0BaseAddr = (uint32_t)fft_buffer;
DMA_InitStructure.DMA_DIR = DMA_DIR_PeripheralToMemory;
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Enable;
DMA_InitStructure.DMA_PeripheralDataSize =DMA_PeripheralDataSize_HalfWord;
DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_HalfWord;
DMA_InitStructure.DMA_BufferSize = FFT_LENGTH*4;
DMA_InitStructure.DMA_Mode = DMA_Mode_Normal;
DMA_InitStructure.DMA_Priority = DMA_Priority_VeryHigh;
DMA_InitStructure.DMA_FIFOMode = DMA_FIFOMode_Disable;
DMA_InitStructure.DMA_FIFOThreshold = DMA_FIFOThreshold_Full;
DMA_InitStructure.DMA_MemoryBurst = DMA_MemoryBurst_Single;
DMA_InitStructure.DMA_PeripheralBurst = DMA_PeripheralBurst_Single;

extract data from buffer, truncate to 18 bits and extends it to 32 bits and the store at fft_integer:

int32_t fft_integer[FFT_LENGTH];

fft_buffer stores the original data from one channel and redundant data from other channel. Original data is store at two element of array, like fft_buffer[4] and fft_buffer[5], which are both 16 bits. And fft_integer store just data from one channel and each data take a 32bits.This is why the size of fft_buffer Array is [FFT_LENGTH*4]. 2 elements are used for data from one channel and 2 element is used for the other channel. But for fft_integer, the size of fft_integer array is FFT_LENGTH. Because data from one channel is stored and 18bits can be stored in one element of type int32_t.

for (t=0;t<FFT_LENGTH*4;t=t+4){
    uint8_t  first_8_bits, second_8_bits, last_2_bits;
    uint32_t store_int;
    /* get the first 8 bits, middle 8 bits and last 2 bits, combine it to a new value */
    first_8_bits = fft_buffer[t]>>8;
    second_8_bits = fft_buffer[t]&0xFF;
    last_2_bits = (fft_buffer[t+1]>>8)>>6;

    store_int = ((first_8_bits <<10)+(second_8_bits <<2)+last_2_bits);

    /* convert it to signed integer number according to the MSB of value
     * if MSB is 1, then set all the bits before MSB to 1
     */
    const uint8_t negative = ((store_int & (1 << 17)) != 0);
    int32_t nativeInt;
    if (negative)
        nativeInt = store_int | ~((1 << 18) - 1);
    else
        nativeInt = store_int;

    fft_integer[cnt] = nativeInt;
    cnt++;
}

The microphone is using I2S Interface and it's a single mono microphone, which means that there is just half of the data is effective at half of the transmission time. It works for about 128ms, and then will stop working.

This picture shows the data, which i convert to a integer.enter image description here

My question is why there is are large components of lower frequency although it can reconstruct the similar sound. I'm sure there is no problem in Hardware configuration.

I have done a experiment to see which original data is stored in buffer. I have done the following test:

uint8_t a, b, c, d
for (t=0;t<FFT_LENGTH*4;t=t+4){
    a = (fft_buffer[t]&0xFF00)>>8;
    b = fft_buffer[t]&0x00FF;
    c = (fft_buffer[t+1]&0xFF00)>>8;
    /* set the tri-state to 0 */
    d = fft_buffer[t+1]&0x0000;
    printf("%.2x",a);
    printf("%.2x",b);
    printf("%.2x",c);
    printf("%.2x\n",d);

}

The PCM data is shown like following:

0ec40000
0ec48000
0ec50000
0ec60000
0ec60000
0ec5c000
...    
0cf28000
0cf20000
0cf10000
0cf04000
0cef8000
0cef0000
0cedc000
0ced4000
0cee4000
0ced8000
0cec4000
0cebc000
0ceb4000
....    
0b554000
0b548000
0b538000
0b53c000
0b524000
0b50c000
0b50c000
...

Raw data in Memory:

c4 0e ff 00
c5 0e ff 40
...
52 0b ff c0
50 0b ff c0

I use it as little endian.

1
And what do you expect us to do about that? You don't show the code, can't guarantee the hardware is correct (which is off-topic here), don't provide any relevant information. "with noise" is not even subjective. Quantisation? How many dB? Did you read the datasheet of the involved devices? What is the power supply noise? SNR? …too honest for this site
What is your question?Clifford
This is my first time in Stack Overflow. Sorry for the blurry description. I have edited the question again. It may be better now.Willi
How is the data packed in fft_integer, you extract 16 bits from one 32 but word, and 2 bits from a different 32 bit word. That looks unlikely.Clifford
The data packed in fft_integer is 8bits| 8bits| 2bits(all the precision here). My DMA is so configured, each time a 16 bits will be transmitted, as i show in variable fft_buffer, which means that in order to receive 24 bit PCM, DMA takes two times. fft_buffer contains the original data. In order to reduce the unnecessary data from fft_buffer, like the data from one channel is effective data, the other one is redundant. I store it in a fft_integer.Willi

1 Answers

1
votes

The large low-frequency component starting from DC in the original data is due to the large DC offset caused by incorrectly translating the 24 bit two's complement samples to int32_t. DC offset is inaudible unless it caused clipping or arithmetic overflow to occur. There are not really any low frequencies up to 100Hz, that is merely an artefact of the FFT's response to the strong DC (0Hz) element. That is why you cannot hear any low frequencies.

Below I have stated a number of assumptions as clearly as possible so that the answer may perhaps be adapted to match the actualité.

Given:

Raw data in Memory:

c4 0e ff 00
c5 0e ff 40
...
52 0b ff c0
50 0b ff c0

I use it as little endian.

and

2 elements are used for data from one channel and 2 element is used for the other channel

and given the subsequent comment:

fft_buffer[0] stores the higher 16 bits, fft_buffer[1] stores the lower 16 bits

Then the data is in fact cross-endian such that for example, for:

c4 0e ff 00

then

fft_buffer[n]   = 0x0ec4 ;
fft_buffer[n+1] = 0x00ff ;

and the reconstructed sample should be:

0x00ff0ec4

then the translation is a matter of reinterpreting fft_buffer as a 32 bit array, swapping the 16 bit word order, then a shift to move the sign-bit to the int32_t sign-bit position and (optionally) a re-scale, e.g.:

c4 0e ff 00 => 0x00ff0ec4
0x00ff0ec4<< 8    = 0xff0ec400
0xff0ec400/ 16384 = 0xffff0ec4(-61756)

thus:

// Reinterpret DMA buffer as 32bit samples
int32_t* fft_buffer32 = (int32_t*)fft_buffer ;

// For each even numbered DMA buffer sample...
for( t = 0; t < FFT_LENGTH * 2; t += 2 )
{
    // ... swap 16 bit word order
    int32_t sample = fft_buffer32 [t] << 16 | 
                     fft_buffer32 [t] >> 16 ;

    // ... from 24 to 32 bit 2's complement and rescale to
    //     maintain original magnitude. Copy to single channel
    //     fft_integer array.
    fft_integer[t / 2] = (sample << 8) / 16384 ;
}