I'm computing the incremental mean of my input data (which is an array of 6 elements, so i'll end up with 6 means).
This is the code I'm using everytime a new input array is available (obviously I update the number of samples ecc...):
computing_mean:for(int i=0;i<6;i++){
temp_mean[i]=temp_mean[i] + (input[i]-temp_mean[i])/number_of_samples;
//Possible optimization?
//temp_mean[i]=temp_mean[i] + divide(input[i]-temp_mean[i],number_of_samples);
}
Where all the data in the code are arrays or single number of the following type:
typedef ap_fixed <36,24,AP_RND_CONV,AP_SAT> decimalNumber;
From my synthesis report this loop hase 324 latency and 54 iteration latency, caused mainly by the division operation.
Are there any ways I can improve the speed of the division? I tried using hls_math and the divide function, but it doesn't seem to work with my type of data.
EDIT 1: I'm including my performance profiler inside vivado HLS. I'll add a self-contained reproducible code later with another edit.
As you can see, the majority of the time is spent in SDIV
sin()
andcos()
, or things likesqrt()
, division will always be the most painful. There are occasions where you can skip division and do bit-shifting instead, if you're working with integer data and the number you're dividing by is a power of 2, but that's about it. - tadman