I want to calculate percentiles from an ensemble of multiple large vectors in Python. Instead of trying to concatenate the vectors and then putting the resulting huge vector through numpy.percentile, is there a more efficient way?
My idea would be, first, counting the frequencies of different values (e.g. using scipy.stats.itemfreq), second, combining those item frequencies for the different vectors, and finally, calculating the percentiles from the counts.
Unfortunately I haven't been able to find functions either for combining the frequency tables (it is not very simple, as different tables may cover different items), or for calculating percentiles from an item frequency table. Do I need to implement these, or can I use existing Python functions? What would those functions be?