1
votes

Numpy has the function to compute covariance from an array which is fine. However, I would like to do it using generators to save memory. Is there some way to do this without writing my own cov-function?

1
We might need a little more information about the size of your array etc, to highlight exactly how/why you want to use generators - inspectorG4dget
Well, the question was more of academic character in first instance. I understand that the construction might not be the right way to go in many cases, but that was what I wanted to discuss. - Robert
If you look into the source of cov (github.com/numpy/numpy/blob/v1.9.1/numpy/lib/…), everything passed in will be copied and converted to an np.array anyway. So you don't save any memory by passing generators to cov. If you really want generators I think you're stuck with writing you own function. - RickardSjogren

1 Answers

0
votes

You can use the following implementation:

from numpy import outer

def gen_cov(g):
    mean, covariance = 0, 0
    for i, x in enumerate(g):
       diff = x - mean
       mean += diff/(i+1)
       covariance += outer(diff, diff) * i / (i+1)
    return covariance/i

You may want to use something different from numpy.outer depending on what the generator elements are. This is a Python implementation of this answer.