5
votes

Could some one explain what is wrong with below code

from multiprocessing import Pool
def sq(x):
    yield x**2
p = Pool(2)

n = p.map(sq, range(10))

I am getting following error

MaybeEncodingError Traceback (most recent call last) in () 5 p = Pool(2) 6 ----> 7 n = p.map(sq, range(10))

/home/devil/anaconda3/lib/python3.4/multiprocessing/pool.py in map(self, func, iterable, chunksize) 258 in a list that is returned. 259 ''' --> 260 return self._map_async(func, iterable, mapstar, chunksize).get() 261 262 def starmap(self, func, iterable, chunksize=None):

/home/devil/anaconda3/lib/python3.4/multiprocessing/pool.py in get(self, timeout) 606 return self._value 607 else: --> 608 raise self._value 609 610 def _set(self, i, obj):

MaybeEncodingError: Error sending result: '[, ]'. Reason: 'TypeError("can't pickle generator objects",)'

Many thanks in advance.

1
how about changing yield to return?Shiping
I am trying to avoid storing of values.Manu
yield would try to save the value and return will just return the value and forget it. nevertheless yield won't work.Shiping

1 Answers

4
votes

You have to use a function not a generator here. Means: change yield by return to convert sq to a function. Pool can't work with generators.

Moreover, when trying to create a working version on Windows, I had a strange repeating error message.

Attempt to start a new process before the current process
has finished its bootstrapping phase.

This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:

if __name__ == '__main__':

literally quoting the comment I got, since it's self-explanatory:

the error on windows is because each process spawns a new python process which interprets the python file etc. so everything outside the "if main block" is executed again"

so to be portable, you have to use __name__=="__main__" when running this module:

from multiprocessing import Pool

def sq(x):
    return x**2

if __name__=="__main__":
    p = Pool(2)
    n = p.map(sq, range(10))
    print(n)

result:

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Edit: if you don't want to store the values beforehand you can use imap

n = p.imap(sq, range(10))

n is now a generator object. To consume the values (and activate the actual processing), I force iteration through a list and I get the same result as above

print(list(n))

Note that the documentation indicates that imap is much slower than map