python multiprocessing runs serial with certain parameters

Question

I am trying run a set of code in parallel and it seems to work in some cases but not others. The code below runs in parallel with saved_models[item] is None or not present but runs very slow and in serially when it has data.

Any thoughts? The saved_model object is not that big and its different for every single run. from statsmodels.tsa.arima_model import ARIMA def do_parallel_work(self): with mp.Pool(processes=self.max_workers) as pool: job_args = [(item , target_col , saved_models[item] if saved_models is not None and item in saved_models else None ) for item in items] results = pool.map(self.do_work_helper, job_args)

        for result in results:
            if result[1] is not None:
                results_dict[result[0]] = result[1]

def do_work_helper(self, args):
    return self.do_work(*args)

def do_work(self, item, target_cols, saved_model):
    # can't show exactly what this but essentially it does something to the affect of:
    my_model = ARIMA()
    # if saved_model is None
    fit_model = my_model.fit(trend='nc', maxiter=1000, disp=0)
    # else
    my_model.predict()
    return item, stuff

I showed you how I call all of the functions except do_parallel_work, but I don't see how that is relevant. All the functions are part of a class. Made a small edit because I forget the "self" for the do_work function. — cooke
No you haven't. "Call" doesn't mean "give a name to", it means "make the code run inside the function". The code you've posted does absolutely nothing. You haven't called the functions. — roganjosh
OK, well you weren't very clear. How I call a function is not the same as what the function does. I have not show what the do_work function does. I can't show that exactly but i can provide more detail. — cooke
I don't believe I was unclear. "call" a function is the correct terminology; just like you'd call your friends and expect a response. The code inside a function body will not execute unless you call that function. It can be evaluated in terms of being syntactically correct, but it doesn't actually do something to your data until it's called. — roganjosh

cooke cooke · Accepted Answer · 2019-04-02T13:46:52

My issue was that I was doing this in a class and not all the functions were static. So it was copying the full class for each thread and I did not want this to happen. I believe just one sub function was not static which was forcing the entire class to get copied.

python multiprocessing runs serial with certain parameters

1 Answers