I am running a Moving Average and SARIMA model for time series forecasting on my machine which has 12 cores.
The moving average model takes 25 min to run on a single core. By using the multiprocessing module, I was able to bring down the running time to ~4 min (by using 8 out of 12 cores). On checking the results of the "top" command, one can easily see that multiprocessing is actually using the 8 cores and the behaviour is as expected.
Moving Average(1 core) -> CPU Usage for Moving Average 1 core
Moving Average(8 core) -> CPU Usage for Moving Average 8 cores
I ran the same routines using the SARIMA model first without using multiprocessing. To my surprise, it was automatically using all the cores/distributing work on all cores. Unlike Moving Average model(Image 1) where I could see the CPU Usage of the process to be 100% for the single process and ~800% cumaltively on using 8 cores, here the CPU Usage for a single core only was fluctuating between 1000%-1200%(i.e all 12 cores). As expected, the multiprocessing module didn't help me much in this case and the results were far worse.
SARIMA(1 core) -> CPU USage Sarima 1 core
SARIMA (8 core) -> CPU Usage Sarima 8 core (Instead of one process using 1200% in this case, some processes go over 100% )
My question is why is the OS automatically distributing work on different cores in case of SARIMA model, while I have to do it explicitly(using multiprocessing)in Moving Average Model. Can it be due to the style of writing the python program?
Some other info:
I am using http://alkaline-ml.com/pmdarima/0.9.0/modules/generated/pyramid.arima.auto_arima.html for SARIMA Tuning.
I am using process queue technique to parallelise the code
SARIMA is taking 9 hrs on 1 core(maxing at 1200% as shown in above images) and more than 24 hrs if I use multiprocessing.
I am new to stackoverflow and will be happy to supplement any other information required. Please let me know if anything is not clear.
Updated: I had raised an issue on the official repo of pyramid package and the author had replied. The same can be accessed here: https://github.com/alkaline-ml/pmdarima/issues/301