58
votes

I have to run jobs on a regular basis on compute servers that I share with others in the department and when I start 10 jobs, I really would like it to just take 10 cores and not more; I don't care if it takes a bit longer with a single core per run: I just don't want it to encroach on the others' territory, which would require me to renice the jobs and so on. I just want to have 10 solid cores and that's all.

I am using Enthought 7.3-1 on Redhat, which is based on Python 2.7.3 and numpy 1.6.1, but the question is more general.

5
I'm pretty sure that numpy doesn't do any multithreading, there is nothing to switch off. - Winston Ewert
set cpu affinity for the processes - jfs
@WinstonEwert: incorrect. Try np.dot with large matrix on multicore cpu. The libraries that it uses may utilize more than one cpu - jfs
Thanks a lot. Now that I know what to search for, I found this other page that seems to answer my question: stackoverflow.com/questions/1575067/… - MasDaddy

5 Answers

37
votes

Set the MKL_NUM_THREADS environment variable to 1. As you might have guessed, this environment variable controls the behavior of the Math Kernel Library which is included as part of Enthought's numpy build.

I just do this in my startup file, .bash_profile, with export MKL_NUM_THREADS=1. You should also be able to do it from inside your script to have it be process specific.

44
votes

Only hopefully this fixes all scenarios and system you may be on.

  1. Use numpy.__config__.show() to see if you are using OpenBLAS or MKL

From this point on there are a few ways you can do this.

2.1. The terminal route export OPENBLAS_NUM_THREADS=1 or export MKL_NUM_THREADS=1

2.2 (This is my preferred way) In your python script import os and add the line os.environ['OPENBLAS_NUM_THREADS'] = '1' or os.environ['MKL_NUM_THREADS'] = '1'.

NOTE when setting os.environ[VAR] the number of threads must be a string! Also, you may need to set this environment variable before importing numpy/scipy.

There are probably other options besides openBLAS or MKL but step 1 will help you figure that out.

18
votes

In case you want to set the number of threads dynamically, and not globally via an environment variable, you can also do:

import mkl
mkl.set_num_threads(2)
10
votes

In more recent versions of numpy I have found it necessary to also set NUMEXPR_NUM_THREADS=1.

In my hands, this is sufficient without setting MKL_NUM_THREADS=1, but under some circumstances you may need to set both.

-8
votes

For me, the solution was simple as I stopped using numpy.dot:

import numpy as np

a = np.random.rand(1e6)
b = np.random.rand(1e6, 10)

# potentially uses multiple threads
dotted = np.dot(a, b)

# single-thread
summed = np.sum(a[:, np.newaxis] * b, axis=0)

assert np.all(dotted == summed)