Python threading vs. multiprocessing in Linux

Question

Based on this question I assumed that creating new process should be almost as fast as creating new thread in Linux. However, little test showed very different result. Here's my code:

from multiprocessing import Process, Pool
from threading import Thread

times = 1000

def inc(a):
    b = 1
    return a + b

def processes():
    for i in xrange(times):
        p = Process(target=inc, args=(i, ))
        p.start()
        p.join()

def threads():
    for i in xrange(times):
        t = Thread(target=inc, args=(i, ))
        t.start()
        t.join()

Tests:

>>> timeit processes() 
1 loops, best of 3: 3.8 s per loop

>>> timeit threads() 
10 loops, best of 3: 98.6 ms per loop

So, processes are almost 40 times slower to create! Why does it happen? Is it specific to Python or these libraries? Or did I just misinterpreted the answer above?

UPD 1. To make it more clear. I understand that this piece of code doesn't actually introduce any concurrency. The goal here is to test the time needed to create a process and a thread. To use real concurrency with Python one can use something like this:

def pools():
    pool = Pool(10)
    pool.map(inc, xrange(times))

which really runs much faster than threaded version.

UPD 2. I have added version with os.fork():

for i in xrange(times):
    child_pid = os.fork()
    if child_pid:
        os.waitpid(child_pid, 0)
    else:
        exit(-1)

Results are:

$ time python test_fork.py 

real    0m3.919s
user    0m0.040s
sys     0m0.208s

$ time python test_multiprocessing.py 

real    0m1.088s
user    0m0.128s
sys     0m0.292s

$ time python test_threadings.py

real    0m0.134s
user    0m0.112s
sys     0m0.048s

Well, the question you linked to is comparing the cost of just calling fork(2) vs. pthread_create(3), whereas your code does quite a bit more. How about comparing os.fork() with thread.start_new_thread()? — Aya
@Aya: I couldn't find any kind of join in thread module to create similar test, but even compared to high-level threading version with os.fork() is still much slower. In fact, it is the slowest one (though additional conditions may affect performance). See my update. — ffriend
You have to use a mutex to wait for the thread if you're using the low-level thread module, which is how the higher-level threading module implements join(). But, if you're just trying to measure the time it takes to create the new process/thread, then you shouldn't be calling join(). See also my answer below. — Aya

Aya Aya · Accepted Answer · 2013-07-02T14:11:06

The question you linked to is comparing the cost of just calling fork(2) vs. pthread_create(3), whereas your code does quite a bit more, e.g. using join() to wait for the processes/threads to terminate.

If, as you say...

The goal here is to test the time needed to create a process and a thread.

...then you shouldn't be waiting for them to complete. You should be using test programs more like these...

fork.py

import os
import time

def main():
    for i in range(100):
        pid = os.fork()
        if pid:
            #print 'created new process %d' % pid
            continue
        else:
            time.sleep(1)
            return

if __name__ == '__main__':
    main()

thread.py

import thread
import time

def dummy():
    time.sleep(1)

def main():
    for i in range(100):
        tid = thread.start_new_thread(dummy, ())
        #print 'created new thread %d' % tid

if __name__ == '__main__':
    main()

...which give the following results...

$ time python fork.py
real    0m0.035s
user    0m0.008s
sys     0m0.024s

$ time python thread.py
real    0m0.032s
user    0m0.012s
sys     0m0.024s

...so there's not much difference in the creation time of threads and processes.

Python threading vs. multiprocessing in Linux

3 Answers