7
votes

Based on this question I assumed that creating new process should be almost as fast as creating new thread in Linux. However, little test showed very different result. Here's my code:

from multiprocessing import Process, Pool
from threading import Thread

times = 1000

def inc(a):
    b = 1
    return a + b

def processes():
    for i in xrange(times):
        p = Process(target=inc, args=(i, ))
        p.start()
        p.join()

def threads():
    for i in xrange(times):
        t = Thread(target=inc, args=(i, ))
        t.start()
        t.join()

Tests:

>>> timeit processes() 
1 loops, best of 3: 3.8 s per loop

>>> timeit threads() 
10 loops, best of 3: 98.6 ms per loop

So, processes are almost 40 times slower to create! Why does it happen? Is it specific to Python or these libraries? Or did I just misinterpreted the answer above?


UPD 1. To make it more clear. I understand that this piece of code doesn't actually introduce any concurrency. The goal here is to test the time needed to create a process and a thread. To use real concurrency with Python one can use something like this:

def pools():
    pool = Pool(10)
    pool.map(inc, xrange(times))

which really runs much faster than threaded version.


UPD 2. I have added version with os.fork():

for i in xrange(times):
    child_pid = os.fork()
    if child_pid:
        os.waitpid(child_pid, 0)
    else:
        exit(-1)

Results are:

$ time python test_fork.py 

real    0m3.919s
user    0m0.040s
sys     0m0.208s

$ time python test_multiprocessing.py 

real    0m1.088s
user    0m0.128s
sys     0m0.292s

$ time python test_threadings.py

real    0m0.134s
user    0m0.112s
sys     0m0.048s
3
Well, the question you linked to is comparing the cost of just calling fork(2) vs. pthread_create(3), whereas your code does quite a bit more. How about comparing os.fork() with thread.start_new_thread()? - Aya
@Aya: I couldn't find any kind of join in thread module to create similar test, but even compared to high-level threading version with os.fork() is still much slower. In fact, it is the slowest one (though additional conditions may affect performance). See my update. - ffriend
You have to use a mutex to wait for the thread if you're using the low-level thread module, which is how the higher-level threading module implements join(). But, if you're just trying to measure the time it takes to create the new process/thread, then you shouldn't be calling join(). See also my answer below. - Aya

3 Answers

5
votes

The question you linked to is comparing the cost of just calling fork(2) vs. pthread_create(3), whereas your code does quite a bit more, e.g. using join() to wait for the processes/threads to terminate.

If, as you say...

The goal here is to test the time needed to create a process and a thread.

...then you shouldn't be waiting for them to complete. You should be using test programs more like these...

fork.py

import os
import time

def main():
    for i in range(100):
        pid = os.fork()
        if pid:
            #print 'created new process %d' % pid
            continue
        else:
            time.sleep(1)
            return

if __name__ == '__main__':
    main()

thread.py

import thread
import time

def dummy():
    time.sleep(1)

def main():
    for i in range(100):
        tid = thread.start_new_thread(dummy, ())
        #print 'created new thread %d' % tid

if __name__ == '__main__':
    main()

...which give the following results...

$ time python fork.py
real    0m0.035s
user    0m0.008s
sys     0m0.024s

$ time python thread.py
real    0m0.032s
user    0m0.012s
sys     0m0.024s

...so there's not much difference in the creation time of threads and processes.

2
votes

Yes, it is true. Starting a new process (called a heavyweight process) is costly.

As an overview ...

The OS has to (in the linux case) fork the first process, set up the accounting for the new process, set up the new stack, do the context switch, copy any memory that gets changed, and tear all that down when the new process returns.

The thread just allocates a new stack and thread structure, does the context switch, and returns when the work is done.

... that's why we use threads.

1
votes

In my experience there is a significant difference between creating a thread (with pthread_create) and forking a process.

For example I created a C test similar to your python test with thread code like this:

pthread_t thread; 
pthread_create(&thread, NULL, &test, NULL); 
void *res;
pthread_join(thread, &res);

and process forking code like this:

pid_t pid = fork();
if (!pid) {
  test(NULL);
  exit(0);
}         
int res;
waitpid(pid, &res, 0);

On my system the forking code took about 8 times as long to execute.

However, it's worth noting that the python implementation is even slower - for me it was about 16 times as slow. I suspect that is because in addition to the regular overhead of creating a new process, there is also more python overhead associated with the new process too.