User level thread programming - multiple kernel threads?

Question

Many programming languages like C++ (posix library) and Java provide the ability to play around with user-level threads. However, if all these user-level threads run in a single kernel thread - you only get the illusion of multiprogramming or multiprocessing (if multiple processors available) right ? I mean- still all these threads run in the same kernel thread. Am I right in saying that? And, if yes, then how exactly do we plan to get performance improvement using user-level threads ?

EDIT: I guess performance would not really be possible on a many to one model (user level threads to kernel thread mapping). So in a many to many model, performance improvement is possible only if a kernel level thread forks off. So my question is, even if user level threads have low overhead, I cannot really envisage a performance improvement as great as scheduling kernel level threads.

EDIT2: Essentially this is what I am trying to get verified - "Assume a computer has 4 processors. Now, say my program is the only thing running - and has forked 4 threads each of which do completely independent things. Now, if the mapping is say one to one (user to kernel mapping), I can actually get a perfect 4 times speedup. However if say (for some reason) all 4 user threads map to the same kernel thread space - then there is no speedup because of multiprocessing. This is because, even though I have 4 user level threads - they run in the same kernel space and cannot be split across 4 cores.

User level threads that run in a single kernel thread? There's no operating system that I know that works that way. Nor could I conceive of one. Document your question better. — Hans Passant
Look at en.wikipedia.org/wiki/GNU_Portable_Threads. There is a model called the many to one, that maps multiple user level threads to a single kernel thread. — Hari
Hmm, better known as "fiber" or "co-routine". Nobody uses them anymore because they suck on modern cpu cores. Very poor cpu cache locality. — Hans Passant
My comment is: they are not worse off because of poor cpu cache locality. Rather they are not intended for multi-core processing - because of their mapping model (one to many). — Hari
In fact that is precisely the downpoint of the many to one model - one user level thread blocks, then the entire kernel thread blocks. — Hari

Roman Pietrzak Roman Pietrzak · Accepted Answer · 2011-08-19T17:35:51

No, you are simply not right.

In most cases use of posix for C/C++ or Thread implementation for Java to create and run threads, means that underlying user-space implementation runs real threads within the memory space of one process. It means, that running 4 threads on 4 CPU's machine gives you a real 4x speed-up - of course if everything is properly written, and the OS itself is not somehow blocked (prevented) from balancing the CPU-2-thread usage.

I said "most cases", because there can be always implementation of POSIX lib (e.g. some debug or incomplete implementation) or Java Threads (e.g. some incomplete VM or exotic setup) which won't be running real threads - just simulating this... But on standard PC environment, you can be sure "No, you are simply not right" :)

User level thread programming - multiple kernel threads?

1 Answers