pthread vs. kthread in Linux kernel v2.6+

Question

This is a conceptual question.

According to this post, pthread is actually implemented using the clone() system call. So we can infer that there is a kernel thread (or a light-weight process) backing up a pthread in the user space. The kernel is aware of the pthread and can schedule it like a process.

As for kthread, according to Robert Love, kthreads are also created with the clone() system call:

clone(CLONE_VM| CLONE_FS | CLONE_FILES | CLONE_SIGHAND, 0)

So pthread and kthread both use clone() call. My first question is:

Is there a difference between these two kinds of threads?

To answer my own question, I read on:

The significant difference between kernel threads and normal processes is that kernel threads do not have an address space (in fact, their mm pointer is NULL).

Is this one difference? I mean, a thread created by pthread_create() shares the address space with the normal process. In contrast, a kthread does not have its own address space. Is that correct?

What else is different?

askb askb · Accepted Answer · 2014-12-21T15:02:10

In contrast, a kthread does not have its own address space. Is that correct?

Yes

a thread created by pthread_create() shares the address space with the normal process.

kernel: how to find all threads from a process's task_struct

pthreads: pthread_create() are used in the user space, where multiple threads within your application share the same process address space. For this you need to link your program with the pthread library to use this functionality. pthreads provides multi-threading in the application level or the user space. Internally this translates into a clone() syscall which maps a new struct task_struct to every application thread.

kthreads: Some example of kernel threads are for flushing disk caches, servicing softirqs, flushing dirty buffers etc. These threads run only within the kernel space and don't have access to user space virtual memory and they only use kernel space memory address after PAGE_OFFSET, therefore the current->mm field in the task descriptor is always NULL. Internally this kernel_thread() api translates into do_fork() within the kernel. Kernel threads are created asynchronously either init process comes up or some kernel modules is loaded (ex a file system).

pthread vs. kthread in Linux kernel v2.6+

1 Answers