2
votes

How does creating threads based on the number CPU cores per JVM differ with thread running on multiple JVMs creating the number of threads on the number of CPU cores, with the condition that all the JVMs runs on one physical system sharing the same CPU? In other words, a multi threaded Java program running 8 threads in parallel vs the same multi-threaded program running on 8 different JVMs sharing the same CPU?

I have given below some ways I found possibly to implement the parallel processing with threads, but could not understand the essential differences between them?

Approach one: A Thread queries the database changes periodically, starts (long running) threads in parallel (whenever changes occur) that works on the change data. (The work involves arithmetical and persisting the result to a database)

Approach two: Multiple threads queries the data changes in the database, locks the modified data, each thread starting a thread (from a thread pool) that process the change data.

Approach three: Multiple Threads, essentially run from different JVMs as separate processes, queries the database, locks the changed records it has found and starts thread (from the thread pool that each on of them have, max thread on the pool being of number of CPU cores) to process the change data.

Do the third approach is in anyway better than the other two? If yes/no why? (Because, as the monitoring threads runs on different JVMs, everyone of them can create as much threads as the CPU cores? As an example, in 8 core CPU, create 8 monitoring threads on separate JVM (as separate processes), with every one of them submitting the change jobs to a thread pool of 8? But, does not this argument fail, as there are only 8 physical cores and the processor can only run 8 threads at anytime?)

You have any other effective way to implement this scenario?

1
Typically, running multiple JVMs on one machine will not make an application fasterControlAltDel
Does that also down vote a question?Vijay Veeraraghavan
Downvoted and voted for closure based on lack of specific details. It's not enough to say "threads" like "ah, I've heard threads improve everything!" We'd need to know the details of what the processing isControlAltDel
You should consider whether your question is even JVM-specific. I mean a C program can use multiple threads just as well. So do you want to know the difference between parallel C code and Java? Or is your question more general than that? Or specific to some kind of workload. You probably should rethink your question and split it into different aspects of limited scope each.the8472
I updated the question that includes what the work is.Vijay Veeraraghavan

1 Answers

3
votes

I think your answer boils down to:

  • Processing with one thread in one process, end of story.
  • Processing with multiple threads in one process.
  • Processing with multiple threads in multiple processes.

If your goals are to saturate the CPU with as much work as possible, and to perform your processing the fastest, then the answer is usually #2, multiple threads in one process.

Multiple threads in multiple processes doesn't buy you much, and has several downsides:

  • If all threads are in the same process, then they can use slim mutexes/locks (intraprocess mutexes/locks), which have significantly better performance than interprocess mutexes/locks. Multiple processes means using kernel-provided locking primitives, which are typically much slower.

  • If all threads are in the same process, they can all access the same memory, and have all of their memory pooled together. Having everything in one heap means data colocality, and colocation can improve CPU cache performance. Additionally, if you had to share data between the threads in multiple processes, you would need to use shared memory (which is cumbersome in Java) or message passing (which duplicates data, wasting CPU and ram).

The only benefit to using multiple processes is that you can easily perform privilege separation.