75
votes

I just finished reading this post: What's the advantage of a Java-5 ThreadPoolExecutor over a Java-7 ForkJoinPool? and felt that the answer is not straight enough.

Can you explain in simple language and examples, what are the trade-offs between Java 7's Fork-Join framework and the older solutions?

I also read the Google's #1 hit on the topic Java Tip: When to use ForkJoinPool vs ExecutorService from javaworld.com but the article doesn't answer the title question when, it talks about api differences mostly ...

6

6 Answers

65
votes

Fork-join allows you to easily execute divide and conquer jobs, which have to be implemented manually if you want to execute it in ExecutorService. In practice ExecutorService is usually used to process many independent requests (aka transaction) concurrently, and fork-join when you want to accelerate one coherent job.

42
votes

Fork-join is particularly good for recursive problems, where a task involves running subtasks and then processing their results. (This is typically called "divide and conquer" ... but that doesn't reveal the essential characteristics.)

If you try to solve a recursive problem like this using conventional threading (e.g. via an ExecutorService) you end up with threads tied up waiting for other threads to deliver results to them.

On the other hand, if the problem doesn't have those characteristics, there is no real benefit from using fork-join.


References:

25
votes

Java 8 provides one more API in Executors

static ExecutorService  newWorkStealingPool()

Creates a work-stealing thread pool using all available processors as its target parallelism level.

With addition of this API,Executors provides different types of ExecutorService options.

Depending on your requirement, you can choose one of them or you can look out for ThreadPoolExecutor which provides better control on Bounded Task Queue Size, RejectedExecutionHandler mechanisms.

  1. static ExecutorService newFixedThreadPool(int nThreads)

    Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue.

  2. static ScheduledExecutorService newScheduledThreadPool(int corePoolSize)

    Creates a thread pool that can schedule commands to run after a given delay, or to execute periodically.

  3. static ExecutorService newCachedThreadPool(ThreadFactory threadFactory)

    Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available, and uses the provided ThreadFactory to create new threads when needed.

  4. static ExecutorService newWorkStealingPool(int parallelism)

    Creates a thread pool that maintains enough threads to support the given parallelism level, and may use multiple queues to reduce contention.

Each of these APIs are targeted to fulfil respective business needs of your application. Which one to use will depend on your use case requirement.

e.g.

  1. If you want to process all submitted tasks in order of arrival, just use newFixedThreadPool(1)

  2. If you want to optimize performance of big computation of recursive tasks, use ForkJoinPool or newWorkStealingPool

  3. If you want to execute some tasks periodically or at certain time in future, use newScheduledThreadPool

Have a look at one more nice article by PeterLawrey on ExecutorService use cases.

Related SE question:

java Fork/Join pool, ExecutorService and CountDownLatch

7
votes

Brian Goetz describes the situation best: https://www.ibm.com/developerworks/library/j-jtp11137/index.html

Using conventional thread pools to implement fork-join is also challenging because fork-join tasks spend much of their lives waiting for other tasks. This behavior is a recipe for thread starvation deadlock, unless the parameters are carefully chosen to bound the number of tasks created or the pool itself is unbounded. Conventional thread pools are designed for tasks that are independent of each other and are also designed with potentially blocking, coarse-grained tasks in mind — fork-join solutions produce neither.

I recommend reading the whole post, as it has a good example of why you'd want to use a fork-join pool. It was written before ForkJoinPool became official, so the coInvoke() method he refers to became invokeAll().

6
votes

Fork-Join framework is an extension to Executor framework to particularly address 'waiting' issues in recursive multi-threaded programs. In fact, the new Fork-Join framework classes all extend from the existing classes of the Executor framework.

There are 2 characteristics central to Fork-Join framework

  • Work Stealing (An idle thread steals work from a thread having tasks queued up more than it can process currently)
  • Ability to recursively decompose the tasks and collect the results. (Apparently, this requirement must have popped up along with the conception of the notion of parallel processing... but lacked a solid implementation framework in Java till Java 7)

If the parallel processing needs are strictly recursive, there is no choice but to go for Fork-Join, otherwise either of executor or Fork-Join framework should do, though Fork-Join can be said to better utilize the resources because of the idle threads 'stealing' some tasks from busier threads.

6
votes

Fork Join is an implementation of ExecuterService. The main difference is that this implementation creates DEQUE worker pool. Where task is inserted from oneside but withdrawn from any side. It means if you have created new ForkJoinPool() it will look for the available CPU and create that many worker thread. It then distribute the load evenly across each thread. But if one thread is working slowly and others are fast, they will pick the task from the slow thread. from the backside. The below steps will illustrate the stealing better.

Stage 1 (initially):
W1 -> 5,4,3,2,1
W2 -> 10,9,8,7,6

Stage 2:
W1 -> 5,4
W2 -> 10,9,8,7,

Stage 3:
W1 -> 10,5,4
W2 -> 9,8,7,

Whereas Executor service creates asked number of thread, and apply a blocking queue to store all the remaining waiting task. If you have used cachedExecuterService, it will create single thread for each job and there will be no waiting queue.