6
votes

I was reading about the Task Parallel Library and the article said:

In the .NET Framework 4, tasks are the preferred API for writing multi-threaded, asynchronous, and parallel code

But it also says they use the ThreadPool behind the scenes. What I'm having difficulty figuring out is if Tasks should only be used when you'd use a ThreadPool (and so "Thread versus Task" would be equivalent to "Thread versus ThreadPool"), or if Microsoft intended for Tasks to be used anywhere multiple threads are required, without the considerations inherent to the "Thread versus ThreadPool" dilemma.

So, should Tasks be used anywhere multiple threads are required?

3

3 Answers

4
votes

The design advantage of using tasks is that you hand over the nitty-gritty of threading to the runtime, which presumably could accomplish the threading tasks using a less buggy, more optimal solution. I know certain Task-based paradigms, such as PLINQ, allow you to hint at which strategy the runtime should adopt, so that the question of "to Threadpool or not to Threadpool" could be handled directly.

The switch to this model is analogous to the switch to a Managed GC-ed language versus a language that requires you to clean up your own memory. There will always be arguments in favour of the latter, but Garbage Collection is getting so optimized now that it's practically a non-issue. Ideally, the runtime switching mechanism for Tasks will evolve and get better. So in theory, your application written and compiled for .NET 4 could get faster with better implementations of the runtime, without further recompilation. Also, threading code is notoriously hard to get right, so any mechanism that hides those details is good for the programmer.

Whether those benefits outweigh potential detriments, such as edge cases that the runtime doesn't deal with well, is something that should be considered case-by-case. I would certainly try to not optimize early here, though.

1
votes

You can use TaskCreationOptions.LongRunning as a hint to tell the TPL that your task might be more involved than what the ThreadPool is tuned for. But, yes, the TPL does seem to be the preferred method for multithreaded programming looking forward. Microsoft is even building on top of it to support the new async and await keywords which are proposed in the Async CTP. It does not mean you have to abandon the old style Thread and ThreadPool APIs altogether. However, I am personally finding that the TPL does most of what I want in a more elegant API and I tend to rely on it almost exclusively now.

0
votes

A Task is a higher level abstraction than a thread or a ThreadPool. Essentially you package a function in a Task and ask the runtime to execute it as best as it can. You can have many dozens if not hundrends of Tasks that will be executed by a limited number of threads.

Using tasks a developer creates as many tasks as needed, chain them to create flows (workflows in F#) and control their cancellation without bothering with how threads are allocated or used. It is up to the runtime to select the best way to execute all the tasks using the limited number of threads.

Tasks make the implementation of concurrent programming patterns much easier. The ParallelExtensionExtras library provides methods to convert Begin/EndXXX pairs to tasks that can be chained and a preliminary version of the Task iteration idiom that is used by Async CTP to provide the async/await syntax. You can use a ConcurrentCollection of tasks to create a queue of jobs, similar to the AsyncCall example in ParallelExtensionExtras, or go even further and create agents similar to Scala, Erlang or F#. The DataFlow in the Async CTP is yet another example of what you can create with tasks.

You have to keep in mind that while a typical laptop has 2 cores and a typical desktop has 4, small servers already have 8 or more cores and soon they will have many more. Keeping all those core busy by manually scheduling threads AND avoiding blocks can become a major headache.