X-Post from Lightbend's Discuss Forum : https://discuss.lightbend.com/t/actor-message-allocation-to-dispatcher-thread/6314
Before I even get started, the short answer is "don't worry about it". I understand the curiousity, but at a micro level it's going to be indeterminate and from a macro level the only things you need to care about are in the Dispatcher documentation such as the difference between regular dispatchers, pinned dispatchers, fork-join, and thread pool executors and in the ordering guarantees in the Messaging Ordering documentation.
Also, disclaimer, I don't claim to be an expert on the internals of the dispatchers: I'm just an end user. But I am procrastinating a bit and figured I'd share some of my tuning observations and poke around the Akka source code a bit for fun. For more detailed answers, you should look at the source as well. Most of the answers you are looking for are going to be in the akka-actor/src/main/scala/akka/dispatch folder.
With that disclaimer out of the way, let me answer your second question first.
"Whether [the dispatcher] will check for idle actors mailbox also for the messages?"
The dispatcher doesn't actually check for anything: it's entirely reactive. (As we will see later.) It certainly doesn't waste any time checking empty mailboxes. This is why a single dispatcher can scale to millions of actors.
Your more generic question "how dispatcher thread will pick the actor?" is much harder to answer in a simple way. There are so many types of dispatchers. And just about every answer I can give you will have an exception. (For example, the aforementioned pinned dispatcher where threads are dedicated to specific actors and the CallingThreadDispatcher which is designed for testing and runs all invocations on the current thread). But let me talk about the typical dispatchers under typical circumstances.
Dispatchers don't pick actors, dispatchers are (in the typical case) just interfaces to Java ExecutorServices. The typical scenario looks like:
- You add a message to the mailbox of an Actor. (From a dispatcher's perspective we have an inverted view of the world: we interact with the mailbox, not the actor.)
- If the mailbox isn't already scheduled (which it may already be if it has messages in it), the mailbox goes to its dispatcher and schedules itself.
- The dispatcher goes to the underlying ExecutorService (let's say a ForkJoinExecutor) and enqueues a task to process the mailbox.
- A Java ForkJoinExecutor is a complicated piece of scheduling and I don't claim to be an expert. But the short version is that each thread has its own queue, but is capable of "stealing" tasks from other queues when it has an empty queue. The Java implementation also has the ability to dynamically adjust the number of threads that it is using up to parallelism limit. This is why I said "at the micro level" it's indeterminate. Work stealing dynamic threads is very efficient, but it isn't deterministic.
- At some point the task related to the mailbox containing the message will be selected by the executor and the Runnable will be called. .
- The Runnable is going to first process system messages in the mailbox and then regular messages. There are all kinds of exceptions here too, like priority mailboxes, stashing, throughput limits, etc. but in general the mailbox will process messages (using the actor's behaviors) until the mailbox is empty or one of the throughput limits are reached. Note that the task is tied to the mailbox and not the message.
The above is oversimplified and ignores some of the edge cases and performance optimizations, but that is the 30,000 foot view.
I hope that helps, because I'm aware that it's both filled with exceptions (mailboxes and dispatchers are designed to be flexible) and complicated. But Akka is highly optimized and insanely efficient. If any of the Akka devs want to step in tell me where I got sloppy with my description, feel free. But the net result is where I started: there's multiple layers of abstraction here so the only ordering guarantees you get are the ones documented, but the overall system throughput is extremely efficient even if the work accomplished per message is small and the number of messages is huge.