10
votes

I'm currently looking into Fault Tolerance and Supervisor strategies in Akka (Java version).

at ... http://doc.akka.io/docs/akka/2.3.2/java/fault-tolerance.html and http://doc.akka.io/docs/akka/2.3.2/general/supervision.html#supervision

A few questions:

1) Should we ever use try/catch blocks in our actors when we know what kind of exceptions to expect? Why or why not? If not, should we depend on a supervisor strategy to effectively handle exceptions that a child might throw?

2) By default, if no supervisor is configured explicitly in a parent actor, it looks like any child actor who throws an exception will be restarted by default. What if none of your actors in your entire system carry state...Should we really be doing restarts?

3) What if your top-level actors created by system.actorOf( ... ) throws an exception? How do you provide a supervisor strategy outside of the actor system?

4) Let's assume a scenario in which actor A has a child actor B. Now let's say Actor A asks Actor B to do some work.

Some code might look like this:

Future<Object> future = Patterns.ask(child, message, timeout);
future.onComplete(new OnComplete<Object>() {

    @Override
    public void onComplete(Throwable failure, Object result) throws Throwable {
             ... handle here    
    }

Now... what if actor A somehow throws an exception. By default it is restarted by its supervisor. The question is, does the onComplete "closure" still get executed sometime in the future, or is it effectively "wiped out" on the restart?

5) Let's assume I have a hierarchy as such as: A->B->C. Let's also assume that I override preRestart so that I effectively do NOT stop my children. On A's prestart he calls getContext().actorOf(B), and in B's prestart he calls getContext().actorOf(C). If A throws an exception, will more than one actor B and more than one actor C now exist in the system?

Thanks!

1

1 Answers

9
votes

This is going to be a pretty long answer, but let me tackle your points as orderly as possible.
Also, I will rely on the official Akka documentation, as I believe Akka to be one of the best documented projects out there and I don’t want to reinvent the wheel. :)

  1. A good introduction/overview of the way fault tolerance works in Akka is [1]. I think that article “sums up” quite well several pages of the Akka docs. To respond specifically to this point, I think it depends: you can try/catch exceptions, sure, but the Error Kernel Pattern states that you should “push down the actor hierarchy” anything that can fail (this is to prevent or limit as much as possible the loss of state within actors). This said, if you have a very specific Exception and you know how to handle it as part of the processing of a message, I don’t think there is any intrinsic problem in catching it. In fact, I can think of at least one specific case where you want to catch exceptions and handle them: if your actor is responding to a Pattern.ask, you need to wrap exceptions in Failure if you want the caller to be notified. ([2]).

  2. As stated in [3], the default behaviour is indeed Restart, but only in case an Exception is thrown during the message processing. Notice that ActorInitializationException and ActorKilledException will, by default, terminate the child instead and keep in mind that any Exception thrown within preStart will be wrapped in a ActorInitializationException. As to the whether Restart is a sound default “in case you don’t have state in your actors”...well, an Actor is, by definition, an abstraction to safely access and manipulate state in a concurrent environment: if you don’t have state, you might as well use Futures instead of actors, probably. In general, Restart was deemed a safe and reasonable default for a typical use case. In your specific case (which is not a typical use-case for an actor system), you can override the default supervision strategy anyways.

  3. Top-level actors are top level only from the “user” point of view. As explained in [4], any top-level actor is created as a child of the Guardian actor, and it has a normal default supervision strategy. Also, you can modify such default using the property akka.actor.guardian-supervisor-strategy. Also, keep in mind that you should always design you systems keeping Akka's hierarchical nature ([5]) in mind, hence not using top-level actors too much ([6]).

  4. Whether onComplete’s callbacks will be called or not depends on when A fails. If it fails after B completed and responded to A’s request, then it might execute. Otherwise it will not. It is “wiped out” when with the old A instance.

  5. This is a bit confusing, but I will assume the following:

    • When you say “A throws an Exception”, you mean within message processing (onReceive)
    • You have a field in your actor that will store the ref returned by getContext().actorOf(C).

The quick answer is: yes. Given the scenario you describe, there will be multiple instances of B and C. The new instance of A will not know that however. It will have a reference to the new B and, indirectly, the new C. This is reasonable and expected, because you have manually and explicitly disabled a default piece of cleanup logic that handles failures in an actor hierarchy (by changing postRestart): it is now your responsibility to cleanup and the preStart implementation you describe does not do it.