0
votes

I've been working with Akka for some time, but now am exploring its actor system in depth. I know there is thread poll executor and fork join executor and afinity executor. I know how dispatcher works and all the rest details. BTW, this link gives a great explanation

https://scalac.io/improving-akka-dispatchers

However, when I experimented with a simple call actor and switched execution contexts, I always got roughly the same performance. I run 60 requests simultaneously and average execution time is around 800 ms to just return simple string to a caller.

I'm running on a MAC which has 8 core (Intel i7 processor).

So, here are the execution contexts I tried:

thread-poll {
  type = Dispatcher
  executor = "thread-pool-executor"
  thread-pool-executor {
    fixed-pool-size = 32
  }
  throughput = 10
}

fork-join {
  type = Dispatcher
  executor = "fork-join-executor"
  fork-join-executor {
    parallelism-min = 2
    parallelism-factor = 1
    parallelism-max = 8
  }
  throughput = 100
}

pinned {
  type = Dispatcher
  executor = "affinity-pool-executor"
}

So, questions are:

  1. Is there any chance to get a better performance in this example?
  2. What's all about the actor instances? How that matters, if we know that dispatcher is scheduling thread (using execution context) to execute actor's receive method inside that thread on the next message from actor's mailbox. Isn't than actor receive method only like a callback? When do number of actors instance get into play?
  3. I have some code which is executing Future and if I run that code directly from main file, it executed around 100-150 ms faster than when I put it in actors and execute Future from actor, piping its result to a sender. What is making it slower?

If you have some real world example with this explained, it is more than welcome. I read some articles, but all in theory. If I try something on a simple example, I get some unexpected results, in terms of performance.

Here is a code

object RedisService {
  case class Get(key: String)
  case class GetSC(key: String)
}

class RedisService extends Actor {
  private val host = config.getString("redis.host")
  private val port = config.getInt("redis.port")

  var currentConnection = 0

  val redis = Redis()

  implicit val ec = context.system.dispatchers.lookup("redis.dispatchers.fork-join")

  override def receive: Receive = {
    case GetSC(key) => {
      val sen = sender()

      sen ! ""
    }
  }
}

Caller:

    val as = ActorSystem("test")
    implicit val ec = as.dispatchers.lookup("redis.dispatchers.fork-join")

    val service = as.actorOf(Props(new RedisService()), "redis_service")

    var sumTime = 0L
    val futures: Seq[Future[Any]] = (0 until 4).flatMap { index =>
      terminalIds.map { terminalId =>
        val future = getRedisSymbolsAsyncSCActor(terminalId)

        val s = System.currentTimeMillis()
        future.onComplete {
          case Success(r) => {
            val duration = System.currentTimeMillis() - s
            logger.info(s"got redis symbols async in ${duration} ms: ${r}")
            sumTime = sumTime + duration
          }
          case Failure(ex) => logger.error(s"Failure on getting Redis symbols: ${ex.getMessage}", ex)
        }

        future
      }
    }

    val f = Future.sequence(futures)


    f.onComplete {
      case Success(r) => logger.info(s"Mean time: ${sumTime / (4 * terminalIds.size)}")
      case Failure(ex) => logger.error(s"error: ${ex.getMessage}")
    }

The code is pretty basic, just to test how it behaves.

1

1 Answers

3
votes

It's a little unclear to me what you're specifically asking, but I'll take a stab.

If your dispatcher(s) (and, if what the actor is doing is CPU/memory- vs. IO-bound, actual number of cores available (note that this gets hazier the more virtualization (thank you, oversubscribed host CPU...) and containerization (thank you share- and quota-based cgroup limits) comes into play)) allows m actors to be processing simultaneously and you rarely/never have more than n actors with a message to handle (m > n), trying to increase parallelism via dispatcher settings won't gain you anything. (Note that in the foregoing, any task scheduled on the dispatcher(s), e.g. a Future callback, is effectively the same thing as an actor).

n in the previous paragraph is obviously at most the number of actors in the application/dispatcher (depending on what scope we want to look at things: I'll note that every dispatcher over two (one for actors and futures that don't block and one for those that do) is stronger smell (if on Akka 2.5, it's probably a decent idea to adapt some of the 2.6 changes around default dispatcher settings and running things like remoting/cluster in their own dispatcher so they don't get starved out; note also that Alpakka Kafka uses its own dispatcher by default: I wouldn't count those against the two), so in general more actors implies more parallelism implies more core utilization. Actors are comparatively cheap, relative to threads so a profusion of them isn't a huge matter for concern.

Singleton actors (whether at node or cluster (or even, in really extreme cases, entity) level) can do a lot to limit overall parallelism and throughput: the one-message-at-a-time restriction can be a very effective throttle (sometimes that's what you want, often it's not). So don't be afraid to create short-lived actors that do one high-level thing (they can definitely process more than one message) and then stop (note that many simple cases of this can be done in a slightly more lightweight way via futures). If they're interacting with some external service, having them be children of a router actor which spawns new children if the existing ones are all busy (etc.) is probably worth doing: this router is a singleton, but as long as it doesn't spend a lot of time processing any message, the chances of it throttling the system are low. Your RedisService might be a good candidate for this sort of thing.

Note also that performance and scalability aren't always one and the same and improving one diminishes the other. Akka is often somewhat willing to trade performance in the small for reduced degradation in the large.