Grayzone between blocking and non-blocking I/O?

Question

I am familar with programming according to the two paradigms, blocking and non-blocking, on the JVM (Java/nio, Scala/Akka). However, I see a kind of grayzone in between that confuses me.

Look at any non-blocking program of your choice: it is full of blocking statements!

For example, each assignment of a variable is a blocking operation that waits for CPU-registers and memory-reads to succeed. Moreover, non-blocking programs even contain blocking statements that carry out computations on complex in-memory-collections, without violating the non-blocking paradigm.

In contrast to that, the non-blocking paradigm would clearly be violated if we would call some external web-service in a blocking way to receive its result.

But what is in between these extremes? What about reading/writing a tiny file, a local socket, or making an API-call to an embedded data storage engine (such as SQLite, RocksDb, etc.). Is it ok to do blocking reads/writes to these APIs? They usually give strong timing guarantees in practice (say << 1ms as long as the OS is not stalled), so there is almost no practical difference to pure in-memory-access. As a precise example: is calling RocksDBs get/put within an Akka Actor considered to be an inadvisable blocking I/O?

So, my question is whether there are rules of thumb or precise criteria that help me in deciding whether I may stick to a simple blocking statement in my non-blocking program, or whether I shall wrap such a statement into non-blocking boilerplate (framework-depending, e.g., outsourcing such calls to a separate thread-pool, nesting one step deeper in a Future or Monad, etc.).

the8472 the8472 · Accepted Answer · 2017-08-05T20:40:03

for example, each assignment of a variable is a blocking operation that waits for CPU-registers and memory-reads to succeed

That's not really what is considered "blocking". Those operations are constant time, and that constant is very low (a few cycles in general) compared to the latency of any IO operations (anywhere between thousands and billions of cycles) - except for page faults due to swapped memory, but if those happen regularly you have a problem anyway.

And if we want to get all nitpicky, individual instructions do not fully block a CPU thread as modern CPUs can reorder instructions and execute ones that have no data dependencies out of order while waiting for memory/caches or other more expensive instructions to finish.

Moreover, non-blocking programs even contain blocking statements that carry out computations on complex in-memory-collections, without violating the non-blocking paradigm.

Those are not considered as blocking the CPU from doing work. They should not even block user interactivity if they are correctly designed to present the results to the user when they are done without blocking the UI.

Is it ok to do blocking reads/writes to these APIs?

That always depends on why you are using non-blocking approaches in the first place. What problem are you trying to solve? Maybe one API warrants a non-blocking approach while the other does not. For example most file IO methods are nominally blocking, but writes without fsync can be very cheap, especially if you're not writing to spinning rust so it can be overkill to avoid those methods on your compute threadpool. On the other hand one usually does not want to block a thread in a fixed threadpool while waiting for a multi-second database query

Grayzone between blocking and non-blocking I/O?

1 Answers