I am developing a distributed computing system using dask.distributed. Tasks that I submit to it with the Executor.map function sometimes fail, while others seeming identical, run successfully.
Does the framework provide any means to diagnose problems?
update By failing I mean increasing counter of failed tasks in the Bokeh web UI, provided by the scheduler. Counter of finished tasks increases too.
Function that is run by the Executor.map returns None. It communicates to a database, retrieves some rows from its table, performs calculations and updates values.
I've got more than 40000 tasks in map, so it is a bit tedious to study logs.