I am applying multiple operations to a dask dataframe. Can I define distributed worker resource requirements for particular operation?
e.g. I call something like:
df.fillna(value="").map_partitions(...).map(...)
I want to specify resource requirement for map_partitions() (potentially different than the ones for map()), but seems like the method does not accept resources parameter.
PS. Alternatively, I figured out that I can call client.persist() after map_partitions() and specify resources in this call, but this immediately triggers the computation.