0
votes

I want to run "N" concurrent operations in asyncio . This is the general layout of my code.

def heavy_operation(): 
   # Heavy operation 1....N  I want to run concurrently.
for x in range (N) :
  heavy_operation() 

heavy_operation is GET request I make to an API. A few lines of string at once. Since N is near 100 my code runs slowly.

Solutions without using for-loop are also fine as long as the heavy operations are running concurrently.

1

1 Answers

0
votes

If your heavy operations are not async (defined with async def and actually awaiting IO operations), asyncio won't do you much good because it is single-threaded and suited for parallelizing IO. When parallelizing CPU-bound code, you can use threads directly, like this:

import concurrent.futures

with concurrent.futures.ThreadPoolExecutor() as pool:
    for i in range(N):
        pool.submit(heavy_operation)  # you can also add positional arguments

Note that if heavy_operation is CPU-bound, it is likely that your code will run no faster than the sequential version due to the GIL. In that case you can try changing ThreadPoolExecutor to ProcessPoolExecutor to run heavy_operation in separate python processes.

If heavy_operation is IO-bound (e.g. it uses requests to download http), the optimal approach would be to convert it to async by switching the underlying library to one that supports asyncio, such as from requests to aiohttp. Then you'd use asyncio.gather to run heavy_operation in parallel, as shown here and elsewhere.

Having said that, I'd also add that, in practice, spawning 100 threads (assuming you want all your requests to run in parallel) won't encumber a modern system. You'll just need to pass that number to the ThreadPoolExecutor constructor, and it will work fine. The benefit of an asyncio is that you get a solution that: a) actually scales, because it will work equally well for 10, 100, or 10,000 concurrent connections, b) avoids many pitfalls with threads, such as race conditions, and c) automatically provides other goodies such as reliable cancellation and timeouts.