0
votes

I have some trouble understanding how Function app is working ..

My environment is as follow : Python3.8, Blob Trigger, Cosumption Plan

I am creating an application which is trigger when an audio file is uploaded into a container. This audio file trigger an Azure function and run "Speech-To-Text" using Azure cognitive service (So my function is waiting for an answer from that service). I set up a "FUNCTIONS_WORKER_PROCESS_COUNT" at 5 in order to allow each of my function app instance to run several speech-to text analysis in parallel.

SO I uploaded 100 blob into my container to check my function behaviour, here is what I get :

Function app is triggered and start several servers (5 for 100 blobs) and then start processing 1 blob per server until it has been more than 30 minutes since I uploaded blob and then I get a Timeout.

But I was expecting this behaviour : Function app is triggered and start several servers. Each servers process 5 blobs in parallel and give me an answer in 15 to 20 minutes for all of my blobs !

So I don't get 2 things here. W

  • Why are my functions not processing 5 blobs per server instead of 1 blob per server ? (I set up "FUNCTIONS_WORKER_PROCESS_COUNT" at 5) ?
  • And my blobs seems to be processed as soon as they appear in the container instead of putting in a queue. And this behaviour is responsible of time out since they are waiting for quite a long time instead of being processed .. WHy ?

I hope I was clear .. Thank you for your help !

EDIT : I just added 100 blobs to see how function app is reacting, and my freshly uploaded blobs are being processed before the ones that I uploaded at the begining.

1

1 Answers

1
votes

1. For your first question:

As far as my understanding, Python is a single-threaded runtime.

Because Python is a single-threaded runtime, a host instance for Python can process only one function invocation at a time. For applications that process a large number of I/O events and/or is I/O bound, you can improve performance significantly by running functions asynchronously.

And it is right that the FUNCTIONS_WORKER_PROCESS_COUNT makes you run 5 blob triggered functions per host, but it would run one by one if you are using the same resource, which means even you could run 5 process at the same time, if the first process is running your function, the second process (run the same function) would waiting; if your first process is waiting for the data come in, the second process would running first.

Here is an article how FUNCTIONS_WORKER_PROCESS_COUNT works.

And you can check how many worker instances you are using. If you have 100 blobs to trigger your function and 5 worker process set per worker instance, it should starts 20 instances for consuming the request. (Welcome to correct me if I'm wrong.)


2. For your second question:

The Blob storage trigger starts a function when a new or updated blob is detected.

That's how blob triggered function works.