Azure Function Drain mode

(2021-Jun-16) Recently, Google started to notify me that my blog website (http://datanrg.blogspot.com/) had been experiencing some ‘Server (5xx)’ errors. Nothing has been changed on my side, and I’m not sure how long this Google-powered Blogger platform will be active. 

Another “sucker punch” came from my youngest child :-). She is 9, however, she is aware of Azure Data Factory and other data platform components to a certain extent, that I attempt to write. A few days ago she suggested that I should stop writing about all this “boring” stuff and add some actions & fun stuff that she reads about in her chapter books or watch in her TV shows.

So, if you stop seeing any new posts on my blog site: I’m either fighting with Blogger server errors or watching a new episode of the “Captain Underpants” TV show with my daughter :-)

Image by PublicDomainPictures from Pixabay 

Here are some actions and fun stuff about this blog post idea. It came from my recent communication and assistance that was received from the Microsoft Support team while they were helping me to troubleshoot a failed Azure Function execution incident in my project.

Azure Functions enable developers to write their application code and put all other infrastructure worries onto the cloud platform (Microsoft Azure) shoulders by providing “compute on-demand” resources.

As requests to execute Azure Functions increase, then the demand for such compute resources is supported, but only while it is needed (scale-out). As requests fall, any extra resources and application instances drop off automatically (scale-in).

Recently Microsoft enabled a new Drain mode in Azure Functions, that allows for a graceful shutdown of the Azure Function host by completing inflight invocations and stops listening for new events from triggering sources. When the host is put in drain mode, It should: 

  1. Stop listening for new incoming requests,
  2. Cancellation token is passed as a parameter to the function invocation,
  3. Finally, a scale-in operation is performed.

Once all currently executing invocations have stopped, then Azure Function listeners would be stopped as well to ensure that the worker is shutting down and doesn’t handle any new invocations. Prior to that change, the scale-in operation used to cause the existing function host to lose in-flight and pending invocations.

Drain mode was added to help with the scenario where a worker was signalled to shut down but continued to process new events which then are never completed because of the shutdown event. 

If a Function App is running on Premium Plan (ElasticPremium) Hosting Plan, then its instances are dynamically added and removed based on demand/number of incoming messages. Azure Functions uses a component called the Scale Controller to monitor the rate of events. The scale controller uses heuristics for each trigger type, and it also decides whether to scale out, scale in, or stay the same. When the function app is scaled out, additional resources are allocated to run multiple instances of the Azure Functions host. Conversely, as compute demand is reduced, the scale controller removes function host instances. The number of instances is eventually "scaled-in" to zero when no functions are running within a function app. The Drain mode API on the Function App host put it in a suspended mode, where it stops accepting new incoming traffic / turns off listeners for its trigger thus facilitates the graceful shutdown.

Kudos to the Microsoft support engineers who explained the recent changes to the Function App and shared all this information with me, and big thanks to my daughter who still believes that I could write about some fun stuff :-) 

Previous blog posts about Azure Functions:

Comments

  1. Is there detail available on the Azure Drain Function API?

    ReplyDelete
    Replies
    1. I couldn't find much information on Function App drain mode myself, much of it came from the MSFT support engineers. In a way, it resembles a VM drain mode, but I wish more information would be shared for the Function Apps as well.

      Delete
    2. Where could I find the FunctionApp Drain Mode in the GitHub code? Could you share the API location itself?

      Delete
    3. MSFT support engineers haven't shared this information with me.

      Delete
    4. https://github.com/Azure/azure-functions-host/pull/6330

      Delete
  2. Great article ! I wonder what signals the function app to start again. The listener should resume esp when its a trigger based function app. What is the criteria for the listener to restart again

    ReplyDelete
    Replies
    1. This could be your time trigger or event-base call for a function within your Function App.

      Delete
  3. fyi appears the behaviour of drain mode was altered to NOT trigger cancellation tokens with https://github.com/Azure/azure-webjobs-sdk/issues/2795 and the new way is to leverage IDrainModeManager dependency

    ReplyDelete

Post a Comment