Fail activity in Azure Data Factory and Why would I want to Fail

(2021-Nov-30) I heard a story about a young person who was asked why she was always cutting a small piece of meat and putting it aside before cooking a larger one. She answered that this was the usual way at her home when she was growing up and she didn’t really know why. Only after checking with her older relatives, she had discovered the true reason for cutting that small piece away: her grandmother’s cutting board was too small for a regular size meat piece and she wanted to fit it on the board while preparing a meal.

Photo by olia danilevich from Pexels

Recently, Microsoft introduced a new Fail activity (https://docs.microsoft.com/en-us/azure/data-factory/control-flow-fail-activity) in the Azure Data Factory (ADF) and I wondered about a reason to fail a pipeline in ADF when my internal being tries very hard to make the pipelines successful once and for all. Yes, I understand a documented explanation that this activity can help to “customize both its error message and error code”, but why?

Internal Conditions

When I work with Switch activity in ADF, which allows me to branch out a control flow to multiple streams, a Default case always puzzles me. You either define all your explicit Cases and let the Default case cover all non-defined anomalies, or you can be bold enough to define all Cases but one and let that last one fall into the Default case, which doesn’t sound very stable. So, in my case, I can add a Fail activity to the Switch Default case, but it still doesn’t feel right.

External Conditions

Also, I was reading about other people defining their intention to Fail a pipeline in ADF when certain files are not available or because of other file-based malfunction. I believe that Validation ADF activity is still very underrated, which is very powerful to regulate your pipeline control flow based on absence or availability files/folders. Plus, if you can wisely control the Timeout and Retry settings of similar activities, that will help you not to fail but gracefully exit your pipeline workflow if something bad happens.

Successfully Failed External Activity

However, the most recent development or availability of a new development Azure Data Factory template to “Call Synapse pipeline with a notebook activity” has shown me the best use case of using Fail activity: https://docs.microsoft.com/en-us/azure/data-factory/solution-template-synapse-notebook

This new solution template describes how to call a Synapse pipeline with a notebook activity using a Web activity, then have a polling process to check Synapse pipeline status until its completion (status output as Succeeded, Failed, or Canceled) and in case of Failure on Synapse, notebook side to execute Fail activity and customizes both error message and error code.

I believe while building an external workflow, either Function or Logic App, or like in this case a Synapse pipeline with a notebook activity, you intend to make it stable enough and withstand all possible scenarios, even the bad ones and gracefully exit. Then your ADF workflow should be smart enough to interpret the output of the external execution and act appropriately, even to Fail. I like this use-case of Fail activity more!

Teach me Microsoft, teach me well, teach me the ways of Failure!

Data Adventures

Search This Blog