Collecting custom parameters/variables for Azure Data Factory deployment

(2020-June-22) It's a noble act to develop a very cool database solution that will bring value to your customers. In addition to this, you can help to deploy this solution to another isolated  Testing or Production environment and clear yourself from a statement that it only "works on my machine".  By automating this deployment process along with setting various environment-specific parameters/variables you will sail your solution ship to a better customer's appreciation of your efforts. 
 
Microsoft Azure Data Factory (ADF) solution deployment process is done with the help of two 
Azure Resource Manager (ARM) template files: ARMTemplateForFactory.json and ARMTemplateParametersForFactory.json. These two files are like horse and carriage that go together, where the first file contains a complete ADF definition and second file lists configurable elements (or parameters) that can be adjusted and affect your deployment process to another environment.

good colleague of mine Eric Bressot-Perrin introduced me to a recent enhancement to the Azure Data Factory CI/CD process that allows adding custom parameters to the Resource Manager templates.

Photo by Tom Fisk from Pexels

Microsoft does a good job describing this set of changes to the existing deployment process and they keep updating their documentation resource with examples, so I highly recommend from time to time to check it: https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment


If I needed to visually explain how this custom parameterization works for Azure Data Factory resource, I would picture it this way. Before you solely relied on publishing your ADF code from your collaboration Git branch to the adf_publish branch where ARMTemplateForFactory.jsonand ARMTemplateParametersForFactory.json files live and get further deployed to other environments. You had some flexibility to parameterize your deployment or run some custom code to update ARM templates before they get deployed.



With the introduction of the ADF custom parameterization, you have an additional JSON file arm-template-parameters-definition.json that you can use to define rules to add supplementary parameters to the main ARMTemplateParametersForFactory.json file. There is a very important statement on Microsoft documentation site that explains how this new file operates, "A definition can't be specific to a resource instance. Any definition applies to all resources of that type". It's like a garden rake that will collect all the leaves or none, i.e. if your rule defines a JSON property, let's say "timeout" of your ForEach loop container, then all timeouts will be scooped into ARM template parameter file.


A quick illustration of how I can change parameters and even ADF activities' names during a deployment. 

(A) Azure Data Factory development code
I have a very simple pipeline where I want to change my "waitTimeInSeconds" parameter value along with the Wait activity name.

(B) Parameterization template
I change my arm-template-parameters-definition.json file and add new elements to the "Microsoft.DataFactory/factories/pipelines" path that will instruct to add all activities' names and all parameters' default values.


(C) ARM template parameter file
After publishing Data Factory code from my collaboration branch, the ARMTemplateParametersForFactory.json file will receive additional elements that I can further adjust.


(D) Override template parameters
In my deployment pipeline code, I specifically override waitTimeInSeconds to 10 seconds and the subsequent name of the Wait activity name that will use this value.


(E) Validating ADF deployment
My ADF code gets successfully deployed to the Testing environment

And when I access the Testing Azure Data Factory resource I can see both parameter and activity name changed to their desired (adjusted) values during the deployment.


This whole custom parameterization of the ADF template parameter file may not look a very significant change but it will help you to achieve much more flexibility to deploy your Data Factory code.

Comments