How to add a Constant value to Dynamic JSON array in Azure Data Factory

(2021-Mar-22) I had an interesting Azure Data Factory (ADF) case last week while pulling a list of files from an Azure Storage account with the help of [Get Metadata] activity. The existing ‘childItems’ option of this activity provides a basic sufficient level of details with the list of filenames in a given folder, however, I needed a few additional elements in that list.

For my specific case, the [Get Metadata] JSON output lacked a file path for each of the files; also I needed to include a file type (‘log’ or ‘data’) depending on a file folder this activity would scan those files from. Also, the updated JSON array object had to rely on a dynamic output of the [Get Metadata] activity, and I couldn’t use nested [For Each] loop container, because they are still not supported in ADF.

So, I needed to come up with a way to make this transformation in such a way, so that my final output would fit into a classic JSON array type definition and with both dynamic and static values combined.


Sample of my Initial [Get Metadata] activity output

As I have already mentioned the list of filenames was good, and it might look odd to include a path (static value in this case) along with the additional attribute. But I needed this within my JSON array so that it would help me in my further iteration (which I may write another blog post later).

Statis JSON elements I wanted to include

Sample values:
- "filetype" : "log",
- "path" : "files/system/logs"

And my ADF static variable was set to this value then:
- "filetype":"log", "path":"files/system/logs"

Solution to this problem

A key to solve this odd problem couldn’t be just a simple step, but a very odd sequence of JSON to text and then text to JSON transformations. First I created a simple ADF pipeline with the [Get Metadata] activity to read the list of files from the "files/system/logs" Azure Storage location, 2nd step was to save the output of this activity into a JSON array variable and then do a set of strange transformations to populate my final merged JSON array object.

Here is my final transformation logic:

And here is the GitHub location to this ADF pipeline with all the code within:
https://github.com/NrgFly/Azure-DataFactory/blob/master/Samples/pipeline/pl_constant_dynamic_JSON.json

Transformations Steps explained

(1) Join Dynamic Array with Constant

The official Microsoft documentation states that JOIN function in ADF returns a string that has all the items from an array and has each character separated by a delimiter. Example: join(createArray('a', 'b', 'c'), '.') will result into: "a.b.c". Since I already had an array output from my [Get Metadata] activity, I just need to squeeze my static text between the array elements.

(2) Replace '}*' with ','

In my first step, I specifically included the ‘*’ character then to be replaced with ‘,’ to pretend that I want to extend a set of elements for each array item.

(3) Replace '}{' with '},{'

This was done specifically to segregate JSON {} array elements since the JOIN function removed all the commas.

(4) Wrap output with '[]'

Curly brackets are good, but they’re not good enough to represent a JSON array, so I needed to additionally wrap the output of my preceding step to make it look like an array.

(5) Add Constant at the end

This step was odd, and I didn’t catch it right away. Since the very first JOIN transformation squeezed the static elements between [Get Metadata] activity JSON array, so I needed to add the static value to the last array element to complete this text transformation.

(6) Convert String to JSON Array

Nothing works better than a conversion of a well-formatted JSON array in a text format into a real JSON array 

Final Result

As a result, I received the output I was looking for, my minor mission was accomplished, and I moved on to another work item of my project :-)



Comments