(2021-Mar-22) I had an interesting Azure Data Factory (ADF) case last week while pulling a list of files from an Azure Storage account with the help of [Get Metadata] activity. The existing ‘childItems’ option of this activity provides a basic sufficient level of details with the list of filenames in a given folder, however, I needed a few additional elements in that list.
For my specific case, the [Get Metadata] JSON output lacked a file path for each of the files; also I needed to include a file type (‘log’ or ‘data’) depending on a file folder this activity would scan those files from. Also, the updated JSON array object had to rely on a dynamic output of the [Get Metadata] activity, and I couldn’t use nested [For Each] loop container, because they are still not supported in ADF.
So, I needed to come up with a way to make this transformation in such a way, so that my final output would fit into a classic JSON array type definition and with both dynamic and static values combined.
Sample of my Initial [Get Metadata] activity output
As I have already mentioned the list of filenames was good, and it might look odd to include a path (static value in this case) along with the additional attribute. But I needed this within my JSON array so that it would help me in my further iteration (which I may write another blog post later).Statis JSON elements I wanted to include
Solution to this problem
A key to solve this odd problem couldn’t be just a simple step, but a very odd sequence of JSON to text and then text to JSON transformations. First I created a simple ADF pipeline with the [Get Metadata] activity to read the list of files from the "files/system/logs" Azure Storage location, 2nd step was to save the output of this activity into a JSON array variable and then do a set of strange transformations to populate my final merged JSON array object.
Here is my final transformation logic:
https://github.com/NrgFly/Azure-DataFactory/blob/master/Samples/pipeline/pl_constant_dynamic_JSON.json
Transformations Steps explained
(1) Join Dynamic Array with Constant
(2) Replace '}*' with ','
In my first step, I specifically included the ‘*’ character then to be replaced with ‘,’ to pretend that I want to extend a set of elements for each array item.
(3) Replace '}{' with '},{'
This was done specifically to segregate JSON {} array elements since the JOIN function removed all the commas.
(4) Wrap output with '[]'
Curly brackets are good, but they’re not good enough to represent a JSON array, so I needed to additionally wrap the output of my preceding step to make it look like an array.
(5) Add Constant at the end
This step was odd, and I didn’t catch it right away. Since the very first JOIN transformation squeezed the static elements between [Get Metadata] activity JSON array, so I needed to add the static value to the last array element to complete this text transformation.
(6) Convert String to JSON Array
Nothing works better than a conversion of a well-formatted JSON array in a text format into a real JSON array
Final Result
As a result, I received the output I was looking for, my minor mission was accomplished, and I moved on to another work item of my project :-)
Comments
Post a Comment