Event-driven architecture (EDA) with Azure Data Factory

(2019-Oct-27) Creation or deletion files in your Azure Storage account may initiate data ingestion process and will support your event-driven data platform architecture.

Image by Lars_Nissen_Photoart from Pixabay

Microsoft recently introduced an additional change to file-event triggers within Azure Data Factor. What this change does, it gives you a bit more control for files that you want to be used or not to be used in your data ingestion process.

Before we get into more details on how to use this new "Ignore empty blobs" feature, let's briefly review possible scenarios of using file event triggers in your data processing workflow.

Ingest new data in batches

A batch of incoming sourcing data may come as a set of files and sometimes those files could be archives with other files within them. Those incoming files don't arrive at once, usually, it's a sequential process and it may have some delays between starting and ending files of this set. In order to orchestrate a synchronized data ingestion process and start loading those files as a complete set, your data provider will generate an additional flag-file (or end-file) to indicate the end of file uploading for a particular batch. And only after receiving this flag-file, your data ingestion process starts.

Ingest new data as it comes

With this approach, you create your data ingestion framework to react to each incoming data file that may arrive at a particular location. And as soon as the new file arrives, it triggers your process to ingest just this new data file into your data store.

Where this new ADF triggers change is helpful, it places control on this particularly reactive process to load new data files. In case if your data vendor by mistake or other reasons sends you an empty file, then when you set this "Ignore empty blobs" setting to "Yes", your data ingestion pipeline wouldn't be triggered and you don't have to worry about creating a special logic to handle empty sourcing files in your data ingestion pipeline. Empty files won't be loaded at all.

Working with ADF triggers has become a bit easier ! :-)

Comments

UnknownJanuary 28, 2022 at 2:12 PM
Hi,
My pipeline is failing even if my file is 125 b size when "ignore empty blobs" set as true. When i set "ignore empty blobs" to false then my pipeline succeed.

I have different environment like dev, pre-prod.
In dev "ignore empty blobs" set as true working fine but not in pre-prod
What could be the reason.

Data Adventures

Search This Blog

Event-driven architecture (EDA) with Azure Data Factory - Triggers made easy

Comments

Post a Comment