Search This Blog

Monday, October 15, 2018

Setting Variables in Azure Data Factory Pipelines

(2018-Oct-15) Working with Azure Data Factory you always tend to compare its functionality with well established ETL packages in SSIS. Data flow task have been recreated as Data Copy activities; logical components have found they cloud-based siblings; as well as new kids on the block, such as Databricks and Machine Learning activities could boost adoption rate of Azure Data Factory (ADF) pipelines.

Support for local variables hasn't always been available in ADF and was only recently introduced to already available pipeline parameters. This addition makes more flexible to create interim properties (variables) that you can adjust multiple times within a workflow of your pipeline.

Here is my case-study to test this functionality.

I have a simple SQL Database with 2 tables that could hold daily and monthly sales data which I plan to load from a sample set of CSV data files from my Blob storage in Azure.

My new ADF pipeline has an event trigger that passes a file path and file name values from newly created objects in my Blob storage container:

The logic then would be to check a data feed type (Daily or Monthly) based on a file name and load data to the corresponding table in SQL Database in Azure.

And here is where this [Set Variable] activity comes as a very handy tool to store a value based on a define expression of my variable:

Then I define two sub-tasks to copy data from those flat files into corresponding tables based on the value of the FeedType variable:

And once this all being done, I place two different data files into by Blob storage container and ADF Pipeline trigger successfully executes the same pipeline twice to load data into two separate tables:

Which I can further check and validate in my Azure SQL Database:

My [Set Variable] activity has been tested successfully! 
And it's one more point toward using ADF pipelines more often.

Thursday, August 30, 2018

Maps in Power BI. Do I really need them?

(2018-Aug-30) While preparing for a recent Power BI Toronto meetup session, I found a very valuable whitepaper on Maps in Power BI written by David Eldersveld from the BlueGranite consulting company, and I had used some of the ideas and examples from that document in my presentation.

There is one idea that I couldn't agree more which could be traced and found in many of my Power BI report developments (I admit) and reports developed by others. We've integrated external data source systems, created key metrics in our data model, identified that some of the data elements could be categorized as geolocations in our datasets, and... then we want to start using maps visualizations right away, but do I really need them?

If my dataset contains a customer location info or list of countries where all the tweet messages came from based on a recent marketing campaign then I'm tempted to visualize them. Let's take an example of the following report with the Canadian population by Province. What would be the best way to analyze the population itself, a table, a map, a bar or a pie chart or combination of all of them?

I could easily sort the table by the Population column or look at another non-map visualization (Donut chart, for example, developed by ZoomCharts company) to realize that Quebec is the second largest province by its population.

or I could just select the least populated provinces and territories to find out that their ratio vs. other populated area is less than 1%.

So I wouldn't suggest using Map visualization in Power BI just because my data can allow this to happen. Use other more effective tools or visual components to present your data; combine them with maps, and most of all create a story with your visualizations that will help your audience to understand it better.

It only takes to learn maps visualizations to understand that sometimes something else could be used instead :-)

Sunday, July 29, 2018

ArcGIS Maps for Power BI: Plus subscription (2nd part); What's in it for me?

(2018-July-29) My initial attempt to explore and test the ArcGIS Map paid version continued after receiving a response from the ESRI support team. Initially, I couldn't see all the additional features of the Plus subscription and I wrote about that in my last blog post - ArcGIS Maps for Power BI: Free vs. Paid version

But then I received a response from the Canadian Tech ESRI support team on my request to look into the issue of non-connectivity to Plus subscription content in Power BI. Basically, it communicated that all things were OK with the map itself and I needed to check further with Microsoft Power BI support team.

Well, I was a bit disappointed with this outcome; in addition to this, I received a few comments from other people about their unsuccessful attempts to resolve the very same issue with ESRI support team.

The next day I received two other messages: one with a confirmation that my ArcGIS Map Plus subscription had been activated and another email from a business analyst from the ESRI team in the US with a request to check if I still had any issues connecting to my account and additional help was offered if I needed. This restored my faith in people, and they willing to help others :-) Plus, after quickly checking my original Power BI report, I was able to connect to the Plus subscription of ArcGIS MAP, yes!

So, this is my story of checking all the new features available in the paid version of the ArcGIS Map in PowerBI.


First, the basesmaps add a contextual background to your map geo points. And with the Plus subscription you have access to 8 new basemaps, identified by a Plus badge on the screen:

New basemaps are:

- Imagery
- Imagery with Labels
- National Geographic
- Oceans
- Terrain with Labels
- Topographics
- USA Topo Labels
- USGS National Map (USA)


And my favorite basemep among those eight ones is the National Geographic, it hard to tell why. Probably, because it combines both technical geocoding details like any standards maps, but at the same time, it bears that graphical look from past historical sea voyages into the future endeavors to explore oceans, and I just like it :-)

Reference Layer

With Plus subscription you have access to Live Atlases, which can enrich you map look with other graphical layers on the top of your existent basemep. The search option allows you to examine available layers that you can use. Currently, most of the layers are US based data, but I found a North America Ecoregions layer that showed Canada in so many different colors. Just take a look!



Infographics are just "floating" data cards that you can add to your map with some statistical data  (population, age distribution, income levels, etc.) to support either your whole map visualization or a part of it.

The beauty of those infographics data cards is that their content is updated depending on what you're doing with your ArcGIS. If you select one or more features on the map, the cards will show demographic information for an area around each selected location. See, how the Toronto infographics data gets changed when I zoom my view in to a core downtown area.


I won't be making any conclusions at the end. If you prefer using MapBox in Power BI, it's your choice; if you want to give a try to ArcGIS Plus subscription or other map visualization in Power BI, go for it!

For me, this version of the ArcGIS Map also reminded me that I needed to purchase a National Geographic Magazine subscription for my daughter :-)

Happy data adventures!

Wednesday, July 25, 2018

ArcGIS Maps for Power BI: Free vs. Paid version

(2018-July-25) I like the ESRI company slogan, "The Science of Where". Hopefully, it will help someone not to get lost in the myriads of geo points on their maps. 

Working with ArcGIS map visualization in Power BI, you always hit that notification point that tells you about more features available in the non-free version of the ArcGIS map, so I wanted to explore this.

I pulled an opened dataset of TTC Routes and Schedules with the route definitions, stop patterns, stop locations, and schedules of the Toronto TTC ground buses and streetcars into my Power BI data model.

With the following table count statistics:
Stops - 10,614 records
Trips - 134,949 records
Routes - 200 records
Stop Times - 5,584,011 records

And this data model helped me to plot all (almost all) stops' geo locations in my ArcGIS map:

My next step was to sign up for the Plus subscription (Paid version of the ArgGIS map) in Power BI. The registration went smoothly, my credit card information has been collected, no warning signs. However, when I tried to sign in again and authenticate myself with my registration email address, I received an error message that something went wrong with ArcGIS Maps for Power BI with apologies for the inconvenience and request to refresh the page or check back later. I contacted the Canadian ESRI tech support but still haven't received any updates for this issue.

So, I was only left with the option to trust the ESRI documentation and check limitations of the free version of the ArcGIS Map in Power BI. If you read the documentation, basically you will get an idea of "more", there a couple of additional new features, but the Plus version of the map has all the standard features with more capacity:

Included with Power BIWith a Plus subscription
4 basic basemaps
4 basic basemaps and 8 others, including satellite imagery
1,500 features per map
5,000 features per map
100,000 features per month
1 million features per month
Reference layers
10 reference layers that contain U.S. demographics
Access to Esri Living Atlas maps and layers (feature services)
Publicly shared feature layers on ArcGIS
Publicly shared feature layers on ArcGIS
Curated gallery of U.S. demographics variables (7 categories)
Full access to the ArcGIS GeoEnrichment data browser, including U.S. and global demographics variables

More basic maps, more geo coordinates to map, more reference layers. The good thing about the reference layers is that you get exposed to the same set of publicly shared feature layers in both free and paid versions of the map, which is a cool thing! There is a search field in the Edit mode of the map that allows you to select and view one of those public layers within your map.

So now, knowing what my limits are, I switched my map theme to clustering and realized that it also had another limit that is not listed on the official ESRI site, the clustering feature in my case was limited to 10,000 geo points vs. the actual dataset of 10,614 stop locations. You can check those numbers if you like.

I still hope that ESRI tech support team will get back to me with my inquiry, otherwise, I will need to find a way to cancel my Plus subscription.

Happy data adventures!

(2018-July-29) Update: the ESRI support team finally reached out to me and helped to resolve the issue, please read my 2nd part of this story:

Sunday, July 15, 2018

SQL Server data flow into Excel template with Power Query function-based parameter

(2018-July-15) Your case is to create an Excel template with data extracted from a backend SQL Server database; it takes a few minutes and Power Query is a very handy tool to connect/shape and extract your data into a worksheet format. What if you need to create multiple files where your SQL bases data source is only one parameter different (like a report date or a product name).

In my case, I have a list of 7000 geo stations across the globe, with my default extract logic I was able to pull a list of Canadian stations in my Excel file. Just connect to a SQL Server using Power Query, and the job is done. But that is still part of the problem to solve.

What if you need to create individual files for each of the main parameter's values, in my case, it's a country name. Let's explore if we can add some flexibility to the existing Power Query with Excel-based parameters.

1) A new Power Query is created based on my existing Excel table by pressing the "From Table" option:

2) One conversion step is removed from this table in Power Query:

3) A specific cell is selected from sourcing Excel table, which converts the table into a scalar value in Power Query:

4) A Custom function is then created based on the sourcing scalar value:

5) Then an explicit filtering condition in the main data query is replaced with the newly created function:

6) And now I can choose any country from the list that I want and see their geo stations, even Greenland :-)

Happy data adventures!

Sunday, May 27, 2018

Power BI with Mapbox and Volcanic Ring of Fire

(2018-May-27) With recent Hawaiian volcano activity, it would be interesting to see how other similar events have shaped our Earth planet. The NOAA's National Centers for Environmental Information (NCEI) provides public access to the Volcano Location Database - And currently, it contains information about 1571 known volcanic eruptions.

I was able to extract this dataset into my Power BI data model and decided to use the new custom map visualization created by Mapbox that became available with this monthly Power BI update. 

The working experience with Mapbox is similar to other custom and native map visualization, however, it has one new additional feature that would require you to register and get your personal Mapbox access token.

As soon as you place this visualization control into your report canvas and then link it with your geo dataset, the following initial screen will show up:

Then after the initial steps of registration, you are able to get this access token to place into the Power BI Viz format panel:

Going back to the actual volcanic map, I found the proof, that the Ring of Fire does exist and it's around the Pacific Ocean.

And here is the actual Power BI report with the volcanic interactive map using Mapbox (which currently is only supported in Chrome and Firefox) :

Happy data adventures!

Sunday, May 20, 2018

Power BI with Maps: Choose Your Destination

(2018-May-20) A childhood dream to travel around the world fueled by reading Gulliver's Travels stories and Robinson Crusoe attempts to survive on a deserted island. Those books were filled with geographical description and map locations. I think I always loved working with maps and I still do! There is something that could captivate your mind and propel your imagination for traveling when you immerse yourself in maps.

Power BI itself a great tool that could feed your appetite for an adventure using built-in and custom map visualizations.

The built-in (or native) map visualizations are available right away when you install the Power BI desktop for the first time and Microsoft has provided sufficient information for you to get started working with them:
- Map
- Filled Map
- ArgGIS Map
- Shape Map

And there is another set of custom map visualization that we can use in building data analytics in Power BI (not a complete list):
- Flow map by Weiwei Cui
- Heatmap by Weiwei Cui
- Globe Map by Microsoft 
- Icon Map by Altius
- Mapbox Visual by Mapbox

Since there is more information about the built-in map visualizations then I'd like to share my experience working with the custom ones.

But before I do this, let me spend a few minutes describing how the Shape Map (still in Preview at the moment) works.

Shape Map

Once you enable this Preview feature in your desktop version of the Power BI then can you use a predefined set of Shapes:

or you can free your mind and upload a custom shapefile with Canadian provinces and Northen Territories, which I did from this site

Flow map by Weiwei Cui

Flow map is a special type of network visualization that portrays movements of objects among geo-locations. 

I built a simple dataset with the list of Provinces and Territories capitals and linked them all to Ottawa (Canadian capital). The component itself has a very rich set of formatting settings which I liked the most; so the final visualization could look totally different by changing one of them.

Heatmap by Weiwei Cui

Heat maps are a type of visualization to show data density on a map. They are particularly helpful when you have a lot of (e.g., tens of thousands of) data points on the map and are mainly interested in their overall distribution. Technically, in a heatmap, data points are aggregated locally and mapped to colors (either gradient or quantile), so that we can make better sense of the density of the data from the colors while still being able to see and use the map. 

I already blogged about the use of the heatmaps in Power BI - and we can always use the built-in ArgGIS heat maps option as well to see alternative ways for your dataset density.

Globe Map by Microsoft

One the best features of the GlobeMap is that it allows you to rotate the Globe and see it from different angles, and this is right within your Power BI report.

Globe Map is a 3D Map that makes the map exploration experience more immersive and magical. It provides the sense of connection to the data with the physical world. This, combined with our spatial ability, brings a new perspective to the data when presented as 3D objects. Globe Map also supports heat map on the spatial map. You can use a second measure for heat intensity and draw immediate attention to the right areas.

I used the Berkley Earth datasets - for this 3D reporting which was already featured in my previous blog post - 

Icon Map by Altius

The Icon Map visual shows data-bound images & lines or circles on a map and allows you to render images and lines or circles on the map. 

Mapbox Visual by Mapbox

The Mapbox makes heatmaps, circles, and clusters using big location data.
Your location data is more than just points on a map. With the Mapbox Visual for Power BI, quickly create beautiful custom map visuals that answer your business question with fast performance.

And here is the actual Power BI report with some of the custom map visualizations:

Happy data adventures!