Geo Location of Twitter messages in Power BI

[2017-June-25] There are several custom solutions that make Twitter messages analysis possible in Power BI. And with a recent addition of the Campaign/Brand Management for Twitter solution template for Power BI (https://powerbi.microsoft.com/en-us/solution-templates/brand-management-twitter/) it's getting even less complicated to get into this social media mainstream data exploration.

Here is the architecture of this solution template, where your Microsoft Azure subscription and Twitter account will provide you with access to analytical data based on Twitter handlers and hashtags of your choice. 


- Logic Apps pull data from Twitter
- Logic App passes tweet into an Azure Function
- Azure Function enriches tweet and writes it to Azure SQL
- Azure Function also calls Azure ML experiment running Python script to work out sentiment of tweet
- Power BI imports data into it from Azure SQL and renders pre-defined reports


You can find more details on how to customize this solution template here: Brand & Campaign Management Solution Template Documentation

However, with my recent customer Power BI project implementation with this template, a request was made to add a filter for twitter messages based on their geo locations; customer wanted to see tweets only from the Canadian Twitter audience. Then I realized that all this information was not easy to obtain, because it's optional for Twitter users to populate it and even update their profile location is also optional (Twitter Geo Guidelines). But I still decided to try parsing user profile location in order to separate tweets from the Canadian users and users from other countries.

For this blog post I have used @canada150th twitter account messages and messages from all other accounts that contained #canada150 hashtag as well.

Here is my final report:




And here is how I was able to add twitter user profile location data into my report. 

1) I added a new location column to the table that holds all the processed tweet message in my Azure SQL database:
alter table [pbist_twitter].[tweets_processed] add location nvarchar(100);


2) Made a few changes in the code of the Azure function (TweetFunctionCSharp.cs) that processes tweet messages:

    private Dictionary<string, string> processedTweets = new Dictionary<string, string>()
        {
            {"tweetid", null},
            {"masterid", null},
            {"image_url", null},
            {"dateorig", null},
            {"authorimage_url", null},
            {"username", null},
            {"hourofdate", null},
            {"minuteofdate", null},
            {"direction", "Text"},
            {"favorited", "1"},
            {"retweet", "False"},
            {"user_followers", null},
            {"user_friends", null},
            {"user_favorites", null},
            {"user_totaltweets", null},
            {"location", null}
        };


processedTweets["location"] = tweet.UserDetails.Location;


3) Created a function in the SQL Azure database that would try to parse a text from the user profile location column and check if that profile belongs to Canada:

CREATE FUNCTION [dbo].[ufnCheckTwitterProfileCountry]
(
    @location nvarchar(100)
)
RETURNS nvarchar(30)
AS
BEGIN
        -- Declare the return variable here
    DECLARE @Country nvarchar(30)
 SELECT @Country = 'Non-Canada'

 DECLARE @TempTable as TABLE (location nvarchar(100));
 INSERT INTO @TempTable (location)
 select ' AB' as Location
 union select ' BC'
 union select ' MB'
 union select ' NB'
 union select ' NL'
 union select ' NS'
 union select ' NT'
 union select ' NU'
 union select ' ON'
 union select ' PE'
 union select ' QC'
 union select ' SK'
 union select ' YT'
 union select ',AB'
 union select ',BC'
 union select ',MB'
 union select ',NB'
 union select ',NL'
 union select ',NS'
 union select ',NT'
 union select ',NU'
 union select ',ON'
 union select ',PE'
 union select ',QC'
 union select ',SK'
 union select ',YT'
 union select 'Alberta'
 union select 'British Columbia'
 union select 'Manitoba'
 union select 'New Brunswick'
 union select 'Newfoundland and Labrador'
 union select 'Nova Scotia'
 union select 'Northwest Territories'
 union select 'Nunavut'
 union select 'Ontario'
 union select 'Prince Edward Island'
 union select 'Quebec'
 union select 'Québec'
 union select 'Saskatchewan'
 union select 'Yukon'
 union select 'Canada'

    -- Add the T-SQL statements to compute the return value here
    SELECT @Country = 'Canada'
 WHERE 
 (
 SELECT SUM(CHARINDEX(t.location, @location)) 
 from @TempTable t
 ) > 0;
 
    -- Return the result of the function
 RETURN @Country
END
GO


4) Updated a SQL view code to bring a Country code that is used in my Power BI report:

CREATE VIEW [pbist_twitter].[vw_tweets_processed]
AS
    SELECT tweetid    AS [Tweet Id],
           dateorig    AS [Original Date],
           Convert(date,[dateorig])  AS [Date],
           hourofdate    AS [Hours],
           minuteofdate    AS [Minutes],
           latitude    AS [Latitude],
           longitude    AS [Longitude],
           masterid    AS [Master Id],
           retweet    AS [Retweet],
           username    AS [Username],
           usernumber    AS [User Number],
           image_url    AS [Image URL],
           authorimage_url   AS [Author Image URL],
           direction    AS [Direction],
           favorited    AS [Favorited],
           user_followers   AS [User Followers],
           user_friends    AS [User Friends],
           user_favorites   As [User Favourites],
           user_totaltweets   AS [User Total Tweets]
 ,location
 ,dbo.ufnCheckTwitterProfileCountry(location) as Country
    FROM   pbist_twitter.tweets_processed;
GO


5) And now, with the hope that all Twitter users in my solution would have their profile location populated in addition to another hope, that this location free text field would be correctly parsed by my SQL function, I can add a Country filter to the Power BI report, which you could see at the very beginning of this blog post in all of its 6 pages.

Problem solved! Feel free to comment if you find this approach to add location details to your Power BI Twitter template solution helpful!

Happy 150th Canada day and happy data adventures!

Comments

  1. Hi there. I'm not sure if this helps any, but with respect to getting the Twitter data, this is possible to do from within SQL Server itself. It does require SQLCLR so it won't work on Azure SQL Database (not even when SQLCLR was allowed since this requires EXTERNAL_ACCESS), but it works just fine on SQL Server Express Edition which can be run on premise or in an Azure VM. The Free version of the SQL# ( https://SQLsharp.com/ ) SQLCLR library (which I wrote) has several TVFs for querying Twitter (it handles the OAuth and parsing of most of the fields). There are instructions in the documentation area for how to set it up. Take care, Solomon...

    ReplyDelete
  2. Thanks Solomon for sharing the script. In my case I only changed the existing Power BI solution in Azure, where SQL Azure database was used just as a data repository. In addition to that, tweet sentiment analysis was done with the help of Text Cognitive services in Azure.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Ok. Just to be clear, I was not suggesting that SQL# ( http://SQLsharp.com/ {link didn't work in first comment and I have no way to edit it} ) could help with the analysis, just that perhaps it could help with the fetching of the data from Twitter that then gets stored in Azure SQL Database. Regardless, what you have done here is very interesting :).

    ReplyDelete

Post a Comment