Data story behind a Global Fishing Power BI report

(2019-Mar-01) The first time I heard about the Data For A Cause initiatives was at one of the Power BI meetups in Toronto. And I actually didn't attend that meeting myself, but only saw pictures and learned a bit more about this virtual community of data professionals and Olga Tsubiks, who is the lead of this group. They help not-for-profit organizations to explore their datasets and create visualizations to assist them build awareness and better understanding. 

This time the Data for a Cause challenge was to work with Global Fishing Watch and visualize the fishing activity of vessels fishing in other country's waters. I was able to create a Power BI report using their data, however, this blog post is not about best practices of creating analytical reports in Power BI, but it's rather a data story of working with provided datasets and a thinking process of creating an interactive tool to visually explore fishing data.

Link to Global Fishing Power BI report.





Data

Fishing dataset had been presented in a flat file format with the following list of attributes:
       fishing_year
      ,fishing_date: a string in format “YYYY-MM-DD”
      ,fishing_lat_bin: grid cell
      ,fishing_lon_bin: grid cell
      ,fishing_mmsi: unique ID of the fishing vessel, AIS tracking device
      ,fishing_fishing_hours: hours the vessels were present in this location on this day
      ,vessel_mmsi: same as fishing_mmsi
      ,vessel_shipname
      ,vessel_callsign
      ,vessel_flag:  the country of the fishing vessel, in iso3 value
      ,vessel_imo
      ,vessel_registry_geartype: vessel gear types. The column Vessel Registry Geartype refers to the official purpose of the vessel. By analyzing the movement of the boat, GFW identified the vessel as a fishing vessel (even if registered otherwise) - Vessel Inferred Geartype.
      ,vessel_inferred_geartype
      ,vessel_inferred_geartype_score
      ,vessel_inferred_subgeartype
      ,vessel_inferred_subgeartype_score
      ,vessel_registry_length
      ,vessel_inferred_length
      ,vessel_registry_tonnage
      ,vessel_inferred_tonnage
      ,vessel_registry_engine_power
      ,vessel_inferred_engine_power
      ,vessel_source
      ,ISO_Ter1: the country where the fishing occurred (an exclusive economic zone of, in iso3). 
      ,ISO_Ter2
      ,ISO_Ter3

Data file contained 6,929,885 rows and it covered 5 years of fishing activities from 2012 to 2016.

Additionally, I pulled the list country codes with their names from this Wikipedia resource - https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes.
And I also found a Capture fisheries production dataset from the World Data Bank that contained data for the period of time from 1960 to 2016. This datasets was in an Excel file format and where one of the worksheets contained list of countries with some additional attributes, which I thought would be a good candidate for my country profile data table.


Eventually, after pulling data from the main data file, Wikipedia county codes list and fish capture data, my final data model in Power BI looked the following way, where I replicated the county code table both for Vessel & Fishing Area country info.




Metrics & Relationships between different data attributes

This was my thinking process to create a visual story with this data model:
1) Metrics:
    - Total Fishing Hours
    - Counts of Vessels (by Country, by Types/SubTypes)
    - Fish Caught (other data set)
2) Relationships:
    - Fish Caught by Country and its fishing activity on the map
    - Relationships between Vessel countries and Fishing area countries
3) Additional consideration to use maps since the Latitude/Longitude coordinates were provided.

Creating visualizations in Power BI

Summary report




For the summary page report, I used the following custom and standard visual components:
- Information cards to show counts of vessels, fishing hours and fish caught. 
Enlighten World Flag Slicer to filter vessel countries
Drill-down donut chart by ZoomCharts to show country fleet vessel distribution by vessel type
- Area and Bar charts to show fish capture data
- Narrative Science for Power BI component to create natural language narratives based on fish capture time series. I really liked this component in Power BI since it allows me to automatically and interactively generate a textual description of my numeric data, it also helps to show any potential trendlines in my dataset. And I had already blogged about this Narrative Science component for Power BI in the past: http://datanrg.blogspot.com/search?q=narrative+science.

Example of narratives generated for one selected country:



Heat map




For the second page of my report, I decided to use Heatmap visualization to show fishing hours density across different regions of the world. Latitude & longitude coordinates helped me to plot vessel fishing events on this map along with fishing hours to show different scales of heat.
Timeline slicer was used to select a date range for my heat map using the fishing_date attribute.


Flow map



Knowing that my main data set contained information about vessel and fishing area countries, it came naturally to assess how both of them relate to each, i.e. to analyze how each vessel country fish in another country's waters.

And Flow map visualization worked well for this type of analysis, where you can set Origin and Destination fields from your dataset. Also, you can set whether your connections would be in a form of straight or curved lines. 

In overall this Flow Map visualization helped to analyze fishing activities of richer countries in poorer countries' waters. And you could select a country vessels flow with the help of the filter at the top of this report page.

Summary
- Working with open data sets is always an interesting experience, especially when it contains geo coordinates that you can utilize with various map visualizations.
- I hope that this Power BI report does help to analyze fishing activities in the world. And if you are aware of other nonprofit organizations that I can help with this type of data analytics, please let me know, it would be my way of giving back to the community.

Comments

  1. Hello, would you be happy to share your dataset? Thank you

    ReplyDelete
    Replies
    1. Hello Faye, it can be found here - https://globalfishingwatch.org/datasets-and-code/

      Delete
    2. Thanks Rayis. I'm very much a data and Power BI newbie, and am having trouble figuring out where you extracted the data for your "Fish Watch" table, especially the geographic data. Would you be able to help me out? Thanks for replying.

      Delete
    3. It's easy:
      1) Go to this web-site: https://globalfishingwatch.org/datasets-and-code/
      2) Click Download Data, this will lead you to this site: https://globalfishingwatch.org/data-download/
      3) Register yourself
      4) This will open available datasets, you can choose any datasets if you like, however this one will be interesting to you, "Daily Fishing Effort at 100th Degree Resolution by Flag State and Geartype"
      5) Download either Daily or Yearly csv file to your local computer, then unzip it.
      6) Then you can upload this file into your Power BI desktop, create your data model and analyze

      Delete
    4. Table Schema
      date: a string in format “YYYY-MM-DD” indicating the date for daily csvs
      year: year of data for annual csvs
      lat_bin: the southern edge of the grid cell, in 100ths of a degree -- 101 is the grid cell with a southern edge at 1.01 degrees north
      lon_bin: the western edge of the grid cell, in 100ths of a degree -- 101 is the grid cell with a western edge at 1.01 degrees east
      flag: the flag state of the fishing effort, in iso3 value
      geartype: see our description of geartpyes
      vessel_hours: hours that vessels of this geartype and flag were present in this gridcell on this day
      fishing_hours: hours that vessels of this geartype and flag were fishing in this gridcell on this day
      mmsi_present: number of mmsi of this flag state and geartype that visited this grid cell on this day

      Delete
  2. Good morning. I wonder if there is any new link to download the dashboard. The current link is giving error. Thank you very much and congratulations for the blog.

    ReplyDelete

Post a Comment