Week 25 - Semi-structured Data

To celebrate Women in Data and the upcoming Data+Women event, we've partnered up with Gaia Pometto , a data consultant in the Netherlands, to release a challenge for the Data+Women event. She'll go through the challenge, live, at the event and you can find her on LinkedIn, Twitter and YouTube.

She also wanted to give a special mention to colleague Atzmon Ben Binyamin who collaborated with her on this challenge and was a great help throughout the process. You can find Atzmon on LinkedIn.

The Challenge

In order to prepare a future ingestion flow of data, your organization wants to get data from a weather API into Snowflake. At the end of the flow, it is expected to create a fact table to be used by consumers. 

Your colleague Atzmon has already pulled a sample of the weather data from the API and momentarily stored it in an s3 bucket. The s3 bucket contains a JSON file with hourly data for 7 days. 

Your Flow will consist of 3 layers (schemas) of data:
a. Landing zone (external stage, raw table - weather_raw)
b. Curated zone (raw data parsed table - weather_parsed)
c. Consumption zone (aggregated data table - weather_agg)

Objectives: 
1. Create a stage for the landing layer and copy the JSON. 
2. Parse the JSON and create a table - weather_parsed (in the curated zone). 
3. Create a table for consumption weather_agg (in the consumption zone).     
The table should have aggregates per day for: 
a. Temperature       
b. Wind speed       
c. Distinct 'icon' definitions for a day (e.g. ["cloudy", "rain"])

Here is the link to the s3 bucket (hint: bucket = frostyfridaychallenges, path = challenge_25).
Good luck!

The solution should look something like this :

Remember, if you want to participate:

  1. Sign up as a member of Frosty Friday. You can do this by clicking on the sidebar, and then going to ‘REGISTER
  2. Post your code to GitHub and make it publicly available (Check out our guide if you don't know how to here)
  3. Post the URL in the comments of the challenge
Previous
Previous

Week 26 - Tasks & Scheduling

Next
Next

Week 24 - Streamlit & Apps