Week 15 - Stored Procedures & UDFs

This week we’re working with small dataset about house sales that needs to be categorized according to certain sliding bins. The catch however , is that the sizes and the number of bins can change quickly.

The challenge is to create a function with a single name that does the following:

  • can handle uneven bin sizes
  • the first parameter must be the column that will inform your bins (in this example, we categorise according to [price])
  • the second parameter should specify the ranges of your bins (remember, these are uneven bins, bin 1 could be 1 - 400, and bin 2 401 - 708, while bin 3 is 709 - 3000) how you do this is up to you: you can specify lower bounds, upper bounds, both, count within each bin....
  • if using SQL, as a minimum, it should be able to handle 2-6 bins, if using other languages then you will find them flexible enough to allow you to do any number of bins

The query should look like the following :

SELECT sale_date,price,your_function(price,<bin_ranges>) AS BUCKET_SET1,your_function(price,<bin_ranges>) AS BUCKET_SET2,your_function(price,<bin_ranges>) AS BUCKET_SET3,FROM home_sales

You will then need to pass in the following bin/bucket ranges for testing purposes:

  • Bucket_Set1:
    • 1: 0 - 1
    • 2: 2 - 310,000
    • 3: 310001 - 400000
    • 4: 400001 - 500000
  • Bucket_Set2:
    • 1: 0 - 210000
    • 2: 210001 - 350000
  • Bucket_Set3:
    • 1: 0 - 250000
    • 2: 250001 - 290001
    • 3: 290002 - 320000
    • 4: 320001 - 360000
    • 5: 360001 - 410000
    • 6: 410001 - 470001
Click image to expand

create table home_sales (
sale_date date,
price number(11, 2)
);

insert into home_sales (sale_date, price) values
('2013-08-01'::date, 290000.00),
('2014-02-01'::date, 320000.00),
('2015-04-01'::date, 399999.99),
('2016-04-01'::date, 400000.00),
('2017-04-01'::date, 470000.00),
('2018-04-01'::date, 510000.00);

Given the buckets above, the following results should be generated:

Click image to expand

Remember if you want to participate:

  1. Sign up as a member of Frosty Friday. You can do this by clicking on the sidebar, and then going to ‘REGISTER‘ (note joining our mailing list does not give you a Frosty Friday account)
  2. Post your code to GitHub and make it publicly available (Check out our guide if you don't know how to here)
  3. Post the URL in the comments of the challenge.

You can also sign up to our mailing list below.

Previous
Previous

Week 16 - Semi-structured Data

Next
Next

Week 14 - Semi-structured Data