Week 152 - SQL & Query Techniques
Challenge
Frosty's Frozen Logistics handles thousands of deliveries across the Arctic Circle every week. Partner warehouses submit shipment manifests as flat CSV dumps — and the data quality is, shall we say, glacial. Wrong types, missing IDs, strings that don't fit, dates that aren't dates. Until now, the ops team ran each import as a prayer: one bad row and the entire INSERT crashes, taking all the good rows with it.
Good news: Snowflake now supports DML Error Logging — set ERROR_LOGGING = TRUE on a table and your INSERTs keep going when they hit a bad row. The good rows land in the table, the bad rows get captured in a companion error table that you can query with ERROR_TABLE(). No more all-or-nothing loads.
Assignment
Help the Frosty Logistics ops team load a messy shipment manifest into a properly typed table, then use the error table to build a rejection report that shows exactly which rows failed and why.
Setup Code
Setup Code (click me)
-- Frosty Friday Week 152 — Setup
-- DML Error Logging
-- https://frostyfri.day/en/challenges/week-152
USE ROLE SYSADMIN;
CREATE DATABASE IF NOT EXISTS FROSTY_DB;
CREATE SCHEMA IF NOT EXISTS FROSTY_DB.WEEK_152;
USE SCHEMA FROSTY_DB.WEEK_152;
-- Raw shipment manifest from partner warehouses (everything is VARCHAR — the wild west)
CREATE OR REPLACE TABLE RAW_SHIPMENTS (
shipment_id VARCHAR,
sender VARCHAR,
recipient VARCHAR,
weight_kg VARCHAR,
destination VARCHAR,
priority VARCHAR,
delivery_date VARCHAR
);
INSERT INTO RAW_SHIPMENTS VALUES
('SH-001', 'Polar Provisions', 'Ice Hotel Tromsoe', '12.5', 'Tromsoe', 'STANDARD', '2026-04-01'),
('SH-002', 'Arctic Apothecary', 'Glacier General Store', '3.2', 'Reykjavik', 'EXPRESS', '2026-04-01'),
('SH-003', 'Tundra Tech', 'FrostByte Data Centre', 'heavy', 'Longyearbyen', 'EXPRESS', '2026-04-02'),
('SH-004', 'Snowdrift Supplies', 'Camp Basecamp', '45.0', 'Nuuk', 'STANDARD', '2026-04-02'),
(NULL, 'Penguin Post', 'McMurdo Station', '8.7', 'Antarctica', 'EXPRESS', '2026-04-03'),
('SH-006', 'Blizzard Books', 'Library of the North', '2.1', 'Murmansk', 'STANDARD', '2026-04-03'),
('SH-007', 'Icicle Instruments', 'Northern Philharmonic', '67.3', 'Fairbanks', 'OVERNIGHT','2026-04-03'),
('SH-008', 'Frostbite Fashion', 'Aurora Boutique', '1.8', 'Rovaniemi', 'EXPRESS', '2026-04-04'),
('SH-009', 'Glacier Gourmet', 'Polar Bear Bistro', '25.0', 'Hammerfest', 'STANDARD', '2026-04-04'),
('SH-010', 'Permafrost Parts', 'Snowmobile Central', '120.5', 'Yellowknife', 'STANDARD', '2026-04-04'),
('SH-011', 'Chill Chemicals', 'Ice Core Lab', '5.5', 'Summit Station', 'EXPRESS', '2026-04-05'),
('SH-012', 'Arctic Apothecary', 'Glacier General Store', '-3.0', 'Reykjavik', 'STANDARD', '2026-04-05'),
('SH-013', 'Snowdrift Supplies', 'Camp Basecamp', '15.0', 'Nuuk', 'STANDARD', 'next tuesday'),
('SH-014', 'Tundra Tech', 'FrostByte Data Centre', '9.9', 'Longyearbyen', 'EXPRESS', '2026-04-06'),
('SH-015', 'Polar Provisions', 'Ice Hotel Tromsoe', '7.7', 'Tromsoe', 'STANDARD', '2026-04-06'),
('SH-016', 'Blizzard Books', 'Library of the North', '4.3', 'Murmansk', 'EXPRESS', '2026-04-07'),
('SH-017', 'Icicle Instruments', NULL, '22.0', 'Fairbanks', 'STANDARD', '2026-04-07'),
('SH-018', 'Glacier Gourmet', 'Polar Bear Bistro', '11.1', 'Hammerfest', 'EXPRESS', '2026-04-07');
Your Objective
Step 1: Create the typed delivery table
Create a table called DELIVERIES with proper column types and constraints. Enable DML error logging on it.
| Column | Type | Constraint |
|---|---|---|
| shipment_id | VARCHAR(6) | NOT NULL |
| sender | VARCHAR(50) | |
| recipient | VARCHAR(50) | NOT NULL |
| weight_kg | NUMBER(5,1) | |
| destination | VARCHAR(50) | |
| priority | VARCHAR(8) | |
| delivery_date | DATE |
🧊 I'm Stuck — Help! (Click me)
💡 Add
ERROR_LOGGING = TRUEafter the closing parenthesis of yourCREATE TABLEstatement, just like you'd addCLUSTER BYorDATA_RETENTION_TIME_IN_DAYS.
Step 2: Load the data
Write an INSERT INTO ... SELECT that casts each column from RAW_SHIPMENTS into the types expected by DELIVERIES. Run it and note how many rows were inserted.
🧊 I'm Stuck — Help! (Click me)
💡 Cast each column explicitly:
shipment_id::VARCHAR(6),weight_kg::NUMBER(5,1),delivery_date::DATE, etc. With error logging enabled, the INSERT won't crash on bad rows — it will skip them and keep going. You should see 13 rows inserted out of 18.
Step 3: Build the error report
Query the error table to produce a clean rejection report. Show:
ERROR_DATA:SHIPMENT_ID::VARCHARaliased asSHIPMENT_IDERROR_DATA:SENDER::VARCHARaliased asSENDERERROR_METADATA:error_source::VARCHARaliased asPROBLEM_COLUMNERROR_METADATA:error_message::VARCHARaliased asERROR_MESSAGE
Order by SHIPMENT_ID.
🧊 I'm Stuck — Help! (Click me)
💡 Use
ERROR_TABLE(DELIVERIES)as your FROM clause — it's a table function that returns the error log for any table withERROR_LOGGING = TRUE. The columnsERROR_METADATAandERROR_DATAare VARIANT (JSON), so use colon notation (:) to extract fields. The offending column's value is wrapped in an array inERROR_DATA— that's how Snowflake flags which value caused the problem.
Expected Output:
SELECT
ERROR_DATA:SHIPMENT_ID::VARCHAR AS shipment_id,
ERROR_DATA:SENDER::VARCHAR AS sender,
ERROR_METADATA:error_source::VARCHAR AS problem_column,
ERROR_METADATA:error_message::VARCHAR AS error_message
FROM ERROR_TABLE(DELIVERIES)
ORDER BY shipment_id;
| SHIPMENT_ID | SENDER | PROBLEM_COLUMN | ERROR_MESSAGE |
|---|---|---|---|
| SH-003 | Tundra Tech | WEIGHT_KG | Numeric value 'heavy' is not recognized |
| SH-007 | Icicle Instruments | PRIORITY | String 'OVERNIGHT' is too long and would be truncated |
| SH-013 | Snowdrift Supplies | DELIVERY_DATE | Date 'next tuesday' is not recognized |
| SH-017 | Icicle Instruments | RECIPIENT | NULL result in a non-nullable column |
| [null] | Penguin Post | SHIPMENT_ID | NULL result in a non-nullable column |
Step 4: Reconciliation check
Prove that every row in RAW_SHIPMENTS is accounted for — either loaded into DELIVERIES or captured in the error table. Write a single query that returns three numbers:
rows_loaded— count of rows inDELIVERIESrows_rejected— count of rows inERROR_TABLE(DELIVERIES)total_raw_rows— count of rows inRAW_SHIPMENTS
🧊 I'm Stuck — Help! (Click me)
💡 Use scalar subqueries:
SELECT (SELECT COUNT(*) FROM DELIVERIES) AS rows_loaded, .... The three counts should add up: 13 + 5 = 18.
Expected Output:
SELECT
(SELECT COUNT(*) FROM DELIVERIES) AS rows_loaded,
(SELECT COUNT(*) FROM ERROR_TABLE(DELIVERIES)) AS rows_rejected,
(SELECT COUNT(*) FROM RAW_SHIPMENTS) AS total_raw_rows;
| ROWS_LOADED | ROWS_REJECTED | TOTAL_RAW_ROWS |
|---|---|---|
| 13 | 5 | 18 |
Hints
- DML error logging is enabled per-table with
ERROR_LOGGING = TRUEonCREATE TABLEorALTER TABLE. It's a table property, not a session setting. - There's also a session parameter
OPT_OUT_ERROR_LOGGING(defaultFALSE). If someone sets it toTRUE, error logging is disabled for that session regardless of the table property. ERROR_TABLE(<table_name>)is a table function — use it in yourFROMclause like any other table function.- The error table has five columns:
TIMESTAMP,QUERY_ID,ERROR_CODE,ERROR_METADATA(VARIANT), andERROR_DATA(VARIANT). ERROR_METADATAcontains the error code, message, source column, and SQL state as JSON fields.ERROR_DATAcontains the full row that failed, as JSON. The column that caused the error has its value wrapped in an array — that's Snowflake's way of flagging the offending value.- You can
TRUNCATE TABLE ERROR_TABLE(<table_name>)to clear the error log. - This feature works with INSERT (both VALUES and SELECT), UPDATE, MERGE, and COPY INTO.
- Check out the DML error logging documentation for the full reference.