Week 152 - SQL & Query Techniques

Challenge

Frosty's Frozen Logistics handles thousands of deliveries across the Arctic Circle every week. Partner warehouses submit shipment manifests as flat CSV dumps — and the data quality is, shall we say, glacial. Wrong types, missing IDs, strings that don't fit, dates that aren't dates. Until now, the ops team ran each import as a prayer: one bad row and the entire INSERT crashes, taking all the good rows with it.

Good news: Snowflake now supports DML Error Logging — set ERROR_LOGGING = TRUE on a table and your INSERTs keep going when they hit a bad row. The good rows land in the table, the bad rows get captured in a companion error table that you can query with ERROR_TABLE(). No more all-or-nothing loads.

Assignment

Help the Frosty Logistics ops team load a messy shipment manifest into a properly typed table, then use the error table to build a rejection report that shows exactly which rows failed and why.

Setup Code

Setup Code (click me)
-- Frosty Friday Week 152 — Setup
-- DML Error Logging
-- https://frostyfri.day/en/challenges/week-152

USE ROLE SYSADMIN;

CREATE DATABASE IF NOT EXISTS FROSTY_DB;
CREATE SCHEMA IF NOT EXISTS FROSTY_DB.WEEK_152;
USE SCHEMA FROSTY_DB.WEEK_152;

-- Raw shipment manifest from partner warehouses (everything is VARCHAR — the wild west)
CREATE OR REPLACE TABLE RAW_SHIPMENTS (
    shipment_id    VARCHAR,
    sender         VARCHAR,
    recipient      VARCHAR,
    weight_kg      VARCHAR,
    destination    VARCHAR,
    priority       VARCHAR,
    delivery_date  VARCHAR
);

INSERT INTO RAW_SHIPMENTS VALUES
    ('SH-001', 'Polar Provisions',    'Ice Hotel Tromsoe',       '12.5',   'Tromsoe',         'STANDARD', '2026-04-01'),
    ('SH-002', 'Arctic Apothecary',   'Glacier General Store',  '3.2',    'Reykjavik',      'EXPRESS',  '2026-04-01'),
    ('SH-003', 'Tundra Tech',         'FrostByte Data Centre',  'heavy',  'Longyearbyen',   'EXPRESS',  '2026-04-02'),
    ('SH-004', 'Snowdrift Supplies',  'Camp Basecamp',          '45.0',   'Nuuk',           'STANDARD', '2026-04-02'),
    (NULL,     'Penguin Post',        'McMurdo Station',        '8.7',    'Antarctica',     'EXPRESS',  '2026-04-03'),
    ('SH-006', 'Blizzard Books',      'Library of the North',   '2.1',    'Murmansk',       'STANDARD', '2026-04-03'),
    ('SH-007', 'Icicle Instruments',  'Northern Philharmonic',  '67.3',   'Fairbanks',      'OVERNIGHT','2026-04-03'),
    ('SH-008', 'Frostbite Fashion',   'Aurora Boutique',        '1.8',    'Rovaniemi',      'EXPRESS',  '2026-04-04'),
    ('SH-009', 'Glacier Gourmet',     'Polar Bear Bistro',      '25.0',   'Hammerfest',     'STANDARD', '2026-04-04'),
    ('SH-010', 'Permafrost Parts',    'Snowmobile Central',     '120.5',  'Yellowknife',    'STANDARD', '2026-04-04'),
    ('SH-011', 'Chill Chemicals',     'Ice Core Lab',           '5.5',    'Summit Station', 'EXPRESS',  '2026-04-05'),
    ('SH-012', 'Arctic Apothecary',   'Glacier General Store',  '-3.0',   'Reykjavik',      'STANDARD', '2026-04-05'),
    ('SH-013', 'Snowdrift Supplies',  'Camp Basecamp',          '15.0',   'Nuuk',           'STANDARD', 'next tuesday'),
    ('SH-014', 'Tundra Tech',         'FrostByte Data Centre',  '9.9',    'Longyearbyen',   'EXPRESS',  '2026-04-06'),
    ('SH-015', 'Polar Provisions',    'Ice Hotel Tromsoe',       '7.7',    'Tromsoe',         'STANDARD', '2026-04-06'),
    ('SH-016', 'Blizzard Books',      'Library of the North',   '4.3',    'Murmansk',       'EXPRESS',  '2026-04-07'),
    ('SH-017', 'Icicle Instruments',  NULL,                     '22.0',   'Fairbanks',      'STANDARD', '2026-04-07'),
    ('SH-018', 'Glacier Gourmet',     'Polar Bear Bistro',      '11.1',   'Hammerfest',     'EXPRESS',  '2026-04-07');

Your Objective

Step 1: Create the typed delivery table

Create a table called DELIVERIES with proper column types and constraints. Enable DML error logging on it.

Column Type Constraint
shipment_id VARCHAR(6) NOT NULL
sender VARCHAR(50)
recipient VARCHAR(50) NOT NULL
weight_kg NUMBER(5,1)
destination VARCHAR(50)
priority VARCHAR(8)
delivery_date DATE
🧊 I'm Stuck — Help! (Click me)

💡 Add ERROR_LOGGING = TRUE after the closing parenthesis of your CREATE TABLE statement, just like you'd add CLUSTER BY or DATA_RETENTION_TIME_IN_DAYS.

Step 2: Load the data

Write an INSERT INTO ... SELECT that casts each column from RAW_SHIPMENTS into the types expected by DELIVERIES. Run it and note how many rows were inserted.

🧊 I'm Stuck — Help! (Click me)

💡 Cast each column explicitly: shipment_id::VARCHAR(6), weight_kg::NUMBER(5,1), delivery_date::DATE, etc. With error logging enabled, the INSERT won't crash on bad rows — it will skip them and keep going. You should see 13 rows inserted out of 18.

Step 3: Build the error report

Query the error table to produce a clean rejection report. Show:

  • ERROR_DATA:SHIPMENT_ID::VARCHAR aliased as SHIPMENT_ID
  • ERROR_DATA:SENDER::VARCHAR aliased as SENDER
  • ERROR_METADATA:error_source::VARCHAR aliased as PROBLEM_COLUMN
  • ERROR_METADATA:error_message::VARCHAR aliased as ERROR_MESSAGE

Order by SHIPMENT_ID.

🧊 I'm Stuck — Help! (Click me)

💡 Use ERROR_TABLE(DELIVERIES) as your FROM clause — it's a table function that returns the error log for any table with ERROR_LOGGING = TRUE. The columns ERROR_METADATA and ERROR_DATA are VARIANT (JSON), so use colon notation (:) to extract fields. The offending column's value is wrapped in an array in ERROR_DATA — that's how Snowflake flags which value caused the problem.

Expected Output:

SELECT
    ERROR_DATA:SHIPMENT_ID::VARCHAR AS shipment_id,
    ERROR_DATA:SENDER::VARCHAR AS sender,
    ERROR_METADATA:error_source::VARCHAR AS problem_column,
    ERROR_METADATA:error_message::VARCHAR AS error_message
FROM ERROR_TABLE(DELIVERIES)
ORDER BY shipment_id;
SHIPMENT_ID SENDER PROBLEM_COLUMN ERROR_MESSAGE
SH-003 Tundra Tech WEIGHT_KG Numeric value 'heavy' is not recognized
SH-007 Icicle Instruments PRIORITY String 'OVERNIGHT' is too long and would be truncated
SH-013 Snowdrift Supplies DELIVERY_DATE Date 'next tuesday' is not recognized
SH-017 Icicle Instruments RECIPIENT NULL result in a non-nullable column
[null] Penguin Post SHIPMENT_ID NULL result in a non-nullable column

Step 4: Reconciliation check

Prove that every row in RAW_SHIPMENTS is accounted for — either loaded into DELIVERIES or captured in the error table. Write a single query that returns three numbers:

  • rows_loaded — count of rows in DELIVERIES
  • rows_rejected — count of rows in ERROR_TABLE(DELIVERIES)
  • total_raw_rows — count of rows in RAW_SHIPMENTS
🧊 I'm Stuck — Help! (Click me)

💡 Use scalar subqueries: SELECT (SELECT COUNT(*) FROM DELIVERIES) AS rows_loaded, .... The three counts should add up: 13 + 5 = 18.

Expected Output:

SELECT
    (SELECT COUNT(*) FROM DELIVERIES) AS rows_loaded,
    (SELECT COUNT(*) FROM ERROR_TABLE(DELIVERIES)) AS rows_rejected,
    (SELECT COUNT(*) FROM RAW_SHIPMENTS) AS total_raw_rows;
ROWS_LOADED ROWS_REJECTED TOTAL_RAW_ROWS
13 5 18

Hints

  • DML error logging is enabled per-table with ERROR_LOGGING = TRUE on CREATE TABLE or ALTER TABLE. It's a table property, not a session setting.
  • There's also a session parameter OPT_OUT_ERROR_LOGGING (default FALSE). If someone sets it to TRUE, error logging is disabled for that session regardless of the table property.
  • ERROR_TABLE(<table_name>) is a table function — use it in your FROM clause like any other table function.
  • The error table has five columns: TIMESTAMP, QUERY_ID, ERROR_CODE, ERROR_METADATA (VARIANT), and ERROR_DATA (VARIANT).
  • ERROR_METADATA contains the error code, message, source column, and SQL state as JSON fields.
  • ERROR_DATA contains the full row that failed, as JSON. The column that caused the error has its value wrapped in an array — that's Snowflake's way of flagging the offending value.
  • You can TRUNCATE TABLE ERROR_TABLE(<table_name>) to clear the error log.
  • This feature works with INSERT (both VALUES and SELECT), UPDATE, MERGE, and COPY INTO.
  • Check out the DML error logging documentation for the full reference.
Next
Next

Week 151 - Cortex Agents