Week 157 - Custom Incremental Dynamic Tables

Challenge

Snowflake Summit 2026 is days away and the event operations team is drowning. Thousands of attendees are registering, updating their session picks, and cancelling β€” all streaming into a single raw registrations table. The standard dynamic table pipeline they built last year can't keep up: it doesn't know how to handle cancellations (soft-deletes), and re-processing every attendee on every refresh is melting their warehouse budget faster than a snow cone in San Francisco.

They've heard about Custom Incremental Dynamic Tables β€” a new way to write your own MERGE logic inside a REFRESH USING clause, letting Snowflake handle the scheduling and retries while you control exactly how changes land. Time to put it to work.

Assignment

Build a custom incremental dynamic table pipeline that maintains a live attendee dashboard for Summit 2026 β€” handling new registrations, session changes, and cancellations incrementally using MERGE INTO SELF.

Setup Code

Setup Code (click me)
-- Frosty Friday Week 157 β€” Setup
-- Custom Incremental Dynamic Tables
-- https://frostyfri.day/en/challenges/week-157

USE ROLE SYSADMIN;

CREATE DATABASE IF NOT EXISTS FROSTY_DB;
CREATE SCHEMA IF NOT EXISTS FROSTY_DB.WEEK_157;
USE SCHEMA FROSTY_DB.WEEK_157;

CREATE OR REPLACE WAREHOUSE FROSTY_WH
  WAREHOUSE_SIZE = 'XSMALL'
  AUTO_SUSPEND = 60
  AUTO_RESUME = TRUE;

-- Summit sessions catalogue (static dimension)
CREATE OR REPLACE TABLE SESSIONS (
    session_id    VARCHAR(6)   NOT NULL,
    session_title VARCHAR(100) NOT NULL,
    track         VARCHAR(30)  NOT NULL,
    capacity      INT          NOT NULL,
    day           DATE         NOT NULL
);

INSERT INTO SESSIONS VALUES
    ('S001', 'Keynote: The Future of Data',         'General',       5000, '2026-06-03'),
    ('S002', 'Dynamic Tables Deep Dive',            'Engineering',    300, '2026-06-03'),
    ('S003', 'Cortex AI in Production',             'AI/ML',          400, '2026-06-03'),
    ('S004', 'Building Native Apps',                'Apps',           250, '2026-06-04'),
    ('S005', 'Iceberg & Open Formats',              'Engineering',    350, '2026-06-04'),
    ('S006', 'Data Governance at Scale',            'Governance',     200, '2026-06-04'),
    ('S007', 'Snowpark Container Services',         'Engineering',    300, '2026-06-05'),
    ('S008', 'Cost Optimization Masterclass',       'Administration', 250, '2026-06-05'),
    ('S009', 'Streaming & Real-Time Pipelines',     'Engineering',    350, '2026-06-05'),
    ('S010', 'Closing Keynote: Community Awards',   'General',       5000, '2026-06-05');

-- Raw registrations table (change-tracked source)
-- This simulates the CDC feed from the registration platform.
CREATE OR REPLACE TABLE REGISTRATIONS (
    registration_id VARCHAR(8)   NOT NULL,
    attendee_name   VARCHAR(50)  NOT NULL,
    attendee_email  VARCHAR(100) NOT NULL,
    company         VARCHAR(50)  NOT NULL,
    session_id      VARCHAR(6)   NOT NULL,
    status          VARCHAR(12)  NOT NULL,  -- REGISTERED, CANCELLED
    registered_at   TIMESTAMP_NTZ NOT NULL,
    updated_at      TIMESTAMP_NTZ NOT NULL,
    CONSTRAINT pk_registrations PRIMARY KEY (registration_id) RELY
);

-- Enable change tracking on the source
ALTER TABLE REGISTRATIONS SET CHANGE_TRACKING = TRUE;

-- Initial batch of registrations (the "historical" data already in the system)
INSERT INTO REGISTRATIONS VALUES
    ('R001', 'Alice Chen',      'alice@dataflow.io',     'DataFlow Inc',   'S001', 'REGISTERED', '2026-05-01 09:00:00', '2026-05-01 09:00:00'),
    ('R002', 'Bob Martinez',    'bob@snowpros.com',      'SnowPros LLC',   'S002', 'REGISTERED', '2026-05-01 09:15:00', '2026-05-01 09:15:00'),
    ('R003', 'Carol Johansson', 'carol@nordicdata.se',   'Nordic Data AB', 'S003', 'REGISTERED', '2026-05-01 10:00:00', '2026-05-01 10:00:00'),
    ('R004', 'Dave Kim',        'dave@querycraft.kr',    'QueryCraft',     'S005', 'REGISTERED', '2026-05-02 08:30:00', '2026-05-02 08:30:00'),
    ('R005', 'Eve Novak',       'eve@frostbyte.cz',     'FrostByte s.r.o','S001', 'REGISTERED', '2026-05-02 11:00:00', '2026-05-02 11:00:00'),
    ('R006', 'Frank Liu',       'frank@icebreaker.cn',   'Icebreaker Ltd', 'S004', 'REGISTERED', '2026-05-03 14:00:00', '2026-05-03 14:00:00'),
    ('R007', 'Grace Okafor',    'grace@polarpipe.ng',    'PolarPipe',      'S006', 'REGISTERED', '2026-05-03 15:30:00', '2026-05-03 15:30:00'),
    ('R008', 'Hiro Tanaka',     'hiro@blizzardbi.jp',   'BlizzardBI',     'S007', 'REGISTERED', '2026-05-04 09:00:00', '2026-05-04 09:00:00'),
    ('R009', 'Ines Bergmann',   'ines@alpineml.de',      'Alpine ML',      'S003', 'REGISTERED', '2026-05-04 10:30:00', '2026-05-04 10:30:00'),
    ('R010', 'Jake Thompson',   'jake@summitdata.us',    'Summit Data Co', 'S008', 'REGISTERED', '2026-05-05 07:45:00', '2026-05-05 07:45:00'),
    ('R011', 'Kaya Petrov',     'kaya@glacierdb.bg',     'GlacierDB',      'S009', 'REGISTERED', '2026-05-05 13:00:00', '2026-05-05 13:00:00'),
    ('R012', 'Liam Frost',      'liam@tundratech.ca',    'TundraTech',     'S002', 'REGISTERED', '2026-05-06 08:00:00', '2026-05-06 08:00:00'),
    ('R013', 'Mia Santos',      'mia@snowbound.br',      'Snowbound',      'S010', 'REGISTERED', '2026-05-06 09:30:00', '2026-05-06 09:30:00'),
    ('R014', 'Noah Fischer',    'noah@permafrost.at',    'Permafrost GmbH','S005', 'REGISTERED', '2026-05-07 11:00:00', '2026-05-07 11:00:00'),
    ('R015', 'Olivia Park',     'olivia@arcticai.kr',    'Arctic AI',      'S003', 'REGISTERED', '2026-05-07 14:15:00', '2026-05-07 14:15:00');

Your Objective

Step 1: Create the custom incremental dynamic table

Create a dynamic table called SUMMIT_DASHBOARD that uses REFRESH USING with a MERGE INTO SELF to incrementally process changes from the REGISTRATIONS table. The table should:

  • Join incoming registration changes with the SESSIONS dimension table to enrich each row
  • Handle INSERTs (new registrations)
  • Handle UPDATEs (session changes or cancellations)
  • Handle DELETEs from the source (remove from dashboard)

Columns:

Column Type
registration_id VARCHAR(8)
attendee_name VARCHAR(50)
company VARCHAR(50)
session_id VARCHAR(6)
session_title VARCHAR(100)
track VARCHAR(30)
day DATE
status VARCHAR(12)
registered_at TIMESTAMP_NTZ
updated_at TIMESTAMP_NTZ

Use TARGET_LAG = '1 minute' and warehouse FROSTY_WH.

🧊 I'm Stuck β€” Help! (Click me)

πŸ’‘ The key pattern is MERGE INTO SELF USING (SELECT ... FROM registrations CHANGES() AS r JOIN sessions AS s ON r.session_id = s.session_id) AS src ON self.registration_id = src.registration_id. Use METADATA$ACTION to detect deletes β€” when the action is 'DELETE' and it's not part of an update (NOT METADATA$ISUPDATE), that's a real delete from the source. Handle MATCHED with UPDATE for changes and NOT MATCHED with INSERT for new rows. Don't forget to filter out the DELETE half of update-pairs in your insert/update logic.

Expected Output (after the initial refresh populates from existing data β€” query SELECT COUNT(*) AS attendees FROM SUMMIT_DASHBOARD WHERE status = 'REGISTERED'):

ATTENDEES
15

Step 2: Simulate new registrations, a session change, and a cancellation

Run these DML statements against the REGISTRATIONS source table to simulate real-world changes arriving after the initial load:

-- New registration
INSERT INTO FROSTY_DB.WEEK_157.REGISTRATIONS VALUES
    ('R016', 'Priya Sharma', 'priya@snowscale.in', 'SnowScale', 'S002',
     'REGISTERED', '2026-05-20 10:00:00', '2026-05-20 10:00:00');

-- Session change: Bob switches from Dynamic Tables Deep Dive to Iceberg
UPDATE FROSTY_DB.WEEK_157.REGISTRATIONS
   SET session_id = 'S005',
       updated_at = '2026-05-20 11:00:00'
 WHERE registration_id = 'R002';

-- Cancellation: Frank can't make it
UPDATE FROSTY_DB.WEEK_157.REGISTRATIONS
   SET status = 'CANCELLED',
       updated_at = '2026-05-20 12:00:00'
 WHERE registration_id = 'R006';

Wait for the dynamic table to refresh (you can trigger a manual refresh with ALTER DYNAMIC TABLE SUMMIT_DASHBOARD REFRESH), then query the dashboard.

🧊 I'm Stuck β€” Help! (Click me)

πŸ’‘ After the refresh, Bob's session_title should now show 'Iceberg & Open Formats' (not 'Dynamic Tables Deep Dive'), Frank's status should be 'CANCELLED', and Priya should appear as a new row. The MERGE handles all three cases: WHEN MATCHED THEN UPDATE covers both the session change and the cancellation; WHEN NOT MATCHED covers the new registration.

Expected Output (query: SELECT registration_id, attendee_name, session_title, status FROM SUMMIT_DASHBOARD WHERE registration_id IN ('R002', 'R006', 'R016') ORDER BY registration_id):

REGISTRATION_ID ATTENDEE_NAME SESSION_TITLE STATUS
R002 Bob Martinez Iceberg & Open Formats REGISTERED
R006 Frank Liu Building Native Apps CANCELLED
R016 Priya Sharma Dynamic Tables Deep Dive REGISTERED

Step 3: Build an attendee count summary using a standard dynamic table downstream

Create a second dynamic table called SESSION_HEADCOUNT that sits downstream of SUMMIT_DASHBOARD β€” this one uses a normal SELECT-based definition (not custom incremental). It should show, per session, the count of currently registered (non-cancelled) attendees versus total capacity.

Columns:

Column Type
session_id VARCHAR(6)
session_title VARCHAR(100)
track VARCHAR(30)
day DATE
registered INT
capacity INT
fill_pct NUMBER(5,1)

Use TARGET_LAG = '1 minute' and warehouse FROSTY_WH.

🧊 I'm Stuck β€” Help! (Click me)

πŸ’‘ This is a straightforward SELECT-based dynamic table β€” no REFRESH USING needed. LEFT JOIN SESSIONS with SUMMIT_DASHBOARD on session_id with an additional filter d.status = 'REGISTERED' in the ON clause. GROUP BY session, and compute fill_pct as ROUND(COUNT(d.registration_id) / s.capacity * 100, 1). The LEFT JOIN from SESSIONS ensures sessions with zero registrations still appear.

Expected Output (query: SELECT session_title, registered, capacity, fill_pct FROM SESSION_HEADCOUNT ORDER BY fill_pct DESC LIMIT 5):

SESSION_TITLE REGISTERED CAPACITY FILL_PCT
Iceberg & Open Formats 3 350 0.9
Cortex AI in Production 3 400 0.8
Dynamic Tables Deep Dive 2 300 0.7
Data Governance at Scale 1 200 0.5
Cost Optimization Masterclass 1 250 0.4

Step 4: Verify the pipeline handled changes correctly

Write a single reconciliation query that returns:

  • total_in_dashboard β€” total rows in SUMMIT_DASHBOARD
  • active_registrations β€” rows with status = 'REGISTERED'
  • cancelled β€” rows with status = 'CANCELLED'
  • source_rows β€” total rows in REGISTRATIONS
🧊 I'm Stuck β€” Help! (Click me)

πŸ’‘ Use scalar subqueries or a cross join with two aggregations. The numbers should reconcile: the dashboard total should equal the source total (16), with 15 active and 1 cancelled.

Expected Output:

TOTAL_IN_DASHBOARD ACTIVE_REGISTRATIONS CANCELLED SOURCE_ROWS
16 15 1 16

Hints

  • Custom incremental dynamic tables use REFRESH USING (MERGE INTO SELF ...) or REFRESH USING (INSERT INTO SELF ...). They require an explicit column list in the CREATE statement β€” Snowflake can't infer the schema from DML.
  • The CHANGES() clause replaces stream semantics. You don't specify time bounds β€” Snowflake automatically binds the change interval to refresh boundaries.
  • METADATA$ACTION returns 'INSERT' or 'DELETE'. Updates appear as a DELETE + INSERT pair with METADATA$ISUPDATE = TRUE.
  • A RELY primary key on the source table tells Snowflake how to identify rows, enabling delete-then-insert pairs on the same key to cancel out (net-zero change).
  • Objects outside a CHANGES() clause (like your SESSIONS dimension) are read at their state at the refresh snapshot time β€” they're not tracked incrementally.
  • Custom incremental dynamic tables can feed downstream standard dynamic tables. The downstream table just sees a normal table being updated.
  • If you get a "change tracking not enabled" error, make sure you ran ALTER TABLE ... SET CHANGE_TRACKING = TRUE on your source table.
  • For the full syntax reference, see: Custom incrementalization
Next
Next

Week 156 - AI_EXTRACT