Skip to main content

Snowplow Optimized Materialization

This package makes use of the standard dbt incremental materialization with an optimization applied for incremental models. Its key advantage is that it limits table scans on the target table when updating/inserting based on the new data. This improves performance and reduces cost. We do this by overriding the macro that generates the sql for the merge and insert_delete incremental methods.

All other features of the incremental materialization are supported including incremental_predicates and on_schema_change. The code for the overridden macro can be found here.

Usage

To enable the materialization on a model you need to ensure a unique_key and upsert_date_key are provided in the model config, and that snowplow_optimize=true in the config as well.

In addition, the following must be added to your dbt_project.yml file once.

# dbt_project.yml
...
dispatch:
- macro_namespace: dbt
search_order: ['snowplow_utils', 'dbt']

If you wish to disable the buffer we apply to the upsert in the case of late arriving data (defined by snowplow__upsert_lookback_days) you can set disable_upsert_lookback to true in your model config.

Was this page helpful?