Let's collect best practices for dbt incremental models. I've been through a few iterations and want to share what works.
My Key Learnings
1. Always use a reliable updated_at column
Don't rely on id ordering. Use a proper timestamp:
{{
config(
materialized='incremental',
unique_key='order_id',
incremental_strategy='merge'
)
}}
SELECT *
FROM {{ source('raw', 'orders') }}
{% if is_incremental() %}
WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
{% endif %}2. Use merge strategy over delete+insert for large tables
The merge strategy is significantly faster for tables > 100M rows:
- delete+insert: ~45 min for 500M row table
- merge: ~8 min for same table
3. Partition your target tables wisely
Partitioning by date works well for time-series data, but consider your query patterns first.
What patterns have worked for you? Drop your configs below!
More details on incremental strategies: https://docs.getdbt.com/docs/build/incremental-models