Design data models that are both query-efficient and easy to maintain over time.
Data modelling determines how your data is organised in the warehouse.
Kimball Dimensional Modelling:
- Fact tables: numeric measures (sales_amount, quantity, revenue)
- Dimension tables: descriptive attributes (date, customer, product, location)
- Star schema: one fact table surrounded by dimension tables
- Snowflake schema: normalised dimensions
- Slowly Changing Dimensions (SCDs):
- SCD Type 1: overwrite old value
- SCD Type 2: add new row with validity dates (most common)
- SCD Type 3: add column for old/new values
Data Vault 2.0:
- Hubs (business keys), Links (relationships), Satellites (context/attributes)
- Better for highly regulated industries and complex history tracking
- More complex to implement than Kimball
OBT (One Big Table):
- Denormalised, pre-joined — fast for reporting, wasteful on storage
- Suitable for smaller organisations or specific analytical use cases
dbt modelling layers:
- Staging (stg_) — raw source, light cleaning
- Intermediate (int_) — business logic
- Mart (dim_, fct_) — final analytics-ready tables
Kimball Dimensional Modelling:
- Fact tables: numeric measures (sales_amount, quantity, revenue)
- Dimension tables: descriptive attributes (date, customer, product, location)
- Star schema: one fact table surrounded by dimension tables
- Snowflake schema: normalised dimensions
- Slowly Changing Dimensions (SCDs):
- SCD Type 1: overwrite old value
- SCD Type 2: add new row with validity dates (most common)
- SCD Type 3: add column for old/new values
Data Vault 2.0:
- Hubs (business keys), Links (relationships), Satellites (context/attributes)
- Better for highly regulated industries and complex history tracking
- More complex to implement than Kimball
OBT (One Big Table):
- Denormalised, pre-joined — fast for reporting, wasteful on storage
- Suitable for smaller organisations or specific analytical use cases
dbt modelling layers:
- Staging (stg_) — raw source, light cleaning
- Intermediate (int_) — business logic
- Mart (dim_, fct_) — final analytics-ready tables