Dylan's BI Study Notes

My notes about Business Intelligence, Data Warehousing, OLAP, and Master Data Management

Lookup the Data Warehouse FK with SCD enabled

There are at least two ways to design the data warehouse surrogate key for the dimension tables with the SCD type 2 enabled.

1. The Warehouse ID can uniquely identify the object across time periods. The unique key for the dimension is a composite key including the Warehouse ID and the date column. Some OLTP, such as PeopleSoft, supports the Date Effectivity feature in this way. That means that all the joins have to include the date column as part of the join criteria.

2 . The Warehouse ID can uniquely identify the record in the dimension table and is specific to a period of time. There may be still a column to represent the nature key, but the warehouse ID isolate the changes to the nature key from the data warehouse. All the joins from the fact table to the dimension table using the Warehouse ID.

Each of the above approaches has its advantages. However, from the ETL process perspective, I think that the following design makes sense:

When resolving the warehouse FK, you should pass the date from the fact table to identify the effective records from the dimension. If the dimension is SCD enabled, always lookup the warehouse surrogate key based on the nature key and the specific date passed from the fact staging table.

The ETL approach should be valid for both cases. Even the data columns is used as part of the composite key, you should ensure the record exists to avoid dangling join.