Dylan's BI Study Notes

My notes about Business Intelligence, Data Warehousing, OLAP, and Master Data Management

Data Scrubbing

Data Scrubbing is a technical term used in data warehouse world.  It is actually the same as the meaning of data cleansing in MDM.  It is the process of detecting, removing, and/or correcting the dirty data in a database.

The dirty data can be missing, incorrect, out-of-date, redundant, incomplete, or formatted incorrectly.  The dirty data  are in databases can be the result of human error in entering the data, the merging of multiple systems, a lack of company wide standards, or due to old systems that contain outdated data.

The data scrubbing program is the software application used to  clean the database.  Dirty can be introduced everywhere the human errors exist.   However, the need for data scrubbing is much clear when companies merge multiple databases to build their data warehouse.  Data scrubbing program is thus usually part of data warehouse implementation.

