Learning ETL – ETL Process
Posted by Dylan Wan on January 6, 2007
Here is the typical ETL Process:
- Specify metadata for sources, such as tables in an operational system
- Specify metadata for targets—the tables and other data stores in a data warehouse
- Specify how data is extracted, transformed, and loaded from sources to targets
- Schedule and execute the processes
- Monitor the execution
A ETL tool thus involves the following components:
- A design tool for building the mapping and the process flows
- A monitor tool for executing and monitoring the process
The process flows are sequences of steps for the extraction, transformation, and loading of data. The data is extracted from sources (inputs to an operation) and loaded into a set of targets (outputs of an operation) that make up a data warehouse or a data mart.
A good ETL design tool should provide the change management features that satisfies the following criteria:
- A metadata respository that stores the metdata about sources, targets, and the transformations that connect them.
- Enforce metadata source control for team-based development : Multiple designers should be able to work with the same metadata repository at the same time without overwriting each other’s changes. Each developer should be able to check out metadata from the respository into their project or workspace, modify them, and check the changes back into the respository.
- After a metadata object has been checked out by one person, it is locked so that it cannot be updated by another person until the object has been checked back in.