Incremental ETL in BI Apps 188.8.131.52.1 (1)
Posted by Dylan Wan on May 10, 2013
The data warehouse needs to be refreshed periodically to make sure that the data warehouse have the up to date information. The Incremental Load is the regular ETL process running everyday. The frequency can vary, but it is commonly seen as nightly job. To ensure that the incremental ETL job can be finished within a limited time window, incremental ETL load typically identifies the changes happened in the source system and only deal with the delta to avoiding doing a Full ETL Load.
In BI Apps 184.108.40.206.1, if you are not leveraging the Golden Gate and the SDS (Source Dependent Data Store), the incremental ETL is accomplished by directly comparing the last update date captured in the source table with the last refreshed date.
BI Apps 220.127.116.11.1 ODI based ETL does not use separate ETL tasks for incremental load and full load. The logic is determined by whether the data has been executed before. In other word, it is based on whether the data has been refreshed. This was a productivity gain for BI App developers and also make the support of incremental ETL becomes clear. In the past, you may see the Incremental Load Informatica Workflow and Full Load. Every time when there is a logic change, the workflows need to be modified. It was a source of code bugs before. By using a better ETL architecture, we eliminate the needs of maintaining two separate tasks.
Here is a high level summary of how the incremental ETL is accomplished in BI Apps 18.104.22.168.1 ETL:
1. Use the W_ETL_LOAD_DATES table for tracking all the ETL execution
2. Use the IS_INCREMENTAL Variable for holding and passing the state
3. Refresh Variable in a Package Step. If there is an entry in the W_ETL_LOAD_DATES, perform an incremental ETL.
4. Use the RUN_INCREMENTAL to turn on and off the incremental filter
5. Optionally, use the ODI Variable Evaluate Step to branch the Full and Incremental ETL logic within a package
6. Table Maintenance, take the ETL Load Date information to determine the operations, including Drop Index, Truncate Table before Full Load, Rebuild the index after Load, and Analyze Table after Load, tec.