How to implement Union in ODI and OBIEE?
Posted by Dylan Wan on November 16, 2015
When do you need to use UNION in BI?
You need to use UNION when you would like to combine the data sets as a single data set. However, when do you exactly need to do this type of thing in BI or in data warehouse ETL? Here are some real business cases:
1. Multiple Source Instances
For example, you have more than one EBS instances, perhaps for different divisions of your company and you would like to view the data across.
2. Multiple Sources
You have employee data coming from your HR system. You also have the candidate information from your recruitment applications. You would like to view the data across HR and recruitment systems.
3. Multiple source tables from a given source
This scenario is similar to the number 2. However, the number 2 is referring to different source systems. This item number 3 refers to s single instance of the application.
4. The scenario can be applied to fact table only
Transactional data my be stored in two separate transaction tables and both are joined to the same dimensions. When viewing the data, you want to group data in one result set. For example, quotes and orders, or AR transactions and GL journal entries.
5. The scenario can be applied to dimension only
I run into such case in OBIA. The CRM sales revenue in Fusion Apps can be created for a product item (the item master in ERP Cloud) or for a product group (the product catalog in Sales Cloud). Two FK columns and only one of them will be populated.
6. The scenario can be applied to both fact and dimension at the same time
I see an example in a custom data warehouse. The enterprise calendar is captured in W_DAY_D and in OBIA’s OOTB model, the facts joined to W_MCAL_DAY_D only. I thought that the customization can be built by joining to W_DAY_D. But the customization I have seen is to add the enterprise calendar columns to W_MCAL_DAY_D.
The two transaction tables in the case are similar to the case mentioned in the item 4 above, but they join to two different day dimensions.
There are more cases if we talk about implementations under the item 1. Almost all data are like this.
How do we support the UNION scenarios?
Viewing the result sets as one result set does not mean that you have to use UNION.
1. Use more than one ETL flow to populate the same table
OBIA uses the data source num ID to strip the data. In Informatica based ETL, DAC can generate multiple tasks in an execution plan for extracting the data from different source instances. In the ODI based ETL, load plan generator can branch the load plan templates and create multiple ODI steps in the generated load plan.
The trick is that when you need to perform any tasks at the table level, such as truncate table for full load, dropping indexes, recreating indexes, collecting database statistics, you need to treat all these tasks as a single unit. Both DAC and LPG can handle this.
If the tasks are different source tables as described in the above item 3, the application developers will create different ETL tasks.
In ODI, the load plan template will have separate tasks and can be used to group these tasks. In DAC, the task group can be used to group these tasks.
2. ODI also support UNION
ODI supports UNION via creating multiple data sets in an interface. The tasks cannot be executed in parallel since it will actually generate a SQL statement with UNION clause.
3. We do not really need to deal with the item 1 and item 2 using OBIEE since OBIA always source the data from the data warehouse. Another way to view this problem is that OBIEE alone really cannot support multiple source instances and cross subject joins across sources well. At least OBIA does not use OBIEE to accomplish these requirements.
However, OBIEE features are used for support item 3-6.
(to be continued)