Why is Sunopsis so different?
Posted by Dylan Wan on December 27, 2006
On October 9, 2006, Oracle announced that they acquire Sunopsis.
I read the white paper from Sunopsis. I learned that the key is in their “E-LT” concept.
Tranditionally ETL does these three steps in orders –
Extract: This process is to extract the data from the source system, which can be a relational database, a CSV file, an excel file, or any system whose data can be retrived by calling some API. Most tool includes a repository to allow you to model the source data structure or load the data dictionary from the source system. Some can do the data polling such as try to listen to a FTP directory or periodically query the records from a relational database.
Transformation: This typically involves a transformation engine, which allows you to define the mapping between the source data structure and the target data structure. It also allows you to lookup value sets which they provide to convert the data value to those acceptible by the target system.
Load: This process actually loads the transformed data into the target system. It typically populates the tables in a relational database. Sometime it may call the API provided by the target system.
Many ETL tools are deployed as a middle-tier engine and use very little features from the source or target database since they want to make sure that they can support different platforms and independent from the source and target vendors. Some ETL tools are actually code generation engines. They allow the user to define the mapping and transformation in their GUI tool and they internally generates the program to do the transformation and load the data. Usually I found that the code generation engine provides better performance then the middle-tier engine. However, these generated programs cannot compare their performance with the manually-written codes that can use a lot of specific features from the target database server.
Oracle has been enhancing its native ETL type of capabilities within Oracle database servers for several years. It provides several bulk transformation capabilities in its SQL and PL/SQL. Only a few ETL tool, such as Oracle Warehouse Builder, can fully take the advantage of the capabilities from Oracle database. However, today companies face vast amount of heterogeneous data from disparate sources. It requires a BI platform that can work with multiple database systems.
I guess that this is where Sunopsis come into the picture.
The E-LT architecture is to load the data to the target database first before performing the transformation. It can fully leverage the target database features. Another important feature is to include a rule engine that can define the business rules including the mapping and transformation logic independently from any source system, while the engine can generate the target specific transformation program. They bring to Oracle the ability to load and transform data on all data platforms. Conmsistent with its Oracle Business Intelligence EE philosophy to provide a BI solution for heterogeneous environments
More information can be found at: