Dylan's BI Study Notes

My notes about Business Intelligence, Data Warehousing, OLAP, and Master Data Management

Posts Tagged ‘Cloud BI’

Cloud BI Features – Amazon QuickSight

Posted by Dylan Wan on October 17, 2015

Here is a list of features available from Amazon QuickSight:

Data Source Connect to supported AWS data sources
Data Source Upload flat files
Data Source Access third-party data sources
Data Preparation Data Preparation Tools
Visualization Build Visualizations
Visualization Access all chart types
Visualization Filter Data
Data Access Capture and Share, Collaborate
Data Access API/ODBC connection to SPICE
Security Encryption at Rest
Security Active Directory Integration
Security Fine-grained User Access Control
Security Enable Audit Logs with AWS CloudTrail
Performance In-memory calculation with SPICE
Performance Scale to thousands of users
Performance Support up to petabytes of data

.

I categorize the features into these groups:

  1. Data Source
  2. Data Preparation
  3. Visualization
  4. Data Access (or Alternate Access)
  5. Security
  6. Performance

They are almost same features available from other BI tools, such OBIEE, except the in-memory engine, and perhaps the scalability.  Here are some questions I have. Read the rest of this entry »

Advertisements

Posted in BI, Business Intelligence | Tagged: , , , , | Leave a Comment »

Lessons Learned – Data Warehouse on Cloud – Amazon Redshift

Posted by Dylan Wan on September 22, 2015

I feel that these are the rules applicable for any cloud based data warehouse solution.  In general, I feel that the on-premise data warehouse deployment probably will remain for a while.

1. For a columnar database, “select *” is bad

I think that the projection needs to be done as early as possible and should be pushed down.

If a column is not needed in the downstream flow, it should not be selected in the first place.

If the application logic is defined in the metadata, the tool should read it and generate the extract logic. Read the rest of this entry »

Posted in Business Intelligence | Tagged: , , , , | Leave a Comment »

Extracting Data from Cloud Apps

Posted by Dylan Wan on March 5, 2015

I think that it would be easier if the cloud application can be aware of the data integration needs and publish the interfaces proactively.

Here are some basic requirements for the applications that can be considered as data integration friendly:

1. Publish the object data model

This is required for source analysis.

For example, here is the RightNow Object Model.

The documentation should describe the object, the attributes available, the identifiers, the validation rules, the relationships among the objects, including their cardinality, the domain or lookup used in validating the data, etc.

2. Support a scalable solution for the Full Extract

Initial load may require a full extract of the historical transaction data and reference data.  This is typically a one-time only event during implementation phase.

If the extract starts about the same time when the cloud apps is initially implemented, the volume of the initial extract may be similar to incremental extract.  However, if the Cloud BI is implemented much later, after the transational apps, the volume of initial load may be big.

3. Support incremental extract / Change Data Capture

The daily or near real-time extract requires a way to identify the delta, the data created or updated recently.

In many systems, this can be done via an auditing column, such as the last updated date column.  However, if the source object model allows a nested structure.  The last update date may need to be available in two forms, update the header only, or update any objects including the child and the descendant.

4. Delete Capture

Capturing the deleted data during the incremental extract is different from the creation and update cases. If the deleted rows are not extracted, the data will be kept in the data warehouse and be treated as a normal data.  The report could be considered wrong.

In some source systems, deletion and archiving are different events.  We need to have a way to differentiate these two events.

5. Metadata Extract

This is a more advanced feature.  It is required for supporting the custom objects.

In other normal objects, the expected data model is determined during the design time.

However, if the source system allows customization that can add or enable new attributes or even creating new objects, the extraction program needs to be able to detect the model change and react accordingly.

6. Extracting the Lookup or Domain data

Many systems support the multi-language using lookup table.  The idea is that the translated strings are kept in a separate table and allow the data shown on the application UI in different languages depending on the preferred language from the login user.  Not all source system can support this feature.  However, for the source systems that can support this feature, extracting the lookup data can allow the BI showing the data in the same translated as the source system.

The requirement for the extracting process is to extract the data in all languages that are expected in the BI / Data Warehouse side.

7. Extracting data for integrated data security

If the users that can access the cloud source apps and the users that can access BI are the same set of users, or a subset, the user and authentication service should be shared.

However, if the user or the internal person record identifiers are used in building the data security, extracting the users and the roles granted may be necessary.

8. Extracting the UI displayed Labels

In some cases, the cloud BI can be a seamless integrated or even embedded into the source apps.  The displayed lable for the object and for the attributes need to be consistent.  In such cases, extracting the display labels are also requried.

It may be similar to the process of extracting the lookup or domain data which have the data available in multiple languages.

Posted in Business Intelligence | Tagged: , , , , , | Leave a Comment »

BI Applications in Cloud

Posted by Dylan Wan on October 6, 2014

Prepackaged analytics applications are available as cloud services.

The idea is that the client company does not need to use their own hardware and does not need to install the software or apply patches by themselves.

What they need is just simply the browsers.

For the end users, there should not be much difference.   The BI apps built on the OBIEE platform is already a web based application.  Users today, even when they use the BI applications deployed on premise, access the applications via the browser.

The real difference is to the IT department and to the company from the investment perspectives.

The deployment time should be very short.

Where is the role of the ISV, VAR, or SI in this picture?

I think that their role is still very important.

First of all, the out of box reports may not work for everyone.  New reports may need to be created.

Secondly, the source apps may be extended.  For example, oracle cloud services, such as CX clouds, support extensible attributes.  Where those attribute need to be reported should be probably added or hidden.

When the data are not loaded from other Oracle cloud applications, loading the data from Oracle’s  Application Unlimited products such as PeopleSoft, E-Business Suite, Siebel, J.D. Edwards may require a lot of work.

Posted in Business Intelligence | Tagged: , , , | Leave a Comment »