Dylan's BI Study Notes

My notes about Business Intelligence, Data Warehousing, OLAP, and Master Data Management

Posts Tagged ‘big-data’

Cloud BI Features – Amazon QuickSight

Posted by Dylan Wan on October 17, 2015

Here is a list of features available from Amazon QuickSight:

Data Source Connect to supported AWS data sources
Data Source Upload flat files
Data Source Access third-party data sources
Data Preparation Data Preparation Tools
Visualization Build Visualizations
Visualization Access all chart types
Visualization Filter Data
Data Access Capture and Share, Collaborate
Data Access API/ODBC connection to SPICE
Security Encryption at Rest
Security Active Directory Integration
Security Fine-grained User Access Control
Security Enable Audit Logs with AWS CloudTrail
Performance In-memory calculation with SPICE
Performance Scale to thousands of users
Performance Support up to petabytes of data


I categorize the features into these groups:

  1. Data Source
  2. Data Preparation
  3. Visualization
  4. Data Access (or Alternate Access)
  5. Security
  6. Performance

They are almost same features available from other BI tools, such OBIEE, except the in-memory engine, and perhaps the scalability.  Here are some questions I have. Read the rest of this entry »


Posted in BI, Business Intelligence | Tagged: , , , , | Leave a Comment »

Data Lake vs. Data Warehouse

Posted by Dylan Wan on October 4, 2015

These are different concepts.

Data Lake – Collect data from various sources in a central place.  The data are stored in the original form.  Big data technologies are used and thus the typical data storage is Hadoop HDFS.

Data Warehouse – “Traditional” way of collecting data from various sources for reporting.  The data are consolidated and are integrated.  A data warehouse design that follow the dimensional modeling technique may store data in star schema with fact tables and dimension tables.   Typically a relational database is used.

If we look at the Analytics platform at Ebay from this linkedin slideshare and this 2013 article: Read the rest of this entry »

Posted in Big Data, Data Warehouse, EDW | Tagged: , , | Leave a Comment »

About Big Data (1)

Posted by Dylan Wan on March 7, 2012

Recently I read several articles and books about big data.

I found that many use a very funny definition to define big data.

Big data is the data that you typically cannot handle in the database.  It is bigger than the size of the data you have.


It is a joke I told my daughter during the dinner.  Someone said that they are selling a very good car.  You asked them:  How good is it?  They said that their car can take more people, run faster, much more comfortable, provide better safety, and cheaper.  When you ask them about more details, they keep saying that it will better than what you have. Will you buy it.  She felt that the sale person is a liar.

I do believe that the big data problem does exist today, but it is a special kind of data and requires some special way to handle.

It is not everything.  It may require a new way that does not exist before.  It may be also likely to require some ways that have been there for some time, but we just did not pay attention to it.

Posted in Big Data, Data Warehouse | Tagged: | Leave a Comment »