Data Warehouse on Clould – Amazon Redshift
Posted by Dylan Wan on September 17, 2015
Here is a brief summary of what I learned by reading these materials.
1. The data warehouse is stored in clusters
It can support scale out, not scale up.
“Extend the existing data warehouse rather than adding hardware”
2. Use SQL to access the data warehouse
3. Load data from Amazon S3 (Storage Service) using MPP process
4. Partition / Distribute the data by time
“The BI team wanted to calculate some expensive analytics on a few years of data, so we just restored a snapshot and added a bunch of nodes for a few days”