Dylan's BI Study Notes

My notes about Business Intelligence, Data Warehousing, OLAP, and Master Data Management

About Big Data (2): SQL and No SQL

Posted by Dylan Wan on April 14, 2012

This is a good article written by Alan Gates, Pig architect at Yahoo!

Comparing Pig Latin and SQL for Constructing Data Processing Pipelines

He compares Pig Latin, a query language for Hive with SQL, the query language for relational database. He gave a good example that helps those who know SQL to understand the differences.

The query language is not difficult to write.  The key point is that it lets the programmer to control the execution plan.  SQL on the other is a higher level abstraction that hides the execution plan from users.  It does not means that we do not really need to be worried about execution plan.  We do.  It is a key for performance tuning.

A good practice for SQL developers is to get the explain plan and to see how Oracle plans to execute the query and optionally use hint and index to control the execution.

Comparing to this, the approach of directly telling the system what to do may seem easier.

But it means that the database system to handle big data does less things, not smarter.

Advertisements

One Response to “About Big Data (2): SQL and No SQL”

  1. Thanks Dylan for sharing the information. It was indeed very informative.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s