July 2008 Archives

We debunk another commonly proposed approach for making a row-store perform like a column-store: vertically partitioning a row-store. Vertical partitioning is a performance enhancing trick that some DBAs perform to enhance performance on read-mostly data warehouse workloads. The idea is to store an n-column table in n new tables. Each of these new tables contains two columns - a tuple ID column and data value column from the original table. Continue reading "Debunking Another Myth: Column-Stores vs. Vertical Partitioning" »

Consider a traditional, row-oriented database. Indexes are known to improve performance in database systems. They can greatly reduce I/O costs by avoiding the need to perform table scans since they directly contain the data you need to answer a query or contain pointers to such data. If you have a query that accesses only two out of thirty columns from a large table, and you have an index on these two columns, then you can use the indexes to avoid scanning all of the data in a table. Continue reading "Debunking a Myth: Column-Stores vs. Indexes" »

Both column-stores and data cubes are designed to provide high performance on analytical database workloads (often referred to as Online Analytical Processing, or OLAP.) These workloads are characterized by queries that select a subset of tuples, and then aggregate and group along one or more dimensions. In this post, we study how column-stores and data cubes would evaluate a query on a sample database. Continue reading "Understanding the Difference Between Column-Stores and OLAP Data Cubes" »

About this Archive

This page is an archive of posts from July 2008 listed from newest to oldest.

June 2008 is the previous archive.

September 2008 is the next archive.

Find recent content on the main index or look in the archives to find all content.