Supporting Column Store Performance Claims

| | Comments (0)
In this post, Mike Stonebraker tackles two issues with regards to row- versus column-store databases. In the first issue, he looks at performance challenges given the demands of users. In the second issue, he discusses the availability of third-party connectivity as well as automatic database design tools. Continue reading "Supporting Column Store Performance Claims" »

In response to Monash's post on the four categories of RDBMS

| | Comments (0)
In this response to a Curt Monash post over at the DBMS2 blog, Mike Stonebraker offers his reactions. He sees two categories of relational analytic/data warehouse databases, row stores and column stores, and notes that they have very different characteristics and should not be lumped together. He also points out that if high performance is required, current high-end relational engines can be beaten by a factor of 80 or so on TPC-C. Continue reading "In response to Monash's post on the four categories of RDBMS" »

Responding to Monash's recent post on diversity of database systems

| | Comments (1)
In this post, Mike Stonebraker comments on a post over at DBMS2 titled "Database management system choices - overview." Mike makes two points. First, he offers his list of the different types of DBMSs that he sees as viable. Second, he discusses OLTP and the shared nothing architecture. Continue reading "Responding to Monash's recent post on diversity of database systems" »

INSERT performance in column stores

| | Comments (4)
In this post, Stan Zdonik examines the issue of INSERT performance in column stores. By implementing certain strategies, he notes that it is possible to have a column store with INSERT performance that is at least competitive in performance with that of the major row stores. Continue reading "INSERT performance in column stores" »

MapReduce II

| | Comments (22) | TrackBacks (1)
In this follow up post, David DeWitt and Michael Stonebraker discuss the feedback from their previous post on MapReduce. They focus on four criticisms of their first article: 1) that MapReduce is not a database system and should not be judged as one; 2) that MapReduce has excellent scalability, demonstrated by Google's use; 3) that MapReduce is cheap compared to high-end DBMS solutions; 4) and that their stance was the result of DBMS "gray beards" trying to defend their turf/legacy from the MapReduce "young turks." Continue reading "MapReduce II" »

MapReduce: A major step backwards

| | Comments (40) | TrackBacks (1)
In this post, David DeWitt and Michael Stonebraker discuss MapReduce. While it may be a good idea for writing certain types of general-purpose computations, they believe it is a giant step backward in the programming paradigm for large-scale data intensive applications; a sub-optimal implementation, in that it uses brute force instead of indexing; not novel, as it represents a specific implementation of well known techniques developed nearly 25 years ago; missing most of the features that are routinely included in current DBMS; and incompatible with all of the tools DBMS users have come to depend on. Continue reading "MapReduce: A major step backwards" »

Relational databases for storing and querying RDF

| | Comments (1)
The Resource Description Format (RDF) is a way to describe information about relationships between entities and objects. It was originally developed by the W3C as a way to describe information about resources on the Web. It is intended to be the data model used in the Semantic Web, where web pages contain not just text but also structured records describing the data they contain and the relationships in that data. In this post, Sam Madden and Daniel Abadi discuss RDF and database issues. Continue reading "Relational databases for storing and querying RDF" »

With 2007 now in the books, all of us affiliated with the Database Column blog want to thank you for your readership and thoughtful commentary. There are many topics in the publishing queue, but we want to make sure we are covering topics that matter to readers. We encourage you to send us your questions, comments, and ideas for new topics. Continue reading "The Database Column in 2008: Building on initial success" »

To ETL or federate ... that is the question

| | Comments (0)
Enterprises must integrate data in a number of operational systems. But how should they do it? There are two technical approaches: ETL or Federate. Michael Stonebraker discusses the pros and cons of each approach in regards to data element "heat," indexing, resource management, complexity of schema change, contention, timeliness, and mapping, concluding that the ETL approach makes sense in most cases. Continue reading "To ETL or federate ... that is the question" »

The new economics of the BI market

| | Comments (0) | TrackBacks (1)
Consolidation within the Business Intelligence (BI) market continues. After more than a dozen acquisitions made by Business Objects, Cognos, and Hyperion over the past few years, these BI tools/analytics industry leaders were themselves snapped up in a matter of months by SAP, IBM, and Oracle respectively. But economies of scale enabled by consolidation is just one of the two primary drivers of the new economics of BI. Jerry Held explains how the other driver is economies of innovation that is a result of the continuing stream of new entrants. Continue reading "The new economics of the BI market" »