Haderle responds to commenters regarding RDBMS history

| | Comments (0)
I noticed a couple of comments in response to my recent blog post, "Once upon a time ... the origins of today's relational database architectures." I respond to them below.

Reader "Dave" commented [note: we have made a few edits to the following comment]:

Old-fashioned relational database modeling was designed to limit CPU and I/O for both transaction processing and analytical processing by enforcing structure into the data for efficiency benefit. Modern CRMs have traded structure for flexibility (a generalized database structure) while maintaining OLTP performance to the degradation of analytical performance. OLAP inverts the view to speed seek at the cost of slow crud while maintaining flexibility but introduces I/O and CPU overhead for the second database synchronization and operation.

Can we go back to structure and a single engine or is the man-hour cost of implementation not worth the low operating cost of an extract to columnar for analytics that are seldom predefined?

One of the mantras of relational database was design flexibility to handle unanticipated usages of the data. Codd observed that prior databases were fragile with respect to handling additional information or additional usage. Well known was that new fields added to hierarchies were always done on the bottom right of the hierarchy, not in the segment/record where the information most deserved to be. This was simply to keep from having to redefine the applications and databases.

While the relational design may be flexible, it may not be feasible for execution given performance considerations. Hence, one denormalizes the data design sacrificing flexibility for performance. This is still true today in database design, whether it be relational or something else.

You ask whether it's possible to render a single implementation of a DBMS that satisfactorily handles all usages (OLTP, analytics, ...) well enough such that we don't need another. I think not. Existing implementations work. The US Federal Government still maintains tax systems on M204, which is a 1970s inverted database. It works and the cost to change the applications that use it is prohibitive. Someday the economics may change -- but not at present. So one leaves the existing system intact and moves the data to another system (e.g., columnar) to do analysis or processing for which M204 is not intended.


Reader "Michael M David" commented [note: this comment has been truncated; the full comment is on the original post's page]:

My company has finally been able to turn ANSI SQL's processor into a full nonlinear hierarchical processor and the same relational optimizations you mention work to make the hierarchical processing more efficient. We have naturally extended this ANSI SQL hierarchical processing to ANSI SQL transparent native XML hierarchical processing. This shows there are still new areas of ANSI SQL operation to explore and utilize.

Making ANSI SQL operate as a hierarchical processor is done by modeling nonlinear hierarchical structures using a series of SQL-92 Left Outer Joins. The Left Outer Join syntax models the nonlinear hierarchical structure requiring processing and its associated semantics specifies how to process the defined hierarchical structure hierarchically. The Left Outer Join is also performing the required hierarchical data preservation, and the insertions of the NULLs allow multiple legs each with varying lengths to be stored correctly and 
separately in the working rowset and resulting rowset. Relational projection where all columns for a node are not selected causes hierarchical node promotion and node collection.

It's interesting that we have reused much of our hierarchical brethren's technology inside relational databases, especially with the incorporation of XML, while still preserving the nonprocedural attribute of relational. This is true for IBM relational databases as well as many others. It's good to see you doing this.

It's also interesting to see hierarchical databases (IMS, ...) incorporate relational technology and support XML data natively.

 

Categories

,

Leave a comment

About this Post

This page contains a single post by Don Haderle published on November 28, 2007 3:34 PM.

Once upon a time ... the origins of today's relational database architectures was the previous entry in this blog.

The new economics of the BI market is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.