
Dr. Madden is an Associate Professor in the EECS department at MIT and a member of MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL). His research interests span all areas of database systems; past projects include the TinyDB system for data collection from sensor networks and the Telegraph adaptive query processing engine. His current research focuses on modeling and statistical techniques for value prediction and outlier detection in sensor networks, high performance database systems, and networking and data processing in disconnected environments.
Dr. Madden:
I have read with interest your entries in "The Database Column" about parallelism. Those columns helped me understand several basic concepts related to that technology. Thank you. However, there is one issue that I didn't see you address (unless I missed it, but I don't think I did). I'm curious which of the three parallel architectures you discussed in your column dated 10/30/07 (shared-memory, shared-disk and shared-nothing) are advantaged the most by multithreaded multi-core processors. Intel said on 3/3/08 that it was on schedule to deliver octo-core processors using the Nehalem microarchitecture. These chips will have included in them something called an "integrated memory controller," which leads me to believe either the shared-memory or the shared-nothing parallel database architectures are potentially the best candidates to fully utilize this new processor technoology. If so, is there one that is especially promising given the Nehalem and Larrabeee chipsets from Intel?
It seems like the shared-memory approach can potentially utilize the multi-core processors the most; however, I believe these processors will still have the shared-bus issue. Am I correct about that? If so, however, it steill seems like multi-cores will significantly enhance the processing speed of what is otherwise limited by the I/O and memory requests transferring over admitedly the same bus. If the shared-memory approach is not advantaged by multi-cores, would shared-nothing become more advantaged since multi-core processors could, in effect, be assigned multi-sets of disks? I'm not sure how a multi-core processor could share data across "horizontally partitioned" nodes, but seems like it would be feasible.
I hope this is a question that all of the readers of "The Database Column" blog will benefit from by hearing more about it from you. If I'm asking a too neophytic question, I apologize. Hopefully, you'll see some benefit in helping me out with this inquiry. Thanks again for your contributions to blog at "The Database Column".