Architectural Considerations for Parallel Query Evaluation Algorithms
Abstract: "Parallelism is key to high performance relational database systems. Since there are several parallel architectures suitable for database systems, a few interesting problems arise, mostly from an emphasis on the differences among the architectures. Specifically, in the literature, differences rather than similarities between the architectures are pointed out, and the specific details of a particular architecture, crucial to high perfomance, are generally ignored. In this thesis we have attempted to remedy this situation by emphasizing the similarities and a deeper understanding of two popular parallel architectures, shared nothing and shared memory, from a database perspective. We show that there is complementarity and similarity in the two architectures by showing that software shared-memory support can be used to improve performance on shared-nothing hardware and by showing that shared-nothing software can run on shared-memory hardware with performance comparable to that of 'native' algorithms. We also show that by understanding the architectural details and tradeoffs, we can design algorithms that have superior performance. We illustrate this via examples of hash join algorithms on shared-memory hardware that exploit cache memories, hash aggregation algorithms on shared-nothing hardware that tradeoff communication for memory consumption, and hash aggregation algorithm on shared-memory hardware that tradeoff computation for reduced latch conflicts. All these algorithms show performance superior to the previously known algorithms."
What people are saying - Write a review
We haven't found any reviews in the usual places.
Hash Join using Shared Virtual Memory
Hash Join on Shared Memory Hardware
Hash Aggregation on Shared Nothing Hardware
3 other sections not shown
Adaptive Two Phase algorithm performs basic Basic/SVM algorithm building relation cache coherence cache memory Centralized Two Phase Chapter cumulative value data skew disk foreach global hash table GROUP BY attributes group entry already group values groups is large groups is small hash based aggregation hash bucket hash join algorithm hash table bucket hash-based Hybrid algorithm hybrid hash join I/O cost implementation intermediate I/O join attribute join product skew latch conflicts load balancing message passing library modify the cumulative multiprocessors number of groups number of processors number of tuples optimized output tuple overhead page fault parallel architectures parallel hybrid hash Phase algorithm probing relation query redistribution skew Repartitioning algorithm repartitioning the relations result tuples rithms Samp/SVM Sampling scaleup characteristics shared virtual memory shared-memory algorithms shared-nothing algorithm shared-nothing architecture shows Simple algorithm simulation skewed node SMPs software architectures Step algorithm tradeoff uniprocessor update value else insert Zipf distribution