Marek, R. ; Rahm, E.

On the Performance of Parallel Join Processing in Shared Nothing Database Systems

Proc. 5th Int. PARLE Conf. (Parallel Architectures and Languages Europe), Springer Lecture Notes in Computer Science 694, pp. 622-633

1993

Paper

Abstract

Parallel database systems aim at providing high throughput for OLTP transactions as well as short response times for complex and data-intensive queries. Shared nothing systems represent the major architecture for parallel database processing. While the performance of such systems has been extensively analyzed in the past, the corresponding studies have made a number of best-case assumptions. In particular, almost all performance studies on parallel query processing assumed single-user mode, i.e,, that the entire system is exclusively reserved for processing a single query. We study the performance of parallel join processing under more realistic conditions, in particular for multi-user mode. Experiments conducted with a detailed simulation model of shared nothing systems demonstrate the need for dynamic load balancing strategies for efficient join processing in multi-user mode. We focus on two major issues: (a) determining the number of processors to be allocated for the execution of join queries, and (b) determining which processors are to be chosen for join processing. For these scheduling decisions, we consider the current resource utilization as well as the size of intermediate results. Even simple dynamic scheduling strategies are shown to outperform static schemes by a large margin.