Multiprocessing: Trade-Offs in Computation and Communication
Multiprocessing: Trade-Offs in Computation and Communication presents an in-depth analysis of several commonly observed regular and irregular computations for multiprocessor systems. This book includes techniques which enable researchers and application developers to quantitatively determine the effects of algorithm data dependencies on execution time, on communication requirements, on processor utilization and on the speedups possible.
Starting with simple, two-dimensional, diamond-shaped directed acyclic graphs, the analysis is extended to more complex and higher dimensional directed acyclic graphs. The analysis allows for the quantification of the computation and communication costs and their interdependencies. The practical significance of these results on the performance of various data distribution schemes is clearly explained. Using these results, the performance of the parallel computations are formulated in an architecture independent fashion. These formulations allow for the parameterization of the architecture specitific entities such as the computation and communication rates. This type of parameterized performance analysis can be used at compile time or at run-time so as to achieve the most optimal distribution of the computations.
The material in Multiprocessing: Trade-Offs in Computation and Communication connects theory with practice, so that the inherent performance limitations in many computations can be understood, and practical methods can be devised that would assist in the development of software for scalable high performance systems.
What people are saying - Write a review
We haven't found any reviews in the usual places.
Other editions - View all
3-D dag algorithm analyzed architecture assignment scheme asymptotically optimal BLOCC scheme bottommost vertex chains Cholesky decomposition Cholesky factorization columns communication processor-node pairs communication requirements computation sweep compute the dag corresponding d-dimensional dag dag considered data dependencies data traffic associated data traffic complexity dense dependency arcs diagonal planes effect expanding edges Gaussian elimination given grid graph high(T horizontal sub-separators Lemma loop low(T lower bound m-vertex magnitude sense method of partitioning minimum data traffic minimum total execution multiprocessor multiprocessor system n2 dag nested dissection node nonzero elements number of processors optimal parallel partitioning computes partitioning scheme performance positive definite matrix processor pi proof recomputation rectangular dag region FGH schedule shared memory shown in Figure sparse matrix speedup square blocks method strips method subcubes subdiamond sweep time less symmetric positive definite Theorem tion topmost vertex total data traffic tradeoff factor uniform load distribution values vertical strips vertical sub-separator X n dag x-plane