Introduction to Parallel Processing: Algorithms and ArchitecturesTHE CONTEXT OF PARALLEL PROCESSING The field of digital computer architecture has grown explosively in the past two decades. Through a steady stream of experimental research, tool-building efforts, and theoretical studies, the design of an instruction-set architecture, once considered an art, has been transformed into one of the most quantitative branches of computer technology. At the same time, better understanding of various forms of concurrency, from standard pipelining to massive parallelism, and invention of architectural structures to support a reasonably efficient and user-friendly programming model for such systems, has allowed hardware performance to continue its exponential growth. This trend is expected to continue in the near future. This explosive growth, linked with the expectation that performance will continue its exponential rise with each new generation of hardware and that (in stark contrast to software) computer hardware will function correctly as soon as it comes off the assembly line, has its down side. It has led to unprecedented hardware complexity and almost intolerable dev- opment costs. The challenge facing current and future computer designers is to institute simplicity where we now have complexity; to use fundamental theories being developed in this area to gain performance and ease-of-use benefits from simpler circuits; to understand the interplay between technological capabilities and limitations, on the one hand, and design decisions based on user and application requirements on the other. |
Contents
Fundamental Concepts 1 | 3 |
Problems | 8 |
References and Suggested Reading | 23 |
A Taste of Parallel Algorithms | 25 |
Parallel Algorithm Complexity | 45 |
Extreme Models | 61 |
68658 | 69 |
Problems | 105 |
Sorting and Routing on Hypercubes | 279 |
Other Hypercubic Architectures | 301 |
A Sampler of Other Networks | 321 |
Some Broad Topics | 345 |
Problems | 364 |
5 | 382 |
Problems | 410 |
Implementation Aspects | 437 |
Other editions - View all
Introduction to Parallel Processing: Algorithms and Architectures Behrooz Parhami Limited preview - 2006 |
Introduction to Parallel Processing: Algorithms and Architectures Behrooz Parhami No preview available - 2013 |
Common terms and phrases
2-sorter 2D mesh applications binary tree bisection width bitonic block broadcasting buffer butterfly network cache Chapter chip column communication complexity components connected cycle defined delay destination diameter discussed disk Distributed edge efficient elements emulation example execution Figure floating-point hardware hypercube IEEE ILLIAC IV implementation input instruction integer interconnection network label latency linear array lower bound main memory malfunctioning matrix multiplication memory access MIMD multiprocessor needed neighbors node degree number of processors O(log odd–even operations optimal output p-processor packet Parallel Algorithms parallel computers parallel machine parallel prefix computation parallel processing parallel systems path performance permutation phase pipelined PRAM prefix sum Proc q-cube recursive result ring routing problem row-major order scalable scheduling scheme Section semigroup computation sequence sequential shared-memory shearsort Show shown in Fig sieve of Eratosthenes SIMD sorting algorithm sorting networks speed-up steps switch torus vector wormhole routing