Recent Advances in Parallel Virtual Machine and Message Passing Interface: 11th European PVM/MPI Users' Group Meeting, Budapest, Hungary, September 19-22, 2004, ProceedingsDieter Kranzlmüller, Peter Kacsuk, Jack Dongarra The message passing paradigm is the most frequently used approach to develop high-performancecomputing applications on paralleland distributed computing architectures. Parallel Virtual Machine (PVM) and Message Passing Interface (MPI) are the two main representatives in this domain. This volume comprises 50 selected contributions presented at the 11th - ropean PVM/MPI Users’ Group Meeting, which was held in Budapest, H- gary, September 19–22, 2004. The conference was organized by the Laboratory of Parallel and Distributed Systems (LPDS) at the Computer and Automation Research Institute of the Hungarian Academy of Sciences (MTA SZTAKI). The conference was previously held in Venice, Italy (2003), Linz, Austria (2002), Santorini, Greece (2001), Balatonfu ̈red, Hungary (2000), Barcelona, Spain (1999), Liverpool, UK (1998), and Krakow,Poland (1997).The ?rst three conferences were devoted to PVM and were held in Munich, Germany (1996), Lyon, France (1995), and Rome, Italy (1994). In its eleventh year, this conference is well established as the forum for users and developers of PVM, MPI, and other messagepassing environments.Inter- tionsbetweenthesegroupshaveprovedtobeveryusefulfordevelopingnewideas in parallel computing, and for applying some of those already existent to new practical?elds.Themaintopicsofthe meeting wereevaluationandperformance of PVM and MPI, extensions, implementations and improvements of PVM and MPI, parallel algorithms using the message passing paradigm, and parallel - plications in science and engineering. In addition, the topics of the conference were extended to include cluster and grid computing, in order to re?ect the importance of this area for the high-performance computing community. |
Contents
1 | |
14 | |
Fast Tuning of Intracluster Collective Communications | 28 |
More Efficient Reduction Algorithms for NonPowerofTwo Number | 36 |
ZeroCopy MPI Derived Datatype Communication over InfiniBand | 47 |
Minimizing Synchronization Overhead in the Implementation | 57 |
Efficient Implementation of MPI2 Passive OneSided Communication | 68 |
Providing Efficient IO Redundancy in MPI Environments | 77 |
Parallel IO in an ObjectOriented MessagePassing Library | 251 |
Detecting Unaffected Race Conditions in MessagePassing Programs | 268 |
A Lightweight Framework for Executing Task Parallelism | 287 |
A HighPerformance Scalable | 303 |
Identifying Logical Homogeneous Clusters | 319 |
Heterogeneous Parallel Computing Across Multidomain Clusters | 337 |
A Domain Decomposition Strategy for GRID Environments | 353 |
A PVM Extension to Exploit Cluster Grids | 362 |
The Impact of File Systems on MPIIO Scalability | 87 |
Goals Concept and Design | 97 |
The Architecture and Performance of WMPI II | 112 |
A New MPI Implementation for Cray SHMEM | 122 |
Algorithms | 131 |
BSPCGM Algorithms for Maximum Subsequence | 139 |
A Parallel Approach for a Nonrigid Image Registration Algorithm | 147 |
Asynchronous Distributed Broadcasting in Cluster Environment | 164 |
Nesting OpenMP and MPI in the Conjugate Gradient Method | 181 |
Applications | 199 |
A GridBased Parallel Maple | 215 |
Parallel Simulations of Electrophysiological Phenomena in Myocardium | 234 |
An Initial Analysis of the Impact of Overlap | 370 |
A PerformanceOriented Technique for Hybrid Application Development | 378 |
A Refinement Strategy for a UserOriented Performance Analysis | 388 |
What Size Cluster Equals a Dedicated Chip | 397 |
Architecture and Performance of the BlueGeneL Message Layer | 405 |
ParSim 2004 | 415 |
On the Parallelization of a CacheOptimal Iterative Solver for PDEs | 425 |
A Framework for Optimising Parameter Studies on a Cluster Computer | 436 |
Numerical Simulations on PC Graphics Hardware | 442 |
Author Index | 451 |
Other editions - View all
Common terms and phrases
algorithm approach architecture bandwidth benchmarks Berlin Heidelberg 2004 BlueGene/L broadcast buffer bytes cache calls cluster collective communication collective operations component Computer Science datatype developed Distributed Computing Dongarra efficient environment Ethernet EuroPVM/MPI execution Figure file system functions Grid Grid Computing Gropp hardware heterogeneous High Performance IEEE improve InfiniBand iteration Kranzlmüller LAM/MPI latency Linux LNCS load balancing locally-first race Maple maximum memory message layer Message Passing Interface modules MPI implementation MPI-IO MPICH2 MPIWin multiple Myrinet nodes number of processors one-sided communication Open MPI OpenMP optimal overhead packet Parallel and Distributed Parallel Computing parallel program Parallel Virtual Machine parameters partition phase point-to-point problem protocol provides PVM/MPI Quadrics receive running runtime scalable scheme Section semantics SHMEM simulation solution Springer-Verlag Berlin Heidelberg Supercomputing synchronization target tasks technique thread tion vector virtual machine WMPI workload