Hardware and Software Mechanisms for Reducing Load Latency

Front Cover
University of Wisconsin--Madison, 1996 - Computer architecture - 372 pages
Abstract: "As processor demands quickly outpace memory, the performance of load instructions becomes an increasingly critical component to good system performance. This thesis contributes four novel load latency reduction techniques, each targeting a different component of load latency: address calculation, data cache access, address translation, and data cache misses. The contributed techniques are as follows: Fast Address Calculation employs a stateless set index predictor to allow address calculation to overlap with data cache access. The design eliminates the latency of address calculation for many loads. Zero-Cycle Loads combine fast address calculation with an early-issue mechanism to produce pipeline designs capable of hiding the latency of many loads that hit in the data cache. High-Bandwidth Address Translation develops address translation mechanisms with better latency and area characteristics than a multi-ported TLB. The new designs provide multiple-issue processors with effective alternatives for keeping address translation off the critical path of data cache access. Cache-conscious Data Placement is a profile- guided data placement optimization for reducing the frequency of data cache misses. The approach employs heuristic algorithms to find variable placement solutions that decrease inter-variable conflict, and increase cache line utilization and block prefetch. Detailed design descriptions and experimental evaluations are provided for each approach, confirming the designs as cost-effective and practical solutions for reducting load latency."

From inside the book

Contents

HighBandwidth Address Translation
10
Experimental Framework
13
Fast Address Calculation
19

10 other sections not shown

Common terms and phrases

Bibliographic information