Advances in Computer Systems Architecture: 12th Asia-Pacific Conference, ACSAC 2007, Seoul, Korea, August 23-25, 2007, ProceedingsOn behalf of the program and organizing committee members of this conference, we th are pleased to present you with the proceedings of the 12 Asia-Pacific Computer Systems Architecture Conference (ACSAC 2007), which was hosted in Seoul, Korea on August 23-25, 2007. This conference has traditionally been a forum for leading researchers in the Asian, American and Oceanian regions to share recent progress and the latest results in both architectural and system issues. In the past few years the c- ference has become more international in the sense that the geographic origin of p- ticipants has become broader to include researchers from all around the world, incl- ing Europe and the Middle East. This year, we received 92 paper submissions. Each submission was reviewed by at least three primary reviewers along with up to three secondary reviewers. The total number of completed reviews reached 333, giving each submission 3.6 reviews on average. All the reviews were carefully examined during the paper selection process, and finally 26 papers were accepted, resulting in an acceptance rate of about 28%. The selected papers encompass a wide range of topics, with much emphasis on hardware and software techniques for state-of-the-art multicore and multithreaded architectures. |
Contents
A Compiler Framework for Supporting Speculative Multicore Processors | 1 |
PowerEfficient Heterogeneous Multicore Technology for Digital Convergence | 2 |
An Efficient Multiplatform Dynamic Binary Translation System | 4 |
An Open Problem | 16 |
An Online Profile Guided Optimization Approach for Speculative Parallel Threading | 28 |
EntropyBased Profile Characterization and Classification for Automatic Profile Management | 40 |
Laplace Transformation on the FT64 Stream Processor | 52 |
Towards Data Tiling for Whole Programs in Scratchpad Memory Allocation | 63 |
A PowerAware Alternative for the Perceptron Branch Predictor | 198 |
Power Consumption and Performance Analysis of 3D NoCs | 209 |
A Design Methodology for PerformanceResource Optimization of a Generalized 2D Convolution Architecture with Quadrant Symmetric Kernels | 220 |
Bipartition Architecture for Low Power JPEG Huffman Decoder | 235 |
A SWP Specification for Sequential Image Processing Algorithms | 244 |
A Stream SystemonChip Architecture for High Speed Target Recognition Based on Biologic Vision | 256 |
FPGAAccelerated Active Shape Model for RealTime People Tracking | 268 |
Performance Evaluation of Evolutionary Multicore and Aggressively Multithreaded Processor Architectures | 280 |
Evolution of NAND Flash Memory Interface | 75 |
A Fast CloseCoupled Shared Data Pool for Multicore DSPs | 80 |
Exploiting SingleUsage for Effective Memory Management | 90 |
An Alternative Organization of Defect Map for DefectResilient Embedded OnChip Memories | 102 |
An Effective Design of MasterSlave Operating System Architecture for Multiprocessor Embedded Systems | 114 |
Optimal Placement of Frequently Accessed IPs in Mesh NoCs | 126 |
An Efficient Link Controller for Test Access to IP CoreBased Embedded System Chips | 139 |
Performance of Keyword Connection Algorithm in Nested Mobility Networks | 151 |
Leakage Energy Reduction in Cache Memory by Software Selfinvalidation | 163 |
Exploiting Task Temperature Profiling in TemperatureAware Task Scheduling for Computational Clusters | 175 |
Runtime Performance Projection Model for Dynamic Power Management | 186 |
Synchronization Mechanisms on Modern Multicore Architectures | 290 |
Concerning with OnChip Network Features to Improve Cache Coherence Protocols for CMPs | 304 |
A New FaultTolerant Mathematical Model for Adaptively WormholeRouted Interconnect Networks | 315 |
Open Issues in MPI Implementation | 327 |
Implicit Transactional Memory in KiloInstruction Multiprocessors | 339 |
Design of a LowPower Embedded Processor Architecture Using Asynchronous Function Units | 354 |
A Bypass Mechanism to Enhance Branch Predictor for SMT Processors | 364 |
Thread PriorityAware Random Replacement in TLBs for a HighPerformance RealTime SMT Processor | 376 |
Architectural Solution to ObjectOriented Programming | 387 |
399 | |
Other editions - View all
Common terms and phrases
ACSAC algorithm applications approach architecture arrays attachment node benchmark binary translation bits block branch predictor buffer bzip2 checkpoint chip codeword compiler Computer configuration cycles decoder defect map dynamic embedded energy entropy entry evaluate execution FCC-SDP Figure flash memory flit FPGA function units global gzip hardware header IEEE implementation improve input instructions interface kernel L2 cache last-touch latency LNCS loop nests mechanism memory access multi-core multiple multiprocessors NAND flash object-oriented programming off-chip memory on-chip operand optimization overhead p-state parallel parameters perceptron performance pixel power consumption power dissipation prediction accuracy Proc proposed queue reduce runtime sampling scheme Section shows simulation SMT processor speculative speculative execution stored stream stream processor superscalar synchronization Table task technique temperature thread transaction UltraSPARC T1 unbiased branches vector virtual channels workloads