Data Structures, Near Neighbor Searches, and Methodology: Fifth and Sixth DIMACS Implementation Challenges : Papers Related to the DIMACS Challenge on Dictionaries and Priority Queues (1995-1996) and the DIMACS Challenge on Near Neighbor Searches (1998-1999)Michael H. Goldwasser, David S. Johnson, Catherine C. McGeoch This book presents reviewed and revised papers from the fifth and sixth DIMACS Implementation Challenge workshops. These workshops, held approximately annually, aim at encouraging high-quality work in experimental analysis of data structures and algorithms. The papers published in this volume are the results of year-long coordinated research projects and contain new findings and insights. Three papers address the performance evaluation of implementations for two fundamental data structures, dictionaries and priority queues as used in the context of real applications. Another four papers consider the still evolving topic of methodologies for experimental algorithmics. Five papers are concerned with implementations of algorithms for nearest neighbor search in high dimensional spaces, an area with applications in information retrieval and data mining on collections of Web documents, DNA sequences, images and various other data types. |
Contents
1 | |
A practical perfect hashing algorithm | 23 |
Computational evaluation of hot queues | 49 |
Nearest neighbor search for data compression | 69 |
Experimental evaluation of diskbased data structures for nearest neighbor | 87 |
Analysis of approximate nearest neighbor searching with clustered point sets | 105 |
Approximate nearest neighbor search using the extended general spacefilling | 125 |
Locally lifting the curse of dimensionality for nearest neighbor search | 177 |
The role of experiment in the theory of algorithms | 191 |
Towards a discipline of experimental algorithmics | 197 |
A theoreticians guide to the experimental analysis of algorithms | 212 |
A bibliography of algorithm experimentation | 251 |
Common terms and phrases
analysis Annulus Method applications approximate nearest neighbor asymptotic average bck2 binary bound bucket cells clustered codebook codewords searched Computer Science constant cost data points data set database decrease-key operations deletion demo DIMACS dimension dimensionality disk-based data structures distance distribution efficient element experimental experiments FIGURE graph Gray code hash function hash table heuristics hot queues hotq2 hypercube IEEE implementation insert instances internal node iteration k-d tree kd-tree linear probing lookup machine mapping maximum memory metric mFKS minimum minimum-ambiguity monotone priority queues nearest neighbor search neighbor search algorithm non-internal node number of prototypes O(nN optimal paper parameters PCP tree performance PET PEEVE pointer priority queue problem Proc query points random rank number real line red-black tree rithm running Section sliding-midpoint solution space space-filling curves sparse array splitting method sub-models Tabu Search theoretical values VAMSplit R-tree variance Vector Quantization worst-case