## Compressed Data Structures for Strings: On Searching and Extracting Strings from Compressed Textual DataData compression is mandatory to manage massive datasets, indexing is fundamental to query them. However, their goals appear as counterposed: the former aims at minimizing data redundancies, whereas the latter augments the dataset with auxiliary information to speed up the query resolution. In this monograph we introduce solutions that overcome this dichotomy. We start by presenting the use of optimization techniques to improve the compression of classical data compression algorithms, then we move to the design of compressed data structures providing fast random access or efficient pattern matching queries on the compressed dataset. These theoretical studies are supported by experimental evidences of their impact in practical scenarios. |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

1 | |

5 | |

3 Optimally Partitioning a Text to Improve Its Compression | 15 |

4 BitComplexity of LempelZiv Compression | 32 |

5 Fast Random Access on Compressed Data | 55 |

6 Experiments on Compressed FullText Indexing | 61 |

7 Dictionary Indexes | 89 |

8 Future Directions of Research | 107 |

Bibliography | 111 |

### Other editions - View all

Compressed Data Structures for Strings: On Searching and Extracting Strings ... Rossano Venturini No preview available - 2013 |

### Common terms and phrases

achieve AF-index algorithm alphabet Arithmetic encoder binary bit-optimal bits of space block Burrows-Wheeler Transform Bwt(T bzip2 c(Ik cache-oblivious chapter codewords compact trie Compressed Data Structures compressed full-text indexes compressed indexes compressed permuterm index compressors cost d-maximal edge data compression decompression denote dictionary strings efficient encoding External-Memory model extract Ferragina FM-index full-text indexes Giancarlo Grossi gzip implementation indexed text input integer k-th order labeled leaf labeled Lemma length lexicographic order locating logn LZ-index Manzini maximal position memory mp[h Navarro and Mäkinen node number of bits number of occurrences operations optimal partitioning order entropy parsing permutation phrase Pizza and Chili prefix query requires rows Sadakane Sect sequence shortest path solution space occupancy SSSP storage scheme stored substring subtree suffix array suffix tree symbol Symposium takes text searching Theorem Tolerant Retrieval problem trie Venturini vertex wavelet tree window Witten