## Data Streams: Algorithms and ApplicationsData stream algorithms as an active research agenda emerged only over the past few years, even though the concept of making few passes over the data for performing computations has been around since the early days of Automata Theory. The data stream agenda now pervades many branches of Computer Science including databases, networking, knowledge discovery and data mining, and hardware systems. Industry is in synch too, with Data Stream Management Systems (DSMSs) and special hardware to deal with data speeds. Even beyond Computer Science, data stream concerns are emerging in physics, atmospheric science and statistics. Data Streams: Algorithms and Applications focuses on the algorithmic foundations of data streaming. In the data stream scenario, input arrives very rapidly and there is limited memory to store the input. Algorithms have to work with one or few passes over the data, space less than linear in the input size or time significantly less than the input size. In the past few years, a new theory has emerged for reasoning about algorithms that work within these constraints on space, time and number of passes. Some of the methods rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity. The applications for this scenario include IP network traffic analysis, mining text message streams and processing massive data sets in general. Data Streams: Algorithms and Applications surveys the emerging area of algorithms for processing data streams and associated applications. An extensive bibliography with over 200 entries points the reader to further resources for exploration. |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

Introduction | 1 |

Fishing | 3 |

Pointer and Chaser | 6 |

14 Lessons | 8 |

Map | 9 |

The Data Stream Phenomenon | 11 |

Data Streaming Formal Aspects | 15 |

42 Motivating Scenarios | 20 |

Streaming Systems | 73 |

New Directions | 77 |

92 Functional Approximation Theory | 78 |

93 Data Structures | 85 |

94 Computational Geometry | 86 |

95 Graph Theory | 88 |

96 Databases | 91 |

97 Hardware | 95 |

43 Other Data Streaming Applications | 24 |

44 Other Applications for Data Stream Models | 26 |

Foundations Basic Mathematical Ideas | 29 |

52 Random Projections | 40 |

Foundations Basic Algorithmic Techniques | 51 |

62 Tree Method | 54 |

63 Other Algorithmic Techniques | 62 |

Foundations Summary | 67 |

72 Summary and Data Stream Principles | 69 |

98 Streaming Models | 96 |

99 Data Stream Quality Monitoring | 101 |

910 FishEye View | 103 |

Historic Notes | 109 |

Concluding Remarks | 111 |

Acknowledgements | 113 |

References | 115 |

### Common terms and phrases

analysis B-bucket histogram bucket cash register model CM Sketch coeﬃcients communication complexity Computational Geometry compute Consider count data quality data stream algorithms data stream applications data stream models data structure database deﬁne deltoids deterministic deterministic algorithm dictionary diﬀerent diﬃcult distribution edit distance eﬃcient estimate example ﬁnd ﬁnding ﬁrst ﬁsh types ﬁt ﬁxed Gigascope given graph GSAP Haar wavelet hash function heavy hitters Hence histogram integer interesting inverse signal IP addresses IP network IP traﬃc logN lower bound Lp-sum methods monitoring multiset needs nodes norms NP-hard number of bits number of distinct O(logN optimal parsing Paul and Carole permutation piecewise-constant point queries polylog probability at least Proc processing puzzle quantiles random rangesum representation routers sampling Section 1.2 set cover signiﬁcant small space solution solve sparse approximation speciﬁc stream processing string subset subset-sum Theorem theory tion tree Turnstile model updates vectors Zipf distribution