Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification

Front Cover
No Starch Press, 2005 - Computers - 287 pages
1 Review

Join author John Zdziarski for a look inside the brilliant minds that have conceived clever new ways to fight spam in all its nefarious forms. This landmark title describes, in-depth, how statistical filtering is being used by next-generation spam filters to identify and filter unwanted messages, how spam filtering works and how language classification and machine learning combine to produce remarkably accurate spam filters.

After reading Ending Spam, you'll have a complete understanding of the mathematical approaches used by today's spam filters as well as decoding, tokenization, various algorithms (including Bayesian analysis and Markovian discrimination) and the benefits of using open-source solutions to end spam. Zdziarski interviewed creators of many of the best spam filters and has included their insights in this revealing examination of the anti-spam crusade.

If you're a programmer designing a new spam filter, a network admin implementing a spam-filtering solution, or just someone who's curious about how spam filters work and the tactics spammers use to evade them, Ending Spam will serve as an informative analysis of the war against spammers.

TOCIntroduction

PART I: An Introduction to Spam FilteringChapter 1: The History of SpamChapter 2: Historical Approaches to Fighting SpamChapter 3: Language Classification ConceptsChapter 4: Statistical Filtering Fundamentals

PART II: Fundamentals of Statistical FilteringChapter 5: Decoding: Uncombobulating MessagesChapter 6: Tokenization: The Building Blocks of SpamChapter 7: The Low-Down Dirty Tricks of SpammersChapter 8: Data Storage for a Zillion RecordsChapter 9: Scaling in Large Environments

PART III: Advanced Concepts of Statistical FilteringChapter 10: Testing TheoryChapter 11: Concept Identification: Advanced TokenizationChapter 12: Fifth-Order Markovian DiscriminationChapter 13: Intelligent Feature Set ReductionChapter 14: Collaborative Algorithms

Appendix: Shining Examples of Filtering

Index

From inside the book

What people are saying - Write a review

Review: Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification

User Review  - James Luedke - Goodreads

Well written covers advanced concepts of statistical filtering in depth. A good read for software developers or sys admins who deal with spam. Read full review

Contents

War Waged on Spam
17
Final Thoughts
23
Historical Approaches to Fighting Spam
25
Copyright

36 other sections not shown

Common terms and phrases

About the author (2005)

Jonathan A. Zdziarski has been fighting spam for eight years, and has spent a significant portion of the past two years working on the next generation spam filter DSPAM, with up to 99.985% accuracy. Zdziarski lectures widely on the topic of spam.

Bibliographic information