Programming Collective Intelligence: Building Smart Web 2.0 ApplicationsWant to tap the power behind search rankings, product recommendations, social bookmarking, and online matchmaking? This fascinating book demonstrates how you can build Web 2.0 applications to mine the enormous amount of data created by people on the Internet. With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.
"Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."  Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machinelearning algorithms into practical, easytounderstand examples that can be directly applied to analysis of social interaction across the Web today. If I had this book two years ago, it would have saved precious time going down some fruitless paths."  Tim Wolters, CTO, Collective Intellect 
What people are saying  Write a review
User ratings
5 stars 
 
4 stars 
 
3 stars 
 
2 stars 
 
1 star 

I was drawn to this book because I want to understand how to model data gushing from social networks and large data sets. Although I wasn't sure of how the book would approach the topics of data mining, I hypothesized that this book would give me some basic skills to leverage large data sets, at least giving me some examples of how data can be analyzed algorithmically.
I have discovered after reading the book that one can learn information pertaining to several skill sets:
1. Mining data from the internet and filtering that data. Since beginning to read the book, I have borrowed several techniques for gathering and saving data.
2. Introduction to about a dozen mathematical concepts and algorithms that use said data to filter, categorize, predict properties of, and determine relationships betweeen the data. I have realized that much entrepreneurial activity involved in enhancing human interaction is achieved through manipulation of data and these algorithms.
As far as the mathematics and algorithms themselves, much of it was more advanced than my current understanding, and will service in the long run as a springboard from which I must pursue these concepts. The book has given me a lot of starting points from further exploration.
3. An introduction to the python programming languages and some useful tools / packages written it it. Most of the python you will glean from this book is not laid out explicitly, but gathered from reading examples of data mining and analysis with python. Some basic syntax and script structure, as well as data types like lists and dictionaries, and functions for operating on data structures and performating mathematics are utilized.
Tools in python introduced in the book include FeedParser, Python Imaging Library, Beautiful Soup for html/xml parsing, pysqlite for database creation, NumPy for linear algebra and matrix mathematics, and matplotlib for 2D graphics.
Reading this book has piqued my interest in algorithms and mathematical analysis of datasets. From here I will pusue the study of these fields and a more indepth understanding of some of the methods and algorithms presented. It was also a nice mental exercise to learn to read python code from realworld examples.
This looks like the 21st century successor to the AI programming books of the previous century (mine, and Charniak, Riesbeck, & McDermott). Lots of interesting applications; you can learn from them without much background required.
Contents
7  
Discovering Groups  29 
Searching and Ranking  54 
Optimization  86 
Document Filtering  117 
Modeling with Decision Trees  142 
Building Price Models  167 
Other editions  View all
Programming Collective Intelligence: Building Smart Web 2.0 Applications Toby Segaran Limited preview  2007 