User reviews

User Review - Flag as inappropriate

I was drawn to this book because I want to understand how to model data gushing from social networks and large data sets. Although I wasn't sure of how the book would approach the topics of data mining, I hypothesized that this book would give me some basic skills to leverage large data sets, at least giving me some examples of how data can be analyzed algorithmically.
I have discovered after reading the book that one can learn information pertaining to several skill sets:
1. Mining data from the internet and filtering that data. Since beginning to read the book, I have borrowed several techniques for gathering and saving data.
2. Introduction to about a dozen mathematical concepts and algorithms that use said data to filter, categorize, predict properties of, and determine relationships betweeen the data. I have realized that much entrepreneurial activity involved in enhancing human interaction is achieved through manipulation of data and these algorithms.
As far as the mathematics and algorithms themselves, much of it was more advanced than my current understanding, and will service in the long run as a springboard from which I must pursue these concepts. The book has given me a lot of starting points from further exploration.
3. An introduction to the python programming languages and some useful tools / packages written it it. Most of the python you will glean from this book is not laid out explicitly, but gathered from reading examples of data mining and analysis with python. Some basic syntax and script structure, as well as data types like lists and dictionaries, and functions for operating on data structures and performating mathematics are utilized.
Tools in python introduced in the book include FeedParser, Python Imaging Library, Beautiful Soup for html/xml parsing, pysqlite for database creation, NumPy for linear algebra and matrix mathematics, and matplotlib for 2D graphics.
Reading this book has piqued my interest in algorithms and mathematical analysis of datasets. From here I will pusue the study of these fields and a more in-depth understanding of some of the methods and algorithms presented. It was also a nice mental exercise to learn to read python code from real-world examples.

User Review - Flag as inappropriate

This looks like the 21st century successor to the AI programming books of the previous century (mine, and Charniak, Riesbeck, & McDermott). Lots of interesting applications; you can learn from them without much background required.

User Review - Flag as inappropriate

One of my favorite AI book ever. This book is simple to understand and provides easy examples in Python.

User Review - Flag as inappropriate

The emphasis of the book is on applications rather than theory, which is what you would expect from a machine learning book published by O'Reilly. The applications are interesting and implemented in python.

User Review - Flag as inappropriate

Programming Collective Intelligence is a new book from O'Reilly, which was written by Toby Segaran. The author graduated from MIT and is currently working at Metaweb Technologies. He develops ways to put large public datasets into Freebase, a free online semantic database. You can find more information about him on his blog:
Web 2.0 cannot exist without Collective Intelligence. The "giants" use it everywhere, YouTube recommends similar movies, knows what would you like to listen and Flickr which photos are your favorites etc. This technology empowers intelligent search, clustering, building price models and ranking on the web. I cannot imagine modern service without data analysis. That is the reason why it is worth to start read about it.
There are many titles about collective intelligence but recently I have read two, this one and "Collective Intelligence in Action". Both are very pragmatic, but the O'Railly's one is more focused on the merit of the CI. The code listings are much shorter (but examples are written in Python, so that was easy). In general these books comparison is like Java vs. Python. If you would like to build recommendation engine "in Action"/Java way, you would have to read a whole book, attach extra jar-s and design dozens of classes. The rapid Python way requires reading only 15 pages and voila, you have got the first recommendations. It is awesome!
So how about the rest of the book, there are still 319 pages! Further chapters say about: discovering groups, searching, ranking, optimization, document filtering, decision trees, price models or genetic algorithms. The book explains how to implement Simulated Annealing, k-Nearest Neighbors, Bayesian Classifier and many more. Take a look at the table of contents (here:, it does not list all the algorithms but you can find more information there.
Each chapter has about 20-30 pages. You do not have to read them all, you can choose the most important and still know what is going on. Every chapter contains minimum amount of theoretical introduction, for total beginners it might be not enough. I recommend this book for students who had statistics course (not only IT or computing science), it will show you how to use your knowledge in practice – there are many inspiring examples.
For those who do not know Python - do not be afraid – at the beginning you will find introduction to language syntax. All listings are very short and well described by the author – sometimes line by line. The book also contains necessary information about basic standard libraries responsible for xml processing or web pages downloading.
If you would like to start to learn about collective intelligence I would strongly recommend reading “Programming Collective Intelligence” first, then “Collective Intelligence in Action”. The first one shows how easy it is to implement basic algorithms, the second one would show you how to use existing open source projects related to machine learning.
You can find more about this book on it's catalogue page:
best regards
Dariusz Walczak

User ratings

5 stars
4 stars
3 stars
2 stars
1 star

All reviews - 8
2 stars - 0
1 star - 0

All reviews - 8
Editorial reviews - 0

All reviews - 8