Programming Collective Intelligence
is a new book from O'Reilly and Associates on the concept of collecting data from disparate sources – like, say, users – and integrate that data into your programs. It's an excellent book for anyone who wonders how to use data from other websites or how to use user behavior to learn how to service those users better.
Toby Segaran, the author, uses a clear and expository style that allows you to learn how collective intelligence techniques work, in addition to seeing implementations of the techniques (in Python, but porting from Python isn't difficult, especially with how clear his explanations are.)
It's not a book to learn programming, by any means – it's about the process of using data to create "smarter" applications. It also covers some techniques used in mashups, with web sites like del.icio.us, kayak, ebay, hotornot, and akismet.
The first chapter is an introduction to collective intelligence, explaining the concepts behind machine learning. Machine learning in this context isn't the same as artificial intelligence – it's more like the application of data analysis such that users might think it’s artificial intelligence.
The second chapter is "Making Recommendations," much like Amazon.com uses to suggest similar titles to readers. Following chapters include clustering (discovering groups of similar items in large datasets), searching and ranking (think "pagerank"), optimization, document filtering (Bayesian networks), decision trees, price modeling, support-vector machines, genetic programming, and more.
All told, it's a fascinating book. Web 2.0 isn't just about interactivity – it’s about intelligence, too. Interactivity is easy to achieve, with so many web frameworks that focus on interaction. Intelligence is a little harder – and this book goes a long way to making it easy.