The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Google eBook)During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boostingthe first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, nonnegative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates. 
What people are saying  Write a review
User ratings
5 stars 
 
4 stars 
 
3 stars 
 
2 stars 
 
1 star 

Review: The Elements of Statistical Learning: Data Mining, Inference, and Prediction
User Review  Danial  GoodreadsUnnecessarily dry and difficult to read through; but as a reference book with a solid index it hits its mark. Read full review
Review: The Elements of Statistical Learning: Data Mining, Inference, and Prediction
User Review  Tianlin  GoodreadsThis book shows a nice statistical foundation of modern machine learning. Due to the rapid development of this field, the book seemed a bit outdated. The notations are sometimes messed up too. But anyway, it reveals a unique statistical perspective of learning and is quite interesting per se. Read full review
Contents
1  
9  
Linear Methods for Regression  43 
Linear Methods for Classification  100 
Basis Expansions and Regularization  139 
Kernel Smoothing Methods  190 
Model Assessment and Selection  219 
Model Inference and Averaging  261 
Support Vector Machines and Flexible Discriminants  417 
Prototype Methods and NearestNeighbors  459 
Unsupervised Learning  485 
Random Forests  586 
Ensemble Learning  605 
Undirected Graphical Models  625 
HighDimensional Problems p N  649 
699  
Additive Models Trees and Related Methods  295 
Boosting and Additive Trees  337 
Neural Networks  388 
729  
737  
Common terms and phrases
ˆβ AdaBoost additive model algorithm approximation average Bspline basis functions Bayes Bayesian bias bootstrap centroids Chapter classifier coefficients compute correlation covariance criterion crossvalidation curve data points decision boundary defined density dimension distribution error rate estimate example Gaussian genes gradient boosting hence input iteration kernel lasso least squares left panel linear discriminant analysis linear model linear regression loglikelihood logistic regression loss function matrix methods minimize mixture nearestneighbor neural networks node nonlinear observations optimal overfit panel of Figure panel shows parameters penalty plot posterior prediction error predictors principal components problem procedure quadratic random forests regression model ridge regression right panel sample Section shrinkage simulated smoothing spline solution spam split statistical subset supervised learning support vector support vector machine test error Tibshirani training data training set tree values variables variance wavelet weights zero