Data Analysis and Data Mining: An Introduction
Oxford University Press, USA, Apr 23, 2012 - Business & Economics - 278 pages
An introduction to statistical data mining, Data Analysis and Data Mining is both textbook and professional resource. Assuming only a basic knowledge of statistical reasoning, it presents core concepts in data mining and exploratory statistical models to students and professional statisticians-both those working in communications and those working in a technological or scientific capacity-who have a limited knowledge of data mining. This book presents key statistical concepts by way of case studies, giving readers the benefit of learning from real problems and real data. Aided by a diverse range of statistical methods and techniques, readers will move from simple problems to complex problems. Through these case studies, authors Adelchi Azzalini and Bruno Scarpa explain exactly how statistical methods work; rather than relying on the "push the button" philosophy, they demonstrate how to use statistical tools to find the best solution to any given problem. Case studies feature current topics highly relevant to data mining, such web page traffic; the segmentation of customers; selection of customers for direct mail commercial campaigns; fraud detection; and measurements of customer satisfaction. Appropriate for both advanced undergraduate and graduate students, this much-needed book will fill a gap between higher level books, which emphasize technical explanations, and lower level books, which assume no prior knowledge and do not explain the methodology behind the statistical operations.
What people are saying - Write a review
We haven't found any reviews in the usual places.
3 Optimism Conflicts and Tradeoffs
4 Prediction of Quantitative Variables
5 Methods of Classification
6 Methods of Internal Analysis
Complements of Mathematics and Statistics
Other editions - View all
ˆβ Actual response Classif algorithm approximation basis functions Bibliographical notes called Car data categorical choice city distance classification tree coefficients components computational CONFUSION MATRIX consider context corresponding covariates criterion cross-validation curb weight customers data mining deviance discriminant analysis distribution engine estimate example explanatory variables fitted Fruit juice data groups hypothesis identified indicator variable least squares lift curve Linear discriminant analysis linear model linear regression logarithmic scale logistic regression method minimize misclassification error multivariate neural network nodes nonparametric observations obtained p-value panel of figure parameters points polynomial prediction error probability problem procedure Projection pursuit proportional odds model pure premium quadratic quantitative Random forest regression trees residuals response variable ROC curves sample Satisfaction level selected shows specific splines statistical subset Table test set training set variance vector Yesterday’s