Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning (Google eBook)

Front Cover
Springer Science & Business Media, Mar 2, 2009 - Mathematics - 758 pages
1 Review
Remarkable advances in computation and data storage and the ready availability of huge data sets have been the keys to the growth of the new disciplines of data mining and machine learning, while the enormous success of the Human Genome Project has opened up the field of bioinformatics. These exciting developments, which led to the introduction of many innovative statistical tools for high-dimensional data analysis, are described here in detail. The author takes a broad perspective; for the first time in a book on multivariate analysis, nonlinear methods are discussed in detail as well as linear methods. Techniques covered range from traditional multivariate methods, such as multiple regression, principal components, canonical variates, linear discriminant analysis, factor analysis, clustering, multidimensional scaling, and correspondence analysis, to the newer methods of density estimation, projection pursuit, neural networks, multivariate reduced-rank regression, nonlinear manifold learning, bagging, boosting, random forests, independent component analysis, support vector machines, and classification and regression trees. Another unique feature of this book is the discussion of database management systems. This book is appropriate for advanced undergraduate students, graduate students, and researchers in statistics, computer science, artificial intelligence, psychology, cognitive sciences, business, medicine, bioinformatics, and engineering. Familiarity with multivariable calculus, linear algebra, and probability and statistics is required. The book presents a carefully-integrated mixture of theory and applications, and of classical and modern multivariate statistical techniques, including Bayesian methods. There are over 60 interesting data sets used as examples in the book, over 200 exercises, and many color illustrations and photographs.
  

What people are saying - Write a review

We haven't found any reviews in the usual places.

Contents

XCV
338
XCVI
339
1083 How Many Hidden Nodes and Layers?
343
Detecting Hidden Messages in Digital Images
344
XCVII
345
XCVIII
346
1010 Examples of Fitting Neural Networks
347
10112 Generalized Additive Models
350

Bibliographical Notes
16
Data and Databases
17
Mixtures of Polyaromatic Hydrocarbons
19
V
20
VI
21
Face Recognition
22
231 Data Types
25
233 Databases on the Internet
27
VII
28
24 Database Management
29
242 Structured Query Language SQL
30
243 OLTP Databases
32
245 Data Warehousing
33
246 Decision Support Systems and OLAP
35
247 Statistical Packages and DBMSs
36
252 Outliers
38
253 Missing Data
39
254 More Variables than Observations
40
26 The Curse of Dimensionality
41
Bibliographical Notes
42
Exercises
43
VIII
44
Random Vectors and Matrices
45
322 Basic Matrix Operations
46
324 Eigenanalysis for Square Matrices
48
325 Functions of Matrices
49
326 SingularValue Decomposition
50
328 Matrix Norms
51
329 Condition Numbers for Matrices
52
3211 Matrix Calculus
53
IX
54
X
55
33 Random Vectors
56
332 Multivariate Gaussian Distribution
59
XII
60
34 Random Matrices
62
XIII
64
35 Maximum Likelihood Estimation for the Gaussian
65
XIV
66
352 Admissibility
68
XV
70
XVI
71
Bibliographical Notes
72
Nonparametric Density Estimation
75
422 Consistency
78
423 Bona Fide Density Estimators
79
43 The Histogram
80
431 The Histogram as an ML Estimator
81
432 Asymptotics
82
XVII
83
433 Estimating Bin Width
84
434 Multivariate Histograms
85
44 Maximum Penalized Likelihood
87
45 Kernel Density Estimation
88
XVIII
90
452 Asymptotics
91
XIX
92
XX
94
XXI
96
XXII
98
47 Assessing Multimodality
103
XXIII
105
XXIV
106
Model Assessment and Selection in Multiple Regression
107
XXVI
110
522 FixedX Case
111
XXVIII
112
XXIX
114
531 RandomX Case
119
54 Estimating Prediction Error
120
543 Bootstrap
122
XXX
123
XXXII
124
XXXIII
125
XXXIV
130
XXXV
132
XXXVI
134
564 Ridge Regression
136
XXXVII
139
XXXVIII
145
573 Criticisms of Variable Selection Methods
147
XXXIX
149
XL
150
XLI
151
Exercises
155
XLII
156
XLIII
157
XLIV
158
XLV
162
XLVIII
163
XLIX
164
L
169
LI
170
LII
171
LIII
178
LV
179
LVI
186
Mixtures of Polyaromatic Hydrocarbons
188
Exercises
191
LVII
192
LVIII
194
LIX
198
722 Population Principal Components Assume that the random rvector
199
724 PCA as a VarianceMaximization Technique
202
726 How Many Principal Components to Retain?
205
LXII
206
LXIII
208
7211 Functional PCA
212
LXIV
213
LXV
214
7212 What Can Be Gained from Using PCA?
215
LXVI
218
733 LeastSquares Optimality of CVA
219
734 Relationship of CVA to RRR
222
LXVII
225
736 Sample Estimates Thus G and H are estimated by
226
741 Projection Indexes
229
LXVIII
230
LXIX
231
742 Optimizing the Projection Index
232
LXX
235
LXXI
236
Linear Discriminant Analysis
237
82 Classes and Features
240
83 Binary Classification
241
LXXII
243
LXXIII
244
LXXV
245
835 Logistic Discrimination
250
LXXVI
252
LXXVII
253
LXXVIII
259
85 Multiclass LDA
260
851 Bayess Rule Classifier Let
261
LXXIX
263
LXXX
264
LXXXI
268
LXXXII
269
LXXXIII
274
LXXXIV
275
LXXXV
280
Recursive and TreeBased Partitioning Methods
281
Cleveland HeartDisease Data
284
922 TreeGrowing Procedure
285
LXXXVI
287
LXXXVII
288
926 Pruning the Tree
295
LXXXVIII
297
927 Choosing the Best Pruned Subtree
298
LXXXIX
301
931 The TerminalNode Value
305
933 Pruning the Tree
306
XC
308
94 Extensions and Adjustments
309
942 Survival Trees
310
944 Missing Data
312
95 Software Packages
313
XCI
314
Artificial Neural Networks
315
102 The Brain as a Neural Network
316
103 The McCullochPitts Neuron
318
XCII
319
105 SingleLayer Perceptrons
321
XCIII
324
1053 Rosenblatts SingleUnit Perceptron
325
1054 The Perceptron Learning Rule
326
1056 Limitations of the Perceptron
328
106 Artificial Intelligence and Expert Systems
329
1071 Network Architecture
331
1072 A Single Hidden Layer
332
1073 ANNs Can Approximate Continuous Functions
333
1074 More than One Hidden Layer
334
1075 Optimality Criteria
335
1076 The Backpropagation of Errors Algorithm
336
XCIV
337
XCIX
351
1012 Bayesian Learning for ANN Models
352
10121 Laplaces Method
353
C
355
CI
356
1013 Software Packages
364
CII
367
Support Vector Machines
369
CIV
372
CVII
373
CVIII
374
1131 Nonlinear Transformations
379
CIX
381
CX
382
CXII
383
1138 Binary Classification Examples
387
CXIV
388
114 Multiclass Support Vector Machines
390
CXVI
393
CXVII
394
CXVIII
400
1153 Extensions
401
Bibliographical Notes
404
CXIX
406
Cluster Analysis
407
CXX
410
123 Hierarchical Clustering
411
1231 Dendrogram
412
1233 Agglomerative Nesting agnes
414
CXXI
415
CXXII
417
1241 KMeans Clustering kmeans
423
1243 Fuzzy Analysis fanny
425
1244 Silhouette Plot
426
CXXIV
427
Landsat Satellite Image Data
428
CXXV
429
125 SelfOrganizing Maps SOMs
431
1253 Batch Version
434
1254 UnifiedDistance Matrix
435
CXXVI
436
1255 Component Planes
437
CXXVII
438
1262 PrincipalComponent Gene Shaving
440
CXXVIII
442
Colon Cancer Data
443
128 TwoWay Clustering of Microarray Data
446
1282 Plaid Models
449
CXXIX
450
CXXX
452
CXXXI
454
1291 The EM Algorithm for Finite Mixtures
456
1292 How Many Components?
459
CXXXII
462
Multidimensional Scaling and Distance Geometry
463
Airline Distances
464
CXXXIII
466
CXXXVI
467
133 Proximity Matrices
471
134 Comparing Protein Sequences
472
CXXXVII
474
Two Hemoglobin Chains
475
136 Classical Scaling and Distance Geometry
478
1361 From Dissimilarities to Principal Coordinates
479
1362 Assessing Dimensionality
480
Airline Distances Continued
481
CXXXVIII
482
CXXXIX
483
Mapping the Protein Universe
484
138 Metric Distance Scaling
487
Lloyds Bank Employees
489
CXL
490
CXLI
491
139 Nonmetric Distance Scaling
492
CXLIII
495
1394 How Good Is an MDS Solution?
500
Bibliographical Notes
502
Exercises
503
CXLV
504
Committee Machines
505
142 Bagging
506
CXLVI
508
1422 Bagging RegressionTree Predictors
509
CXLIX
510
Boosting by Reweighting
512
Aqueous Solubility in Drug Discovery
514
1433 Convergence Issues and Overfitting
515
CL
517
1434 Classification Margins
518
CLI
521
CLII
522
CLIII
526
1438 Gradient Boosting for Regression
530
14310 Regularization
533
14311 Noisy Class Labels
535
144 Random Forests
536
1443 An Upper Bound on Generalization Error
538
CLIV
539
CLV
540
Diagnostic Classification of Four Childhood Tumors
541
1446 Proximities for Classical Scaling
544
1447 Identifying Multivariate Outliers
545
1448 Treating Unbalanced Classes
547
145 Software Packages
548
CLVI
550
Latent Variable Models for Blind Source Separation
551
152 Blind Source Separation and the CocktailParty Problem
552
Cutaneous Potential Recordings of a Pregnant Woman
554
CLVII
555
1533 Connection to Projection Pursuit
556
1534 Centering and Sphering
557
CLVIII
559
Noiseless ICA
560
1538 Objective Functions
561
CLIX
563
15310 Mutual Information
564
CLXI
565
CLXII
567
Identifying Artifacts in MEG Recordings
569
15313 MaximumLikelihood ICA
572
15314 Kernel ICA
575
CLXIII
578
1543 MaximumLikelihood FA
584
CLXIV
585
CLXV
586
Twentyfour Psychological Tests
587
1546 Confirmatory Factor Analysis
590
CLXVI
591
CLXVII
593
Nonlinear Dimensionality Reduction and Manifold Learning
597
162 Polynomial PCA
598
163 Principal Curves and Surfaces
600
1631 Curves and Curvature
601
1632 Principal Curves
603
1633 ProjectionExpectation Algorithm
604
1635 Principal Surfaces
606
164 Multilayer Autoassociative Neural Networks
607
1642 Relationship to Principal Curves
608
165 Kernel PCA
609
1651 PCA in Feature Space
610
CLXVIII
611
1652 Centering in Feature Space
612
1654 Kernel PCA and Metric MDS
613
1661 Manifolds
615
1662 Data on Manifolds
616
CLXIX
618
CLXX
619
CLXXI
624
1667 Other Methods
628
CLXXII
629
Exercises
631
Correspondence Analysis
633
Shoplifting in The Netherlands
634
1722 Row and Column Dummy Variables
636
CLXXIII
637
Hair Color and Eye Color
638
1724 Profiles Masses and Centroids
639
CLXXIV
641
1725 Chisquared Distances
642
CLXXV
645
CLXXVI
647
1728 Graphical Displays
649
CLXXVII
652
CLXXVIII
654
CLXXIX
656
1745 A Weighted LeastSquares Approach
661
CLXXX
662
175 Software Packages
663
CLXXXI
664
CLXXXII
666
CLXXXIII
670
CLXXXIV
711
CLXXXV
713
CLXXXVI
724
Copyright

Common terms and phrases

Bibliographic information