## Support Vector MachinesEvery mathematical discipline goes through three periods of development: the naive, the formal, and the critical. David Hilbert The goal of this book is to explain the principles that made support vector machines (SVMs) a successful modeling and prediction tool for a variety of applications. We try to achieve this by presenting the basic ideas of SVMs together with the latest developments and current research questions in a uni?ed style. In a nutshell, we identify at least three reasons for the success of SVMs: their ability to learn well with only a very small number of free parameters, their robustness against several types of model violations and outliers, and last but not least their computational e?ciency compared with several other methods. Although there are several roots and precursors of SVMs, these methods gained particular momentum during the last 15 years since Vapnik (1995, 1998) published his well-known textbooks on statistical learning theory with aspecialemphasisonsupportvectormachines. Sincethen,the?eldofmachine learninghaswitnessedintenseactivityinthestudyofSVMs,whichhasspread moreandmoretootherdisciplinessuchasstatisticsandmathematics. Thusit seems fair to say that several communities are currently working on support vector machines and on related kernel-based methods. Although there are many interactions between these communities, we think that there is still roomforadditionalfruitfulinteractionandwouldbegladifthistextbookwere found helpful in stimulating further research. Many of the results presented in this book have previously been scattered in the journal literature or are still under review. As a consequence, these results have been accessible only to a relativelysmallnumberofspecialists,sometimesprobablyonlytopeoplefrom one community but not the others. |

### What people are saying - Write a review

User Review - Flag as inappropriate

useful!

### Contents

1 | |

7 | |

13 | |

19 | |

Loss Functions and Their Risks | 21 |

22 Basic Properties of Loss Functions and Their Risks | 28 |

23 MarginBased Losses for Classification Problems | 34 |

24 DistanceBased Losses for Regression Problems | 38 |

85 Classifying with other MarginBased Losses | 314 |

86 Further Reading and Advanced Topics | 326 |

87 Summary | 329 |

88 Exercises | 330 |

Support Vector Machines for Regression | 333 |

92 Consistency | 335 |

93 SVMs for Quantile Regression | 340 |

94 Numerical Results for Quantile Regression | 344 |

25 Further Reading and Advanced Topics | 45 |

26 Summary | 46 |

Surrogate Loss Functions | 49 |

31 Inner Risks and the Calibration Function | 51 |

32 Asymptotic Theory of Surrogate Losses | 60 |

33 Inequalities between Excess Risks | 63 |

35 Surrogates for Weighted Binary Classification | 76 |

36 Template Loss Functions | 80 |

37 Surrogate Losses for Regression Problems | 81 |

38 Surrogate Losses for the Density Level Problem | 93 |

39 SelfCalibrated Loss Functions | 97 |

310 Further Reading and Advanced Topics | 105 |

311 Summary | 106 |

312 Exercises | 107 |

Kernels and Reproducing Kernel Hilbert Spaces | 111 |

41 Basic Properties and Examples of Kernels | 112 |

42 The Reproducing Kernel Hilbert Space of a Kernel | 119 |

43 Properties of RKHSs | 124 |

44 Gaussian Kernels and Their RKHSs | 132 |

45 Mercers Theorem | 149 |

46 Large Reproducing Kernel Hilbert Spaces | 151 |

47 Further Reading and Advanced Topics | 159 |

48 Summary | 161 |

49 Exercises | 162 |

InfiniteSample Versions of Support Vector Machines | 165 |

51 Existence and Uniqueness of SVM Solutions | 166 |

52 A General Representer Theorem | 169 |

53 Stability of InfiniteSample SVMs | 173 |

54 Behavior for Small Regularization Parameters | 178 |

55 Approximation Error of RKHSs | 187 |

56 Further Reading and Advanced Topics | 197 |

57 Summary | 200 |

Basic Statistical Analysis of SVMs | 203 |

61 Notions of Statistical Learning | 204 |

62 Basic Concentration Inequalities | 210 |

63 Statistical Analysis of Empirical Risk Minimization | 218 |

64 Basic Oracle Inequalities for SVMs | 223 |

65 DataDependent Parameter Selection for SVMs | 229 |

66 Further Reading and Advanced Topics | 234 |

67 Summary | 235 |

68 Exercises | 236 |

Advanced Statistical Analysis of SVMs | 239 |

71 Why Do We Need a Refined Analysis? | 240 |

72 A Refined Oracle Inequality for ERM | 242 |

73 Some Advanced Machinery | 246 |

74 Refined Oracle Inequalities for SVMs | 258 |

75 Some Bounds on Average Entropy Numbers | 270 |

76 Further Reading and Advanced Topics | 279 |

77 Summary | 282 |

78 Exercises | 283 |

Support Vector Machines for Classification | 286 |

81 Basic Oracle Inequalities for Classifying with SVMs | 288 |

82 Classifying with SVMs Using Gaussian Kernels | 290 |

83 Advanced Concentration Results for SVMs | 307 |

84 Sparseness of SVMs Using the Hinge Loss | 310 |

95 Median Regression with the epsInsensitive Loss | 348 |

96 Further Reading and Advanced Topics | 352 |

97 Summary | 353 |

Robustness | 355 |

101 Motivation | 356 |

102 Approaches to Robust Statistics | 362 |

103 Robustness of SVMs for Classification | 368 |

104 Robustness of SVMs for Regression | 379 |

105 Robust Learning from Bites | 391 |

106 Further Reading and Advanced Topics | 403 |

107 Summary | 408 |

108 Exercises | 409 |

Computational Aspects | 411 |

111 SVMs Convex Programs and Duality | 412 |

112 Implementation Techniques | 420 |

113 Determination of Hyperparameters | 443 |

114 Software Packages | 448 |

116 Summary | 452 |

117 Exercises | 453 |

Data Mining | 455 |

121 Introduction | 456 |

122 CRISPDM Strategy | 457 |

123 Role of SVMs in Data Mining | 467 |

125 Further Reading and Advanced Topics | 468 |

126 Summary | 469 |

Appendix | 470 |

A2 Topology | 475 |

A3 Measure and Integration Theory | 479 |

A31 Some Basic Facts | 480 |

A32 Measures on Topological Spaces | 486 |

A33 Aumanns Measurable Selection Principle | 487 |

A4 Probability Theory and Statistics | 489 |

A42 Some Limit Theorems | 492 |

A43 The Weak Topology and Its Metrization | 494 |

A5 Functional Analysis | 497 |

A52 Hilbert Spaces | 501 |

A53 The Calculus in Normed Spaces | 507 |

A54 Banach Space Valued Integration | 508 |

A55 Some Important Banach Spaces | 511 |

A56 Entropy Numbers | 516 |

A6 Convex Analysis | 519 |

A61 Basic Properties of Convex Functions | 520 |

A62 Subdifferential Calculus for Convex Functions | 523 |

A63 Some Further Notions of Convexity | 526 |

A64 The FenchelLegendre Biconjugate | 529 |

A65 Convex Programs and Lagrange Multipliers | 530 |

A7 Complex Analysis | 534 |

A9 Talagrands Inequality | 538 |

553 | |

Notation and Symbols | 579 |

Abbreviations | 583 |

584 | |

591 | |

### Other editions - View all

### Common terms and phrases

algorithm approximation error assume assumptions Banach space bounded Chapter classiﬁcation compact computation Consequently consider convergence convex function convex loss Corollary data mining data set decision function deﬁned Deﬁnition denotes diﬀerent distance-based loss distribution entropy numbers estimate example feature map ﬁnd ﬁnite ﬁrst function f Furthermore Gaussian RBF kernel goal Hilbert space hinge loss implies inﬂuence function L-risk learning method learning rates least squares loss Lemma linear Lipschitz continuous logistic loss loss function M-estimators margin exponent margin-based loss measurable functions measurable kernel measurable space metric space minimizer Moreover norm Obviously oracle inequality parameter pinball loss Polish space posterior probability probability measure probability space Proof properties quantile random variables recall regression RKHS RKHS H RKHSs robust satisfies Section sequence shows statistical Steinwart subset support vector machines SVM solution SVMs SVMs based Theorem TV-SVM yields