Feature Engineering and Selection: A Practical Approach for Predictive ModelsThe process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results. |
Contents
| 2 | |
A Review of the Predictive Modeling Process | |
Exploratory Visualizations | |
Engineering Numeric Predictors | |
Detecting Interaction Effects | |
Handling Missing Data | |
Working with Profile Data | |
Feature Selection Overview | |
Other editions - View all
Feature Engineering and Selection: A Practical Approach for Predictive Models Max Kuhn,KJELL. JOHNSON No preview available - 2021 |
Feature Engineering and Selection: A Practical Approach for Predictive Models Max Kuhn,Kjell Johnson No preview available - 2019 |
Common terms and phrases
10-fold cross-validation analysis set approach assessment set autoencoder average bioreactor Blue Line categorical predictors chapter Clark/Lake coefficients computed contain correlation created cross-validation data points data set dimension reduction distribution dummy variables encoding estimate evaluated example external resampling feature engineering feature selection feature subset filter function hash identify illustrate imaging predictors important improve imputation interaction terms iterations kernel PCA linear model linear regression logistic regression matrix measure metric missing data missing values missingness model performance neural network nonlinear number of predictors OkCupid data optimal original predictors overall overfitting p-value partial least squares polynomial potential predicted values predictive models predictive performance predictor set preprocessing principal component principal component analysis procedure random forest relationship response RMSE ROC curve samples scores Section shows simple simulated annealing specific split statistic STEM profiles support vector machines techniques test set training set transformation trees trend tuning parameters variation visualization zero


