Boosting is a method for combining a series of simple individual models to create a more powerful model. The name gradient boosting arises because target outcomes are set based on the gradient of the error with respect to the prediction of each case. Each new model takes a step in the direction that minimizes prediction error in the space of predictions for each training case.
This article describes how to create a Gradient boosting Importance output as shown below.
Requirements
- A numeric or categorical variable to be used as an Outcome variable to be predicted.
- Predictors variables will be considered as predictors of the outcome variable.
Method
- In the Anything menu select Advanced Analysis > Machine Learning > Gradient Boosting.
- In the object inspector go to the Inputs tab.
- In the Output menu select the variable to be predicted by the predictor variables.
- Select the predictor variable(s) from the Predictor(s) list.
- OPTIONAL: Select the desired Output type:
- Accuracy: Produces measures of the goodness of model fit. For categorical outcomes the breakdown by category is shown.
- Importance: As shown above. It produces a chart showing the importance of the predictors in determining the outcome. Only available for gbtree booster.
- Prediction-Accuracy Table: Produces a table relating the observed and predicted outcome. Also known as a confusion matrix.
- Detail Text output from the underlying xgboost package.
- OPTIONAL: Select the desired Missing Data treatment. (See Missing Data Options).
- OPTIONAL: Select Variable names to display variable names in the output instead of labels.
- OPTIONAL: Select the underlying Booster model. Chose between gbtree and gblinear.
- OPTIONAL: Enable Grid search to search the parameter space in order to tune the model. If not checked, the default parameters of xgboost are used. Increasing this will usually create a more accurate predictor, at the cost of taking longer to run.
See Also
How to Create a Classification And Regression Trees (CART)
How to Run Machine Learning Diagnostics - Prediction-Accuracy Table
How to Run Machine Learning Diagnostics - Table of Discriminant Function Coefficients extension
How to Create an Ensemble of Machine Learning Models
How to Compare Machine Learning Models
How to Run Machine Learning Linear Discriminant Analysis
How to Save Machine Learning Discrimination Variables
How to Save Machine Learning Predicted Values Variables
How to Save Machine Learning Probability of Each Response Variable
How to Run Support Vector Machine