How to Run a Gradient Boosting Machine Learning Model

Boosting is a method for combining a series of simple individual models to create a more powerful model. The name gradient boosting arises because target outcomes are set based on the gradient of the error with respect to the prediction of each case. Each new model takes a step in the direction that minimizes prediction error in the space of predictions for each training case.
This article describes how to create a Gradient boosting Importance output as shown below.

Requirements

A numeric or categorical variable to be used as an Outcome variable to be predicted.
Predictors variables will be considered as predictors of the outcome variable.

Method

From the Report tree, select + > Advanced Analysis > Machine Learning > Gradient Boosting.
In the Object Inspector Go to the Data tab.
In the Outcomes menu, select the variable to be predicted by the predictor variables.
Select the predictor variable(s) from the Predictor(s) list.
OPTIONAL: Select the desired Output type:
- Accuracy: Produces measures of the goodness of model fit. For categorical outcomes, the breakdown by category is shown.
- Importance: As shown above. It produces a chart showing the importance of the predictors in determining the outcome. Only available for gbtree booster.
- Prediction-Accuracy Table: Produces a table relating the observed and predicted outcome. Also known as a confusion matrix.
- Detail Text output from the underlying xgboost package.
OPTIONAL: Select the desired Missing Data treatment. (See Missing Data Options).
OPTIONAL: Select Variable names to display variable names in the output instead of labels.
OPTIONAL: Select the underlying Booster model. Choose between gbtree and gblinear.
OPTIONAL: Enable Grid search to search the parameter space in order to tune the model. If not checked, the default xgboost parameters are used. Increasing this usually creates a more accurate predictor, at the cost of longer runtime.