This article describes how to convert a predictive model simulator created in Displayr into a tool that can allocate a predicted category for a range of records. The predictive model simulator is designed to only predict one record at a time using the control selections. However, we can update this setup to apply the predictions to all records in a new data set.
- An existing regression or machine learning predictive model. This uses the Gradient Boost example from How to Create a Predictive Model Simulator in Displayr.
- New data to apply the predictive model to.
Set up your new data
- Create a pasted table:
- In the toolbar, select Table > Paste or Enter Table > Paste or type data.
- Paste or enter the respondent-level responses using the same column names as the predictor variable names from your predictive model.
- Create a raw data table:
- Add your new data set to your document via Data Sets > Plus (+).
- Select the variables that match the ones used as predictors in your predictive model.
- In the toolbar, select Table > Raw Data > Variable(s) and tick Variable Names.
Importantly, the response labels need to match exactly those used in your predictive model, otherwise the prediction will be NA for that record.
The column names need to also match the variable names used in your predictive model. The variable name is found under Properties > GENERAL > Name when you select a variable in the Data Sets tree:
Set up your prediction calculation
In this example, we now have a table called raw.data. To predict the outcome from the model on this new data, we can simplify the code as follows:
1. Select Calculation > Custom Code in the toolbar and click your page.
2. Paste the below into the R CODE section:
DF = raw.data
arguments <- list(model, newdata = DF)
Update the names of your new data (DF) table and model above, if necessary.
This will now run the prediction algorithm on all our new data:
Note, the predict function will run on all regression and machine learning algorithms in this simplified form, except for CART and Linear Discriminant Analysis. For these latter algorithms, it is better to amend the existing R code slightly in your predicted category calculation as follows.
- Select your predicted category calculation, i.e.
model.predicted.outcome, on your Simulator page.
- Under Properties > R CODE, keep the first line as is but delete everything from line 2 to
DF = raw.datain line 2 like below:
This will now predict regardless of the source algorithm.
3. OPTIONAL: Export the predicted category output from the toolbar via Publish > Export Pages > Excel.