Displayr has a built-in stacking widget that provides a point and click way of stacking SPSS .sav files, see How to Stack Data in Displayr. However, sometimes data quality issues like missing variables, reusing codes, and different value attributes won't allow the stacking to be done. In this instance, you will need to use R to stack the data.
- An SPSS (.sav) data set saved on your Displayr Cloud drive.
1. To start, you should generate the list of variables you want to stack. It's recommended that you use the Stacking Code Generator Excel tool to create the R code needed to list out your stacked variables - Stacking Code Generator. Follow the instructions on the Instructions tab and paste the generated code in step (4) in the example code from below - being sure to remove the last comma.
2. Go to Data Sets > Plus (+) > R.
3. Enter a name for the data set under Name.
4. Paste in the below R code where it states Enter your R code here and review each numbered step in the code to make the appropriate edits:
#Review the numbered list of settings below
#Load the foreign package to use
### (1.a) SPECIFY the file location - link must end in .sav and automatically download the file
# if not, comment out the lines in this section and comment in the code in section 1.b
location = "https://wiki.q-researchsoftware.com/images/3/35/Technology_2018.sav"
use.value.labels = TRUE,
to.data.frame = TRUE))
### (1.b) SPECIFY the file location - if using a file on your Displayr Cloud drive
##put the file name from your Displayr Cloud drive inside '' below
##convert to a data frame to reshape later
### (2) specify your ID (Case) variable that can be used to link to the unstacked data
id.variable = 'RESPNUM'
### (3) list any other variables you want to include in the stacked dataset but leave unstacked
variables.to.stretch = c('Q1', 'Rec_Age')
### provide lists of variables to include in each stacked variable
variables.to.stack = list(
# (4) REPLACE the below with a copy/paste from BLUE (first) column of Excel template
# if you need to insert blank values then use the name 'dummy' in the list below
## !!!! be sure to remove the last comma before the closing round bracket !!!!
'Recommend' = c('Q3_01', 'Q3_02', 'Q3_03', 'Q3_04', 'Q3_05',
'Fun' = c('Q4a_01','Q4a_02','Q4a_03','Q4a_04','Q4a_05','Q4a_06',
'Worth_what_you_pay_for' = c('Q4b_01','Q4b_02','Q4b_03','Q4b_04','Q4b_05','Q4b_06',
'Innovative' = c('Q4c_01','Q4c_02','Q4c_03','Q4c_04','Q4c_05','Q4c_06',
'Good_customer_service' = c('Q4d_01','Q4d_02','Q4d_03','Q4d_04','Q4d_05','Q4d_06',
'Stylish' = c('Q4e_01','Q4e_02','Q4e_03','Q4e_04','Q4e_05','Q4e_06',
'Easy_to_use' = c('Q4f_01','Q4f_02','Q4f_03','Q4f_04','Q4f_05','Q4f_06',
'High_quality' = c('Q4g_01','Q4g_02','Q4g_03','Q4g_04','Q4g_05','Q4g_06',
'High_performance' = c('Q4h_01','Q4h_02','Q4h_03','Q4h_04','Q4h_05','Q4h_06',
'Low_prices' = c('Q4i_01','Q4i_02','Q4i_03','Q4i_04','Q4i_05','Q4i_06',
### get lists of variables to include/exclude
#create dummy variable in the data set to use if needed
#pull off the column headers from the data set
all.names <- names(thedata)
#identify variables to exclude from the stacked data set
variables.to.exclude = all.names[!all.names %in% c(unlist(variables.to.stack), id.variable, variables.to.stretch)]
### Stack the data
stacked.data = reshape(data = thedata,
idvar = id.variable, direction = "long",
drop = variables.to.exclude,
varying = variables.to.stack)
### Format the stacked data
#make column names pretty
names(stacked.data) = c(id.variable, variables.to.stretch, "Observation", names(variables.to.stack))
### (5) return the final stacked data set or read article for how to save to cloud
#keep category metadata
stacked.data = as_factor(stacked.data)
- Note that the variables.to.stack object includes the following format:
- Each element of the list has a name to tell us what the variables mean. This will become the name of the variable in the stacked data frame.
- Each element of the list is a vector that tells us the variable names in the original data frame and the order in which they are to be stacked.
- If you don't have a variable to stretch you can use the following code to set variables.to.stretch to an empty vector.
variables.to.stretch = c()
5. OPTIONAL: If you are unsure of the exact code you need to use to stack your data set, you can prototype the code in a Calculation by using Calculation > Custom Code and typing in your R CODE. The Calculation will allow you to preview the results and modify your code. You can then use that same code to add your R data set as in step 1.
6. OPTIONAL: If you would rather save your stacked data to a .sav file on your Displayr Cloud Drive, in the case that you're working from a data preparation document, you should:
- Copy your stacking code to a Calculation > Custom Code
- Ensure the names of your new stacked variables are valid SPSS variables names (no spaces, begin with an alphabetical character, are unique, etc).
- Replace the code in step 5 of the example above with the following:
### (5) specify filename and save the final stacked data as a sav to the Displayr Cloud
#change stackeddata.sav to the filename you want to save
#note files get overwritten automatically based on name
stackedfile = "stackeddata.sav"
#save the dataset to the cloud
7. R data doesn’t have the same level of metadata as some file types, like SSS and SAV. The code above will retain labels that are consistent across stacking, but care should be taken to review variables with inconsistencies. After the data set is loaded into Displayr, you should always review the Value Attributes and Structures of your variables.
8. OPTIONAL: There will be a new variable Observation in your stacked data file that identifies which iteration of the stacking the case comes from. You can relabel the Values of this variable to correspond to the appropriate product/brand/item for that particular iteration.
9. OPTIONAL: To use this in the same document as your unstacked data, please see How to Link Data Sets Together in a Document