This article describes how to stack a data set in Displayr using R. Stacking can also be described as going from wide to long format like below:
- A Displayr document.
- An SPSS (.sav) data set saved as a URL.
1. Go to Data Sets > Plus (+) > R.
2. Enter a name for the data set under Name.
3. Update and input the below R code where it states "Enter your R code here":
# Reading in the data
tech = suppressWarnings(read.spss("https://wiki.q-researchsoftware.com/images/3/35/Technology_2018.sav",
use.value.labels = TRUE,
to.data.frame = TRUE))
# Stacking the data
id.variable = 'RESPNUM'
variables.to.stretch = c('Q1', 'Rec_Age')
variables.to.stack = list(
'Recommend' = c('Q3_01', 'Q3_02', 'Q3_03', 'Q3_04', 'Q3_05',
'Fun' = c('Q4a_01','Q4a_02','Q4a_03','Q4a_04','Q4a_05','Q4a_06',
'Worth what you pay for' = c('Q4b_01','Q4b_02','Q4b_03','Q4b_04','Q4b_05','Q4b_06',
'Innovative' = c('Q4c_01','Q4c_02','Q4c_03','Q4c_04','Q4c_05','Q4c_06',
'Good customer service' = c('Q4d_01','Q4d_02','Q4d_03','Q4d_04','Q4d_05','Q4d_06',
'Stylish' = c('Q4e_01','Q4e_02','Q4e_03','Q4e_04','Q4e_05','Q4e_06',
'Easy-to-use' = c('Q4f_01','Q4f_02','Q4f_03','Q4f_04','Q4f_05','Q4f_06',
'High quality' = c('Q4g_01','Q4g_02','Q4g_03','Q4g_04','Q4g_05','Q4g_06',
'High performance' = c('Q4h_01','Q4h_02','Q4h_03','Q4h_04','Q4h_05','Q4h_06',
'Low prices' = c('Q4i_01','Q4i_02','Q4i_03','Q4i_04','Q4i_05','Q4i_06',
all.names <- names(tech)
variables.to.exclude = all.names[!all.names %in% c(unlist(variables.to.stack), id.variable, variables.to.stretch)]
stacked.tech = reshape(data = tech,
idvar = id.variable, direction = "long",
drop = variables.to.exclude,
varying = variables.to.stack)
names(stacked.tech) = c(id.variable, variables.to.stretch, "Observation", names(variables.to.stack))
The above code uses the foreign R package to import the SPSS file from our URL. Next, we set the id variable name in id.variable and the list of the variables to stack and stretch (i.e. include but not stack).
Note that the variables.to.stack object includes the following format:
- Each element of the list has a name to tell us what the variables mean. This will become the name of the variable in the stacked data frame.
- Each element of the list is a vector which tells us the variable names in the original data frame and the order in which they are to be stacked.
The variables.to.exclude code then works out the variables that are to be excluded by process of elimination.
If you don't have a variable to stretch you can use the following code to set variables.to.stretch to an empty vector.
variables.to.stretch = c()
We next use the reshape function to create a stacked data frame, and the arguments to the reshape function tell it what to do with each of the columns. The direction argument tells the function that we want to stack the data rather than apply an alternative transformation.
The final line of the code changes the column names of the stacked data to make them more meaningful, and this affects how the final data appears in Displayr.
4. OPTIONAL: If you are unsure of the exact code that you need to use to stack your data set you can prototype the code in an Calculation by using Calculation > Custom Code and typing in your R CODE. The Calculation will allow you to preview the results and modify your code as you go. You can then use that same code to add your R data set.
5. R doesn’t have the same level of metadata as some file types, like SSS and SAV. For example, variables in an R data frame do not have the concept of both Name and Label. Remember to add such information after you have added your data set.