This article describes how to stack a data set in Displayr using R. Stacking can also be described as going from wide to long format like below:
Requirements
- A Displayr document
- An SPSS (.sav) data set saved as a URL
Method
1. Go to Data Sets > Plus (+) > R.
2. Enter a name for the data set under Name.
3. Update and input the below R code where it states "Enter your R code here":
# Reading in the data
library(foreign)
tech = suppressWarnings(read.spss("https://wiki.q-researchsoftware.com/images/3/35/Technology_2018.sav",
use.value.labels = TRUE,
to.data.frame = TRUE))
# Stacking the data
id.variable = 'RESPNUM'
variables.to.stretch = c('Q1', 'Rec_Age')
variables.to.stack = list(
'Recommend' = c('Q3_01', 'Q3_02', 'Q3_03', 'Q3_04', 'Q3_05',
'Q3_06','Q3_07','Q3_08','Q3_09','Q3_10','Q3_11',
'Q3_12','Q3_13'),
'Fun' = c('Q4a_01','Q4a_02','Q4a_03','Q4a_04','Q4a_05','Q4a_06',
'Q4a_07','Q4a_08','Q4a_09','Q4a_10','Q4a_11','Q4a_12','Q4a_13'),
'Worth what you pay for' = c('Q4b_01','Q4b_02','Q4b_03','Q4b_04','Q4b_05','Q4b_06',
'Q4b_07','Q4b_08','Q4b_09','Q4b_10','Q4b_11','Q4b_12','Q4b_13'),
'Innovative' = c('Q4c_01','Q4c_02','Q4c_03','Q4c_04','Q4c_05','Q4c_06',
'Q4c_07','Q4c_08','Q4c_09','Q4c_10','Q4c_11','Q4c_12','Q4c_13'),
'Good customer service' = c('Q4d_01','Q4d_02','Q4d_03','Q4d_04','Q4d_05','Q4d_06',
'Q4d_07','Q4d_08','Q4d_09','Q4d_10','Q4d_11','Q4d_12','Q4d_13'),
'Stylish' = c('Q4e_01','Q4e_02','Q4e_03','Q4e_04','Q4e_05','Q4e_06',
'Q4e_07','Q4e_08','Q4e_09','Q4e_10','Q4e_11','Q4e_12','Q4e_13'),
'Easy-to-use' = c('Q4f_01','Q4f_02','Q4f_03','Q4f_04','Q4f_05','Q4f_06',
'Q4f_07','Q4f_08','Q4f_09','Q4f_10','Q4f_11','Q4f_12','Q4f_13'),
'High quality' = c('Q4g_01','Q4g_02','Q4g_03','Q4g_04','Q4g_05','Q4g_06',
'Q4g_07','Q4g_08','Q4g_09','Q4g_10','Q4g_11','Q4g_12','Q4g_13'),
'High performance' = c('Q4h_01','Q4h_02','Q4h_03','Q4h_04','Q4h_05','Q4h_06',
'Q4h_07','Q4h_08','Q4h_09','Q4h_10','Q4h_11','Q4h_12','Q4h_13'),
'Low prices' = c('Q4i_01','Q4i_02','Q4i_03','Q4i_04','Q4i_05','Q4i_06',
'Q4i_07','Q4i_08','Q4i_09','Q4i_10','Q4i_11','Q4i_12','Q4i_13'))
all.names <- names(tech)
variables.to.exclude = all.names[!all.names %in% c(unlist(variables.to.stack), id.variable, variables.to.stretch)]
stacked.tech = reshape(data = tech,
idvar = id.variable, direction = "long",
drop = variables.to.exclude,
varying = variables.to.stack)
names(stacked.tech) = c(id.variable, variables.to.stretch, "Observation", names(variables.to.stack))
stacked.tech
The above code uses the foreign R package to import the SPSS file from our URL. Next, we set the id variable name in id.variable and the list of the variables to stack and stretch (i.e. include but not stack).
Note that the variables.to.stack object includes the following format:
- Each element of the list has a name to tell us what the variables mean. This will become the name of the variable in the stacked data frame.
- Each element of the list is a vector that tells us the variable names in the original data frame and the order in which they are to be stacked.
The variables.to.exclude code then works out the variables that are to be excluded by process of elimination.
If you don't have a variable to stretch you can use the following code to set variables.to.stretch to an empty vector.
variables.to.stretch = c()
We next use the reshape function to create a stacked data frame, and the arguments to the reshape function tell it what to do with each of the columns. The direction argument tells the function that we want to stack the data rather than apply an alternative transformation.
The final line of the code changes the column names of the stacked data to make them more meaningful, and this affects how the final data appears in Displayr.
4. OPTIONAL: If you are unsure of the exact code that you need to use to stack your data set you can prototype the code in a Calculation by using Calculation > Custom Code and typing in your R CODE. The Calculation will allow you to preview the results and modify your code as you go. You can then use that same code to add your R data set.
5. R doesn’t have the same level of metadata as some file types, like SSS and SAV. For example, variables in an R data frame do not have the concept of both Name and Label. Remember to add such information after you have added your data set.
Additional Tools
You can use the following Excel file to assist in generating your R Code: Stacking Code Generator
To use this template:
- Assemble the list of variable names you want to stack. You don't need every single variable in the data set, just the ones you wish to eventually stack.
- Paste them into Column C in the Original Variable List tab.
- In the Stacking Code Generator tab from Columns D and ongoing, add your variable names where each row is a "set" of variables that will be stacked together and each column is an individual observation.
- Tip: use the “Transpose” paste feature in Excel (if you need to) and delete any unnecessary rows in the generator. Some stacking does not require transpose pasting.
- Column A (the blue column) automatically generates some code for each resultant stacked variable. Copy all the cells in the blue column and paste them into the appropriate spot in the R Code template.
- When pasting the code into Displayr, be sure to remove the last comma before closing the round brackets.
- Be careful using Excel to copy, paste, delete, and transpose data. All of the above requires some common-sense using Excel.
Next
Comments
0 comments
Article is closed for comments.