This article describes how to set up your data file for tracking studies including those with advertising campaigns.
- A cumulative data file. That is, a single data file should contain all waves of the study. Where it is not possible to obtain a cumulative data file then it is not possible to use Displayr to test for differences between waves.
- A data file that follows the guidelines in How to Set Up Your SPSS File for Importing into Displayr.
A single project
- It will prevent any version control tools from working. That is, better data collection programs will keep a track of all changes to the questionnaire.
- Good data collection programs will prevent users from making 'silly' changes to questionnaires on tracking studies (e.g., adding or removing categories, changing between single and multiple response questions).
- When exporting from a single project, the export will be forced to be consistent. However, if exporting from different projects, there will often be inconsistencies. For example, sometimes question numbers are automatic, with the result being that Q2 in one project will be Q3 in another. Or, sometimes numeric variables will be exported as text and vice versa. Such inconsistencies greatly add to the complexity of analyzing the data.
Defensive programming of the questionnaire
It is commonly the case that the questionnaire will evolve for a tracking study. Changes to the questionnaire, such as the addition of new brands, can make later analysis difficult. For example, if a question collects awareness data, and a new brand is added, then this new brand will necessitate a new variable to be created in the data file. Some useful tricks:
- If changing the wording of an option (e.g., changing "My main brand" to "My number 1 brand"), hide the original option and add a new option. That is, do not simply overwrite the existing label. The reason for having them as separate versions is that it will make it possible to disentangle the effects of the different wordings.
- When changing a question, consider hiding the old question and creating a new question. For example, if a question asks people to choose between A, B, and C, and then in the second wave, there is a desire to add option D, it is often a good idea to set this up as a completely different question. When this is not done, it is easy to fail to remember when analyzing the data that the initial respondents did not get the option of choosing D.
- Where changing response options in Nominal, Nominal - Multi and Binary - Multi (Compact) questions, create additional variables for exporting that track which respondents saw which versions. This is useful because the information can often not be deduced from the data otherwise (i.e., as there is no way to distinguish between people not choosing an option because it was not there versus not choosing it because it was not applicable).
- A useful trick when programming Binary - Multi and Binary - Grid questions is to put dummy brands into the original data file and hide these. For example, the questionnaire may be set up as offering the following brands: Coke, Pepsi, Fanta, DUMMY1, DUMMY2, and DUMMY3, which then ensures that when the data is exported, it contains the additional variables.
- Be careful when moving questions in your survey platform and don't rename them. Not all survey platforms will retain the same export variable definitions after such actions. When exporting your data, it's important to keep variable names consistent for the same questions you wish to merge, as merging software generally uses these for matching. The same goes for changing the order of response options, as some survey platforms may automatically shift their values after such action.
Preparation of a cumulative data file
There are four main ways to create cumulative data files. These are ordered from best to worst according to their desirability:
- Export a single data file from the data collection software. The reason that this is generally the best approach is that where there are changes in the questionnaire these will either have to be resolved prior to exporting or will be obvious in the exported data file (i.e., with separate versions of the same question).
- Have the data glued together using data collection software that has tools for addressing different versions of a questionnaire.
- Merge the files together in Displayr per How to Merge Files by Case (Add New Cases) and How to Merge Files by Variable (Add New Variables). Note that this is less preferable than having them glued together by specialist data collection software because typically the file formats required by specialist data collection software have more metadata, which makes the merging more successful.
- Merging the files in another program, such as SPSS. Note that this is generally inferior to using Displayr or Q to do the merging, as the data merging tools in Displayr/Q are written with the notion that the data will be used in Displayr/Q, and thus they tend to produce better-merged data files.
Version control and the avoidance of crystallizing errors
Over the course of a tracker, it is typical that many small changes will occur in the questionnaire. It is generally useful to have some way of working out which versions of which questions were seen by which respondents. In general, the best way to do this is by using version control tools in the data collection software. Two things to avoid are:
- Merging different versions of questions when merging the data file.
- Recoding data either in SPSS, or, in Displayr/Q and then exporting as a data file and merging this data file with other files.
The reason that these two things are bad to do is that they cause errors to be locked in (i.e., crystallized). Instead, the better process is to merge together the data files, only merging identical questions, and then use tools within Displayr to merge different versions of the questionnaire, as this makes it easy to identify when changes occurred. See How to Merge Files by Case (Add New Cases) and How to Merge Files by Variable (Add New Variables).
With very large or complex projects it can often be useful to have two projects. The first which has all the different versions and any data cleaning. An SPSS data file is then exported from this first version and analyzed in a second project.
Working with advertising data
Advertising trackers regularly replace the campaigns being tracked. There are a variety of different ways of addressing this. They can differ substantially in how ideal they are for a given project and a cost-benefit analysis should be undertaken prior to making a decision. However, in general, having a format that allows for stacking is the best strategy.
1. Stacking the advertising data
When stacking, rather than having a single data file for the tracking project, there are instead two files. One file will contain:
- An ID variable.
- The advertising data in a stacked format (e.g., if a respondent evaluates three campaigns then there are three rows of data for that respondent in the data file).
- A categorical variable indicating the advertising campaign.
The other file will contain all the other data and an ID variable that can be matched with the stacked data. The data is then analyzed in Displayr using a Data File Relationship with different advertising campaigns automatically analyzed by using filters.
2. New variables for each new campaign
From a data file creation perspective, the simplest approach is usually to have new variables for each campaign. However, this is often the least helpful way of setting up the data as it complicates the analysis and there is no straightforward way of automating the updating of any reporting (i.e., updating of PowerPoint, Excel, Word, or Dashboards) if this approach is used.
3. Re-using variables
For example, if one ad is taken out of a tracker and replaced with another, then ensure that the data for the new ad is exported in the same variables as the old ad, with another variable then used to create filters. This ensures that any work done in Displayr is automatically re-applied.
When this is done it is generally prudent to include a variable in the data file which flags the campaign that the data relates to.