Once you have gotten your data into Displayr, it is necessary to check that the data is all as you expect. This is typically something of a never-ending process, but the following basic checks are the minimum:
- Sample size
- Checking that the file only contains completed interviews
- Grouping of variables into variable sets
To review the sample size of a data file, click the data set in the Data Sets tree, and the sample size is shown as the Number of cases in the object inspector on the right side of the page. In the example below, the sample size is 300.
Sometimes data file contains incomplete interviews. In general, it is typical practice to remove such interviews prior to performing any analysis. Sometimes data files will contain a variable that indicates which interviews were complete. In other situations, the way to check is to create a table using the data from the last question in the study that was meant to be asked of everybody.
To create a table of the last question in the study in Displayr, scroll down to the bottom of the Data Sets tree and drag the variable onto the page. The sample size will be shown in the footer at the bottom of the table.
Correct grouping of variables into variable sets
The underlying structure of a data set is a large table, where each row represents the data for each person to complete the survey, and each column represents some property of the people. These columns are commonly referred to as variables.
Displayr automatically groups variables into variable sets of related variables. Often these variable sets will contain only a single variable. However, they can contain multiple variables. This is useful as when variables are grouped together, Displayr will both allow you to automatically manipulate them all at the same time, and will automatically summarize them all at the same time when creating tables.
In the example below, the triangle to the left of Race tells us that it contains more than one variable (the triangle appears when you hover over the data set).
When you click on the triangle, the variable set is expanded, and we can see all the variables within it.
If the data set satisfies the above, the next step is to start to perform analysis (see Creating Tables and Crosstabs in Displayr). Otherwise, it is necessary to either:
- Obtain a better data file and import this.
- Clean the data.