When going through your data cleaning process, you can visualize missing data across variables to identify variables to look at closer for special consideration. One way is to visualize missing data by case. This is especially helpful when cleaning trackers, whose data is usually organized by wave. You will be able to see if specific variables/questions were introduced in certain waves, and if certain variables may have been swapped to being tracked by other variables. This article describes how to create an interactive chart showing the missing data by case in the data file. Each "column" represents a different variable, and the blue shading indicates missing values.
Requirements
A Displayr document with a data set.
Method
- Insert a blank visualization, click Anything > Data > Missing Data > Plot by Case.
- In the object inspector, under Inputs > DATA SOURCE > Variables select the variables you want to visualize (or drag them from the Data Sets tree into the green Variables box that appears on the visualization)
- [Optional]: Modify settings on the Inputs tab to add/remove information, including:
- Variable names - Show variable names instead of labels.
- Show hover text - Show index of missing cases on hover. This may be quite slow if the number of missing cases is large.
- Show number of cases missing - Include the number of cases missing in the X axis labels.
- Show percentage of cases missing - Include the percentage of data missing in the X axis labels.
- [Optional]: Add labels and change other formatting on the Chart tab, including:
- Fill color - Color used to display missing cases.
- Background color - Color used to display non-missing cases.
- Global font family - Font family used in the all of the text elements. This can be overridden for specific text elements in the controls below.
- Global font color - Font color used in all the text elements in the chart. This can be overridden for specific text elements in the controls below.
- Show data labels - Show index of missing cases as data labels on the chart. This can be useful for using the chart when hover is not available. However, it is often not useful if the number of missing cases is large because labels will overlap.
- Data label background opacity - By default there is a background that is set to 0.5 so that the data labels can be easily seen above the chart lines. However, if the number of cases are few, so the filled areas are more rectangle like, it may be better to set the opacity to 0.
- Title - Optional title to show in chart.
- Subtitle - Optional subtitle to show in chart.
- Footer - Optional footer to show in chart.
- X axis label orientation - Whether to show the labels horizontally or vertically. If this is set to automatic, labels will be placed horizontally if there are less than 10 variables and vertically otherwise.
- Wrap X axis label - Whether labels should be wrapped across multiple lines if they are too long.
- X axis label width (in characters) - Maximum number of characters before a line is wrapped.
- Customize margins - Select this option to manually specify margin sizes (in pixels).
- Click Calculate if the item doesn't automatically calculate.
Next
How to Create a Filter for Complete Cases
How to Impute Missing Data in Displayr