This article describes how to delete observations, also known as cases or respondents, from a data set. A great feature of Displayr is that manual changes you make to the data set in Displayr are not changed in the raw data file that was imported. This allows you to revert any changes you make in Displayr without the worry of losing data or messing up your raw data file.
Displayr maps what changes are made to specific records by using a unique identifier, so you will not be able to edit or delete data in the Raw Data panel until you've selected one. If you update your data set later, Displayr looks for any specific IDs that had manual changes and remaps those changes to your updated data, so you don't need to worry about losing any manual edits.
Requirements
- A Displayr document with an imported data file in your Data Sources.
- Anytime you need to manually edit any raw data values or delete cases, you will need to set a Unique identifier for the Data Sources, see Unique Identifier ID Variable in our Data Story Guide. This allows Displayr to map which edits were done to which specific cases in your data. Select your data set in the Data Sources and set the Unique identifier in Properties
to your unique ID variable or [Use case number]. See Further Considerations below if you're unsure which to select.
Method - Removing cases from the raw data using an existing filter
Filters enable you to select multiple records at once for deletion. You can create filters based on logic using our filtering tool or by using R to pick out specific records or use more complex criteria.
- Select the name of your data set from the Data Sources.
- In Properties
, go to General > Unique identifier, and select a variable with unique values, or select [Use case number].
- Select any variables in Data Sources you wish to view as raw data, and go to Properties
and select
Raw Data from the Data tab.
- Select your filter variable in the Filter dropdown so that all the rows selected by the filter will appear in green. Here, we have applied our $200,001 or more filter while showing Income (D2) data:
- Right-click a row header matching the filter and select Delete Row(s) Matching Filter to delete these cases from your data set.
- OPTIONAL: You can return deleted cases to your data set by going back to the Raw Data panel, right-clicking the row header, and selecting Undelete All Rows.
Method - Removing cases from the raw data using the Raw Data
- Select the name of your data set from the Data Sources.
- In Properties
, go to General > Unique identifier, and select a variable with unique values, or select [Use case number].
- Select one or more variables (or a combined variable set) in the Data Sources, go to Properties
, and select
Raw Data from the Data tab.
-
To delete an individual row:
- Right-click the row header and select Delete Row(s). For example:
To delete multiple rows, Ctrl-click each row individually to add them, or Shift-click to select a range of rows, and then right-click in a selected row and select Delete Row(s).
- Right-click the row header and select Delete Row(s). For example:
- OPTIONAL: You can return deleted cases to your data set by going back to the Raw Data
panel, right-clicking the row header, and selecting Undelete All Rows.
Further Considerations
There are implications to what unique identifier you choose for your Data Sources. You can use either a unique identifier variable from your raw data file or the case number (row number) to map edits to the data.
The best option is to use a unique variable from your raw data file that will identify each case/respondent in the data. This ensures that Displayr makes the appropriate edits to that specific response/case, no matter where it appears in the raw data file (even if your raw data file is updated at a later date).
Use case number is also an option if you don't have a unique identifier variable in the raw data. Using this will map edits to the data based on the row of the data file. This means that if your raw data file changes in the future, you need to ensure that all previous responses remain in the same order as in the original file. Otherwise, if respondents are on a different row of the updated data, the edits will be mapped to the incorrect respondent.
Additionally, cases that are deleted in Displayr will not be exported if you export the data set. If you have lots of unused cases in your data set, you can delete them in a separate data preparation document, export the data set, and then update your report with it. This removes those from the main reporting document, making your raw data set smaller and making your document run more efficiently. Note if you do this, you should be sure you aren't using the [Use case number] as your Unique Identifier as raw case numbers will change.
Next
How to Create Filters Using Variables in Your Data
How to Tag a Variable as a Filter