This article describes how to delete observations, also known as cases or respondents, from a data set. A great feature of Displayr is that manual changes you make to the data set in Displayr are not changed in the raw data file that was imported. This allows you to revert any changes you make in Displayr without the worry of losing data or messing up your raw data file.
Requirements
- A Displayr document with an imported data file in your Data Sources.
- Anytime you need to manually edit any raw data values or delete cases, you will need to set a Unique identifier for the Data Sources, see Unique Identifier ID Variable in our Data Story Guide. This allows Displayr to map which edits were done to which specific cases in your data. Select your data set in the Data Sources and set the Unique identifier in the object inspector to your unique ID variable or [Use case number], see Further Considerations below if you're unsure which to select.
Method - Removing cases from the raw data using an existing filter
Filters enable you to select multiple records at once for deletion. You can create filters based on logic using our filtering tool or by using R to pick out specific records or use more complex criteria.
- Select the name of your data set from the Data Sources.
- Go to the object inspector go to General > Unique identifier and select a variable with unique values, or select [Use case number].
-
Select any variables in Data Sources you wish to view as raw data, and right-click > View in Data Editor.
-
Select your filter variable in the Filter dropdown so that all the rows selected by the filter will appear in green. Here, we have applied our $200,001 or more filter while showing Income (D2) data:
-
Right-click the row header > Delete Row(s) Matching Filter to delete these cases from your data set.
- OPTIONAL: You can return deleted cases to your data set by going back to the Data Editor and right-clicking the row header > Undelete All Rows.
Method - Removing cases from the raw data using the raw data editor
- Select the name of your data set from the Data Sources.
- Go to the object inspector go to General > Unique identifier and select a variable with unique values, or select [Use case number].
- Select one or more variables (or combined variable set) in the Data Sources, then right-click and select View in Data Editor.
-
To delete an individual row:
-
Click the row header and select Delete Row(s). For example:
To delete multiple rows, Ctrl-click each row individually to add them, or Shift-click to select a range of rows, and then right-click in a selected row and select Delete Row(s).
-
Click the row header and select Delete Row(s). For example:
- OPTIONAL: You can return deleted cases to your data set by going back to the Data Editor and right-clicking the row header > Undelete All Rows.
Further Considerations
There are implications to what unique identifier you choose for your Data Sources. You can choose to use a unique identifier variable from your raw data file or the case number (row number) to map edits to the data.
The best option is to use a unique variable from your raw data file that will identify each case/respondent in the data. This ensures Displayr is making the appropriate edits to that specific response/case no matter where it appears in the raw data file (if your raw data file is updated at a later date).
Use case number is also an option if you don't have a unique identifier variable in the raw data. Using this will map edits to the data based on the row of the data file. This means if your raw data file changes in the future, you need to ensure all of the previous responses are in the same order as the original file. Otherwise, if respondents are on a different row of the updated data, the edits will be mapped to the incorrect respondent.
Next
How to Tag a Variable as a Filter
How to Delete Variables from Your Data Set
How to Filter Cases in the Data Editor Using R