This article describes how to link different data sets together in a single document. This allows users to crosstab questions from two different data files, provided those files have a data file relationship that tells Displayr how the observations relate to each other. The link connects common respondents/observations in both data sets together so if you want to analyze them, the appropriate value(s) from each data set lines up for the corresponding observations.
Linking data sets is similar to merging the variable(s) in each together. However, this should ideally not be used as a substitution to merging in new variables into your data, see How to Merge Files by Variable (Add New Variables) as there are some limitations to working with variables across data sets (they cannot be used in the same banner and filters created can only apply to an output that uses data from same data set as the filter).
Linking data sets is commonly done when you have unstacked data and want to analyze it against stacked data from the same survey, or when you have related data from two different data sources. It should not be used to analyze wave on wave data, which is done by appending new waves to the same data file, see Tracking Study Best Practices and How to Merge Files by Case (Add New Cases).
Requirements
- At least 2 data sets loaded in Displayr in your Data Sets tree with a variable in both files which can be used to match together data in one data set vs another - normally this is some sort of id variable.
- The variable used to link together the data sets must be of the same Structure of data (text, categorical, numeric, etc.). If using a categorical/Nominal variable, the underlying values will be used to do the linking and not the category labels.
Method
1. Select any data source folder name in your Data Sets tree.
2. In the object inspector click Edit relationships. This dialog will show a list of the current data file relationships.
3. Click New to add a new relationship.
4. Select the names of each data set to link.
5. Set the variable that appears in both data sets to match on.
6. Select the appropriate Relationship type.
- One to one is where each single value in the left variable matches exactly to a single value in the right variable.
- One to many is where a single value in the left variable matches multiple values in the right variable, for example, a stacked file.
- Many to one is where multiple values in the left variable match a single value in the right variable. This is the same type of relationship as one to many, with left and right sides swapped.
- Many to many is where multiple values in the left variable match multiple values in the right variable, resulting in Data Fusion.
7. Choose what to do When a value is not found in the other data file.
- Exclude respondents from the matched data: If a respondent's value in the left variable cannot be found in the right variable (or the other way round), the respondent is excluded from the sample.
- Insert missing values into the matched data: If a respondent's value in the left variable cannot be found in the right variable (or the other way round), the respondent is included in the sample as missing data (NaN) rather than their actual response data.
- Show a warning message: When a respondent's value in the left variable cannot be found in the right variable (or the other way round), a warning is shown and you will not be able to proceed until you either fix the data or come back to this screen and select another option.
8. OPTIONAL: If matching on a date variable, you can group dates under Match dates that fall in the same according to year, month, week or day.
9. When the relationship between the files is Many to many, you may choose which data file is the Recipient.
10. Press OK to save the relationship.
11. OPTIONAL: To delete a relationship, move the mouse over the relationship, and then click on the Delete button which appears to the right.
12. OPTIONAL: To edit a relationship, move the mouse over the relationship, and then click on the Edit button which appears to the right.
13. Press OK to return to your document.
Next
How to Copy Variables Between Linked Data Sets