For various reasons, as outlined in Find the Best Possible Data File, ideally you would not be importing your data using an Excel file. However, in reality sometimes this is unavoidable. This article describes how to format your Excel file (.xlsx or .csv) to aid Displayr in correctly importing and setting up your data. If not already done so, please review
- A Displayr document.
- An Excel data set.
Method - Before importation
1. All of your Excel data should be in one worksheet. Displayr won’t be able to use data in multiple worksheets. Only the first worksheet in the file will be imported.
2. Each variable from your survey should have its own column (e.g., Age is in one column, Income is in another).
3. The first row of the Excel file should have the names for each variable. Ensure:
- Headings begin with letters or words, not numbers or symbols (#!?&)
- Headings describe what is in each column (e.g., Age or D1 - Age rather than just D1). This will make things a lot easier when the data is in Displayr.
- Not to name a heading one of R's common functions. Using one may conflict with some of our automations and cause unintended results. These names include: list, length, exists, assign, names, rep, any, stop, min, max, matrix, attr, sum.
4. Ensure there is only a single header row.
5. For multiple-response questions or grid-style questions, use a common prefix in the header row label so as to tell Displayr to group these together as the same variable set.
6. Each person’s responses should be contained in a separate row with a unique identifier column included.
7. Where possible, ensure that categories are stored as labels rather than numbers (e.g. use Male and Female instead of 1 and 2 in the column for the gender question).
8. You should not have any merged cells in the table.
9. When you have a multiple response question, the data should be stored as one column for each of the possible options that were shown in the survey. Be sure to use the same two labels for selected and not selected for each of the options (Aware and Not Aware below). Avoid storing multiple responses within a single column, as this will make the data very difficult to analyze in Displayr.
10. When a respondent has not given an answer to a question or skipped the question, then they should have an entry of NA in that cell. When working with categories, other kinds of responses that indicate the respondent saw the question but did not give an answer are generally best given a label of Don’t know or whatever phrasing matches your questionnaire.
Method - After importation
1. When Displayr sees a column full of text in the Excel file, it needs to make a decision about whether the text in the column represents a set of category labels or it represents open-ended responses. Displayr makes an educated guess based on the number of unique responses in the column, and the lengths of the responses. Sometimes Displayr can guess incorrectly, particularly if, for example, you have a really large brand list.
- In these cases, you can select your variable(s) in the Data Sets tree and go to Properties > GENERAL > Structure in the object inspector to change the data format. For example, changing a variable from Nominal to Numeric will display an average instead of percentages, and changing Text to Nominal will display category percentages instead of raw text.
2. When you import categorical data as labels from Excel, it does not include any metadata to tell Displayr what values to assign each category. Values are therefore automatically assigned based on alphabetical order.
- If you wish to amend category values or labels, you can select your variable(s) in the Data Sets tree and go to Properties > DATA VALUES > Values in the object inspector.
- You can additionally edit variable labels directly in the Data Sets tree by right-clicking a variable and selecting Rename.
3. If variables are meant to be grouped, as in a multiple-response question or grid, you can group them as follows:
- Hold down your Ctrl key and from the Data Sets tree, select each of the variables that you want to group.
- Right-click > Combine and the variables will be grouped together into a single variable set.
- Expand the combined variable set in the Data Sets tree to view the individual variables.