If you’re using Excel’s VLOOKUP to match and update data before importing it into Displayr, you’ve probably run into these common problems:
- Variable labels, structure, and missing value settings are lost.
- Value and label order aren't maintained.
- Automatic alphabetical sorting can skew results.
- Misclassified data leads to inaccurate analysis.
- Manual rework that slows projects down.
- Increased risk of mistakes from copying and pasting.
By building a custom analysis tool (also called a Standard R widget), you can replicate the behavior of VLOOKUP directly within Displayr. This allows you to import data that contains metadata, such as variable labels and types, while also supporting advanced fuzzy matching to handle naming discrepancies or formatting differences.
This tool enables you to:
- Automate data matching using your reference tables.
- Match values exactly or approximately using fuzzy matching, even when there are variations in spelling, formatting, or naming conventions.
- Refine the matching process by adjusting similarity thresholds to control how closely values must align, as needed.
- Reuse the tool across projects by saving it as a Template.
- Automatically create new variables for reporting in the Data Sources tree.
For this guide, we’ll refer to the custom analysis tool as the Smart Lookup Tool, as it effectively recreates VLOOKUP functionality within Displayr.
The Smart Lookup Tool brings together multiple technologies to streamline and automate the matching process. It uses:
- JavaScript – for interactive input controls
- R Code – for lookup and fuzzy match logic
- QScript – for exporting matched variables to your dataset
Why it's better than Excel's VLOOKUP:
| Excel VLOOKUP | Smart Lookup Tool |
| Returns only a single column from the matched row, requiring additional steps to retrieve more information. | Returns multiple columns from the matched row, providing a more complete set of information in one step. |
| Limited to flat file formats (e.g., .xlsx, .csv) that do not retain metadata | Compatible with metadata-rich file formats (e.g., .sav, .mdd, Triple-S) |
| Applies an approximate match (default), or an exact match (specified by the FALSE argument) to find a matching value in the first column of a table. | Applies a step-by-step matching hierarchy, starting with exact matches and progressing to advanced fuzzy techniques to maximize match accuracy. |
| External to Displayr | Fully integrated in Displayr |
Follow this guide to:
- Build your custom analysis tool in Displayr
- Configure input controls to make it easy to use
- Automate your VLOOKUP process and integrate it directly into your analysis
- Perform troubleshooting
Once you’ve built your Smart Lookup Tool, you can reuse it as a template for future projects, helping you work smarter and faster every time.
Requirements
Before building your Smart Lookup Tool, ensure you’ve completed the following prerequisites:
- An R calculation created from the toolbar via Calculation
> Custom Code.
- You’ve read the guide on How to Create a Custom Analysis Tool.
- A Displayr Enterprise license, which allows you to add custom QScripts to the toolbar menu. If this feature isn’t currently enabled, you may need to contact your Account Manager or reach out to support@displayr.com to discuss upgrading to Displayr Enterprise.
- A data table used as a lookup reference that has been imported into your Displayr document. If you'd like to use example data, you can download the file here.
Step-by-Step Guide: How to Build a Smart Lookup Tool in Displayr
The user workflow is:
Step 1: Understand the Challenge and the Smart Solution
Step 2: Prepare Your Lookup Table
Step 3: Set Up Input Controls
Step 4: Understanding Fuzzy Matching
Step 5: Write the R Code to Perform the Lookup
Step 6: Review the Matched Data
Step 8: Run the QScript
Step 7: Create the QScript to Export Variables
Step 9: Save the Smart Lookup Tool as a Template for Reuse
Step 10: Troubleshooting
Step 1: Understand the Challenge and the Smart Solution
In Excel, VLOOKUP searches for a value in the first column of a reference table and returns a value from a specified column in the same row. However, it can only return a single value from that row, even when additional information is available.
The Smart Lookup Tool in Displayr expands on this concept. Instead of returning just one value, it:
- Allows you to choose the lookup value, reference table, and matching column.
- Automatically retrieves the entire matching row, combining all relevant columns.
- Offers greater flexibility while maintaining your data structure within Displayr.
Before setting up controls or writing code, it's essential to understand the Smart Lookup Tool's core functions. The tool is designed to:
- Automate the process of matching input data, like text responses or product codes, to a reference table.
- Support both exact and fuzzy matching to handle variations in spelling, formatting, or naming conventions.
- Return an entire row (or specific columns) from the reference table directly into your dataset, eliminating the need for Excel or manual VLOOKUP steps.
Understanding these functions helps you determine the necessary inputs, plan the structure of matched results, and decide how to handle cases when no match is found.
Step 2: Prepare Your Lookup Table
The lookup table acts as the reference dataset that the Smart Lookup Tool will match against — for example, a list of product codes, country names, or any other standardized values used to enrich or clean your main data.
Create a Lookup Table
- From the Data Sources tree, click Add Data or the + button.
- Expand Advanced data file options.
- Update the Data format to Data Set.
- Click Upload file and select the file that contains the data.
- A data set will appear in the Data Sources tree.
- From the toolbar, go to Table
> Raw Data > Variable(s).
- Click onto the page to place the table.
- From the object inspector
, go to Data > Variables and select the variables that you want to use from the data set uploaded at Step 4.
Once added, this table becomes available for selection in the input controls of your Smart Lookup Tool.
Step 3: Set Up Input Controls
In the JavaScript editor of your Displayr document, you’ll define a series of form inputs that control how the lookup is performed. These will appear as dropdowns or numeric fields to end users.
Recommended Inputs:
| Input Name | Description | Example Value |
|---|---|---|
formPrimaryKey |
Selects the ID or key column (e.g., respondent ID) used to track matched records. | RespondentID |
formLookUpValue |
Dropdown to select the column containing your input values. | "Brand Input" |
formTableArray |
Reference table containing lookup values and additional columns to return. | "MasterBrandList" |
formColumnIndex |
Numeric input to specify the column number in the reference table that contains the values to match against; all other columns will be included in the result once a match is found. | 2 |
formFuzzyToggle |
Checkbox or dropdown to enable/disable fuzzy matching. |
"Yes" or "No"
|
formFuzzyTextThreshold |
Numeric slider to adjust Jaro-Winkler similarity threshold—suitable for short string matches. |
0.88 (default) |
formTextSimilarityThreshold |
Numeric slider to adjust TF-IDF cosine similarity threshold—ideal for longer or complex strings. |
0.85 (default) |
formViewComboBox |
Option to preview data or create variables. |
"Review Matched Data" / "Add Variables to Dataset"
|
formExportColumns |
Leave blank to display all columns. Enter a number to display only the corresponding column. |
💡 Tip: Carefully plan your input controls to provide a smooth and user-friendly experience.
Consider:
- Which inputs should be adjustable versus fixed?
- How should controls be logically grouped?
- Should controls be visible at all times or appear conditionally?
- Which control type (e.g., dropdown, text box) suits each input?
Getting Started: Add Input Controls for the Smart Lookup Tool
To begin creating your Smart Lookup Tool in Displayr:
- Select the R Calculation you created earlier using Calculation
> Custom Code from the toolbar.
- To access the JavaScript code editor, either go to Data > Inputs JavaScript > Edit Code in the object inspector
, or directly in the code editor, use the drop-down menu (which defaults to R Code) and select Inputs JavaScript.
-
Add Interactive Input Controls: Open the JavaScript editor and paste in the provided code snippet to create input controls such as dropdowns, numeric fields, or checkboxes. These controls let users adjust Smart Lookup Tool settings directly from the interface—no need to edit the R code. The controls will be grouped logically for ease of use.
// Add Heading
form.setHeading("Smart Lookup Tool");
// Add a summary of what the tool does
form.setSummary("Key Features: Unique Identifier: A column containing unique values used to match against your Lookup Value. Lookup Value: What you're searching for (e.g., product ID, employee name). Table Array: The data range to search within. Column Index Number: The column number with the value to return.")
// Add Data Source Group
form.group("Data Source");
form.dropBox({name: "formPrimaryKey",
label: "Primary Key:",
types: ["Variable"],
multi: false,
height: 1,
prompt: "Select the column with unique keys to match (e.g., Respondent ID)"})
form.dropBox({name: "formLookUpValue",
label: "Lookup Value:",
types: ["Variable"],
multi: false,
height: 1,
prompt: "Select the Lookup Value: The value you're searching for (e.g., a product ID, employee name)"})
form.dropBox({name: "formTableArray",
label: "Table Array:",
types: ["RItem"],
multi: false,
height: 1,
prompt: "Select the Table Array: The table that contains the data to search through."})
form.textBox({name: "formColumnIndex",
label: "Column Index:",
type: "number",
default_value: "1",
multi: false,
required: false,
height: 1,
prompt: "Column Index Number: The column number within the reference table that contains the data you are matching on."})
var fuzzymatch = form.checkBox({
name: "formPerformFuzzy",
label: "Enable Fuzzy Matching",
default_value: true,
prompt: "Enable to allow flexible (fuzzy) matching for input values. When disabled, only exact and cleaned matches will be used."
})
if (fuzzymatch.getValue() == true) {
// Add section for advanced fuzzy matching thresholds
form.group("Fuzzy Matching Settings");
// Jaro-Winkler Similarity Threshold
form.numericUpDown({
name: "formFuzzyTextThreshold",
label: "Jaro-Winkler Similarity Threshold:",
default_value: 0.88,
increment: 0.01,
minimum: 0,
maximum: 1,
prompt: "Set the minimum Jaro-Winkler similarity score for matching. Higher values (closer to 1) require more exact matches; lower values allow looser matches. The scale ranges from 0 (no similarity) to 1 (exact match)."
});
// TF-IDF Cosine Similarity Threshold
form.numericUpDown({
name: "formTextSimilarityThreshold",
label: "TF-IDF Cosine Similarity Threshold:",
default_value: 0.85,
increment: 0.01,
minimum: 0,
maximum: 1,
prompt: "Set the minimum cosine similarity score for text matching using TF-IDF. Higher values enforce stricter matching. The scale ranges from 0 (no similarity) to 1 (perfect similarity)."
});
}
// Add third group
form.group("View Options");
var selectedview = form.comboBox({name: "formViewComboBox",
label: "Select Data View:",
alternatives: ["Review Matched Data", "Add Variables to Dataset"],
default_value: "Review Matched Data",
required: false,
expression: false,
prompt: "Choose whether to add the matched results as variables to your dataset or review them first in a detailed table."})
if (selectedview.getValue() == "Add Variables to Dataset") {
form.textBox({
name: "formExportColumns",
label: "Export Columns:",
type: "text",
required: false,
prompt: "Enter comma separated list of column indices or names to export"
});
}
Why Fuzzy Matching Matters
Real-world data often contains inconsistencies, such as differences in case, abbreviations, misspellings, or extra spaces. The Smart Lookup Tool uses fuzzy matching to resolve these issues, enhancing match accuracy even when values don’t align exactly.
Matching Hierarchy (Applied Automatically)
To replicate Excel’s VLOOKUP functionality in R, while maximizing match coverage, a tiered matching strategy is used. It begins with exact matching and then progressively applies more lenient fuzzy matching techniques. At each stage, only unmatched records continue to the next step, ensuring both high accuracy and efficient performance.
- Exact Match: A direct one-to-one match between your input and the reference column.
-
Cleaned Match: Ignores differences in case and spacing (e.g.,
"Apple "→"apple") -
Jaro-Winkler Similarity: Best suited for short strings (e.g., names, product codes).
Captures common typos, character swaps, or minor variations. Default threshold:0.88. -
TF-IDF + Cosine Similarity: Ideal for longer phrases or reworded text entries *(e.g.,
"Zoom Nike"vs"Nike Air Zoom"). Default threshold:0.85.
Note: Only unmatched records progress through each state of the hierarchy.
Threshold Tuning (Optional)
Each fuzzy match method calculates a similarity score between 0 and 1. You can adjust the thresholds to fine-tune how strict or flexible the matching should be:
| Score Range | Meaning |
|---|---|
1.0 |
Exact match |
0.88+ |
High similarity (reliable) |
0.70–0.87 |
Moderate (review suggested) |
< 0.70 |
Low similarity (likely incorrect) |
To customize these thresholds:
- Go to the object inspector
and navigate to Data > Fuzzy Matching Settings.
- These options are only visible when the Enable Fuzzy Matching checkbox is selected. If Enable Fuzzy Matching is not selected, only Exact and Cleaned matches will be applied.
- Increase the threshold to improve precision (stricter matches).
- Decrease the threshold to capture more potential matches (but with a higher risk of false positives).
Why These Defaults?
| Method | Default Threshold | Why? |
|---|---|---|
| Jaro-Winkler | 0.88 |
A sweet spot for detecting types and minor string variations whilst avoiding false positives |
| TF-IDF Cosine | 0.85 |
Commonly used for short-to-medium-length text when some reordering or synonyms may be involved |
The default thresholds used in the Smart Lookup Tool are heuristic values based on best practices from both academic research and widely used R packages, including Jaro-Winkler from the stringdist package (van der Loo, 2014) and TF-IDF with cosine similarity from the text2vec package (Selivanov, 2023). While these exact thresholds are not formally defined in the core literature, they reflect values commonly adopted in real-world fuzzy matching workflows and strike a practical balance between accuracy and recall. Full citations are provided at the end of this guide.
Step 5: Write the R Code to Perform the Lookup
Now it’s time to add the R code that will handle the data matching process in the Smart Lookup Tool. This code follows a three-stage approach:
- First, it performs an exact match using the efficient
data.tablepackage. - If no exact match is found, the code applies fuzzy matching using the Jaro-Winkler method for character-level similarity. For more complex cases, it uses TF-IDF with cosine similarity to identify matches based on the semantic meaning of text. It also identifies and handles any unmatched records appropriately, then combines the exact, fuzzy, and semantic matches into a single, consolidated output ready for analysis.
To add this R Code in Displayr:
- Select the custom code object (your Smart Lookup Tool) on the page.
- Go to Data > R Code > Edit Code in the object inspector
.
- Copy and paste the provided R Code snippet into the editor.
-
Press Calculate.
library(data.table)
library(stringdist)
library(text2vec)
library(parallel)
# ---- 1. Inputs & Defaults ----
perform_fuzzy <- ifelse(exists("formPerformFuzzy"), as.logical(formPerformFuzzy), TRUE)
jw_threshold <- suppressWarnings(as.numeric(ifelse(exists("formFuzzyTextThreshold") && !is.na(formFuzzyTextThreshold) && formFuzzyTextThreshold != "", formFuzzyTextThreshold, 0.88)))
tfidf_threshold <- suppressWarnings(as.numeric(ifelse(exists("formTextSimilarityThreshold") && !is.na(formTextSimilarityThreshold) && formTextSimilarityThreshold != "", formTextSimilarityThreshold, 0.85)))
lookup <- as.data.table(formTableArray); setDT(lookup)
data_key <- attr(formLookUpValue, "name")
data <- data.table(PrimaryKey = formPrimaryKey, temp_val = formLookUpValue)
setnames(data, "temp_val", data_key)
key_index <- suppressWarnings(as.numeric(formColumnIndex))
# Blank or missing
if (!exists("formColumnIndex") || is.null(formColumnIndex) || is.na(formColumnIndex) || trimws(formColumnIndex) == "") {
stop("Column Index cannot be left blank. Please enter a valid column number.")
}
# Invalid number
if (is.na(key_index) || key_index < 1 || key_index > ncol(lookup)) {
stop("Invalid column index. Please enter a valid column number within the lookup table.")
}
lookup_key <- names(lookup)[key_index]
return_cols <- setdiff(names(lookup), lookup_key)
final_cols <- c("PrimaryKey", data_key, "LookupKeyValue", return_cols, "MatchType", "SimilarityScore", "SimilarityFlag")
data_lookup <- data.table(PrimaryKey = data$PrimaryKey, InputVal = data[[data_key]]); setkey(data_lookup, PrimaryKey)
# ---- 2. Clean Function ----
clean_text <- function(x) tolower(gsub("[^a-zA-Z0-9]", "", trimws(x)))
# ---- 3. Exact Matching ----
setkeyv(lookup, lookup_key)
exact <- lookup[data, on = setNames(data_key, lookup_key)]
match_indicator <- !is.na(exact[[return_cols[1]]])
exact[, `:=`(
MatchType = fifelse(match_indicator, "Exact", NA),
SimilarityScore = fifelse(match_indicator, 100, NA),
SimilarityFlag = fifelse(match_indicator, "Exact", NA),
LookupKeyValue = exact[[lookup_key]]
)]
missing_cols <- setdiff(final_cols, names(exact))
if (length(missing_cols) > 0) exact[, (missing_cols) := NA]
setcolorder(exact, final_cols)
exact_success <- exact[MatchType == "Exact"]
# ---- 4. Cleaned Matching ----
unmatched_ids <- setdiff(data$PrimaryKey, exact_success$PrimaryKey)
data_unmatched <- data[PrimaryKey %in% unmatched_ids]
lookup_unmatched <- copy(lookup)
data_unmatched[, CleanedKey := clean_text(get(data_key))]
lookup_unmatched[, CleanedKey := clean_text(get(lookup_key))]
setkey(lookup_unmatched, CleanedKey)
fuzzy_cleaned <- lookup_unmatched[data_unmatched, on = .(CleanedKey)]
match_indicator_cleaned <- !is.na(fuzzy_cleaned[[return_cols[1]]])
fuzzy_cleaned[, `:=`(
MatchType = fifelse(match_indicator_cleaned, "Cleaned", NA),
SimilarityScore = fifelse(match_indicator_cleaned, 95, NA),
SimilarityFlag = fifelse(match_indicator_cleaned, "Cleaned", NA),
LookupKeyValue = fuzzy_cleaned[[lookup_key]]
)]
missing_cols_cleaned <- setdiff(final_cols, names(fuzzy_cleaned))
if (length(missing_cols_cleaned) > 0) fuzzy_cleaned[, (missing_cols_cleaned) := NA]
setcolorder(fuzzy_cleaned, final_cols)
fuzzy_cleaned_success <- fuzzy_cleaned[MatchType == "Cleaned"]
# ---- 5. Fuzzy Matching ----
jw_dt <- tfidf_dt <- data.table(matrix(nrow = 0, ncol = length(final_cols)))
setnames(jw_dt, final_cols); setnames(tfidf_dt, final_cols)
fuzzy_best <- data.table(matrix(ncol = length(final_cols), nrow = 0)); setnames(fuzzy_best, final_cols)
if (perform_fuzzy) {
unmatched <- data[!PrimaryKey %in% exact_success$PrimaryKey]
if (nrow(unmatched) > 0) {
unmatched_first_chars <- substr(unmatched[[data_key]], 1, 1)
filtered_lookup <- lookup[substr(lookup[[lookup_key]], 1, 1) %in% unmatched_first_chars]
if (nrow(filtered_lookup) > 0) {
dist_matrix <- stringdistmatrix(unmatched[[data_key]], filtered_lookup[[lookup_key]], method = "jw")
best_idx <- max.col(-dist_matrix)
best_sims <- 1 - dist_matrix[cbind(seq_len(nrow(unmatched)), best_idx)]
jw_raw <- data.table(
PrimaryKey = unmatched$PrimaryKey,
temp_key = filtered_lookup[[lookup_key]][best_idx],
Similarity = best_sims,
temp_val = unmatched[[data_key]]
)
setnames(jw_raw, c("temp_key", "temp_val"), c(lookup_key, data_key))
jw_dt <- merge(filtered_lookup, jw_raw, by = lookup_key, all.y = TRUE)
jw_dt <- jw_dt[Similarity >= jw_threshold]
if (nrow(jw_dt)) {
jw_dt[, c("MatchType", "SimilarityScore", "SimilarityFlag", "LookupKeyValue") := .("Fuzzy", round(Similarity * 100, 2), fifelse(Similarity >= jw_threshold, "Acceptable", "Low"), get(lookup_key))]
jw_dt <- jw_dt[, ..final_cols]
}
}
remaining <- data[!PrimaryKey %in% c(exact_success$PrimaryKey, jw_dt$PrimaryKey)]
if (nrow(remaining) > 0 && nrow(filtered_lookup) > 0) {
combined_vocab <- c(remaining[[data_key]], filtered_lookup[[lookup_key]])
it_all <- itoken(combined_vocab, progressbar = FALSE)
vocab <- create_vocabulary(it_all)
vectorizer <- vocab_vectorizer(vocab)
dtm_all <- create_dtm(it_all, vectorizer)
n_remaining <- nrow(remaining)
sims <- sim2(dtm_all[1:n_remaining, ], dtm_all[-(1:n_remaining), ], method = "cosine")
best_idx <- max.col(sims, "first")
best_scores <- sims[cbind(seq_len(nrow(sims)), best_idx)]
tfidf_raw <- data.table(
PrimaryKey = remaining$PrimaryKey, temp_key = filtered_lookup[[lookup_key]][best_idx], Similarity = best_scores, temp_val = remaining[[data_key]])
setnames(tfidf_raw, c("temp_key", "temp_val"), c(lookup_key, data_key))
tfidf_dt <- merge(filtered_lookup, tfidf_raw, by = lookup_key, all.y = TRUE)
tfidf_dt <- tfidf_dt[Similarity >= tfidf_threshold]
if (nrow(tfidf_dt)) {
tfidf_dt[, c("MatchType", "SimilarityScore", "SimilarityFlag", "LookupKeyValue") := .("Fuzzy", round(Similarity * 100, 2), fifelse(Similarity >= tfidf_threshold, "Acceptable", "Low"), get(lookup_key))]
tfidf_dt <- tfidf_dt[, ..final_cols]
}
}
}
fuzzy_best <- rbindlist(list(jw_dt, tfidf_dt), fill = TRUE)
}
# ---- 6. Unmatched ----
all_ids <- c(exact_success$PrimaryKey, fuzzy_best$PrimaryKey, fuzzy_cleaned_success$PrimaryKey)
unmatched <- data[!PrimaryKey %in% all_ids][, (return_cols) := NA]
unmatched[, c("MatchType", "SimilarityScore", "SimilarityFlag", "LookupKeyValue") := .("Unmatched", NA, "Unmatched", NA)]
setcolorder(unmatched, final_cols)
# ---- 7. Combine and Final Output ----
combined <- rbindlist(list(exact_success, fuzzy_cleaned_success, fuzzy_best, unmatched), fill = TRUE)
setkey(combined, PrimaryKey)
combined <- merge(combined, data_lookup, by = "PrimaryKey", all.x = TRUE)
setnames(combined, old = c("InputVal", "LookupKeyValue"), new = c("Input Value", "Lookup Value"), skip_absent = TRUE)
allowed_cols <- c("PrimaryKey", "Input Value", "Lookup Value", return_cols, "MatchType", "SimilarityScore", "SimilarityFlag")
cols_to_keep <- intersect(names(combined), unique(allowed_cols))
combined <- combined[, ..cols_to_keep]
base_cols <- c("PrimaryKey", "Input Value")
remaining_cols <- setdiff(names(combined), base_cols)
setcolorder(combined, c(base_cols, remaining_cols))
if (nrow(combined) > 0 && ncol(combined) > 0)
combined <- combined[rowSums(!is.na(combined)) > 0]
setnames(combined, "PrimaryKey", "Primary Key")
export_output <- combined[order(`Primary Key`)]
# ---- 8. Final Output Selection ----
if (exists("formViewComboBox") && formViewComboBox == "Add Variables to Dataset") {
# Filter output to only selected columns (excluding technical ones)
excluded_cols <- c("Primary Key", "Input Value", "Lookup Value")
allowed_cols <- setdiff(names(combined), excluded_cols)
exportable_data <- combined[, ..allowed_cols]
selected_cols <- character(0)
if (exists("formExportColumns") && !is.null(formExportColumns) && formExportColumns != "") {
raw_input <- formExportColumns
if (length(raw_input) == 1 && grepl(",", raw_input))
raw_input <- unlist(strsplit(raw_input, ",\\s*"))
numeric_indices <- suppressWarnings(as.numeric(raw_input))
index_valid <- !is.na(numeric_indices) & numeric_indices >= 1 & numeric_indices <= ncol(exportable_data)
index_names <- names(exportable_data)[numeric_indices[index_valid]]
name_matches <- intersect(raw_input, names(exportable_data))
selected_cols <- unique(c(name_matches, index_names))
}
# If selected_cols is not empty, filter; otherwise export all columns
export_output <- if (length(selected_cols) > 0) {
exportable_data[, ..selected_cols]
} else {
exportable_data
}
# Rename only if the match fields are present
setnames(export_output,
old = c("MatchType", "SimilarityScore", "SimilarityFlag"),
new = c("Check Match Type", "Check Similarity Score", "Check Similarity Flag"),
skip_absent = TRUE)
} else if (exists("formViewComboBox") && formViewComboBox == "Review Matched Data") {
export_output <- combined[order(`Primary Key`)]
} else {
export_output <- data.table(Message = "No matching output option selected.")
}
# ---- 9. Attach class and assign ----
attr(export_output, "class") <- c("data.frame", "smart.lookup.tool", colnames(export_output))
smart.lookup.tool <- export_output
After combining the exact and fuzzy match results, the R code formats the data into a table that’s compatible with QScript, allowing it to export the matched data as variables into the Data Sources tree.
Specifically, the R code:
- Merges the matching results into a single
data.table. - Converts the
data.tableinto adata.frameto ensure compatibility with QScript. - Assigns a special class attribute that includes:
-
data.frameso Displayr recognizes it as a valid table. -
smart.lookup.toolis a unique identifier that allows QScript to locate and use the output as the widget's result. - The column names inform QScript which variables to export.
-
Once the code is in place, the user must populate the fields in the object inspector with the relevant information and press Calculate to generate the output.
Step 6: Review the Matched Data
After running your Smart Lookup Tool, you can inspect the results or prepare them for export using the Select Data View drop-down found under Data > View Options.
There are two view modes available:
-
Review Matched Data:
This option lets you visually check the output of your matching process. It’s ideal for performing quality checks - either by scanning the results on screen or exporting them to Excel for closer inspection. To export for review, go to Share > Export Report > Excel > Export Selected > Create New Document. -
Add Variables to the Dataset:
Choose this view when you're ready to export the matched data into your dataset as new variables. This prepares the output so it can be picked up by QScript and added to your Data Sources tree.
Use the dropdown menu to switch between these modes depending on whether you're reviewing or finalizing your data.
Step 7: Create the QScript to Export Variables
To export the columns in the output as variable sets in the Data Sources tree, you can create a QScript that automatically adds each selected column as a separate variable in your dataset, streamlining the process and reducing manual work.
- In the object inspector
, go to Data > View Options and set Selected Data View to Add variables to the dataset.
- Under Export Columns, leave this field blank to export all columns from the output to the Data Sources tree. If you only want to export specific columns, enter either the column name(s) or number(s) (comma-separated) to specify which ones to include.
- In the Report tree, hover and click + > Calculation > QScript > New to open the QScript Editor. This is where you can write JavaScript to automate tasks in Displayr.
-
Copy and paste the code into the code editor.
// Log message to indicate the script has started log("Smart Lookup script loaded"); // Define the main function that handles the logic function createSmartLookupVariable() { // Load the helper functions needed for user input dialogs like selectOne() includeWeb('QScript Selection Functions'); // ################# Step 1: Get the list of all data files in the project const dataFiles = project.dataFiles; // This is an array of all data files loaded in the Q project let data_file; // This will hold the selected data file object // Check if there are no data files in the project (edge case) if (dataFiles.length === 0) { log("❌ No data files found in the project."); return; // Exit the script early } // If there is more than one data file, prompt the user to select one if (dataFiles.length > 1) { // Extract the names of the data files to display in the selection list const labels = dataFiles.map(df => df.name); // e.g., ["Main Survey", "Tracking Data"] // Show a dropdown to the user and get the index of the selected item const selectedIndex = selectOne("Select the data file to store new variables:", labels); // Log the selected index for debugging // log("🔍 User selected index: " + selectedIndex); // Validate the selected index: must not be null, negative, or beyond array length if (selectedIndex == null || selectedIndex < 0 || selectedIndex >= dataFiles.length) { log("❌ Invalid selection or user cancelled."); return; // Exit early due to invalid choice } // Use the index to get the actual data file object data_file = dataFiles[selectedIndex]; } else { // Only one data file is available, so automatically select it data_file = dataFiles[0]; } // Log which data file was selected log(`✅ Variables will be added to data file: "${data_file.name}"`); // ################# Step 2: Find Smart Lookup widgets let items = []; let smart_lookup_tools = []; recursiveGetAllItemsInGroup(project.report, items); items.forEach(i => { const classes = i.outputClasses || []; if (classes.includes("smart.lookup.tool")) { smart_lookup_tools.push(i); } }); if (smart_lookup_tools.length === 0) { log("⚠️ No Smart Lookup widgets found."); return; } // ################# Step 3: Let user pick one Smart Lookup widget let selection = 0; if (smart_lookup_tools.length > 1) { const names = smart_lookup_tools.map(w => w.name); const selected = selectOne("Select a Smart Lookup tool to use:", names); selection = names.indexOf(selected); } const widget = smart_lookup_tools[selection]; log(`🚀 Using Smart Lookup widget: "${widget.name}"`); // ################# Step 4: Get variable names from outputClasses // Get the list of output classes for the selected lookup tool let widget_classes = smart_lookup_tools[selection].outputClasses || []; // Filter out generic or tool-level classes that are not variable names let col_names = widget_classes.filter(x => !["data.frame", "smart.lookup.tool"].includes(x)); // Check if there are any variable names to work with if (col_names.length === 0) { log("⚠️ No variable names found in the widget's output classes."); return; } // Log the variable names that will be created in the data set log("🆕 Variables to be added to the data set:"); col_names.forEach((name, i) => { log(` ${i + 1}. ${name}`); }); // ################# Step 5: Create new variables only if they don't already exist const widget_name = widget.name; col_names.forEach(function(name) { let v_name = name.replace(/ /g, ""); // Variable name let q_name = name; // Question name let existing_q = data_file.getQuestionByName(q_name); if (existing_q != null) { log(`⚠️ Skipping creation: question "${q_name}" already exists.`); return; } let r_code = widget_name + "$`" + name + "`"; let new_question = data_file.newRVariable(r_code, v_name, q_name, null); new_question.variableType = "Categorical"; new_question.isFilter = true; log(`✅ Created new variable: ${v_name}`); }); } // Call the function createSmartLookupVariable(); - Click the Play button to run the script. This will check for any errors and execute the code.
- If there are any issues, error messages will appear so you can review and fix them.
- If the script runs successfully, you'll be prompted to give it a name.
- Enter a meaningful name and click Save. The script will be stored in your Displayr Cloud Drive, ready to reuse if needed.
- Click OK.
- The QScript will have exported the matched data to the Data Sources tree as variable sets.
Note: Because the variables created are R variables, they will update automatically whenever the data changes. You only need to run the script once unless you make changes to the output structure.
Step 8: Run the QScript
Once your Smart Lookup Tool output is ready, you can skip this step if you’ve already pressed Play and run the code; however, it’s useful to know in case you ever need to re-run it. This script exports the matched data into your dataset as individual variables.
Note: QScript is an Enterprise feature used to automate data preparation, cleaning, recoding, and variable creation.
- Go to Tools > QScript, then select the script you just saved.
- The script will:
-
- Scan the project for the output object with the
smart.lookup.toolidentifier. - Read the column names from the object’s class attribute.
- Create new R variables in your selected data file—one for each matched column.
- Scan the project for the output object with the
This ensures your matched data is immediately available for analysis, with no manual copying or retyping. Since the output variables are dynamic R variables, they’ll continue to update automatically as your data changes.
If you don’t need all of the matched columns to be added to the Data Sources tree, you can control which ones are exported. When you select View Options > Add Variables to Dataset, you’ll see an option to specify which columns to export. Leave it blank to export all columns, or enter the ones you want to include.
References
- Cohen, W. W., Ravikumar, P., & Fienberg, S. E. (2003). A Comparison of String Distance Metrics for Name-Matching Tasks.
- Van der Loo, M. P. J. (2014). The stringdist Package for Approximate String Matching.
- Selivanov, D. (2023‑11‑09). text2vec: Modern Text Mining Framework for R. CRAN.
- Official package page: https://CRAN.R-project.org/package=text2vec
- Vignette Analyzing Texts with the text2vec package, covering vectorization, TF‑IDF, and cosine similarity
See Also
How to Create an Auto-Updating Custom Analysis Tool
How to Add Headings, Groups, and Pages to a Custom Analysis Tool
How to Create User Input Fields in a Custom Analysis Tool
How to Set Conditional Controls in a Custom Analysis Tool
How to Set Object Types in Dropbox Controls in Custom Analysis Tools