Adding Calculations into the R Code of Visualizations

Considerable performance gains can be obtained by migrating calculations into the R code section of some data visualizations. This article assumes some basic level of competence in writing R code, and is structured as follows, it:

Explains the R code section of a data visualization.
Provides a detailed worked example of the strategy.
Explains how to do this so that the visualization can easily be reused.

The R Code section of a data visualization

Some visualizations in Displayr are written in the R language. The code that creates the visualization is visible in the Code Editor section of Displayr. For example, you can see the code for a pictograph below.

Stick Man.png

Detailed worked example using the R Code section

The dependency graph below is for a visualization that has been created in a way that is guaranteed to be slow. The graph shows two separate paths. The one at the top takes a total of .11 + .00 + .02 + .02 + .44 + .45 + .42 + .42 + 1.64 + 1.01 = 4.53 seconds. The one at the bottom takes a little less time, meaning that the overall time taken is, at best, 4.53 seconds for the visualization to calculate. (Where multiple visualizations are being shown simultaneously, the overall time may be slower.)

Screenshot 2024-05-29 101221.png

Explaining each of the nodes

The leftmost node shows that the entire data set takes 0.11 seconds to load.

The second node shows that extracting the data from the data for the Q5 variable set is effectively instant.

The top-most Q5 table contains the percentages of people who associate different brands with different personality attributes. It takes 0.02 seconds to calculate.

The next node selects the first six rows of data for the Older column. It took 0.45 seconds. It's slow because there is an overhead associated with each calculation.

The next node sorted the table and took 0.42 seconds.

The next node, viz, creates a bar chart.

The rightmost node shows the visualization if the sample size is greater than 50 and, otherwise, shows nothing.

The remaining nodes extract the sample sizes for the Older column and calculate the smallest value.

Optimizing the dependency graph

The article now describes optimizing the dependency graph by placing calculations into the R Code of the visualization.

The first win is replacing the two tables with a single table containing the percentages and the sample size.

Then, we reference this table as the input data to the table:

viz bar.png

Then, we modify the code in the Code Editor window. There are three steps to this. The code below is doing the same thing as was previously done in the separate calculations. Some things to note:

A message box will appear asking Are you sure you want to edit the R code. Click Show.
The warning in the first line is to alert anybody who clicks on the visualization that the underlying R Code has been modified.
formTable refers to the table that has been selected (i.e., table.Q5).
viz.2 is the name that the visualization will be assigned. If you use a name that has been previously used, you will get an error and need to select another name.
"" means that nothing will appear if the sample size is too small. We could also include a message (e.g., "Sample size too small").
An if statement has been used, and the entire visualization is the else condition.
Once we have selected and sorted the sub-selection, we assign it to formTable, as this is the object that is ultimately used by the code that creates the visualization.

R Example 2 .png

Due to the use of the if statement, two further modifications are required. First, scroll down to around line 250 (this will change depending on how much code you've added at the beginning, and find the place where the visualization is being named. Here, we can see it's named as viz.2.

We delete the name assignment (i.e., viz.2 <- ), as shown below. We do this because all this code is now nested within the if statement at the top, and the naming now occurs at the top.

Last, we must put a closing brace at the bottom of the code. This is done to close the else condition of the if statement.

Once the above is done, the dependency graph has been shortened, taking less than half the time.

Screenshot 2024-05-29 092946.png

Making the visualization easily re-usable

The detailed worked example uses code in the R code box to select rows 1 to 6 of the Older column of the table. If anybody wants to change the selection to a different column, they must modify the R Code.

An alternative approach is to only select the cells of interest as inputs rather than the whole table:

viz bar.png

Then, we modify the R CODE as above, but removing the bits that do the sub-selection:

R Example 2 .png

The visualization can be copied and pasted, with the user modifying the inputs and having it automatically sort and be hidden if the sample size is small.

How to Perform Mathematical Calculations Using R

Articles in this section

The R Code section of a data visualization

Detailed worked example using the R Code section

Explaining each of the nodes

Optimizing the dependency graph

Making the visualization easily re-usable

Next

Articles in this section

The R Code section of a data visualization

Detailed worked example using the R Code section

Explaining each of the nodes

Optimizing the dependency graph

Making the visualization easily re-usable

Next

Related articles