The most well-known type of calculation in Displayr is what's created by clicking the Calculation button in the toolbar. However, most advanced techniques and visualizations are also calculations. A document can almost always be sped up by reducing the number of calculations.
This article contains a:
- Simple worked example
- An edge case where the strategy won't work: (Putting calculations into dropboxes does not give a performance advantage).
- Combine multiple aligned Calculations into one
- Discussions of other non-obvious ways of doing this:
Simple worked example
Consider a document that contains the following six calculations:
cats = 2
dogs = 2
giraffes = 2
wombats = 2
platapi = 2
animals = cats + dogs + giraffes + wombats + platapi
This is surprisingly slow in Displayr. It takes about .8 seconds. If we examine the dependency graph we see that each of the six calculations is taking about 0.4 seconds, but the first five can be done in parallel (if we haven't slammed the servers with lots of other calculations).
As mentioned in the title, we make Displayr faster by reducing the number of calculations. That's easy with this example, as we can replace the six with one:
cats = 2
dogs = 2
giraffes = 2
wombats = 2
platapi = 2
animals = cats + dogs + giraffes + wombats + platapi
And, we get a big payoff, with the time going from 1.2 seconds to 0.45 seconds. Why is it so much faster? If we view the raw R output, we see that the total time executing the code was around 0.05 seconds, and the rest of the time was spent on something called Other overhead on R server (.39 seconds) and Time spent transferring data (0.01 seconds).
The overhead and time spent transferring data occur with every single R calculation. When we reduce the number of calculations, this time just disappears, which is why reducing the number of calculations is such a powerful approach.
Putting calculations into dropboxes does not give a performance advantage
Most users use dropboxes to select things (e.g., tables, variables). However, when you click on a dropbox it will often show a more advanced option of Enter a calculation or value. With this option, you can enter code into the box. In the example below, the code calculates the proportion of males.
As of the time of writing, at the back-end Displayr treats this as an entirely separate calculation, so you don't get any performance benefit. However, the next section describes a strategy that can work in the same situation.
Combine multiple aligned Calculations into one
Similar to Use Small Multiples Instead of Lots of Small Visualizations, if you have multiple R Calculations (like conditional images/links/text, color changing numbers, number visualizations, and dynamic texts) aligned in some way on the page, they can all be combined into one Table with Custom Formatting (Autofit) or by using the CreateCustomTable function in R. There are lots of examples on how to use these items such as:
- How to Create an Autofit Table
- How to Align Values to a Visualization Using a CreateCustomTable R Table
- How to Create a CreateCustomTable R Table of Images
- How to Customize Fonts in a CreateCustomTable R Table
Put controls on page masters
Often documents contain the same controls on multiple pages. For example, each page may contain a listbox showing ages. Then, this listbox needs to be connected to multiple distinct other calculations and R variables. A much better design is to instead move the controls to a master page, which reduces the number of controls and can also reduce the number of calculations and variables required to refer to the control. See How to Use the Same Control on Multiple Pages.
The visualization hack
Many visualizations are also calculations. This means you can remove overhead by combining them together. See Adding Calculations into the R Code of Visualizations. However, before doing this, please also explore the next section.
Row and column manipulations of visualizations
In support, we commonly see pages with lots of visualizations, where each one of these visualizations is based on a table, and each of these tables is a part of a larger table (e.g., maybe the larger table contains data on brands, and each small table is for a separate brand). A much better way of structuring this is to make each of the visualizations pull data from the larger table, using ROW MANIPULATIONS > Rows to show and COLUMN MANIPULATIONS > Columns to show.
Even if not using Displayr's inbuilt visualizations, using a common reference table or `list`, which is then referenced by lots of other outputs is often an effective way of reducing the number of Calculations.
Similarly, there are many other things built into ROW MANIPULATIONS and COLUMN MANIPULATIONS, such as sorting, that can reduce the number of calculations required to prepare the data.
Deliberately create repetitive code
Let's say you have three calculations, where the first one is computed and then calculations 2 and 3 are computed using the first one as an input. It is often the case that it will be more efficient to include the code from the first Calculation at the beginning of the second and third Calculations, and then remove the first Calculation. That is, the reduction in time achieved by reducing the number of calculations will make up for the duplication of code. This is partly due to Displayr performing things in parallel and partly due to the overhead associated with calculations.
Use functions
An even better strategy than creating repetitive code is to abstract the common code by writing a function, and having that function appear as a separate calculation used by the other two calculations. The key thing to appreciate about doing this is that while you still have three outputs, the function itself will be cached, so will not have any real effect on performance.
Hide information in attributes
Consider the situation where you have a calculation that computes one table, tbl
and some text summarizing the table, dscr
. The orthodox way of dealing with this in R is via a list: out = list(table = tbl, description = dscr)
. However, if you want to hook the table up to a visualization, you would need to create a separate calculation with code out$table
, and refer to this in your visualization.
An alternative way of doing the same thing is via attributes. For example:
attr(table, 'description') <- dscr table
This returns the table
which can then be referenced directly in an visualizations without any need for an intermediary table. The description is then extracted by referring to attr(table, 'description')
in R code.
Next
Examine the Calculation Timings in the Raw R Output
How to Perform Mathematical Calculations Using R
How to Perform Mathematical Calculations on Tables
Examine the Calculation Timings in the Raw R Output