Displayr keeps track of the relationships between all variables and outputs in a Displayr document. Every time you update your data file or variables in your data, Displayr then updates all the objects and variables that depend on the object/variable you've changed. The Dependency Graph is a visual representation of how everything in your document is connected. As discussed in Viewing Dependency Graphs to Understand Calculations and Performance, you can view the dependencies for any data/result in Displayr and optimize them.
- Reduce the number of things in the dependency graph
- Identify bottlenecks
- Shorten the dependency graph
- Optimize the changeable parts of a dependency graph
- The impact of data set loading
Many of the other articles in How to Speed Up Displayr include other ways of optimizing the dependency graph.
Common misunderstanding: times taken to compute things are constant
A common misunderstanding is that if something takes 10 seconds to compute in Displayr, it will take 10 seconds if redone tomorrow. This isn't true for two different reasons:
- Displayr automatically saves many results and re-uses them. This is called caching. Consequently, if a user goes to a page containing a calculation that has already been computed, the calculation will be retrieved from cache and won't take any time to compute. This needs to be considered when reviewing a dependency graph, as it means that the results you experience in edit mode may be slower than those you experience in view mode and vice versa.
- The amount of time something takes depends on other calculations and network bandwidth. For example, if you are accessing Displayr on a slow internet connection, it will take longer for whatever instructions you give Displayr to be processed. Or, if you have a lot of large conjoint models computing, everything else will be slower.
Reduce the number of things in the dependency graph
This is discussed in detail in Reduce the Number of "Things" and the Size of the "Things" in a Document and Reduce the Number of Calculations in a Document.
Identify bottlenecks
The dependency graph below for calc.1 shows that to perform this calculation, Displayr needs first to calculate A, B, and C.
Displayr will generally attempt to perform calculations in parallel; consequently, C is a bottleneck because C takes much longer than A or B. In this example, the fastest time that this dependency graph can be calculated is 2.36 + 0.37 = 2.73 seconds.
Shorten the dependency graph
That is, if A needs to compute before B, and B before C, then A -> B -> C is the dependency graph (some people draw the arrows in the other direction...). Sometimes people inadvertently create very inefficient dependency graphs. For example, let's say you create one calculation and have every other calculation linked to it. If you then conduct a trivial modification to this one calculation, it will cause everything else to update. Similarly, if you have a long chain of Calculations, they will all need to be executed in sequence, which will be slower than if you create a structure that permits them to be calculated in parallel.
Optimize the changeable parts of a dependency graph
When optimizing performance, focusing on the total dependency graph is not relevant. Instead, the goal is to focus on the dependency bits that need to be re-computed.
The example below shows a donut chart where the user has a control (top-right). When the user chanes the control selection, this causes the table to be filtered and the Visualization to update. When understanding the time taken for the Visualization to update when the user changes the ages in the ComboBoxS1Age, the only relevant parts are those shown in yellow.
The impact of data set loading
Consider the dependency graph below for the Visualization named viz. The total time taken to show viz will be, reading from right to left, 0.86 + 0.36 + 0.11 + 11.39 = 12.72 seconds.
However, the data set is loaded when the Document is loaded, and, provided the Document stays loaded, it does not need to be loaded again. Consequently:
- In edit mode, once the data set is loaded, this 11.39 seconds for loading the data set does not need to occur again.
- In view mode, if the data set is loaded, this 11.39 seconds does not need to be loaded again.
- Once the data set is loaded, the time taken to calculate the visualization is thus only 0.86 + 0.36 + 0.11 = 1.33 seconds.