This article describes how to create a box plot from one or more numeric variables, where the range of the box indicates the interquartile range of the data and the central line shows the median.
You will end up with a plot like this:
Requirements
You will need one or more Numeric format variables in your data, either in a loaded data set or a table. Note that if you use a table, each row in the table will be treated as a separate observation to calculate the median, percentiles, etc of the distribution. So if you want to use a table instead of a variable as the input, you'll want the table to have the raw data you need to calculate the box.
Method - Use variables as inputs
- Go to Visualization
> Distributions > Box.
- In the object inspector
, under Data > Data Source > Variables select your numeric variable(s).
- Select a secondary variable to group the numeric data by from the Groups dropdown.
- Click Calculate and/or ensure Calculate automatically is checked.
Selecting more than one numeric variable will generate multiple boxes.
Method - Using a table input
- Select the table that has the numeric data in the Rows.
- From the object inspector
, go to Visualization > Distributions > Box.
Acknowledgements
The density is computed using the base R density function, and the plot is created using plotly.
Next
How to Create a Grouped Box Plot