Create a box plot chart in Excel

For those who rely on Excel to do their data analysis (rather than mini-tab or JMP), occasionally the charts available are a little limiting.  So I thought I would post this technique to allow you to perform a box plot analysis using Excel.

20 21

Collate the data

First you need to gather your data together.  In this example, I have 10 sites, with values by month for the year 2011.  For clarification, I have also uploaded the excel file, so please feel free to have a look!

In the example attached, I have created a tab in excel for each site (labelled 1 to 10).  You can of course use just one tab for all if you prefer!

22

The first thing you need to do is calculate the median, minimum, maximum and upper quartile values for each site.  You can see this on the “Box plot” tab of the excel sheet attached.

The median is calculated by:

=MEDIAN(‘1’!$C$2:$C$11)

‘1’ refers to tab “1” and it is looking at the values in column C from lines 2 to 13.

Repeat this for each site (and tab) by changing the figure ‘1’ to ‘2’, then ‘3’ etc to form row 2 of the table below.  The $ symbol ensures that the range used for column C never changes when you copy / paste the calculation across the row.

23

I will come back to row 3 in a moment, for now let’s skip on to rows 4 and 5.

The minimum is calculated by:

=MIN(‘1’!$C$2:$C$11)

The maximum is calculated by:

=MAX(‘1’!$C$2:$C$11)

Again, repeat this for each site (and tab) to form rows 4 and 5 of the table above.  The minimum and maximum values will be used to create the whiskers of the box plot.

Now let’s go back to row 3.  Excel has a built in function “quartile” which we can use for the calculation for this row and for row 6.

Row 3 (Lower quartile, Q1) is calculated by:

=QUARTILE(‘1’!$C$2:$C$11,1)

The value after the comma (in blue above) show that we want the lower quartile (Q1)

Row 6 (Upper quartile, Q3) is calculated by:

=QUARTILE(‘1’!$C$2:$C$11,3)

Once again, repeat this for each site (and tab) to form rows 3 and 6 of the table also shown below.  The Q1 and Q3 values will form the top and bottom of the box.

24

The order they appear in, (median, Q1, Min, Max, Q3) is very important!  Do not change this or the graph will be wrong!

Create the chart

Since excel doesn’t have a box plot chart we are going to have to create one.  To do this, first select the data table shown above, then on the Insert tab, click on the small down arrow next to Other Charts in the charts section.  Click on the 4th chart in the Stock section (Volume-Open-High-Low-Close.  You will need five series of values to create the graph).

25

Excel will automatically create the graph below.

26

This is not exactly what we want, so we will have to do some formatting.  The first step is to adjust the two Y axes so that they are identical, i.e. both go from 0 to 4.0.

Double click on the left axis.  In the menu that apears, ensure that Axis options is selected, then change the maximum value from Auto to Fixed and enter a value of 4.0.  Then click on the Close button.

27

Then right click on the blue filled median data on the graph and select change series chart type.   In the menu that apears, change the graph type to line and choose the 4th option (Line with markers).  Then click on the OK button.

28

The graph will then look as shown below.

29

There are a few more formatting points to do.

  1. Change the fill of the box plot to none
  2. Remove the interconnecting line of the median series
  3. Double click on the boxes.  In the menu that appears, ensure that fill is selected and choose No fill.  Then click on Close.
  4. 30
  5. With the formatting menu open, left click on the median series.  Left click on line color and choose No line.  Then left click on marker options and choose Built in.  Change the marker type to a line, increase the size to 11, then click on Close.
  6. 31
  7. For the finishing touches, you could delete the legend, and the secondary Y axis.  You could also reduce the number of decimal places for the primary Y axis to 2  e.g. 0.00.  Finally, I have also changed the font colour and the chart lines to grey.

Enjoy!

Please note: The statistical analysis in JMP, Minitab and Excel is slightly different in terms of the inclusion of outliers or not!  If you are serious about data analysis, I recommend using the right tools (i.e. JMP or minitab).

boxplot