⇦ Back

This tutorial is about plotting Bland-Altman graphs. For a more complete tutorial on Bland-Altman analysis as a whole, see here. For a tutorial on scatter plots in general, see here, and for line plots in general, see here.

Bland-Altman plots are used to assess the agreement between two methods of measuring something, usually clinical information. They were discussed in Bland & Altman’s 1986 paper (available here and/or here) and see also the Wikipedia page. Here is the example data that was used in the 1986 paper in data frame format (note that, in order to use data frames, the pandas library first needs to be imported):

import pandas as pd

# Peak expiratory flow rate measurements made using a Wright peak flow meter
# and a mini Wright peak flow meter.
# - https://www-users.york.ac.uk/~mb55/datasets/pefr.dct
# - https://www-users.york.ac.uk/~mb55/datasets/datasets.htm
# - https://www-users.york.ac.uk/~mb55/meas/ba.pdf
df = pd.DataFrame({
    'Wright Mini': [
        512, 430, 520, 428, 500, 600, 364, 380, 658,
        445, 432, 626, 260, 477, 259, 350, 451
    ],
    'Wright Large': [
        494, 395, 516, 434, 476, 557, 413, 442, 650,
        433, 417, 656, 267, 478, 178, 423, 427
    ]
})

This data shows the maximum speed of expiration (the ‘peak expiratory flow rate’, or PEFR) of 17 people. It was collected using two different measurement devices: a large and a mini Wright peak flow meter.

1 Value-Value Plot

The data can be plotted as a scatter graph:

  • Import the pyplot module from the Matplotlib library and give it the shorthand name plt
  • The scatter(x, y) function plots x- and y-data as points in a scatter graph
  • A figure title and axis labels can be added with title(), ylabel() and xlabel()
import matplotlib.pyplot as plt

plt.scatter(df['Wright Large'], df['Wright Mini'])
plt.title('Peak Expiratory Flow Rate')
plt.ylabel('Mini Meter (L/min)')
plt.xlabel('Large Meter (L/min)')

Now let’s add a line of equality. This shows ideal agreement: the closer the points are to the line the better the two measurement methods agree with each other.

  • First, use xlim() to get the lowest and highest numbers on the x-axis. Note that these are not the smallest and largest values that have been plotted; as you can see from the plot above, when the graph is created there is some whitespace added around the points to make the graph look better. This means that the range over which the x-axis runs is larger than that of the data.
  • Once the left and right extents of the x-axis have been returned, a straight line can be created from the origin to the upper-right corner with plot()
  • If we were to take a look at the graph now the line of equality would be hanging in the middle of the plot area because the axis limits would have been updated when it was plotted. Manually set the x-axis limits you want using ylim() and xlim().
  • The show() function is an explicit instruction to (re-)show the plot
plt.scatter(df['Wright Large'], df['Wright Mini'])
plt.title('Peak Expiratory Flow Rate')
plt.ylabel('Mini Meter (L/min)')
plt.xlabel('Large Meter (L/min)')
# Get the axis limits
left, right = plt.xlim()
# Plot the line of equality
plt.plot([0, right], [0, right])
# Set the axis limits
plt.ylim(0, right)
plt.xlim(0, right)

plt.show()

Finally, let’s make some aesthetic improvements:

  • Change the colour, size, transparency, marker shape and line type of the plot points and the line of equality by adding keyword arguments in the scatter() and plot() calls:
    • c or color will change the marker or line colour (see your options here)
    • s will change the marker or line size
    • alpha will change the transparency
    • marker will change the marker type (see your options here)
    • ls will change the line style (see your options here)
  • Make the axes’ scales equal by setting the aspect ratio using the gca() function to get the current axes then the .set_aspect('equal') method
  • Add a legend to give a label to the line of equality. This can be done by first using the label keyword argument in the plot() call and, secondly, by adding a legend via legend() (use the frameon=False option to remove the legend’s border):
plt.scatter(df['Wright Large'], df['Wright Mini'], c='k', s=20, alpha=0.6, marker='o')
plt.title('Peak Expiratory Flow Rate')
plt.ylabel('Mini Meter (L/min)')
plt.xlabel('Large Meter (L/min)')
# Get axis limits
left, right = plt.xlim()
# Reference line
plt.plot([0, right], [0, right], c='grey', ls='--', label='Line of Equality')
# Set axis limits
plt.ylim(0, right)
plt.xlim(0, right)
# Set aspect ratio
plt.gca().set_aspect('equal')
# Legend
plt.legend(frameon=False)

plt.show()

2 Bland-Altman Plot

The Bland-Altman analysis is discussed here, but it can be replicated using the following code:

import pandas as pd
import numpy as np

means = df.mean(axis=1)
diffs = df.diff(axis=1).iloc[:, -1]
# Average difference (aka the bias)
bias = np.mean(diffs)
# Sample standard deviation
sd = np.std(diffs, ddof=1)
# Limits of agreement
upper_loa = bias + 2 * sd
lower_loa = bias - 2 * sd

The results can now be plotted in a new scatter graph:

plt.scatter(means, diffs, c='k', s=20, alpha=0.6, marker='o')
plt.title('Bland-Altman Plot for Two Methods of Measuring PEFR')
plt.ylabel('Difference (L/min)')
plt.xlabel('Mean (L/min)')
# Get axis limits
left, right = plt.xlim()
bottom, top = plt.ylim()
# Set y-axis limits
max_y = max(abs(bottom), abs(top))
plt.ylim(-max_y * 1.1, max_y * 1.1)
# Set x-axis limits
domain = right - left
plt.xlim(left, left + domain * 1.1)

plt.show()

We should add in the zero line, the bias line and the limits of agreement to give these points some context. Horizontal lines that span the entire width like these can be added easily using axhline():

# Plot the zero line
plt.axhline(y=0, c='k', lw=0.5)
# Plot the bias and the limits of agreement
plt.axhline(y=upper_loa, c='grey', ls='--')
plt.axhline(y=bias, c='grey', ls='--')
plt.axhline(y=lower_loa, c='grey', ls='--')

plt.show()

Next, let’s add the values of the limits of agreement and of the bias right onto the graph. This can be done using annotations created by the annotate() function. Annotations consist of three elements:

  • The text to display on the graph
  • The xy coordinates of the point you want to annotate, specified as a tuple (ie two numbers separated by a comma and inside round brackets)
  • The x- and y-coordinates of the text label, more properly called xytext. This is again a tuple.
  • Optionally, you can choose the coordinate system used by the xytext argument. This option is called textcoords. We will use 'offset pixels' as this will allow us to specify the positions of our text labels relative to the points we are annotating.

These elements should be specified in the annotate() calls, in the same order as which they are discussed above.

# Add the annotations
plt.annotate('+2×SD', (right, upper_loa), (-10, 10), textcoords='offset pixels')
plt.annotate(f'{upper_loa:+4.2f}', (right, upper_loa), (-10, -27), textcoords='offset pixels')
plt.annotate('Bias', (right, bias), (-10, 10), textcoords='offset pixels')
plt.annotate(f'{bias:+4.2f}', (right, bias), (-10, -27), textcoords='offset pixels')
plt.annotate('-2×SD', (right, lower_loa), (-10, 10), textcoords='offset pixels')
plt.annotate(f'{lower_loa:+4.2f}', (right, lower_loa), (-10, -27), textcoords='offset pixels')

plt.show()

2.1 Confidence Intervals

Again, we will just replicate the code for calculating these here:

import scipy.stats as stats

# Sample size
n = df.shape[0]
# Variance
var = sd**2
# Standard error of the bias
se_bias = np.sqrt(var / n)
# Standard error of the limits of agreement
se_loas = np.sqrt(3 * var / n)
# Endpoints of the range that contains 95% of the Student’s t distribution
t_interval = stats.t.interval(alpha=0.95, df=n - 1)
# Confidence intervals
ci_bias = bias + np.array(t_interval) * se_bias
ci_upperloa = upper_loa + np.array(t_interval) * se_loas
ci_lowerloa = lower_loa + np.array(t_interval) * se_loas

These can now be added to our plot. This is done by plotting a series of straight lines:

# Plot the confidence intervals
plt.plot([left] * 2, list(ci_upperloa), c='grey', ls='--', alpha=0.5)
plt.plot([left] * 2, list(ci_bias), c='grey', ls='--', alpha=0.5)
plt.plot([left] * 2, list(ci_lowerloa), c='grey', ls='--', alpha=0.5)
# Plot the confidence intervals' caps
x_range = [left - domain * 0.025, left + domain * 0.025]
plt.plot(x_range, [ci_upperloa[1]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_upperloa[0]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_bias[1]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_bias[0]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_lowerloa[1]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_lowerloa[0]] * 2, c='grey', ls='--', alpha=0.5)
# Adjust the x- and y-axis limits
max_y = max(abs(ci_upperloa[1]), abs(ci_lowerloa[0]))
plt.ylim(-max_y * 1.05, max_y * 1.05)
plt.xlim(left - domain * 0.05, left + domain * 1.1)

plt.show()

3 Latex and Image Size

See here for more about using Latex formatting in the title and axes’ labels and see here for more about changing the image size. Here it is in action:

# Make figures A5 in size
A = 5
plt.rc('figure', figsize=[46.82 * .5**(.5 * A), 33.11 * .5**(.5 * A)])
# Image quality
plt.rc('figure', dpi=141)
# Be able to add Latex
plt.rc('text', usetex=True)
plt.rc('font', family='serif')
plt.rc('text.latex', preamble=r'\usepackage{textgreek}')

#
# Plot
#
plt.scatter(means, diffs, c='k', s=20, alpha=0.6, marker='o')
plt.title('Bland-Altman Plot for Two Methods of Measuring PEFR')
plt.ylabel(r'Difference, $d$ (L/min)')
plt.xlabel(r'Mean, $\mu$ (L/min)')
# Set x-axis limits
left, right = plt.xlim()
domain = right - left
plt.xlim(left - domain * 0.05, left + domain * 1.1)
# Plot the zero line
plt.axhline(y=0, c='k', lw=0.5)
# Plot the bias and the limits of agreement
plt.axhline(y=upper_loa, c='grey', ls='--')
plt.axhline(y=bias, c='grey', ls='--')
plt.axhline(y=lower_loa, c='grey', ls='--')
# Add the annotations
plt.annotate('+2×SD', (right, upper_loa), (-10, 10), textcoords='offset pixels')
plt.annotate(f'{upper_loa:+4.2f}', (right, upper_loa), (-10, -27), textcoords='offset pixels')
plt.annotate('Bias', (right, bias), (-10, 10), textcoords='offset pixels')
plt.annotate(f'{bias:+4.2f}', (right, bias), (-10, -27), textcoords='offset pixels')
plt.annotate('-2×SD', (right, lower_loa), (-10, 10), textcoords='offset pixels')
plt.annotate(f'{lower_loa:+4.2f}', (right, lower_loa), (-10, -27), textcoords='offset pixels')
# Plot the confidence intervals
plt.plot([left] * 2, list(ci_upperloa), c='grey', ls='--', alpha=0.5)
plt.plot([left] * 2, list(ci_bias), c='grey', ls='--', alpha=0.5)
plt.plot([left] * 2, list(ci_lowerloa), c='grey', ls='--', alpha=0.5)
# Plot the confidence intervals' caps
x_range = [left - domain * 0.025, left + domain * 0.025]
plt.plot(x_range, [ci_upperloa[1]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_upperloa[0]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_bias[1]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_bias[0]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_lowerloa[1]] * 2, c='grey', ls='--', alpha=0.5)
plt.plot(x_range, [ci_lowerloa[0]] * 2, c='grey', ls='--', alpha=0.5)
# Set y-axis limits
max_y = max(abs(ci_upperloa[1]), abs(ci_lowerloa[0]))
plt.ylim(-max_y * 1.05, max_y * 1.05)

plt.show()

Finish by saving the figure as a PNG, JPG, PDF or other type of image with savefig(<filename>) where <filename> is the name you want the image file to have with the extension included. If you are plotting more than one figure in the same Python script, use figure() and close() before and after each, respectively, in order to tell Python when one plot ends and the next one starts.

4 Using a Custom Function

Now that we have the code we need to plot both a value-value plot and a Bland-Altman plot we can put it into a function and call it on any dataset we want. Here’s another example from the Bland-Altman 1986 paper: the repeatability of the large measurement device (two rounds of measurements on the same participants, compared to each other):

def bland_altman_analysis(x, y):
    """Perform Bland-Altman analysis."""
    diffs = x - y
    means = (x + y) / 2
    # Average difference (aka the bias)
    bias = np.mean(diffs)
    # Sample standard deviation
    sd = np.std(diffs, ddof=1)
    # Limits of agreement
    lower_loa = bias - 1.96 * sd
    upper_loa = bias + 1.96 * sd
    # Within-subject sample variances
    var_w = np.var((x, y), axis=0, ddof=1)
    # Mean within-subject sample variance
    var_w = np.mean(var_w)
    # Within-subject sample standard deviation
    s_w = np.sqrt(var_w)
    # Coefficient of repeatability
    rc = 1.96 * np.sqrt(2) * s_w
    print(f'Coefficient of repeatability: {rc}')

    # Value-value plot
    plt.figure()
    plt.scatter(x, y, c='k', s=20, alpha=0.6, marker='o')
    plt.title('Peak Expiratory Flow Rate using a Large Wright Peak Flow Meter')
    plt.ylabel('Second Measurement (L/min)')
    plt.xlabel('First Measurement (L/min)')
    # Get axis limits
    left, right = plt.xlim()
    # Reference line
    plt.plot([0, right], [0, right], c='grey', ls='--', label='Line of Equality')
    # Set axis limits
    plt.ylim(0, right)
    plt.xlim(0, right)
    # Set aspect ratio
    plt.gca().set_aspect('equal')
    # Legend
    plt.legend(frameon=False)
    # Finish
    plt.show()
    plt.close()

    # Bland-Altman plot
    plt.figure()
    plt.scatter(means, diffs, c='k', s=20, alpha=0.6, marker='o')
    plt.title('Repeatability of the Large Wright Peak Flow Meter')
    plt.ylabel('Difference (L/min)')
    plt.xlabel('Mean (L/min)')
    # Get axis limits
    left, right = plt.xlim()
    bottom, top = plt.ylim()
    # Set y-axis limits
    max_y = max(abs(bottom), abs(top))
    plt.ylim(-max_y * 1.1, max_y * 1.1)
    # Set x-axis limits
    domain = right - left
    plt.xlim(left, left + domain * 1.1)
    # Plot the zero line
    plt.axhline(y=0, c='k', lw=0.5)
    # Plot the bias and the limits of agreement
    plt.axhline(y=upper_loa, c='grey', ls='--')
    plt.axhline(y=bias, c='grey', ls='--')
    plt.axhline(y=lower_loa, c='grey', ls='--')
    # Finish
    plt.show()
    plt.close()


x = np.array([
    494, 395, 516, 434, 476, 557, 413, 442, 650,
    433, 417, 656, 267, 478, 178, 423, 427
])
y = np.array([
    490, 397, 512, 401, 470, 611, 415, 431, 638,
    429, 420, 633, 275, 492, 165, 372, 421
])
bland_altman_analysis(x, y)
## Coefficient of repeatability: 42.42792199372817

⇦ Back