⇦ Back

If you have two groups of measurements with very different means you might be tempted to immediately attach significance to that (and conclude that the measurements in one group will generally always be larger than those in the other). However, if there is a lot of variability in the data then even a large difference in means might be meaningless. Think of it like a dartboard:

This is the idea behind effect size: the strength of the relationship between two variables depends on:

  1. The size of the relationship (how far to the right the person misses on average), and
  2. The variability in the data (how good the person is at darts)

Indeed, in general, effect size is equal to the size of the relationship relative to the variability, although there are three different formulas for calculating it exactly: Cohen’s d, Glass’s Δ (delta) and Hedges’s g. All of these require that you have two independent samples (and that you know their means and their standard deviations) and the differences between these three measures are:

1 Example 1

Here’s a worked example that comes from this page from Real Statistics with the following raw data:

import pandas as pd

# Raw data
data = {
    'values': [
        2311, 2274, 2262, 2297, 2291, 2319, 2263, 2329, 2289, 2287, 2290, 2301,
        2298, 2260, 2250, 2242, 2302, 2297, 2293, 2286, 2270, 2313, 2327, 2290,
    ],
    'types': ['original'] * 12 + ['new'] * 12
}
df = pd.DataFrame(data)

The results can be verified using this online calculator.

1.1 Cohen’s d

The formula is:

\(d = \dfrac{\bar{x}_1 - \bar{x}_2}{s}\)

where \(\bar{x}_1\) and \(\bar{x}_2\) are the means of your two samples and \(s\) is the pooled standard deviation (the square root of the mean of your two samples’ variances).

import numpy as np

# Variances
variances = df.groupby('types').var(ddof=1)
# Mean variance
mean_var = variances.mean()['values']
# Pooled standard deviation
s_pooled = np.sqrt(mean_var)
# Difference of the means
diff_mean = abs(df.groupby('types').mean().diff()['values'][-1])
# Cohen's d
cohens_d = diff_mean / s_pooled

print(f"Cohen's d = {cohens_d:.3f}")
## Cohen's d = 0.306

1.2 Glass’s Δ

Glass’s delta uses only the standard deviation of the control group in the denominator:

\(Δ = \dfrac{\bar{x}_1 - \bar{x}_2}{s_1}\)

where \(\bar{x}_1\) and \(\bar{x}_2\) are the means of your two samples and \(s_1\) is the standard deviation of the first sample (or whichever one is your control group).

# Variances
variances = df.groupby('types').var(ddof=1)
# Difference of the means
diff_mean = abs(df.groupby('types').mean().diff()['values'][-1])
# Glass's delta
glasss_delta = diff_mean / np.sqrt(variances['values'].to_list()[0])

print(f"Glass's delta = {glasss_delta:.3f}")
## Glass's delta = 0.278

1.3 Hedges’s g

Hedges’s g uses a different formula for the pooled standard deviation:

\(g = \dfrac{\bar{x}_1 - \bar{x}_2}{s^*}\)

where \(\bar{x}_1\) and \(\bar{x}_2\) are the means of your two samples and \(s^*\) is the pooled standard deviation as calculated with the following formula:

\(s^* = \sqrt{\dfrac{\left(n_1 - 1\right)s^2_1 + \left(n_2 - 1\right)s^2_2}{n_1 + n_2 - 2}}\)

where \(n_1\) and \(n_2\) are the sample sizes and \(s^2_1\) and \(s^2_2\) are the sample variances. Note that \(n_1 + n_2 - 2\) is the number of degrees of freedom.

# Sample sizes
n = df.groupby('types').count()
n1 = n['values']['new']
n2 = n['values']['original']
# Degrees of freedom
dof = n.sum()['values'] - 2
# Variances
variances = df.groupby('types').var(ddof=1)
var1 = variances['values']['new']
var2 = variances['values']['original']
# Difference of the means
diff_mean = abs(df.groupby('types').mean().diff()['values'][-1])
# Pooled standard deviation
s_pooled_star = np.sqrt((((n1 - 1) * var1) + ((n2 - 1) * var2)) / dof)
# Hedges's g
hedgess_g = diff_mean / s_pooled_star

print(f"Hedges's g = {hedgess_g:.3f}")
## Hedges's g = 0.306

Note that this is the same as Cohen’s d. This will always be the case when the sample sizes of the groups are the same.

1.4 Maximum Likelihood

This is essentially a different definition of Cohen’s d where the pooled standard deviation is defined as:

\(s = \sqrt{\dfrac{\left(n_1 - 1\right)s^2_1 + \left(n_2 - 1\right)s^2_2}{n_1 + n_2}}\)

Note that this is identical to Hedges’s g except that there is no “\(-2\)” in the denominator in the radical.

# Sample sizes
n = df.groupby('types').count()
n1 = n['values']['new']
n2 = n['values']['original']
# Variances
variances = df.groupby('types').var(ddof=1)
var1 = variances['values']['new']
var2 = variances['values']['original']
# Difference of the means
diff_mean = abs(df.groupby('types').mean().diff()['values'][-1])
# Pooled standard deviation
s_pooled = np.sqrt((((n1 - 1) * var1) + ((n2 - 1) * var2)) / (n1 + n2))
# Maximum likelihood
maximum_likelihood = diff_mean / s_pooled

print(f"Maximum likelihood = {maximum_likelihood:.3f}")
## Maximum likelihood = 0.319

2 Example 2

Here’s a worked example that comes from this page from Real Statistics with the following raw data:

import pandas as pd

# Raw data
data = {
    'values': [
        13, 17, 19, 10, 20, 15, 18, 9, 12, 15, 16,
        12, 8, 6, 16, 12, 14, 10, 18, 4, 11
    ],
    'types': ['new'] * 11 + ['old'] * 10
}
df = pd.DataFrame(data)

The results can be verified using this online calculator.

2.1 Cohen’s d

import numpy as np

# Variances
variances = df.groupby('types').var(ddof=1)
# Mean variance
mean_var = variances.mean()['values']
# Pooled standard deviation
s_pooled = np.sqrt(mean_var)
# Difference of the means
diff_mean = abs(df.groupby('types').mean().diff()['values'][-1])
# Cohen's d
cohens_d = diff_mean / s_pooled

print(f"Cohen's d = {cohens_d:.3f}")
## Cohen's d = 0.957

2.2 Glass’s Δ

# Variances
variances = df.groupby('types').var(ddof=1)
# Difference of the means
diff_mean = abs(df.groupby('types').mean().diff()['values'][-1])
# Glass's delta
glasss_delta = diff_mean / np.sqrt(variances['values'].to_list()[0])

print(f"Glass's delta = {glasss_delta:.3f}")
## Glass's delta = 1.061

2.3 Hedges’s g

# Sample sizes
n = df.groupby('types').count()
n1 = n['values']['new']
n2 = n['values']['old']
# Degrees of freedom
dof = n.sum()['values'] - 2
# Variances
variances = df.groupby('types').var(ddof=1)
var1 = variances['values']['new']
var2 = variances['values']['old']
# Difference of the means
diff_mean = abs(df.groupby('types').mean().diff()['values'][-1])
# Pooled standard deviation
s_pooled_star = np.sqrt((((n1 - 1) * var1) + ((n2 - 1) * var2)) / dof)
# Hedges's g
hedgess_g = diff_mean / s_pooled_star

print(f"Hedges's g = {hedgess_g:.3f}")
## Hedges's g = 0.962

2.4 Maximum Likelihood

# Sample sizes
n = df.groupby('types').count()
n1 = n['values']['new']
n2 = n['values']['old']
# Variances
variances = df.groupby('types').var(ddof=1)
var1 = variances['values']['new']
var2 = variances['values']['old']
# Difference of the means
diff_mean = abs(df.groupby('types').mean().diff()['values'][-1])
# Pooled standard deviation
s_pooled = np.sqrt((((n1 - 1) * var1) + ((n2 - 1) * var2)) / (n1 + n2))
# Maximum likelihood
maximum_likelihood = diff_mean / s_pooled

print(f"Maximum likelihood = {maximum_likelihood:.3f}")
## Maximum likelihood = 1.011

3 Interpreting the Results

We can interpret the results as follows:

Effect Size d, Δ or g % of std dev
Very small 0.01 1%
Small 0.20 20%
Medium 0.50 50%
Large 0.80 80%
Very large 1.20 120%
Huge 2.00 200%

⇦ Back