The code on this page uses the numpy, scipy, pandas, statsmodels and matplotlib packages. These can be installed from the terminal with:

$ python3.11 -m pip install numpy
$ python3.11 -m pip install scipy
$ python3.11 -m pip install pandas
$ python3.11 -m pip install statsmodels
$ python3.11 -m pip install matplotlib

where python3.11 corresponds to the version of Python you have installed and are using.

Import these packages into Python with:

import numpy as np
from scipy import stats as st
import pandas as pd
import statsmodels.api as sm
from matplotlib import pyplot as plt

For the statistical tests of normality, the hypotheses used are:

\(H_0\) : The sample is not significantly different than a normal population
\(H_1\) : The sample is significantly different than a normal population

We’ll use a significance level of 5% for the entire page:

# Chosen the p-value below which will indicate significance
alpha = 0.05

1 Example Data: The Standard Normal Distribution

Set a seed for the random numbers:

# Set the seed so that we get the same random numbers each time this code runs
np.random.seed(20230811)

Create and plot the example data:

# Draw random numbers from the standard normal distribution
mu, sigma = 0, 1
data = np.random.normal(mu, sigma, 1000)

# Plot the random numbers
label = '1000 Random Numbers'
plt.hist(data, density=True, color='gray', label=label)
# Plot the exact distribution
x = np.linspace(-3, 3, 100)
y = st.norm.pdf(x, mu, sigma)
plt.plot(x, y, 'k--', label='Standard Normal Distribution')
# Format
plt.title('Example Data')
plt.ylabel('Probability Density')
plt.xlabel('Random Variables')
plt.ylim([0, 0.5])
plt.xlim([-3, 3])
plt.legend(frameon=False)
plt.show()

1.1 D’Agostino’s (and Pearson’s) K-squared Test

See the documentation here and the Wikipedia page here.

statistic, p_value = st.normaltest(data)
if p_value > alpha:
    result = 'Normal'
else:
    result = 'NOT Normal'
print(f"D'Agostino & Pearson: {result:>11s}")
print(statistic)

## D'Agostino & Pearson:      Normal
## 4.612690205727473

1.2 Jarque-Bera Goodness-of-Fit Test

See the documentation here and the Wikipedia page here.

statistic, pvalue = st.jarque_bera(data)
if pvalue > alpha:
    result = 'Normal'
else:
    result = 'NOT Normal'
print(f'Jarque-Bera: {result:>20s}')
print(statistic)

## Jarque-Bera:               Normal
## 3.880393602164786

1.3 Kolmogorov–Smirnov Test

Note that you cannot actually use the Kolmogorov-Smirnov test as a normality test; this function uses the Lilliefors test in the background (which is based on Kolmogorov-Smirnov).

See the documentation here (note that this is identical to the documentation for lilliefors()) and the Wikipedia page here.

ksstat, pvalue = sm.stats.diagnostic.kstest_normal(data)
if pvalue > alpha:
    result = 'Normal'
else:
    result = 'NOT Normal'
print(f'Kolmogorov-Smirnov: {result:>13s}')
print(ksstat)

## Kolmogorov-Smirnov:        Normal
## 0.0184928859912723

1.4 Lilliefors Test

The Lilliefors test is a Kolmogorov-Smirnov test with estimated parameters.

See the documentation here (note that this is identical to the documentation for kstest_normal()) and the Wikipedia page here.

ksstat, pvalue = sm.stats.diagnostic.lilliefors(data)
if pvalue > alpha:
    result = 'Normal'
else:
    result = 'NOT Normal'
print(f'Lilliefors: {result:>21s}')
print(ksstat)

## Lilliefors:                Normal
## 0.0184928859912723

Note that the test statistic (ksstat) is the same as the one for the Kolmogorov–Smirnov test.

1.5 Shapiro–Wilk Test

See the documentation here and the Wikipedia page here.

statistic, pvalue = st.shapiro(data)
if pvalue > alpha:
    result = 'Normal'
else:
    result = 'NOT Normal'
print(f'Shapiro-Wilk: {result:>19s}')
print(statistic)

## Shapiro-Wilk:              Normal
## 0.9976267218589783

1.6 Anderson–Darling Test

This tests for whether data comes from a particular distribution.

See the documentation here and the Wikipedia page here.

result = st.anderson(data)
idx = np.where(result.significance_level == alpha * 100)
if result.statistic < result.critical_values[idx]:
    verdict = 'Normal'
else:
    verdict = 'NOT Normal'
print(f'Anderson–Darling: {verdict:>15s}')
print(result.statistic)

## Anderson–Darling:          Normal
## 0.4273934579834986

1.7 Probability Plot

A probability plot shows how closely two data sets agree. Types of probability plots include:

P–P plots (probability-probability plots/percent-percent plots)
Q–Q plots (quantile-quantile plots)
- Normal probability plots (Q–Q plots against the standard normal distribution)

SciPy’s probplot() function can produce a probability plot, see the SciPy documentation here and the Wikipedia disambiguation page here.

st.probplot(data, dist='norm', plot=plt)
plt.show()
plt.close()

The easiest way to control the format of the plot is to take the objects that the function returns and plot those manually:

ax = plt.axes()
ax.set_title('Probability Plot')
ax.set_ylabel('Ordered Values')
ax.set_xlabel('Theoretical Quantiles')
ax.axline((0, 0), slope=1, c='gray', ls='--')
ax.plot(*st.probplot(data, dist='norm')[0], c='k', lw=3)
axis_lim = max([abs(min(data)), max(data)])
ax.set_xlim([-axis_lim, axis_lim])
ax.set_ylim([-axis_lim, axis_lim])
ax.set_aspect('equal')
ax.grid(ls='--')
plt.show()
plt.close()

1.8 Q–Q Plot

A quantile-quantile plot shows the quantiles of two probability distributions. If the points lie close to or on the 45° line then it can be concluded that they are similar. This is a visual aid in testing for normality and should be checked in addition to the numerical test(s) because those do not tell you about behaviour at the tails.

Statsmodels has the qqplot() function that will create a Q–Q plot, see the documentation here and the Wikipedia page here.

sm.qqplot(data, line='45')
plt.show()
plt.close()

Controlling the format here is a bit tricky. The easiest way is probably to use the ProbPlot() function to create a probability plot that can then be plotted with qqplot() and qqline() to get a Q-Q plot. Use the ax keyword argument to use a Matplotlib Axes object:

ax = plt.axes()
ax.set_title('Q-Q Plot')
ax.set_ylabel('Sample Quantiles')
ax.set_xlabel('Theoretical Quantiles')
pp = sm.ProbPlot(data)
qq = pp.qqplot(marker='.', markerfacecolor='gray', markeredgecolor='k', alpha=0.3, ax=ax)
sm.qqline(qq.axes[0], line='45', color='gray', linestyle='--')
axis_lim = max([abs(min(data)), max(data)])
ax.set_xlim([-axis_lim, axis_lim])
ax.set_ylim([-axis_lim, axis_lim])
ax.set_aspect('equal')
ax.grid(ls='--')
plt.show()
plt.close()

2 Real-World Data: Mexican Population Densities

Taken from here.

dct = {
    'Region': [
        'Ajuno', 'Angahuan', 'Arantepacua', 'Aranza', 'Charapan', 'Cheran',
        'Cocucho', 'Comachuen', 'Corupo', 'Ihuatzio', 'Janitzio', 'Jaracuaro',
        'Nahuatzen', 'Nurio', 'Paracho', 'Patzcuaro', 'Pichataro',
        'Pomacuaran', 'Quinceo', 'Quiroga', 'San Felipe', 'San Lorenzo',
        'Sevina', 'Tingambato', 'Turicuaro', 'Tzintzuntzan', 'Urapicho'
    ],
    'Population Density': [
        5.11, 5.15, 5.00, 4.13, 5.10, 5.22, 5.04, 5.25, 4.53, 5.74, 6.63, 5.73,
        4.77, 6.06, 4.82, 4.98, 5.36, 4.96, 5.94, 5.01, 4.10, 4.69, 4.97, 5.01,
        6.19, 4.67, 6.30
    ]
}
df = pd.DataFrame(dct)
print(df)

##           Region  Population Density
## 0          Ajuno                5.11
## 1       Angahuan                5.15
## 2    Arantepacua                5.00
## 3         Aranza                4.13
## 4       Charapan                5.10
## 5         Cheran                5.22
## 6        Cocucho                5.04
## 7      Comachuen                5.25
## 8         Corupo                4.53
## 9       Ihuatzio                5.74
## 10      Janitzio                6.63
## 11     Jaracuaro                5.73
## 12     Nahuatzen                4.77
## 13         Nurio                6.06
## 14       Paracho                4.82
## 15     Patzcuaro                4.98
## 16     Pichataro                5.36
## 17    Pomacuaran                4.96
## 18       Quinceo                5.94
## 19       Quiroga                5.01
## 20    San Felipe                4.10
## 21   San Lorenzo                4.69
## 22        Sevina                4.97
## 23    Tingambato                5.01
## 24     Turicuaro                6.19
## 25  Tzintzuntzan                4.67
## 26      Urapicho                6.30

2.1 Plot the Raw Data

# Plot the data
data = df['Population Density']
plt.hist(data, density=True, color='gray', label='Population Densities')
# Plot a normal distribution
x_bar = np.mean(data)
s = np.std(data)
x = np.linspace(x_bar - 3 * s, x_bar + 3 * s, 100)
y = st.norm.pdf(x, x_bar, s)
plt.plot(x, y, 'k--', label='Normal Distribution')
# Axes
plt.xlim([x_bar - 3 * s, x_bar + 3 * s])
# Finish
plt.title('Population Densities in Mexican Regions')
plt.ylabel('Probability Density')
plt.xlabel('Binned Values')
plt.legend(frameon=False)
plt.show()
plt.close()

2.2 Statistical Tests for Normality

# D'Agostino's (and Pearson's) K² test
if st.normaltest(data)[1] > alpha:
    print("D'Agostino & Pearson:        Normal")
else:
    print("D'Agostino & Pearson:    NOT Normal")

# Jarque-Bera goodness-of-fit test
if st.jarque_bera(data)[1] > alpha:
    print('Jarque-Bera:                 Normal')
else:
    print('Jarque-Bera:             NOT Normal')

# Lilliefors test (based on the Kolmogorov–Smirnov test)
if sm.stats.diagnostic.lilliefors(data)[1] > alpha:
    print('Lilliefors:                  Normal')
else:
    print('Lilliefors:              NOT Normal')

# Shapiro–Wilk test
if st.shapiro(data)[1] > alpha:
    print('Shapiro-Wilk:                Normal')
else:
    print('Shapiro-Wilk:            NOT Normal')

# Anderson–Darling test
result = st.anderson(data)
if result[0] < result[1][2]:
    print('Anderson–Darling:            Normal')
else:
    print('Anderson–Darling:        NOT Normal')

## D'Agostino & Pearson:        Normal
## Jarque-Bera:                 Normal
## Lilliefors:              NOT Normal
## Shapiro-Wilk:                Normal
## Anderson–Darling:        NOT Normal

2.3 Compare to the Source

The source performs seven statistical normality tests, two of which are the same (Lilliefors and Kolmogorov-Smirnov) and two of which are not implemented in Python (the W/S test - not to be confused with the Shapiro–Wilk test - and D’Agostino’s D test - not to be confused with D’Agostino’s K² test). The test statistics of the remaining four tests match those given by Python:

print(f'Jarque–Bera test:      {st.jarque_bera(data)[0]:.3f}')
print(f'Lilliefors test:       {sm.stats.diagnostic.lilliefors(data)[0]:.3f}')
print(f'Shapiro–Wilk test:     {st.shapiro(data)[0]:.3f}')
print(f'Anderson–Darling test: {result.statistic:.3f}')

## Jarque–Bera test:      1.209
## Lilliefors test:       0.173
## Shapiro–Wilk test:     0.943
## Anderson–Darling test: 0.764

So our results match the source.

2.4 Graphical Tests for Normality

2.4.1 Probability Plot

ax = plt.axes()
ax.set_title('Probability Plot')
ax.set_ylabel('Ordered Values')
ax.set_xlabel('Theoretical Quantiles')
(osm, osr), (slope, intercept, r) = st.probplot(data, dist='norm')
ax.plot(osm, osr, c='k', lw=3)
ax.axline((0, intercept), slope=slope, c='gray', ls='--')
ax.set_ylim([min(osr), max(osr)])
ax.set_xlim([min(osm), max(osm)])
ax.grid(ls='--')
plt.show()
plt.close()

2.4.2 Q–Q Plot

ax = plt.axes()
ax.set_title('Q-Q Plot')
ax.set_ylabel('Sample Quantiles')
ax.set_xlabel('Theoretical Quantiles')
pp = sm.ProbPlot(data)
qq = pp.qqplot(markerfacecolor='gray', markeredgecolor='k', ax=ax)
x = pp.theoretical_quantiles
y = pp.sample_quantiles
sm.qqline(qq.axes[0], x=x, y=y, line='s', color='gray', linestyle='--')
ax.set_xlim([min(x), max(x)])
ax.grid(ls='--')
plt.show()
plt.close()

⇦ Back

Statistics in Python:Tests for Normality

1 Example Data: The Standard Normal Distribution

1.1 D’Agostino’s (and Pearson’s) K-squared Test

1.2 Jarque-Bera Goodness-of-Fit Test

1.3 Kolmogorov–Smirnov Test

1.4 Lilliefors Test

1.5 Shapiro–Wilk Test

1.6 Anderson–Darling Test

1.7 Probability Plot

1.8 Q–Q Plot

2 Real-World Data: Mexican Population Densities

2.1 Plot the Raw Data

2.2 Statistical Tests for Normality

2.3 Compare to the Source

2.4 Graphical Tests for Normality

2.4.1 Probability Plot

2.4.2 Q–Q Plot

Statistics in Python:
Tests for Normality