This example uses the ‘chickwts’ built-in dataset which documents the weight of 71 chickens which had been fed different diets.
An experiment was conducted to measure and compare the effectiveness of various feed supplements on the growth rate of chickens. Newly-hatched chicks were randomly allocated into six groups, and each group was given a different feed supplement. Their weights in grams after six weeks are given along with feed types.
Here’s an idea of what the data looks like:
print(head(chickwts, 15))
## weight feed
## 1 179 horsebean
## 2 160 horsebean
## 3 136 horsebean
## 4 227 horsebean
## 5 217 horsebean
## 6 168 horsebean
## 7 108 horsebean
## 8 124 horsebean
## 9 143 horsebean
## 10 140 horsebean
## 11 309 linseed
## 12 229 linseed
## 13 181 linseed
## 14 141 linseed
## 15 260 linseed
Let’s see what the six groups (type of feed given to the chickens) are:
for (feed_type in unique(chickwts$feed)) {
print(feed_type)
}
## [1] "horsebean"
## [1] "linseed"
## [1] "soybean"
## [1] "sunflower"
## [1] "meatmeal"
## [1] "casein"
Using the above flowchart we see that we should use Kruskal-Wallis one-way ANOVA.
The Kruskal-Wallis test is performed using the kruskal.test()
function (more info):
kruskal.test()
function used the ‘tilde’ notation. This format requires that the dependent variable be specified first, followed by a tilde (“~”, which means ‘proportional to’ in statistics), followed by the independent variable: weight ~ feed
kruskal.test(weight ~ feed, data = chickwts)
##
## Kruskal-Wallis rank sum test
##
## data: weight by feed
## Kruskal-Wallis chi-squared = 37.343, df = 5, p-value = 5.113e-07
As we can see from the output the p-value was 5.113e-07, which is strong evidence in favour of the alternative hypothesis (which is that chicks grow at different rates when fed these different diets). The individual values that have been returned can be accessed by setting a variable equal to kruskal.test()
and then indexing it:
k <- kruskal.test(weight ~ feed, data = chickwts)
# Get the Kruskal-Wallis rank sum statistic
chi_squared <- k$statistic
# Get the degrees of freedom of the approximate chi-squared distribution of
# the test statistic
df <- k$parameter
# Get the p-value of the test
p <- k$p.value
# Get the character string "Kruskal-Wallis rank sum test"
name_of_test <- k$method
# Get a character string giving the names of the data
comparison <- k$data.name
# Print them all
print(chi_squared)
print(df)
print(p)
print(name_of_test)
print(comparison)
## Kruskal-Wallis chi-squared
## 37.34272
## df
## 5
## [1] 5.11283e-07
## [1] "Kruskal-Wallis rank sum test"
## [1] "weight by feed"