Question
The penguins data in base R provides various measurements of adult
penguins from three different species. See ?penguins for more details.
Originally, the data was used to study sex dimorphism separately for
the three species.
The first three rows of the data can be inspected as follows. Employ
summary() to obtain a first overview.
R> data("penguins", package = "datasets")
R> head(penguins, 3)
species island bill_len bill_dep flipper_len body_mass sex year
1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
2 Adelie Torgersen 39.5 17.4 186 3800 female 2007
3 Adelie Torgersen 40.3 18.0 195 3250 female 2007
Explore the sex differences with respect to body mass (weight, in grams)
of the penguins. Create parallel boxplots of weight by sex, such as the
one below, separately for the three species.
Which species does this plot pertain to?
To complement the plot compute median, mean, and standard deviation (left to right)
for the female and male penguins (top down), respectively.
The average weight difference of
is thus slightly higher / lower than the
median weight difference of .
Compute the full summary() of weight by sex for this species and select
the correct statements in the following list.
Less than half of the male penguins weigh more than 3731.25 grams. / None of the penguins weighs less than 2700 grams. / The standard deviation of weight is lower for males compared to females.
Solution
One way to obtain the exploratory boxplots separately for the three species is:
R> par(mfrow = c(1, 3))
R> for(i in levels(penguins$species)) plot(body_mass ~ sex, data = penguins,
+ subset = species == i, main = i, ylim = range(penguins$body_mass, na.rm = TRUE))
The question shows the parallel boxplots for the Chinstrap species.
Groupwise statistics of body mass by sex and species (including mean, median,
and standard deviation) can be obtained by aggregating the data with the combined
summary() and sd() functions.
R> aggregate(body_mass ~ sex + species, data = penguins,
+ FUN = function(x) c(summary(x), `Std. dev.` = sd(x)))
sex species body_mass.Min. body_mass.1st Qu. body_mass.Median
1 female Adelie 2850.0000 3175.0000 3400.0000
2 male Adelie 3325.0000 3800.0000 4000.0000
3 female Chinstrap 2700.0000 3362.5000 3550.0000
4 male Chinstrap 3250.0000 3731.2500 3950.0000
5 female Gentoo 3950.0000 4462.5000 4700.0000
6 male Gentoo 4750.0000 5300.0000 5500.0000
body_mass.Mean body_mass.3rd Qu. body_mass.Max. body_mass.Std. dev.
1 3368.8356 3550.0000 3900.0000 269.3801
2 4043.4932 4300.0000 4775.0000 346.8116
3 3527.2059 3693.7500 4150.0000 285.3339
4 3938.9706 4100.0000 4800.0000 362.1376
5 4679.7414 4875.0000 5200.0000 281.5783
6 5484.8361 5700.0000 6300.0000 313.1586
Based on this the remaining elements of the question can be answered.