penguins: Cloze Quiz for Exploratory Analysis of Penguins Data

Cloze exercise exploring sex differences in body mass for a randomly selected species of penguins, based on the eponymous data in base R.

Name:
penguins
Type:
Related:
Preview:

The penguins data in base R provides various measurements of adult penguins from three different species. See ?penguins for more details. Originally, the data was used to study sex dimorphism separately for the three species.

The first three rows of the data can be inspected as follows. Employ summary() to obtain a first overview.

data("penguins", package = "datasets")
head(penguins, 3)
##   species    island bill_len bill_dep flipper_len body_mass    sex year
## 1  Adelie Torgersen     39.1     18.7         181      3750   male 2007
## 2  Adelie Torgersen     39.5     17.4         186      3800 female 2007
## 3  Adelie Torgersen     40.3     18.0         195      3250 female 2007

Explore the sex differences with respect to body mass (weight, in grams) of the penguins. Create parallel boxplots of weight by sex, such as the one below, separately for the three species.

Which species does this plot pertain to?

To complement the plot complete the corresponding table of groupwise statistics:

median mean std. deviation
female
male

The average weight difference of is thus slightly than the median weight difference of .

Compute the full summary() of weight by sex for this species and select the correct statements in the following list.

One way to obtain the exploratory boxplots separately for the three species is:

par(mfrow = c(1, 3))
for(i in levels(penguins$species)) plot(body_mass ~ sex, data = penguins,
  subset = species == i, main = i, ylim = range(penguins$body_mass, na.rm = TRUE))

The question shows the parallel boxplots for the Chinstrap species.

Groupwise statistics of body mass by sex and species (including mean, median, and standard deviation) can be obtained by aggregating the data with the combined summary() and sd() functions.

aggregate(body_mass ~ sex + species, data = penguins,
  FUN = function(x) c(summary(x), `Std. dev.` = sd(x)))
##      sex   species body_mass.Min. body_mass.1st Qu. body_mass.Median
## 1 female    Adelie      2850.0000         3175.0000        3400.0000
## 2   male    Adelie      3325.0000         3800.0000        4000.0000
## 3 female Chinstrap      2700.0000         3362.5000        3550.0000
## 4   male Chinstrap      3250.0000         3731.2500        3950.0000
## 5 female    Gentoo      3950.0000         4462.5000        4700.0000
## 6   male    Gentoo      4750.0000         5300.0000        5500.0000
##   body_mass.Mean body_mass.3rd Qu. body_mass.Max. body_mass.Std. dev.
## 1      3368.8356         3550.0000      3900.0000            269.3801
## 2      4043.4932         4300.0000      4775.0000            346.8116
## 3      3527.2059         3693.7500      4150.0000            285.3339
## 4      3938.9706         4100.0000      4800.0000            362.1376
## 5      4679.7414         4875.0000      5200.0000            281.5783
## 6      5484.8361         5700.0000      6300.0000            313.1586

Based on this the remaining elements of the question can be answered.

The penguins data in base R provides various measurements of adult penguins from three different species. See ?penguins for more details. Originally, the data was used to study sex dimorphism separately for the three species.

The first three rows of the data can be inspected as follows. Employ summary() to obtain a first overview.

data("penguins", package = "datasets")
head(penguins, 3)
##   species    island bill_len bill_dep flipper_len body_mass    sex year
## 1  Adelie Torgersen     39.1     18.7         181      3750   male 2007
## 2  Adelie Torgersen     39.5     17.4         186      3800 female 2007
## 3  Adelie Torgersen     40.3     18.0         195      3250 female 2007

Explore the sex differences with respect to body mass (weight, in grams) of the penguins. Create parallel boxplots of weight by sex, such as the one below, separately for the three species.

Which species does this plot pertain to?

To complement the plot complete the corresponding table of groupwise statistics:

median mean std. deviation
female
male

The average weight difference of is thus slightly than the median weight difference of .

Compute the full summary() of weight by sex for this species and select the correct statements in the following list.

One way to obtain the exploratory boxplots separately for the three species is:

par(mfrow = c(1, 3))
for(i in levels(penguins$species)) plot(body_mass ~ sex, data = penguins,
  subset = species == i, main = i, ylim = range(penguins$body_mass, na.rm = TRUE))

The question shows the parallel boxplots for the Adelie species.

Groupwise statistics of body mass by sex and species (including mean, median, and standard deviation) can be obtained by aggregating the data with the combined summary() and sd() functions.

aggregate(body_mass ~ sex + species, data = penguins,
  FUN = function(x) c(summary(x), `Std. dev.` = sd(x)))
##      sex   species body_mass.Min. body_mass.1st Qu. body_mass.Median
## 1 female    Adelie      2850.0000         3175.0000        3400.0000
## 2   male    Adelie      3325.0000         3800.0000        4000.0000
## 3 female Chinstrap      2700.0000         3362.5000        3550.0000
## 4   male Chinstrap      3250.0000         3731.2500        3950.0000
## 5 female    Gentoo      3950.0000         4462.5000        4700.0000
## 6   male    Gentoo      4750.0000         5300.0000        5500.0000
##   body_mass.Mean body_mass.3rd Qu. body_mass.Max. body_mass.Std. dev.
## 1      3368.8356         3550.0000      3900.0000            269.3801
## 2      4043.4932         4300.0000      4775.0000            346.8116
## 3      3527.2059         3693.7500      4150.0000            285.3339
## 4      3938.9706         4100.0000      4800.0000            362.1376
## 5      4679.7414         4875.0000      5200.0000            281.5783
## 6      5484.8361         5700.0000      6300.0000            313.1586

Based on this the remaining elements of the question can be answered.

The penguins data in base R provides various measurements of adult penguins from three different species. See ?penguins for more details. Originally, the data was used to study sex dimorphism separately for the three species.

The first three rows of the data can be inspected as follows. Employ summary() to obtain a first overview.

data("penguins", package = "datasets")
head(penguins, 3)
##   species    island bill_len bill_dep flipper_len body_mass    sex year
## 1  Adelie Torgersen     39.1     18.7         181      3750   male 2007
## 2  Adelie Torgersen     39.5     17.4         186      3800 female 2007
## 3  Adelie Torgersen     40.3     18.0         195      3250 female 2007

Explore the sex differences with respect to body mass (weight, in grams) of the penguins. Create parallel boxplots of weight by sex, such as the one below, separately for the three species.

Which species does this plot pertain to?

To complement the plot complete the corresponding table of groupwise statistics:

median mean std. deviation
female
male

The average weight difference of is thus slightly than the median weight difference of .

Compute the full summary() of weight by sex for this species and select the correct statements in the following list.

One way to obtain the exploratory boxplots separately for the three species is:

par(mfrow = c(1, 3))
for(i in levels(penguins$species)) plot(body_mass ~ sex, data = penguins,
  subset = species == i, main = i, ylim = range(penguins$body_mass, na.rm = TRUE))

The question shows the parallel boxplots for the Chinstrap species.

Groupwise statistics of body mass by sex and species (including mean, median, and standard deviation) can be obtained by aggregating the data with the combined summary() and sd() functions.

aggregate(body_mass ~ sex + species, data = penguins,
  FUN = function(x) c(summary(x), `Std. dev.` = sd(x)))
##      sex   species body_mass.Min. body_mass.1st Qu. body_mass.Median
## 1 female    Adelie      2850.0000         3175.0000        3400.0000
## 2   male    Adelie      3325.0000         3800.0000        4000.0000
## 3 female Chinstrap      2700.0000         3362.5000        3550.0000
## 4   male Chinstrap      3250.0000         3731.2500        3950.0000
## 5 female    Gentoo      3950.0000         4462.5000        4700.0000
## 6   male    Gentoo      4750.0000         5300.0000        5500.0000
##   body_mass.Mean body_mass.3rd Qu. body_mass.Max. body_mass.Std. dev.
## 1      3368.8356         3550.0000      3900.0000            269.3801
## 2      4043.4932         4300.0000      4775.0000            346.8116
## 3      3527.2059         3693.7500      4150.0000            285.3339
## 4      3938.9706         4100.0000      4800.0000            362.1376
## 5      4679.7414         4875.0000      5200.0000            281.5783
## 6      5484.8361         5700.0000      6300.0000            313.1586

Based on this the remaining elements of the question can be answered.

Description:
This cloze exercise randomly selects one (out of three) species of penguins from the eponymous data set in base R and then asks for a number of exploratory analyses investigating sex differences in body weight for the selected penguin species. The exercise template illustrates how such a data-based cloze including all types of items (numeric, string, single-choice, multiple-choice) can be created relatively easily using the add_cloze() function.
Solution feedback:
Yes
Randomization:
Shuffling (1 out of 3 species)
Mathematical notation:
No
Verbatim R input/output:
Yes
Images:
Yes
Other supplements:
No
Raw: (1 random version)
PDF:
penguins-Rmd-pdf
penguins-Rnw-pdf
HTML:
penguins-Rmd-html
penguins-Rnw-html

Demo code:

library("exams")

set.seed(403)
exams2html("penguins.Rmd")
set.seed(403)
exams2pdf("penguins.Rmd")

set.seed(403)
exams2html("penguins.Rnw")
set.seed(403)
exams2pdf("penguins.Rnw")