Exam 1

  1. Question

    Consider the following table:

    Name Length Weight
    Fritz 187 85
    Wilhelm 161 66
    Dieter 163 66
    Detlef 195 98

    What is the average of the variable “Length”?


    Solution

    The average “Length” is 176.5:

    \[\bar x = \frac{187 + 161 + 163 + 195}{4} = 176.5.\]


  2. Question

    A machine fills milk into \(250\)ml packages. It is suspected that the machine is not working correctly and that the amount of milk filled differs from the setpoint \(\mu_0 = 250\). A sample of \(157\) packages filled by the machine are collected. The sample mean \(\bar{y}\) is equal to \(240.8\) and the sample variance \(s^2_{n-1}\) is equal to \(137.67\).

    Test the hypothesis that the amount filled corresponds on average to the setpoint. What is the value of the t-test statistic?


    1. \(25.240\)
    2. \(-9.825\)
    3. \(-9.495\)
    4. \(5.415\)
    5. \(28.504\)

    Solution

    The t-test statistic is calculated by: \[ \begin{aligned} t & = & \frac{\bar y - \mu_0}{\sqrt{\frac{s^2_{n-1}}{n}}} = \frac{240.8 - 250}{\sqrt{\frac{137.67}{157}}} = -9.825. \end{aligned} \] The t-test statistic is thus equal to \(-9.825\).


    1. False
    2. True
    3. False
    4. False
    5. False

  3. Question

    In a small city the satisfaction with the local public transportation is evaluated. One question of interest is whether inhabitants of the city are more satisfied with public transportation compared to those living in the suburbs.

    A survey with 250 respondents gave the following contingency table:

               Location
    Evaluation  City Suburbs
      Very good   18      22
      Good        36      23
      Bad         36      66
      Very bad    10      39
    

    The following table of percentages was constructed:

               Location
    Evaluation  City    Suburbs
      Very good    18.0    14.7
      Good         36.0    15.3
      Bad          36.0    44.0
      Very bad     10.0    26.0
    

    Which of the following statements are correct?


    1. The percentage table provides the satisfaction distribution for each location type.
    2. The percentage table provides row percentages.
    3. The value in row 1 and column 2 in the percentage table indicates: 14.7 percent of those living in the suburbs evaluated the public transportation as very good.
    4. The value in row 2 and column 2 in the percentage table indicates: 15.3 percentage of those, who evaluated the public transportation as good live in the suburbs.
    5. The percentage table can be easily constructed from the original contingency table: percentages are calculated for each column.

    Solution

    In the percentage table, the column sums are about 100 (except for possible rounding errors). Hence, the table provides column percentages, i.e., conditional relative frequencies for satisfaction level given location type.


    1. True. The column sums are equal to 100 (except for possible rounding errors).
    2. False. The row sums are not equal to 100.
    3. True. This is the correct interpretation for column percentages.
    4. False. This is an interpretation for row percentaes, but the table provides column percentages.
    5. True. This calculation yields column percentages.

  4. Question

    Consider the following regression results:

    
    Call:
    lm(formula = log(y) ~ x, data = d)
    
    Residuals:
         Min       1Q   Median       3Q      Max 
    -1.45662 -0.31421 -0.04617  0.26287  1.27416 
    
    Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
    (Intercept)  0.03675    0.06787   0.541     0.59    
    x           -0.74279    0.05701 -13.030   <2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.4987 on 52 degrees of freedom
    Multiple R-squared:  0.7655,    Adjusted R-squared:  0.761 
    F-statistic: 169.8 on 1 and 52 DF,  p-value: < 2.2e-16
    

    Describe how the response y depends on the regressor x.


    Solution

    The presented results describe a semi-logarithmic regression.

    The mean of the response y decreases with increasing x.

    If x increases by 1 unit then a change of y by about -52.42 percent can be expected.

    Also, the effect of x is significant at the 5 percent level.


  5. Question

    For the 30 observations of the variable x in the data file boxhist.csv draw a histogram, a boxplot and a stripchart. Based on the graphics, answer the following questions or check the correct statements, respectively. (Comment: The tolerance for numeric answers is \(\pm0.3\), the true/false statements are either about correct or clearly wrong.)


    1. The distribution is unimodal. / The distribution is not unimodal.
    2. The distribution is symmetric. / The distribution is right-skewed. / The distribution is left-skewed.
    3. The boxplot shows outliers. / The boxplot shows no outliers.
    4. A quarter of the observations is smaller than which value?
    5. A quarter of the observations is greater than which value?
    6. Half of the observations are greater than which value?

    Solution



    1. True / False
    2. False / True / False
    3. True / False
    4. 0.53
    5. 1.13
    6. 0.58