Sangeetha Pulapaka

The remaining of the question goes like this:

Consider the formula:

Why is it not appropriate for Amy to use this formula for the standard deviation of Green – Pink?

The formula you did not mention is

\sigma _{\widehat{p}_{1} - \widehat{p}_{2}} = \sqrt{\frac{p_{1}(1-p_{1})}{n_{1}}+\frac{p_{2}(1-p_{2})}{n_{2}}}

where n_{1} and [math]n_{2}[math] are sizes of each sample.

This standard deviation formula works as long as we have:

  • Independent observations between the two samples.
  • Independent observations within each sample*

Here p_{1} and p_{2} have to come from independent samples, but is not the case here because they are coming from the same sample.

The correct procedure would be to

  1. Randomly select a sample calculate the proportion of pink balls.
  2. Randomly select another sample, calculate the proportion of green balls

and then calculate the difference. So Amy cannot use this formula for the standard deviation of Green- Pink because both they are not independent and come from the same sample.

Here is a detailed explanation of the same concept: