Sangeetha Pulapaka

The remaining of the question goes like this:

Consider the formula:

Why is it not appropriate for Amy to use this formula for the standard deviation of Green – Pink?

The formula you did not mention is

\sigma _{\widehat{p}_{1} - \widehat{p}_{2}} = \sqrt{\frac{p_{1}(1-p_{1})}{n_{1}}+\frac{p_{2}(1-p_{2})}{n_{2}}}

where n_{1} and n_2 are sizes of each sample.

This standard deviation formula works as long as we have:

  • Independent observations between the two samples.
  • Independent observations within each sample*

Here p_{1} and p_{2} have to come from independent samples, but is not the case here because they are coming from the same sample.

It is not appropriate for Amy to use this standard deviation formula because the samples are not independent of each other.

Here is a detailed explanation of the same concept: