# Sampling Distribution of the Sample Proportion as an Example of the Sampling Distribution of the Sample Mean

Proportion problems can be likened to a coin toss, where we think of a "heads" as 1, and a "tails" as 0. Proportions are related to percentages and probabilities, but proportions are (like probabilities) always given as numbers between 0 and 1. Examples include the proportion of

• Americans in favor of the death penalty,
• water samples that turn up E.Coli,
• paint samples that turn up lead,

etc.

If all samples of paint turn up lead, then the proportion of contaminated samples is 1; if no samples turn up lead, then the proportion is 0; and generally, the truth lies somewhere between!

You might wonder how one could ever know whether a coin is fair (that is, the chance of a head is the same as the chance of a tail). The truth of the matter is, you can't. All conditions might point to a fair coin -- it may be perfectly symmetric, etc. -- but you'll never know for sure. This points to the existence of a parameter, which we call ${\displaystyle \left.\pi \right.}$, and which indicates the true underlying proportion of heads (say). Note well: this ${\displaystyle \left.\pi \right.}$ is not the same as the ${\displaystyle \left.\pi \right.}$ that plays such an important role in the study of circles.

From a "frequentist's" perspective, the only way to understand ${\displaystyle \left.\pi \right.}$ is to toss the coin forever and see what happens to the ratio of heads to tosses. That's how we find the underlying parameters of the coin:

${\displaystyle \pi =\lim _{n\rightarrow \infty }{\frac {\Sigma _{i=1}^{n}x_{i}}{n}}}$

Then the variance of that coin is

${\displaystyle \sigma ^{2}=\lim _{n\rightarrow \infty }{\frac {\Sigma _{i=1}^{n}(x_{i}-\pi )^{2}}{n}}=\lim _{n\rightarrow \infty }{\frac {\Sigma _{i=1}^{n}(x_{i}^{2}-2x_{i}\pi +\pi ^{2})}{n}}}$

or

${\displaystyle \sigma ^{2}=\lim _{n\rightarrow \infty }{\frac {\Sigma _{i=1}^{n}x_{i}}{n}}-2\pi \lim _{n\rightarrow \infty }{\frac {\Sigma _{i=1}^{n}x_{i}}{n}}+\lim _{n\rightarrow \infty }{\frac {\Sigma _{i=1}^{n}\pi ^{2}}{n}}}$

(because ${\displaystyle x_{i}^{2}=x_{i}}$ for the coin toss); or

${\displaystyle \sigma ^{2}=\pi -2\pi ^{2}-\pi ^{2}=\pi -\pi ^{2}=\left(1-\pi \right)\pi }$

Therefore

${\displaystyle \sigma ={\sqrt {(1-\pi )\pi }}}$

Therefore, according to the theory of the sampling distribution of the sample mean, the parameters of the distribution of the sample mean are

${\displaystyle \mu _{\overline {x}}=\pi }$,

and

${\displaystyle \sigma _{\overline {x}}={\sqrt {\frac {(1-\pi )\pi }{n}}}}$.

The proportion problem is interesting for (at least) two reasons:

1. once the mean is given, the standard deviation is known (somewhat unusual); and
2. Our rule for when the normal distribution assumption is valid casts a shadow on the old lie that ${\displaystyle \left.n=30\right.}$ is magic: for the proportion problem the rule we use is:
${\displaystyle 0<\pi -3\sigma _{\overline {x}}<\pi +3\sigma _{\overline {x}}<1}$
That is: ${\displaystyle \left.n=30\right.}$ won't save you: it's a rule of thumb, but it depends entirely on the underlying distribution of x....