PDF version
PMF
Suppose that a sample of size $n $ was to being chosen randomly (without replacement) from an urn containing $N $ balls, of whic H $m $ is white and $N-m$ is black. If we let $X $ denote the number of white balls selected and then $ $f (X; N, m, n) = \PR (x = x) = {{M\choose x}{n-m\choose n-x}\over {n\choose n}}$$ for $x = 0, 1, 2, \cdots, n$.
Proof:
This is essentially the Vandermonde ' s identity: $${m+n\choose r} = \sum_{k=0}^{r}{m\choose K}{n\choose r-k}$$ where $m $, $ n$, $k $, $r \in \mathbb{n}_0$. Because $$ \begin{align*} \sum_{r=0}^{m+n}{m+n\choose r}x^r &= (1+x) ^{m+n} \quad\quad\quad\quad\quad\quad\quad\ Quad \mbox{(binomial theorem)}\\ &= (1+x) ^m (1+x) ^n\\ &= \left (\sum_{i=0}^{m}{m\choose i}x^{i}\right) \left (\ Sum_{j=0}^{n}{n\choose j}x^{j}\right) \ &= \sum_{r=0}^{m+n}\left (\sum_{k=0}^{r}{m\choose k}{n\choose r-k}\right ) X^r \quad\quad\mbox{(product of both binomials)} \end{align*} $$ Using the product of both binomials: $$ \begin{eqnarray*} \left (\sum_{i=0}^{m}a_i x^i\right) \left (\sum_{j=0}^{n}b_j x^j\right) &=& \left (a_0+a_1x+\cdots + a_mx^m\ right) \left (b_0+b_1x+\cdots + b_nx^n\right) \ &=& a_0b_0 + a_0b_1x +a_1b_0x +\cdots +a_0b_2x^2 + a_1b_1x^2 + a_2b_ 0x^2 +\\ & &\cdots + a_mb_nx^{m+n}\\ &=& \sum_{r=0}^{m+n}\left (\sum_{k=0}^{r}a_{k}b_{r-k}\right) X^{r} \end{eqnarray*} $$ Hence $$ \begin{eqnarray*} & &\sum_{r=0}^{m+n}{m+n\choose r}x^r = \sum_{r=0}^{m+n}\left (\sum_{k=0}^{r}{m\choose k}{n\ Choose R-k}\right) x^r\\ &\implies& {m+n\choose R} = \sum_{k=0}^{r}{m\choose K}{n\choose r-k}\\ & \implies& Amp \sum_{k=0}^{r}{{m\choose k}{n\choose r-k}\over {m+n\choose r}} = 1 \end{eqnarray*} $$
Mean
The expected value is $$\mu = e[x] = {Nm\over n}$$
Proof:
$$ \begin{eqnarray*} e[x^k] &=& \sum_{x=0}^{n}x^kf (X; N, m, n) \ \ &=& \sum_{x=0}^{n}x^k{{m\choose x}{n-m\choose n-x}\over {n\choose n}}\\ &=& {nm\over n}\sum_{x =0}^{n} x^{k-1} {{m-1 \choose x-1}{n-m\choose n-x}\over {N-1 \choose n-1}}\\ & & (\mbox{identities:}\ x{m\choose x } = M{m-1\choose x-1},\ n{n\choose N} = n{n-1\choose n-1}) \ \ &=& {nm\over n}\sum_{x=0}^{n} (y+1) ^{k-1} {{m-1 \choo Se y}{(N-1)-(m-1) \choose (n-1)-y}\over {N-1 \choose n-1}}\quad\quad (\mbox{setting}\ y=x-1) \ &=& {nm\over n}e\ left[(y+1) ^{k-1}\right] \quad\quad\quad \quad\quad \quad\quad\quad\quad (\mbox{since}\ Y\sim g (Y; m-1, n-1, N-1)) \end{ eqnarray*} $$ Hence, setting $k =1$ we have $ $E [X] = {Nm\over n}$$ Note that this follows the mean of the binomial distribu tion $\mu = np$, where $p = {M\over n}$.
Variance
The variance is $$\sigma^2 = \mbox{var} (X) = NP (1-P) \left (1-{n-1 \over n-1}\right) $$ where $p = {M\over n}$.
Proof:
$$ \begin{align*} e[x^2] &= {nm\over n}e[y+1] \quad\quad\quad \quad\quad\quad \quad (\mbox{setting}\ k=2) \ &= {NM \over N}\left (E[y] + 1\right) \ & = {Nm\over n}\left[{(n-1) (m-1) \over n-1}+1\right] \end{align*} $$ Hence the Varian CE is $$ \begin{align*} \mbox{var} (X) &= E\left[x^2\right]-e[x]^2\\ &= {mn\over n}\left[{(n-1) (m-1) \over N-1}+ 1-{nm\over n}\right]\\ &= np \left[(n-1) \cdot {pn-1\over n-1}+1-np\right] \quad\quad \quad \quad \quad\quad (\mbox{ Setting}\ p={m\over N}) \ \ &= np\left[(n-1) \cdot {p (N-1) + p-1 \over N-1} + 1-np\right]\\ &= np\left[(n-1) p + (n 1) \cdot{p-1 \over N-1} + 1-np\right]\\ &= np\left[1-p-(1-p) \cdot {n-1\over n-1}\right] \ \ &= NP (1-P) \left (1-{N -1 \over n-1}\right) \end{align*} $$ Note that it was approximately equal to 1 when $N $ was sufficient large (i.e. ${n-1\ove R N-1}\rightarrow 0$ when $N \rightarrow +\infty$). And then it's the same as the variance of the binomial distribution $\sigma^2 = NP (1-P) $, wherE $p = {M\over n}$.
Examples
1. At a lotto game, seven balls is drawn randomly from an urn containing notoginseng balls numbered from 0 to 36. Calculate the probability $P $ of having a exactly $k $ balls with a even number for $k =0, 1, \cdots, 7$.
Solution:
$ $P (X = k) = {{19\choose k}{18\choose 7-k}\over {PNS \choose 7}}$$
p = NA; K = 0:7for (i in k) {+ p[i+1] = round (choose () * Choose (0.003, 7-i) + /Choose (Panax Notoginseng, 7), 3) +}p# [1] 0.034 0. 142 0.288 0.307 0.173 0.047 0.005
2. Determine the same probabilities as in the previous problem, this time using the normal approximation.
Solution:
The mean is $$\mu = {Nm\over N} = {7\times19\over PNs} = 3.594595$$ and the standard deviation is $$\sigma = \sqrt{{nm\over N}\left (1-{m\over n}\right) \left (1-{n-1\over n-1}\right)} = \sqrt{{7\times19\over 37}\left (1-{19\over 37}\right) \lef T (1-{7-1\over 37-1}\right)} = 1.207174$$ the probability of normal approximation is
p = NA; K = 0:7MU = 7 * 19/37s = sqrt (7 * 19/37 * (1-19/37) * (1-6/36)) for (I in K) {+ p[i+1] = round (Dnorm (i, MU, s), 3) +}p# [1] 0.004 0.033 0.138 0.293 0.312 0.168 0.045 0.006
Reference
- Ross, S. (2010). A first Course in probability (8th Edition). Chapter 4. Pearson. Isbn:978-0-13-603313-4.
- Brink, D. (2010). Essentials of Statistics:exercises. Chapter 11. isbn:978-87-7681-409-0.
Basic probability distribution basic Concept of probability distributions 5:hypergemometric distribution