University of California, Berkeley stat2.2x probability probability Study Note: Section 4, the central Limit theorem

Source: Internet
Author: User

The stat2.2x probability (probability) course was taught at the EdX platform in 2014 by the University of California, Berkeley (University of California, Berkeley).

Download PDF Note (academia.edu)

Summary

  • Standard Error
    The standard error of a random variable $X $ was defined by $ $SE (x) =\sqrt{e ((X-E (x)) ^2)}$$ $SE $ measures the rough size of T He chance error in $X $: roughly the far off $X $ is from $E (X) $.
  • Standard Deviation
    The standard deviation of a list of numbers is $ $SD =\sqrt{e ((X-\MU) ^2)}$$ where $\mu=e (x) $. $SD $ measures the rough size of the deviations:roughly how far off the numbers is from the average.
  • $SE $ of the Sum of the draws
    $n $ draws at random with replacement from a box of numbered tickets, the standard error of the sum of the that the draw is $ $SE =\sq Rt{\text{number of Draws}}\cdot (sd\ \text{of The box}) =\sqrt{n}\cdot\sigma$$ where $\sigma=\sqrt{e ((X-\MU) ^2)}$
  • Chebychev ' s inequality
    The probability that $X $ is $k $ or more $SEs $ away from $E (X) $ are at $\frac{1}{k^2}$, which is $ $P (x\ \text{is outside The interval}\ e (x) \pm K\cdot SE (x)) \leq\frac{1}{k^2}$$ for instance, $ $P (x\ \text{is inside the interval}\ E (x) \pm2\cdot SE (X)) \geq1-\frac{1}{2^2}=\frac{3}{4}$$
  • De Moivre-laplace theorem
    Fix any $p $ strictly between $0$ and $1$. As the number of trials $n $ increases, the probability histogram for the binomial distribution looks like the normal curve With mean $\mu=n\cdot p$ and $SD =\sqrt{n\cdot P\cdot (1-p)}$.
  • Central Limit theorem
    Let $X _1, X_2, \ldots, x_n$ is independent and identically distributed, each with expected value $\mu$ and standard error $\sigma$. Let $S _n=x_1+x_2+\ldots+x_n$. Then for large $n $, the probability distribution of $S _n$ are approximately normal with mean $n \mu$ and standard deviation $\sqrt{n}\sigma$, no matter what's the distribution of each $X _i$.
  • Normal approximation of binomial distribution
    $$\mu=n\cdot p, Se=\sqrt{n\cdot p\cdot (1-p)}$$ $ $Z _1=\frac{x_1-\mu}{se}, z_2=\frac{x_2-\mu}{se}$$ $ $P (X_1\leq X\leq X_ 2) =\text{area under the standard normal curve between}\ x_1,x_2 $$ R Code:
    MU = n * p; SE = sqrt (n * p * (1-p)) Z1 = (X1-MU)/SE; Z2 = (X2-MU)/Sepnorm (Z2)-Pnorm (Z1)

Practice

Problem 1

In 6000 rolls of a die, approximately are the chance of getting between 950 and 1050 sixes (inclusive)?

Solution

Binomial distribution $n =6000, k=950:1050, p=1/6$: $ $P (\text{between 950 and 1050 sixes}) $$ $$=\sum_{k=950}^{1050}c_{ 6000}^{k} (\frac{1}{6}) ^k\cdot (\frac{5}{6}) ^{6000-k}\doteq0.9198021$$ R code:

SUM (dbinom (x = 950:1050, size = 6000, p = 1/6)) [1] 0.9198021

Alternatively, using Normal approximation: $$\mu=np=6000\times\frac{1}{6}=1000$$ $ $SE =\sqrt{n\cdot p\cdot (1-p)}\ doteq28.86751$$ $ $Z _1=\frac{950-1000}{se}, z_2=\frac{1050-1000}{se}$$ $ $P (\text{between 950 and 1050 sixes}) $$ $$=\ Text{area under the standard normal curve between}\ z_1\ \text{and}\ z_2$$ $$=0.9167355$$ R Code:

n = 6000; p = 1/6MU = n * p; SE = sqrt (n * p * (1-p)) Z1 = (950-MU)/SE; Z2 = (1050-MU)/Sepnorm (Z2)-Pnorm (z1) [1] 0.9167355

Problem 2

The "column" bet in Roulette pays 2 to 1 and there is a chances in a to win. Suppose you bet \$1 the Times independently on a column. Find

A) the expected number of times you win

b) The SE of the number of times you win

c) The expected value of your net gain

D) The $SE $ of your net gain

e) The chance that's come out ahead

Solution

2a) $ $E (\text{times of Win}) =100\times\frac{12}{38}\doteq31.57895$$

2b) $ $SE =\sqrt{n\cdot P\cdot (1-p)}=\sqrt{100\times\frac{12}{38}\times\frac{26}{38}}\doteq4.648295$$

2c) $ $E (\text{net gain}) =100\times (2\times\frac{12}{38}+ ( -1) \times\frac{26}{38}) \doteq-5.263158$$ Alternatively, Let $W $ is the number of wins and $X $ the net gain. Then $ $X =2\cdot W-1\cdot (100-w) =3\cdot w-100$$ $ $E (X) =3\cdot E (W) -100=3\times31.579895-100=-5.26315$$

2d) Because $SE =\sqrt{n}\sigma$ and $ $n =100, \mu=2\times\frac{12}{38}+ ( -1) \times\frac{26}{38}=-\frac{1}{19}$$ $$\ Sigma=\sqrt{e ((X-\MU) ^2)}=\sqrt{(2+\frac{1}{19}) ^2\times\frac{12}{38}+ ( -1+\frac{1}{19}) ^2\times\frac{26}{38}}\ doteq1.394489$$ Thus $ $SE =\sqrt{n}\sigma\doteq13.94489$$ Alternatively, $ $SE (X) =3\cdot SE (W) =3\times4.6483=13.945$$

2e) $X > 0 \rightarrow w > \frac{100}{3}\rightarrow w \geq 34$. Binomial distribution $n =100, k=34:100, p=12/38$: $$\sum_{k=34}^{100}c_{100}^{k}\cdot (\frac{12}{38}) ^k\cdot (\frac{ 26}{38}) ^{100-k}\doteq0.3357928$$ R code:

SUM (dbinom (x = 34:100, size = +, p = 12/38)) [1] 0.3357928

Problem 3

Find the normal approximation to the chance of getting for heads in tosses of a coin.

Solution

Normal approximation: $$\mu=100\times0.5=50, Se=\sqrt{n\cdot P\cdot (1-p)}=\sqrt{100\times0.5\times0.5}=5$$ $ $Z _1=\ FRAC{42.5-50}{5}, z_2=\frac{43.5-50}{5}$$ $ $P (\text{getting heads in tosses of a coin}) \doteq0.02999328$$ R code:

n = 100; p = 1/2MU = n * p; SE = sqrt (n * p * (1-p)) Z1 = (42.5-MU)/SE; Z2 = (43.5-MU)/Sepnorm (Z2)-Pnorm (z1) [1] 0.02999328

Binomial distribution (exact value): $ $C _{100}^{43}\times (\frac{1}{2}) ^{100}\doteq0.03006864$$ R code:

Dbinom (x = max, size = +, p = 1/2) [1] 0.03006864

Therefore The normal approximation is excellent.

EXERCISE 4

Problem 1

A random variable $W $ has the probability distribution

Value 1 2 3 4

Probability 0.5 0.25 0.125 0.125

(For those of the interested, this is the geometric $p =0.5$ ' killed ' at 4. $W $ is the number of times I toss a COI n If I follow this rule:i ' ll toss the coin till I get the first head, but I'll stop after 4 tosses even if I haven ' t got A head by that time.)

1 a Find $E (W) $

1 b Find $SE (W) $

Solution

1A) $ $E (W) =1\times0.5+2\times0.25+3\times0.125+4\times0.125=1.875$$

1B) $ $SE (w) =\sqrt{e[(W-E (w)) ^2]}$$ $$=\sqrt{(1-1.875) ^2\times0.5+ (2-1.875) ^2\times0.25+ (3-1.875) ^2\times0.125+ ( 4-1.875) ^2\times0.125}$$ $$\doteq1.053269$$ R Code:

v = 1:4; p = C (. 5,.,.) mu = SUM (v * p) sqrt (sum ((V-MU) ^ 2 * p)) [1] 1.053269

Problem 2

A True-false test consists of questions, each of the which has one correct answer:true, or false. One point was awarded for every correct answer, but one point was taken off for each wrong answer. Suppose a student answers every question by guessing at random, independently of other questions. Let $S $ is the student ' S score on the test.

2 a Find $E (S) $

2 b Find $SE (S) $

2 c Find $P (s=0) $ without using a large-sample approximation.

Solution

2 a) This was very similar to the net gain, $ $E (S) =20\times (1\times\frac{1}{2}+ ( -1) \times\frac{1}{2}) =0$$

2B) $S $ is the sum score, $$\mu=1\times\frac{1}{2}+ ( -1) \times\frac{1}{2}=0$$ $ $SE (S) =\sqrt{n}\sigma=\sqrt{20\times (( 1-0) ^2\times\frac{1}{2}+ ( -1-0) ^2\times\frac{1}{2})}\doteq4.472136$$

2C) $S =0$ means there is correct answers and incorrect, answers binomial $n distribution, =20, k=10, $ $P (s=0) =c_{20}^{10}\times (\frac{1}{2}) ^{20}\doteq0.1761971$$ R code:

Dbinom (x = ten, size = prob = 1/2) [1] 0.1761971

Problem 3

A die is rolled.

3 A Find the expected number of times the face with 6 spots appears.

3 b Find The $SE $ of the number of the times the face with 6 spots appears.

3 C Find The normal approximation to the chance, the face with six spots appears times.

3D Find The exact chance the face with six spots appears times.

3E Find The normal approximation to the chance then the face with six spots appears 9, ten, or one times.

3F Find The exact chance the face with six spots appears 9, or one times.

Solution

3 a) $ $E (\text{6 spots appears}) =60\times\frac{1}{6}=10$$

3B) $ $SE (\text{6 spots appears}) =\sqrt{60\times\frac{1}{6}\times (1-\frac{1}{6})}\doteq2.886751$$

3C) $ $Z _1=\frac{9.5-10}{se}, z_2=\frac{10.5-10}{se}$$ Computing in R:

Mu = 10; SE = sqrt (1/6 * 5/6) z1 = (9.5-MU)/SE; Z2 = (10.5-MU)/Sepnorm (Z2)-Pnorm (z1) [1] 0.1375098

3D) Binomial distribution $n =60, k=10, p=\frac{1}{6}$: $ $C _{60}^{10}\times (\frac{1}{6}) ^{10}\times (\frac{5}{6}) ^{50} \doteq0.1370131$$ R Code:

Dbinom (x = ten, size = prob = 1/6) [1] 0.1370131

3 e) $ $Z _1=\frac{8.5-10}{se}, z_2=\frac{11.5-10}{se}$$ Computing in R:

Mu = 10; SE = sqrt (1/6 * 5/6) z1 = (8.5-MU)/SE; Z2 = (11.5-MU)/Sepnorm (Z2)-Pnorm (z1) [1] 0.3966682

3F) Binomial distribution $n =60, k=9:11, p=\frac{1}{6}$: $$\sum_{k=9}^{11}c_{60}^{k}\cdot (\frac{1}{6}) ^{k}\cdot (\ FRAC{5}{6}) ^{60-k}\doteq0.3958971$$ R code:

SUM (dbinom (x = 9:11, size = 1/6)) [1] 0.3958971

Problem 4

According to genetic theory, plants of a particular species has a 25% chance of being red-flowering, independently of Oth ER plants. Find the normal approximation to the chance that among-plants of this species, more than 2400 is red-flowering.

Solution

Normal approximation: $ $p =0.25, n=10000$$ $$\mu=np, SE=\SQRT{NP (1-p)}, z=\frac{2400.5-\mu}{se}$$ Computing in R:

n = 10000; p = 0.25MU = n * p; SE = sqrt (n * p * (1-p)) z = (2400.5-MU)/Se1-pnorm (z) [1] 0.989215

Binomial distribution $$\sum_{k=2401}^{10000}c_{10000}^{k}\cdot (0.25) ^k\cdot (0.75) ^{10000-k}$$ R Code:

SUM (dbinom (x = 2401:10000, size = 10000, prob = 0.25)) [1] 0.9894525

Problem 5

A random number generator draws at random with replacement from the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. In draws, the chance, the digit 0 appears fewer than 495 times are closest to

Solution

Normal approximation: $ $n =5000, p=0.1$$ $$\mu=np, SE=\SQRT{NP (1-p)}, z=\frac{494.5-\mu}{se}$$ Computing in R:

MU = n * p; SE = sqrt (n * p * (1-p)) z = (494.5-MU)/Sepnorm (z) [1] 0.3977125

Binomial distribution $$\sum_{k=0}^{494}c_{5000}^{k}\cdot (0.1) ^k\cdot (0.9) ^{5000-k}$$ R code:

SUM (x = 0:494, size = dbinom, prob = 0.1)) [1] 0.3999814

EXERCISE 5

Problem 1

The durations of phone calls taken by the receptionist at an office is like draws made at random with replacement from a The list that has a average of 8.5 minutes (that's 8 minutes and seconds) and an $SD $ of 3 minutes. Approximately what is the chance, the total duration of the next calls are more than in hours?

Solution

Central Limit theorem: $$\mu=8.5, sd=3, Se=\sqrt{n}\cdot sd=30$$ $ $Z =\frac{900-850}{30}$$ Computing in R:

z = (900-850)/301-pnorm (z) [1] 0.04779035

Problem 2

A Multiple choice test consists of questions. Each question have 5 possible answers, only one of the which is correct. Four points was awarded for each correct answer, and 1 point was taken off for each wrong answer. Suppose answer all the questions is guessing at random, independently of any other questions.

2 A in order to score more than-points, you has to get more than ________ answers right. Fill in the blank with the smallest correct whole number.

2 B What's the chance that you get more than points?

Solution

2 a) Let $x $ is the number of correct answers, we have $$4x+ ( -1) \cdot (100-x) > 30\rightarrow x > 26$$ Therefore you h Ave to get more than-answers right.

2B) Binomial distribution $n =100, k=27:100, p=\frac{1}{5}$: $ $P (\text{more than (points}) =\sum_{k=27}^{100}c_{100}^{k }\cdot (\frac{1}{5}) ^k\cdot (\frac{4}{5}) ^{100-k}\doteq0.05583272$$ R code:

SUM (dbinom (x = 27:100, size = 1/5)) [1] 0.05583272

Normal approximation: $ $n =100, p=\frac{1}{5}, \mu=np=20, SE=\SQRT{NP (1-p)}=4$$ $ $Z =\frac{26.5-20}{se}$$ Computing in R:

z = (26.5-20)/4> 1-pnorm (z) [1] 0.05208128

This approximation was not sufficient good.

Problem 3

Assume that each person in a population have chance 2/1000 of carrying a particular disease, independently of all other PEO Ple. Among people in this population, the number of people that carry the disease [pick all that is correct]

Solution

First, this is binomial distribution. Second, because $p $ is very small so it is right-skewed.

Problem 4

Jack and Jill gamble on a roll of a die (yes, a fair die), as follows. If the die shows 1 or 2 spots, Jack gives Jill $\$1$. If the die shows 5 or 6 spots, Jill gives Jack $\$1$. If the die shows 3 or 4 spots, no money changes hands. Suppose Jack and Jill play this game. The chance that Jill's net gain are more than $\$20$ are closest to?

Solution

$ $P (\text{jill wins 1}) =p (\text{jill loses 1}) =p (\text{no Money Changes hands}) =\frac{1}{3}$$ $$\mu=1\times\frac{1}{3} + ( -1) \times\frac{1}{3}+0\times\frac{1}{3}=0$$ $ $SD =\sqrt{(1-0) ^2\times\frac{1}{3}+ ( -1-0) ^2\times\frac{1}{3}+ ( 0-0) ^2\times\frac{1}{3}}=\sqrt{\frac{2}{3}}$$ $ $SE =\sqrt{n}\cdot sd=\sqrt{\frac{800}{3}}, Z=\frac{20-0}{SE}$$ Computing in R:

SE = sqrt (800/3) z = (20-0)/Se1-pnorm (z) [1] 0.1103357

Problem 5

In Roulette, the bet on a "split" pays 1 and there is 2 chances in. The bet on "Red" is pays 1 to 1 and there is chances in the + to win. Compare the following, strategies:a: Bet $\$1$ on A split, Times independently b:bet $\$1$ on red, Times Inde Pendently in what follows, "making more than $\ $x $" means have a net gain of more than $\ $x $; "Losing more than $\ $x $" means had a net gain of less than $-\ $x $. Compare the chances between A and B that "coming out ahead, winning more than $\$20$, losing more than $\$20$".

Solution

By using the central Limit theorem.

Let $P _{x0}$ is "coming out ahead" when following strategy $X $. Similarly, $P _{x20^{+}}$ and $P _{x20^{-}}$ denotes wining and losing $\$20$ respectively. Strategy $A $: $ $n =200, \mu=200\times (17\times\frac{2}{38}+ ( -1) \times\frac{36}{38}) =-\frac{200}{19}$$ $ $SE =\sqrt{n} \cdot sd=\sqrt{200\times[(17-\MU) ^2\times\frac{2}{38}+ ( -1-\MU) ^2\times\frac{36}{38}]}$$ Similarly, we can calculate Strategy $B $ in the same. And finally computing in R:

NetGain = function (n, prob, value, gain) {  mu = n * (sum (prob * value))  se = sqrt (n * SUM ((VALUE-MU) ^ 2 * prob))  if (gain >= 0) {    z = (gain + 0.5-mu)/SE    print (1-pnorm (z))  } else {    z = (gain-0.5-mu)/SE    print (Pnorm (z))  }}  NetGain (n = prob = C (2/38, 36/38), value = C (+,-1), gain = 0) [1] 0.4722959  # anetgain (n = $, prob = C (18/38,  20/38), value = C (1,-1), gain = 0) [1] 0.4704632  # bnetgain (n = 1, prob = C (2/38, 36/38), value = C (+,-), gain = [1] 0.4224767  # anetgain (n = $, prob = C (18/38, 20/38), value = C (1,-1), gain = +) [1] 0.4174109  # Bn  Etgain (n = prob = C (2/38, 36/38), value = C (+,-1), gain = -20) [1] 0.474937  # anetgain (n = $, prob = C (18/38, 20/38), value = C (1,-1), gain = -20) [1] 0.4732785  # B

According to the results above, $ $P _{a0} > p_{b0}$$ $ $P _{a20^{+}} > p_{b20^{+}}$$ $ $P _{a20^{-}} > p_{b20^{-}}$$ T Hat is, $P _a > p_b$

    • Coming out ahead
    • Winning more than $\$20$
    • Losing more than $\$20$

University of California, Berkeley stat2.2x probability probability Study Note: Section 4, the central Limit theorem

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.