Under what circumstances does the binomial distribution approximate a normal distribution?

Mean and variance of the binomial distribution
Normal approximation to the binimial distribution

One can easily verify that the mean for a single binomial trial, where S(uccess) is scored as 1 and F(ailure) is scored as 0, is p; where p is the probability of S. Hence the mean for the binomial distribution with n trials is np.
One can easily verify that the variance for a single binomial trial, where S is scored as 1 and F is scored as 0, is p(1-p). Hence the variance for the binomial distribution with n trials is np(1-p). This provides that the standard deviation is (np(1-p))^.5.

If the number of trials, n, is large, the binomial distribution is approximately equal to the normal distribution. (This is nice, since we really do not want to explicitly calculate binomial probabilities when n > 100.)

Example: If 10% of men are bald, what is the probability that fewer than 100 in a random sample of 818 men are bald?
Form the z-score, for which purpose it is necessary to have the mean (*mu*) and standars deviation (*sigma*)
*mu* = np = 818 × .1 = 81.8.
*sigma* = (np(1-p))^.5 = (818 × .1 × .9)^.5 = 8.5802
z = (n-*mu*)/*sigma* = (100-81.8)/8.58 = 2.12
Since we are interested in fewer than (draw a picture), from the normal table we find that 98.3% of the time there will be fewer than 100 bald men.

The validity of the normal approximation is illustrated if you click here.

Simulation with a binomial experiment is one way to generate a normal distribution.

N.B.: Either do all the calculations with count data as we have done here, or convert everything (including the standard deviation) to proportions.

Applets: The normal approximation to the binomial is illustrated by David Lane (this employs the continuity correction factor). A cruder version is also available. The classic falling ball model for the binomial convergence to the normal distribution can be seen at Davidson University or a .com (The classical model has each yellow ball going to the adjacent slot to the right or left with probability .5 when it hits a green ball, but these simulations look like more horizontal travel is possible).

Competencies: If n=25 and p=.2, calculate the mean, variance, and standard deviation of the binomial distribution.
If n=200 and p = .67, estimate the probability that the number of successes is greater than 140.

return to index

Questions?

Let \(X_i\) denote whether or not a randomly selected individual approves of the job the President is doing. More specifically:

Let \(X_i=1\), if the person approves of the job the President is doing, with probability \(p\)
Let \(X_i=0\), if the person does not approve of the job the President is doing with probability \(1-p\)

Then, recall that \(X_i\) is a Bernoulli random variable with mean:

\(\mu=E(X)=(0)(1-p)+(1)(p)=p\)

and variance:

\(\sigma^2=Var(X)=E[(X-p)^2]=(0-p)^2(1-p)+(1-p)^2(p)=p(1-p)[p+1-p]=p(1-p)\)

Now, take a random sample of \(n\) people, and let:

\(Y=X_1+X_2+\ldots+X_n\)

Then \(Y\) is a binomial(\(n, p\)) random variable, \(y=0, 1, 2, \ldots, n\), with mean:

\(\mu=np\)

and variance:

\(\sigma^2=np(1-p)\)

Now, let \(n=10\) and \(p=\frac{1}{2}\), so that \(Y\) is binomial(\(10, \frac{1}{2}\)). What is the probability that exactly five people approve of the job the President is doing?

Solution

There is really nothing new here. We can calculate the exact probability using the binomial table in the back of the book with \(n=10\) and \(p=\frac{1}{2}\). Doing so, we get:

\begin{align} P(Y=5)&= P(Y \leq 5)-P(Y \leq 4)\\ &= 0.6230-0.3770\\ &= 0.2460\\ \end{align}

That is, there is a 24.6% chance that exactly five of the ten people selected approve of the job the President is doing.

Note, however, that \(Y\) in the above example is defined as a sum of independent, identically distributed random variables. Therefore, as long as \(n\) is sufficiently large, we can use the Central Limit Theorem to calculate probabilities for \(Y\). Specifically, the Central Limit Theorem tells us that:

\(Z=\dfrac{Y-np}{\sqrt{np(1-p)}}\stackrel {d}{\longrightarrow} N(0,1)\).

Let's use the normal distribution then to approximate some probabilities for \(Y\). Again, what is the probability that exactly five people approve of the job the President is doing?

Solution

First, recognize in our case that the mean is:

\(\mu=np=10\left(\dfrac{1}{2}\right)=5\)

and the variance is:

\(\sigma^2=np(1-p)=10\left(\dfrac{1}{2}\right)\left(\dfrac{1}{2}\right)=2.5\)

Now, if we look at a graph of the binomial distribution with the rectangle corresponding to \(Y=5\) shaded in red:

02468100.00 0.050.0010.0100.0440.1170.2050.2460.100.150.200.25DensityHistogram of YNormalYMean - 5StDev - 1.581N - 1000

we should see that we would benefit from making some kind of correction for the fact that we are using a continuous distribution to approximate a discrete distribution. Specifically, it seems that the rectangle \(Y=5\) really includes any \(Y\) greater than 4.5 but less than 5.5. That is:

\(P(Y=5)=P(4.5< Y < 5.5)\)

Such an adjustment is called a "continuity correction." Once we've made the continuity correction, the calculation reduces to a normal probability calculation:

Now, recall that we previous used the binomial distribution to determine that the probability that \(Y=5\) is exactly 0.246. Here, we used the normal distribution to determine that the probability that \(Y=5\) is approximately 0.251. That's not too shabby of an approximation, in light of the fact that we are dealing with a relative small sample size of \(n=10\)!

Let's try a few more approximations. What is the probability that more than 7, but at most 9, of the ten people sampled approve of the job the President is doing?

Solution

If we look at a graph of the binomial distribution with the area corresponding to \(7<Y\le 9\) shaded in red:

0246 8100.000.050.0010.0100.0440.117 0.2050.2460.100.150.200.25DensityHistogram of Y NormalYMean - 5StDev - 1.581N - 1000

we should see that we'll want to make the following continuity correction:

\(P(7<Y \leq 9)=P(7.5< Y < 9.5)\)

Now again, once we've made the continuity correction, the calculation reduces to a normal probability calculation:

By the way, you might find it interesting to note that the approximate normal probability is quite close to the exact binomial probability. We showed that the approximate probability is 0.0549, whereas the following calculation shows that the exact probability (using the binomial table with \(n=10\) and \(p=\frac{1}{2}\) is 0.0537:

\(P(7<Y \leq 9)=P(Y\leq 9)-P(Y\leq 7)=0.9990-0.9453=0.0537\)

Let's try one more approximation. What is the probability that at least 2, but less than 4, of the ten people sampled approve of the job the President is doing?

Solution

If we look at a graph of the binomial distribution with the area corresponding to \(2\le Y<4\) shaded in red:

024 68100.000.050.0010.0100.0440.1170.2050.2460.100.150.200.25DensityHistogram of YNormalYMean - 5StDev - 1.581N - 1000

we should see that we'll want to make the following continuity correction:

\(P(2 \leq Y <4)=P(1.5< Y < 3.5)\)

Again, once we've made the continuity correction, the calculation reduces to a normal probability calculation:

\begin{align} P(2 \leq Y <4)=P(1.5< Y < 3.5) &= P(\dfrac{1.5-5}{\sqrt{2.5}}<Z<\dfrac{3.5-5}{\sqrt{2.5}})\\ &= P(-2.21<Z<-0.95)\\ &= P(Z>0.95)-P(Z>2.21)\\ &= 0.1711-0.0136=0.1575\\ \end{align}

By the way, the exact binomial probability is 0.1612, as the following calculation illustrates:

\(P(2 \leq Y <4)=P(Y\leq 3)-P(Y\leq 1)=0.1719-0.0107=0.1612\)

Just a couple of comments before we close our discussion of the normal approximation to the binomial.

(1) First, we have not yet discussed what "sufficiently large" means in terms of when it is appropriate to use the normal approximation to the binomial. The general rule of thumb is that the sample size \(n\) is "sufficiently large" if:

\(np\ge 5\) and \(n(1-p)\ge 5\)

For example, in the above example, in which \(p=0.5\), the two conditions are met if:

\(np=n(0.5)\ge 5\) and \(n(1-p)=n(0.5)\ge 5\)

Now, both conditions are true if:

\(n\ge 5\left(\frac{10}{5}\right)=10\)

Because our sample size was at least 10 (well, barely!), we now see why our approximations were quite close to the exact probabilities. In general, the farther \(p\) is away from 0.5, the larger the sample size \(n\) is needed. For example, suppose \(p=0.1\). Then, the two conditions are met if:

\(np=n(0.1)\ge 5\) and \(n(1-p)=n(0.9)\ge 5\)

Now, the first condition is met if:

\(n\ge 5(10)=50\)

And, the second condition is met if:

\(n\ge 5\left(\frac{10}{9}\right)=5.5\)

That is, the only way both conditions are met is if \(n\ge 50\). So, in summary, when \(p=0.5\), a sample size of \(n=10\) is sufficient. But, if \(p=0.1\), then we need a much larger sample size, namely \(n=50\).

(2) In truth, if you have the available tools, such as a binomial table or a statistical package, you'll probably want to calculate exact probabilities instead of approximate probabilities. Does that mean all of our discussion here is for naught? No, not at all! In reality, we'll most often use the Central Limit Theorem as applied to the sum of independent Bernoulli random variables to help us draw conclusions about a true population proportion \(p\). If we take the \(Z\) random variable that we've been dealing with above, and divide the numerator by \(n\) and the denominator by \(n\) (and thereby not changing the overall quantity), we get the following result:

\(Z=\dfrac{\sum X_i-np}{\sqrt{np(1-p)}}=\dfrac{\hat{p}-p}{\sqrt{\dfrac{p(1-p)}{n}}}\stackrel {d}{\longrightarrow} N(0,1)\)

The quantity:

\(\hat{p}=\dfrac{\sum\limits_{i=1}^n X_i}{n}\)

that appears in the numerator is the "sample proportion," that is, the proportion in the sample meeting the condition of interest (approving of the President's job, for example). In Stat 415, we'll use the sample proportion in conjunction with the above result to draw conclusions about the unknown population proportion p. You'll definitely be seeing much more of this in Stat 415!

Under what circumstances does the binomial distribution approximate a normal distribution?

Solution

Solution

Solution

Solution

Bài Viết Liên Quan

What are the three different theoretical perspectives of social change?

An example of __________ utility involves intermediaries shipping goods to buyers of a product.

In the U.S. Constitution the principle of democracy is

A marketing channel relies on to make products available to consumers and industrial users

Which of the following is the most common channel of distribution for consumer goods

When you flip a coin four times what is the probability that it will come up heads exactly twice

What is the term for when you have the right people for a survey but groups of respondents within the sample do not respond and skew your results?

What are the five characteristics that influence the rate at which an innovation is adopted by the target user?

What is the perception that one is worse off relative to those with whom one compares oneself?

Society that lives a nomadic life and relies on domesticated animals.

Toplist

Top 19 đặt một câu ghép chính phụ sử dụng cặp quan hệ từ để thi 2022

Top 29 suy thận độ 2 kiêng ăn gì 2022

Top 10 triển vọng thị trường chứng khoán việt nam sách 2022

Top 9 trong các tài sản sau đây tài sản nào thuộc sở hữu của nhà nước 2022

Top 10 giáo an phát triển năng lực môn kĩ thuật lớp 4 2022

Top 8 chuẩn mực đạo đức của vinamilk 2022

Top 28 kế hoạch bài dạy môn tự nhiên xã hội lớp 2 mô đun 4 2022

Top 10 de thi giữa học kì 1 lớp 10 môn lý có đáp an tự luận 2022

Top 9 huyện hoài đức - hà nội có bao nhiều xã 2022

Bài mới nhất

Khi nào nên đóng bỉm cho trẻ sơ sinh năm 2024

Top 10 dan ba hiem doc nhat trung hoa p1 năm 2024

Hóa đơn bị mờ chữ có bị phạt không năm 2024

De thi học kì 1 môn toán lớp 4 violet năm 2024

Bài tập giải toán có lời văn lớp 4 năm 2024

Top 20 ngân hàng lớn nhất việt nam năm 2024

Biểu hiện của thoái hóa cột sống lưng năm 2024

Bị lỗi this page cant be displayed năm 2024

Vở bài tập toán lớp 5 bài 166 luyện tập năm 2024

Chiếu powerpoint có phải lúc nào cũng hiệu quả năm 2024

Chủ đề