Hypergeometric Distribution : A Distribution of Dependent Events
By Bindeshwar Singh Kushwaha
PostNetwork Academy
Introduction
- In the previous sections, we studied distributions such as the binomial distribution.
- The binomial distribution assumes that each trial is independent and the probability of success remains constant.
- However, in many real-life problems, selections are made without replacement.
- In such cases, the trials are not independent, and the hypergeometric distribution is used.
Real-life Scenario
- Suppose we have 10 tickets numbered 1 to 10.
- We draw 3 tickets without replacement.
- Let “success” be the event of drawing an odd-numbered ticket.
- Total odd numbers = 5 (1, 3, 5, 7, 9).
- If we replace the ticket each time, the trials are independent and we use the Binomial Distribution.
- If we do not replace the tickets, the trials are dependent and we use the Hypergeometric Distribution.
Basic Definition
- The hypergeometric distribution gives the probability of obtaining exactly \(x\) successes in \(n\) draws,
- From a finite population of size \(N\) containing \(M\) successes and \((N – M)\) failures,
- When sampling is done without replacement.
\[
P(X = x) = \frac{\binom{M}{x}\binom{N – M}{n – x}}{\binom{N}{n}}
\]
P(X = x) = \frac{\binom{M}{x}\binom{N – M}{n – x}}{\binom{N}{n}}
\]
Parameters of Hypergeometric Distribution
- \(N\) = Total number of items in the population
- \(M\) = Number of success items in the population
- \(n\) = Number of draws (sample size)
- \(x\) = Number of observed successes in the sample
Mean and Variance
- The mean or expected value is given by
\[
E(X) = n \frac{M}{N}
\]
E(X) = n \frac{M}{N}
\]
- The variance is
\[
Var(X) = n \frac{M}{N} \frac{N – M}{N} \frac{N – n}{N – 1}
\]
Var(X) = n \frac{M}{N} \frac{N – M}{N} \frac{N – n}{N – 1}
\]
The term \(\frac{N – n}{N – 1}\) is called the finite population correction factor.
Example 1: Ticket Problem
- 10 tickets are numbered from 1 to 10.
- 5 tickets have odd numbers (successes) and 5 have even numbers (failures).
- 3 tickets are drawn without replacement.
- Find the probability that exactly 2 tickets have odd numbers.
\[
P(X=2) = \frac{\binom{5}{2}\binom{5}{1}}{\binom{10}{3}} = \frac{10 \times 5}{120} = \frac{1}{2.4} \approx 0.4167
\]
P(X=2) = \frac{\binom{5}{2}\binom{5}{1}}{\binom{10}{3}} = \frac{10 \times 5}{120} = \frac{1}{2.4} \approx 0.4167
\]
Example 2: Jury Selection Problem
- A jury of 12 members is chosen from a pool of 20 people.
- 8 are men (successes) and 12 are women (failures).
- Find the probability that the jury contains exactly 5 men.
\[
P(X=5) = \frac{\binom{8}{5}\binom{12}{7}}{\binom{20}{12}}
\]
P(X=5) = \frac{\binom{8}{5}\binom{12}{7}}{\binom{20}{12}}
\]
Example 3: Fish Tank Problem
- A tank contains 200 fish. 60 are tagged and 140 are untagged.
- A sample of 10 fish is drawn without replacement.
- Find the probability that exactly 4 tagged fish are drawn.
\[
P(X=4) = \frac{\binom{60}{4}\binom{140}{6}}{\binom{200}{10}}
\]
P(X=4) = \frac{\binom{60}{4}\binom{140}{6}}{\binom{200}{10}}
\]
Example 4: Probability of Acceptance Without Further Inspection
- A lot of 25 units contains 10 defective units.
- An engineer inspects 2 randomly selected units from the lot.
- The lot is accepted if both selected units are non-defective.
- We are to find the probability that the lot is accepted without further inspection.
\[
P(\text{no defective units in sample}) = \frac{\binom{15}{2}}{\binom{25}{2}}
\]
P(\text{no defective units in sample}) = \frac{\binom{15}{2}}{\binom{25}{2}}
\]
\[
\frac{15 \times 14}{25 \times 24} = \frac{210}{600} = 0.35
\]
\frac{15 \times 14}{25 \times 24} = \frac{210}{600} = 0.35
\]
Answer: The probability that the lot is accepted without further inspection is \(\boxed{0.35}\).
Important Properties
- Trials are dependent since sampling is without replacement.
- The total population \(N\) is finite.
- The random variable \(X\) can take integer values within
\[
\max(0, n – (N – M)) \le X \le \min(n, M)
\]
\max(0, n – (N – M)) \le X \le \min(n, M)
\]
The sum of all probabilities is 1:
\[
\sum_{x} P(X=x) = 1
\]
\sum_{x} P(X=x) = 1
\]
Comparison with Binomial Distribution
- The Binomial Distribution assumes independent trials with constant probability \(p\).
- The Hypergeometric Distribution assumes dependent trials without replacement.
- When \(N\) is large and \(n\) is small, the hypergeometric distribution approximates the binomial distribution.
Video
Summary
- Used when sampling is done without replacement.
- Suitable for finite populations.
- Depends on \(N\), \(M\), and \(n\).
- Probability formula:
\[
P(X = x) = \frac{\binom{M}{x}\binom{N – M}{n – x}}{\binom{N}{n}}
\]
P(X = x) = \frac{\binom{M}{x}\binom{N – M}{n – x}}{\binom{N}{n}}
\]
Mean and variance summarize the distribution’s behavior.
Reach PostNetwork Academy
- Website: www.postnetwork.co
- YouTube Channel: www.youtube.com/@postnetworkacademy
- Facebook Page: www.facebook.com/postnetworkacademy
- LinkedIn Page: www.linkedin.com/company/postnetworkacademy
- GitHub Repositories: www.github.com/postnetworkacademy
Thank You!