Fitting of Poisson Distribution

Fitting of Poisson Distribution




Fitting of Poisson Distribution

Bindeshwar Singh Kushwaha — PostNetwork Academy


Introduction

Master the technique of fitting the Poisson distribution to real-world frequency data.
This tutorial shows a step-by-step method to calculate theoretical frequencies for observed datasets.

Key Concepts & Techniques

  • Introduction to Fitting: Fit a theoretical Poisson distribution to experimental data to derive expected frequencies.
  • The Recurrence Advantage: Use the recurrence relation for Poisson probabilities to compute successive probabilities easily.
  • Calculating Expected Frequencies: Determine theoretical frequency \( f(x) \) from the total number of observations \( N \).
  • The Fitting Procedure:
    1. Calculate the mean \( \lambda \) from observed data.
    2. Find the initial probability \( p(0) \).
    3. Apply the recurrence formula to get \( p(1), p(2), \dots \).
    4. Multiply probabilities by \( N \) to obtain theoretical frequencies.

Recurrence Formula for the Poisson Probabilities

For a Poisson distribution with parameter \( \lambda \):

\( p(x) = \dfrac{e^{-\lambda}\,\lambda^x}{x!} \)   … (1)

If we change \( x \) to \( x+1 \):

\( p(x+1) = \dfrac{e^{-\lambda}\,\lambda^{x+1}}{(x+1)!} \)   … (2)

Divide (2) by (1):

\( \dfrac{p(x+1)}{p(x)} = \dfrac{\lambda}{x+1} \)

So the recurrence relation is:

\( p(x+1) = \dfrac{\lambda}{x+1}\,p(x) \)   … (3)


Using the Recurrence Relation

  • This recurrence relation holds for the Poisson probabilities.
  • Start with \( p(0) = e^{-\lambda} \), then compute \( p(1), p(2), \dots \) successively using (3).

Poisson Frequency Distribution

If an experiment follows Poisson assumptions and is repeated \( N \) times, the expected frequency of observing \( x \) occurrences is

\( f(x) = N \cdot P(X=x) = N \cdot \dfrac{e^{-\lambda}\,\lambda^x}{x!}, \quad x = 0,1,2,\dots \)


Example 1 — Defective Bottles

A manufacturer: 0.1% bottles are defective. Boxes contain 500 bottles. A buyer purchases 100 boxes.
Find how many boxes will contain at least two defective bottles.

Step 1: Parameters

  • Probability defective: \( p = \dfrac{0.1}{100} = 0.001 \).
  • Box size \( n = 500 \Rightarrow \lambda = n p = 500 \times 0.001 = 0.5 \).
  • Number of boxes \( N = 100 \).
  • Poisson PMF: \( P(X=x) = \dfrac{e^{-0.5}(0.5)^x}{x!} \).

Step 2: Probability \( X \ge 2 \)

\( P(X \ge 2) = 1 – [P(X=0) + P(X=1)] \)

\( = 1 – \left[ \dfrac{e^{-0.5}(0.5)^0}{0!} + \dfrac{e^{-0.5}(0.5)^1}{1!} \right] \)
\( = 1 – e^{-0.5}(1 + 0.5) \)

Numerical: \( e^{-0.5} \approx 0.60653 \Rightarrow P(X \ge 2) \approx 1 – 0.60653\times1.5 = 0.090205 \)

Step 3: Expected Number of Boxes

Expected = \( N \times P(X\ge2) = 100 \times 0.090205 \approx 9.02 \)

So about 9 boxes are expected to contain at least 2 defective bottles.


Process of Fitting a Poisson Distribution (Summary)

  1. Compute mean \( \bar{x} = \dfrac{\sum f x}{\sum f} \) and use it as \( \lambda \).
  2. Compute \( p(0) = e^{-\lambda} \).
  3. Use recurrence \( p(x+1) = \dfrac{\lambda}{x+1} p(x) \) to find additional probabilities.
  4. Compute theoretical frequencies \( f(x) = N \cdot p(x) \).

Example 2 — Aircraft Accidents (Fitting Example)

Data for 2480 pilots (number of accidents):

Number of Accidents (X) 0 1 2 3 4 5
Observed frequency (f) 1970 422 71 13 3 1

Step 1: Mean \( \lambda \)

\( N = \sum f = 1970 + 422 + 71 + 13 + 3 + 1 = 2480 \).
\( \sum fX = 0\cdot1970 + 1\cdot422 + 2\cdot71 + 3\cdot13 + 4\cdot3 + 5\cdot1 = 620 \).
\( \lambda = \dfrac{620}{2480} = 0.25 \).

Step 2: \( p(0) \)

\( p(0) = e^{-0.25} \approx 0.7788008 \) (rounded to 0.7788)

Step 3: Probabilities by recurrence (λ = 0.25)

  • \( p(1) = p(0) \times \dfrac{0.25}{1} \approx 0.7788 \times 0.25 = 0.1947 \)
  • \( p(2) = p(1) \times \dfrac{0.25}{2} \approx 0.1947 \times 0.125 = 0.02434 \)
  • \( p(3) \approx 0.02434 \times \dfrac{0.25}{3} \approx 0.00203 \)
  • \( p(4) \approx 0.00203 \times \dfrac{0.25}{4} \approx 0.000127 \)
  • \( p(5) \approx 0.000127 \times \dfrac{0.25}{5} \approx 6.35\times10^{-6} \)

Step 4: Theoretical frequencies \( f(x) = 2480 \times p(x) \)

  • \( f(0) \approx 2480 \times 0.7788 \approx 1931 \)
  • \( f(1) \approx 2480 \times 0.1947 \approx 483 \)
  • \( f(2) \approx 2480 \times 0.02434 \approx 60 \)
  • \( f(3) \approx 2480 \times 0.00203 \approx 5 \)
  • \( f(4) \approx 2480 \times 0.000127 \approx 0 \)
  • \( f(5) \approx 2480 \times 6.35\times10^{-6} \approx 0 \)

Comparison Table

Accidents (X) 0 1 2 3 4 5
Observed (f) 1970 422 71 13 3 1
Theoretical \( f(x) \) 1931 483 60 5 0 0

Conclusion: The Poisson distribution with \( \lambda = 0.25 \) fits the observed accident data reasonably well (the theoretical and observed frequencies are close).


Example 3 — Fountain Pens (Poisson Approximation)

Scenario: defective pen probability \( p = \dfrac{1}{500} \). Packets of \( n=10 \). Total packets \( N = 20000 \).

Step 1: Mean

\( \lambda = n p = 10 \times \dfrac{1}{500} = 0.02 \).

Step 2: Poisson formula and \( e^{-\lambda} \)

\( P[X=x] = \dfrac{e^{-\lambda}\,\lambda^x}{x!} \), and \( e^{-0.02} \approx 0.9801987 \) (rounded 0.9802).

Step 3: Packets with exactly one defective \( X=1 \)

\( P[X=1] = e^{-0.02}\cdot 0.02 \approx 0.9802 \times 0.02 = 0.019604 \).
\( f(1) = 20000 \times 0.019604 \approx 392.08 \Rightarrow \textbf{392 packets} \).

Step 4: Packets with exactly two defectives \( X=2 \)

\( P[X=2] = e^{-0.02}\dfrac{(0.02)^2}{2!} \approx \dfrac{0.9802 \times 0.0004}{2} = 0.00019604 \).
\( f(2) = 20000 \times 0.00019604 \approx 3.9208 \Rightarrow \textbf{4 packets} \).

Summary: \( \lambda = 0.02 \), \( f(1)\approx392 \), \( f(2)\approx4 \).


Example 4 — Typist Mistakes (100 pages)

Mistakes per page (X) 0 1 2 3 4 5
Frequency (f) 42 33 14 6 4 1

Step 1: Mean \( \lambda \)

\( N = 100 \).
\( \sum fX = 0\cdot42 + 1\cdot33 + 2\cdot14 + 3\cdot6 + 4\cdot4 + 5\cdot1 = 100 \).
\( \lambda = \dfrac{100}{100} = 1 \).

Step 2: Initial probability \( p(0) \)

\( p(0) = e^{-1} \approx 0.367879 \) (rounded to 0.3679).

Step 3: Probabilities using recurrence (λ = 1)

  • \( p(1) = p(0) \times \dfrac{1}{1} = 0.3679 \)
  • \( p(2) = p(1) \times \tfrac{1}{2} \approx 0.1840 \)
  • \( p(3) \approx 0.0613 \)
  • \( p(4) \approx 0.0153 \)
  • \( p(5) \approx 0.0031 \)

Step 4: Theoretical frequencies (N = 100)

  • \( f(0) \approx 37 \)
  • \( f(1) \approx 37 \)
  • \( f(2) \approx 18 \)
  • \( f(3) \approx 6 \)
  • \( f(4) \approx 2 \)
  • \( f(5) \approx 0 \)

Comparison Table

Mistakes (X) 0 1 2 3 4 5 Total
Observed (f) 42 33 14 6 4 1 100
Theoretical \( f(x) \) 37 37 18 6 2 0 100

Conclusion: Poisson with \( \lambda = 1 \) fits the typist data well — theoretical frequencies closely match observed counts.

PDF

Video

 


Contact / Reach PostNetwork Academy

Thank You!

Presented by: Bindeshwar Singh Kushwaha — PostNetwork Academy

Tags: #PoissonDistribution #Statistics #DataScience #Probability #StatisticalModeling #RecurrenceRelation #PostNetworkAcademy

©Postnetwork-All rights reserved.