Skip to main content

TL;DR

The expected number of draws (with replacement) needed to collect all n distinct coupons: \( E[T] = n \cdot H_n = n \sum_{k=1}^n \frac{1}{k} \approx n \ln n + \gamma n \), where \( \gamma \approx 0.577 \) is the Euler-Mascheroni constant.

By Valenke Exam Prep Team·Last updated 2026-06-03

Coupon Collector Problem

The expected number of draws (with replacement) needed to collect all n distinct coupons: \( E[T] = n \cdot H_n = n \sum_{k=1}^n \frac{1}{k} \approx n \ln n + \gamma n \), where \( \gamma \approx 0.577 \) is the Euler-Mascheroni constant.

Why it matters for interviews

A fundamental problem in probability that appears in interview questions about random sampling, completeness of datasets, and coverage problems. The harmonic number structure is elegant and widely applicable.

Definition and Mathematical Foundation

The expected number of draws (with replacement) needed to collect all n distinct coupons: \( E[T] = n \cdot H_n = n \sum_{k=1}^n \frac{1}{k} \approx n \ln n + \gamma n \), where \( \gamma \approx 0.577 \) is the Euler-Mascheroni constant.

Application in Quantitative Finance

A fundamental problem in probability that appears in interview questions about random sampling, completeness of datasets, and coverage problems. The harmonic number structure is elegant and widely applicable.

Related Concepts

Related Terms

Ready to practice for the Quant Trading Interview?

Adaptive practice powered by Item Response Theory targets your weak areas. Start with 3 free sessions.

Start free practice →

Frequently Asked Questions

How do you derive the expected time?
After collecting k coupons, the probability of a new one is \( (n-k)/n \). The waiting time for the next new coupon is geometric with mean \( n/(n-k) \). Sum over \( k = 0 \) to \( n-1 \): \( E[T] = \sum_{k=0}^{n-1} n/(n-k) = nH_n \).
What is the variance of the coupon collector time?
\( \text{Var}(T) = n^2 \sum_{k=1}^n \frac{1}{k^2} \approx n^2 \pi^2/6 \). The standard deviation is \( O(n) \), same order as the mean, indicating high variability.
What are practical applications?
Estimating how many random samples are needed to cover all categories in a dataset, the time to discover all bugs in testing, or the number of users needed to see all ad variants in A/B testing.