In probability and statistics, the Yule-Simon distribution is a discrete probability distribution. It is named after Udny Yule and Herbert Simon. Simon originally called it the Yule distribution.
The probability mass function of the Yule-Simon(ρ) distribution is
for integer
and real ρ > 0, where B is the beta function. Equivalently the pmf can be written in terms of the falling factorial as
where Γ is the gamma function.
The probability mass function f has the property that for sufficiently large k we have
- Failed to parse (unknown function \propto):
f(k) \approx \frac{\rho\,\Gamma(\rho+1)}{k^{\rho+1}}
\propto \frac{1}{k^{\rho+1}}
.
\!
This means that the tail of the Yule-Simon distribution is a realization of Zipf's law: f(k) can be used to model, for example, the relative frequency of the kth most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of k.
Generalizations
Simon also hinted at a two-parameter generalization of the Yule-Simon distribution, in which the beta function is replaced by an incomplete beta function. The probability mass function of the generalized Yule-Simon(ρ, α) distribution is defined as
with
. For α = 0 the ordinary Yule-Simon(ρ) distribution is obtained as a special case.
References
- Herbert A. Simon, On a Class of Skew Distribution Functions, Biometrika 42(3/4): 425–440, December 1955.