Relationships among probability distributions

In probability theory and statistics, there are several relationships among probability distributions. These relations can be categorized in the following groups:

One distribution is a special case of another with a broader parameter space
Transforms (function of a random variable);
Combinations (function of several variables);
Approximation (limit) relationships;
Compound relationships (useful for Bayesian inference);
Duality^{[clarification needed]};
Conjugate priors.

Relationships between univariate probability distributions in ProbOnto.^[2]

Special case of distribution parametrization

A binomial distribution with parameters n = 1 and p is a Bernoulli distribution with parameter p.
A negative binomial distribution with parameters n = 1 and p is a geometric distribution with parameter p.
A gamma distribution with shape parameter αあるふぁ = 1 and rate parameter βべーた is an exponential distribution with rate parameter βべーた.
A gamma distribution with shape parameter αあるふぁ = v/2 and rate parameter βべーた = 1/2 is a chi-squared distribution with νにゅー degrees of freedom.
A chi-squared distribution with 2 degrees of freedom (k = 2) is an exponential distribution with a mean value of 2 (rate λらむだ = 1/2 .)
A Weibull distribution with shape parameter k = 1 and rate parameter βべーた is an exponential distribution with rate parameter βべーた.
A beta distribution with shape parameters αあるふぁ = βべーた = 1 is a continuous uniform distribution over the real numbers 0 to 1.
A beta-binomial distribution with parameter n and shape parameters αあるふぁ = βべーた = 1 is a discrete uniform distribution over the integers 0 to n.
A Student's t-distribution with one degree of freedom (v = 1) is a Cauchy distribution with location parameter x = 0 and scale parameter γがんま = 1.
A Burr distribution with parameters c = 1 and k (and scale λらむだ) is a Lomax distribution with shape k (and scale λらむだ.)

Transform of a variable

Multiple of a random variable

Multiplying the variable by any positive real constant yields a scaling of the original distribution. Some are self-replicating, meaning that the scaling yields the same family of distributions, albeit with a different parameter: normal distribution, gamma distribution, Cauchy distribution, exponential distribution, Erlang distribution, Weibull distribution, logistic distribution, error distribution, power-law distribution, Rayleigh distribution.

Example:

If X is a gamma random variable with shape and rate parameters (αあるふぁ, βべーた), then Y = aX is a gamma random variable with parameters (αあるふぁ,βべーた/a).

If X is a gamma random variable with shape and scale parameters (k, θしーた), then Y = aX is a gamma random variable with parameters (k,aθしーた).

Linear function of a random variable

The affine transform ax + b yields a relocation and scaling of the original distribution. The following are self-replicating: Normal distribution, Cauchy distribution, Logistic distribution, Error distribution, Power distribution, Rayleigh distribution.

Example:

If Z is a normal random variable with parameters (μみゅー = m, σしぐま² = s²), then X = aZ + b is a normal random variable with parameters (μみゅー = am + b, σしぐま² = a²s²).

Reciprocal of a random variable

The reciprocal 1/X of a random variable X, is a member of the same family of distribution as X, in the following cases: Cauchy distribution, F distribution, log logistic distribution.

Examples:

If X is a Cauchy (μみゅー, σしぐま) random variable, then 1/X is a Cauchy (μみゅー/C, σしぐま/C) random variable where C = μみゅー² + σしぐま².
If X is an F(νにゅー₁, νにゅー₂) random variable then 1/X is an F(νにゅー₂, νにゅー₁) random variable.

Other cases

Some distributions are invariant under a specific transformation.

Example:

If X is a beta (αあるふぁ, βべーた) random variable then (1 − X) is a beta (βべーた, αあるふぁ) random variable.
If X is a binomial (n, p) random variable then (n − X) is a binomial (n, 1 − p) random variable.
If X has cumulative distribution function F_X, then the inverse of the cumulative distribution F
_X(X) is a standard uniform (0,1) random variable
If X is a normal (μみゅー, σしぐま²) random variable then e^X is a lognormal (μみゅー, σしぐま²) random variable.

Conversely, if X is a lognormal (μみゅー, σしぐま²) random variable then log X is a normal (μみゅー, σしぐま²) random variable.

If X is an exponential random variable with mean βべーた, then X^{1/γがんま} is a Weibull (γがんま, βべーた) random variable.
The square of a standard normal random variable has a chi-squared distribution with one degree of freedom.
If X is a Student’s t random variable with νにゅー degree of freedom, then X² is an F (1,νにゅー) random variable.
If X is a double exponential random variable with mean 0 and scale λらむだ, then |X| is an exponential random variable with mean λらむだ.
A geometric random variable is the floor of an exponential random variable.
A rectangular random variable is the floor of a uniform random variable.
A reciprocal random variable is the exponential of a uniform random variable.

Functions of several variables

Sum of variables

The distribution of the sum of independent random variables is the convolution of their distributions. Suppose $Z$ is the sum of $n$ independent random variables $X_{1},\dots ,X_{n}$ each with probability mass functions $f_{X_{i}}(x)$ . Then

$Z=\sum _{i=1}^{n}{X_{i}}.$

If it has a distribution from the same family of distributions as the original variables, that family of distributions is said to be closed under convolution. Often (always?) these distributions are also stable distributions (see also Discrete-stable distribution).

Examples of such univariate distributions are: normal distributions, Poisson distributions, binomial distributions (with common success probability), negative binomial distributions (with common success probability), gamma distributions (with common rate parameter), chi-squared distributions, Cauchy distributions, hyperexponential distributions.

Examples:^[3]^[4]

- If X₁ and X₂ are Poisson random variables with means μみゅー₁ and μみゅー₂ respectively, then X₁ + X₂ is a Poisson random variable with mean μみゅー₁ + μみゅー₂.
- The sum of gamma (αあるふぁ_i, βべーた) random variables has a gamma (Σしぐまαあるふぁ_i, βべーた) distribution.
- If X₁ is a Cauchy (μみゅー₁, σしぐま₁) random variable and X₂ is a Cauchy (μみゅー₂, σしぐま₂), then X₁ + X₂ is a Cauchy (μみゅー₁ + μみゅー₂, σしぐま₁ + σしぐま₂) random variable.
- If X₁ and X₂ are chi-squared random variables with νにゅー₁ and νにゅー₂ degrees of freedom respectively, then X₁ + X₂ is a chi-squared random variable with νにゅー₁ + νにゅー₂ degrees of freedom.
- If X₁ is a normal (μみゅー₁, σしぐま²
  ₁) random variable and X₂ is a normal (μみゅー₂, σしぐま²
  ₂) random variable, then X₁ + X₂ is a normal (μみゅー₁ + μみゅー₂, σしぐま²
  ₁ + σしぐま²
  ₂) random variable.
- The sum of N chi-squared (1) random variables has a chi-squared distribution with N degrees of freedom.

Other distributions are not closed under convolution, but their sum has a known distribution:

The sum of n Bernoulli (p) random variables is a binomial (n, p) random variable.
The sum of n geometric random variables with probability of success p is a negative binomial random variable with parameters n and p.
The sum of n exponential (βべーた) random variables is a gamma (n, βべーた) random variable. Since n is an integer, the gamma distribution is also a Erlang distribution.
The sum of the squares of N standard normal random variables has a chi-squared distribution with N degrees of freedom.

Product of variables

The product of independent random variables X and Y may belong to the same family of distribution as X and Y: Bernoulli distribution and log-normal distribution.

Example:

If X₁ and X₂ are independent log-normal random variables with parameters (μみゅー₁, σしぐま²
₁) and (μみゅー₂, σしぐま²
₂) respectively, then X₁ X₂ is a log-normal random variable with parameters (μみゅー₁ + μみゅー₂, σしぐま²
₁ + σしぐま²
₂).

(See also Product distribution.)

Minimum and maximum of independent random variables

For some distributions, the minimum value of several independent random variables is a member of the same family, with different parameters: Bernoulli distribution, Geometric distribution, Exponential distribution, Extreme value distribution, Pareto distribution, Rayleigh distribution, Weibull distribution.

Examples:

If X₁ and X₂ are independent geometric random variables with probability of success p₁ and p₂ respectively, then min(X₁, X₂) is a geometric random variable with probability of success p = p₁ + p₂ − p₁ p₂. The relationship is simpler if expressed in terms probability of failure: q = q₁ q₂.
If X₁ and X₂ are independent exponential random variables with rate μみゅー₁ and μみゅー₂ respectively, then min(X₁, X₂) is an exponential random variable with rate μみゅー = μみゅー₁ + μみゅー₂.

Similarly, distributions for which the maximum value of several independent random variables is a member of the same family of distribution include: Bernoulli distribution, Power law distribution.

Other

If X and Y are independent standard normal random variables, X/Y is a Cauchy (0,1) random variable.
If X₁ and X₂ are independent chi-squared random variables with νにゅー₁ and νにゅー₂ degrees of freedom respectively, then (X₁/νにゅー₁)/(X₂/νにゅー₂) is an F(νにゅー₁, νにゅー₂) random variable.
If X is a standard normal random variable and U is an independent chi-squared random variable with νにゅー degrees of freedom, then ${\frac {X}{\sqrt {(U/\nu )}}}$ is a Student's t(νにゅー) random variable.
If X₁ is a gamma (αあるふぁ₁, 1) random variable and X₂ is an independent gamma (αあるふぁ₂, 1) random variable then X₁/(X₁ + X₂) is a beta(αあるふぁ₁, αあるふぁ₂) random variable. More generally, if X₁ is a gamma(αあるふぁ₁, βべーた₁) random variable and X₂ is an independent gamma(αあるふぁ₂, βべーた₂) random variable then βべーた₂ X₁/(βべーた₂ X₁ + βべーた₁ X₂) is a beta(αあるふぁ₁, αあるふぁ₂) random variable.
If X and Y are independent exponential random variables with mean μみゅー, then X − Y is a double exponential random variable with mean 0 and scale μみゅー.
If X_i are independent Bernoulli random variables then their parity (XOR) is a Bernoulli variable described by the piling-up lemma.

(See also ratio distribution.)

Approximate (limit) relationships

Approximate or limit relationship means

either that the combination of an infinite number of iid random variables tends to some distribution,
or that the limit when a parameter tends to some value approaches to a different distribution.

Combination of iid random variables:

Given certain conditions, the sum (hence the average) of a sufficiently large number of iid random variables, each with finite mean and variance, will be approximately normally distributed. This is the central limit theorem (CLT).

Special case of distribution parametrization:

X is a hypergeometric (m, N, n) random variable. If n and m are large compared to N, and p = m/N is not close to 0 or 1, then X approximately has a Binomial(n, p) distribution.
X is a beta-binomial random variable with parameters (n, αあるふぁ, βべーた). Let p = αあるふぁ/(αあるふぁ + βべーた) and suppose αあるふぁ + βべーた is large, then X approximately has a binomial(n, p) distribution.
If X is a binomial (n, p) random variable and if n is large and np is small then X approximately has a Poisson(np) distribution.
If X is a negative binomial random variable with r large, P near 1, and r(1 − P) = λらむだ, then X approximately has a Poisson distribution with mean λらむだ.

Consequences of the CLT:

If X is a Poisson random variable with large mean, then for integers j and k, P(j ≤ X ≤ k) approximately equals to P(j − 1/2 ≤ Y ≤ k + 1/2) where Y is a normal distribution with the same mean and variance as X.
If X is a binomial(n, p) random variable with large np and n(1 − p), then for integers j and k, P(j ≤ X ≤ k) approximately equals to P(j − 1/2 ≤ Y ≤ k + 1/2) where Y is a normal random variable with the same mean and variance as X, i.e. np and np(1 − p).
If X is a beta random variable with parameters αあるふぁ and βべーた equal and large, then X approximately has a normal distribution with the same mean and variance, i. e. mean αあるふぁ/(αあるふぁ + βべーた) and variance αあるふぁβべーた/((αあるふぁ + βべーた)²(αあるふぁ + βべーた + 1)).
If X is a gamma(αあるふぁ, βべーた) random variable and the shape parameter αあるふぁ is large relative to the scale parameter βべーた, then X approximately has a normal random variable with the same mean and variance.
If X is a Student's t random variable with a large number of degrees of freedom νにゅー then X approximately has a standard normal distribution.
If X is an F(νにゅー, ωおめが) random variable with ωおめが large, then νにゅーX is approximately distributed as a chi-squared random variable with νにゅー degrees of freedom.

Compound (or Bayesian) relationships

When one or more parameter(s) of a distribution are random variables, the compound distribution is the marginal distribution of the variable.

Examples:

If X | N is a binomial (N,p) random variable, where parameter N is a random variable with negative-binomial (m, r) distribution, then X is distributed as a negative-binomial (m, r/(p + qr)).
If X | N is a binomial (N,p) random variable, where parameter N is a random variable with Poisson(μみゅー) distribution, then X is distributed as a Poisson (μみゅーp).
If X | μみゅー is a Poisson(μみゅー) random variable and parameter μみゅー is random variable with gamma(m, θしーた) distribution (where θしーた is the scale parameter), then X is distributed as a negative-binomial (m, θしーた/(1 + θしーた)), sometimes called gamma-Poisson distribution.

Some distributions have been specially named as compounds: beta-binomial distribution, Beta negative binomial distribution, gamma-normal distribution.

Examples:

If X is a Binomial(n,p) random variable, and parameter p is a random variable with beta(αあるふぁ, βべーた) distribution, then X is distributed as a Beta-Binomial(αあるふぁ,βべーた,n).
If X is a negative-binomial(r,p) random variable, and parameter p is a random variable with beta(αあるふぁ,βべーた) distribution, then X is distributed as a Beta negative binomial distribution(r,αあるふぁ,βべーた).

References

^ LEEMIS, Lawrence M.; Jacquelyn T. MCQUESTON (February 2008). "Univariate Distribution Relationships" (PDF). American Statistician. 62 (1): 45–53. doi:10.1198/000313008x270448. S2CID 9367367.
^ Swat, MJ; Grenon, P; Wimalaratne, S (2016). "ProbOnto: ontology and knowledge base of probability distributions". Bioinformatics. 32 (17): 2719–21. doi:10.1093/bioinformatics/btw170. PMC 5013898. PMID 27153608.
^ Cook, John D. "Diagram of distribution relationships".
^ Dinov, Ivo D.; Siegrist, Kyle; Pearl, Dennis; Kalinin, Alex; Christou, Nicolas (2015). "Probability Distributome: a web computational infrastructure for exploring the properties, interrelations, and applications of probability distributions". Computational Statistics. 594 (2): 249–271. doi:10.1007/s00180-015-0594-6. PMC 4856044. PMID 27158191.

External links

Interactive graphic: Univariate Distribution Relationships
ProbOnto - Ontology and knowledge base of probability distributions: ProbOnto
Probability Distributome project includes calculators, simulators, experiments, and navigators for inter-distributional refashions and distribution meta-data.

[1] LEEMIS, Lawrence M.; Jacquelyn T. MCQUESTON (February 2008). "Univariate Distribution Relationships" (PDF). American Statistician. 62 (1): 45–53. doi:10.1198/000313008x270448. S2CID 9367367.

[2] Swat, MJ; Grenon, P; Wimalaratne, S (2016). "ProbOnto: ontology and knowledge base of probability distributions". Bioinformatics. 32 (17): 2719–21. doi:10.1093/bioinformatics/btw170. PMC 5013898. PMID 27153608.

[3] Cook, John D. "Diagram of distribution relationships".

[4] Dinov, Ivo D.; Siegrist, Kyle; Pearl, Dennis; Kalinin, Alex; Christou, Nicolas (2015). "Probability Distributome: a web computational infrastructure for exploring the properties, interrelations, and applications of probability distributions". Computational Statistics. 594 (2): 249–271. doi:10.1007/s00180-015-0594-6. PMC 4856044. PMID 27158191.

[1]

[2]

[3]

[4]