Probability Distributions

Uniform Distribution

The uniform distribution on the interval [a,b] has density f(x)={1ba, ifx[a,b] 0 otherwise.

Normal Distribution

A random vector X of dimension n is said to have a Normal distribution with mean μ and covariance matrix Σ if Σ is positive definite and its density function is given by f(x)=1(2π) n/2|Σ| 1/2e 12(xμ) Σ 1(xμ). In this case, we write XN(μ,Σ).

Regression Property

Suppose that X is a random vector with a multivariate Normal distribution and partition the vector as X=(X 1,X 2) so that labelRegressionDistribution(X 1 X 2)N[(μ 1 μ 2),(Σ 11 Σ 12 Σ 21 Σ 22)]. The conditional distribution of X 1 given X 2=x 2 is labelRegressionConditionalDistributionX 1|X 2=x 2N(μ 1+Σ 12Σ 22 1(x 2μ 2),Σ 11Σ 12Σ 22 1Σ 21).

Conversely, whenever (eq:RegressionConditionalDistribution) holds and when the marginal distribution of X 2 satisfies X 2N(μ 2,Σ 22), then the joint distribution of (X 1,X 2) is given by (eq:RegressionDistribution).

Bivariate Normal Distribution

Suppose the random variables X 1,X 2 are jointly distributed according to the bivariate normal distribution. This distribution can be characterized by five parameters:

Their joint density is

f(x 1,x 2)=12πσ 1σ 21ρ 2exp{12(1ρ 2)[(x 1μ 1σ 1) 22ρx 1μ 1σ 1x 2μ 2σ 2+(x 2μ 2σ 2) 2]}.

Sampling From the Bivariate Normal Distribution

The following algorithm can be used to sample from the bivariate normal distribution:

  1. Let z 1 and z 2 be independent draws from the standard normal distribution N(0,1).

  2. Then, x 1 and x 2 calculated as follows will have a joint bivariate normal distribution with parameters (μ 1,μ 2,σ 1,σ 2,ρ):

x 1 =μ 1+σ 1z 1 x 2 =μ 2+σ 2[z 1ρ+z 21ρ 2]

Truncated Normal Distribution

Let X be normally distributed with mean μ and variance σ 2 and consider the conditional distribution of X in the interval [a,b]. The distribution of X conditional on X[a,b] is the truncated normal distribution. The conditional density is f(x|x[a,b])=1σϕ(xμσ)Φ(bμσ)Φ(aμσ) for axb, where ϕ and Φ denote respectively the standard normal density and CDF.

Censored Regression

In econometrics, this distribution is used in the censored regression model (also commonly called the Tobit model). In this model, we observe an outcome y i and covariates x i according to the model y i =x i β+ε i, y i ={y i , ify i >0 0, ify i 0, where ε iN(0,σ 2). y i * is a latent variable which is only observed when it is above a certain threshold (normalized to zero here).

The expected value of y i and the likelihood function can be derived using the properties of the truncated normal distribution.

Derivation of the Mean

E[X|X[a,b]] =μ+ a b(xμ)1σϕ(xμσ)Φ(bμσ)Φ(aμσ)dx =μ+1Φ(bμσ)Φ(aμσ) a bxμσ12πe 12(xμσ) 2dx =μ+1Φ(bμσ)Φ(aμσ) a bσx[12πe 12(xμσ) 2]dx =μ+1Φ(bμσ)Φ(aμσ)σ[12πe 12(xμσ) 2] a b =μσϕ(bμσ)ϕ(aμσ)Φ(bμσ)Φ(aμσ).

Special Cases

E[X|X>a]=ϕ(a)1Φ(a). This expression is called the inverse Mills ratio and is denoted λ(a)ϕ(a)1Φ(a). It is the hazard function of the Normal distribution.

Exponential Distribution

The probability density function (pdf) of an exponential distribution with rate parameter λ is

f(x;λ)={λe λx, x0, 0, x<0.

The cumulative distribution function (cdf) is

F(x;λ)={1e λx, x0, 0, x<0.

The distribution has support [0,). If a random variable X has this distribution, we write XExpo(λ).

Alternative parameterization

Sometimes the exponential distribution is parameterized by β=1λ where β is the mean of the Expo(λ) distribution. Thus, to avoid ambiguity it is important to specify whether the parameter denotes the rate or mean.

Weibull Distribution

The Weibull distribution (Weibull, 1951) has cdf F(x)=1e λx α and pdf f(x)={λx α1e λx α, x0, 0, x<0. Here, λ>0 is a scale parameter and α>0 is a shape parameter.

Special Cases

For certain values of α, the Weibull distribution reduces to other common distributions: The Weibull distribution with α=1 reduces to the exponential distribution with rate parameter λ.