Probability Distributions

Uniform Distribution

The uniform distribution on the interval $\left[a,b\right]$ has density $f\left(x\right)=\left\{\begin{array}{ll}\frac{1}{b-a},& \text{if}\phantom{\rule{1em}{0ex}}x\in \left[a,b\right]\\ 0& \text{otherwise}\end{array}.$

Normal Distribution

A random vector $X$ of dimension $n$ is said to have a Normal distribution with mean $\mu$ and covariance matrix $\Sigma$ if $\Sigma$ is positive definite and its density function is given by $f\left(x\right)=\frac{1}{\left(2\pi {\right)}^{n/2}{|\Sigma |}^{1/2}}{e}^{-\frac{1}{2}\left(x-\mu {\right)}^{\top }{\Sigma }^{-1}\left(x-\mu \right)}.$ In this case, we write $X\sim N\left(\mu ,\phantom{\rule{thinmathspace}{0ex}}\Sigma \right).$

Regression Property

Suppose that $X$ is a random vector with a multivariate Normal distribution and partition the vector as $X=\left({X}_{1},{X}_{2}\right)$ so that

(1)$\left(\begin{array}{c}{X}_{1}\\ {X}_{2}\end{array}\right)\sim N\left[\left(\begin{array}{c}{\mu }_{1}\\ {\mu }_{2}\end{array}\right),\left(\begin{array}{cc}{\Sigma }_{11}& {\Sigma }_{12}\\ {\Sigma }_{21}& {\Sigma }_{22}\end{array}\right)\right].$

The conditional distribution of ${X}_{1}$ given ${X}_{2}={x}_{2}$ is

(2)${X}_{1}|{X}_{2}={x}_{2}\sim N\left({\mu }_{1}+{\Sigma }_{12}{\Sigma }_{22}^{-1}\left({x}_{2}-{\mu }_{2}\right),\phantom{\rule{thinmathspace}{0ex}}{\Sigma }_{11}-{\Sigma }_{12}{\Sigma }_{22}^{-1}{\Sigma }_{21}\right).$

Conversely, whenever (2) holds and when the marginal distribution of ${X}_{2}$ satisfies ${X}_{2}\sim N\left({\mu }_{2},\phantom{\rule{thinmathspace}{0ex}}{\Sigma }_{22}\right)$, then the joint distribution of $\left({X}_{1},{X}_{2}\right)$ is given by (1).

Bivariate Normal Distribution

Suppose the random variables ${X}_{1},{X}_{2}$ are jointly distributed according to the bivariate normal distribution. This distribution can be characterized by five parameters:

• ${\mu }_{1}$: the mean of ${X}_{1}$,
• ${\mu }_{2}$: the mean of ${X}_{2}$,
• ${\sigma }_{1}$: the standard deviation of ${X}_{1}$,
• ${\sigma }_{2}$: the standard deviation of ${X}_{2}$,
• $\rho$: the correlation of ${X}_{1}$ and ${X}_{2}$.

Their joint density is

$f\left({x}_{1},\phantom{\rule{thinmathspace}{0ex}}{x}_{2}\right)=\frac{1}{2\pi {\sigma }_{1}{\sigma }_{2}\sqrt{1-{\rho }^{2}}}\mathrm{exp}\left\{-\frac{1}{2\left(1-{\rho }^{2}\right)}\left[{\left(\frac{{x}_{1}-{\mu }_{1}}{{\sigma }_{1}}\right)}^{2}-2\rho \frac{{x}_{1}-{\mu }_{1}}{{\sigma }_{1}}\frac{{x}_{2}-{\mu }_{2}}{{\sigma }_{2}}+{\left(\frac{{x}_{2}-{\mu }_{2}}{{\sigma }_{2}}\right)}^{2}\right]\right\}.$

Sampling From the Bivariate Normal Distribution

The following algorithm can be used to sample from the bivariate normal distribution:

1. Let ${z}_{1}$ and ${z}_{2}$ be independent draws from the standard normal distribution $N\left(0,1\right)$.

2. Then, ${x}_{1}$ and ${x}_{2}$ calculated as follows will have a joint bivariate normal distribution with parameters $\left({\mu }_{1},{\mu }_{2},{\sigma }_{1},{\sigma }_{2},\rho \right)$:

$\begin{array}{rl}{x}_{1}& ={\mu }_{1}+{\sigma }_{1}{z}_{1}\\ {x}_{2}& ={\mu }_{2}+{\sigma }_{2}\left[{z}_{1}\rho +{z}_{2}\sqrt{1-{\rho }^{2}}\right]\\ \end{array}$

Truncated Normal Distribution

Let $X$ be normally distributed with mean $\mu$ and variance ${\sigma }^{2}$ and consider the conditional distribution of $X$ in the interval $\left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]\subset ℝ$. The distribution of $X$ conditional on $X\in \left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]$ is the truncated normal distribution. The conditional density is $f\left(x|x\in \left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]\right)=\frac{\frac{1}{\sigma }\varphi \left(\frac{x-\mu }{\sigma }\right)}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}$ for $a\le x\le b$, where $\varphi$ and $\Phi$ denote respectively the standard normal density and CDF.

Censored Regression

In econometrics, this distribution is used in the censored regression model (also commonly called the Tobit model). In this model, we observe an outcome ${y}_{i}$ and covariates ${x}_{i}$ according to the model

(3)$\begin{array}{rl}{y}_{i}^{*}& ={x}_{i}^{\top }\beta +{\epsilon }_{i},\\ {y}_{i}& =\left\{\begin{array}{ll}{y}_{i}^{*},& \text{if}\phantom{\rule{1em}{0ex}}{y}_{i}^{*}>0\\ 0,& \text{if}\phantom{\rule{1em}{0ex}}{y}_{i}^{*}\le 0\end{array},\end{array}$

where ${\epsilon }_{i}\sim N\left(0,{\sigma }^{2}\right)$. ${y}_{i}^{*}$ is a latent variable which is only observed when it is above a certain threshold (normalized to zero here).

The expected value of ${y}_{i}$ and the likelihood function can be derived using the properties of the truncated normal distribution.

Derivation of the Mean

$\begin{array}{rl}E\left[X|X\in \left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]\right]& =\mu +{\int }_{a}^{b}\left(x-\mu \right)\frac{\frac{1}{\sigma }\varphi \left(\frac{x-\mu }{\sigma }\right)}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}\phantom{\rule{thinmathspace}{0ex}}\mathrm{dx}\\ & =\mu +\frac{1}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}{\int }_{a}^{b}\frac{x-\mu }{\sigma }\frac{1}{\sqrt{2\pi }}{e}^{-\frac{1}{2}{\left(\frac{x-\mu }{\sigma }\right)}^{2}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{dx}\\ & =\mu +\frac{1}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}{\int }_{a}^{b}\sigma \frac{\partial }{\partial x}\left[-\frac{1}{\sqrt{2\pi }}{e}^{-\frac{1}{2}{\left(\frac{x-\mu }{\sigma }\right)}^{2}}\right]\phantom{\rule{thinmathspace}{0ex}}\mathrm{dx}\\ & =\mu +\frac{1}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}\sigma {\left[-\frac{1}{\sqrt{2\pi }}{e}^{-\frac{1}{2}{\left(\frac{x-\mu }{\sigma }\right)}^{2}}\right]}_{a}^{b}\\ & =\mu -\sigma \frac{\varphi \left(\frac{b-\mu }{\sigma }\right)-\varphi \left(\frac{a-\mu }{\sigma }\right)}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}.\end{array}$

Special Cases

• Left censoring of standard normal: $\left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]=\left[a,\phantom{\rule{thinmathspace}{0ex}}\infty \right]$, $\mu =0$, $\sigma =1$.

$E\left[X|X>a\right]=\frac{\varphi \left(a\right)}{1-\Phi \left(a\right)}.$ This expression is called the inverse Mills ratio and is denoted $\lambda \left(a\right)\equiv \frac{\varphi \left(a\right)}{1-\Phi \left(a\right)}.$ It is the hazard function of the Normal distribution.

Exponential Distribution

The probability density function (pdf) of an exponential distribution with rate parameter $\lambda$ is

$f\left(x;\lambda \right)=\left\{\begin{array}{ll}\lambda {e}^{-\lambda x},& x\ge 0,\\ 0,& x<0.\end{array}$

The cumulative distribution function (cdf) is

$F\left(x;\lambda \right)=\left\{\begin{array}{ll}1-{e}^{-\lambda x},& x\ge 0,\\ 0,& x<0.\end{array}$

The distribution has support $\left[0,\infty \right)$. If a random variable $X$ has this distribution, we write $X\sim Expo\left(\lambda \right).$

Alternative parameterization

Sometimes the exponential distribution is parameterized by $\beta =\frac{1}{\lambda }$ where $\beta$ is the mean of the $Expo\left(\lambda \right)$ distribution. Thus, to avoid ambiguity it is important to specify whether the parameter denotes the rate or mean.

Weibull Distribution

The Weibull distribution (Weibull, 1951) has cdf $F\left(x\right)=1-{e}^{-\lambda {x}^{\alpha }}$ and pdf $f\left(x\right)=\left\{\begin{array}{ll}\lambda {x}^{\alpha -1}{e}^{-\lambda {x}^{\alpha }},& x\ge 0,\\ 0,& x<0.\\ \end{array}$ Here, $\lambda >0$ is a scale parameter and $\alpha >0$ is a shape parameter.

Special Cases

For certain values of $\alpha$, the Weibull distribution reduces to other common distributions: The Weibull distribution with $\alpha =1$ reduces to the exponential distribution with rate parameter $\lambda$.

• When $\alpha =1$, it is equivalent to the Exponential distribution with rate parameter $\lambda$.
• When $\alpha =2$, it is equivalent to the Rayleigh distribution with variance ${\lambda }^{2}/2$.
• When $\alpha =3.4$, it is approximately the univariate normal distribution.
• As $\alpha \to \infty$, it converges to the Dirac delta function.

References

• Cameron, A. C. and P. K. Trivedi (2005). Microeconometrics: Methods and Applications. New York: Cambridge University Press.

• Weibull, W. (1951). A statistical distribution function of wide applicability. J. Appl. Mech.-Trans. ASME 18, 293–297.