# Probability Distributions

## Uniform Distribution

The uniform distribution on the interval $\left[a,b\right]$ has density $f\left(x\right)=\left\{\begin{array}{ll}\frac{1}{b-a},& \text{if}\phantom{\rule{1em}{0ex}}x\in \left[a,b\right]\\ 0& \text{otherwise}\end{array}.$

## Normal Distribution

A random vector $X$ of dimension $n$ is said to have a Normal distribution with mean $\mu$ and covariance matrix $\Sigma$ if $\Sigma$ is positive definite and its density function is given by $f\left(x\right)=\frac{1}{\left(2\pi {\right)}^{n/2}{|\Sigma |}^{1/2}}{e}^{-\frac{1}{2}\left(x-\mu {\right)}^{\top }{\Sigma }^{-1}\left(x-\mu \right)}.$ In this case, we write $X\sim N\left(\mu ,\phantom{\rule{thinmathspace}{0ex}}\Sigma \right).$

### Regression Property

Suppose that $X$ is a random vector with a multivariate Normal distribution and partition the vector as $X=\left({X}_{1},{X}_{2}\right)$ so that

(1)$\left(\begin{array}{c}{X}_{1}\\ {X}_{2}\end{array}\right)\sim N\left[\left(\begin{array}{c}{\mu }_{1}\\ {\mu }_{2}\end{array}\right),\left(\begin{array}{cc}{\Sigma }_{11}& {\Sigma }_{12}\\ {\Sigma }_{21}& {\Sigma }_{22}\end{array}\right)\right].$

The conditional distribution of ${X}_{1}$ given ${X}_{2}={x}_{2}$ is

(2)${X}_{1}|{X}_{2}={x}_{2}\sim N\left({\mu }_{1}+{\Sigma }_{12}{\Sigma }_{22}^{-1}\left({x}_{2}-{\mu }_{2}\right),\phantom{\rule{thinmathspace}{0ex}}{\Sigma }_{11}-{\Sigma }_{12}{\Sigma }_{22}^{-1}{\Sigma }_{21}\right).$

Conversely, whenever (2) holds and when the marginal distribution of ${X}_{2}$ satisfies ${X}_{2}\sim N\left({\mu }_{2},\phantom{\rule{thinmathspace}{0ex}}{\Sigma }_{22}\right)$, then the joint distribution of $\left({X}_{1},{X}_{2}\right)$ is given by (1).

## Bivariate Normal Distribution

Suppose the random variables ${X}_{1},{X}_{2}$ are jointly distributed according to the bivariate normal distribution. This distribution can be characterized by five parameters:

• ${\mu }_{1}$: the mean of ${X}_{1}$,
• ${\mu }_{2}$: the mean of ${X}_{2}$,
• ${\sigma }_{1}$: the standard deviation of ${X}_{1}$,
• ${\sigma }_{2}$: the standard deviation of ${X}_{2}$,
• $\rho$: the correlation of ${X}_{1}$ and ${X}_{2}$.

Their joint density is

$f\left({x}_{1},\phantom{\rule{thinmathspace}{0ex}}{x}_{2}\right)=\frac{1}{2\pi {\sigma }_{1}{\sigma }_{2}\sqrt{1-{\rho }^{2}}}\mathrm{exp}\left\{-\frac{1}{2\left(1-{\rho }^{2}\right)}\left[{\left(\frac{{x}_{1}-{\mu }_{1}}{{\sigma }_{1}}\right)}^{2}-2\rho \frac{{x}_{1}-{\mu }_{1}}{{\sigma }_{1}}\frac{{x}_{2}-{\mu }_{2}}{{\sigma }_{2}}+{\left(\frac{{x}_{2}-{\mu }_{2}}{{\sigma }_{2}}\right)}^{2}\right]\right\}.$

### Sampling From the Bivariate Normal Distribution

The following algorithm can be used to sample from the bivariate normal distribution:

1. Let ${z}_{1}$ and ${z}_{2}$ be independent draws from the standard normal distribution $N\left(0,1\right)$.

2. Then, ${x}_{1}$ and ${x}_{2}$ calculated as follows will have a joint bivariate normal distribution with parameters $\left({\mu }_{1},{\mu }_{2},{\sigma }_{1},{\sigma }_{2},\rho \right)$:

$\begin{array}{rl}{x}_{1}& ={\mu }_{1}+{\sigma }_{1}{z}_{1}\\ {x}_{2}& ={\mu }_{2}+{\sigma }_{2}\left[{z}_{1}\rho +{z}_{2}\sqrt{1-{\rho }^{2}}\right]\\ \end{array}$

## Truncated Normal Distribution

Let $X$ be normally distributed with mean $\mu$ and variance ${\sigma }^{2}$ and consider the conditional distribution of $X$ in the interval $\left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]\subset ℝ$. The distribution of $X$ conditional on $X\in \left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]$ is the truncated normal distribution. The conditional density is $f\left(x|x\in \left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]\right)=\frac{\frac{1}{\sigma }\varphi \left(\frac{x-\mu }{\sigma }\right)}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}$ for $a\le x\le b$, where $\varphi$ and $\Phi$ denote respectively the standard normal density and CDF.

### Censored Regression

In econometrics, this distribution is used in the censored regression model (also commonly called the Tobit model). In this model, we observe an outcome ${y}_{i}$ and covariates ${x}_{i}$ according to the model

(3)$\begin{array}{rl}{y}_{i}^{*}& ={x}_{i}^{\top }\beta +{\epsilon }_{i},\\ {y}_{i}& =\left\{\begin{array}{ll}{y}_{i}^{*},& \text{if}\phantom{\rule{1em}{0ex}}{y}_{i}^{*}>0\\ 0,& \text{if}\phantom{\rule{1em}{0ex}}{y}_{i}^{*}\le 0\end{array},\end{array}$

where ${\epsilon }_{i}\sim N\left(0,{\sigma }^{2}\right)$. ${y}_{i}^{*}$ is a latent variable which is only observed when it is above a certain threshold (normalized to zero here).

The expected value of ${y}_{i}$ and the likelihood function can be derived using the properties of the truncated normal distribution.

### Derivation of the Mean

$\begin{array}{rl}E\left[X|X\in \left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]\right]& =\mu +{\int }_{a}^{b}\left(x-\mu \right)\frac{\frac{1}{\sigma }\varphi \left(\frac{x-\mu }{\sigma }\right)}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}\phantom{\rule{thinmathspace}{0ex}}\mathrm{dx}\\ & =\mu +\frac{1}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}{\int }_{a}^{b}\frac{x-\mu }{\sigma }\frac{1}{\sqrt{2\pi }}{e}^{-\frac{1}{2}{\left(\frac{x-\mu }{\sigma }\right)}^{2}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{dx}\\ & =\mu +\frac{1}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}{\int }_{a}^{b}\sigma \frac{\partial }{\partial x}\left[-\frac{1}{\sqrt{2\pi }}{e}^{-\frac{1}{2}{\left(\frac{x-\mu }{\sigma }\right)}^{2}}\right]\phantom{\rule{thinmathspace}{0ex}}\mathrm{dx}\\ & =\mu +\frac{1}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}\sigma {\left[-\frac{1}{\sqrt{2\pi }}{e}^{-\frac{1}{2}{\left(\frac{x-\mu }{\sigma }\right)}^{2}}\right]}_{a}^{b}\\ & =\mu -\sigma \frac{\varphi \left(\frac{b-\mu }{\sigma }\right)-\varphi \left(\frac{a-\mu }{\sigma }\right)}{\Phi \left(\frac{b-\mu }{\sigma }\right)-\Phi \left(\frac{a-\mu }{\sigma }\right)}.\end{array}$

### Special Cases

• Left censoring of standard normal: $\left[a,\phantom{\rule{thinmathspace}{0ex}}b\right]=\left[a,\phantom{\rule{thinmathspace}{0ex}}\infty \right]$, $\mu =0$, $\sigma =1$.

$E\left[X|X>a\right]=\frac{\varphi \left(a\right)}{1-\Phi \left(a\right)}.$ This expression is called the inverse Mills ratio and is denoted $\lambda \left(a\right)\equiv \frac{\varphi \left(a\right)}{1-\Phi \left(a\right)}.$ It is the hazard function of the Normal distribution.

## Exponential Distribution

The probability density function (pdf) of an exponential distribution with rate parameter $\lambda$ is

$f\left(x;\lambda \right)=\left\{\begin{array}{ll}\lambda {e}^{-\lambda x},& x\ge 0,\\ 0,& x<0.\end{array}$

The cumulative distribution function (cdf) is

$F\left(x;\lambda \right)=\left\{\begin{array}{ll}1-{e}^{-\lambda x},& x\ge 0,\\ 0,& x<0.\end{array}$

The distribution has support $\left[0,\infty \right)$. If a random variable $X$ has this distribution, we write $X\sim Expo\left(\lambda \right).$

### Alternative parameterization

Sometimes the exponential distribution is parameterized by $\beta =\frac{1}{\lambda }$ where $\beta$ is the mean of the $Expo\left(\lambda \right)$ distribution. Thus, to avoid ambiguity it is important to specify whether the parameter denotes the rate or mean.

## Weibull Distribution

The Weibull distribution (Weibull, 1951) has cdf $F\left(x\right)=1-{e}^{-\lambda {x}^{\alpha }}$ and pdf $f\left(x\right)=\left\{\begin{array}{ll}\lambda {x}^{\alpha -1}{e}^{-\lambda {x}^{\alpha }},& x\ge 0,\\ 0,& x<0.\\ \end{array}$ Here, $\lambda >0$ is a scale parameter and $\alpha >0$ is a shape parameter.

### Special Cases

For certain values of $\alpha$, the Weibull distribution reduces to other common distributions: The Weibull distribution with $\alpha =1$ reduces to the exponential distribution with rate parameter $\lambda$.

• When $\alpha =1$, it is equivalent to the Exponential distribution with rate parameter $\lambda$.
• When $\alpha =2$, it is equivalent to the Rayleigh distribution with variance ${\lambda }^{2}/2$.
• When $\alpha =3.4$, it is approximately the univariate normal distribution.
• As $\alpha \to \infty$, it converges to the Dirac delta function.

## References

• Cameron, A. C. and P. K. Trivedi (2005). Microeconometrics: Methods and Applications. New York: Cambridge University Press.

• Weibull, W. (1951). A statistical distribution function of wide applicability. J. Appl. Mech.-Trans. ASME 18, 293–297.