Heckman and Honoré (1989)

The Identifiability of the Competing Risks Model

These notes are based on the following article:

Heckman, James J. and Bo E. Honoré (1989). The identifiability of the competing risks model. Biometrika, 76: 325–330.

The Classical Competing Risks Model

Suppose there are $J$ competing causes of death ${1, 2, \dots, J}$ .
Associated with each cause of death is a stochastic failure time $T_{j}$ .
We observe only the distribution of the identified minimum:
- The time of death $T = \min_{j} T_{j}$ .
- The cause of death $I = \arg \min_{j} T_{j}$ .
Goal: Identify the joint distribution of the latent failure times given that we only observe the distribution of the identified minimum.
Note that we aren’t considering regressors yet.

Cox and Tsiatis Nonidentification Theorem

For any joint distribution of latent failure times, there exists another such distribution with independent failure times that yields the same distribution of the minimum (Cox, 1959, 1962; Tsiatis, 1975).
That is, given r.v.’s $(T_{1}, T_{2}, \dots, T_{J})$ there exist $(S_{1}, S_{2}, \dots, S_{J})$ with $S_{i} ⫫ S_{j}$ for all $i \neq j$ such that $(T, I_{T})$ and $(S, I_{S})$ are observationally equivalent.
In light of this result, any empirical work needed to proceed by placing some structure on the form of dependence across risks, for example, by assuming independence.

Importance of Dependence

We are concerned with conditional independence—independence of the risks $T_{1}, \dots, T_{J}$ conditional on $X$ .
Even conditional independence may not hold if, for example, we are studying an individual whose behavior may affect all of the risks.
Yashin, Manton, and Stallard (1986): How do smoking, blood pressure, and body weight (regressors) affect time of death from cancer, heart disease, etc. (risks).

Overview

Establish an identification theorem for a general class of competing risks models with regressors.
This class includes models with marginal distributions that follow:
- Proportional hazards.
- Mixed proportional hazards.
- Accelerated hazards.
Results are presented for only two competing risks but generalize to any arbitrary finite number of risks.

Proportional Hazards Model

We want to model the time of death $T$ from a single risk conditional on some covariates $X$ .
Conditional on $X$ , $T$ has cdf $F (t | x)$ and pdf $f (t | x)$ .
Hazard function: $λ (t | x) = \frac{f (t | x)}{1 - F (t | x)}$ .
Integrated hazard: $Λ (t | x) = \int_{0}^{t} λ (s | x) ds$ .
If $λ (t | x) = z (t) ϕ (x)$ then $Λ (t | x) = Z (t) ϕ (x)$ with $Z (t) = \int_{0}^{t} z (s) ds$ .
Equivalently, we can work with the survivor function: $S (t | x) = \Pr (T > t | x) = \exp [- Z (t) ϕ (x)]$
It is common in practice to use $ϕ (x) = e^{x β}$ .
Suppose $F (t | x) = 1 - e^{- Z (t) ϕ (x)}$ where $Z (t)$ is the baseline integrated hazard and $ϕ (x)$ is a scaling term.
If $Z$ is differentiable, then $Z' (t)$ is the baseline hazard.

Proportional Hazards and Competing Risks

Assuming for the moment that failure times are independent, we can easily generalize this to model competing risks.
The distribution of each failure time has a proportional hazard specification.
$Z (t)$ and $ϕ$ may differ across risks.
The joint survivor function is $S (t_{1}, t_{2} | x) = 1 - (1 - \exp [- Z_{1} (t_{1}) ϕ_{1} (x)]) (1 - \exp [- Z_{2} (t_{2}) ϕ_{2} (x)]) .$

Introducing Dependence

We could draw two independent failure times $T_{1}$ and $T_{2}$ by drawing (independently) $U_{j} \sim U (0, 1)$ and solving for $T_{j}$ :

$S_{j} (t | x) = \exp {- Z_{j} (t) ϕ_{j} (x)} .$

If $K (u_{1}, u_{2}) = u_{1} u_{2}$ is the CDF of $U_{1}$ and $U_{2}$ , the joint survivor function is

$S (t_{1}, t_{2} | x) = K [\exp {- Z_{1} (t) ϕ_{1} (x)}, \exp {- Z_{2} (t) ϕ_{2} (x)}] .$

We can introduce dependence in $T_{1}$ and $T_{2}$ by introducing dependence in $U_{1}$ and $U_{2}$ via $K$ .
Suppose $(U_{1}, U_{2}) \sim K (\cdot, \cdot)$ on $[0, 1]^{2}$ and assume that $Z_{1} (0) = Z_{2} (0) = 0$ .
Then the survivor function for $(T_{1}, T_{2})$ is

(1)

S (t_{1}, t_{2} | x) = K (\exp [- Z_{1} (t_{1}) ϕ_{1} (x)], \exp [- Z_{2} (t_{2}) ϕ_{2} (x)]) .

Generalization: Mixed Proportional Hazards

Suppose that the competing risks are independent, $ϕ_{j} (x) = e^{x β_{j}}$ , and that one of the covariates, $ω$ , is not observed:

$S (t_{1}, t_{2} | x) = \int_{Ω} \exp [- Z_{1} (t_{1}) e^{x β_{1} + c_{1} ω}] \exp [- Z_{2} (t_{2}) e^{x β_{2} + c_{2} ω}] dG (ω) .$

We can arrive at this model by choosing $K$ such that:

$K (η_{1}, η_{2}) = \int_{Ω} η_{1}^{\exp (c_{1} ω)} η_{2}^{\exp (c_{2} ω)} dG (ω) .$

Generalization: Accelerated hazards

$S (t | x) = \exp [- Z {t ϕ (x)}]$

Joint survivor with dependent competing risks:

$S (t_{1}, t_{2} | x) = K (\exp [- Z_{1} {t_{1} ϕ_{1} (x)}], \exp [- Z_{2} {t_{2} ϕ_{2} (x)}]) .$

For any $K$ , the marginal distributions give rise to univariate accelerated hazard models.

Identification Theorem

Assume that $(T_{1}, T_{2})$ has joint distribution (1). Then $Z_{1}$ , $Z_{2}$ , $ϕ_{1}$ , $ϕ_{2}$ , and $K$ are identified from the minimum of $(T_{1}, T_{2})$ under the following assumptions:

$K$ is continuously differentiable with partial derivatives $K_{1}$ and $K_{2}$ and for $i = 1, 2$ , $\lim_{n \to ∞} K_{i} (η_{1 n}, η_{2 n})$ is finite for all sequences $η_{1 n}$ , $η_{2 n}$ for which $η_{1 n} \to 1$ and $η_{2 n} \to 1$ for $n \to ∞$ . We also assume that $K$ is strictly increasing in each of its arguments.
$Z_{1} (1) = Z_{2} (1) = 1$ and $ϕ_{1} (x_{0}) = ϕ_{2} (x_{0}) = 1$ for some $x_{0}$ .
The support of ${ϕ_{1} (x), ϕ_{2} (x)}$ is $(0, ∞) \times (0, ∞)$ .
$Z_{1}$ and $Z_{2}$ are nonnegative, differentiable, strictly increasing functions, except that we allow them to be infinite for finite $t$ .

Notes about these assumptions:

$K$ is already weakly increasing.
This is an innocuous normalization since $ϕ_{j} (x)$ and $Z_{j} (t)$ are not jointly identified to scale.
This is satisfied, for example, when $ϕ_{j} (x) = \exp (x β_{j})$ and there is a common covariate with support $ℝ$ and different coefficients.

Mapping Observables to Unobservables

Observed distributions:

$Q_{1} (t | x) = \Pr (T_{1} \geq t, T_{2} \geq T_{1} | x) Q_{2} (t | x) = \Pr (T_{2} \geq t, T_{1} \geq T_{2} | x) .$

Tsiatis (1975) establishes the following mappings:

$\frac{\partial Q_{1}}{\partial t} (t | x) = {[\frac{\partial S}{\partial t_{1}}]}_{t_{1} = t_{2} = t} \frac{\partial Q_{2}}{\partial t} (t | x) = {[\frac{\partial S}{\partial t_{2}}]}_{t_{1} = t_{2} = t} .$

We have $\frac{\partial Q_{1}}{\partial t} (t | x) = - K_{1} [\exp {- Z_{1} (t) ϕ_{1} (x)}, \exp {- Z_{2} (t) ϕ_{2} (x)}] \exp {- Z_{1} (t) ϕ_{1} (x)} Z'_{1} (t) ϕ_{1} (x) .$

Identification of $ϕ_{j}$

Taking the ratio of $\frac{\partial Q_{1} (t | x)}{\partial t}$ at $x$ and $x_{0}$ yields

$\frac{K_{1} [\exp {- Z_{1} (t) ϕ_{1} (x)}, \exp {- Z_{2} (t) ϕ_{2} (x)}] \exp {- Z_{1} (t) ϕ_{1} (x)} Z'_{1} (t) ϕ_{1} (x)}{K_{1} [\exp {- Z_{1} (t) ϕ_{1} (x_{0})}, \exp {- Z_{2} (t) ϕ_{2} (x_{0})}] \exp {- Z_{1} (t) ϕ_{1} (x_{0})} Z'_{1} (t) ϕ_{1} (x_{0}) .}$

Taking $t \to 0$ and using the normalization yields $ϕ_{1} (x)$ . Our choice of $x$ was arbitrary so $ϕ_{1} (x)$ is identified on the entire support of $X$ . Similarly for $ϕ_{2} (x)$ .

Identification of $K$

We know $S (t, t | x)$ since $S (t, t | x) = Q_{1} (t | x) + Q_{2} (t | x)$ . Furthermore, $S (t, t | x) = K (\exp [- Z_{1} (t) ϕ_{1} (x)], \exp [- Z_{2} (t) ϕ_{2} (x)]) .$

Setting $t = 1$ gives $S (1, 1 | x) = K (\exp [- ϕ_{1} (x)], \exp [- ϕ_{2} (x)]) .$ and letting $ϕ_{1} (x)$ and $ϕ_{2} (x)$ vary over $(0, ∞)^{2}$ (by Assumption 3) yields $K$ .

Identification of $Z_{j}$

$S (t, t | x_{n}) = K (\exp [- Z_{1} (t) ϕ_{1} (x_{n})], \exp [- Z_{2} (t) ϕ_{2} (x_{n})])$

Let $ϕ_{2} (x) \to 0$ while holding $ϕ_{1} (x)$ fixed.
Then $S (t, t | x) \to K (\exp [- Z_{1} (t) ϕ_{1} (x)], 1) .$
Since $K$ and $ϕ_{1}$ are known and $K$ is strictly increasing in both arguments, we have $Z_{1} (1) = 1$ for any $t$ .
Similarly for $Z_{2} (t)$ .

Conclusion

Identification argument:

Given the distribution of $(T, I)$ and exploiting multiplicative separability gives us $ϕ_{j} (x)$ for $j = 1, 2$ .
Using the full range of $ϕ_{j} (x)$ on $(0, ∞)$ yields $K$ .
Using $K$ , $ϕ_{j}$ , and related properties gives us $Z_{j} (t)$ .

Implications of Nonparametric Identification:

Identification does not depend on parametric functional forms or assumed forms of risk dependence (modulo separability of the hazard).
Highlights the role of regressors in identification in contrast to the Cox-Tsiatis nonidentification result.
Suggests the possibility of a nonparametric estimator.