88 MISSING DATA MECHANISMS
–Missed visits, either at random or for reasons related to response, such
as when a patient in a study of depression fails to show up when he is
experiencing symptoms;
–Withdrawal from a study, decided either by the participant or by the
investigator conducting the study; common examples in pharmacologic
trials are withdrawal due to side effects, toxicity, or lack of efficacy;
–Losstofollow-up, distinguished from withdrawal because the reasons are
not reported;
–Deathordisabling event, possibly related to the outcome but sometimes
not; for example, in a longitudinal study of HIV, accidental death is not
outcome related;
–Missingness by design, as in a longitudinal survey where only a subset of
individuals is selected for follow-up.
This certainly is not exhaustive but covers many reasons for missing data
in longitudinal studies. Adding to the complexity of handling missing data
is that missingness may have differentcauses, and in most cases should be
treated differently. Withdrawal for lackofefficacyisadifferent process than
withdrawal for toxicity; outcome-related mortality must be treated differently
than death by other causes.
Our focus throughout the book is primarily on dropout, and for simplicity
we begin with the assumption that dropout — and its relation to the response
process — can be captured using a single random variable. When there are
distinct types of dropout such that they are related to outcome in different
ways, then it is straightforward to introduce a multinomial version of the
missing data indicator, and many of the same ideas discussed here will apply
directly (see Rotnitzky et al., 2001, for example). Also, see Section 8.4.5.
Before moving forward to describing models for incomplete data, it is nec-
essary to define formally two important terms: dropout and monotone missing
data pattern.Formanyof the models we discuss in this and subsequent chap-
ters, monotone missingness is a key requirement.
Definition 5.1. Dropout process.
For full-data responses Y
1
,...,Y
J
scheduled to be recorded at times t
1
,...,t
J
,
let R
1
,...,R
J
denote the missing data indicators, with R
j
=1ifY
j
is observed
and R
j
=0ifmissing. A missing data process is a dropout process if for some
j such that 1 <j<J, R
j
=0⇒ R
j+k
=0forall1<k≤ J − j;thatis,
there exists a measurement occasion j such that a missing response at time j
implies all subsequent observations are missing. 2
Missingness that does not lead to dropout usually is called intermittent
missingness because rather than truncating the longitudinal process, it creates
gaps. Dropout that occurs in the absence of intermittent missingness leads to
amonotone pattern for the responses